How to create interactive line charts with Pandas and Altair

Line graph is an essential part of data analysis. It gives us an idea of ​​how the value changes with successive measurements. When working with time series, the importance of line charts becomes crucial. Trend [направление], seasonality and correlation are some of the characteristics that can be observed in neatly generated line graphs. In this article, we will create interactive line graphs using two Python libraries – Pandas and Altair.

We have already touched on the topic of visualizations using the Altair library using the example of creating interactive maps, and today, to the start Data Science course, decided to share a simple guide on how to select the most important from a variety of graphs; with this guide you can start learning Altair in practice.


Pandas provides data, and Altair renders beautiful and informative line charts. While it is also possible to plot data with Pandas, it does not explicitly focus on data visualization. In addition, we will make the graphs interactive, and you cannot achieve interactivity with Pandas.

Let’s start by generating data. A typical use case for line charts is stock price analysis. One of the easiest ways to get stock price data is provided by the pandas-datareader library. First, we need to import it along with the Pandas already installed in Google Colab.

import pandas as pd
from pandas_datareader import data

We will get the prices for the shares of three companies in one year. Start date, end date and data source must be specified:

start="2020-1-1"
end = '2020-12-31'
source="yahoo"

You need to know one more detail – the name of the promotion:

apple = data.DataReader("AAPL", start=start ,end=end, data_source=source).reset_index()[["Date", "Close"]]
ibm = data.DataReader("IBM", start=start ,end=end, data_source=source).reset_index()[["Date", "Close"]]
microsoft = data.DataReader("MSFT", start=start ,end=end, data_source=source).reset_index()[["Date", "Close"]]

We now have stock prices for Apple, IBM and Microsoft in 2020. Better to put them in the same data frame. Before combining, we need to add a column that indicates to which stock this or that price belongs. The following block of code adds the appropriate columns and then concatenates the dataframes using the concat function:

apple["Stock"] = "apple"
ibm["Stock"] = "ibm"
microsoft["Stock"] = "msft"
stocks["Month"] = stocks.Date.dt.month
stocks = pd.concat([apple, ibm, microsoft])

We have also added information about the month, which may be useful for analysis. Now you can start creating graphs.

Altair

Altair is a Python statistical visualization library. As we will see in the examples, its syntax is clean and easy to understand. It is also very easy to create interactive visualizations with Altair. I will briefly explain the structure of Altair and then focus on creating interactive line charts. If you are new to Altair, here is the Altair tutorial as a 4-part series:

Parts List

Here’s a simple line graph without any interactivity:

alt.Chart(stocks).mark_line().encode(
   x="Date",
   y="Close",
   color="Stock"
).properties(
   height=300, width=500
)

The basic structure starts with a top-level diagram object. The data can be in the format of a Pandas data frame, or you can write a string with a URL pointing to a JSON or CSV file. Then the type of rendering is indicated (mark_circle, mark_line, and so on).

The encode function tells Altair what to build on the given data frame. Thus, everything we write in the encode function must be data bound. We will distinguish stocks using the color parameter, which is similar to the hue parameter in Seaborn. Finally, using the properties function, certain properties of the chart are set.

Highlighting in Altair is one of the ways to interact with the user, it intercepts the user’s actions.

selection = alt.selection_multi(fields=["Stock"], bind="legend")
alt.Chart(stocks).mark_line().encode(
   x="Date",
   y="Close",
   color="Stock",
   opacity=alt.condition(selection, alt.value(1), alt.value(0.1))
).properties(
   height=300, width=500
).add_selection(
   selection
)

The object of selection (selection) above is built according to the column of shares, which contains their names. He contacts the legend. Our selection object is specified in the opacity parameter, so the opacity of the line changes according to the selected stock name.

We also need to add a selection to the chart using the add_selection function. The two images below demonstrate how selection works. We just need to click on the name of the stock in the legend. The graph is updated accordingly:

Altair provides other options for interactivity. For example, you can create a graph with an interactive line that updates when the mouse pointer passes over it. The code below creates a select object that does the job we just covered:

hover = alt.selection(
   type="single", on="mouseover", fields=["Stock"], nearest=True
)

Let’s use a selection object to grab the closest point on the chart, and then select the line to which this point belongs.

There are 3 components in the code below. The first creates a line graph. The second component is a scatterplot plotted on a line plot, this plot is used to determine the nearest point. We will adjust the opacity so that the scatterplot is not visible. The third component is responsible for selecting the line containing the captured point on the second graph:

# line plot
lineplot = alt.Chart(stocks).mark_line().encode(
   x="Date:T",
   y="Close:Q",
   color="Stock:N",
)
# nearest point
point = lineplot.mark_circle().encode(
   opacity=alt.value(0)
).add_selection(hover)
# highlight
singleline = lineplot.mark_line().encode(
   size=alt.condition(~hover, alt.value(0.5), alt.value(3))
)

Now, by combining the second and third plots, you can create an interactive line plot:

point + singleline

The first image shows the original, or raw, graph. On the second – the updated version at the time of hovering the cursor.

Conclusion

Altair is quite handy in terms of ways to add interactive components to visualizations. By gaining a complete understanding of the elements of interactivity, you can enrich your visualizations.

Data visualization is worth the effort because it helps you look at data in a new way, both literally and figuratively. But visualization is just one aspect of working with data. If you want to know more about this, you can pay attention to our Data Science course, in which you will learn how to unload data, clean it, analyze it and use it to solve your problems.

find outhow to level up in other specialties or master them from scratch:

Other professions and courses

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *