How to build beautiful graphs in Python with Seaborn
Future students of the course “Python Developer. Professional “ and everyone is invited to take part in an open webinar on the topic “Framework and Metaclasses”.
And now we are sharing the traditional translation of useful material.
Data visualization is a technique that enables data scientists to transform raw data into charts and graphs that provide valuable information. Charts reduce the complexity of the data and make it easier for any user to understand.
There are tons of data visualization tools like Tableau, Power BI, ChartBlocks and others that are no-code tools. They are very powerful and each has its own audience. However, for working with raw data that requires processing, and also as a sandbox, Python is best suited.
Despite the fact that this path is more difficult and requires programming skills, Python will allow you to perform any manipulation, transformation and visualization of your data. It is ideal for data scientists.
Python is the best data science tool for many reasons, but the most important is its library ecosystem. There are many great libraries for working with data in Python, such as numpy
, pandas
, matplotlib
, tensorflow
…
Matplotlib
is probably the most famous graphing library available in Python and other programming languages such as R. It is its level of customization and usability that puts it first. However, some of the actions and customizations during its use can be difficult to cope with.
The developers have created a new library based on matplotlib
, which is called seaborn
… Seaborn
as powerful as matplotlib
but at the same time provides a great abstraction to simplify graphs and brings in some unique features.
In this article, we will focus on how to work with seaborn
to create top-notch charts. If you want, you can create a new project and repeat all the steps, or just contact my seaborn tutorial on GitHub…
What is Seaborn?
Seaborn Is a library for generating statistical graphs in Python. It builds on matplotlib and works closely with pandas data structures…
Seaborn’s architecture allows you to quickly explore and understand your data. Seaborn captures entire data frames or arrays containing all of your data and performs all the internal functions needed for semantic mapping and statistical aggregation to transform data into informative graphs.
It abstracts away complexity, allowing you to design graphics according to your needs.
Installing Seaborn
Install seaborn
As easy as any other library, you need your favorite Python package manager for that. During installation seaborn
the library will install all dependencies including matplotlib
, pandas
, numpy
and scipy
…
Let’s install already seaborn
and of course also the package notebook
to access the data sandbox.
pipenv install seaborn notebook
Apart from that, let’s import a few modules before getting started.
import seaborn as sns
import pandas as pd
import numpy as np
import matplotlib
Building the first charts
Before we start plotting graphs, we need data. Charm seaborn
in that it works directly with objects dataframe
of pandas
which makes it very comfortable. Moreover, the library comes with some built-in datasets that can be used directly from the code, and not have to load files manually.
Let’s see how this works on an airplane flight dataset.
flights_data = sns.load_dataset("flights")
flights_data.head()
year | month | passengers | |
---|---|---|---|
0 | 1949 | Jan | 112 |
one | 1949 | Feb | 118 |
2 | 1949 | Mar | 132 |
3 | 1949 | Apr | 129 |
4 | 1949 | May | 121 |
All the magic happens when the function is called load_dataset
which expects the name of the data to load and returns dataframe
… All these datasets are available in repositories on Github…
Scatter Plot – Scatter Plot
A scatter plot is a chart that displays points based on two dimensions of a dataset. It is very easy to create a scatterplot in seaborn, as it only requires one line of code.
sns.scatterplot(data=flights_data, x="year", y="passengers")

Easy, isn’t it? Function scatterplot
takes in the dataset to be rendered and the columns that will act as the x and y axes.
Line Plot – Line Plot
This graph draws a line that represents the development of continuous or categorical data. This kind of graph is popular and famous and easy to create. As before, we will use the function lineplot
with a dataset and columns representing the x and y axes. Will do the rest for us seaborn
…
sns.lineplot(data=flights_data, x="year", y="passengers")

Bar Chart – Bar Plot
This is probably the most well-known type of chart, and you guessed it we can build this type of chart using seaborn
, just as we did it for line and scatterplots using the function barplot
…
sns.barplot(data=flights_data, x="year", y="passengers")

She’s very colorful, I know. Later we will learn how to customize it.
Extending functionality with matplotlib
Seaborn builds on matplotlib, extending its functionality and abstracting complexity. At the same time, seaborn does not lose its power. Any seaborn chart can be customized using functions from the matplotlib library. This mechanic can come in handy in certain cases and allows seaborn to take advantage of matplotlib’s capabilities without having to rewrite all of its functions.
Let’s say you want to plot several charts at the same time using seaborn, in this case you can use the function subplot
from matplotlib.
diamonds_data = sns.load_dataset('diamonds')
plt.subplot(1, 2, 1)
sns.countplot(x='carat', data=diamonds_data)
plt.subplot(1, 2, 2)
sns.countplot(x='depth', data=diamonds_data)

Using the function subplot
several diagrams can be built on one graph. The function takes three parameters: the first is the number of rows, the second is the number of columns, and the third is the number of charts.
We render one seaborn plot on each plot part, mixing matplotlib and seaborn functionality.
Seaborn and Pandas
We have already said that seaborn loves pandas to the point that all of its functions are built on a structure dataframe
… Up to this point we have been using seaborn with preloaded data, but what if we want to render data that has already been loaded with pandas?
drinks_df = pd.read_csv("data/drinks.csv")
sns.barplot(x="country", y="beer_servings", data=drinks_df)

Create beautiful graphs with styles
Seaborn gives you the ability to change the interface of your charts. To do this, we have five styles at our disposal out of the box: darkgrid, whitegrid, dark, white and ticks…
sns.set_style("darkgrid")
sns.lineplot(data = data, x = "year", y = "passengers")

Here’s another example.
sns.set_style("whitegrid")
sns.lineplot(data=flights_data, x="year", y="passengers")

Cool use cases
We’ve learned the basics of seaborn, and now let’s put this knowledge into practice and build several charts with the same dataset. In our case, you are using the data set “tips»Which can be downloaded directly from seaborn.
First, let’s load the dataset.
tips_df = sns.load_dataset('tips')
tips_df.head()
total_bill | tip | sex | smoker | day | time | size | |
---|---|---|---|---|---|---|---|
0 | 16.99 | 1.01 | Female | No. | Sun | Dinner | 2 |
one | 10.34 | 1.66 | Male | No. | Sun | Dinner | 3 |
2 | 21.01 | 3.50 | Male | No. | Sun | Dinner | 3 |
3 | 23.68 | 3.31 | Male | No. | Sun | Dinner | 2 |
4 | 24.59 | 3.61 | Female | No. | Sun | Dinner | 4 |
I like to display the first few rows of a dataset to get an idea of the columns and the data itself. I usually use several pandas functions to deal with data problems like values null
, or add information to the dataset that may be useful. You can read more about this in pandas manual…
Let’s add another column that will be the percentage of the tip for the entire bill.
tips_df["tip_percentage"] = tips_df["tip"] / tips_df["total_bill"]
tips_df.head()
Now the data looks like this:
total_bill | tip | sex | smoker | day | time | size | tip_percentage | |
---|---|---|---|---|---|---|---|---|
0 | 16.99 | 1.01 | Female | No. | Sun | Dinner | 2 | 0.059447 |
one | 10.34 | 1.66 | Male | No. | Sun | Dinner | 3 | 0.160542 |
2 | 21.01 | 3.50 | Male | No. | Sun | Dinner | 3 | 0.166587 |
3 | 23.68 | 3.31 | Male | No. | Sun | Dinner | 2 | 0.139780 |
4 | 24.59 | 3.61 | Female | No. | Sun | Dinner | 4 | 0.146808 |
And now we will start building graphs.
Tip percentage
Let’s try to figure out the percentage of tip and bill. For this we use the function histplot
which will generate the histogram.
sns.histplot(tips_df["tip_percentage"], binwidth=0.05)

To make everything read well, we had to set up the property binwidth
, but now we can understand and evaluate the data faster. Most customers tip 15 to 20% of the bill, but there are a few cases where tips exceed 70%. These values are called anomalies or outliers and are always worth looking at to see if these values are in error.
I was also wondering if the tip percentage changes with the time of day.
sns.histplot(data=tips_df, x="tip_percentage", binwidth=0.05, hue="time")

This time, we created a chart using the entire dataset, not a single column, and set the property hue
per column time
… This will use a different color for each time value in the chart and add a legend to it.
Total tip for a specific day of the week
Another interesting metric is the number of tips that staff receive depending on the day of the week.
sns.barplot(data=tips_df, x="day", y="tip", estimator=np.sum)

It seems like Friday is a good day to stay at home.
Influence of table size and day of the week on tip
Sometimes you need to understand how several variables affect the final result. For example, how does the day of the week and table size affect tip percentage?
To build the following diagram, we combine the function pivot
from pandas for preprocessing and then draw a heatmap.
pivot = tips_df.pivot_table(
index=["day"],
columns=["size"],
values="tip_percentage",
aggfunc=np.average)
sns.heatmap(pivot)

Conclusion
Of course, there are many more things we can do with seaborn, you can find out about them in official documentation…
I hope you enjoyed this article as much as I did. Thanks for reading!
Learn more about the course “Python Developer. Professional “.
Register for the webinar on the topic “Frameworks and Metaclasses”…