# Some data for plotting
x = [0, 1, 2, 3, 4, 5, 6]
y_1 = [0, 2, 4, 6, 8, 10, 12]
y_2 = [0, 3, 6, 9, 12, 15, 18]
# Let's start plotting
plt.plot(x, y_1, color='red', linestyle='dashed', label='Y values')
plt.xlabel('x-values')
plt.ylabel('y-values')
plt.title('X vs Y')
plt.grid(alpha=.25)
plt.legend(loc='upper left')
plt.show()
6 Plotting You Need to Know
matplotlib
unleashed! (Image from Python Data Visualization with Matplotlib — Part 1)
What to Expect in this Chapter
In this chapter, I will show you how to generate high-quality, publication-ready plots using Python. You can readily use this knowledge in your other modules (for example, when writing reports for your experiments). There are many packages (e.g. matplotlib
, plotty
, seaborn
, bokeh
, folium
) that you can use to plot with Python. Of these, matplotlib
is the most popular and most versatile. Since some of the other packages are built using matplotlib
, it is usually good to know your way around matplotlib
first.
6.1 A simple plot
6.1.1 Let’s look at some code.
The code corresponding to the plot is shown below. Copy, paste and run this code snippet to generate the plot.
Spend a few minutes perusing the code to figure out what each line achieves in the plot. A simple way to do this is by commenting out some of the lines to turn off their contributions or by changing some parameters (e.g. loc
to bottom left
) and numbers (e.g. alpha
to .8
). It will help you learn faster if you try to predict what will happen before running the code.
Note
- You can use the following abbreviations if you like:
Long form | Abbreviation |
---|---|
color |
c |
linestyle |
ls |
linewidth |
lw |
so, both the following lines produce the same result.
='red', linestyle='dashed', linewidth=2) plt.plot(x, y, color
='red', ls='dashed', lw=2) plt.plot(x, y, c
Jupyter is an interactive environment, so you will see an output even if you omit
plt.show()
. However, it is good practice to include this line anyway so that your code will also work in non-interactive environments (e.g. when the script is run directly from the command line).The plotting functions usually have default values for the styling parameters. So if you wish, you can keep it simple and plot as follows:
plt.plot(x, y)
6.1.2 Adding another plot
You can add another plot
command to the graph to plot the data of y_2
in blue by adding the following line.
='blue', label='Y2 values') plt.plot(x, y_2, color
Once you do this, the code will look like this:
# Some data for plotting
x = [0, 1, 2, 3, 4, 5, 6]
y_1 = [0, 2, 4, 6, 8, 10, 12]
y_2 = [0, 3, 6, 9, 12, 15, 18]
# Lets start plotting
plt.plot(x, y_1, color='red', linestyle='dashed', label='Y values')
plt.plot(x, y_2, color='blue', label='Y2 values')
plt.xlabel('x-values')
plt.ylabel('y-values')
plt.title('X vs Y')
plt.grid(alpha=.25)
plt.legend(loc='upper left')
plt.show()
6.1.3 Yet another plot but with error bars
Let me add another plot, but this time I will also include \(x\) and \(y\) error bars for the points. The plotting command I need to use for this is called errorbar()
.
= x_error, yerr = y_error,
plt.errorbar(x, y_3, xerr = "green", label = "Y3 with errors") color
Once you do this, the code will look like this:
# Some data for plotting
x = [0, 1, 2, 3, 4, 5, 6]
y_1 = [0, 2, 4, 6, 8, 10, 12]
y_2 = [0, 3, 6, 9, 12, 15, 18]
y_3 = [0, 4, 8, 12, 16, 20, 24]
x_error, y_error = .1, 0.75
# Lets start plotting
plt.plot(x, y_1, color='red', linestyle='dashed', label='Y values')
plt.plot(x, y_2, color='blue', label='Y2 values')
plt.errorbar(x, y_3, xerr=x_error, yerr=y_error,
color='green', label='Y3 with errors')
plt.xlabel('x-values')
plt.ylabel('y-values')
plt.title('X vs Y')
plt.grid(alpha=.25)
plt.legend(loc='upper left')
plt.show()
In this example, I have provided constant errors for all the points. However, you can also provide a list of errors so that each will have a different length.
6.1.4 Experiment, Experiment, Experiment
- Visit colorbrewer and pick other colours for the lines.
- Change the value of
alpha
. (Have you figured out whatalpha
does?) - Can you change the colour of the title?
- Specify more options for the grid!
= .25, color = 'red', linestyle = 'dashed') plt.grid(alpha
- Change the location of the legend.
- Can you figure out the acceptable options for the location of the legend?
Hint: Make a mistake in specifyingloc
! - What happens if you do not specify a location?
- Can you figure out the acceptable options for the location of the legend?
From here onwards, to reduce clutter, I will show a minimum of code related to styling. You should, however, still retain them to get nice-looking plots.
6.1.5 Better with NumPy
Often it is easier to use NumPy arrays instead of Python lists. So, let’s first convert the Python lists into NumPy arrays and then redo the plot in the previous step.
# Some data for plotting
= [0, 1, 2, 3, 4, 5, 6]
x = [0, 2, 4, 6, 8, 10, 12]
y_1 = [0, 3, 6, 9, 12, 15, 18]
y_2
= np.array(x)
np_x = np.array(y_1)
np_y_1 = np.array(y_2)
np_y_2
='red', linestyle='dashed', label='Y values')
plt.plot(np_x, np_y_1, color='blue', label='Y2 values')
plt.plot(np_x, np_y_2, color plt.show()
You will have to wait until the next step to see why NumPy is better.
6.1.6 Adding mathematical functions
One of the advantages of NumPy arrays is that they allow us to easily generate data-related mathematical functions. Let’s reuse our previous code to plot \(x^2\) and \(\sin(x)\)
= np.array([0, 1, 2, 3, 4, 5, 6])
x = x**2
x2 = np.sin(x)
sin_x
='red', linestyle='dashed', label='x^2')
plt.plot(x, x2, color='blue', label='sin(x)')
plt.plot(x, sin_x, color
plt.legend() plt.show()
Alas, our plot does not look good because \(\sin(x)\) lies between \(\pm 1\), but \(x^2\) has no such bounds. One way to fix this is to add another y-axis that shares the same x-axis.
We need another axis!
matplotlib
offers a variety of ways to have multiple axes. The simplest way to have another y-axis that shares the same x-axis is to use the command twinx()
as follows.
= np.array([0, 1, 2, 3, 4, 5, 6])
x = x**2
x2 = np.sin(x)
sin_x
='red', linestyle='dashed', label='x^2')
plt.plot(x, x2, color='lower left')
plt.legend(loc
# This creates a new y-axis for the plot that comes after
plt.twinx()='blue', label='sin(x)')
plt.plot(x, sin_x, color='lower right')
plt.legend(loc
plt.show()
Note that we have two legend()
calls, one for each axis.
But our plot still does not look good because we have only a few points. Let’s use np.linspace
to fix this with:
= np.linspace(0, 6, 100) x
Now we end up with the following result.
6.1.7 Saving to disc
If you want to use your plot in a report or presentation, you must first save it to disk. Luckily, matplotlib
makes it astonishingly easy to export plots into many formats (PDF, JPEG, PNG, BMP…) and at different resolutions. For this, we use the function savefig()
and specify the format with the extension (e.g. filename.pdf
) of the file name and resolution by specifying the dots-per-inch (dpi
); that’s all! Yes, it’s that easy!
= np.linspace(0, 6, 100)
x = x**2
x2 = np.sin(x)
sin_x
='red', linestyle='dashed', label='x^2')
plt.plot(x, x2, color='lower left')
plt.legend(loc
# This creates a new y-axis for the plot that comes after
plt.twinx()='blue', label='sin(x)')
plt.plot(x, sin_x, color='lower right')
plt.legend(loc
'simple-plot.png', dpi=150) plt.savefig(
When you run this code, you will find the file saved in the same directory (folder) as the one in your notebook lives in. If you want to is saved elsewhere, you need to specify the path in more detail. For example:
Mac OS X
'~/Desktop/simple-plot.png', dpi=150) plt.savefig(
Windows
'C://Desktop/simple-plot.png', dpi=150) plt.savefig(
If I had wanted the plot saved in JPEG format, I would have used simple-plot.jpeg
. Further, I would increase dpi
if I wanted a higher resolution.
6.2 A real example: Global Warming
6.2.1 Plotting data from files
Plotting data stored in a file (e.g. spreadsheet, text file, database) is a routine task for a scientist. In fact, the first thing you should do with any data is to look at it with a simple plot.
For the rest of this section, I will use the Earth’s land temperature data from the Berkeley Earth website. Please visit the site (Global Warming \(\rightarrow\) Data Overview) and download the average temperature data for Daily Land. The original name of the file should be Complete_TAVG_daily.txt
= np.loadtxt('Complete_TAVG_daily.txt', skiprows=24)
data = data[:, 0]
date = data[:, -1] anomaly
Time to plot.
=.5)
plt.plot(date, anomaly, alpha-8, 8]) plt.ylim([
I have used a small alpha value to soften the colour of the plot and made the plot range symmetrical in the \(y\) direction.
Let’s add a horizontal line at the zero value to highlight the trend shown by the data. The hlines()
function needs a \(y\)-value and starting and ending values for \(x\).
0, date[0], date[-1], linestyle='--', colors='grey') plt.hlines(
Unlike myself, most people (very sensibly) might just want a decent-looking plot without having to spend time customising it. To facilitate this matplotlib
offers some standard style templates (see here{my-a}). I am going to use the one called fivethirtyeight.
'fivethirtyeight') plt.style.use(
This line must be included right at the top!
Et voila! Do you see global warming?!
Here is the complete code for the final product.
'fivethirtyeight')
plt.style.use(
= np.loadtxt('Complete_TAVG_daily.txt', skiprows=24)
data
= data[:, 0]
date = data[:, -1]
anomaly
=.5)
plt.plot(date, anomaly, alpha0, date[0], date[-1], linestyle='--', colors='grey')
plt.hlines(-8, 8])
plt.ylim([
'Date')
plt.xlabel('Temperature Anomaly')
plt.ylabel('Temperature anomaly\n(Relative to average from Jan 1951 - Dec 1980.)')
plt.title( plt.show()
By the way, if you want to reset things jump out of this style, you need to set the default style using:
'default') plt.style.use(
6.2.2 xkcd!
Okay, since we are talking about styles, I must tell you that the developers of matplotlib
have a healthy sense of humour and have included the option of making your plots in the xkcd style. To enable this, just run plt.xkcd()
instead of setting a style. Here is what the previous plot looks like using the xkcd style. Cool!