Python 3

home

Matplotlib

PLEASE NOTE THAT THIS IS OLD MATERIAL Latest material can be found in a Jupyter Notebook for this session.





Matplotlib documentation

(my slides are the clearest though)


Matplotlib documentation can be found here: http://matplotlib.org/ A very good rundown of features is in the Python for Data Analysis 2nd Edition PDF, Chapter 9 A clear tutorial on the central plotting function pyplot (part of which was used for this presentation) can be found here: https://matplotlib.org/users/pyplot_tutorial.html





Plotting in a Python Script

Use plt.plot() to plot; plt.savefig() to save as an image file.


Python script using pyplot object:

import matplotlib.pyplot as plt
import numpy as np

line_1_data = [1, 2, 3, 2, 4, 3, 5, 4, 6]
line_2_data = [6, 4, 5, 3, 4, 2, 3, 2, 1]

plt.plot(line_1_data)          # plot 1st line
plt.plot(line_2_data)          # plot 2nd line

plt.savefig('linechart.png')   # use any image extension
                               # for an image of that type




Plotting in a Jupyter Notebook

Jupyter notebook session using pyplot object

# load matplotlib visualization functionality
%matplotlib notebook
import matplotlib.pyplot as plt

linedata = np.random.randn(1000).cumsum()

plt.plot(linedata)

Any calls to .plot() will display the figure in Jupyter.





The Figure and Subplot Objects

The figure represents the overall image; a figure may contain multiple subplots.


Here we are establishing a figure with one subplot. The subplot object ax can be used to plot


import numpy as np
import matplotlib.pyplot as plt

fig = plt.figure()
ax = fig.add_subplot(1, 1, 1)            # this figure will have one subplot
                                         # 1 row, 1 column, position 1 within that
ax.plot(np.random.randn(1000).cumsum())
ax.plot(np.random.randn(1000).cumsum())




Multiple Subplots Within a Figure

We may create a column of 3 plots, a 2x2 grid of 4 plots, etc.


import matplotlib.pyplot as plt
import numpy as np

fig = plt.figure()
ax1 = fig.add_subplot(2, 2, 1)
plt.plot(np.random.randn(50).cumsum(), 'k--')
ax2 = fig.add_subplot(2, 2, 2)
plt.plot(np.random.randn(50).cumsum(), 'b-')
ax3 = fig.add_subplot(2, 2, 3)
plt.plot(np.random.randn(50).cumsum(), 'r.')

Establishing a grid of subplots with pyplot.subplots()


import matplotlib.pyplot as plt
fig, axes = plt.subplots(2, 3)

print(axes)
   # array([ [ <matplotlib.axes._subplots.AxesSubplot object at 0x7fb626374048>,
   #           <matplotlib.axes._subplots.AxesSubplot object at 0x7fb62625db00>,
   #           <matplotlib.axes._subplots.AxesSubplot object at 0x7fb6262f6c88> ],
   #         [ <matplotlib.axes._subplots.AxesSubplot object at 0x7fb6261a36a0>,
   #           <matplotlib.axes._subplots.AxesSubplot object at 0x7fb626181860>,
   #           <matplotlib.axes._subplots.AxesSubplot object at 0x7fb6260fd4e0> ] ],
   #           dtype=object)

A fine discussion can be found at http://www.labri.fr/perso/nrougier/teaching/matplotlib/matplotlib.html#figures-subplots-axes-and-ticks





Line Plotting along 1 or 2 axes

plot() with a list plots the values along the y axis, indexed by list indices along the x axis (0-4):

fig = plt.figure()
ax1 = fig.add_subplot(1, 1, 1)
ax1.plot([1, 2, 3, 4])         # indexed against 0, 1, 2, 3 on x axis

With two lists, plots the values in the first list along the y axis, indexed by the second list along the x axis:

fig = plt.figure()
ax1 = fig.add_subplot(1, 1, 1)
ax1.plot([10, 20, 30, 40], [1, 4, 9, 16])




Adding Seaborn to any Plot

Simply import seaborn makes plots look better.


import seaborn as sns
import matplotlib.pyplot as plt

barvals = [10, 30, 20, 40, 30, 50]
barpos = [0, 1, 2, 3, 4, 5]

fig = plt.figure()
ax = fig.add_subplot(1, 1, 1)
ax.bar(barpos, barvals)

seaborn is an add-on library that can make any matplotlib plot more attractive, through the use of muted colors and additional styles. The library can be used for detailed control of style, but simply importing it provides a distinct improvement over the default primary colors.





Line Color, Style, Markers

import matplotlib.pyplot as plt

line_1_data = [1, 2, 3, 2, 4, 3, 5, 4, 6]
line_2_data = [6, 4, 5, 3, 4, 2, 3, 2, 1]

fig = plt.figure()
ax = fig.add_subplot(1, 1, 1)
ax.plot(line_1_data, linestyle='dotted', color='red')
ax.plot(line_2_data, linestyle='dashed', color='green', marker='o')


Line Styles

'-'solid line style
'--'dashed line style
'-.'dash-dot line style
':'dotted line style
'.'point marker
','pixel marker
'o'circle marker
'v'triangle_down marker
'^'triangle_up marker
'<'triangle_left marker
'>'triangle_right marker
'1'tri_down marker
'2'tri_up marker
'3'tri_left marker
'4'tri_right marker
's'square marker
'p'pentagon marker
'*'star marker
'h'hexagon1 marker
'H'hexagon2 marker
'+'plus marker
'x'x marker
'D'diamond marker
'd'thin_diamond marker
'|'vline marker
'_'hline marker


Line Styles

'b'blue
'g'green
'r'red
'c'cyan
'm'magenta
'y'yellow
'k'black
'w'white





Setting Axis Ticks and Tick Range

import imp
plt = imp.reload(plt)

ydata = [1, 2, 3, 2, 4, 3, 5, 4, 6]
xdata = [0, 10, 20, 30, 40, 50, 60, 70, 80]    # (this is the default if no list is passed for x)

fig = plt.figure()
ax = fig.add_subplot(1, 1, 1)

ax.set_yticks([2, 4, 6, 8])
ax.set_xticks([0, 25, 50, 75, 100])

ax.set_ylim(0, 10)
ax.set_xlim(0, 100)

ax.set_xticklabels(['zero', 'twenty-five', 'fifty', 'seventy-five', 'one hundred'],
                   rotation=30, fontsize='small')

line1, = ax.plot(xdata, ydata)
line2, = ax.plot([i+10 for i in xdata], ydata)

ax.legend([line1, line2], ['this line', 'that line'])

The ticks on the y axis (vertical) are set based on the data values of the first list passed. The ticks on the x axis (horizontal) are set based on the data values of the second list passed. setting the tick range limit


ax.set_ylim(0, 10)
ax.set_xlim(0, 100)

setting the ticks specifically


ax.set_yticks([2, 4, 6, 8])
ax.set_xticks([0, 25, 50, 75, 100])

setting tick labels


ax.set_xticklabels(['zero', 'twenty-five', 'fifty', 'seventy-five', 'one hundred'],
                    rotation=30, fontsize='small')

plt.grid(True) to add a grid to the figure


plt.grid(True)

setting a legend


line1, = ax.plot(xdata, ydata)
line2, = ax.plot([i+10 for i in xdata], ydata)

ax.legend([line1, line2], ['this line', 'that line'])




Saving to File

fig.savefig() saves the figure to a file.


fig.savefig('myfile.png')

The filename extension of a saved figure determines the filetype.

print(fig.canvas.get_supported_filetypes())

    # {'eps': 'Encapsulated Postscript',
    #  'pdf': 'Portable Document Format',
    #  'pgf': 'PGF code for LaTeX',
    #  'png': 'Portable Network Graphics',
    #  'ps': 'Postscript',
    #  'raw': 'Raw RGBA bitmap',
    #  'rgba': 'Raw RGBA bitmap',
    #  'svg': 'Scalable Vector Graphics',
    #  'svgz': 'Scalable Vector Graphics'}




Visualizing pandas Series and DataFrame

pandas has fully incorporated matplotlib into its API.


pandas Series objects have a plot() method that works

import pandas as pd
import numpy as np

ts = pd.Series(np.random.randn(1000), index=pd.date_range('1/1/2000', periods=1000))
ts = ts.cumsum()
ts.plot(kind="line")   # "line" is default

pandas DataFrames also have a .plot() method that plots multiple lines


import pandas as pd
df = pd.DataFrame({'a': [1, 2, 3], 'b': [2, 3, 4]})
df.plot()

Pandas DataFrames also have a set of methods that create the type of chart desired.


df.plot.area     df.plot.barh     df.plot.density  df.plot.hist     df.plot.line     df.plot.scatter
df.plot.bar      df.plot.box      df.plot.hexbin   df.plot.kde      df.plot.pie

The pandas visualization documentation can be found here: http://pandas.pydata.org/pandas-docs/stable/visualization.html





Bar Charts

.


import matplotlib.pyplot as plt

langs =    ('Python', 'C++', 'Java', 'Perl', 'Scala', 'Lisp')
langperf = [     10,     8,      6,      4,       2,      1]

y_pos = np.arange(len(langs))

plt.bar(y_pos, langperf, align='center', alpha=0.5)
plt.xticks(y_pos, langs)
plt.ylabel('Usage')
plt.title('Programming language usage')




Pie Charts

Pie charts set slice values as portions of a summed whole


import numpy as np
import matplotlib.pyplot as plt
plt.pie([2, 3, 10, 20])




Scatterplot

Scatterplots Set points at x,y coordinates, at varying sizes and colors


import matplotlib.pyplot as plt
import numpy as np

N = 50
x = np.random.rand(N)
y = np.random.rand(N)
colors = np.random.rand(N)
area = np.pi * (15 * np.random.rand(N))**2  # 0 to 15 point radii

plt.scatter(x, y, s=area, c=colors, alpha=0.5)




[pr]