Python 3home |
PLEASE NOTE THAT THIS IS OLD MATERIAL Latest material can be found in a Jupyter Notebook for this session.
(my slides are the clearest though)
Matplotlib documentation can be found here: http://matplotlib.org/ A very good rundown of features is in the Python for Data Analysis 2nd Edition PDF, Chapter 9 A clear tutorial on the central plotting function pyplot (part of which was used for this presentation) can be found here: https://matplotlib.org/users/pyplot_tutorial.html
Use plt.plot() to plot; plt.savefig() to save as an image file.
Python script using pyplot object:
import matplotlib.pyplot as plt
import numpy as np
line_1_data = [1, 2, 3, 2, 4, 3, 5, 4, 6]
line_2_data = [6, 4, 5, 3, 4, 2, 3, 2, 1]
plt.plot(line_1_data) # plot 1st line
plt.plot(line_2_data) # plot 2nd line
plt.savefig('linechart.png') # use any image extension
# for an image of that type
Jupyter notebook session using pyplot object
# load matplotlib visualization functionality %matplotlib notebook import matplotlib.pyplot as plt linedata = np.random.randn(1000).cumsum() plt.plot(linedata)
Any calls to .plot() will display the figure in Jupyter.
The figure represents the overall image; a figure may contain multiple subplots.
Here we are establishing a figure with one subplot. The subplot object ax can be used to plot
import numpy as np
import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.add_subplot(1, 1, 1) # this figure will have one subplot
# 1 row, 1 column, position 1 within that
ax.plot(np.random.randn(1000).cumsum())
ax.plot(np.random.randn(1000).cumsum())
We may create a column of 3 plots, a 2x2 grid of 4 plots, etc.
import matplotlib.pyplot as plt
import numpy as np
fig = plt.figure()
ax1 = fig.add_subplot(2, 2, 1)
plt.plot(np.random.randn(50).cumsum(), 'k--')
ax2 = fig.add_subplot(2, 2, 2)
plt.plot(np.random.randn(50).cumsum(), 'b-')
ax3 = fig.add_subplot(2, 2, 3)
plt.plot(np.random.randn(50).cumsum(), 'r.')
Establishing a grid of subplots with pyplot.subplots()
import matplotlib.pyplot as plt
fig, axes = plt.subplots(2, 3)
print(axes)
# array([ [ <matplotlib.axes._subplots.AxesSubplot object at 0x7fb626374048>,
# <matplotlib.axes._subplots.AxesSubplot object at 0x7fb62625db00>,
# <matplotlib.axes._subplots.AxesSubplot object at 0x7fb6262f6c88> ],
# [ <matplotlib.axes._subplots.AxesSubplot object at 0x7fb6261a36a0>,
# <matplotlib.axes._subplots.AxesSubplot object at 0x7fb626181860>,
# <matplotlib.axes._subplots.AxesSubplot object at 0x7fb6260fd4e0> ] ],
# dtype=object)
A fine discussion can be found at http://www.labri.fr/perso/nrougier/teaching/matplotlib/matplotlib.html#figures-subplots-axes-and-ticks
plot() with a list plots the values along the y axis, indexed by list indices along the x axis (0-4):
fig = plt.figure()
ax1 = fig.add_subplot(1, 1, 1)
ax1.plot([1, 2, 3, 4]) # indexed against 0, 1, 2, 3 on x axis
With two lists, plots the values in the first list along the y axis, indexed by the second list along the x axis:
fig = plt.figure()
ax1 = fig.add_subplot(1, 1, 1)
ax1.plot([10, 20, 30, 40], [1, 4, 9, 16])
Simply import seaborn makes plots look better.
import seaborn as sns
import matplotlib.pyplot as plt
barvals = [10, 30, 20, 40, 30, 50]
barpos = [0, 1, 2, 3, 4, 5]
fig = plt.figure()
ax = fig.add_subplot(1, 1, 1)
ax.bar(barpos, barvals)
seaborn is an add-on library that can make any matplotlib plot more attractive, through the use of muted colors and additional styles. The library can be used for detailed control of style, but simply importing it provides a distinct improvement over the default primary colors.
import matplotlib.pyplot as plt
line_1_data = [1, 2, 3, 2, 4, 3, 5, 4, 6]
line_2_data = [6, 4, 5, 3, 4, 2, 3, 2, 1]
fig = plt.figure()
ax = fig.add_subplot(1, 1, 1)
ax.plot(line_1_data, linestyle='dotted', color='red')
ax.plot(line_2_data, linestyle='dashed', color='green', marker='o')
Line Styles
'-' | solid line style |
'--' | dashed line style |
'-.' | dash-dot line style |
':' | dotted line style |
'.' | point marker |
',' | pixel marker |
'o' | circle marker |
'v' | triangle_down marker |
'^' | triangle_up marker |
'<' | triangle_left marker |
'>' | triangle_right marker |
'1' | tri_down marker |
'2' | tri_up marker |
'3' | tri_left marker |
'4' | tri_right marker |
's' | square marker |
'p' | pentagon marker |
'*' | star marker |
'h' | hexagon1 marker |
'H' | hexagon2 marker |
'+' | plus marker |
'x' | x marker |
'D' | diamond marker |
'd' | thin_diamond marker |
'|' | vline marker |
'_' | hline marker |
Line Styles
'b' | blue |
'g' | green |
'r' | red |
'c' | cyan |
'm' | magenta |
'y' | yellow |
'k' | black |
'w' | white |
import imp
plt = imp.reload(plt)
ydata = [1, 2, 3, 2, 4, 3, 5, 4, 6]
xdata = [0, 10, 20, 30, 40, 50, 60, 70, 80] # (this is the default if no list is passed for x)
fig = plt.figure()
ax = fig.add_subplot(1, 1, 1)
ax.set_yticks([2, 4, 6, 8])
ax.set_xticks([0, 25, 50, 75, 100])
ax.set_ylim(0, 10)
ax.set_xlim(0, 100)
ax.set_xticklabels(['zero', 'twenty-five', 'fifty', 'seventy-five', 'one hundred'],
rotation=30, fontsize='small')
line1, = ax.plot(xdata, ydata)
line2, = ax.plot([i+10 for i in xdata], ydata)
ax.legend([line1, line2], ['this line', 'that line'])
The ticks on the y axis (vertical) are set based on the data values of the first list passed. The ticks on the x axis (horizontal) are set based on the data values of the second list passed. setting the tick range limit
ax.set_ylim(0, 10)
ax.set_xlim(0, 100)
setting the ticks specifically
ax.set_yticks([2, 4, 6, 8])
ax.set_xticks([0, 25, 50, 75, 100])
setting tick labels
ax.set_xticklabels(['zero', 'twenty-five', 'fifty', 'seventy-five', 'one hundred'],
rotation=30, fontsize='small')
plt.grid(True) to add a grid to the figure
plt.grid(True)
setting a legend
line1, = ax.plot(xdata, ydata)
line2, = ax.plot([i+10 for i in xdata], ydata)
ax.legend([line1, line2], ['this line', 'that line'])
fig.savefig() saves the figure to a file.
fig.savefig('myfile.png')
The filename extension of a saved figure determines the filetype.
print(fig.canvas.get_supported_filetypes())
# {'eps': 'Encapsulated Postscript',
# 'pdf': 'Portable Document Format',
# 'pgf': 'PGF code for LaTeX',
# 'png': 'Portable Network Graphics',
# 'ps': 'Postscript',
# 'raw': 'Raw RGBA bitmap',
# 'rgba': 'Raw RGBA bitmap',
# 'svg': 'Scalable Vector Graphics',
# 'svgz': 'Scalable Vector Graphics'}
pandas has fully incorporated matplotlib into its API.
pandas Series objects have a plot() method that works
import pandas as pd
import numpy as np
ts = pd.Series(np.random.randn(1000), index=pd.date_range('1/1/2000', periods=1000))
ts = ts.cumsum()
ts.plot(kind="line") # "line" is default
pandas DataFrames also have a .plot() method that plots multiple lines
import pandas as pd
df = pd.DataFrame({'a': [1, 2, 3], 'b': [2, 3, 4]})
df.plot()
Pandas DataFrames also have a set of methods that create the type of chart desired.
df.plot.area df.plot.barh df.plot.density df.plot.hist df.plot.line df.plot.scatter df.plot.bar df.plot.box df.plot.hexbin df.plot.kde df.plot.pie
The pandas visualization documentation can be found here: http://pandas.pydata.org/pandas-docs/stable/visualization.html
.
import matplotlib.pyplot as plt
langs = ('Python', 'C++', 'Java', 'Perl', 'Scala', 'Lisp')
langperf = [ 10, 8, 6, 4, 2, 1]
y_pos = np.arange(len(langs))
plt.bar(y_pos, langperf, align='center', alpha=0.5)
plt.xticks(y_pos, langs)
plt.ylabel('Usage')
plt.title('Programming language usage')
Pie charts set slice values as portions of a summed whole
import numpy as np
import matplotlib.pyplot as plt
plt.pie([2, 3, 10, 20])
Scatterplots Set points at x,y coordinates, at varying sizes and colors
import matplotlib.pyplot as plt
import numpy as np
N = 50
x = np.random.rand(N)
y = np.random.rand(N)
colors = np.random.rand(N)
area = np.pi * (15 * np.random.rand(N))**2 # 0 to 15 point radii
plt.scatter(x, y, s=area, c=colors, alpha=0.5)