Python Matplotlib
Pandas has plotting commands as well. These are just thin wrappers around matplotlib. 1. Do a scatter plot for column 'A' and 'B' from df. 2. Do a scatter_matrix of all numeric columns in df
# Syntax 1 (matplotlib style) df.plot('A', 'B', kind='scatter') # Syntax 2 (python / pandas style) df.plot.scatter('A','B') ---------------------------------------- pd.plotting.scatter_matrix(df)
Create a 1 x 2 grid with two subplots that are unpacked as ax1 and ax2
fig, (ax1, ax2) = plt.subplots(2,1)
Write this code in one line: fig = plt.figure() ax = fig.add_subplot(111)
fig, ax = subplots(1,1)
6 basic types of charts
plot([x],y) bar (x,y) barh(x,y) hist(x) scatter(x,y) boxplot(x,y)
SYN: plot a simple histogram
plt.plot(x) plt.plot(x, bins = 20)
What is the difference between: plt.subplot() plt.subplots()
plt.subplot() Creates a grid and returns current axis. For example: ax2 = plt.subplot(2,1,2) plt.subplots() Creates a grid and Returns a tuple with two elements: figure and all axes: fig, [ax1, ax2] = plt.subplots(1,2) plt.add_subplot()
How do you create a grid with 3 rows and 2 columns?
plt.subplots(3,2) #Or, we want to directly access the figure and/or axes afterwards: fig, ([ax1, ax2], [ax3, ax4], [ax5, ax6]) = plt.subplots(3,2)
Add title, and axis labels to a plot
plt.xlabel('the x axis') plt.ylabel('the y axis') plt.title('spiffy title')
The main module in matplotlib that lets you plot Matlab style?
pyplot
How to print a horizontal bar chart where you add user-friendly categorical labels (let's say the base data is just numeric)
# Use the barh chart type # Create a names list. Then set it using the yticks property x = range(5) y = [15, 34, 45, 77, 82] names = ['shoes', 'shirts', 'garden', 'grocery', 'service'] plt.yticks(x, names) plt.barh(x,y)
What is a figure, axes and axis? How are they related?
# figure: the whole area. my_fig = figure() axes: what you think of as a "plot". a figure may have several axes. axis: each axes has two axis. They may have ticks and tick labels
You want to view the frequency/density of numeric variable(s). Names three ways for 1 variable (1D) and their corresponding 2D plot.
#Approach: 1D -> 2D #------------------------- 1. "Show every value": rug -> scatter 2. "Binning": histogram -> hexbin 3. "Density (=Smoothing the bins)": kde -> contour
Describe the hierarchy of artist classes
A figure can have many axes (=subplots). "Axes" include the x-axis, y-axis, 2Dline, title Besides one or more Axes, the figure also contain a rectangle serving as the background image.
How do you set properties for the plot function?
Alt 1: Use keyword arguments in the plot function. E.g. plot(x,y, linewidth = 2.0, linestyle = '--') Alt 2: leverage some short-hands. E.g. plot(x,y, '--') Alt 3: Set properties after the plot has been created and a plot object returned. my_plot = plt.plot(x,y) plt.setp(my_line, linestyle='--')
The three layers of matplotlib?
Backend: supplies canvas and renders the drawing. There are 7-8 different GUI backends and a few non-GUI backneds such as 'pdf' and 'svg'. Artist: The objects you see (rectangles, lines) are from the artist class Scripting (=pyplot): Short-cut way to work with the different artist objects and methods
How do you supply data points to the plot() function?
Case 1: Supply only y-axis plot(y) Case 2: Supply x and y-axis plot(x,y) Case 3: Supply several pairs of (x,y) plot(x,y, z, p, n,k)
What does a Seaborn command return?
Either the plot created a subplot in which case it returns the axes (matplotlib). Or in some cases, Seaborn creates the entire figure, in which case it returns a Seaborn grid object. (which I guess is a wrapper around the figure).
How do you make these 1D and 2D distribution plots in matplotlib and seaborn (if possible) using standard commands 1. "Show every value": rug -> scatter 2. "Binning": histogram -> hexbin 3. "Density (=Smoothing the bins)": kde -> contour
In Seaborn: =============== * distplot for all 1D plots * jointplot for all 2D plots Or individual commands In matplotlib: =============== * no standard rugplot * plt.scatter() * plt.hist() * plt.hexbin() * no standard kde plot * plt.contour()
You have a dataframe with some numeric variables. You want to see bivariate (scatterplot) distributions for each pair of variables. How can you do this?
Pandas: pd.plotting.scatter_matrix() Seaborn: sns.pairplot You can also add hue/color as a third (categorical) dimension
What language is Matplotlib written in?
Python and Numpy
https://matplotlib.org/gallery/recipes/common_date_problems.html
READ THIS
What does the commands: plt.subplot(3,2,1) plt.plot(x,y) do?
subplot(nrows, ncols, plot_number) Subplot(3,2,1) creates a 3 rows x 2 columns grid of plots and then makes the first (upper left corner) of these six plots the current plot. Then plot(x,y) plots in the current plot location