Data Visualization w/ Matplotlib
what are some parameters and different arguments associated with those parameters for ax.plot
"marker =": "o"-circles as markers. "v"- triangles pointing downwards. "linestyle = ": specifies line connector type.options: "--" for dashed and "None" for no connectors. "color = ": color of points and connectors. options:"r" for red
How do I create 3 rows and 2 columns of subplots?How are plots referenced and are the rows and columns zero indexed.
fig,ax = plt.subplots(3,2).The plots are referenced by ax[0,0] and are zero indexed.
How do I create a figure and axis object? What is a figure and axis object?
fig,ax =plt.subplots() figure object = a container that holds everything you see in the output axis object = part of the page that displays the data
How do I save a figure?Where can I find this saved figure?
fig.savefig("title.fileextension"). can find in present working directory.
when would saving a figure as a .jpg be useful?
if image will be a part of a website.
What does matplotlib do for date-time data to show the full range of data?
infer x-ticks
What is a small multiple?
multiple small plots that show similar data across different conditions.
what are some useful param = args of .savefig()? What are some effective args to pass to the params?
quality = 50; avoid values above 95 because compression is no longer effective. dpi = 300; 300 is a high quality resolution. a high quality resolution correlates with a larger file size. .set_size_inches([5,3])- sets aspect ratio, where the first number is the width and the second number is the height both in inches.
What happens if I pass sharey =True to .subplots?
all subplots will have same range of y-axis ticks
What does plt.style.use("string") do?
applies style to all figures in the session
How do I make an annotation at a specific xy-coordinate?
ax.annotate("message",xy = (x,y)
How do I make a bar chart using an axis object?
ax.bar(x,y). x and y are df['col_name']
If I make two .bar calls, How do I indicate which data is at the bottom of the stacked bars? What about the bottom of three stacked bars.
ax.bar(x,y,bottom = df['col']); ax.bar(x,y,bottom = df['col1'] + df['col2'])
How do I form multiple boxplots?How do I label each one?
ax.boxplot([list of datasets]). use ax.set_xticklabels([list of string labels corresponding to each boxplot in order])
What are some guidelines for choosing plotting style?
1. dark backgrounds are usually less visible so they are discouraged. 2. if color is important, consider choosing colorblind-friendly options -"seaborn-colorblind" -"tableau-colorblind 10" 3. if you think someone will want to print your figure, use less ink. avoid colored backgrounds. 4. if it will be printed in black and white, use "grayscale" style
For multiple .bar() calls, how do I indicate which bar is which of stacked bar calls?
I add "label = " passing the argument of the y argument for that col. I then call ax.legend() and a legend will appear on the chart.
When using subplots on similar data down a column of subplots, how should I label these two plots in terms of number of total labels?
I should use one x label and two different y labels.
How do I zoom in on shorter time intervals in my plot.
I slice the index of my dataframe I am using to plot.
How do I make a histogram using an axis object? what are some useful arguments?
ax.hist(df['col']). label = "label" puts a label when called with legend to indicate which bars are associated with which data when plotting multiple histograms on same axis object. bins = n or list of endpoints for intervals(no interval plotted before 1st value or after last value). histtype = "step" - shows thin lines that frame bars only
How do I create a twin x-axis if I want to plot data on the same plot but they have different scales for the y-axis?
ax.plot() label ax2=.ax.twinx() ax2.plot() label plt.show()
how do I plot on an axis object? How does it actually plot
ax.plot(x = col1, y= col2). it takes the same indexed entries from x and y and creates an (x,y) ordered pair to plot.
How do I create a scatter plot?What are some helpful arguments of scatter?
ax.scatter(x,y). color = "blue" - sets color of points in data. label = "label" - sets label to be displayed on legend. c = col - useful for encoding a third variable in the plot. does this by changing brightness of points by mapping data in col to colors.
How do I set an axis label for an axis object? y label? title?
ax.set_xlabel("label");ax.set_ylabel("y label"); ax.set_title("title")
how do I set x-tick labels and rotate the ticks 90 degrees counter-clockwise?
ax.set_xticklabels(df['col'],rotation = 90)
If I have one of the array dimensions of a subplot as 1 and the other dimensions as greater than one(say [2,1]), how do I reference the axis object to plot and label the left subplot.
ax[0]
What modification to ax do I need to use if I want to modify the data or the labels of subplot?
ax[m,n]
What happens if I execute two ax.plot commands before plt.show()?
both are appended on same plot.
What types of plots can I add error bars to and how are these errors expressed on the chart?
can add to a bar chart where a vertical tick is added to the bars. I add yerr = series.std() to show standard deviation of each bar. for a line plot, vertical markers are added to each point to show standard deviation. To add these we ax.errorbar(x,y,yerr=series.std())
If I have two datasets plotted on the same plot how can I distiguish between them and the axes that correspond to them?
can use color param in .plot and .set_ylabel.
What is the convention for axis label capitalization?
capitalize first word all subsequent proper nouns.
What does pd.Timestamp("2015-10-06")
converts a datetime-like string to a timestamp object
If I have NaN values in my data, how will this be represented in a line plot
there will be a break
when using ax.twinx(), what does the output look like when I plot the two similar datasets?
they will share the same x-axis ticks but there will be two y axes on either side of the plot with differently scaled ticks that each correspond to one of the datasets
What is the size of the whiskers of a boxplots? what percentage of the distribution do the box and whiskers represent?What do the points outside of the box and whiskers represent?
whiskers represent 1.5x the size of the interquartile range. box and whiskers represent 99% of the data if normally distributed. outliers
what are some helpful arguments in ax.annotate?
xytext = (x-coord,y-coord).coordinates of text. arrowprops = dict. dict = {"arrowstyle" : "->", "color" : "gray"} arrowprops specifies properties of an arrow and automatically connects coordinates of xytext to coordinates of xy.
Can I plot using the index column of time-series data as the independent variable? How do I reference this index column?
yes. df.index