Introduction to Seaborn
How do I rotate xtick labels for both a facetgrid and axessubplot object?
plt.xticks(rotation = 90) - rotates xtick labels 90 degrees counter-clockwise.
What happens if multiple observations are given per x-value when using a lineplot? What is the shaded region around the line?
seaborn will aggregate all observations into a single summary measure. by default, this measure will be the mean. the shaded region around the line is a confidence interval.
Does one have to label a countplot if using a dataframe column as data?
no, seaborn will automatically put x-axis label as the column name and count on y-axis
what does order param do for catplots?
order accepts a list of categories to be plotted. the order of the list from left to right determines the order you want them to appear on the x-axis
how does one create a custom palette?
pass in a sequential list of colors or hex codes to sns.set_palette().
How does one create a scatterplot using relplot()?
pass in kind = "scatter".
How do we create a boxplot using .catplot()?
.catplot(kind = "box")
how do I plot a line plot?
.relplot(kind = "line);
How do I create a title using a FacetGrid object and move it higher up on the grid?
1. assign a facetgrid plot to a variable 2. do var_name.fig.suptitle("New Title",y = n ) where n > 1
How do I add a title to an AxesSubplot object and move it higher up on the grid?
1. assign an AxesSubplot plot to a variable 2. do var_name.set_title("New Title", y =n) where n > 1
What are the three types of palettes used to change the main elements of a plot? Describe the uses for the non-custom types?
1. diverging palettes-useful if data uses a scale and the two endpoints are opposites and there's a neutral midpoint. 2. sequential palettes - great for emphasizing a variable on a continuous scale. 3. custom palette
How are the subplots organized if I use relplot to make a scatterplot scatterplot and I pass columns in for row and col?
Down the columns data will be aggregated for every plot by both a unique entry from the col passed to row and one unique entry from the col passed to col for all plots in the column. Across rows, data will be aggregated for every plot by both a unique entry per plot from the col passed to col and one unique entry from the col passed to row for all plots in the row.
What is relationship between a Facetgrid and AxesSubplot object? What are there characteristics?
FacetGrid consists of 1 or more axes subplots. FacetGrid - Plot types: .relplot(),.catplot() Characteristics: can create subplots Axes Subplot - Plot types: scatterplot() , countplot() ,etc. Characteristics - only creates a single plot
If I have data where there are multiple rows of an x value where each row represents a subgroup, what param = args can I use to effectively plot the data?
If I set style and hue = to the col that contains the subgroup identifier, then there will be a line for each subgroup with a unique linestyle and color. if i set markers= True(along with style and hue being set) there will be will be a marker used in the same color as the line for each line and will be placed at each data point. if you don't want the linestyle to vary by group, pass dashes = false.
What does a confidence interval tell us?
a confidence interval tell us that based off our sample that should be random, we are 95% confident that the mean or statistical measure for the population is within this interval. confidence intervals indicates the uncertainty our sample mean has in predicting the population mean.
What does a point plot show? What are some advantages it has over bar plots? How do you create a point plot?
a point plot shows the mean of a quantitative variable for observations in each category. Advantages 1. its easier to compare height of subgroups because in a point plot the points are stacked above each other. 2. its easier to see differences in slope between the categories than it is to compare the heights of the bars between them. To create a point plot, you do: sns.catplot(kind = "point")
How do I turn off conf interval on a barplot?
ci = None
How do we replace a confidence interval with the standard deviation for all points at a particular x-value? How do we turn off confidence interval without replacing it with another measure?
ci ="sd"; ci = None
What are some param = args for .relplot() for scatterplots when aggregating data on a third variable?
col_wrap = n; this determines how many plots we want per row col_order = list of unique entries in col passed to col; this will display plots aggregated on a third variable in the order of the listen passed to the param. The order displays from Left to Right and then at the end of the row, goes to the next row from Left to right and so on. size = col_name. the size of the point increases with the size of the value in the column. best used if used on a quantitative variable or a categorical variable that represents different levels of something like small,medium,large. hue = col_name. if the column is a quantitative variable, seaborn will color the points different shades of the same color instead of different colors per category value. style = col_name. setting different point styles to each unique value in col. alpha = n; nE[0 , 1], useful when have many overlapping points on the plot and you want to see which areas have more or less observations. 0 is completely transparent and one is completely non-transparent.
What is a different way to show the impact of a third variable on a scatter plot than using the hue parameter? How can this be done with relplot()?
create subplots on that third variable. This can be done with relplot() by passing in a column to row or col params. if passed to just row will create subplots in one column and if passed to just column will create subplots in one row.
What are line plots useful for?
each point represents the same "thing",typically tracked over time.
What happens to a scatterplot if add the hue parameter and set it equal to a third variable?
hue adds color to data to express distribution of third variable and will add a legend to indicate color correspondence with values of the variable.
What does adding hue to a countplot() do?
hue will divide each category into categories from the hue variable and will provide a bar for each value in the hue variable for each category.
What are important .scatterplot() parm=args related to the hue?
hue_order = ordered list of unique values in hue column. this ordering of list alters what values are associated with which color. palette = dict_name dict has keys being unique values of hue column and values being the color we want to assign to the value. colors can be strings of the color name, strings of an abbreviation of the color or HTML hex color codes as strings starting with "#".
How do I import seaborn?What else do I need to import?
import seaborn as sns import matplotlib.pyplot as plt
What does .relplot() allow us to do?
it allows us to create subplots in a single figure.
What are some useful param= args for catplot when kind= "point"?
join = False; removes line connecting the points. set estimator = median. when you do from numpy import median, you can set the estimator from the default mean to the median. capsize = 0.2, capsize places perpendicular lines on the confidence interval lines for easier viewing. 0.2 is the desired width. ci = None; turns off confidence interval
How do I plot a bar chart?What does a barplot show?Does .catplot automatically display a confidence interval?
sns.catplot(kind = "bar"). barplots show mean of quantiative variable per category. Yes.
How do I create a countplot?
sns.catplot(kind = "countplot");
How do I create a countplot?What is a countplot?What is the difference between passing the data to x or y?
sns.countplot(x = variable or y= variable). a countplot is a bar plot of counts of each unique value. passing the data to y creates a horizontal bar plot
How do I create a countplot with dataframe column data?
sns.countplot(x or y = "col_name",data = df_name)
How do I load a dataset within seaborn's library?
sns.load_dataset("data_set_name")
How do I create a scatter plot?How do I display seaborn plots?
sns.scatterplot(x,y) x and y can be a list or a column from a dataframe. plt.show()
How does one change the scale of the plot elements and labels of a plot? What are some scales you can use?
sns.set_context(). scales (from smallest to largest)- "paper"(default),"textbook","talk","poster"
How do I change the palette of the main elements of a plot?
sns.set_palette()
How do I change the style for all my plots? What are the 5 preset options, their layouts, and their advantages?
sns.set_style(). 1. white- no ticks with a solid white background. adv:good for when we only car about the comparison between the groups or the general trend across groups instead of specific values. 2. whitegrid - same as white present except it adds grey grid in background of plot. adv:good if your audience needs to determine specific values of the points instead of making higher level assumptions. 3. ticks - similar to white except adds small tick marks to axes 4. dark - grey background, no ticks 5. darkgrid - grey background with white grid, no ticks
what are some useful param = args in .catplot() for the creation of boxplots?
sym = " "; omits all outliers form. sym can also change point style for outliers whis = n or two item list; scales whisker by a float if n is passed. if a list is passed first item is the lower percentile that lower whisker will extend to and second item is the upper percentile that the upper whisker will extend to. if [0,100] gets, passed, there will be no outliers on plot. hue = col_name. creates multiple box plots per category to show subgroups of col_name.
What is the best way to show contrast when plotting a third variable on a scatter plot?
use hue with size or style to have a color difference and either size or point style difference.
How do we create subplots using .catplot()
use the col and row parameters.
What do I need to know about subplot titles?
using .suptitle() puts a title for the whole subplot. Each subplot will already have title for the specific subgroup being plotted. ex: Group = Group2 var_name.set_titles("This is {col_name}") -assigns a title where the brackets are filled in with the subgroup name.
How do I add axis labels for both a facetgrid and axessubplot object ?
var_name.set(xlabel = "x label name",ylabel = "y label name")
How do we switch orientation of barplots and countplots?
we switch whats passed into x and y.