Seaborn

Ace your homework & exams now with Quizwiz!

A ... is basically used to aggregate the categorical data according to some methods and by default its the mean. It can also be understood as a visualization of the group by action.

Barplot Explanation/AnalysisLooking at the plot we can say that the average total_bill for the male is more as compared to the female. CATEGORICAL PLOT

A .... is sometimes known as the box and whisker plot.It shows the distribution of the quantitative data that represents the comparisons between variables.

Boxplot Explanation/Analysis -x takes the categorical column and y is a numerical column.Hence we can see the total bill spent each day."hue" parameter is used to further add a categorical separation. By looking at the plot we can say that the people who do not smoke had a higher bill on Friday as compared to the people who smoked. CATEGORICAL PLOT

It is used basically for univariant set of observations and visualizes it through a histogram i.e. only one observation and hence we choose one particular column of the dataset.

Displot Explanation: KDE stands for Kernel Density Estimation and that is another kind of the plot in seaborn. bins is used to set the number of bins you want in your plot and it actually depends on your dataset. color is used to specify the color of the plot Now looking at this we can say that most of the total bill given lies between 10 and 20. DISTRIBUTION PLOT

It is the most general of all these plots and provides a parameter called kind to choose the kind of plot we want thus saving us from the trouble of writing these plots separately. The kind parameter can be bar, violin, swarm etc.

Factorplot CATEGORICAL PLOT

...is a way to show some sort of matrix plot. To use a heatmap the data should be in a matrix form. By matrix we mean that the index name and the column name must match in some way so that the data that we fill inside the cells are relevant. Lets look at an example to understand this better.

Heatmap MATRIX PLOTS

It is used to draw a plot of two variables with bivariate and univariate graphs. It basically combines two different plots.

Joinplot Explanation: kind is a variable that helps us play around with the fact as to how do you want to visualise the data.It helps to see whats going inside the joinplot. The default is scatter and can be hex, reg(regression) or kde. x and y are two strings that are the column names and the data that column contains is used by specifying the data parameter. here we can see tips on the y axis and total bill on the x axis as well as a linear relationship between the two that suggests that the total bill increases with the tips. DISTRIBUTION PLOT

sns.lmplot(x ='total_bill', y ='tip', data = dataset, col ='sex', row ='time', hue ='smoker')

Line Plot we draw multiple plots by specifying a separation with the help of the rows and columns. Each row contains the plots of tips vs the total bill for the different times specified in the dataset. Each column contains the plots of tips vs the total bill for the different genders. A further separation is done by specifying the hue parameter on the basis of whether the person smokes. REGRESSION PLOT

It represents pairwise relation across the entire dataframe and supports an additional argument called hue for categorical separation. What it does basically is create a jointplot between every possible numerical column and takes a while if the dataframe is really huge.

Pairplot Explanation: hue sets up the categorical separation between the entries if the dataset. palette is used for designing the plots. DISTRIBUTION PLOT

It plots datapoints in an array as sticks on an axis.Just like a distplot it takes a single column. Instead of drawing a histogram it creates dashes all across the plot. If you compare it with the joinplot you can see that what a jointplot does is that it counts the dashes and shows it as bins.

Rugplot DISTRIBUTION PLOT

It basically creates a scatter plot based on the category.

Stripplot Explanation/Analysis - One problem with strip plot is that you can't really tell which points are stacked on top of each other and hence we use the jitter parameter to add some random noise. jitter parameter is used to add an amount of jitter (only along the categorical axis) which can be useful when you have many points and they overlap, so that it is easier to see the distribution. hue is used to provide an addition categorical separation setting split=True is used to draw separate strip plots based on the category specified by the hue parameter. CATEGORICAL PLOT

It is very similar to the stripplot except the fact that the points are adjusted so that they do not overlap.Some people also like combining the idea of a violin plot and a stripplot to form this plot.

Swarmplot One drawback to using swarmplot is that sometimes they dont scale well to really large numbers and takes a lot of computation to arrange them. So in case we want to visualize a swarmplot properly we can plot it on top of a violinplot. CATEGORICAL PLOT

A ... basically counts the categories and returns a count of their occurrences.

countplot Explanation/AnalysisLooking at the plot we can say that the number of males is more than the number of females in the dataset. As it only returns the count based off a categorical column, we need to specify only the x parameter. CATEGORICAL PLOT

Histograms and KDE can be combined using ...

distplot

sns.set_style('whitegrid') sns.lmplot(x ='total_bill', y ='tip', data = dataset, hue ='sex', markers =['o', 'v'])

lmplot() can be understood as a function that basically creates a linear model plot. lmplot() makes a very simple linear regression plot.It creates a scatter plot with a linear fit on top of it. Explanationx and y parameters are specified to provide values for the x and y axes. sns.set_style() is used to have a grid in the background instead of a default white background. The data parameter is used to specify the source of information for drawing the plots. REGRESSION PLOT

We can see the joint distribution and the marginal distributions together using

sns.jointplot

Rather than a histogram, we can get a smooth estimate of the distribution using a kernel density estimation, which Seaborn does with ...

sns.kdeplot

Visualizing the multidimensional relationships among the samples is as easy as calling

sns.pairplot

It is similar to the boxplot except that it provides a higher, more advanced visualization and uses the kernel density estimation to give a better description about the data distribution.

violinplot Explanation/Analysis - hue is used to separate the data further using the sex category setting split=True will draw half of a violin for each level. This can make it easier to directly compare the distributions. CATEGORICAL PLOT


Related study sets

Plagiarism, MLA Format and Citations Quiz

View Set

Male/Female Reproductive System Test 9

View Set

HR Law Ch. 8: Affirmative Action

View Set

life insurance policy provisions, options and riders

View Set

Chapter 10 Project Scheduling: Lagging, Crashing, and Activity Networks

View Set

Traditions & Culture of IU - Unit 2 EXAM !

View Set