Data Visualization I & II and Descriptive
You have data about a child's height and their father's height. What type of visualization would be most appropriate?
scatterplot b/c both variables are quantitative
position-angle experiment:
when comparing magnitude, bar charts make the comparison more visually obvious than pie charts
Visualization Best Practices:
1)Choose the right type of visualization 2) Be mindful when choosing colors 3)Label axes 4)make sure your numbers add up 5)make sure the number 6) make sure the text size is big enough 7)make comparisons easy for the reader 8)Use y-axes that start at 0 for barplots 9)keep it simple 10)allow viewer to make comparison on from top to bottom 11) order rows logically(compare down columns and order sensically from left to right) 12)limit the number of rows and columns(a few rows =table) 13)include informative labels 14)be mindful of significant digits 15)include a good caption 16)include a source 17)format table so it can quickly understood
What makes a good exploratory visualization?
j
bar chart
plotting the distribution of a categorical variable
Boxplot
are a summary of numerical values across categories. The middle line represents the median and tells you the typical height for females and males. The lines give you an idea of the typical range(min and max) of values for each category.
What colors are good to use? Which colors and color combinations to avoid?
avoid using red and green. use green and purple
You want to quickly visualize the most popular age in your dataset. What type of visualization would be most appropriate?
bar chart. When looking at values, bar charts make it much easier to see the difference between groups
You have a dataset including gender information. What type of visualization would be most appropriate?
barplot
You want to see if there's a difference between the heights of individuals who eat breakfast vs those who do not. What type of visualization would be most appropriate?
box plot
Histograms are used for..................
information about a single set of numbers.
Density plots are used for
information about a single set of numbers. Demonstrates the distribution of the data (A smoothed version of a histogram) and helps to identify extreme values.
What functions are used to generate the basic plot types discussed in class?
k
What package is used for basic data visualization in Python?
k
What takes a visualization from exploratory to explanatory?
k
What does it mean to iteratively improve a visualization?
l
What is the "data to ink ratio" and how does changing that ratio aid in understanding data visualization?
l
What's the difference between an exploratory and an explanatory visualization?
l
Barplots are used to
measure the count of values within a categorical variable. You can easily see that there are more females than males.
histograms and density plots are used for......
quantitative variables
box plot
summarize a quantitative and categorical variable together. categorical variable on x-axis and quantitatie variable on the y-axis
A data visualization is good if.....
the colors indicate what the variable name is describinng, but the % is not shown for every number
Which graphs are appropriate for various types of data (i.e. when would you make a histogram? A scatterplot? etc.)
1)Histograms are used for information about a single set of numbers. 2)Density plots are used for information about a single set of numbers. Demonstrates the distribution of the data (A smoothed version of a histogram) and helps to identify extreme values. 3)Scatterplots are used in relationship between two numerical variables.(ex.generally, the more you weigh, the taller you are) . 4)Barplots are used to measure the count of values within a categorical variable. You can easily see that there are more females than males . 5)Boxplot are a summary of numerical values across categories. The middle line represents the median and tells you the typical height for females and males. The lines give you an idea of the typical range(min and max) of values for each category. 6)Tables are effective ways to display data summaries. Tables count as data visualizations
Tables are
effective ways to display data summaries. Tables count as data visualizations