Stats Week 2 - Data Displays, Descriptive Statistics, and Graphs
Q1
25th percentile= "median" of lower half
Q2
50th percentile= median
Q3
75th percentile = median of the upper half of the data
In a boxplot you can tell the exact pattern of the data set (beyond just whether the data is skewed or symmetric.)
False
The median must be one of the numbers in the data set.
False
The standard deviation has no units.
False
A flat histogram (with a line straight across) contains no variability whatsoever, according to our definition.
False (has more variability)
The slices on a pie chart represent relative frequencies.
True
The starting point can affect the way a graph looks.
True
What is the 5 number summary?
min, Q1, median, Q3, max
Which type of graph is best for COMPARING two or more quantitative data sets, a boxplot or a histogram?
Boxplot
Which type of graph is made from the 5-number summary?
Boxplot
Categorical Data
Data that consists of names, labels, or other nonnumerical values
A listing of all possible values in a data set and how often they occurred is called a data _____________________.
Distribution
If the mean of a data set is large, the standard deviation has to be large also.
False
Data Distribution
List of all data values and how often they occur
Center
Mean and the median, *don't have to be a value in the data set*
more variability vs. less variability
More has the same amount far away as is close (the same size all around). While less is more concentrated in the middle (goes up then down).
Frequencies
Number in each category - Table Bar graph showing # in each category
Which is more affected by skewness, the IQR or standard deviation?
Standard Deviation
As we heard in lecture, the "average distance from the mean" is measured by the __________________________.
Standard deviation
Boxplot A and Boxplot B are drawn on the same axes. The box part of Boxplot A is shorter in length than the box part of Boxplot B. What can you tell about the two data sets?
You cannot tell anything from the information provided.
Skewed left
mean < median
Symmetric Data
mean = median
relative frequency
the fraction or percent of the time that an event occurs in an experiment -table, pie chart - bar graph showing % in each category
Variables
Any measurable conditions, events, characteristics, or behaviors that are controlled or observed in a study.
What is the standard deviation of the data set 1, 1, 1, 1?
0
If you add 10 to every value of a data set, what happens to the standard deviation?
Stays the same
If a data set is skewed to the left, how will the mean and median compare?
The mean will be less than the median.
Variability
how the data are dispersed or spread around the mean/center - standard deviation, quartiles
IQR
interquartile range (Q3-Q1)
What can never be negative
standard deviation
Boxplot
- A graph of the five-number summary -Special area around Q1 to Q3 to indicate middle 50% of the data - Can immediately see median, IQR, and skewness - You are interested in how concentrated the data are within each other - Can't see the mean and sample size - Easy to compare data sets
An outlier in a data set can significantly affect the value of the mean but not the median.
True
You can have two data sets with the same mean but different standard deviations.
True
Standard deviation
a computed measure of how much scores vary around the mean score - Same units as the original data - Never negative - Can = 0 - Is affected by outliers and skewness
Histograms
- Nice way to see shape of data set - Hard to identify quartiles - Rough idea of center or variability - Hard to compare data sets
Suppose your data represent revenues from a group of 20 stores in a retail chain across the country, and revenue is measured in millions of dollars. The standard deviation of this data set would also be measured in millions of dollars.
True
Quantitative Data
measurements and counts
Skewed right
mean > median