How to use statistics to describe quantitative data
What percentage of data falls below Q1?
25%
How does the median divides our data set?
50% of our data are larger while 50% of our data are smaller
Define measures of spread
A measure of spread (variability, dispersion, scatter) refers to how the data within the set is "spread out" (or "dispersed", or "scattered") about the mean.
What's a bin?
A set of similar data like a venn diagram. Categories that you can list data in.
What's a box plot?
Box plots are a type of graph that can help visually organize data.
Standard deviation measures how tightly the observations -----------------
CLUSTER AROUND THE MEAN.
What are descriptive statistics?
Descriptive statistics are brief descriptive coefficients that summarize a given data set, which can be either a representation of the entire or a sample of a population.
Define "measures of center"
Reflects the average
Define mode
The most frequent number in a data set.
What's the benefit of quartiles, deciles, percentiles?
They describe where a particular observation lies compared with everyone else.
What is the first task in descriptive statistics?
To find some measure of the middle of a set of data.
A column in a spreadsheet is associated with a random variable.
True
Median also means ______________________
middle value
The point of the median is to not let outliers skew the data higher.
true
What's the symbol ifor standard deviation?
σ (the greek letter sigma)
A quartile divides data into three points that are called ________________
—a lower quartile, median, and upper quartile
Formula for calculating the range is ___________________________
. Calculate the range by subtracting the maximum from the minimum or the difference between the largest and smallest values.
How do I find Q1?
1. Divide all data into two sets. Find the median of set 1.
What are the 3 measures of central tendency?
1. Mean 2. Median 3. Mode
What are some common aggregations?
1. Mean 2. median 3. mode
What 2 things do descritive statistics measure?
1. Measures of central tendency 2. Measures of variability (spread).
What does the mean median and mode output?
A single number from lots of different numbers.
Each column in a statistical spreadsheet is assigned a _____________
Capital letter or variable
The sum symbol on Excel is called _________________
Sigma
Example of standard deviation: The average weight of passengers on a plane is 155 pounds. Another group of marathon runners has a weight of 155 pounds. Which group would have the most dispersion from the midpoint?
The airline passengers. I.e. There might be a 325 pund person and a 6 pound baby. The marathon runners are likely all near 155 pounds.
"What is the Variance?"
The average of the squared differences from the Mean.
How do you calculate the median?
The formula is different for an even set of values vs. an odd set of values.
How do you calculate the variance?
To calculate the variance follow these steps: Work out the Mean (the simple average of the numbers) Then for each number: subtract the Mean and square the result (the squared difference). Then work out the average of those squared differences. (Why Square?)
What's a hundreth or percentile?
When each percentile represents 1 percent of the distribution.
The standard deviation is the descriptive statistic that allows us to ________________________________
assign a single number to this dispersion around the mean.
What are the four aspects of analyzing quantatative data?
cemter Shape spread outliers
Measures of spread gives you an idea of _________________________
how students differ
define standard deviation.
it measures how dispersed the data are from their mean
What's the range?
the difference between the highest and lowest numbers in a dataset.
What is variability?
the extent to which data points in a statistical distribution or data set vary—from the average value, as well as the extent to which these data points differ from each other.
How do I find the five number summary of data with an even set of values?
1. First order the values. 1, 2,3,3,5,8,10,105 2. What's the maximum? 1 What's the minimum? 105 3. Find the median? 4 4. To find Q1 and Q3, we divide our data between the 2 values we used to find the median which is 3 and 5. Whereas Q1= 1,2,3,3 Q2 = 5,8,10,105 5. Find the median of Q1 = 2.5 and the median of Q3=9 6. Calculate the range by subtracting the maximum from the minimum. range = 104 7. Calculate the interquartile range Q3-Q1 = 6.5
What's the procedure for calculating the median for odd numbers?
1. Order the set of values from smallest to largest. 2. Count how many values that you are working with. The median is the exact middle value if there is an odd set of values.
What's the procedure for calculating the median for even numbers?
1. Order the set of values from smallest to largest. 2. We calculate the mean of our 2 center values to get the median.
What are the 4 methods used to calculate the measures of spread?
1. Range 2. Interquartile range 3. Standard deviation 4. variance
Here are a set of numbers: (1,2,3)3 (5,8,10) What's the median?
3
What percentage of data falls below Q3?
75%
What's a coefficient?
A number used to multiply a variable.Example: 6z means 6 times z, and "z" is a variable, so 6 is a coefficient.
What's a histogram?
A visualization of your data categories. i.e. age ranges. You would plot how many people in each age category.
Definition aggregation
A way to turn multiple numbers into fewer numbers.
Define the term "normal distribution"
Data that are ditributed mormally are symmetrical around their mean in a bell shape.
What is spread?
How far from average or median the numbers in a data set are.
What's the median?
It divides a distribution in half;meaning half of the observations lie above the median and half lie below.
How do quartiles work?
Just like the median divides the data into half so that 50% of the measurement lies below the median and 50% lies above it, the quartile breaks down the data into quarters.
What are the 3 measures of center?
Mean, median, mode
What might distort the mean or average of a set of data?
Outliers i.e. measuring income; the few mega rich will skew the per capita income measurement
The lower quartile, or first quartile, is denoted as _________________________
Q1
What's the most common way professionals measure the spread of data with a single value?
Standard deviation
What's the interquartile range?
The IQR is the difference between Q3 and Q1.
The upper or third quartile, denoted as Q3, is ___________________________
The central point that lies between the median and the highest number of the distribution.
What's the five number summary?
The five number summary includes 5 items: The minimum. = min Q1 (the first quartile, or the 25% mark). The median. = med Q3 (the third quartile, or the 75% mark). The maximum. = max
What's standard deviation formula?
The formula is easy: it is the square root of the Variance.
What values do you need to graph a box plot?
To graph a box plot the following data points must be calculated: the minimum value, the first quartile, the median, the third quartile, and the maximum value.
Each quartile contains 25% of the total observations.
True
What's a decile?
When you divide an observation ito 10 %.
What is the algebra symbol for an unknown random variable?
X
How do you measure a quartile?
by dividing the distribution into four groups. A quartile divides data into three points—a lower quartile, median, and upper quartile—to form four groups of the dataset.
Define an "absolute figure".
figures that can be interpreted without being compared to anything else i.e. It's 58 degrees outside.
Q1 can be thought of as the median for the ________________________
first half of the data
The word expect is associated with the _________________________ of our data set.
mean or average
central tendency is also another word for ________________
mean or average
If Q1 and Q3 can be thought of as the medians on either side of Q2 then what's the median of Q1? What's the median of Q3?
median of Q1=2 Median of Q2=8
The weight of the airline passengers is _________________________
more spread out
The median divides a distribution in half. The distribution can be firther divided inot _________________________
quarters or quartiles. The first quartile consists of the bottom 25% of the observations, the second quartile consists of the next 25 % and so on.
If I place 9th in the golf tournament, that is a _________________ statistic.
relative
Q2 can be thought of as the median for the __________________________
second half of the data
Another name for the median is the ________________________
second quartile
What 4 things do we discuss when describing quantitative data?
shape, spread, outliers, and center
Definition standard deviation is __________________________
the measure of how spread out numbers are
Q1 and is the middle number that falls between the smallest value of the dataset and ________________________________
the median.
The second quartile, Q2, is also _________________________________
the median.
What does a quartile measure?
the spread of values above and below the mean.
Measures of variability include_______________________
the standard deviation, variance, the minimum and maximum variables,