Stats 1430 Midterm
interquartile range
- High concentration of data in middle; IQR is small - Distance taken up by middle 50% of data
standard deviation
- Same units as original data - Never negative - Can equal 0 - Affected by outliers and skewness
Correlation of a sample (r) will always be a number between
-1 and 1
3 characteristics when describing relationship between x and y
1. General pattern - linear relationship or not 2. Direction - up or down 3. Strength - how closely the points follow the pattern
Measures of Center
1. Mean 2. Median
Correlation is a measure of the strength and direction of what type of relationship between two quantitative variables?
Linear relationship
What descriptive statistics are in units of MPG?
a. Mean b. Median c. IQR d. Standard deviation
Correlation has a. The same units as the data b. No units
b
If two data sets have the same median, they must have the same IQR. a. True b. False
b
If two data sets have the same standard deviation, they must have the same mean. a. True b. False
b
The mean of a data set must be one of the values in the data set. a. True b. False
b. false
An "and" probability is the probability that _______ events occur.
both
skewed right
- A few larger values - Mean > Median
skewed left
- A few small values - Mean < Median
boxplots
- Can immediately see median and IQR and whether or not data is skewed - Bigger boxes mean more variability - Easier to compare data sets
Rules of Independence
1. P(A|B) = P(A) or P(B|A) = (PB) 2. P(A|B) = P(A|Not B) 3. P(A and B)=P(A)*P(B)
Measures of Variability
1. Standard deviation 2. IQR
Standard deviation cannot be negative.
True
Suppose at the end of the year everyone at Bob's restaurant gets a 5 PERCENT raise per hour to their existing wages. How does this raise affect the standard deviation of their wages? a. SD is larger than before b. SD is smaller than before c. SD is same as before
a
Why would you say the data must be close to symmetric? a. Because it is easier to interpret symmetric data b. Because the mean and median are close to each other c. Because the max and min are close together d. There is not enough information to determine that the data are close to symmetric
b
A ____________ probability is the probability of a single event.
marginal
An ________ probability is the probability that at least one of the events occurs.
Or
residual
observed - predicted
A ____________ is a set of all possible outcomes of some random process.
sample space
In general, can you recreate the original data values from its histogram?
No
Calculating IQR
Q3 - Q1
Three important characteristics in a data set (quantitative data)
1. Shape - symmetric, skewed 2. Center - middle concentration 3. Variability - range; around the center
Scatterplots examine relationships between what types of variables? a. Categorical b. Quantitative c. Both categorical and quantitative variables
b
Suppose the mean of a data set is 10, the mean is 20, and the standard deviation is 5. What happens if you multiply all the values in the data set by -1? a. The mean, median, and SD are all multiplied by -1 b. The mean and median are multiplied by -1 and the standard deviation does not change c. The mean, median, and standard deviation do not change d. None of the above
b
The median of a data set must be one of the values in the data set. a. True b. False
b. false
Suppose you have a numerical data set that is very much skewed to the left. If you had to pick one, which measure of center best represents most of the data in this case? a. Mean b. Median
b. median
When skewness is present in a set of data, which of the following descriptive summary measures are most appropriate? a. Mean and standard deviation b. Maximum and minimum c. IQR and median d. Mean and IQR
c
A ____________ probability is the probability of one event, given another has occurred.
conditional