Stats final exam
Conditions for 1-sample t for means
1. Randomness of data collection 2. Normality of population (plot has no outliers) or large sample size
conditions for matched pairs t for mean of differences
1. Randomness of data collection 2. Normality of population (plot has no outliers) or large sample size (applied to differences)
Conditions for chi-square
1. Randomness of data collection 2. Large sample size (All expected counts > 5)
Conditions for 2 sample z for proportions
1. Randomness of data collection 2. Normality of the sampling distribution of 𝑝𝑝̂ 1 − 𝑝𝑝̂ 2:
Conditions for 1-sample z for proportions
1. Randomness of data collection 2. Normality of the sampling distribution of 𝑝𝑝̂:
Conditions for 2-sample t for difference of means (pooled variance)
1. Randomness of data collection 2.Normality of populations (sample plots have no outliers) or large n 3. Equal population standard deviations (Largest s/smallest s < 2)
Conditions for ANOVA
1. Randomness of data collection 2.Normality of populations (sample plots have no outliers) or large n 3. Equal population standard deviations (Largest s/smallest s < 2)
Suppose you want to estimate the proportion of voters who will vote for George Smith, a candidate for state representative. How many should you sample in order to estimate p with a margin of error of 0.05 (5%) and 95% confidence?
385
Role type classification for Chi-Square
C -> C
Role type classification for 2-sample z for proportions
C -> C Explanatory variable has 2 levels
Role type classification for ANOVA (Analysis of Variance)
C -> Q Explanatory variable has 3+ levels
Role type classification for 2-sample t for difference of means (pooled variance)
C -> Q Explanatory variable has two levels
Role type classification for Matched pairs t for mean of differences
C -> Q Explanatory variable has two levels
Role type classification for 1-sample z for proportions
Categorical
Hypothesis for 1-sample t for means
H0: µ = µ0 Ha: µ > µ0 µ < µ0 µ ≠ µ0
Hypothesis for Chi-Square
Ho: There is no association Ha: There is an association
Hypothesis for Linear Regression
Ho: There is no linear relationship between the two variables Ha: There is a linear relationship between the two variables
Hypothesis for 1-sample z for proportions
Ho: p = po Ha: p > po p < po p ≠ po
Hypothesis for 2-sample z for proportions
Ho: p1 = p2 Ha: p1 < p2 p1 > p2 p1 ≠ p2
Hypothesis for 2-sample t for difference of means (pooled variance)
Ho: µ1 = µ2 Ha: µ1 > µ2 µ1 < µ2 µ1 ≠ µ2
Hypothesis for ANOVA (Analysis of Variance)
Ho: µ1 = µ2 = ... = µn Ha: at least one mean is different
hypothesis for Matched pairs t for mean of differences
Ho: 𝜇𝑑 = 0 Ha: 𝜇𝑑 > 0 𝜇𝑑 < 0 𝜇𝑑 ≠ 0
conditions for linear regression
Linearity- linear pattern in scatterplot Independence randomness in data collection Normality- (histogram of residuals is normal) Equal Pop. Stan. Dev. - scatterplot has no megaphone pattern
table to use for 2-sample z for proportions
P-value: z t for z*
table used for 1-sample z for proportions
P-value: z t for z*
Role type classification for Linear Regression
Q -> Q
To do regression inference, the data must satisfy all of the following except The response variable (y) has a Normal distribution at each value of x. The values of the explanatory variable (x) must follow a Normal distribution. The true relationship must be linear. The standard deviation of the y's about the true line is the same everywhere.
The values of the explanatory variable (x) must follow a Normal distribution.
df and table for chi-square
df = (r - 1) * (c - 1) r = num. of rows c = num. of columns table: chi-square
df and table for 1-sample t for means
df: n - 1 table: t
df and table for matched pairs t for mean of differences
df: n - 1 table: t
df and table for 2-sample t for difference of means (pooled variance)
df: n1 + n2 - 2 table: t
Changing the unit of measurement in the X or Y variable changes the value of r.
false
Margin of error for an approximate confidence interval for p is z* x square root of p(1-p) / n
false
The mean of the sampling distribution of the sample proportion equals p̂ .
false
We check np̂ ≥10 and n(1-p̂) ≥10 before obtaining the p-value to test H0: p = 0.60.
false
df for linear regression
n - 2
Which one of the following represents the parameter estimated with a 99% confidence interval for difference in proportions?
p1 − p2
Role type classification for 1-sample t for means
quantitative
Correlation ignores the distinction between explanatory and response variables.
true
Correlation is a valid measure of strength of relationship whenever the relationship between two different quantitative measurements on each individual appears linear in the scatterplot
true
We compute the standard error of p̂ using the formula square root of p(1-p) / n
true
When no information is available about the value of p and we need to determine sample size needed to estimate proportion, we can safely use p* = 0.5 in the sample size formula
true
μ1 − μ2 , the difference in two population means, represents the parameter used to compare the means of two populations.
true
What is the symbol for the difference between two population means?
μ 1 −μ2