Stats Last Exam
Which of the following would have to be true for the study to meet the assumptions underlying chi square. (select all that apply)
- The deviations (Oj - Ej) are normally distributed. - No student is majoring in more than one of the departments in the college (i.e. every student falls in only one category) - No one is measured more than once in the study (the scores are independent of each other)
What is a significance level
- fixed value that we compare with the P-value, matter of judgement to determine if something is "statistically significant". - The probability that a pattern in the results could have arisen by chance.
power + beta =
1.00
What is a nominal scale:
A scale organized by names, and that can't be put in any sort of numerical order Ex: Please circle item that closely identifies to your racial background: 1) african 2) asian ....
Does that make r a 'statistic' or a 'parameter'?
a statistic
What is a rank scale?
a subcategory of an ordinal scale, organize in order from least to greatest Ex: who won a race, first place, second place
What is beta?
beta is the probability that if H0 is false, that we will incorrectly decide to not reject H0
The correlation between X and Y in our population is symbolized by...
ρ (rho)
You want to predict your stress level by using the number of minutes you exercise each week. What test should you use?
Regression. You still have two cardinal variables, but you are using one to predict the other, so regression.
A measure of correlation is called a
correlation coefficient
Pearson's r is the appropriate measure of correlation to use when you want to measure the _______ relationship between two cardinal variables.
linear
Given no other information, the best prediction of someone's score on variable Y would be the ____ of Y.
mean
Going back to the lecture on variance. The variance of a group of scores is the mean squared distance each score is from the ______. (textual, not numeric, answer)
mean
The hypothesis which essentially states that any pattern, any effect, we see in our sample is just the result of random error, that it is not indicative of the existence of a similar pattern or effect in the population from which we sampled, is known as the
null hypothesis
Is ρ (rho) a 'statistic' or a 'parameter'?
parameter
The group of interest, the group we are actually trying to find out about when we run a study, is the
population
If r = 0.5, then there is __________ correlation between X and Y in our sample.
positive
Let's say that bigger houses tend to cost more, and smaller houses tend to cost less. This would be an example of _________ correlation between X and Y.
positive
Is 'r' the correlation between X and Y in our sample, or the correlation between X and Y in our population?
r is the correlation in our sample.
The mean squared error of the prediction is the mean squared distance each score is from the ________ line.
regression
Which of the following would be an appropriate statistical tool to test if the independent variable had an effect in a repeated measures design.
t for dependent groups
Which of the following would be an appropriate statistical tool to test if there is a significant difference between two group means in a true experimental design?
t test for independent groups ANOVA for independent groups
The 'p value' of an experiment is...
the probability we would have obtained the data we did if H0 were true
If H0 is true, and we repeated the experiment many times, then the mean value of chi square would be? μχ² =?
the same number as the df
The power of an experiment is the probability that...
we will be able to conclude the independent variable had an effect, when it really did.
The power of an experiment is the probability that...
we will be able to reject H0 when H0 is actually false.
Which of the following best describes the possible values that r can be?
will always be between -1 and 1 (i.e. -1 ≤ r ≤ 1)
Our decision regarding the null hypotheses is limited to which of the following choices...
- do not reject HO - reject HO
What is a correlational design?
1) Subjects are not randomly divided into groups 2) The independent variable is not manipulated by the experimenter. The independent variable is just measured and it becomes how people are assigned to groups
If you exclude the high and low values of your variables and only analyze the middle scores, then this will...
Artificially deflate the value of r.
If you exclude the middle values of your variables and only analyze the high and low scores, then this will...
Artificially inflate the value of r.
Which of the following would be an appropriate statistical tool to test if there is a significant difference between three group means in a true experimental design?
ANOVA for independent groups
Moe wants to study the relationship between income and drug dependence. He interviews 300 randomly selected adults, and asks how many times a month they use illicit drugs. He also asks them what their annual income is. What kind of test should Moe use to analyze his data? A. independent groups t-test B. factorial ANOVA C. correlation D. chi squared goodness of fit
Correlation. Both variables (# of times they used drugs, and income) are cardinal, and Moe is looking for a relationship between them.
You want to know if the number of minutes you spend exercising each week is related to your stress level (rated on a 1-10 scale). What test could you use to analyze this data?
Correlation. You have two cardinal variables here, and you want to know if they are associated, or related to one another.
Moe wants to know how age and drug use affect income. He samples groups of young, middle aged, and older adults, and determines whether each person is a non-users, infrequent users, or frequent users. He also asks each person their annual income. What kind of test should Moe use to analyze his data? A. independent groups t-test B. factorial ANOVA C. correlation D. chi squared test of association
Factorial ANOVA. You have two IV's (age and frequency of drug use) and one cardinal DV (income).
What would be the appropriate H0?
HO = 0
What is a cardinal scale?
Looking at measurements that directly measure some quantity. ratio and interval are types of cardinal scales EX: temperature, heart rate
The measure of error for regression is:
Mean squared distance the actual scores are from the regression line.
The various approaches to increasing the power of the experiment will increase our chances of rejecting H0
Only if H0 is actually false
What if we have a confounding variable in our experiment?
That we don't know if it was the independent variable or the confounding variable, or both, that made the groups different.
If H0 is false, then we expect...
The value of χ² to be greater than 2
Alpha is what kind of error?
Type 1 error
Beta is what kind of error?
Type 2 error
If p is greater than .05
We do not reject HO the different between the sample means is not statistically significant We cannot conclude that the population means are different, maybe they are maybe they aren't We cannot conclude the the independent variable had an effect on the dependent variable, maybe it did maybe it didn't
If p is less than or equal to .05
We reject HO The different between the sample means is statistically significant We can conclude that the independent variable and an effect on the dependent variable
Which of the following is the symbol for the predicted value of Y?
Y'
Are you simply looking are frequencies, how many subjects fall into each category?
Yes: This calls for chi square, goodness of fit (one variable) association (2 variables)
If there is __________ correlation the values on X and Y tend to go tend to go in opposite directions; people with low scores on X tend to have high scores on Y, and those with high scores on X tend to have low scores on Y.
a negative
Pearson's r is the appropriate measure of correlation to use when both variables of interest are ________ variables.
cardinal
When we reject H0 we are saying that the results were not just due to random error. To be able to then conclude that it was the independent variable that had an effect we need to make sure that there are no serious ____________ variables in the study.
confounding
If r = -.3, then there is __________ correlation between X and Y in our sample.
negative
Let's say that people who take more Vitamin R tend to get sick less (measured as days of being sick per year), and those who take less Vitamin R tend to sick more. This would be an example of _________ correlation between X and Y.
negative
If you find a statistically significant correlation between X and Y can you conclude that X causes Y?
no
If r = 0, then there is __________ correlation between X and Y in our sample.
no correlation
In the sampling distribution of χ² the p value reflects the area of the curve
to the right of our computed χ²
alpha
= significance level
What are the things that will increase the power of the experiment?
- Increasing N - Decreasing the variability of the scores within each group - Increasing the effect of the independent variable
You perform a t test for dependent groups. p=.03
- We can conclude that the independent variable had an effect. -We can conclude that μD doesn't equal zero. - MD is significantly different than zero - reject HO
You test the statistical significance of a correlation and arrive at p=.045
- We can conclude there is a correlation between those two variables in the population from which we sampled. - Reject H0 - The correlation in the sample is statistically significant.
You perform a t test for dependent groups. p=.45
- do not reject HO - We cannot determine whether or not the independent variable had an effect. - MD is not significantly different than zero. - We cannot determine whether or not μD differs from zero.
Are you investigation whether an independent variable has an effect on dependent variable by looking at the effect of the independent variable on group means?
- if it is two groups - either a t test or an F(anova) test. If you have more than two groups, you have to use anova - if more than one independent variable, it will be a factorial anova
Which of the following are possible outcomes of null hypothesis testing:
- we can fail to determine whether or not the null hypotheis is false - we can conclude that the null hypothesis is false
If H0 is true, what would be the average value of 'r'?
0
Cramer's V will always be a value between...
0 and 1.00
What is a true experimental design?
1) an independent variable must be manipulated 2) there must be the element of control 3) randomization of subjects
What is a quasi-experimental design?
1) subjects are not randomly divided into groups 2)the independent variable must be manipulated
What is an ordinal scale?
An increase in a number that reflects a change in quantity. Ex: Please indicate your level of support for the current administration: 1)Strongly oppose 2) Slightly Oppose ....
You want to know whether skiers or snowboarders buy more lift tickets in Utah. You gather this data (# of tickets sold to skiers and # of tickets sold to snowboarders) from the various resorts. What test could you use to analyze the data?
Chi Square Goodness of Fit. You have one variable (type of athlete) which is categorical (skier vs. snowboarder).
You want to know if skiing ability affects a person's likelihood of getting injured while skiing. In other words, you'd like to know whether more injuries are suffered by beginning vs. intermediate vs. expert skiers. You gain access to Park City Resort's ski patrol records, and categorize each injury report as being suffered by a beginner, intermediate, or expert skier. What test could you use on this data?
Chi Square Goodness of Fit. You only have one variable (level of skier), and it is categorical (aka nominal). Injuries is not a variable; everyone you categorize is injured. The only variable here is level of skiing ability.
Mr. Burns wonders if living near the nuclear reactor is unhealthy. He purchases 40 rats, and randomly assigns them to the nuclear or non-nuclear group. He puts the nuclear group in cages near the nuclear plant. He puts the non-nuclear group in cages 100 miles from the plant. With the same diet and living conditions, after 6 months, he categorizes each rat as healthy, sick, or dead. What test should he use to determine whether there is a relationship between living near the nuclear reactor, and health? A. independent groups t-test B. independent groups ANOVA C. chi square goodness of fit D. chi square test of association
Chi Square Test of Association (the outcome variable, healthy, sick, or dead, is categorical rather than cardinal. Also, you have two variables (proximity to the plant, and health), so Chi Square Test of Association, rather than Goodness of Fit.
What would be the appropriate Ha?
Ha: rho ≠ 0
You wonder if a breakfast of oatmeal is better for children than processed, sugared cereal, in terms of their ability to focus in school. You recruit 40 families, and ask them to feed their children sugared cereal for breakfast for two weeks. During that time, you ask the children's teachers to rate how attentive each child is in class each day (on a 1-7 scale). You average those ratings for an overall score for each child. During the 2nd phase of the experiment, you ask the families to switch to oatmeal for breakfast. Again, you have teachers rate each child's attentiveness, and average those daily scores for an overall score for each child. What test should you use to analyze the data?
Dependent samples t-test (note you could also use a repeat measure ANOVA). Because you are measuring each child on the DV (attentiveness) twice, this is a within subjects, or repeated measures design. As there is only one IV (type of breakfast) and only two levels of that IV (sugared cereal vs. oatmeal), you can use either a dependent samples t-test, or within subjects ANOVA. Bonus for noticing that the hypothesis is that oatmeal is better, and for using a one-tailed t-test.
Same as #7 above, but this time you also want to know whether the gender of the driver (as well as category of age) makes a difference. What test would you use to analyze the data?
Factorial ANOVA. You have two IVs (age of driver and gender of driver), and one categorical DV (number of mistakes made). Incidentally, this would be a 3x2 ANOVA, because there are three levels of age, and two of gender (assuming you force people to choose between only male and female; if you added a third category, it would be 3x3).
Are you interested in the strength and the direction of the relationship between to cardinal variables?
If yes, correlation
You want to know whether giving students a Red Bull drink (which is full of stimulants) will affect their test scores. You arrive at a class on test day, and randomly assign half the students to drink a Red Bull a few minutes before the test. The other half of the students do not drink a Red Bull (or anything else). Which statistical test should you use to see whether drinking Red Bull improves test scores?
Independent samples t-test (note that you could also use a one-way ANOVA), because there is only one (categorical) IV (Red Bull or not) and the DV (test score) is cardinal.
What kind is power?
Power is the correct decision It is the probability that when HO is false, we will correctly reject it
If the expected value is less than 5
Probably safe to ignore assumption of normality.
The DMV wants to know who is likely to make more mistakes on a driving test: older drivers, young adult drivers, or teen-aged drivers. You go through their driving test records and categorize each driver according to age, and also record how many mistakes each driver made. What test could you use to analyze this data?
One-way ANOVA. You have only 1 IV (age of driver), and one cardinal variable as the DV (number of mistakes made). You cannot use a t-test because there are three levels of the IV (older, young adult, and teen-age)
The various approaches to increasing the power of the experiment will increase our chances of proving our independent variable had an effect
Only when the independent variable really does indeed have an effect
Are you looking for a formula that will allow you to predict one variable based on the score on the other variable?
Regression
You compute the correlation between X and Y in your sample and arrive at a correlation of r=0. Which of the following can you conclude?
There is no linear relationship between X and Y in the sample.
Which of the following would be an appropriate statistical tool to test to see if the number of subjects who fall into the various categories on one variable differ from what would be expected if H0 were true.
chi square goodness of fit
Which of the following would be an appropriate statistical tool to test to see if there is a relationship between two nominal variables.
chi square test for association
A researcher want to know whether or not underage drinking is related to college attendance. She polls 200 young people between the ages of 18 and 21. She records whether or not they drink, and whether or not they attend college. Which kind of test should she use?
chi squared test of association
I want to know whether socio-economic status (poor, middle class, or rich) influences one's political persuasion (democrat vs. republican). What kind of test should I use?
chi squared test of association
If there is __________ correlation the values on X and Y tend to go up and down together; people with low scores on X tend to have lows scores on Y, and those with high scores on X tend to have high scores on Y.
a positive
What is alpha?
alpha is the probability that if H0 is true that we will incorrectly decide to reject H0, probability of making a type 1 error
The hypothesis which states that the pattern or effect we see in our sample is indicative of a similar pattern or effect in the population from which we sampled, is known as the
alternative hypothesis
Bart thinks that sleep deprivation causes irritability in school children. He polls the families of 100 5th graders, and finds out how many hours of sleep each child gets a night. He then has their teachers rate each of the children on how irritable they are at school (on a 1-10 scale). What kind of test should he use?
correlation
Which of the following would be an appropriate statistical tool to measure the strength and direction of the relationship between two cardinal variables?
correlation
I want to know whether cross training and stretching out before exercising can reduce the number of injuries suffered by athletes. I recruit 50 high school track team members, and randomly assign them to one of four conditions. One group stretches, but doesn't cross train. One group stretches and cross trains. One group doesn't stretch and doesn't cross train, and the last group doesn't stretch and cross trains. I record the number of injuries suffered by the athletes in each group over the track season so I can compare the mean number of injuries in each group. Which kind of test should I use?
factorial ANOVA
Anthony wants to study the relationship between income and drug dependence. His theory is that drug use is higher among lower and higher incomes, but lower among middle income ranges. He plans to recruit 300 participants, 100 each from low, middle, and high income groups. He then plans to ask them how many times they have used illicit drugs during the past 6 months. What kind of test should he use?
independent groups ANOVA
I want to know whether eating an apple a day really does reduce the number of visits a person makes to the doctor. I recruit 100 people, and randomly assign them to the "apple a day" or "no apples" group. I then record the number of times they visit the doctor that month. Which kind of test should I use?
independent groups t-test
What is a confounding variable?
is a variable other than random sampling error, and other than our independent variable, that could account for why the sample means are different.
If low values for X occur with low, medium, and high values of Y; and high values for X occur with low, medium, and high values of Y, this would be an example of __________ correlation.
no correlation
Which of the following would be an appropriate statistical tool to use someone's score on variable X to predict their score on variable Y?
regression
Moving on to the 'Integration' lecture....The group we actually have in our hands, the group we measure, is the
sample