Stats 210 Exam #3
df interaction
(df rows) (df columns)
Beta weight can only be between what values?
-1.00 and 1.00
Small effect size for R squared
0.01
Medium effect size for R squared
0.09
Two-way ANOVAs include?
2 nominal IVs and a scale DV
Penny is figuring the regression line for some data but needs help in first figuring the predicted value of Y. She knows that the slope is 3 and the intercept is 4. What is the predicted Y value for an X score of 7?
25
How many F statistics calculated in a two-way between-groups ANOVA?
3 F statistics (2 main effects and 1 for the interaction)
The Educational Testing Services (ETS) is conducting a study to determine the relation between method of presentation (standard lecture vs. computer presentation) and type of lecture (psychology, English, or statistics) to lecture comprehension. How many cells are in this design?
6
A researcher sets up a study to examine how lecture comprehension of college students is affected by lecture type (history, psychology, statistics, or English), classroom size (large or small), presentation method (blackboard, overhead projector, or computer), and instructor (graduate student, assistant professor, or full professor). How many cells are in this design?
72
What else can a correlation be?
A descriptive statistic or an inferential statistic
Pearson r only reflects what type of relationship?
A linear relationship
Correlation often depicts what type of relationship?
A linear relationship between variables (Increase in one variable is associated with an increase in another variable)
What is the Standard Error of the Estimate?
A measure of how well your data fits the curve.
A researcher wishes to investigate relations between depression and cerebral asymmetry. She collects EEG data from 50 subjects and based on their HRSD scores classifies them as Severely depressed/Moderately depressed/Non-depressed. What statistical test would be appropriate?
A one-way between groups ANOVA *2 variables used: level of depression (nominal/ordinal) and EEG scores (scale variable)
What statistical test would we use if we examine students' cognitive performance before and after taking methamphetamine?
A paired-samples t test
What is the formula for Beta?
B= (b) (square root of SS subscript x over square root of SS subscript y)
Only with a one-way within-groups ANOVA do you have to....?
Be concerned about order effects because every participant is exposed to every condition
Which character is the standardized regression coefficient symbolized by?
Beta
What does B=R mean?
Beta (or coefficient of simple linear regression) is equal to the Correlation Coefficient
What is a cell?
Box depicting a unique combination of levels of IVs in a factorial design
When there are more than two levels of the variables in a chi-square analysis, what is true?
Calculating adjusted standardized residuals helps to assess individual cells for significant differences.
What do textbook authors recommend using a computer program for?
Calculating the Standard Error of Estimate
Manifest Variables
Can be observed + measured
Correlation does not mean....?
Causation
Large standard error of the estimate graph
Data points are NOT close to the line of best fit
Small standard error of the estimate line
Data points are close to the line of best fit
What are out-of-line outliers?
Data points that aren't in line with the other data points and REDUCE the size of the correlation (and other data points)
Benefits of within-groups ANOVA?
Decreases error due to differences between groups, we know that groups are identical for all of the relevant variables because each group includes exactly the same participants, able to decrease within-groups VARIABILITY due to differences for the people in our study across groups (denominator of F statistic decreases so larger F statistic)
Restricting the range of one of your variables (Skew/Celing/Floor Effects) does what to your correlation?
Decreases the correlation
Between-groups variance?
Estimate of population variance based on difference among GROUP means
True or false: 1 nominal variable with 3 levels, a chi-square test for independence would be used.
False--> Only 1 variable so a chi-square Goodness of Fit would be used **Chi-square Independence has 2+ variables
Variances between groups are equal in variances
Homoscedasticity
Reliability
How consistent your measure is
Predictor variables should be....?
Independent of each other (reflecting different abilities or skills)
What is the within-groups variance?
Individual differences *Standard Error
In the simple linear regression formula, (blank) is the predicted value for Y when X is equal to zero.
Intercept
How strong is the association?
Large because the data points fall almost exactly in a straight line.
Establishing validity is usually what?
More difficult than establishing reliability, so it is not always done.
df between?
N (groups)-1
What does a Pearson r=0.00 indicate?
No correlation
In looking at a graph of data, there seems to be a curved pattern, possibly because of the influence of a third variable. Should simple linear regression be used?
No; the data are nonlinear
Chi-Square Tests
Nonparametric test
Two-way ANOVA effect size?
R squared
When performing a two-way between-groups ANOVA, effect size is measured using _____, and it is calculated for _____ effects.
R2; three
Partial Correlation?
Relationship between 2 variables while controlling for a third variable.
_____ use(s) a ratio of two conditional proportions to measure the chance of an occurrence, giving an estimate of the size of the effect in a chi-square test.
Relative risk
R squared and n squared is equal to?
SSbetween/SStotal
The formula for correlation is similar to the formula for z score in that they both:
Take into account sample size and variability
What is the Grand Mean (GM)?
Taking all of the scores and averaging them
Numerator in the F statistic is?
The difference among the sample means
A researcher investigates the impact of color and lighting on mood at work. He manipulates the paint used in offices, including the presentation of four different colors. He also manipulates lighting by using three different lighting conditions. The null hypothesis for the interaction in this study would be?
The effect of office color is not dependent on the type of lighting condition.
Which graph has the smaller standard error of the estimate
The graph with data points closer to the line of best fit.
In ANOVA, what does MS stand for/represent?
The variance
People with more education earn more money throughout their lifetimes than those with less education. How are they correlated?
They are positively correlated.
Analysis of variance is always a what?
Two-tailed test
ANOVA
Used with 1+ nominal independent variables (with 2+ levels) and an interval or ratio dependent variable
When a test accurately measures what it is intended to measure, we say that the test is?
Valid
What is SSerror?
What's left over from within-groups variability **SSerror contributes to the denominator of the F-statistic
When does an interaction occur?
When 2 IVs have an effect in combination that we do not see when looking at each IV individually
What is a main effect?
When one IV influences the DV
Orthogonal
When predictor variables are independent of each other and reflect different abilities or skills
Types of one-way ANOVAs
Within-Groups: 2+ samples with same participants Between-Groups: 2+ samples with different participants in each sample/each level of IV
Which source of variability does not require the grand mean in its calculation?
Within-groups sum of squares
Which of these is NOT an assumption of a two-way between-groups ANOVA?
a) The variability in underlying populations is similar. * b) The participants are matched on important characteristics. c) The underlying population distributions are normal. d) Participants are randomly selected.
A marginal mean in a two-way ANOVA is the mean of the dependent variable:
at one level of an independent variable, disregarding the distinction of levels of the other independent variable.
Df formula for between-groups ANOVA is...?
df between= N(groups)-1
Df formula for interaction effect of a between-subjects factorial design?
df interaction= (df rows)(df columns)
In a _____, the effect of one independent variable reverses based on the _____.
qualitative interaction; level of the other independent variable
In a _____, the strength of the effect varies under certain conditions, but not the _____.
quantitative interaction; direction
Outliers
r can be biased by outliers (can exaggerate other variables when in line with other data points--> In-line outliers)
Effect size for ANOVA we calculate?
r squared (proportion of variance accounted for)
Calculating Effect Size
r squared or eta= squared, n squared, is a common measure of effect size for ANOVAs
SStotal=?
s squared (total/grand variance) * dftotal
The resulting F statistic is equal to?
t squared (t-statistic squared)
How do we calculate the F statistic?
variance between-groups (different group conditions)/ variance within-groups
What is the Pearson r analogous to?
z scores (because both standardized measures)
Z Score Formula
z(x)= (X-M(x))/SDx *SDx= the standard deviation of sample NOT population
Formula for Turkey HSD?
(M1-M2)/SM (Standard error of mean)
Error (Proportionate Reduction in Error) formula?
(Y-Y(with hat))= the difference between actual score (Y) and predicted score (Y with hat) ** Predicted score is also called the deviation score
Chi-Square Test for Independence
**Analyzes 2 nominal variables **Same 6 steps for hypothesis testing **Same formula for statistic as Goodness of Fit test
Chi-Square Test for Goodness-of-Fit
**Nonparametric test when we have 1 nominal variable (data are observed frequencies f subscript 0) **Null hypothesis specifies proportions in each category (proportions of scores that should fall in a particular category) **Use observed frequencies to compute expected frequencies
What number reflects the strongest relationship between variables (correlations)? 0.51, -0.64, or 1.30
-0.64 (negative sign doesn't matter with correlations)
What numbers does the correlation coefficient fall between?
-1.00 and 1.00
What is the only value that can possibly be a Pearson r coefficient?
-1.00 to 1.00
When to use a nonparametric test....
-DV is nominal -DV or IV is ordinal -Sample size is small (violating assumption about shape of sampling distribution) -When underlying population is not normal
Limitations of nonparametric tests....
-cannot easily use confidence intervals or effect sizes -have less statistical power than parametric tests (more likely to make a Type Two error) -nominal and ordinal data provide less information (want to use scale data)
Large effect size for R squared
0.25
Steps of regression with z scores
1) Calculate z score 2) Multiply z score by correlation coefficient 3) Convert z score to a raw score
How do we determine the Regression Equation (steps)?
1) Find z score for X 2) Use z score to calculate the predicted Y value 3) Convert z score to its raw score
What variances are within the Within-groups variances?
1) Subjects (find sum of squares for subjects and remove it from within-groups variance to find) 2) Error
What are the two things we need to know for a regression?
1) What is the value of Y when X is equal to zero (y-intercept) 2) What is the slope
Assumptions of ANOVAS?
1)Random selection of samples 2) Scale variable 3) Normally distributed 4) Homogeneity of variances
One-way repeated measures ANOVA, calculate (blank) F statistics and are actually only interested in (blank) F statistics.
2 F statistics (between and Subjects) 1 F statistic we care about
Repeated-measures analysis of variance produces SStotal=40 and SSwithin=10. What is the SSbetween?
30 (if SSwithin was SSerror we wouldn't be able to answer this question)
A researcher sets up a study to examine how lecture comprehension of college students is affected by lecture type (history, psychology, statistics, or English), classroom size (large or small), presentation method (blackboard, overhead projector, or computer), and instructor (graduate student, assistant professor, or full professor). How is the design described?
4 × 2 × 3 × 3 between-groups ANOVA
Test with multiple groups of scores but specific comparison have been chosen prior to data collection?
A planned comparison (apriori test)
What type of interaction has crossed lines?
A qualitative interaction because DIRECTION of DIFFERENCES CHANGE.
How is a correlation different from a regression analysis?
A regression enables us to make predictions, while a correlation describes a relationship.
What is the only kind of regression we conduct?
A simple linear regression
Correlation Coefficient
A statistic that QUANTIFIES a relation between two variables *can be positive (changes in same direction) or negative (changes in opposite direction)
The Pearson Correlation Coefficient
A statistic that quantifies a linear relation between 2 scale variables * "r"
Multiple Regression
A statistical technique that includes 2+ predictor variables in a prediction equation (similar to y=mx+b)
What does a positive correlation look like (1.00)?
A straight line starting at the bottom left corner and going up to the right
Validity
A valid measure is one that measures what is was DESIGNED or INTENDED to measure
In a chi-square test where there are more than two levels of one of the variables, the _____ allow(s) you to make more specific conclusions rather than a general interpretation based on rejecting the null hypothesis
Adjusted Standardized Residuals
What does partial correlation allow us to do?
Allows us to QUANTIFY the relation between 2 variables, controlling for the correlation of each of these variables with a third related variable.
What does r=0.76 changed to r=0.90 indicate?
An in-line outlier (EXAGGERATES other data points/increases correlation)
Non-parallel lines across levels indicates what?
An interaction (crossed lines)
An IV that makes a separate+distinct contribution in the prediction of a DV, as compared with another variable, is called?
An orthogonal variable
What does r=0.76 changed to r=0.27 indicate?
An out-of-line outlier (REDUCES size of other data points/correlation)
Regression to the mean
Any time there is a less-than-perfect correlation
SEM encourages researchers to think of variables in what way....?
As a series of connections
The standardized regression coefficient, which is equal to the Pearson correlation coefficient in a simple linear regression, is also called:
Beta weight
The larger the Beta the....?
Better the prediction
What is the correlation coefficient referred to as?
Bivariate statistic= a relationship between two variables (requires paired scores)
Body weight can be predicted based on the amount of calories consumed by an individual due to the positive correlation between the two variables. When looking at the line of best fit for the linear regression, the data points are clustered close together. Predicted shyness based on the number of friendships a person has is also correlated, but the data points are more scattered around the line of best fit, showing a general negative correlation. Which has the higher predictive power and why?
Calories consumed and body weight, because the variance is lower
When r is smaller than what it SHOULD be what effect is in play?
Ceiling AND Floor Effects
What does a negative Correlation Coefficient indicate?
Changes in opposite directions
What does a positive Correlation Coefficient indicate?
Changes in the same direction
It has been suggested that the rate of forgetting (normal vs. fast) for word lists often used in psychological experiments may depend on the type of memory test employed, either free-recall or recognition. What is the best way to test this hypothesis?
Chi-square test
A four-year college wants students to live on campus during their first and second year. In a class of 18 students, the professor does a quick check to see the distribution of students across those two living options. What statistical test could be used to analyze these data?
Chi-square test for goodness of fit
A political campaign manager wants to investigate whether the top 10 percent of college graduates are more likely to be either a Democrat or Republican. What type of test would be MOST appropriate for his design?
Chi-square test for goodness-of-fit
Some people are "cat people" and some are "dog people," but which are there more of in Sam's research methods class? What statistical test could help Sam evaluate this question and the data she would likely collect?
Chi-square test for goodness-of-fit
A director of a childcare center is interested in whether there is a tendency for fathers and mothers to be differentially involved in the care of their child, depending on the gender of that child. She collects data on which parent drops off a child and the gender of that child. Her center has very few children, so her data are limited. What statistical test could she use to analyze these data?
Chi-square test for independence
Does attractiveness to people of different races or ethnic backgrounds depend on your own race or ethnicity? Mackenzie decides to collect data on this topic for a class project. She focuses on three different racial/ethnic groups for both the person and the subject of their attraction. What type of analysis could she use?
Chi-square test for independence
A magazine surveyed 300 men and 250 women and found that 55 percent of the men said that baseball was their favorite sport, and 30 percent of women also reported baseball as their favorite. All participants were asked to select from four sports. How could this be tested if this difference is significant, given that the samples are uneven?
Chi-square test for independence, converting the data into frequencies for a contingency table
What is a correlation?
Co-variation or co-relation between two variables *These variables change together *Usually scale (interval or ratio) variables
Proportionate Reduction in Error is also called....?
Coefficient of Determination
The Bonferroni Test is more....?
Conservative than the Turkey test (which is why it isn't used very often) *post-hoc test that provides a more strict critical value for every comparison of means *Use smaller critical region to make it more difficult to reject the null hypothesis (determine # of comparisons we plan to make)
Before starting the calculations of a correlation, one should:
Construct a scatterplot
Correlation is used to calculate validity often by....?
Correlating a new measure with existing measures known to assess the variable of interest.
In simple linear regression if you know the (blank) you know what the proportionate reduction in error is.
Correlation
Regression builds on what?
Correlation *The difference is a question of prediction vs. relation
There is a correlation of 0.91 between ice cream consumption and drowning deaths.
Correlation is NOT causation
Psychometrician's use....?
Correlation to examine 2 important aspects of the development of measures--> reliability and validity
A researcher wishes to investigate relations between depression and cerebral asymmetry. She collects EEG data from 50 subjects and assesses their level of depression on the Hamilton Rating Scale for Depression (HRSD). What statistical test would be appropriate?
Correlation.
A variable we suspect associates with the IV of interest and impacts the results of our study is called a:
Covariate- a variable linked to IV of interest(want to try to control this).... like a confounding variable but removing it to remove its effects
What is the effect size for Chi-Square test?
Cramer's V (phi)
An effective way to understand the size of the relation between two variables assessed using the chi-square statistic is to:
Create a visual depiction or graph
What are the comparison distribution of correlation coefficients determined by?
Degrees of freedom--> df= N-2 (N= number of paired scores)
Within-groups ANOVA is similar to a....?
Dependent-samples t test because use the same participants and conduct a test multiple times (more than 2 times)
How can a correlation be a descriptive statistic?
Descriptive features: describes a relationship, indicates the direction+strength of that relationship
Qualitative Interaction?
Direction of differences changes as you go from one level to another
Quantitative Interaction?
Direction of differences does not change *Difference between variables changes but direction does not
How do you conduct a Bonferroni Test?
Divide p level by the number of comparisons
F statistic?
Each score in the sample is a combination of treatment effects and individual variability or error
P (Type 1 error) increases with?
Each statistical test
The statistic that describes the variability of a set of data points to the line of best fit in a linear regression is the standard:
Error of the estimate
Within-groups variance?
Estimate of population variance based on differences WITHIN sample distributions
Why use two-way between groups ANOVA?
Evaluate effects of two IVs, more efficient to do single study rather than two studies with one IV each *Can explore interactions between variables
Combined Groups
Exaggerate the correlation/create a correlation that doesn't really exist
Internal Reliability
Examining correlations with each individual item and the overall score *Average of all possible split-halves
In a two-way ANOVA, three _____ statistics are calculated.
F
An F statistic with higher within-groups variance than between-groups variance we can infer significant difference between means (True or False)
False
True or False: As the correlation between two variables increases, the proportionate reduction in error decreases.
False
True or False: AxB interaction is significant, then at least one of the two main effects must also be significant.
False- 2 lines that cross perfectly but no main effect still indicates an interaction
True or False: Repeated-measures design reported as "F(2,12)=7.85, p<0.05". This study used a sample of n=13 participants.
False- 7 subjects in study (dfsubjects= N-1 --> 7-1=6) dferror=(dfbetween)(dfsubjects)--> 12= (2)(6)--> 6 is dfsubjects
True or False: When performing a hypothesis test for correlation, we test against the null hypothesis the p=1.0
False- testing the null hypothesis that p= NOT 0
A p-value is the probability that the null hypothesis is true
False- the p-value is likelihood that extreme or not extreme result occurs under the null hypothesis
True or false: Partial correlation describes a situation where complete correlation calculation has NOT been performed.
False--> Partial correlation is when you isolate a variable + control for the rest.
True or false: Test-retest reliability is intended to measure a person's, not a test's, consistency over time.
False--> Test-retest reliability is a way we control the reliability of the measure and or test itself
As the number of t tests increases, the risk of a Type 1 Error decreases
False--> it increases
True or False: The standardized regression coefficient (Beta) for each variable in a multiple regression equation is equivalent to that variable's correlation (r) with the DV.
False--> only with simple linear regression will Beta=r but not in multiple regression
True or False: 2 variables are negatively correlated, they cannot be causally related.
False-possibly causal relationship (we just don't know).
True or False: In contrast to correlation, regression allows the researcher to make a determination of cause+effect relations amongst variables.
False: regression has the same limitations as correlations
Why not use multiple t-tests?
Fishing for a finding Problem of Type 1 Error p (probability of not making a type 1 error) increases with each test we compute
The chi-square test for _____ is appropriate when there is one nominal variable, while the chi-square test for _____ is appropriate when there are two nominal variables.
Goodness-of-fit; independence
What is the null and research hypothesis prediction for a Correlation?
H0=p=0 and H1=p= Not zero
Matched group designs (type of within-groups design)....
Have increased statistical power over between groups design, when employing a matched group design may not be aware of all the important variables that participants should be matched on, and if one of participants decides to drop out of the study, must discard data for match for this person.... NOT typically involve testing fewer participants than within-groups design
The correlation of height and weight is 0.75, and the correlation of depression and satisfaction with work is -0.42. Which is a stronger correlation and why?
Height and weight; 0.75 is closer to 1 than -0.42 is to -1
Example of combined groups?
Height x weight for male and female participants because both genders together the relationship becomes EXAGGERATED (combined groups)
Correlation is used by psychometricians to....?
Help professional sports teams assess the reliability of athletic performance (how fast a pitcher can throw a baseball).
A clinical psychologist notices that admission to the psychiatric hospital he works at seems to vary by season. Using a chi-square test, he rejects the null hypothesis that hospital admission is independent of season. What conclusions can he make?
Hospital admissions vary by season
What is the SSsubjects?
How much of within-groups variability is due to individual differences **Like in independent-samples t-test by moving individual differences within-subjects to get a more powerful test
One-Way ANOVA
Hypothesis test including 1 nominal IV with more than 2 levels and a scale DV
What is a factor in a study?
IV in a study with more than one level
More variability between-groups
Indicated by larger difference between means (means that are further apart)
Standard Error of Estimate
Indicates the typical distance between the regression line and the actual data points
A main effect for a two-way ANOVA is the?
Influence of one of the independent variables on the dependent variable.
Two-way between-groups ANOVA, which F statistic, if it is significant, should you first interpret?
Interaction effects
Overall, children did not remember more words than adults on a memory test, F (1, 153) = 3.00, p > 0.10, and participants listening to music during the test did not remember more words than those not listening to music, F (1, 153) = 1.98, p > 0.10. However, listening to music significantly aided the children in remembering the words, while it hindered the adults in remembering the words, F (1, 153) = 12.73, p < 0.05. Which effect(s) must be present in this study for these results to be valid?
Interaction of age and music
What are the 3 possible explanations for a correlation?
Invisible third variables 1) A-->B 2)B-->A 3)C-->A and B **Don't know which one it is
Psychometrics
Is used in the development of tests and measures
What is another limitation of regression?
Issues of generalizability (because no random sampling)
Dr. Garoule is trying to determine which of his patients has the highest likelihood of depression. He calculates a linear regression equation with the scores on an anxiety measure, which are positively correlated with scores on a scale measuring depression. Dr. Garoule converts patient D's anxiety score to a z score and predicts the z score for the depression scale to be 0.65. Is patient D's raw score for depression above or below the mean and why?
It is above the mean because the z score is positive.
Dr. Garoule is trying to determine which of his patients has the highest likelihood of depression. He calculates a linear regression equation with the scores on an anxiety measure, which are positively correlated with scores on a scale measuring depression. Dr. Garoule converts patient D's anxiety score to a z score and predicts the z score for the depression scale to be -0.35. Is patient D's raw score for depression above or below the mean and why?
It is below the mean because the z score is negative.
Bart scored 84 on his statistics midterm, and the class average was 85. What would his deviation and z score be in a distribution of the midterm grades?
It would be negative
Bart scored 86 on his statistics midterm, and the class average was 85. What would his deviation and z score be in a distribution of the midterm grades?
It would be positive
In the social sciences, there are numerous variables that can be discussed and considered as important phenomena, but they cannot be observed directly. These are called _____ variables. However, _____ variables, which can be observed and measured, are used to assess the intangible variables.
Latent; manifest
When parametric assumptions are met, nonparametric tests have (blank) statistical power than parametric test
Less **Nonparametric tests= less power + smaller sample size used
A cell depicts one unique combination of _____ of the independent variable in a two-way ANOVA.
Levels
Cell depicts one unique combination of (blank) of the independent variables in a two-way ANOVA.
Levels *Factors= the IV themselves
The regression line is also known as the:
Line of best fit
How do you use the Q table?
Look at df within and number of groups (k)
What do you do with a main effect?
Looking at one IV and ignoring the other
Mike wants to test how different levels of verbal ability and math ability influence a measure of reading comprehension and a measure of graph interpretation. He is concerned that IQ may confound the results since it is likely to have an effect, too. What test should he use to account for this third variable?
MANCOVA
Factorial design with 2 IV (gender+age of participants) and DV is memory performance. Found females remembered more than males. Say there is a MAIN EFFECT of gender of participants.
Main effect is the answer
In determining main effects and interactions in a factorial ANOVA, it is helpful to use the _____ in a table of means.
Marginal means
(Blank) refers to means of the rows and the columns in a table that shows cells of a study with a two-way ANOVA design
Marginal means (rows+columns measured to analyze data)
In an ANOVA we refer to s squared as the?
Mean Sum of Squares (MS) or Average Sum of Squares
What is the between-groups variance?
More variability here
Predicting an individual's IQ score from two variables, for example, socioeconomic status and education level, would involve the use of:
Multiple regression
Predicting an individual's IQ score from two variables, for example, socioeconomic status and education level, would involve the use of....?
Multiple regression.
df columns?
N columns-1
df rows?
N rows-1
df total?
N total (total # of scores)-1
What can happen by conducting 2 separate ANOVAs?
NOT GOOD! It inflates alpha
What do you need to calculate before using the Tukey HSD test?
Need to calculate the standard error
Structural equation modeling graphs depict a _____ among several variables, demonstrating how all of the variables combine to create a _____.
Network of relations; statistical model
To avoid any controversy over a chi-square analysis, which of these is the recommendation for the minimum expected frequency per cell?
No fewer than 5 observations in every cell or at least 5 times as many participants as cells in the design
Parallel lines indicates what?
No interaction
Tysha notices that every time she wears jewelry to work she has a very productive day and finishes all her projects on time. When she doesn't wear jewelry she struggles to get everything done. She concludes that wearing jewelry is the cause for her successful days at work and buys more jewelry. Is she correct?
No; she cannot make causal claims from correlations.
Shay discovers his data are not normally distributed and appear to be skewed. He knows his sample size is small, but he cannot collect more data due to limited funds and time constraints. Which test should he use?
Nonparametric test
_____ tests are useful for comparisons of data where the dependent variable is _____.
Nonparametric; nominal
The probability of Type 1 Error increases as the....?
Number of statistical comparisons increases (don't want this to happen because we want to keep the p level at 0.05)
N' could be....?
Number of subjects in particular group if group number is the same
When can regression be used?
ONLY when there is a linear relationship between variables
Factorial ANOVA uses?
One scale DV and at least 2 nominal IV (factors)
Simple regression allows us to use (blank) IV (s) while a multiple regression analysis allows us to use (blank) IV(s).
One; two or more
What are in-line outliers?
Outliers that are in-line with other data points and EXAGGERATE other data points
Regression line allows us to what?
PREDICT new data associated with the same correlation (Ex: predicting a student's GPA based on a given poverty level)
What is another name for r squared rows?
Partial Eta Squared
Karl Pearson
Pearson Correlation Coefficient
What correlation uses "p" for their correlation coefficient?
Population correlation
As the age of a male professor increases, his income increases. This is an example of what type of correlation?
Positive
Every day for the last month, Boris has recorded the number of times it has rained and the height of all of his tomato plants. He wants to see if the two are correlated, so he creates a scatterplot of the data. He sees a pattern that he could probably draw a line through (from the bottom left going up to the top right) as rainy days and plant height both increase. What type of correlation is this?
Positive
A priori (planned) comparisons
Post-hoc tests are different from this
What does Z (subscript Y with hat) represent?
Predicted Y-value
Intercept
Predicted value of Y when X is equal to zero *called "a" in the formula
Regression (blank), Correlation (blank) relations
Predicts, describes
What does the proportionate reduction in error do?
Quantifies how much more accurate our predictions are when we use the regression line INSTEAD of the mean as a PREDICTION TOOL
Logic behind F statistic?
Quantifies overlap *2 ways to estimate population variance (between and within groups variability)
A Correlation Coefficient does what?
Quantifies the relationship between two variables.
A researcher is examining the relation between patrol location (more affluent vs. less affluent neighborhoods) and amount of training (5 weeks, 10 weeks, or 15 weeks) to police job performance in a small town. The researcher finds a significant interaction in that the strength of the effect of training varies depending on patrol location; however, the direction of the effect does not change. This type of effect is a:
Quantitative Interaction
What are the two types of interactions in ANOVAs?
Quantitative and Qualitative
The interaction strengthens the effect (in the same direction)
Quantitative interaction
Assumptions for a correlation?
Random sampling, normally distributed, scale data, homogeneity of variance (variance is the small at all levels of the IV), linear relationship needed
Analysis of variance is the?
Ratio of between-groups variance to within-groups variance
(blank) refers to the tendency of scores that are particularly high or low to drift towards the mean over time.
Regression to the mean
The definition of (blank) is: predict z score on the dependent variable that is closer to the mean than is the z score on the independent variable.
Regression to the mean
The idea that patterns of extreme scores will balance out if sampling continues indefinitely or trends are looked at over the long run is known as:
Regression to the mean
Desmond thinks his new tutoring methods are highly effective compared to commercially available methods. He selects the worst students in his statistics class and tries his new tutoring strategy. Which statement describes a threat to the validity of his hypothesis even if the students do very well after the tutoring sessions?
Regression to the mean is likely to occur.
_____ is a useful statistical analysis for predicting behavior, and _____ is a useful technique for finding the direction and strength of a relation between two variables.
Regression; correlation
Which is used more often? Reliability or Validity?
Reliability
What 2 important measures do psyshometricians examine?
Reliability and Validity
A psychometrician uses correlation to examine what important aspects of the development of measures?
Reliability and validity
Subjects
Removes variability of subjects from within-groups variance to find the error
What is another name for a one-way within-groups ANOVA?
Repeated-measures ANOVA
Multiple regression is often computed via....?
SPSS
SS interaction for two-way ANOVA?
SS (dose x gender)=SS between- SS dose- SS gender
Sum of squares for a two-way ANOVA?
SS total= s squared (grand/total variance) * (N-1)
Between-groups variance?
SSbetween= Sum of (M-GM) squared
r squared columns=?
SScolumns/(SStotal-SSrows-SSinteraction)
Within-subjects ANOVA, use label SSwithin to correspond with what term from the source table?
SSerror
When calculating R squared for one-way within-groups ANOVA not concerned with calculation for....?
SSerror (IS concerned with SSbetween, SStotal, and SSsubjects)
r squared interaction=?
SSinteraction/(SStotal-SSrows-SScolumns)
R squared rows=?
SSrows/(SStotal-SScolumns-SSinteraction)
SStotal=?
SSwithin+SSbetween
MSwithin=?
SSwithin/dfwithin
Within-groups variance?
SSwithin= Sum of (X-M) squared
What correlation uses "r" for their correlation coefficient?
Sample correlation
What is the Pearson Correlation Coefficient?
Sample correlation symbolized by "r" *Population correlation symbolized by "p" (called row) *Intended for scale data on both variables
Homoscedascity (Homogeneity of variances)
Samples come from population with the same variances
What type of graph is particularly useful for displaying a correlation?
Scatterplot
A significant F does not guarantee what?
Significant differences between means
You want to predict your score on the statistics final exam using your grade point average for the semester. Which statistical technique is best for this type of analysis?
Simple linear regression
In the equation for a line in statistics, the _____ is the predicted amount of increase for Y when X is increased by 1, and the _____ is the predicted value for Y when X crosses the y-axis (X = 0).
Slope; intercept
Values for R squared Effect Sizes?
Small, Medium, Large
R squared effect sizes?
Small= 0.01 Medium= 0.09 Large= 0.25
What are the r (effect size) values for a correlation?
Small= 0.1 Medium=0.3 Large=0.5
Nonparametric test
Special class of hypothesis tests (inferential statistics) used when assumptions for parametric tests are not met.
What is the best way to test a measure's internal consistency?
Split-half
How do you calculate the SM (standard error of the mean) in a Turkey HSD?
Square root of MSwithin/N' N'= N(groups)/ sum of (1/N)
The Pearson r is also called a....?
Standardized measure (always between -1.00 and 1.00)
Standardized Regression Coefficient
Standardized version of the slope in a regression equation. Predicted change in Y, expressed in units of Standard Deviation, for an increase of one Standard Deviation in X. **Called Beta
What does a negative correlation look like (-1.00)?
Starting in the top left corner and going down towards the bottom right of the graph
Structural Equation Modeling (SEM)
Statistical technique that quantifies how well the sample data "fit" a theoretical model that hypothesizes a set of relations among multiple variables. **Includes Manifest and Latent Variables
One particular type of reliability is?
Test-retest reliability
How can a correlation be an inferential statistic?
Testing the correlation for significant and making predictions
What does SPSS call the within groups variability?
The "Error" in an F distribution
What is the line used for a regression called?
The Line of Best Fit
What Correlation Coefficient table should you use based on the degrees of freedom?
The Pearson Correlation Coefficient Table NOT the Spearman Correlation Coefficient Table
Value of the correlation coefficient indicates what?
The STRENGTH of the relation (not the positive or negative sign)
What measures how much variability there is between actual scores and predicted scores?
The Standard Error of Estimate
Correlation can be used to establish what?
The VALIDITY of a personality test
Slope
The amount the Y is predicted to increase for an increase of 1 in X *called "b" in the formula
Mean square within groups (sometimes referred to as mean square error) in a between-groups ANOVA is?
The average of individual sample variances provided there are equal N's, weighted average of all sample variances (when not all equal N's), estimate of variance of each of population involved
Denominator in the F statistic is?
The average of the sample variances
The value of Beta should be the same value as what?
The correlation coefficient
Chi-Square Statistic is based on....?
The degrees of freedom
What is the first thing/part discussed when looking at the results (even if other main effects) because this is the most important result
The interaction
If shyness is negatively correlated with the number of friendships a person has, which statement regarding the line of best fit, or regression line, would be true?
The line will start in upper left corner of the graph and end in the lower right corner.
In repeated-measures analysis of variance (within-subjects design), how does magnitude of mean differences from one treatment to another contribute to F-ratio?
The mean differences add to numerator of F ratio
Conducting a hypothesis test for Pearson correlation coefficient r, we calculate degrees of freedom by subtracting 2 from the sample size. In Pearson correlation, the sample size is....?
The number of participants because the number of PAIRED scores not the number of INDIVIDUAL scores
Between MS (mean sum of squares) is associated with what part of the F-statistic?
The numerator
The term coefficient of determination is also known as:
The proportionate reduction in error (how much of improvement you get in predictions when you use the line of best fit).
A student researcher conducts a study that is analyzed using a two-way between-groups ANOVA. Unfortunately, none of the three F statistics were statistically significant. The student is confident that there is something to her research question, and she wonders if she simply didn't collect enough data to detect anything. She computes effect size measures and gets the following values: 0.012 for the first main effect, 0.073 for the second main effect, and 0.0017 for the interaction. If she were to conduct additional research, what would you recommend she pursue?
The second main effect because it has a medium effect size
Proportionate Reduction in Error should equal?
The square of the correlation coefficient
If the points on a scatterplot are all close to the regression line:
The standard error of the estimate is small
Which statement BEST describes an interaction in a two-way ANOVA?
The two independent variables have a combined effect on the dependent variable that is not present with either independent variable alone.
Correlation coefficient (in simple linear regression) is equal to what?
The value of B (Beta)
F distribution is the square of what?
The z distribution if there are only 2 samples and a sample size of infinity (or a very large sample); square of t distribution if there are only 2 samples
A bar graph displays the mean scores on a test of mental rotation for students majoring in psychology or chemistry who have either high or low spatial abilities. Psychology and chemistry students with low spatial skills scored about 35 percent correct on the mental rotation test, while psychology and chemistry students with high spatial skills scored about 70 percent correct. What can be inferred from these data?
There appears to be a main effect of spatial ability only
A bar graph displays the mean scores on a test of mental rotation for students majoring in psychology or chemistry who have either high or low spatial abilities. Psychology and chemistry students with low spatial skills scores about 35% correct on the mental rotation test, while psychology and chemistry students with high spatial skills scored about 70% correct. What can you infer from these data?
There appears to be a main effect of spatial ability only (might be another main effect but we don't know this)
One limitation of nonparametric tests is that confidence interval and effect size measures are typically not available. What is true of chi-square tests?
There is a measure of effect size, Cramer's V.
What can be concluded from the following chi-square test for independence result, 2 (1, N = 98) = 5.17, p < 0.05?
There is evidence of dependency between the two variables
One-way between-groups ANOVA how many sums of squares are calculated?
Three Total Sum of Squares Sum of Squares Between Groups Sum of Squares Within Groups
True or False: If 2 variables are positively correlated, then low scores on one variable would be associated with low scores on the other variable.
True
True or False: The first step in calculating a correlation is to examine the data using a visual display
True
True or False: The strength of the correlation coefficient is independent of its sign.
True
True or False: Two-factor design results in significant interaction, you should be cautious about interpreting main effects because interaction can distort, conceal, or exaggerate the main effects of the individual factors
True
True or False: Within-groups ANOVA, SSwithin groups= SSsubjects + SSerror
True
True or false: Validity is difficult to assess and is sometimes not considered as a result.
True
Researcher expects new teaching technique more effective for hs than younger middle school students, researcher is predicting interaction between teaching style and grade in school.
True (differential effect)
Linear Regression
Trying to figure out equation of a line *Includes an intercept and a slope
When you have 3 groups, and F is significant, how do you know where the differences are?
Turkey HSD and Bonferonni
The Educational Testing Services (ETS) is conducting a study to determine the relation between method of presentation (standard lecture vs. computer presentation) and type of lecture (psychology, English, statistics) to lecture comprehension. What kind of ANOVA should ETS use and what are the factors and/or levels?
Two-way ANOVA: Factor 1 is presentation, with two levels (standard vs. computer), and factor 2 is lecture type, with three levels (psychology, English, and statistics).
(Blank) hypothesis test that includes 2 nominal independent variables and an interval dependent variable, and a (blank) is a statistical analysis used with one interval dependent variable and at least 2 nominal independent variables
Two-way ANOVA; factorial ANOVA (beyond one IV)
A researcher wants to examine the effect of drugs and diet on systolic blood pressure. She randomly assigns 20 individuals with high blood pressure to one of four treatments: control (no diet or drugs), diet modification only, drug only, drug and diet modification. What type of design is appropriate for this study?
Two-way between-groups ANOVA
Comparing 3+ using ANOVA. Would be incorrect to use multiple t tests instead because it increases the chances of making a....?
Type 1 Error (more likely to reject the null hypothesis by mistake)
What are some limitations of regression?
Typically cannot establish causation (just like correlation does not mean causation)
Skew/Celing/Floor Effects
Underestimate the true relationship of a correlation
Restricted Range
Underestimates r *Small range in scores may underestimate r
What is Matched Groups?
Use different people who are similar on all of the variables that we want to control **We can analyze our data as if the same people are in each group, giving us additional statistical power.
Multiple regression differs from simple linear regression because it:
Uses more than one independent variable to make predictions
If between and within-groups variances are similar then?
Variability (F statistic) of about 1 will occur suggesting little difference between groups
What is a new assumption for a Correlation?
Variability around each data points should be the same (similarity around different data points)
F statistic equals?
Variability due to treatment effect+variability due to individual differences/ variability due to individual differences
Latent Variables
Variables we are interested in studying but cannot be clearly defined or measured directly
What statistical measure best describes MS (Mean Sum of Squares aka MSbetween+MSwithin)
Variance (average Standard Deviation)
Scatterplot
Visual portrays of correlational relationship *Look at BEFORE conducting correlation *Can have outliers distorting data dramatically *Each point on graph reflects one subject's scores
For a chi-square test for goodness-of-fit, what must be true to receive empirical support for the research hypothesis?
We are actually hoping for a bad fit between the observed data and what we expect according to the null hypothesis.
Is a sample with 346 participants normally distributed?
We don't know because we don't have information on the population
Within-groups ANOVA > between-groups ANOVA because within groups ANOVA:
We have REDUCED ERROR because same participants contribute to each condition of the study
Assume that we find a positive correlation between the number of hours students spend studying for an exam and their grade on the exam. If we calculate the regression equation for these data and find that the y intercept is 65, what conclusion can we draw?
When students do not study at all, we would predict a score of 65 on the exam.
What is an interaction effect?
When the effect of one IV on the DV changes as a result of the level of a second IV
What is an indicator of regression to the mean?
When the z score for the independent variable (X) is more extreme than the Y-value
When do we use an F distribution?
When we are working with 2+ samples
When is a within-groups ANOVA used?
When we have one IV with at least 3 levels, a scale DV, and the same participants in each group
Turkey HSD Test?
Widely used post-hoc test that uses means and standard error
The grand mean in a factorial ANOVA is NOT useful in calculating which of these?
Within groups sum of squares
Seasonal variations in subjects' level of self-reported depression. records level of reported depression at 4 points over the course of a year (different seasons)
Within-groups ANOVA
When determining the critical value or cutoffs for the three effects tested in a two-way between-groups ANOVA, what value is always the same for each effect?
Within-groups degrees of freedom
Formula for converting a z Score to a Raw Score
Y (with hat)= Z(Y with hat)(SD(Y))+ M(Y)
Regression formula
Y (with hat)= a+bX
When you reject the null hypothesis what do you really reject....?
You reject the idea that variables are independent so they are therefore DEPENDENT variables.
Formula for Regression to the Mean?
Z (subscript Y with hat)= (r subscript XY) (Z subscript X)
Chi-Square Test for Independence uses what for degrees of freedom?
df (row)= k(row)-1 df (column)= k(column)-1 df(x2)= (df row) (df column)
Chi-Square Statistic degrees of freedom formula
df (x2)= k-1 **k= the number of categories
What is the df if there are 3 levels of IV?
df between= 3-1--> =2
A repeated measures study examine anxiety levels of 10 snake phobia to 4 types of snake stimuli, the df error (df within) is....?
df between= 4-1=3, Df subjects= 10-1=9, df error=27
df within?
df1+df2+df3+....dflast
One-way within-groups ANOVA formula for degrees of freedom error?
dferror= (dfbetween) (dfsubjects)
Proportionate Reduction in Error formula?
r squared= (SS (total)- SS (error))/ SS (total) *SS (total)= the variability when we use the MEAN as a predictor *SS (error)= the variability in REGRESSION LINE
What was each variable in the formula Z (underscore Y with hat)=r (underscore XY) (Z (underscore X)) mean?
r underscore XY=correlation between X and Y z underscore X= correlation for x
