PSY 333 Midterm 2
What is Linear Regression? a. what are the two types?
-extension of Pearson's correlation (which measures the strength & direction between 2 scale variables) -assists in predicting the value of a scale DV Ex: motivation & GPA GPA = DV, trying to predict Types: 1)Single linear regression 2)Multiple regression
What is a type II error?
-the failure to reject a false null hypothesis or FALSE NEGATIVE -reason why it is so important to check for normality with dependent variable -"telling lady with bump that she isn't pregnant" -skewness makes denominator larger...which makes t value smaller...which will make it not significant when it actually might be
What is Levene's test of Equality of Error Variances?
-to determine if there is statical evidence that the variants are equal across groups -test for equality of variance, tests if the variance between groups are similar -or if the means of two groups are similar to each other Ho: equal variances assumed Ha: equal variances not assumed****when p > .05 we accept the null... - if p > .05, equal variances assumed - if p < .05, violate equality of variances assumption
What is a one-way ANOVA?
1 independent variable with 3 levels or more -used to test for differences between groups with three or more means/conditions -a parametric test -between-groups design with three or more levels of the independent variable
What are the three ratios from Two-way/Factorial ANOVA? - Define main effect and interaction effect
1&2 = Main effect: each IV's effect on DV (note: if 2 levels, the value = t test) 3 = Interaction effect: effect on DV of 2 or more IVs working together Example - IV1: beer or pizza - IV2: males or females - Main Effect 1: With diet, is there a difference between means not taking sex into consideration? - Main Effect 2: Regardless of which group they were assigned to, is there a change in weight based on sex? - Interaction Effect: how does sex interact with diet to see if there are differences in change of weight?
1. What is Binary Logistic Regression? 2. What is Multinominal Logistic Regression?
1. BLR: If there are only 2 categories in the DV 2. MLR: If there are more than 2 categories of the DV "think bi=2 and m=more"
In Linear Regression... 1. What is the Criterion Variable? 2. What is the Predictor Variable?
1. Criterion: Y, DV 2. Predictor: X, IV -overlap area of criterion and predictor variable = how much influence the DV has over the IV...the more overlap = the more influence it has
Nonparametric Test: What are the violations of assumptions of parametric tests?
1. DV is not "scale" (i.e. ordinal) 2. DV is scale but not normally distributed -may be due to small sample size (<25) OR... -outliers/skewness
What are the assumptions for a Two-way ANOVA?
1. DV is scale 2. DV has a normal distribution -since it is a parametric test
Explain the parts of the scientific method as an ongoing process (7 steps)
1. Develop general theories 2. Make observations 3. Think of interesting questions 4. Formulate hypotheses 5. Develop testable predictions 6. Gather data to test predictions 7. Refine, Alter, Expand, or Reject Hypotheses DMTFDGR
1. Larger R² value = 2. Smaller R² value =
1. Larger R² = actual & estimated values are closer together, good predictor, close to 1 (i.e. R² = .90) 2. Smaller R² = actual & estimated values are far apart, not a good predictor, close to 0 (R² = .02)
What tests are used when... 1. Looking for pattern in the data = 2. Looking for differences between groups/means/mean ranks = 3. Two-way/Factorial ANOVA =
1. Pearson's, Chi-square, Spearman's 2. All of the other 3. no agreed upon parametric test, need to do a lit search and calculate by hand
In Single Linear Regression... 1. What is the regression line? 2. What is the prediction equation?
1. Regression line: line that best fits observed data points (shortest line possible) 2. Prediction equation: a precise equation derived from observed relationships, the equation of the line Y = a + bX
In linear regression, R² 1. R² or x*x = ? 2. R² is the proportion ______
1. R² is the correlation coefficient squared (r*r), r = direction and strength 2. R² is the proportion of variance in Y explained by X in the model
What are the two types of multiple regression? When are each used?
1. Standard Multiple Regression - use when you have 1 model 2. Hierarchical Multiple Regression - use when you have > 1 model and have prior data...used to "control" for known predictor variables and then test for effects of other variables
1. What are nonparamteric statistics? 2. How do they normally work?
1. allows you to violate assumptions needed for parametric stat and still obtain reliable findings 2. many nonparametric statistics order the data and then rank the data (to minimize the effect of the assumption violations)
What are the three main types of study design?
1. correlational design 2. independent groups/between groups design 3. dependent groups/repeated measures/within groups design
1. what does large/small = 2. what does similar/similar = 3. what does small/large =
1. large/small = reject null - At least one mean is an outlier and each distribution is narrow; distinct from each other 2. similar/similar = fail to reject null - Means are fairly close to overall mean &/or distributions overlap a bit; hard to distinguish 3. small/large = fail to reject null - The means are very close to overall mean &/or distributions "melt" together
How do you test for normality?
1. look at histogram: superimpose normal curve onto histogram 2. run statistical test: EXPLORE Command (S-W or K-S) check normality test results and for outliers, follow up with z-score calculations for skewness and kurtosis
What are other possible solutions to the "Problem of Multiple Comparisons?"
1. reduce the number of correlations (ex. from 10 to 3 so BFC= .05/3) 2. collect preliminary data to identify key factors of interest and collect new data/only focus on those correlations 3. use a multiple regression model instead (consider IVs together)...to see how good variables are at predicting your IV
What is ANOVA?
ANalysis Of VAriance = a variability ratio(called an "F ratio") of the Sum of squares(which is an index of variance) -used to compare the means of more than 2 groups -not asking if they are EXACTLY equal...we are asking if each mean likely came from the larger overall population
What is a correlational design and what tests are used?
Correlational design: examining the ASSOCIATIONS between pre-existing or naturally occurring variables -Pearson's Correlation [Spearman's Correlation] [Chi-square Test of Independence]
What can error bars in a bar graph represent? What is the rule of thumb with standard error bars?
Error bars can represent: standard error of the mean (SEM) or 95% CI -if the SE bars do not overlap, then likely to have significance between the group or significant difference between the means
What is the F ratio = ?
F = SS[between]/ SS[within] ≈ Variance[between] / Variance[within] -This F ratio is like a "signal-to-noise ratio", it is a more complicated formula because you no longer have only 2 groups...you have 3 or more groups - Greater the difference between the means of the groups relative to the spread within the groups, the more likely you are to get significance and reject the null hypothesis -More spread within groups = makes it much larger ***The smaller the F ratio the less likely you are to get significance
How do you reduce the severity of Bonferroni?
Focus on the most relevant analyses to reduce amount of correlations/the severity of Bonferroni
Nonparametric Tests: Friedman Test
Fr - test of within groups differences between 3 or more related groups **use if violate assumptions of Repeated Measures ANOVA "Fr is for Fries and you want more than 3 fries so you repeat your favorite restaurant"
Nonparametric Tests: Kruskal-Wallis Test
H -test of between-group differences between 3 or more groups **use if violate One-way ANOVA "Krusty krab serves Hamburgers and one-way or anova Im gonna find ya, Im gonna Kruskal-Wallis ya"
What is the null hypothesis for ANOVA?
Ho: μ₁ = μ₂ = μ₃ Population mean of first group is close enough to population mean of second group... -Alternative is somewhere the means are not equal: Can be one (1 & 2) or all (1&2, 2&3)
What is the operational definition? Why is it important?
Operational definition: precise statement of how a conceptual variable is turned into a measured variable -Important so that study can be replicated and all aspects of study are clearly explained (other labs might measure differently) ex. how do you operationally define road rage?
What is pairwise comparisons? -What test does this apply to?
Pairwise comparisons shows if there are significant differences between pairs of means... for each condition of IV -if p value < .05, then the specific pairwise comparison is significant -applies to repeated measures (within-groups design) ANOVA Ex: between Woman 1 and Woman 2, between Woman 1 and Man 1
R² is a Ratio of... -What do the values of the denominator and numerators mean?
Ratio of the summed squares of the distance between the estimated values to the mean (numerator) with the distance between actual values to the mean (denominator) Numerator: distance b/w estimated values to mean Denominator: distance b/w actual values to mean ...Ratio: distance bw (estimated)/(actual) values to mean -larger denominator = smaller R² = line not good at predicting values (points are far away from mean) -1/1 or R² = 1, actual values = estimated values -Note: the numerator of R² estimated values = values of regression line
What is replication and what does it do?
Replication: refers to the repetition of a research study -If replicated study achieves the same or similar results as the OG study, it gives greater validity to the findings ... --if results can be replicated it helps establish a theory --an independent lab replicating gives more strength --adding validity to findings --If you cannot replicate results, you CANNOT support your theory *Important to help advance psychological and medical research
What is the Sum of Squares (SS)? What is variance?
SS: the sum of the squared deviations (difference) of data points from the distribution mean = the indicator/index of variance -variance: spread of data
Nonparametric Tests: Wilcoxon-Signed Ranks Test
T -test of wtihin-groups between two related groups **use if violate assumptions of Paired-samples t-test "Wilcoxon=cox= T for testicles = pair of them"
Which type of error is worse? Type I or Type II?
Type I error is worse -if you move standard of judgement to make less type I, there will be more type II -in research it is bad because it is saying there is a cause-effect when there isn't...it is easier to prove something exists rather than try to prove something no longer exists -ex. innocent found guilty
Nonparametric Tests: Mann-Whitney __ Test
U -test of between-group differences between two groups: ordinal or non normal scale variable **use if violate assumptions of independent-samples t-test "Mann U suck, I am an independent woman"
What is an Dependent Groups/Within Groups Design and what tests are used?
Within groups design: looking for DIFFERENCES WITHIN the same group of participants - Paired-samples t-test [Wilcoxon Signed-Ranks Test] - Repeated-measures ANOVA [Friedman Test]
What is a Repeated-measures ANOVA?
Within-groups design with 1 IV and 3 or more levels (more than 2 groups) -all participants receive all levels Ex. Measure heart rate & blood pressure at 3 time points during a movie
Regression Line Equation: Y = a + bX...what do each values mean? -How do you use the equation?
Y: estimated value of Y, given X a: y-intercept b: slope coefficient: how much the DV changes for a 1 unit change in IV -use line equation to predict values of the DV from IV (plug in X to get Y) -the points along this regression line are estimated values
What is the z score? -What is significant?
Z score: # of SD a score is from the mean If z score is within +/- 1.96 (95% of scores are within this range) then data is not significant - only a 5% likelihood that z score is outside this range (this is the significance level at .05)
What is ClinicalTrials.gov?
a database of privately and publicly funded clinical studies conducted around the world -find a study given condition, disease, drug name, researcher, etc. -can use if someone has condition that is not responding to treatment
Define 3. Spurious relationship
a plausible but false relationship -Ex: the negative association of athletics to GPA "disappears" when studying is taken into account...it is not that being an athlete leads to lower GPA, it is that being an athlete leads to less study time which in turn leads to lower gpa. Being an athlete is a spurious relationship because study time is the actual relationship
a. What is skewness? b. What is a negative skew and positive skew? c. What happens to mean, median, and mode when a curve is skewed?
a. Skewness: measure of asymmetry b. positive skew= right skew or pulled to right -negative skew= left skew or pulled to left c. right skew: mean gets pulled to right... Mean>Median>Mode left skew: mean gets pulled to left... Mean<Median<Mode
a. What is kurtosis? b. What is positive and negative kurtosis? c. How does kurtosis affect the bell shape?
a. kurtosis: the degree of peakedness/flatness in the variable distribution b. positive: high degree of peakedness, kurtosis >0, leptokurtic - negative: low degree of peakedness, kurtosis <0, platykurtic c. it doesn't...only becomes more flattened bell or tall bell: mean=median=mode
Examples of uses for multiple regression a. sociologists b. psychologists c. educational policy makers
a. sociologist: "Which social indicators best predict whether or not a new immigrant group will adapt into society?" b. psychologist: "Which personality characteristics best predict social adjustment in new environments?" c. edu policy makers: "What are the best predictors of academic success in college?"
What is the Bonferroni correction?
adjustment made to p-value, accounts for problems of multiple data comparisons - protects from Type I error -makes it harder to get significance so less likely to get false positives
Define anecdotal evidence -what is it used for?
anecdotal evidence is the use of anecdote: unusual or amusing incident/story - used to support a claim -commonly used in decision making or testimonials *should NOT be used as evidence
Define 1. Collective Effects
collective effects: the interplay among factors (i.e. non-drinking, female, athlete to predict gpa) -allows you to see interplay or how factors interact -collective or all together...rather than separate I.e. Multiple Regression
Define 2. Control variables
control variables: measures the impact of a variable "above & beyond" the effects of other variables (helps account for spurious relationships) -above & beyond = control variables I.e. Hierarchical multiple regression
What is Logistic Regression?
dependent variable is measured at the nominal level - used to see if something does or does not happen Ex: a student passes or fails a class with a certain instructor. Motivation & major-related job/internship/grad school when graduate (yes/no)
How is the Bonferroni correction calculated? -What happens if you make the correction?
divide the original alpha value (.05) by the # of analyses you plan to run on the DV Ex. if you have 10 correlations with college Gpa then altered p-value = .05/10 = .005 so p<.005 is significant - Now vulnerable to type II error: Less type I → more type II
What type of evidence is needed to support a theory?
empirical evidence to provide support for a theory -must be subjected to scientific scrutiny
What is the Standard Error of the Mean? What does a larger sample size mean? What does a high or low standard error of the mean represent?
how precise the sample mean estimates the population mean, to find where the true population mean exists -larger sample size = smaller ratio = more representative-smaller value = more representative the sample is of the population (closer the mean values are to each other) -larger value = less representative the sample is of the population (further away the sample mean is to the population mean) = SD/√N
What does R² indicate?
how well this line predicts/estimates actual values -the higher the value, the better it is at predicting values...the closer it is to 1 the better predictor and the closer to 0 the worse predictor
What is the Standardized Coefficients Beta (β)?
in hierarchical regression, it is the standardized slope...report these values*** + slope = positive relationship (if significant) - slope = negatief relationship (if significant)
What is the Unstandardized Coefficient B?
in hierarchical regression...it tells us the value of the slope for each independent variable (+/- relationship)
Define empirical evidence
information collected through systematic observation or experimentation to test hypotheses
How can you identify inconspicuous design flaws?
manipulation check, ask colleagues and peers -pilot study, feedback, etc.
What can lead to "unnatural" non-normal distributions? 1. _________ affects the type of statistical analyses that can be conducted
manner in which data was collected may lead to an "unnatural" non-normal distribution -biases, ceiling, or floor effects lead to non-normal distributions where there should be one 1. normality
What is mean rank?
nonparametric tests take the mean of the ranks and compares the differences in the mean of ranks to see if there is a significant difference STEPS: 1. Data from both groups combined are assigned ranks from the lowest to the highest 2. Then the ranks given to one group are compared with the ranks given to the other group 3. The mean ranks shown here indicate which group had the rating of "importance of body" as ranked higher...go to test statistic to determine if significant
What type of distribution naturally exists?
normal distributions are common but not universal -BUT many non-normal distributions naturally exist
What is a ceiling effect and what does it cause?
occurs when scores can go no higher than an upper limit and "pile up" at the top -causes negative skew -ex. scores on an easy exam
What is a floor effect and what does it cause?
occurs when scores can go no lower than a lower limit and "pile up" at the bottom -causes a positive skew -ex. household income, # of children, very difficult to test
When testing for normality what is the p value and when is it a normal distribution (null or alternative?)
p > .05 we accept the null so it is normal Ho: your distribution = normal distribution Ha: your distribution ≠ normal distribution
Explain empirical research 1. what is the objective 2. what is gathered 3. what method is used
psychology is an empirical discipline 1. objective to conduct scientific research in the most unbiased and controlled fashion 2. gather and analyze empirical evidence 3. uses the scientific method
What is a type I error?
rejecting the null when you should not - FALSE POSITIVE due to pure chance - "a guy is pregnant" - saying there is an effect when there really isn't - if p = .05 there is still a 5% chance you get significance when it is actually a fluke
Nonparametric Tests: Spearman's Correlation/ Spearman's Rho
rₛ or ⍴ -measures of association between two variables: scale/ordinal, ordinal/ordinal **use if violate assumptions of Pearson's correlation (Extra violation if non-linear) "through spear through o"
What is 95% Confidence Interval?
the range values above and below the sample mean within which the true population mean exists with 95% certainty
What is critical to the scientific process?
transparency!!!
What is a two-way ANOVA? -__x__ = ?
two-way ANOVA: two IVs with any number of levels per variable, "Factorial ANOVA" - # x # = 2 independent variables, where the number is how many levels are in each variable
What are evidence based decisions?
uses empirical evidence to make evidence-based decisions about a program, practice or policy using the best amiable research data (and reexamine when new data is collected)
What is the problem of multiple comparisons?
when conducting multiple analyses on the same DV, the chance of committing a Type I error increases
What is partial eta squared?
ηp2 a measure of effect size -closer to 0 the smaller the effect and the closer to 1 the larger the effect -aka eta squared use when running two-way/factorial ANOVA
What is Wilks' Lambda? -What test does it apply to?
λ -if p value is < .05, there is a significant difference between means -repeated measures ANOVA (within-group)
What is Multiple Regression?
'Multiple Linear Regression' -multiple = 2 or more IV/predictor variables to predict a single scale DV/criterion -explains variation in the DV from each IV...IVs can operate independently or work together -*The statistical procedure allows you to go beyond correlation and look at collective effects (the interplay) among factors Equation: Y = a + b1 X1 + b2 X2 +...+ bk Xk EX: study time (x1) & motivation to achieve (x2) predict quiz score (Y)
What is Single Linear Regression?
'Simple Linear Regression or Linear Regression' -a statistical technique that describes the relationship between 2 variables by calculating a prediction equation -single refers to # of IVs used to predict...so simple = single = 1 predictor variable Ex: study time (x) predicts quiz score (y)
What is a quasi-independent variable?
An independent variable that is NOT manipulated because they occur naturallyEx. can't make someone a different age or race
What is an outlier? 1. Mild Outlier 2. Extreme Outlier
An outlier is an extreme score 1. mild outlier = from 1.5 to 3 IQRs beyond 1st or 3rd quartile, o symbol 2. extreme outliers = > 3 IQRS beyond 1st or 3rd quartile, * symbol
What is an Independent Groups/Between Groups Design and what tests are used?
Between groups design: looking for DIFFERENCES BETWEEN groups of participants - Independent-samples t-test [Mann-Whitney U Test] - One-way ANOVA [Kruskal-Wallis Test] - Two-way/Factorial ANOVA
How do you determine if skewness or kurtosis or both are violating normality? a. equation for skewness b. equation for kurtosis
Use descriptives table a. Z = Skewness/Std. Error b. Z = Kurtosis/Std. Error If outside +/- 1.96 range = significant
Nonparametric Tests: Chi Square Test of Independence
ꭓ² - measure of association between two variables: nominal/nominal or nominal/ordinal "chi is big so you want to nom nom (take a bite)"