Psych MSP Exam 2
Mean square within or MS(wn) is known as the "_________."
"error term"
Between-subjects factor
- A factor that is studied using independent samples in all conditions - Involves using the formulas for a between-subjects ANOVA
Within-subjects factor
- A factor that is studied using related (dependent) samples in all levels - Involves a set of formulas called a within-subjects ANOVA
Threats to internal validity
- Confounds that must be controlled so that a cause-effect relationship can be demonstrated - History, maturation, testing, instrumentation, statistical regression, selection, selection interactions, mortality/attrition
What issues are most important when recruiting participants for an experiment (including ethical issues that should be addressed)?
- Informed consent: "study might involve listening to a three-minute song" - Prescreening: screen out those at increased risk; legally able (e.g. over 18) - Recruiting participants: power, diversity of sample, prescreening, nonprobability sampling - Random assignment: not necessary to randomly select the sample but random assignment essential (want similar groups to ensure individual differences are equally represented in each group)
Restriction of range
- Problem that arises when range between the lowest and highest scores on one or both variables is limited (produces a coefficient smaller than it would be if range were not restricted); floor and ceiling effects - Decreases the power of our study or our ability to see whether a relationship exists
Coefficient of determination (r^2)
- Proportion of variability accounted for by knowing the relationship (correlation) between two variables - Reflects the usefulness or importance of a relationship (think effect size)
Why do we compute confidence intervals?
- So that we have an interval estimation - Because we are trying to best describe the population
Two-sample experiment
- Subjects are measured under two conditions of the independent variable - Condition 1 produces sample mean X̄(1) that represents µ(1) and condition 2 produces sample mean X̄(2) that represents µ(2)
Which threats (to internal validity) apply to a group design?
- Threats due to experiences or environment: history, maturation, testing, instrumentation - Threats due to participant characteristics: attrition/mortality, selection, selection interactions
Post hoc comparison
- We compare all possible pairs of means from a factor, one pair at a time, to determine which means differ significantly - Only performed when F-obtained is significant
Why are 95% confidence intervals so common?
- α is usually .05 - 1-α = probability of avoiding a Type I error - .95 probability that the interval contains the sample mean
What are the 4 steps for computing the independent-samples t-test?
1. Calculate the estimated population variance for each condition 2. Compute the pooled variance 3. Compute the standard error of the difference between the means 4. Compute t-obtained for the two independent samples
What are the steps for computing F-obtained?
1. Compute the total sum of squares (SS(tot)) 2. Compute the sum of squares between groups (SS(bn)) 3. Compute the sum of squares within groups (SS(wn)) 4. Compute the degrees of freedom (k-1 between, N-k within, N-1 total) 5. Compute the mean squares 6. Compute F-obtained
Critical values for the independent-samples t-test (t(crit)) are determined based on what 3 things?
1. Degrees of freedom: df = (n(1)-1) + (n(2)-1) 2. The selected α 3. Whether a one-tailed or two-tailed test is used
What are the 4 different types of IV manipulation?
1. Environmental 2. Scenario 3. Instructional 4. Physiological
What are the two versions of a two-sample t-test?
1. Independent-samples t-test 2. Related-samples t-test (or dependent-samples)
What are the 3 ways to maximize the power of a t-test?
1. Larger differences produced by changing the independent variable increase power 2. Smaller variability in the raw scores increases power 3. A larger N increases power
What are the 4 more important things to do when designing a simple experiment?
1. Maximize power 2. Ensure IV manipulation is reliable and valid 3. Ensure DV manipulation is reliable and valid 4. Maximize internal validity
To maximize power in the independent-samples t-test, you should do what 3 things?
1. Maximize the size of the difference between the means 2. Minimize the variability of the scores within each condition 3. Maximize the size of N, that is, n(1) + n(2)
What are the three steps for setting up a one-sample t-test?
1. Set up the statistical hypotheses (H0 and HA); these are done in precisely the same fashion as in the z-test 2. Select alpha; α = .05 typically is used 3. Check the assumptions for a t-test
What are the 4 ways to maximize power in a simple experiment?
1. Strong manipulation of the IV 2. Extreme levels of the IV 3. Homogeneity of participants 4. Increase N
The critical F value (F(crit)) depends on what 3 things?
1. The degrees of freedom (both the df(bn) = k-1 and the df(wn) = N-k) 2. The α selected 3. The F-test is always a one-tailed test
What are the 4 assumptions of the independent-samples t-test?
1. The dependent scores measure an interval or ratio variable 2. The populations of raw scores form normal distributions 3. The populations have homogeneous variances (homogeneity of variance = the variances of the populations being represented are equal) 4. While Ns may be different, they should not be massively unequal
What are the 3 assumptions of the one-way between-subjects ANOVA?
1. The experiment has only one independent variable and all conditions contain independent samples 2. The dependent variable measures normally distributed interval or ratio scores 3. The variances of the populations are homogeneous
What are the 3 assumptions for a t-test?
1. You have one random sample of interval or ratio scores 2. The raw score population forms a normal distribution 3. The SD of the raw score population is estimated by computing s(x), the estimated population SD
What are the degrees of freedom for an independent-samples t-test that uses two samples with n=12 in each sample?
22
If our α is .01, what would our confidence interval be?
99%
Fisher's Least Significant Difference (LSD) test
A commonly used post hoc test that computes the smallest amount that group means can differ in order to be significant
Experiment
A design that includes manipulation of an IV, measurement of a DV, random assignment, and control of confounds
Quasi-experiment
A group design in which a researcher compares pre-existing or naturally occurring groups that are exposed to different levels of a variable of interest
Squared point-biserial correlation
A measure of effect size for the independent-samples t test, providing the percentage of variance in the outcome (or DV) accounted for by the predictor (or IV)
Linear relationship
A relationship between two variables, defined by their moving in a single direction together
Correlation
A relationship between variables
Multiple regression (R)
A statistical technique that compares both the individual and combined contribution of two or more variables to the prediction of another variable; used when we have more than two variables and we want to know the predictive validity that results from knowing the relationship among all the variables
Levene's Test for Equality of Variances
A statistical test that examines whether the variability within different samples is similar
Eta squared (η^2)
A statistical test which tells us the percentage of variability in the variable we measured accounted for by the group
Multiple independent-groups design
A study examining the effect of a manipulated IV or the relationship of a variable which had three or more levels on a DV; the participants in each level of the IV are unrelated
Simple experiment
A study investigating the effect of a manipulated IV with two conditions on a DV; the IV is nominal scale and the DV is interval or ratio
Ecological validity
A type of external validity that assesses the degree to which a study's findings generalize to real-world settings
Correlational design
A types of study that tests the hypothesis that variables are related
Confound
A variable that varies systematically with the variables of interest in a study and is a potential alternative explanation for causality
What is the chance (probability) we have made a Type I error (rejecting the null when the null is really true)?
Alpha (α)
ANOVA is the abbreviation of what?
Analysis of variance
Which threat is not directly controlled for in an experiment?
Attrition/mortality
Standard error of the estimate
Average difference between the predicted Y values for each X from the actual Y values
What is the purpose of an experiment? A. Describe B. Predict C. Explain
C. Explain
Demand characteristics
Characteristics of the study that lead participants to guess at the study's hypothesis and change their behavior accordingly (threat to internal validity)
For nominal data, what is the appropriate test for comparing a sample to a population or an expected value?
Chi-square goodness of fit test
How does an experiment control for threats to internal validity?
Control for as many confounds as possible by: keeping extraneous variables controlled across IV conditions and using random assignment so participants are assigned to an IV randomly
Confidence interval
Defines the highest mean difference and the lowest mean difference (and the values in between) we would expect for a particular population mean we found in our study
Descriptive designs __________, correlational designs ___________, and quasi-experiments/experiments ___________.
Describe; predict; explain
Point estimation
Describes a point on the variable at which the µ is expected to fall
Point-biserial correlation coefficient (r(pb))
Describes the relationship between a dichotomous variable and an interval/ratio variable; interpreted similarly to a Pearson correlation coefficient
Effect size
Describes the strength of the effect of an IV (or the strength of the relationship between a predictor and outcome in a correlational study)
Mean square between groups
Describes the variability between the means of our levels
Mean square within groups
Describes the variability of scores within the conditions of an experiment
__________ and __________ designs have better external validity, while _________ and __________ designs have better internal validity.
Descriptive; correlational; quasi-experiments; experiments
Group design
Design in which a researcher compares two or more groups of participants who are exposed to different levels of a variable of interest
Treatment effect
Differences produced by the independent variable
In a one-sample t-test, what do you do after you compute the estimated population SD, or s(x)?
Divide that difference by the standard deviation of the sampling distribution of means; because σ(x) is usually not available, we use the estimated standard error of the means (SD(x) or S(x))
Regression equation
Equation that describes the relationship between two variables and allows us to predict Y from X
Mean square within or MS(wn) estimates what?
Estimates the error variance in the population
Mean square between groups or MS(bn) estimates what?
Estimates the variability of scores between the levels in a factor
What are the two measures of effect size for a one-sample t-test?
Eta squared and Cohen's d
Dependent-groups experiment
Experiment in which the groups are related, in that participants were matched prior to exposure to the IV or in that the participants experience all levels of the IV
Between experiments and quasi-experiments, ___________ have better internal validity and ___________ have better external validity.
Experiments; quasi-experiments
The statistic for the ANOVA is _______.
F
True or false: There is more focus on external validity in an experiment as compared to a quasi-experiment.
False (Quasi-experiment; they often use real-world interventions rather than artificial lab settings)
If ________ is false, then MS(bn) contains estimates of both error variance (which measures difference within each population) and treatment variance (which measures difference between the populations).
H(0) or null
If __________ is true, each condition is a sample from the same population.
H(0) or null
When ________ is true, MS(bn) estimates the variability among the individual scores in the population just like MS(wn) does, and so MS(bn) should be equal to MS(wn).
H(0) or null
The null hypothesis in numerical terms is H(0): M = µ. Another way to consider the null is as an equation, which is?
H(0): µ - M = 0 (zero)
We know that in an independent-samples t-test we could state our null as H(0): µ(1) = µ(2). However, we usually state it as __________ instead.
H(0): µ(1) - µ(2) = 0
What are the threats (to internal validity) of a one group pre-post test design?
History, maturation, instrumentation (all have to do with experiences or environment) and statistical regression (biased recruitment)
Mean square between groups or MS(bn) is used to determine what?
How each level mean deviates from the overall mean of the experiment (indicates how different the level means are from each other)
What threats (to internal validity) are inherent to a quasi-experiment?
Instrumentation, selection, attrition/mortality
We assume there is one value for the error variance in the population and each _________ estimates that value.
MS(wn) or mean square within groups
___________ is an estimate of the error variance, the inherent variability within a population represented by the samples.
MS(wn) or mean square within groups
The greater the _________, the more accurately the t-distribution represents the population means (S(x) will be closer to σ(x)).
N (sample size)
Degrees of freedom = __________ because we are making an estimate about the population and we are dealing with a sample.
N - 1
One-group pre-posttest design
Nonexperimental design in which all participants are tested prior to exposure to a variable of interest and again after exposure
For interval or ratio data, what is the appropriate test for comparing a sample to a population or an expected value?
One-sample t-test
Experimenter expectancy effect (or Rosenthal effect)
Phenomenon in which a researcher unintentionally treats the groups differently so that results support the hypothesis (threat to internal validity)
Hawthorne effect
Phenomenon in which participants change their behavior simply because they are in a study and have the attention of researchers (threat to internal validity)
What are the two ways to estimate the population mean (µ)?
Point estimation and interval estimation
Matched random assignment
Process in which participants are put into matched sets and then each member of the set is assigned to one IV level so that all in the set have an equal chance of experiencing any of the levels
Linear regression
Process of describing a correlation with the line that best fits the data points (Y' = bX + a)
Prescreening
Process of identifying those who have characteristics that the researcher wants to include or exclude in the study
Between-subjects experiment (or independent-groups experiment)
Random selection, each subject serves in only one condition
Floor effect
Restricting the lower limit of a measure so that lower scores are not assessed accurately (lowest score of a measure is set too high)
Ceiling effect
Restricting the upper limit of a measure so that higher levels of a measure are not assessed accurately (highest score of a measure is set too low)
What terms make up a summary ANOVA table?
Source (treatment, error, total), sum of squares (SS(B), SS(W), SS(tot)), degrees of freedom, MS/mean square, and F-obtained
Interval estimation
Specifies a range of values within which we expect to µ to fall
What would you do if you wanted to get the estimated population variance from the estimated population SD?
Square everything (eliminate the square root sign from the estimated population SD formula)
How does the standard error of the difference differ from the standard deviation?
Standard error of the difference is the average variability in a sampling distribution of differences BETWEEN means, while SD is the average variance of each score FROM the mean
Pearson's r (Pearson's product-moment correlation coefficient)
Statistic used to describe a linear relationship between two interval/ratio measures; describes the direction (positive or negative) and strength of the relationship
Predictor variable
The X variable used to predict a Y value; term used instead of DV in a correlational design
Standard error of the difference between the means (SD(x-x))
The average variability in a sampling distribution of differences between means
t-distribution
The distribution of all possible values of t computed for random sample means selected from the raw score population described by H(0); each has a calculated mean, SD, and t-obtained
Internal validity
The extent to which we can say that one variable caused a change in another variable
Experiment-wise error rate
The overall probability of making a Type I error anywhere among the comparisons in an experiment
One-sample t-test
The parametric inferential procedure for a one-sample experiment when the standard deviation of the raw score population must be estimated; compare a sample to a population, to a known or expected score
Analysis of variance (ANOVA)
The parametric procedure for determining whether significant differences occur in an experiment containing two or more sample means
Independent-sample t-test
The parametric procedure used for testing two sample means from independent samples
Two-sample t-test
The parametric statistical procedure for determining whether the results of a two-sample experiment are significant
Criterion variable
The predicted variable
Power
The probability of rejecting the null when the null is in fact false
Manipulation check
The process of verifying that the participants noticed/attended to the manipulation; usually appears after the DV measure
In relation to an ANOVA, what does eta squared (η^2) indicate?
The proportion of variance in the dependent variable that is accounted for by changing the levels of a factor
What are we stating if we reject the null in an independent-samples t-test?
The sample mean difference represents a difference between two population µs that is significantly different from zero
F-distribution
The sampling distribution showing the various values of F that occur when H(0) is true and all conditions represent one population
Standard error of the means (σ(x) or SEM)
The standard deviation of the sampling distribution of means
Cohen's d (d)
The standardized size of the difference between the two means
Line of best fit
The straight line that best fits a correlation and consists of each X value in the relationship and its predicted Y value
Sum of squares (SS)
The sum of deviation scores that are obtained by subtracting the mean from each score
Sum of squares total or SS(total)
The sum of squared deviations around the mean of the entire sample
Diffusion of treatment
The treatment administered to one group is shared with another group through cross-group interactions (threat to internal validity)
What is σ(x)?
The true standard deviation of the raw score population
Y predicted (Y')
The value that results from entering a particular X value in a regression equation
What does it mean if Levene's test is significant?
Variances are not homogeneous (homogeneity of variances assumption is violated) and p < .05
When is a one-way ANOVA performed?
When only one independent variable is tested in the experiment
When is the Tukey HSD multiple comparisons test used?
When the Ns in all levels of the factor are equal
When are two samples independent?
When we randomly select participants for a sample, without regard to who else has been selected for either sample
What is tested in a chi-square goodness of fit test?
Whether the observed frequencies of the categories reflect the expected population frequencies
What are reasons for conducting a factorial study?
You have reason to expect that the relationship between your variables will depend on a third, moderating variable or you want to systematically control confounds or extraneous variables
Because we can have different regions of rejection (critical values), the probability of __________ can change.
a Type I error
You'll have an independent-groups factorial design if __________.
all the IVs or predictors (factors) are independent groups
You'll have a dependent-groups factorial design if __________.
all the IVs or predictors are dependent groups
We perform interval estimation by creating a ______________.
confidence interval
When the t-test for independent samples is significant, a __________ for the difference between the two µs should be computed.
confidence interval
When F-obtained is significant, it indicates that two or more means ___________. It does not indicate which specific means __________.
differ significantly (x2)
Using the ANOVA allows us to compare the means from all levels of the factor and keep the _________ equal to α.
experiment-wise error rate
When there are more than two means in an experiment, using multiple t-tests results in a _________ much larger than the one we have selected.
experiment-wise error rate
In ANOVAs, an independent variable is called a ____________.
factor
At a card players' club, the poker players had a contest with the blackjack players to see who could win the most money. The appropriate design for testing the significance of the difference between the means is __________.
independent-samples t-test
The symbol for the number of levels in a factor is _________.
k
The ANOVA uses different terminology for variance. What we previously called the estimated population variance is now a ____________.
mean square (mean of squared deviations)
With the one-sample t-test, we find the difference between the ____________ and our ____________.
population mean; sample mean
When the F-test is significant, we perform __________ comparisons.
post hoc
Differently shaped t-distributions will have differently shaped ______________.
regions of rejection
If the differences between our two groups' means is greater than what we would expect by chance, we __________.
reject the null and retain the alternative
If the difference between our two groups' means is about what we would expect because of individual differences, we __________.
retain the null
The computations for the ANOVA require the use of several ___________.
sums of squared deviations / sum of squares (SS)
In an experiment involving only two conditions of the independent variable, you may use either a _________ or _________.
t-test; ANOVA
Post hoc comparisons are like __________.
t-tests
Always use a ___________ confidence interval even if your test is one-tailed.
two-tailed
You'll have a mixed factorial design if _________.
you have at least one independent-groups factor and one dependent-groups factor
Calculating a one-sample t-test is a very similar process to calculating a __________.
z-score
Use the ______ test when σ(x) is known and use the ______ test when σ(x) must be estimated by calculating sample SD or s(x)
z-test; t-test
The confidence interval for a single _________ describes a range of values of __________.
µ
When we use a t-test, we only have one comparison between the two means in an experiment, so the experiment-wise error rate equals _________.
α (alpha)
We obtain the appropriate value of t(crit) from the t-tables using both the appropriate ________ and ________.
α and df