PSYC 301 - Exam 2
F is an omnibus test means....? For our purposes, just say that F tells ....
A significant F means it tests whether the groups differ from each other or not, it doesn't necessarily tell us which groups differ from each other. There is significant effect of significant difference between the groups.
Type .... errors, however, are not as directly controlled but instead are related to factors such as sample size. Type ... errors are particularly sensitive to the number of subjects in a sample, and as that number increases, Type .... errors ..... In other words, as the sample characteristics more closely match those of the population (achieved by increasing the sample size), the likelihood that you will accept a false null hypothesis .....
II II Decreases decreases
What does that mean to reject the null hypothesis?
If p < .05, the statistical procedure is telling you that your data pattern would be less than 5% likely if the null were true. As social scientists, we're willing to take this 5% risk and reject the null ( that is, reject the starting assumption that these groups are underlying the same/equivalent)
Why the tails never, ever touching the x-axis?
If the tails did touch, the probability of an event being very extreme in one or the other tail would be absolutely zero. But since they do not touch, there is always a chance, no matter how perfect we might be, that an event can occur - no matter how small and unlikely its probability might be.
what does p < .05 mean?
"the probability of observing that outcome is less than .05" and often expressed in a report or journal article simply as "significant at the .05 level"
Imagine 100 people conducted studies comparing math ability across genders 10 studies found men are better than women at math, 90 found that they aren't Which one is more likely to get published? Why?
10 studies It is actually likely that those 10 are type I errors The apparent consistency in the published literature make us incorrectly think the null is false
The research hypothesis is ...tailed and .... because it posits a ...., but ...
2 nondirectional a difference, but in no particular direction
Let's pretend the p-value is .02. this means....
2% chance of obtaining those results if there were no effect of time of presentation
A 3x2 factorial design: 3 indicates: 2 indicates: ==>how many possibilities
3 levels of 1 group factor 2 levels of the other grouping factor. ==> 6 possibilities
+ or - 1.96 z-scores or standard deviations is ...% Or the probability of a raw score falling within + or - 2.56 z-scores or standard deviation is ...%
95% 99%
The area in the tails gets bigger with ....dfs That means with smaller samples you have to move the critical values farther out to keep ....
Smaller the same % (Ex. 5%) in the critical region.
The null hypothesis really false. If you accept the false null hypothesis, you make a ..... ..... are also represented by the .... letter ... or
Type II error Type II errors Greek beta
The level of chance or risk you are willing to take is expressed as .... The risk of associated with not being 100% confident that what you observe in an experiment is due to the treatment or what was being tested
a significance level
What is t-test for dependent samples? Also, is called?
a single group of the same subjects is being studied under 2 conditions. Tested more than once Repeated-measures design Within-subjects effects 2 groups scores "Repeated measures"
the accepted cutoff for calling a result "statistically significant" Data more improbable than this cutoff will lead to rejecting the null hypothesis. Traditionally set at .5 - but can be anything. (5% risk)
alpha level
Effect size gives us ...
an idea about the relative positions of 1 group to another
Statistical significance is used to ...
back up your argument that some variable matters but significance does not necessarily always equal MEANINGFULNESS
If the obtained value is more extreme than the critical value, the null .... If the obtained value does not exceed the critical value, the null..
cannot be accepted (reject the null) is the most attractive explanation (fail to reject the null)
Step 5:
determination of the value needed for rejection of the null hypothesis using the appropriate table of critical values for the particular statistic
Hypothesis testing is the process of ...
determining statistical significance
In an independent t-test, step 5: Our 1st task is to determine ... Every combination of alpha level and sample size (.....) has ....
find the critical value the degrees of freedom (df), which approximates the sample size. (expressed as degrees of freedom) has corresponding critical value
If the null hypothesis is false, you fail to reject it: p-value was .... than .05 ==>
greater Type II error (a.k.a False negative)
If the null hypothesis is true, you fail to reject: p-value was .... than .05 ==>
greater correct (but bummer)
Step 8:
if the obtained value does not exceed the critical value, the null hypothesis is the most attractive explanation. (If the obtained value is smaller tan the critical value, fail to reject the null hypothesis) This means your p-value is bigger than your cutoff
The formula for computing the t-value for the t-test for independent means... the numerator is.... the denominator is ...
is the difference between the means the amount of variation within and between each of the 2 groups
Step 4 (cont..): If the p-value is smaller than alpha: We.... If the p-value is bigger than alpha: We....
is to compare the p-value we obstained to our significance level (alpha = .05) We reject the null You haven't "PROVEN" anything; there's still a 5% or less chance that you are wrong. We FAIL to reject the null Your research hypothesis might still be right; but this data hasn't given you enough evidence to REJECT the null
If the outcome representing the obtained value falls to the left of the critical value (it is .... (less/more) extreme, the conclusion is that the .....is most attractive explanation for any differences that are observed In other words, the obtained value falls in the region (....% of the area under the curve) where we ....
less the null . where we expect only outcomes due to chance to occur
The file drawer effect: The study that fail to reject ...
less likely to get published than studies that reject the null (this is called publication bias)
If the null hypothesis is true, you reject it: p-value was .... than .05. ==> .....
lower Type I error. (a.k.a False positive)
If the obtained value falls to the right of the critical value (it is ...(less/more) extreme, the conclusion is that the ....is the most attractive explanation for any differences that are observed. In other words, the obtained value falls in the region (...% of the area under the curve) where we .....
more the research hypothesis 5% where we would expect only outcomes due to something other than chance to occur.
In an independent t-test, step 4: compute the test statistic, this is called the....
obtained value (OV)
For example. if the level of significance is .01, it means that ... (Type I error)
on any one test of the null hypothesis, there is a 1% chance you will reject the null hypothesis when the null is true and conclude that there is a group difference when there really is no group at all.
Using the standard error of the mean allow us ...
once again to use the table of z-scores to determine the probability of an outcome
What does one-way analysis of variance?
only one grouping dimension 1 factor/1 treatment variable is being explored and this factor has more than 2 levels
F is big ==> p is .... ==> reject or fail to reject the Null? (the Null is false or true?) The numerator is smaller or bigger than the denominator? ==> there is a big effect or not?
p is small ==> reject the null ==> The null is false Bigger ==> there is a big effect
Every obtained value has an associated ....
p-value
the probability of getting your observed results (or more extreme results) if the null hypothesis were true
p-value
Important: dependent t-test the Null does not..... Under the Null, some individuals will have ... +The average is .. +In the other words, ...
posit that each individual has to have a difference score equal to zero a positive difference score whereas other will have a negative difference score +The average is close to zero In the other words, the changes are essentially random
The null hypothesis is really false. If you reject the false null hypothesis, when there .... This is also called ....
really are differences between the 2 groups. power 1 - beta
Type I errors ...
rejection of a null that is actually true False positive/mistaken hit Chance that a type I error is equivalent to your level of significant (alpha) On any one test of the null hypothesis, there is a 5% chance that you will make a type I error
Example of degrees of freedom: Imagine a sample of 3 scores that has a mean of 5, the first 2 scores in the sample can be anything, they are independent of each other and are free to vary. However, once those 2 numbers are "Set", the third number is ...
restricted to a specific number (chapter 11, page 6/10)
as t gets bigger, the p-value gets .... Remember, the test is ultimately the probability of ....
smaller getting that particular t ( in this type of test) under the assumption that the underlying t should be zero (The null)
When the obtained value is bigger than the critical value, then your p-value is ... When the obtained value is smaller than the critical value, the your p-value is...
smaller than your cutoff bigger than your cutoff.
The critical value is the point beyond which the obtained outcomes are judged to be ...
so rate that conclusion is that the obtained outcome is not due to chance but to some other factor .
F is based on ...
sources of variance rather than the mean differences
The degree of risk you are willing to take that you will reject a null hypothesis when it is actually true is called the ___________.
statistical significance
Sum of Squares (between) is the ..... ==>
sum of differences between the mean of all scores and the mean of each group's score, which is then squared ==> An idea of how different each group's mean is from the overall mean
What does the word Significant mean?
that any difference between the attitudes of the 2 groups is due to some systematic influence and not due to chance. we assume that all of the other factors that might account for any differences between groups were controlled.
A confidence interval is ...
the best estimate of the range of a population value (or population parameter) that we can come up with given the sample value (or sample statistic representing the population parameter)
The formula of one sample Z-test?
the differences between the sample mean and the population mean makes up the numerator the denominator, an error term, is called the Standard Error of Mean (SEM) and is the value we could expect by chance, given all the variability that surrounds the selection of all possible sample means from a population.
Some characteristics of dependent t-test:
the groups are not independent of each other, Instead, dependent of one another. A comparison of means from each group of scores and focused on the differences between scores
We conclude that the difference we observed is very unlikely if ...
the null is true
The F-formula: the numerator and denominator mean.... Ratio 1 means: ... ===>
the numerator: Mean Sum of Square (between) --> due to grouping factor --> the effect the denominator: Mean Sum of Square (Within) --> due to chance --> error, chance, individual differences. Means the amount of variability due to within-group difference is equal to the amount of variability due to between group difference ===> any difference is no signifcant
The appropriate table of critical values tells you ...
the obtained value the corresponds to the p-value you've set as alpha (typically .05)
The real differences between a Z-test and a t-test are that for a t-test, .. .
the population's standard deviation is not known while a Z-test, it is known. Another difference is that the tests use different distributions of critical values to evaluate the outcomes (which makes sense given that they're using different test statistics)
Hypothesis testing is ...
the process of determining statistical significance Remember though, the you haven't PROVEN your hypothesis, only supported it indirectly by REJECTING the null You're still not 100% sure and never can be. Errors are always posible.
Why do degrees of freedom influence the critical value? The shape of t-distribution is the kurtosis ... when the dfs reaches inifinity, .. Fewer dfs -->
the shape of the t-distribution is always bell shaped and has a mean of zero The kurtosis (variability) differs as a function of the degrees of freedom When the dfs reaches infinity, it's exactly the same as a normal (Z) distribution Fewer dfs --> flatter (the area in the tails gets bigger)
The standard error of the mean is... It's the best ... Reflects ...
the standard deviation of all the possible means selected from the population. It's the best Estimate we can come up with, given that it is impossible to compute all the possible means Reflects what that value would be for the entire population of all mean values.
What is the standard error? Used for
the standard deviation of all the sample means that could, in theory, be selected from the population. it is an alternative way of computing, and understanding, confidence intervals.
Sum of squares (within) is the... ==>
the sum of the differences between each individual score in a group and the mean of each group, which is then squared. ==>An idea how different each score in a group is from the mean of that group.
In a t-test dependent: pretest ----> .... ---> ....
treatment or time ----> posttest
If you reject the null you stated, you would be making an error. The risk you take in making this kind of error (or the level of significance) is also known as a .... Type ..... are also represented by the .... letter .... or
type I error type I errors Greek alpha
The t-test makes the major assumption that the amount of ... This is the ...
variability in each of the 2 groups is equal homogeneity of variance assumption
When do we use one sample Z-test?
when we want to test the difference between a sample and a population Ex: does the average ACT for this class differ from the population's average ACT score?
in a t-test, what if we only a 1-tailed test, ..
you wouldn't find significance
Step 3:
Selection of the appropriate test statistic Collect data and compute sample statistics
What does p > .05 or "p =n.s" mean?
(for non significant) it means that the probability of rejecting a true null exceeds .05 and, in fact, can range from .050001 to 1.0.
In one sample Z-test, the critical value is ... if the obtained value is more extreme than the critical value, the null .... If the obtained value does not exceed the critical value, the null ....
+ or - 1.96 cannot be accepted is the most attractive explanation
A small effect size ranges from ... to ... A medium effect size ranges from .... to ... A large effect size ranges from .... to ...
0 to .2 .2 to .5 above .5
what does significant findings occurred at the 0.05 level?
1 chance in 20 or 5 in 100 or 5%
Several types of ANOVA (4)
1-way ANOVA: 1 factor and more than 2 levels Repeated measures ANOVA: same participants tested more than twice Factorial ANOVA: more than 1 factor Mixed ANOVA: A factorial ANOVA when you have both between and within-subjects factors
Hypothesis testing procedure Step 1: Step 2: (We reject the null hypothesis we hit .....) (In statistics - when our probability is .... than the criterion we set - that is , .....) ....... level - the accepted cutoff for calling a result " ..." Data more improbable than this cutoff will lead to... Traditionally set at ... - but can be anything. In short, "I'm willing to take a ....% chance that I'm making the wrong decision"
A statement of the null hypothesis: A statement of equality Setting the level of risk (or the level of significance or Type I error) associated with the null hypothesis. (we hit a defined criterion of Improbability) (less than ; alpha) Alpha; statistically significant reject the null hypothesis .05 5%
Why not make alpha VERY small? Why not use an alpha of .0000001 so we can be more certain not to make mistake?
Because doing so would increase the chance of type II errors
Independent design(.....) Dependent design (...)
Between subject - no concerns about retesting - complete intervention or incomplete intervention Within subjects - each participant serves his or her own control - before intervention and after intervention
How to calculate degrees of freedom (between and within)?
Between: k-1 ( k is the number of groups) Within: N-K (N is the total sample size)
Step 5: this is the minimum value needed to reject the null hypothesis. This is the value you would expect the test statistic to yield if the null hypothesis is indeed true. This is ..... Each type of test statistic has a ....
CRITICAL VALUE a table of critical values that are based on your alpha and sample size These table tells you the MINIMUM value needed to reject the null
Step 6: Step 7:
Comparison of the obtained value with critical value. If the obtained value is more extreme (bigger) than the critical value, the null hypothesis cannot be accepted ( then your p-value is smaller than your cutoff)
Step 4: p-value: ....
Computation of the test statistic value The test statistic value = obtained value - is the result or product of a specific statistical test. (p-value) (Make a decision) P-value is the probability of getting your observed results (or more extreme results) if the null hypothesis WERE true. (the null is NOT the more likely outcome, if your p-value is greater than .05.)
If the null hypothesis is false, you reject it: p-value was .... than .05 ==>
Correct You may get to publish
Differences between t-test and F-test
F-test: more than 2 means T-test: 2 means
What is Factor What is Level
Factor is the variable that designated the groups to be compared (Ex: type of teacher) Level is the different groups within a factor (factors have levels) (Ex: stat, math, psych)
For example: How much do baseball salaries increase from their rookie year to their 9th year in the league? Independent samples:... Dependent samples: .... (Provide Pros and Cons for each sample)
Independent samples: 2 different groups of players (In the same year, pick a bunch of players that have been in the league 1 year and a bunch who have been in 9 years and compare their salaries) +Pros: Fast +Cons: cohort effect Dependent samples: same group of players - compare salaries one year and than again 9 years later +Pros: each player serves his or her own control +Cons: takes 9 years to do study.
Consequence of type 2 errors
May stop studying something you're interested in Important research findings never make it light Truth may never be uncovered.
Some characteristics of ANOVA: ... The variance due to ....: + +
More than 2 levels of the same variable were being tested and these groups were compared on their average performance. testing for a difference between scores of difference groups Tested once The variance due to differences in performances: +due to difference between individuals BETWEEN Groups (treatment differences) +due to difference between individuals WITHIN groups ( chance, error, individual differences)
Can the null be tested directly?
No
Example: the null hypothesis states that the sample average is equal to the population average. This is an example of which test?
One sample Z-test
F is small ==> p is ... ==> reject or fail to reject the Null? (the Null is false or true?) The numerator is smaller or bigger than the denominator? ==> there is a big effect or not?
P is big ==> fail to reject ==> the Null is true smaller ==> there is no effect
..... is a construct that has to do with how well a statistical test can detect and reject a null hypothesis when it is false. Mathematically, it's calculated by subtracting the value of the type .... error from .... A more ... test is always more desirable than a less .... test, because the more .... one lets you get to the heart of what's false and what's not
Power Type II error from 1
What causes type II error?
Sampling error Measurement error Imperfect operationalizations Small sample size (POWER) As-yet-known latent variables A noisy world
Step of hypothesis
Step 1: state the hypotheses (Predictor - independent, outcome - dependent) Step 2: Set the level of risk. Step 3: Compute statistics - selection of the appropriate test statistic Step 4: Make a decision - test statistic value (Reject or fail to reject) Step 5: determine the critical value Step 6: compare your obtained value to the critical value Step 7: If the obtained value is bigger than the critical value, reject the null hypothesis Step 8: if the obtained value is smaller than the critical value, fail to reject the null
While we are using the standard deviation to compute confidence intervals, many people choose to use the ... of the mean or ....
The standard error of the mean OR SEM
When to use an independent samples t-test.. Also referred to as ....
The t-test uses data from 2 separate groups to evaluate the mean difference between the 2 groups. Also referred to as Between Subjects t-test
What does p-value is 0.12 mean?
This means there is a 2% chance of obtaining those results IF there were NO effect of time of presentation
Statistical significance has to be considered in context:
To be meaningful the study must be sound Need for replication The absolute difference has to also be considered (called Effect size) Alpha = .05 is relatively arbitrary
......occurs when you inadvertently accept a false null hypothesis. For example, there may really be differences between the populations represented by the sample groups, but you mistakenly conclude there are not. What is this type or error?
Type II error
Characteristics of one sample Z-test?
We are examining differences between a sample and a population There is only 1 group being tested
When we fail to reject ....
We often don't know what to take of that We are left wondering This can lead to a problem known as the file drawer effect
Is Type I error bad for science and the scientists? If yes, why?
Yes May waste time trying to replicate the effect yourself Others may waste time trying to replicate it Will be discredited when study doesn't hold up to replication Slows down the progress of science Replication crisis
In one sample Z-test, if there had been a significant difference, would you have stated the direction or make some conclusion? For example? Is this test directional or non-directional?
Yes, for example: kids in the athletic program weigh less than children in general directional
...... is another way to apply probability via the Z table. -An estimate range for an unknown population value, given the descriptive stats from a sample. Think margin of error in polling results Example: we want to know the population average for a second-grade spelling test (that is, what is the average score for ALL second graders). But we only have a sample of 40 scores
a confidence interval
Degrees of freedom are .... For the independent t-test, the degrees of freedom are ...
a function of the sample size and the type of test you are using The number of scores in the final calculation of a statistic that are free to vary (n1-1) + (n2-1) = n1 + n2 - 2 (conceptually, this is: the number of people - the number of groups)
Effect size is ....
a measure of how different 2 groups are from one another - it's a measure of the magnitude of the treatment Kind of like, How big is big?