Stats Exam 3
Define the power of a statistical test
Probability of rejecting Ho when Ho is false or probability of detecting an effect when the effect exists.
Does the researcher hope to reject Ho or accept Ho? What about H1?
Reject Ho and accept H1.
Two decisions in hypothesis testing: Reject Ho or Retain Ho
Reject the null hypothesis (Ho) if the test statistic (e.g., "obtained z value) falls in rejection areas.
Why do we evaluate the Ho instead of evaluating the H1?
. It is easier to reject a statement than to support the statement. The goal of most research is to support the research (alternative) hypothesis (H1). By rejecting the null hypothesis (Ho), we can support H1.
If exactly 5% of the t distribution is located in the tail beyond t = 2.353, how many degrees of freedom are there?
3
level of significance
A probability value that is used to define the very unlikely outcomes if the null hypothesis true
Which of the following sets of data would produce the largest value for an independent-measures t statistic? a. The two sample means are 10 and 12 with sample variances of 20 and 25. b. The two sample means are 10 and 12 with variances of 120 and 125. c. The two sample means are 10 and 20 with variances of 20 and 25. d. The two sample means are 10 and 20 with variances of 120 and 125.
C a large mean difference with small variances
Null Hypothesis in a Nondirectional (two-tailed) hypotheses
Ho: μ MP = 50 (Mean of achievement test for Mt. Pleasant children is the same as mean for Michigan norm)
Null Hypothesis in a Directional (one-tailed) hypotheses
Ho: μ MP ≤ 50 (Mean of achievement test for Mt. Pleasant children is 50 (Michigan norm) or less)
A researcher is evaluating the effectiveness of a new education program for elementary school children. The program is designed to reduce anger and aggression. A sample of n = 16 children is selected and the children are placed in the new program. After 3 months, each child is given a standardized aggression test. The mean aggression score is M = 36 for this sample. For the general population of elementary school children, the scores on the aggression test form a normal distribution with μ = 40 and σ = 8. State Ho and H1.
Ho: μ new program ≥ 40 pounds (Implementing new education program will not decrease anger and aggression among children) H1: μ new program <40 pounds (Implementing new education program will decrease anger and aggression among children)
For the independent-measures t statistic, if other factors are held constant, increasing the sample mean difference will __________ the chances of a significant t statistic and __________ measures of effect size. a. increase, increase b. increase, decrease c. decrease, increase d. decrease, decrease
a (mean difference locates in the numerator in t and d formulas)
The critical region is
in the direction that is inconsistent with Ho (unlikely outcome when Ho is true), but that is consistent with H1. o If H1 states that a treatment increases performance, the critical region is located entirely in the right tail of the distribution (See Figure A). o If H1 states that a treatment decreases performance, the critical region is located entirely in the left tail of the distribution (See Figure B).
The research strategy that uses a separate group of participants for each population is called an...?
independent-measures research design or a between-subjects design
For a sample of n = 16 scores with SS = 375, compute the sample variance and the estimated standard error for the sample mean.
s2 = 25, sM = 1.25
If my sample is determined to be an unlikely outcome, the probability that is associated with my sample mean is (small or large).
small
As the alpha level gets smaller (from .05 to .001) , the size of the critical region also get smaller (True, False)
true
Which of the following is not an assumption for hypothesis testing using the t statistic?
The sample size must be greater than 30.
Retain the Ho if the test statistic falls in non-rejection area
With an alpha level of α =.05, if any z-score that is within ± 1.96 (two-tailed test) (in the nonrejection region), we fail to reject Ho.
Define the level of significance (α).
A probability value that is used to define the very unlikely outcomes if the null hypothesis true.
What is the similarity between α and p?
Both are probability under the assumption of Ho being true.
A researcher reports t(24) = 5.30 for an independent-measures experiment. How many individuals participated in the entire experiment? a. 24 b. 25 c. 26 d. 12
C
For the two related samples t test , df = __________. a. n1 + n2 - 2 b. (n1 - 1) + (n2 - 1) c. n - 1 d. n1 + n2 - 1
C
Researchers usually want their sample mean falls on (critical region or non-critical region).
Critical region (because we reject Ho when the sample means fall in the critical region. Rejecting Ho means that there is a significant treatment effect)
An independent sample t-test was used to test the difference between US teens (n = 45) and Chinese teens (n = 30) on the amount of time spent on internet use. df = ______________. a. 2 b. 74 c. 75 d. 73
D
One sample has n = 8 and SS = 21 and a second sample has n = 8 and SS = 35. What is the pooled variance for the two samples? a. 28 b. 56 c. 56/16 d. 56/14
D
With " α = .05 and a sample of n = 12 subjects in the two related samples t test, the two-tailed critical region for the t statistic has boundaries of __________. a. t = ±2.228 b. t = ±1.812 c. t = ±1.796 d. t = ±2.201
D
D= ________ Md = ___________ ud= ___________ s=___________ Smd= ___________
D = Difference scores (X2-X1) Md = Mean of D ud= Population mean of D in Ho; it is always zero because Ho states no difference s= SD (Standard deviation)of D Smd= estimated standard error
A small value of z (near zero) for the z statistic evidence that Ho should be rejected (True, False)
False
In a research report, the results of a hypothesis test include the phrase "p < .01." This means that the test failed to reject the Ho (True, false)
False
When n is especially small, the t distribution is __________ and _______________.
Flatter, more spread out
Alternative Hypothesis in a Directional (one-tailed) hypotheses
H1: μ MP > 50 (Mean of achievement test for Mt. Pleasant children is greater than 50 (Michigan norm)
Alternative Hypothesis in a Nondirectional (two-tailed) hypotheses
H1: μ MP ≠ 50 (Mean of achievement test for Mt. Pleasant children is not the same as mean for Michigan norm)
A sample mean near the center of the distribution is associated with (low or high) probability when Ho is true.
High (because those means are likely outcomes when Ho is true)
Z = sample mean - hypothesized population mean/ standard error = Obtained difference/ difference due to chance
Our study that examines Mt. Pleasant students' performance (see p. 3 of this note). Michigan Norm: µ=50, σ=10 Mean of n = 100 Mt. Pleasant students z= 53- 50 / 10 / square root of 100 = 3
What does the null hypothesis predict about a population or a treatment effect?
The Ho predicts that the treatment has no effect and the population is unchanged
Define a Type I error
The probability of rejecting Ho when Ho is true. In other words, you commit a Type I error when you say that a treatment has an effect when in fact it does not.
Assuming all other factors stay the same, what happens to the proportion of the data in both tails as the degrees of freedom increases with a t statistic?
The proportion in the two tails combined decreases
\In a research report, the term significant, is used when the Ho is rejected (True, false)
True
Dependent (Paired or Related) Samples
Two samples of data are dependent when each score in one sample is paired with a specific score in the other sample Dependent samples can occur in several ways. A group may be measured twice, such as pretest-posttest situation. The two scores from the same individual are then the dependent scores. Matched samples: each subject in one sample is matched on some relevant variable with a subject in the other sample. Samples that are biologically related or related by some important variables (husband-wife; twin pairs; siblings).
Independent Samples
Two samples of data are independent when each score in one sample is not dependent on a specific score in the other sample.
Type I Error
Type I error is the probability of rejecting the null hypothesis when Ho is true. In other words, it is the probability of falsely concluding that there is an effect when there is no effect. b. Type I error rate = α
Type II Error
Type II error is the probability of retaining the null hypothesis when Ho is false. In other words, it is the probability of falsely concluding that there is no effect when there is an effect. b. Type II error rate = β
Nondirectional (two-tailed) hypotheses
We use non-directional hypotheses when we want to know if there is a change or difference regardless of the direction of the effect or difference. This is called a two-tailed test because extreme results at either extreme or tail (Mt. Pleasant mean score is greater or less than 50) would make you want to reject the null hypothesis.
Directional (one-tailed) hypotheses
We use one-tailed hypotheses when we want to know the direction of the effect or difference (increase or decrease). It is also called a one-tailed test because the hypothesis test looks for an extreme result at just one end (in this case, the high end) of the curve.
This research strategy in which the two sets of data are obtained from the same groups of participants, is called...
a repeated-measures research design or a within-subjects design
5. A sports coach developed a new training method and is investigating the impact of a new training method. For each statement, indicate whether each statement is associated Ho or H1, and two-tailed or one-tailed hypothesis. a) The new training program produces different results from the existing b) The new training program produces results about like the existing one. c) The new training program produces better results than the existing one d) The new training program produces no better results than the existing
a) two-tailed H1 b) two-tailed Ho c) one-tailed H1 d) one-tailed Ho
For an independent-measures research study, the data show a 10-point difference between the two treatment means and a pooled variance of 4. Given this information, the value of Cohen's d is __________. a. 10/4 b. 10/2 c. 4/10 d. 2/10
b (10/Sq Rt of 4) = 10/2
Which of the following research situations is most likely to use an independent-measures design? a. Evaluate the effectiveness of a diet program by measuring how much weight is lost during 4 weeks of dieting. b. Evaluate the effectiveness of a cholesterol medication by comparing cholesterol levels before and after the medication. c. Evaluate the difference in verbal skills between 3-year-old girls and 3-year-old boys. d. Evaluate the development of verbal skills between age 2 and age 3 for a sample of girls.
c
How do we find a critical value?
• Alpha level (α) o In two-tailed test, the alpha value is divided in half so that an equal proportion of area is placed in the upper and lower tail. Because the extreme 5% is split between two tails of the distribution, there is exactly 2.5% (or 0.025) in each tail. o In one-tailed test, keep the original alpha level.
A researcher would like to know whether lowing room temperature will affect eating behavior in rats. The lab temperature is usually kept at 72o, and under these conditions, the rats eat an average of μ = 10 grams of food each day. The amount of food varies from one rat to another, forming a normal distribution with σ = 4. The researcher selects a sample of 4 rats and places them in a room where the temperature is kept at 65o. The daily food consumption for these rats averaged M = 13 grams. Do these data indicate that lowering a temperature has a significant effect on eating? State Ho and H1.
. Ho: μ 65 degree = 10 (room temperature has no effect on amount of food intake) H1: μ65 degree ≠ 10 ((room temperature has an effect on amount of food intake)
STEPS TO WRITING APA CONCLUSIONS FOR USE WITH HYPOTHESIS TESTING:
1) Write a sentence describing your decision in the context of the problem. Make sure you use the words "statistically significant" or "significant" if you reject Ho or the words "not significant" or "statistically non-significant" if you fail to reject Ho. End this sentence with a comma. 2) It is not essential, but it is a good idea to add the descriptive statistics (M, SD, etc.) in your write-up. 3) After the comma, you use symbols to report the z-score. "z" should be italicized AND lowercase (all statistical symbols must be italicized in any conclusion). Then Space. Then equal sign. Then another space. Then input your obtained numeric value, the sample z-score (not the critical z!). Then a comma. 4) After the comma, you use symbols to report the p-value. "p" should be italicized AND lowercase (all statistical symbols must be italicized in any conclusion. Then space. Next, indicate if p is < or > alpha (in our problem, it was .05). You can do this OR simply input the exact probability associated with your sample statistic (we found this earlier -- .0013). Do one of these, not both. End with a period, of course.
Four Steps of Hypothesis Testing
1. Difference scores (D): • A difference score (D) is a score obtained by subtracting two scores. Difference scores are obtained prior to computing the t statistic D = X2 - X1 Where X1 is the person's score in the first treatment and X2 is the score in the second treatment. • Subsequent calculations are based on D, rather than raw scores (X) Two-tails Ho: μD = 0 (no difference between the population means) H1: μD ≠ 0 (a difference between the population means) One-tail Ho: μD <= 0 OR Ho: μD >= 0 H1: μ μD > 0 H1: μD < 0 2. Finding rejection areas and determine t critical values 1. α 2. Direction (two-tails or one-tail) 3. df = n -1 * n= number of difference scores (D) 3. Calculate t statistic (obtained t) t= Md -Ud / Smd = Md/ Smd (Note: ud= 0) Smd = S/ Sq Rt. of n = Sq. Rt of s2/n = Estimated standard error 4. Make a conclusion
Repeated-Measures Versus Independent-Measures Designs
1. Number of subjects. Because a repeated-measures design uses the same individuals in both treatment conditions, this type of design usually requires fewer participants than would be needed for an independent-measures design. 2. Study changes over time. In addition, the repeated-measures design is particularly well suited for examining changes that occur over time, such as learning or development. 3. Individual differences. The primary advantage of a repeated-measures design, however, is that it reduces variance and error by removing individual differences, resulting in higher statistical power (pp. 367-368). 4. Time-related Factors and order effects. When participation in one treatment influences the scores in another treatment, the results may be distorted by order effects; this can be a serious problem in repeated-measures designs.
When do we use the t-test for Two Related Samples?
1. Repeated measures design ("pre-post design"): The t-test for two related samples is used to test a hypothesis about the population mean difference between two treatment conditions using sample data from a repeated- measures research study. a. In this design, a single group of individuals is obtained and each individual is measured in both of the treatment conditions being compared. b. Thus, the data consist of two scores for each individual. c. Example - you could measure athletic performance in a sample of athletes before and after a training camp 2. Matched-subjects design: The t-test for two related samples is also used to test a hypothesis about the population mean difference between two treatment conditions using sample data from a matched-subjects design. Each individual in one treatment is matched one-to-one with a corresponding individual in the second treatment based on the common characteristics or traits that they share (e.g., sex, IQ, GPA etc.). a. The matching is accomplished by selecting pairs of subjects so that the two subjects in each pair have identical (or nearly identical) scores on the variable that is being used for matching. Thus, the data consist of pairs of scores with each pair corresponding to a matched set of two "nearly identical" subjects. This procedure ensures that the samples are equivalent with respect to some specific variables. b. Matched sample has twice as many subjects as a repeated-measures design. c. However, because the matching process can never be perfect, matched-subjects designs are relatively rare. Repeated-measures designs (using the same individuals in both treatments) make up the vast majority of related-samples studies.
Four Steps of Hypothesis Test
1. Set Ho and H1 Two-tails Ho: μ1- μ2 = 0 (no difference between the population means) H1: μ1- μ2 ≠ 0 (a difference between the population means) One-tail Ho: μ1- μ2 <= 0 OR Ho: μ1- μ2 >= 0 H1: μ1- μ2 > 0 H1: μ1- μ2 < 0 2. Find rejection region and determine tcv Finding rejection areas and t critical values, which are determined by: 1. α 2. Direction (two-tails or one-tail) 3. df = df1 +df2 = (n1-1) + (n2-1) = n1 + n2 - 2 df1 = df for the first sample n1 = sample size for the first sample df2 = df for the second sample n2 = sample size for the second sample 3. Calculate t statistic (obtained t) t= Mean difference/ estimated standard error t= (m1-m2) - (u1-u2)/ S m1-m2 t= (m1-m2) - (0) / S m1-m2 Ho says that there is no mean difference between two populations. Therefore, u1=u2, u1-u2=0 4. Make a conclusion Effect Size (Cohen's d) for the Independent Sample t-test. Formula 2 Derive d from t statistic You can use this simple formula once you get t statistic This formula is simpler than the formula 1; Using t and df, you can easily calculate d. We will use this formula. When two groups have same sample size n1=n2 d=2t/ sq. rt of df When two groups do not have same sample size (n1 not equal to n2) d=t ( n1+n2)/ Sq Rt of df and Sq Rt of n1 * n2) Note: df= n1 + n2 - 2
How to find critical z values α = .01, one-tailed test (H1: Treatment decreases performance)
1. α = .01 2. Z-table. a. Look up the proportion of .01 in column C (proportion in tail). You see .0102 and .0099, but not .01. As .0099 is closer to .01, you should choose .0099. The z value that is correspond to .0099 is 2.33. As H1 states that treatment "decreases" performance, the critical region is in the left side. Therefore the z critical value is z = -2.33.
How to find critical z values α = .01, two-tailed test
1. α = .01/2 = .005 2. Z-table (Table B.1). a. Look up the proportion of .005 in column C (proportion in tail). You see .0049 and .0051, but not .005. As both .0049 and .0051 have an equal distance from .05, you can use either probability. Z critical values = ±2.57 or Z critical values = ±2.58.
How to find critical z values α = .05, one-tailed test (H1: Treatment increases performance
1. α = .05 2. Z-table (Table B.1). a. Look up the proportion of .05 in column C (proportion in tail). You see .0495 and .0505, but not .05. As both .0495 and .0505 have an equal distance from .05, you can use either probability. Z critical values = +1.64 or Z critical values = +1.65.
How to find critical z values α = .05, two-tailed test
1. α = .05/2 = .025 2. Z-table (Table B.1). a. Look up the proportion of .025 in column C (proportion in tail) and find that the z-score boundary is z=1.96. The extreme 5% is in the tails of the distribution beyond z = ±1.96.
A random sample of n = 30 individuals is selected from a population with μ = 15, and a treatment is administered to each individual in the sample. After treatment, the sample mean is found to be M = 23.1 with SS = 400. In order to determine if the treatment had a significant effect, which of the following can we use?
A t statistic. There is not enough information to use a z-score.
What are the differences between α and p?
A: Probability that determines the boundaries of common outcomes and rare outcomes when Ho is true Probability that is set by the researcher before collecting data Usually a small probability values (.10, .05, .01, .001) P: probability that is associated with researcher's sample result when Ho is true probability that is determined by researcher's sample statistical outcomes Can be any value between 0 and 1
Define p.
The probability of getting my sample result by chance if Ho is true.
How would our conclusion change if we want to know whether the number of cigarettes smoked after the treatment is reduced? (one-tailed test)
As there is a clear direction in our prediction, we need to use one-tailed test. -Step I: Ho: The number of cigarette smoked is not be reduced after the treatment H1: The number of cigarette smoked is reduced after the treatment -Step II: α = .05 df = 3 one-tail tcv = -2.353 -Step III: t = -2.5 - t obtained value is in the reject area >>> Reject Ho - P= .088 (SPSS) As one tail is used, p should be .044 (p/2) - .0488 (p) > .05 (a) >>> Reject Ho -Step IV Make a conclusion • t obtained value is in the reject area reject Ho • P = .088 (SPSS) As one-tail is used, p should be .044 (p/2) • .0488 (p) < .05 (α) reject Ho There was a significant difference between before and after treatment on the number of cigarette smoked, t(3) = -2.5, p < .05 (or p = .044) . Treatment effect was very large, d=1.25, indicating that the number of cigarette smoked after the treatment was 1.25 SD less than that before the treatment.
A research report describing the results from the two related samples t test states, "The data showed a significant difference between treatments, t(22) = 4.71, p < .01." From this report you can conclude that __________. a. a total of 22 individuals participated in the study b. a total of 23 individuals participated in the study c. a total of 24 individuals participated in the study d. It is impossible to determine the number of participants from the information given.
B
An independent-measures research study uses two samples, each with n = 10, to compare two treatments. If the results are evaluated with a t statistic using a two-tailed test with α = .05, then the critical region would have boundaries of __________. a. t = ±2.262 b. t = ±2.101 c. t = ±2.093 d. t = ±2.086
B
In a hypothesis test, if an independent-measures t statistic has a value zero, then __________. a. the two population means must be equal b. the two sample means must be equal c. the two sample variances must be equal d. None of the other 3 choices is correct.
B
The null hypothesis for the independent-measures t test states __________. a. there is no difference between the two sample mean b. there is no difference between the two population means c. the difference between the two sample means is identical to the difference between the two population means d. None of the other 3 choices is correct.
B
The results of an independent-measures research study are reported as "t(5) = -2.12, p > .05, two tails." For this study, what t values formed the boundaries for the critical region? a. +2.015 and -2.015 b. +2.571 and -2.571 c. +2.776 and -2.776 d. cannot be determined from the information given
B
Which of the following possibilities is a serious concern with a repeated-measures study? a. You will obtain negative values for the difference scores. b. The results will be influenced by order effects. c. The mean difference is due to individual differences rather than treatment differences. d. All of the other options are major concerns.
B
Why do we state hypothesis in terms of population parameters?
Because population is the group that we are interested in. Using a sample result, we would like to say something about the population.
Which of the following is a problem with using the z-score statistic?
It requires knowing the population variance, which is often difficult to obtain.
We can reject the Ho more easily if our α level is (small or large).
Large (because a larger alpha level means a larger rejection area)
A sample is selected from a population with μ = 50, and a treatment is administered to the sample. If the sample variance is s2 = 121, which set of sample characteristics has the greatest likelihood of rejecting the null hypothesis?
M = 45 for a sample size of n = 75
Define critical region for a hypothesis test
The critical region consists of sample outcomes that are very unlikely to be obtained if the Ho is true. The boundaries of the critical region depend on the alpha level.
Alternative hypotheses (Research hypotheses) (H1 or HA)
The hypothesis that there is an effect or difference, change, something happened with respect to some characteristic of the underlying populations.
Null hypotheses
The hypothesis that there is no effect, no difference, no change, nothing happened with respect to some characteristic of the underlying populations.
Define a Type II error
The probability of failing to reject Ho when Ho is false. In other words, you commit a Type II error when you say that a treatment has no effect when in fact it does.
The following data were obtained from a repeated-measures research study. What is the value of MD for these data? a. 3 b. 3.5 c. 4 d. 4.5 Subject 1st 2nd #1 10 15 #2 4 8 #3 7 5 #4 6 11
a ( (15-10)+(8-4)+(5-7)+(11-6))/4 )
What determines the probability of a Type I error?
a level
If the null hypothesis is true, then the t statistic (on average) should have a value of __________. a. 0 b. 1 c. 1.96 d. cannot be determined.
a (no mean differenceleads to "0" in the numerator, resulting in t value of 0)
Indicate (i) whether groups are independent or dependent (related) and (ii) identify (quasi-) independent and dependent variables (IV and DV) a. Is the mean age of death for left-handed people lower than it is for right-handed people? i. Sample (independent samples dependent samples) ii. IV ( ) DV ( ) b. College students' attitude toward gun control issues before and after watching "Bowling for Columbine." i. Sample (independent samples dependent samples) ii. IV ( ) DV ( ) c. Are rich people happier than poor people? i. Sample (independent samples dependent samples) ii. IV ( ) DV ( ) d. Any gender difference on average hours of watching March Madness? i. Sample (independent samples dependent samples) ii. IV ( ) DV ( ) e. Do husbands spend more time watching sports than wives do? i. Sample (independent samples dependent samples) ii. IV ( ) DV ( )
a. Independent; IV (handedness) DV(age of death) b. Related; IV(before vs. after treatment); DV (attitude scores) c. Independent; IV (rich vs. poor) DV(happiness) d. Independent; IV(men vs. women); DV (hours of TV watching) e. Related; IV(husband vs. wife); DV(hours of TV watching)
Assume that each of the following statements is in error: each describes a researcher's conclusions, but the researcher is mistaken. Indicate whether the error is Type I or Type II. a. "The data indicate that there are significant differences between males and females in their ability to perform task 1." b. "There are no significant differences between males and females in their ability to perform task 2." c. On the basis of our data, we reject the hull hypothesis."
a. Type I b. Type II c. Type I
A sample is selected from a population and a treatment is administered to the sample. If there is a 3-point difference between the sample mean and the original population mean, which set of sample characteristics has the greatest likelihood of rejecting the null hypothesis?
b. s 2 = 4 for a sample with n = 10