Eval/Research- Test 2
The inclusion criteria in a meta-analysis are very much like
*a. delimitations
A null hypothesis is
*d. a statistical hypothesis that assumes that there is no difference among the effects of treatments
Comparing test items with the course objectives (course topics) checks which type of validity?
*a. content
In calculating ES for a treatment effect between a pretest and a posttest, the text recommends the use of which standard deviation?
*a. the pretest standard deviation
When a researcher claims that there is a difference between treatments (i.e., rejects the null hypothesis) when there really is no difference, what type of error is this?
*a. type I error
When a researcher states that a result is significant, this means that
*c. the result is unlikely to be a chance occurrence
A survey technique that asks a jury of experts to respond to a series of questionnaires in order to reach a consensus about important issues is the
*d. Delphi survey method
When an experimenter states that the level of significance is the .05 level, he is setting the probability of committing which type of error?
*a. type I error
The first step in conducting a questionnaire survey is to
*b. list specific objectives to be achieved
It is common when looking at a distribution to encounter extreme or unusual scores that could have a misleading effect on the results. These extreme scores are termed
*b. outliers
A researcher tests the knowledge that 10,000 school children from the southwestern United States have about nutrition, and the scores are transformed into percentiles by grade level. This type of research is a ________ survey study.
*a. normative
Which of the following are most difficult to analyze in a questionnaire study?
*a. open-ended items
In a meta-analysis, the ES for the experimental versus control groups was 0.68, which represents a percentile equivalent of 75. This means that
*b. the average score of the experimental group was better than 75% of the control group's scores
The fact that a meta-analysis quantifies findings from many studies is a frequently mentioned advantage. Another strong advantage of the meta-analysis over the normal review paper is
*b. the description of the coding of study characteristics and the criteria for excluding studies
In a meta-analysis, the researcher found that the mean for the experimental group in a study was 14, and the mean for the control group was 10. The standard deviation of the control group was 5. Based on these data, what is the ES?
*c. 0.80
A major difference in the purpose of a meta-analysis and a review of literature for a research paper is that the meta-analysis uses the literature
*a. for empirical and theoretical conclusions
When a tester divides the number of students who answered a test item correctly by the number of students who attempted the item, the result is called
*a. index of difficulty
A physical education teacher develops a skill test in volleyball. After administering the test to 50 students, she asks the volleyball coach to rate the students on volleyball skills. She then correlates the students' test scores with the coach's rating. This is an example of what type of validity?
*a. criterion validity
Which hypothesis is stated in the null form?
*a. There is no difference between the vocabulary scores of average- and high-ability students.
A researcher decides to use an alpha of .01 and a power of .80. To determine the needed sample size the researcher must ascertain the expected
*c. effect size
The use of intraclass correlation to estimate reliability in motor performance instead of Pearson r has the advantage(s) of
*d. a and b (a. allowing analysis of trial-to-trial change b. estimating reliability for more than two trials)
When a national poll reports that its survey has a margin of error of 3%, it is referring to error due to
*a. sampling
The maximum value of item discrimination in norm-referenced test items occurs when the item difficulty is
*b. .50
Validity is determined by finding the correlation between scores on
*c. a test and some independent criterion
Scaled responses to questionnaire items such as strongly agree, agree, no opinion, and so on are referred to as ________ scale items.
*b. Likert-type
Which of the following is a technique for providing information about the reliability of an instrument?
*b. coefficient alpha
In a meta-analysis of 100 studies that compared boys' and girls' knowledge of AIDS, an average ES of 0.50 was found. This number represents
*d. a difference of 0.50 standard deviation units
The author of a test of anxiety proceeds on the premise, based on the literature, that the performance of people with high anxiety suffers when they are under stress. In an experiment, the author finds that people who scored as highly anxious on the test performed more poorly on a stressful task. The test author maintained that this finding was evidence of what kind of validity?
*d. construct
A teacher wishes to determine the reliability of three trials on a performance test. He uses ANOVA to obtain the reliability coefficient. This technique is called
*d. intraclass correlation
Choose the correct sequence for developing a questionnaire.
*a. determining objectives, delimiting the sample, writing items, pilot testing
A measure of meaningfulness that expresses the difference between the experimental and control group in standard deviation units is the
*a. effect size
Consider the following questionnaire item (which uses a yes/no format): "Do you use objective test items or essay items?" What rule regarding item construction does it violate?
*c. Do not use items that have two or more separate ideas in the same item.
A predictive validity technique involving the construction of a two-way grid of percentages of students at different levels of an aptitude test who achieved different grades is called
*c. an expectancy table
The practice of selecting a small random sample of a survey's nonrespondents is primarily intended to
*c. determine whether the nonrespondents are a different population from the respondents
In a meta-analysis, the results of various studies on a topic are quantified using a standard measure (ES) that is called
*c. effect size
A threat to the validity of rating scales is that the rater may tend to let her previous knowledge of the student in another capacity (e.g., another course) influence the rating. This threat is called the
*c. halo effect
If a thermometer measured the temperature in an oven as 400° five days in a row when the temperature was actually 337°, this measuring instrument would be considered
*c. reliable but not valid
If a researcher finds a small difference in average test scores between a large sample (over 700) of experimental participants and a large sample (same size) of control participants, it is very likely that the difference is
*c. statistically significant but does not have a high degree of meaningfulness
Reliability of a measure may be established by
*d. a and b only (a. giving the test to the same people on two different occasions and correlating the two sets of scores b. correlating scores from alternate forms of the same test )
For scores from a test to have good content validity, the following statement(s) must be true:
*d. b and c only (b. The test adequately samples what was covered in the course. c. The percentage of points for each topic area reflects the amount of emphasis given that topic.)
To help ensure an acceptable return rate in a questionnaire study, the researcher should
*d. b and c only (b. use follow-up letters or e-mails c. enclose or attach another copy of the survey)
The accuracy with which a 12-min run estimates maximal oxygen consumption in a group of male high school seniors represents
*d. concurrent validity
The most frequent and severe criticism of meta-analysis is that
*d. the studies represent wide differences in methodologies, designs, and measurements
If the researcher fails to reject the null hypothesis when there really is a difference, this is an example of a
*d. type II error