Assessment 521: Reliability and Validity
Validity coefficients greater than __________ are considered in the very high range. .50 .60 .70 .80
.50
If the reliability coefficient of a test is determined to be .27, what percentage is attributed to random chance or error? 27% 73% 2.7% Unknown percentage
73%
A researcher wants to measure content-sampling error and has two versions of an achievement test available. What measure of estimating reliability would be best in this situation? Alternate forms Test-retest Split-Half reliability Internal consistency reliability
Alternate forms
The goal of factor analysis is to: Measure the effectiveness of specific interventions in research. Reveal how scores differ from one group to the next. Prove the age of the individuals taking the test impacts their scores. Decrease the number of variables into fewer, more general variables.
Decrease the number of variables into fewer, more general variables.
Exploratory factor analysis can be used to: Investigate the differences between groups. Develop new testing instruments. Confirm expectations about a scale's dimensions. Determine if a test produces negative consequences.
Develop new testing instruments.
To evaluate a content validity evidence, test developers may use: Expert judges Factor analysis Experimental results Evidence of homogeneity
Expert judges
Jose has developed a test that has poor reliability; he can seek to increase reliability by: Increasing the number of test questions. Decreasing the number of test questions. Making the test questions more ambiguous. Starting over and developing a new test.
Increasing the number of test questions.
The researcher reports the standard error of measurement (SEM); this is: A different term for the standard deviation. The spread of scores of a single individual if he/she repeated a test several times. The spread of scores of a group of test takers on a single test. A measure to use alone as an index of reliability.
The spread of scores of a single individual if he/she repeated a test several times.
A researcher administers an achievement test to the same group of participants on three different occasions. In reporting the results, he describes the error that occurs from repeatedly testing the same individuals; this is called: Content-sampling error. Time-sampling error. Interrater differences error. Test-taker variables error.
Time-sampling error.
The SEM for an achievement test is 2.45. Johnny scores 100 and we assume that 68% of the time his true score falls between + 1 SEM; this means the confidence interval would be between: 0 and 102.45 2.45 and 100 95.10 and 104.90 97.55 and 102.45
97.55 and 102.45
_______________ are concepts, ideas, or hypotheses that cannot be directly measured or observed. Constructs Variables Standards Specifications
Constructs
A professor wants to assess students' knowledge of material taught through lectures. However, the professor asks questions that were not discussed during class lectures. This may result in problems with _______________ validity. Construct Content Discriminate Face
Content
_______________ is calculated by correlating test scores with the scores of tests or measures that assess the same construct. Convergent validity Discriminant validity Face validity Content validity
Convergent validity
A test designed for elementary school children was administered to 11th grade students. To these students, the test seemed extremely childish and inappropriate. They cooperated poorly with the testing procedure, and, as a result, this negatively impacted their outcomes on the test. Which of the following would have best addressed this problem? Evidence of homogeneity. Discriminant evidence. Evidence based on consequences of testing. Evidence based on response processes.
Evidence based on response processes.
In terms of accurate prediction of a criterion variable, a person who is predicted to do well during the first semester of college (based on an SAT score) and then does poorly would fall into the _______________ quadrant. True positive True negative False positive False negative
False positive
An administrator and the school psychologist were observing a child to assess for behavioral problems. An error may occur in reviewing what the two observers notice; this is reported as: Content-sampling error Time-sampling error Interrater differences Test-taker variables
Interrater differences
What information is included on a Multitrait-Multimethod Matrix? Subtests and correlations between each subtest. Methods of assessment, traits examined, and correlations. Loading factors and correlations of subtests. False positives, false negatives, true positives, and true negatives.
Methods of assessment, traits examined, and correlations.
You are reading about reliability of a test in the test manual and notice that the researchers report using a Spearman-Brown coefficient. You can infer that internal consistency reliability was measured using: The Kuder-Richardson Formulas Coefficient alpha Split-half reliability Test-retest
Split-half reliability
A researcher is concerned with measuring internal consistency reliability and has decided to use the Kuder-Richardson Formulas with a Likert scale test; this is a problem because the: Test does not have dichotomous test items. Researcher needs a second test for comparison. Test does not measure internal consistency reliability. Researcher is concerned with content sampling error.
Test does not have dichotomous test items.
You are attempting to account for a time sampling error and decide to administer the test a second time. In discussing reliability, you report this as what method of estimating reliability? Alternate forms Test-retest Split-half reliability Internal consistency reliability
Test-retest