COUN 521 Assessment Procedures for Counselors and Helping Professionals Chapter 5 & 6-Reliability and Validity
test does not have dichotomous test items
A researcher is concerned with measuring internal consistency reliability and has decided to use Kuder-Richardson Formulas with a Likert Scale test. This is a problem because the:
- evaluating the actual and potential consequences of a given test -assessing the social impact of a test's interpretations -the answer on the quiz may say "both a and b"
According to Messick (1989), consequential validity includes...
interrater differences
An administrator and the school psychologist were observing a child to assess for behavioral problems. An error may occur in reviewing what the two observers notice. This is reported as:
experimental results
Comparing pre and post-test scores of two groups, one group that experienced an intervention and one group that did not, is an example of
develop new testing instruments
Exploratory factor analysis can be used to...
73%
If the reliability coefficient of a test is determined to be .27, what percentage is attributed to random chance or error?
observed score
Keisha receives a 79 on her test. This is her:
intelligence, achievement, emotionally disturbed, learning disabled
List of examples of constructs in education:
multitrait-multimethod matrix
The ______________ is characterized by assessing both convergent and discriminant validity evidence and displaying data on a table of correlations.
decrease the number of variables into fewer, more general variables
The goal of factor analysis is to...
not generally acceptable
The researcher determines that the reliability coefficient is .65. This means the reliability is:
the spread of scores of a single individual if he/she took a test repeated times
The researcher reports the standard error of measurement (SEM). This is:
92% of the variance in scores is explained by real differences
The test manual reports a reliability coefficient (r) of .92, which means:
construct validity
The tripartite view of validity includes content validity, criterion validity, and
methods of assessment, traits examined, and correlations
What information is included on a multitrait-multimethod matrix?
the more the scores represent random error
When the researcher interprets the reliability coefficient, the closer the score is to 0:
test-retest
You are attempting to account for time sampling error and decide to administer the test a second time. In discussing reliability, you report this as what method of estimating reliability?
Realiability- Degree of consistency with which and instrument measures an attribute (concept), Precision, accuracy, stability, equivalence and homogeneity over repeated measures Validity- Are investigators measuring what they think they are measuring, Degree to which an instrument measures what it is supposed to measure
discuss the relationship between reliability and validity
content
A professor wants to assess students' knowledge of material taught through lectures. However, the professor asks questions that were not discussed during class lectures. This may result in problems with ________ validity.
time sampling error
A researcher administers an achievement test to the same group of participants on three different occasions. In reporting the results, he describes the error that occurs from repeatedly testing the same individuals. This is called:
alternate forms
A researcher wants to measure content-sampling error and has two versions of an achievement test available. What measure of estimating reliability would be best in this situation?
coefficient alpha
A researcher wants to measure content-sampling error with a Likert scale test. Which of the following methods would be best?
split half reliability
A researcher wants to measure internal consistency in a test that measures two different constructs (self-esteem and depression) without subdividing the items into the two construct groupings. Which of the following would be the best method to measure internal consistency?
evidence based on response processes
A test designed for elementary school children was administered to 11th grade students. To these students the test seemed extremely childish and inappropriate. They cooperated poorly with the testing procedure and as a result this negatively impacted the outcome of the test. Which of the following would have best addressed this problem?
test taker variables
A test was administered to a group of students the morning after homecoming. Several of the students appeared tired and some were coughing and sneezing. These factors may result in what type of error:
relevant, uncontaminated, reliable...... the answer will be all the above
Criterion measures that are chosen for the validation process must be
how uniform test items and components are in measuring one construct
Evidence of homogeneity refers to...
content sampling error
In reviewing a newly developed test instrument, the evaluator noticed that some of the items did not appear to reflect the construct being measured. He reported there was:
false positive
In terms of accurate prediction of a criterion variable, a person who is predicted to do well during the first semester of college (based on an SAT score) and then does poorly would fall into the __________ quadrant.
increasing the number of test questions
Jose has developed a test that has poor reliability. He can seek to increase reliability by:
group differentiation studies
Scores on the Kaufman Assessment Battery for Children have been shown to differ significantly between children with ADHD and children who are gifted. This is an example of which type of validity evidence?
quality of test items
Several test takers complained that items on the test were vague and confusing. This creates concern for:
97.55 and 102.45
The SEM for an achievement test is 2.45. Johnny scores 100 and we assume that 68% of the time his true score falls between + 1 SEM. This means the confidence interval would be between:
expert judges
To evaluate a content validity evidence, test developers may use:
.50
Validity coefficients greater than _________ are considered in the very high range.
carryover effect
When interviewing test takers who had an achievement test on three different occasions, participants reported that they had remembered some of the answers from the previous test administration. This is known as:
split-half reliability
You are reading about reliability of a test in the test manual and notice that the researchers report using a Spearman-Brown coefficient. You can infer that internal consistency reliability was measured using:
constructs
_________________ are concepts, ideas, or hypotheses that cannot be directly measured or observed
convergent validity
_________________ is calculated by correlating test scores with the scores of tests or measures that assess the same construct.
construct underrepresentation
___________________________ is a threat to validity that implies that a test is too narrow and fails to include important dimensions or aspects of the identified construct.