PSYC Chapter 6
Which is an example of a false positive? A) A test identifies a client as schizophrenic when the client is not. B) A test correctly identifies a client as schizophrenic. C) A test correctly identifies a client as not having schizophrenia. D) A test indicates that a client is not schizophrenic when in fact the client is.
A) A test identifies a client as schizophrenic when the client is not.
Which best describes the concept of validity as applied to tests? A) It refers to how well a test measures what the test authors intend it to measure. B) It refers to whether the same results could have occurred by chance less than five times in a hundred. C) It refers to how well a specific sample performs on an administration of a test. D) It refers to whether or not a test is administered under standardized conditions.
A) It refers to how well a test measures what the test authors intend it to measure.
Which of the following is true of test bias as compared to test fairness? A) Test bias is dependent on statistical analyses while test fairness relates to values. B) Test bias is dependent on values while test fairness relates to statistical analyses. C) Whether a test is fair can be answered with certainty while whether a test is biased cannot be. D) None of the answers is true.
A) Test bias is dependent on statistical analyses while test fairness relates to values.
For future research on the validity of the Constructive and Unconstructive Worry Questionnaire, the developers of this test suggested that studies be conducted using A) a clinical population of pathological worriers. B) subjects from varied cultural backgrounds. C) varied criteria to qualify subjects for participation in the research. D) All of the answers are correct.
A) a clinical population of pathological worriers.
Which is an example of convergent evidence for the construct validity of a test measuring fear of cats? A) a high correlation between the test and an existing validated test measuring fear of cats B) a high correlation with an existing validated test measuring more generalized fear C) a low correlation between the test and a test to measure fear of dogs D) Both a high correlation between the test and an existing validated test measuring fear of cats and a high correlation with an existing validated test measuring more -generalized fear are correct.
A) a high correlation between the test and an existing validated test measuring fear of cats
To ensure the Constructive and Unconstructive Worry Questionnaire had acceptable content validity, the items would need to A) adequately sample the variety of characteristics of worry. B) reflect content specified in Alfred E. Neuman'sElements of Worry. C) total no more than 20 for each of the two variables. D) All of the answers are correct.
A) adequately sample the variety of characteristics of worry.
Gonsalvez and Crowe (2014) concluded that psychotherapy supervisors' judgments of supervisees' competence are A) compromised by leniency errors. B) compromised by severity errors. C) reasonably accurate given subsequent ratings. D) unreliable in the light of subsequent ratings.
A) compromised by leniency errors.
Before constructing a comprehensive final examination that covers everything you have studied since the first day of your course, your instructor reviews the objectives of the course, the textbook, and all lecture notes. Your instructor is clearly making a diligent effort to maximize the _____ validity of the final examination. A) content B) criterion-related C) predictive D) internal consistency
A) content
According to your textbook, factor analysis A) derives its name from the "factoring" of correlations. B) is a data reduction technique. C) will typically yield a publishable study for a researcher. D) None of the answers is correct.
A) derives its name from the "factoring" of correlations.
If a newly developed test designed to measure happiness correlates with other tests of happiness but not with tests of sadness, this is referred to as _____ evidence of validity. A) discriminant B) convergent C) homogeneous D) concurrent
A) discriminant
In the development of the Constructive and Unconstructive Worry Questionnaire, the test authors hypothesized that the tendency to worry _____ would be positively related to trait-anxiety. A) excessively B) frequently C) constructively D) unconstructively
A) excessively
Which of the following is not included in the traditional "trinitarian" conceptualization of validity? A) face validity B) content validity C) construct validity D) criterion-related validity
A) face validity
If a test is a valid measure of a particular construct, we would expect that A) groups of people who differ with respect to the construct will obtain different test scores. B) groups of people who differ with respect to the construct will obtain similar test scores. C) groups of people who obtain similar scores will have similar personalities. D) None of the answers is correct.
A) groups of people who differ with respect to the construct will obtain different test scores.
Which term is used to refer to the tendency of a rater to evaluate ratees higher than they objectively deserve because of the rater's inability to discriminate between aspects of the ratee's behavior? A) halo effect B) random error C) generosity error D) severity error
A) halo effect
In order to remain consistent with a test's blueprint, a test administered on a regular basis is likely to require A) item-pool management. B) base rate maintenance. C) predictive validity certification. D) None of the answers is correct.
A) item-pool management.
What type of validity evidence best sheds light on whether a college admissions test is valid for selecting students who will complete the program within four years? A) predictive criterion-related validity B) concurrent criterion-related validity C) content validity D) construct validity
A) predictive criterion-related validity
The form of criterion-related validity that reflects the degree to which a test score correlates with a criterion measure that was obtained subsequent to the test score is known as A) predictive validity. B) construct validity. C) concurrent validity. D) content validity.
A) predictive validity.
The results of a predictive validity study of a test will likely be most affected by A) the characteristics of the sample tested, such as attrition and self-selection. B) the number of items on the test, with longer tests demonstrating higher predictive validity. C) the correlation coefficient chosen to measure the validity. D) the administration time required for the test compared with that of the criterion test chosen.
A) the characteristics of the sample tested, such as attrition and self-selection.
Criterion contamination occurs when A) the criterion measure is influenced by the predictor measure. B) subjects talk to one another about the test. C) the characteristic being measured occurs with low frequency in the group being studied. D) All of the answers are correct.
A) the criterion measure is influenced by the predictor measure.
Which of the following is t he best definition of hit rate? A) the proportion of people a test correctly identifies as possessing a particular trait, behavior, characteristic, or attribute B) the proportion of people in the general population who possess a particular trait, behavior, characteristic, or attribute C) the proportion of people a test incorrectly identifies as possessing a particular trait, behavior, characteristic, or attribute D) the degree of validity of a particular test
A) the proportion of people a test correctly identifies as possessing a particular trait, behavior, characteristic, or attribute
A key difference between concurrent and predictive validity has to do with A) the time frame during which data on the criterion measure is collected. B) the magnitude of the reliability coefficient considered significant at the .05 level. C) the magnitude of the validity coefficient considered significant at the .05 level. D) Both the magnitude of the reliability coefficient considered significant at the .05 level and the magnitude of the validity coefficient considered significant at the .05 levelare correct.
A) the time frame during which data on the criterion measure is collected.
The reason the Constructive and Unconstructive Worry Questionnaire was developed was A) to capture both diagnostic and therapeutic value of the worry construct. B) to learn if the worrying life cycle can help minimize or prevent worrying in the future. C) Both to capture diagnostic and therapeutic value of the worry construct and to learn if the worrying life cycle can help minimize or prevent worrying in the future are correct. D) None of the answers is correct.
A) to capture both diagnostic and therapeutic value of the worry construct.
A test is considered valid when the test A)measures what it purports to measure. B)measures whatever it is that it measures consistently. C)can be administered efficiently and cost-effectively. D)has little or no error associated with it.
A)measures what it purports to measure.
In the development of the Constructive and Unconstructive Worry Questionnaire, after a review of the preliminary items, a total of _____ items remained in the final form of the test. A) 40 B) 18 C) 16 D) 12
B) 18
The initial version of the Constructive and Unconstructive Worry Questionnaire contained _____ items, and each of these items was checked to ensure that they were _____ and concise. A) 80; relatively equal in difficulty B) 40; unique C) 80; unique D) 40; relatively equal in difficulty
B) 40; unique
Which statistic is appropriate for use to estimate the heterogeneity of a test composed of multiple-choice items? A) point-biserial correlation coefficient B) Pearson-product moment correlation coefficient C) coefficient alpha D) chi square
B) Pearson-product moment correlation coefficient
In an undergraduate measurement course, an instructor announces that the first examination will cover the topics of reliability and validity. One student in the class, Jamarr, publicly predicts that only questions on reliability will be posed. As it turns out, true to Jamarr's prediction, all of the test questions are only on the topic of reliability. Given this background, which of the following is the most reasonable conclusion that Jamarr's fellow students could draw? A) The first examination lacked concurrent validity. B) The first examination lacked content validity. C) The first examination lacked face validity. D) Jamarr should be consulted prior to the second examination.
B) The first examination lacked content validity.
If you were a psychologist working in the field of human resource management, which claim for a new personnel selection test by a test publisher would be most compelling and persuasive? A) The test identifies a large number of false positives. B) The test improves the hit rate. C) The test identifies a large base rate. D) The test improves the selection ratio.
B) The test improves the hit rate.
A team of consumer psychologists is interested in conducting research to test the palatability of Papa John's Pizza (PJP). A PJP Palatability Survey is developed on the basis of the opinions of a sample of death row prison inmates. These same inmates are then used to validate the paper-and-pencil "PJP Palatability Survey." Which error was made by the researchers? A) The researchers used too small a population to test. B) The test was invalid due to criterion contamination. C) Convergent evidence was confused with discriminant evidence. D) Serving the inmates PJP was a violation of a Constitutional prohibition against cruel and unusual punishment.
B) The test was invalid due to criterion contamination.
Which assessment technique is the best example of a face-valid method? A) a personality test in which test takers are asked to describe what they see in inkblots B) administering a word processing test to a person applying to a job that requires the use of a word processor C) asking test takers to draw a picture of their family to assess family relationships D) measuring the height of applicants applying for a semi-pro basketball team
B) administering a word processing test to a person applying to a job that requires the use of a word processor
Studies that indicate that Attention Deficit Disorder occurs in approximately 2 percent of the population. Here, 2 percent is the _____ for the disorder. A) hit rate B) base rate C) miss rate D) sample
B) base rate
A rater systematically assigns ratings in the middle range, thus avoiding extremely positive or extremely negative ratings. Which type of error best characterizes this rater's ratings? A) leniency error B) central tendency error C) severity error D) halo effect
B) central tendency error
A test developer compares a student's performance on a newly developed math achievement test to the same student's performance on a well-established math achievement test for the purpose of exploring the _____ validity of the new test. A) content B) concurrent criterion-related C) predictive criterion-related D) construct
B) concurrent criterion-related
What type of validity evidence best sheds light on how a shorter and less expensive test compares with a longer and more expensive one? A) predictive criterion-related validity B) concurrent criterion-related validity C) content validity D) construct validity
B) concurrent criterion-related validity
A psychologist wants to determine the criterion-related validity of an intelligence test by determining how well it predicts a student's placement in a special class. If the psychologist uses the intelligence test for both diagnosis and special class placement, that criterion is said to be A) irrelevant. B) contaminated. C) invalid. D) negatively skewed
B) contaminated.
Blueprinting is best associated with A) construct validity. B) content validity. C) criterion-related validity. D) architectural validity.
B) content validity.
Relating scores obtained on a test to other test scores or data from other assessment procedures is typically done in an effort to establish the _____ validity of a test. A) content-related B) criterion-related C) face D) about-face
B) criterion-related
Predictive and concurrent validity can be subsumed under A) content validity. B) criterion-related validity. C) face validity. D) true score validity.
B) criterion-related validity.
"Unequal levels of difficulty between two groups" characterizes the definition of a biased test by A) any random member of the general public. B) federal judges. C) a psychometrician. D) All of the answers are correct.
B) federal judges.
In legal terminology, a valid contract is a contract that A) measures what it purports to measure. B) has been executed with the proper formalities. C) is well grounded on principles of evidence. D) was designed with all of the needs of the parties.
B) has been executed with the proper formalities.
If new predictors explain something about a predicted score that was not already explained by existing predictors, the new predictor might be praised for its A) test-retest reliability. B) incremental validity. C) construct validity. D) face validity.
B) incremental validity.
As the term is applied to a test, validity is a judgment or estimate of how well a test A) measures what it purports to measure under all circumstances. B) measures what it purports to measure in a particular context. C) satisfies the deductions that could logically be made from inferences about it. D) a test result can be duplicated under the same or similar circumstances.
B) measures what it purports to measure in a particular context.
Each of the three approaches to validity assessment in the trinitarian model should best be thought of as A) a mutually exclusive evidence of a test's validity and any one approach is necessary and sufficient for demonstrating a test's validity. B) one type of evidence that, with others, contributes to a judgment concerning the validity of a test. C) insufficient, either by themselves or together with the other two, to demonstrate the validity of a test. D) None of the answers is correct.
B) one type of evidence that, with others, contributes to a judgment concerning the validity of a test.
Each of the three approaches to validity assessment in the trinitarian model should best be thought of as A) mutually exclusive as evidence of a test's validity with any one source necessary and sufficient for demonstrating a test's validity. B) one type of evidence that, with others, contributes to a judgment concerning the validity of a test. C) insufficient, either by themselves or together with the other two, to demonstrate the validity of a test. D) None of the answers is correct.
B) one type of evidence that, with others, contributes to a judgment concerning the validity of a test.
Quotas may be viewed as one type of remedy for A) low reliability of selection tests. B) previously unfair practices. C) low validity of selection tests. D) All of the answers are correct.
B) previously unfair practices.
A supervisor unintentionally rates his supervisees less favorably than they really deserve. Which type of error has been made? A) unconscious error B) severity error C) random error D) vocational error
B) severity error
To improve raters' judgments of competency, Gonsalvez and Crowe (2014) recommended that A) at least three raters be used. B) specific competencies be evaluated. C) all raters be certified as competent themselves. D) All of the answers are correct.
B) specific competencies be evaluated.
In psychological testing and assessment, bias best refers to A) random variation in test performance attributable to covert prejudice on the part of the test developer. B) systematic variation in test performance that is unrelated to the construct a test is intended to measure. C) a test or testing practice that systematically favors the performance of one group of test takers over another. D) All of the answers are correct.
B) systematic variation in test performance that is unrelated to the construct a test is intended to measure.
The first step in developing the Constructive and Unconstructive Worry Questionnaire was A) defining the construct to be measured. B) the creation of an item pool. C) identifying a subject pool of worriers. D) None of the answers is correct.
B) the creation of an item pool.
Using a test that measures a low base rate trait A) will likely result in more correct than incorrect classifications. B) will likely result in more incorrect than correct classifications. C) will result in an equal number of correct and incorrect classifications. D) will have results that cannot be determined based on the information presented.
B) will likely result in more incorrect than correct classifications.
After a live performance of Justin Bieber, the tweets of his die-hard fans on Twitter can be expected to reflect _____ error. A) a leniency B) a generosity C) Both leniency and generosity are correct. D) None of the answers is correct.
C) Both leniency and generosity are correct.
Prior to the development of the Constructive and Unconstructive Worry Questionnaire, research on worry had shown that the act of worrying can lead to A) positive outcomes. B) negative outcomes. C) Both positive outcomes and negative outcomes are correct. D) None of the answers is correct.
C) Both positive outcomes and negative outcomes are correct.
Which is true regarding the concept of test fairness? A) In contrast to bias, fairness is relatively easy to determine. B) Fairness is usually determined statistically. C) Fairness often involves moral or ethical issues. D) All of the answers are correct.
C) Fairness often involves moral or ethical issues.
If the hit rate of a test identifying the occurrence of a particular disorder in a population is low, what impact does this have on the results of the test? A) There will be no impact on the accuracy of the classification. B) More individuals will be correctly classified as having the disorder. C) Fewer individuals will be correctly classified as having the disorder. D) The impact cannot be determined based on the information provided.
C) Fewer individuals will be correctly classified as having the disorder.
Messick supported a unitary view, while _____ supported the trinitarian approach. A) Cronbach B) Lawshe C) Guion D) Dangerfield
C) Guion
A new test designed to gauge competency to stand trial is found to lack face validity. Which is the most likely consequence of this fact? A) Judges will urge assessors to use this test. B) Lawyers will urge assessors not to use this test. C) Impression management will be a poor factor in the test results. D) Defendant competency will be a poor factor in the test results.
C) Impression management will be a poor factor in the test results.
A test is considered to be biased if A) 50 percent of the test takers fail the test. B) one group, such as males, consistently performs better than another group, such as females. C) a factor inherent in the test systematically prevents accurate measurement. D) the test developer was found to harbor prejudice against some group.
C) a factor inherent in the test systematically prevents accurate measurement.
Employment test data suggests that an individual applicant is incapable of successfully performing a particular job. However, in reality, this individual would be very successful at the job. Such a scenario exemplifies A) a base rate. B) a false positive. C) a false negative. D) a false expectancy.
C) a false negative.
A coefficient of correlation is calculated between Henry's score on a test of sociopathy and a clinician's rating of Henry on the variable of sociopathy. This coefficient of correlation might also be referred to as A) an index of reliability. B) an index of sociopathy. C) a validity coefficient. D) a content-related validity coefficient.
C) a validity coefficient.
In contrast to a trinitarian view of validity, a unitary view of validity takes into account A) two of the three elements of the trinitarian view. B) none of the elements of the trinitarian view but a new model based on consequences of test use. C) all three elements of the trinitarian view plus additional factors such as societal values. D) None of the answers is correct.
C) all three elements of the trinitarian view plus additional factors such as societal values.
Ecological validity refers to a judgement regarding how well a test measures what it purports to measure A) but only in a specified environment. B) but only in a specified environment and within certain frequency limits. C) at the time and place that the variable being measured is actually emitted. D) All of the answers are correct.
C) at the time and place that the variable being measured is actually emitted.
Which of the following is best viewed as types of criterion-related validity? A) concurrent validity and face validity B) content validity and predictive validity C) concurrent validity and predictive validity D) concurrent validity and content validity
C) concurrent validity and predictive validity
Criterion-related validity is to predictive validity as criterion-related validity is to A) construct validity. B) content validity. C) concurrent validity. D) test bias.
C) concurrent validity.
The form of criterion-related validity that reflects the degree to which a test score is correlated with a criterion measure obtained at the same time that the test score was obtained is known as A) predictive validity. B) construct validity. C) concurrent validity. D) content validity.
C) concurrent validity.
A significant, positive relationship exists between scores on a new test of intelligence and scores on the fourth edition of the Stanford-Binet intelligence scale. This may be viewed as supportive of which type of validity evidence for the new test? A) criterion-related validity B) content validity C) convergent evidence of construct validity D) discriminant evidence of construct validity
C) convergent evidence of construct validity
A statistically insignificant correlation exists between scores on a new test of depression and a well-established measure of satisfaction with life. These data may be construed as which type of validity evidence with regard to the test of depression? A) criterion-related validity B) convergent evidence of construct validity C) discriminant evidence of construct validity D) None of the answers is correct because there was an insignificant relationship.
C) discriminant evidence of construct validity
Which is not a method of evaluating the validity of a test? A) evaluating scores on the test as compared to scores obtained on other tests B) evaluating the content of the test C) evaluating the percentage of passing and failing grades on the test D) evaluating test scores as they relate to predictions from a particular theory
C) evaluating the percentage of passing and failing grades on the test
Comedian Rodney Dangerfield was cited in the text to illustrate which of the following? A) test validation B) content validity C) face validity D) construct validity
C) face validity
The extent to which a particular factor contributes to a test score is referred to as A) true score. B) base rate. C) factor loading. D) hit rate.
C) factor loading.
A study of the ecological validity of a test is likely to be conducted A) by a researcher interested in learning about behavior that occurs at a specific time and place. B) only during the season that the targeted behavior occurs if the targeted behavior is seasonal in nature. C) in an environment that is similar to one in which the targeted behavior will naturally occur. D) All of the answers are correct.
C) in an environment that is similar to one in which the targeted behavior will naturally occur.
A Child Abuse Potential (CAP) Inventory boasts a hit rate of approximately 90 percent. Properly interpreted, this means that A) 90 percent of the people who score high on the CAP physically abuse children. B) 90 percent of the people who score low on the CAP do not physically abuse children. C) in groups with a 50 percent base rate, 90 percent of those who abuse children are correctly identified. D) that in groups with a 90 percent base rate, 50 percent of those who abuse children are correctly identified.
C) in groups with a 50 percent base rate, 90 percent of those who abuse children are correctly identified.
In the context of validity, a valid test A) may be used fairly. B) may be used unfairly. C) may be used either fairly or unfairly. D) is only used by biased test users.
C) may be used either fairly or unfairly.
Comparing Scholastic Assessment Test (SAT) scores with the first semester college grade point averages of students is a process related to establishing the _____ validity of the SAT. A) content B) concurrent criterion-related C) predictive criterion-related D) construct
C) predictive criterion-related
In Chapter 6 of your text, Adam Shoemaker, the featured professional in Meet an Assessment Professional, described the use of a test with little criterion validity. Dr. Shoemaker recalled that this test was used for the purpose of A) gauging inter-item consistency of another test. B) gaining "buy-in" from the test users. C) providing a "job preview" of sorts to aspirants. D) hiring candidates for mid-level executive positions.
C) providing a "job preview" of sorts to aspirants.
According to the text, face validity may ultimately be more of an issue of _____ than _____. A) social values; psychometric soundness B) psychometric soundness; public relations C) public relations; psychometric soundness D) social values; public perception
C) public relations; psychometric soundness
Face validity refers to A) the most preferred method for determining validity. B) another name for content validity. C) the appearance of relevancy of the test items. D) validity determined by means of face-to-face interviews.
C) the appearance of relevancy of the test items.
In a psychometric context, a definition of test fairness is most likely to include reference to A) the percent of items answered correctly by members of different groups. B) the mean scores earned by various groups on a particular test. C) the degree to which a test is used in an impartial, just, and equitable way. D) All of the answers are correct.
C) the degree to which a test is used in an impartial, just, and equitable way.
In the development of the Constructive and Unconstructive Worry Questionnaire, the subjects in one of the preliminary studies were A) 98 Korean foreign exchange students studying at New York University. B) 398 convicted felons in the federal prison system. C) 698 residents of a South Florida trailer park during the hurricane season. D) 998 Australian residents of wildfire-prone areas.
D) 998 Australian residents of wildfire-prone areas.
A construct is A) unobservable. B) something that describes behavior. C) something that is assumed to exist. D) All of the answers are correct.
D) All of the answers are correct.
An investigation of a test's construct validity may yield evidence that A) the test is measuring a single construct. B) the test correlates with another test purporting to measure the same construct. C) test scores increase as a function of age. D) All of the answers are correct.
D) All of the answers are correct.
Face validity A) may influence the way a test taker approaches the situation. B) relates more to what a test appears to measure than what the test may actually measure. C) is given short-shrift as compared to other indices of validity. D) All of the answers are correct.
D) All of the answers are correct.
If a test developer has only a "fuzzy" vision of the construct being measured, then A) the content validity of the test is likely to suffer. B) the construct validity of the test is likely to suffer. C) content irrelevant to the targeted construct may be measured. D) All of the answers are correct.
D) All of the answers are correct.
Pretest and posttest scores may be affected by A) therapy. B) medication. C) education. D) All of the answers are correct.
D) All of the answers are correct.
Rating errors A) may be unintentional. B) may be intentional. C) may involve a tendency to be lenient in rating. D) All of the answers are correct.
D) All of the answers are correct.
Test blueprinting is applied in the design of A) an attitude test. B) a personality test. C) an employment test. D) All of the answers are correct.
D) All of the answers are correct.
The magnitude of a validity coefficient may be affected by A) attrition of the sample. B) restriction of range. C) inflation of range. D) All of the answers are correct.
D) All of the answers are correct.
The validation of a test is a process A) that can be carried out by the test author. B) that can be carried out by the test user. C) of gathering evidence of the test's validity. D) All of the answers are correct.
D) All of the answers are correct.
Which is an example of a criterion? A) achievement test scores B) success in being able to repair a defective toaster C) student ratings of teacher effectiveness D) All of the answers are correct.
D) All of the answers are correct.
Which is true regarding the adjustment of test scores by group membership? A) According to the Civil Rights Act of 1991, it is illegal for purposes of making hiring or promotion decisions. B) It is viewed as helping guarantee the proportional representation of various minority groups in the workplace. C) It is viewed as allowing preferential treatment of certain groups. D) All of the answers are correct.
D) All of the answers are correct.
Which qualifies as a construct? A) depression B) intelligence C) mechanical aptitude D) All of the answers are correct.
D) All of the answers are correct.
Evidence of the homogeneity of a test can be found in the A) correlation between a test and some criterion. B) correlation between test items and total test scores. C) correlation between subtest scores and total test scores. D) Both correlation between test items and total test scores and correlation between subtest scores and total test scores are correct.
D) Both correlation between test items and total test scores and correlation between subtest scores and total test scores are correct.
In the development of the Constructive and Unconstructive Worry Questionnaire, the amount of worry one experiences was captured using A) the Worry Domains Questionnaire. B) the Penn State Worry Questionnaire. C) trained raters marking a 5-point scale. D) Both the Worry Domains Questionnaire and the Penn State Worry Questionnaire are correct.
D) Both the Worry Domains Questionnaire and the Penn State Worry Questionnaire are correct.
_____ is defined as the degree to which an additional predictor explains something about the criterion measure that is not explained by predictors already in use. A) A false positive rate B) Evidence of construct validity C) Predictive validity D) Incremental validity
D) Incremental validity
Which is true regarding a rating? A) It refers only to a numerical judgment that places a person or an attribute along a continuum. B) It refers only to a verbal judgment that places a person or an attribute along a continuum. C) It tends not to involve a judgment. D) It refers to either a numerical or a verbal judgment that places a person or an attribute along a continuum.
D) It refers to either a numerical or a verbal judgment that places a person or an attribute along a continuum.
Which magnitude of validity coefficient is typically acceptable to conclude that a test is valid? A) 1.50 B) 1.80 C) above 1.90 D) None of the answers is correct.
D) None of the answers is correct.
A standard against which a test or test score is evaluated is known as A) a facet. B) a correlation coefficient. C) a validity coefficient. D) a criterion.
D) a criterion.
A test reviewer comes to the conclusion that a certain test is "a valid test." This means that the reviewed test has been shown to be valid for A) a particular use with a particular population for all time. B) a particular use with a universal population of test takers for a limited time. C) universal use with all test takers for the life of the test. D) a particular use with a particular population at a particular time.
D) a particular use with a particular population at a particular time.
Issues of "fairness" as applied to tests A) are seldom discussed in the popular media. B) may be determined through mathematical procedures. C) are generally agreed on. D) are rooted in moral and philosophical issues.
D) are rooted in moral and philosophical issues.
The effect of _____ of test scores for remedying adverse impact is to make equivalent all scores that fall within a particular range. A) within-group norming B) differential cutoffs C) preference policies D) banding
D) banding
All validity evidence can be interpreted as _____ validity. A) content B) criterion-related C) predictive D) construct
D) construct
"It's a measure of validity that is arrived at by a comprehensive analysis of how scores on a test relate to other test scores." This statement refers to A) face validity. B) content validity. C) the trinitarian index. D) construct validity.
D) construct validity.
A review of existing measures of individual differences in worry suggested to the authors of the Constructive and Unconstructive Worry Questionnaire that none of the measures were made to distinguish people's tendency to worry A) about things with momentous consequences versus those with trivial consequences. B) about things coming up in the future versus things one had done in the past. C) in an ideal-based fashion from a reality-based fashion. D) constructively from their tendency to worry unconstructively.
D) constructively from their tendency to worry unconstructively.
In the development of the Constructive and Unconstructive Worry Questionnaire, which research tool was used to assist the test developers in selecting the final form of the test? A) analysis of variance B) regression analysis C) critical incident analysis D) factor analysis
D) factor analysis
Which type of error is likely to occur when a music critic, who is also a big fan of an artist, reviews the artist's latest album? A) banding effect B) central tendency error C) severity error D) halo effect
D) halo effect
The names attributed to different factors in a factor analysis are A) dictated by the factors themselves. B) subject to change as new analyses occur. C) thoroughly validated against dictionary definitions. D) typically dependent on an analyst's judgment.
D) typically dependent on an analyst's judgment.
In the development of the Constructive and Unconstructive Worry Questionnaire, the test authors hypothesized that the tendency to worry _____ would be negatively related to one's tendency to be punctual. A) excessively B) frequently C) constructively D) unconstructively
D) unconstructively
"How can group differences on cognitive ability tests be reduced while retaining existing high levels of reliability and criterion-related validity?" According to Gottfredson, the answer to this question A) lies in the judicious application of affirmative action strategies. B) must be answered by measurement professionals for themselves. C) must come from strategies designed to minimize adverse impact. D) will not come from measurement-related research.
D) will not come from measurement-related research.