Chapter 4: Validity and Test Development

Ace your homework & exams now with Quizwiz!

Convergent Validity

demonstrated when a test correlates highly with other variables or tests with which it shares an overlap of constructs. Ex., two test designed to measure different types of intelligence should, nonetheless, share enough of the general factor in intelligence to produce a hefty correlation when jointly administered to a heterogeneous sample of subjects.

Discriminant Validity

demonstrated when a test does not correlate with variables or tests from which it should differ. Ex., social interest and intelligence are theoretically unrelated, and tests of these two constricts should correlate negligibly, if at all.

Sensitivity

accurate identification of patients who have a syndrome in this care, dementia.

Rational Scale construction (Internal consistency)

all scale items correlate positively with each other and also with the total score for the scale

Test Utility

can be summed up by the question, "Does use of this test result in better patient outcomes or more efficient delivery of services?" ex., we might envision an experiment in which individual psychotherapy clients were randomly assigned to two groups.

Convergent Validity

demonstrated when a test correlated highly with other variables or tests with which it shares an overlap of constructs. Ex., two test designed to measure different types of intelligence should, nonetheless, share enough of the general factor in intelligence to produce a hefty correlation when jointly administered to a heterogenous sample of subjects.

Ratio Scale

has all the characteristics of an interval scale but also possesses a conceptually meaningful zero point in which there is a total absence of the characteristic being measured.

Extra-validity concerns

include side effects and unintended consequences of testing.

Likert Scale

presents the examinee with five responses ordered on an Strongly agree/strongly disagree or approve/disapprove continuum.

Testing is big business

test development is extraordinarily expensive, which means that publishers are inherently conservative about introducing new tests.

Ordinal Scale

Constitutes a form of ordering or ranking. A ranking of "1" is "more" than a ranking of "2", and so on. The "more" refers to the order of preference, however, ordinal scales fail to provide information about the relative strength of rankings.

Interval Scale

Provides information about ranking, but also supplies a metric for gauging the differences between ranking. Scale from 1 to 100

Validity coefficient

The correlation between test and criterion. The higher the validity coeficient the more accurate is the test in predicting the criterion.

Feedback from the examinees

The inter-university entrance exam is a group test consisting of five multiple choice subtests: General knowledge, Figural Reasoning, Comprehension, Mathematical Reasoning, and English.

Factor Analysis

is to identify the minimum number of determiners (factors) required to account for the intercorrelations among a battery of tests.

Validity Shrinkage

A common discovery in cross-validation research is that a test predicts the relevant criterion less accurately with the new sample of examinees than with the original tryout sample.

Factor Loading

A correlation between an individual test and a single factor. Factor loadings can vary between -1.0 and +1.0. the final outcome of a factor analysis is a table depicting the correlation of each test with each factor.

Mini-Mental State Examination

A short screening test of cognitive functioning. Consists of a number of simple questions (ex. what day is this?) and easy tasks (ex. remembering three words). The test yields a score from 0 (no items correct) to 30 (all items correct). Dementia is a general term that refers to significant cognitive decline and memory loss caused by a disease process such as Alzheimer's disease or the accumulation of small strokes. Patients are known from independent, comprehensive medical and psychological workups to meet the criteria for dementia or not.

Face Validity

A test has face validity if it looks valid to test users, examiners, and especially the examinees.

Validity

A test is valid to the extent that inferences made from it are appropriate, meaningful, and useful

Criterion Validity

Any outcome measure against which a test is validated. Must be more than just imaginative; they must also be reliable, appropriate, and free of contamination from the test itself. Accumulations of stressful life events Ex. divorce, job promotion, traffic tickets.

Content Validity

Determined by the degree to which the questions, tasks, or items on a test are representative of the universe of behaviour the test was designed to sample. Nothing more then a sampling issue.

Multitrait-multimethod matrix

Each test is administered twice to the same group of subjects and scores on all pairs of tests are correlated. This matrix is a rich source of data on reliability, convergent validity, and discriminant validity.

False Positives

Some persons predicted to succeed will, in fact, fail.

Method of empirical keying

Test items are selected for a scale based entirely on how well they contrast a criterion group from a normal sample. Ex., depression scale could be derived from a pool of true-false personality inventory questions.

Standard error of estimate

The margin of error to be expected in the predicted criterion score.

Homogenous scale

The most commonly used method for achieving this goal is to correlate each potential item with the total score and select items that show high correlations with the total score.

Specificity

has to do with accurate identification of normal patients.

Production of testing materials

testing materials must be user friendly if they are to receive a wise acceptance by psychologists and educators.

Regression equation

the best-fitting straight line for estimating the criterion from the test. For current purposes, it is more important to understand the nature and function of regression equations.

Nominal Scale

the numbers serve only as category names. Ex., when collecting data for a demographic study, a researcher might code as "1" and females as "2". Simplifying a form of naming.

Cross Validation

the practice of using the original regression equation in a new sample to determine whether the test predicts the criterion as well as it did the original sample.

Publishing the Test

the test developer must oversee the production of the testing materials, publish a technical manual, and produce a user's manual.

Construct Validity

A theoretical, intangible quality or trait in which individuals differ. Ex., leadership ability, over controlled hostility, depression, and intelligence.

Guttman Scale

Produced by selecting items that fall into an ordered sequence of the examinee endorsement. Ex. Depression: ( ) I occasionally feel sad ( ) I often feel sad ( ) I feel sad most of the time ( ) I always feel sad and I can't stand it

Decision Theory

Purpose of psychological testing is not measurement per se but measurement in the service of decision making:

False Negatives

Some predicted to fail would, if given the chance, succeed.

Technical manual and user's manual

Technical data about a new instrument are usually summarized with appropriate references. The prospective user can find information about item analyses, scale reliabilities, cross-validation studies. The user's manual gives instructions for administration and provides guidelines for test interpretation.

Concurrent Validity

Under criterion-related validity: the criterion measures are obtained at approximately the same time as the test scores. For example, the current psychiatric diagnosis of patients would be an appropriate criterion measure to provide validation evidence for a paper-and-pencil psychodiagnostic test. Ex. an arithmetic achievement test scores could be used to predict, with reasonable accuracy, the current standing of students in a mathematics course.

Predictive Validity

Under criterion-related validity: the criterion measures are obtained in the future, usually months or years after the test scores are obtained, as with the college grades predicted from an entrance exam. Test scores are used to estimate outcomes to be measures at a later date.

Criterion-related Validity

When a test is shown to be effective in estimating an examinee's performance on some outcome measure. This test score is useful only insofar as it provides a basis for accurate prediction of the criterion. Ex., a college entrance exam that is reasonable accurate in predicting the subsequent grade point average of examinees would possess criterion-related validity.

Method of Absolute scaling

a procedure for obtaining a measure of absolute items difficulty based on results for different age groups of test takers.


Related study sets

Animal Development Quiz #2 Questions

View Set

Ecology and Evolution Quiz Three

View Set

MODULE 3-THE ROCK CYCLE, AND MINERALS Study Guide 3

View Set

BLAW 441 4/19/17 Rights, Duties, and Liabilities of Shareholders

View Set

Human Growth and Development Final

View Set

CHAPTER 7: TRAINING AND DEVELOPMENT

View Set