Validity

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Define the standard error of estimate and explain its interpretation.

(When using linear regression) The standard error of estimate is used to describe the amount of prediction error due to the imperfect validity of interpretation of a test score. Reported with confidence intervals.

Explain how validity evidence is integrated to develop a sound validity argument.

...

Review validity evidence presented in a test manual and evaluate the usefulness of the test scores for specific purposes.

...

A test is valid for measuring an attribute if:

1. The attribute exists 2. Variation in the attribute causally produce variations in the outcomes of the measurement procedure.

Inconsistent validation results can be interpreted in three ways:

1. The test does not measure the construct. 2. The theoretical network which generated the hypothesis is incorrect. 3. The experimental design failed to test the hypothesis properly.

Describe the five categories of validity evidence specified in the 1999 Standards

1. Validity evidence based on test content. 2. Validity evidence based on relation to other variables. 3. Validity evidence based on response processes. 4. Validity evidence based on internal structure. 5. Validity evidence based on consequences of testing.

Trace the development of the contemporary conceptualization of validity.

1974 - In the beginning, there was three distinct types of validity - content, criterion-related and construct validity. These were known as traditional nomenclature. 1985 - Than, validity was seen as a more inter-related concept, where types of validity really only represented different ways of collecting evidence to support the validity of interpretations of performance on a test (test scores). Began to refer to "types of validity" as "types of validity evidence" instead. 1999 - Unitary concept with five different categories of evidence. Sources of validity evidence differ in their importance according to factors such as the construct being measured, the intended use of the test scores, and the population being assessed.

Decision-theory model

An aide to the test user to determine how much information a predictor test can contribute when making classification decisions. - Factors other than the correlation between the test and criterion are important to consider, e.g. selection ration (proportion of participants needed to fill positions) and base rate (proportion of applicants who can be successful on the criterion).

Describe the major threats to validity.

Construct under representation - measuring less than the construct it is suppose to measure. Construct irrelevant variance - measuring more than the construct it is suppose to measure. External factors: Examinee characteristics (i.e. high test taking anxiety), Test administration and scoring procedures (i.e. not following time limits), and instruction and coaching (i.e. told the questions before the test).

2. Evidence based on relations to other variables:

Examining the relationships between test scores and other variables (e.g. a critical behavior) aka criterion-related validity Test-criterion evidence: predict performance on some variable that is typically referred to as a criterion (:a measure of some attribute or outcome that is of primary interest). Two types of validity studies typically used to collect test-criterion evidence: 1. Predictive studies - Test is administered, there is an intervening time interval, than criterion is measured. + Predicts outcomes (better for education - SAT) - expensive/considerable time 2. Concurrent studies - Test is administered and criterion is measured around the same time. + save time/money + determines current status Validity generalization, sensitivity and specificity - important consideration in the interpretation of predictive and concurrent studies. Stronger correlation = more validity

Item relevance testing

Experts examine each individual test item and determine whether it reflects essential content in the specified domain.

Content coverage

Experts look at the overall test and rate the degree to which the items cover the specified domain.

Face validity vs. Content validity

Face validity really has nothing to do with what a test actually measures, just what it appears to measure. Whereas content validity is acquired through a systematic and technical analysis of the test content. Face validity is important though - it can increase examinee motivation, increase test performance, be seen as more meaningful. In forensic settings - where detection of malingering may be emphasized, face validity is undesirable.

Construct validity

Involves an integration of evidence that relates to the meaning or interpretation of test scores.

Criterion-related validity

Involves examining the relationships between the test and external variables that are thought to be direct measure of the construct.

Content validity

Involves how adequately the test samples the content area of the identified construct.

Discriminant validity evidence

Is obtained when you correlate a test with existing tests that measure dissimilar constructs. - Want to see a low correlation.

Convergent validity evidence

Is obtained when you correlate a test with existing tests that measure the same or similar constructs. - Want to see a high correlation.

Describe the steps in factor analysis and how factor analytic results can contribute evidence of validity.

ONLY READ TO PAGE 180 - FACTOR ANALYSIS START

1. Evidence based on test content:

Related to Content validity - relationship between content and construct of test. Test content: refers to themes, wording and format of the items, task, or questions on a test as well as the guidelines regarding administration and scoring. First identify what we want to measure - define a construct, develop a table of specifications to write the actual test items. Consult professionals and experts. After the test is written, systematically review and evaluate correspondence between test content and test construct by: 1. Item relevance testing 2. Content coverage A quantitative index, that reflects the degree of agreement among experts is called multidimensional scaling analysis. Content-based validity evidence is preferred for achievement tests, and employee selection and classification. It is NOT preferred for personality and aptitude tests.

Multitrait-multimethod matrix

Relatively sophisticated validation technique that combines convergent and divergent strategies. Requires you to examine two or more traits using two or more measurement methods - which provides a correlation matrix which compares the actual relationship with a preexisting prediction. The validity coefficient is designated by rxy in the shorter diagonals. Reliability is designated by rxx in the longer diagonal. Important things to look for when interpreting MTMM: - High reliability - Convergent validity (monotrait-heteromethod) correlations should be significantly different from zero, and should be bigger than discriminant ones (both types) - The pattern of discriminant validity should be similar within triangles (discriminate same - heterotrait-monomethod and discriminate different - heterotrait-heteromethod). - Don't want method of measurement to account for variance - (discriminate same (top triangles) are higher than discriminate different methods/convergent triangles).

Explain the relationship between reliability and validity.

Reliability is a necessary but insufficient condition for validity. No matter how reliable a test is - it does not guarantee validity. Only true score variance is reliable (compared to error variance) and can be related to a construct on a test. Therefore only true score variance is related to validity (error variance is not). Although low reliability limits validity - high reliability does not ensure validity (i.e. head circumference measurement is reliable - but inference to intelligence would not be valid). Reliability places limits on the magnitude of validity coefficients when a test score is correlated with variables external to the test itself.

Criterion contamination

Scores on the predictor should not in any way influence criterion scores. If the predictor scores do influence criterion - the criterion is contaminated and may artificially inflate the resulting validity coefficient.

Sensitivity

The ability of the test at a predetermined cut score to detect the presence of the disorder. Inverse relationship with specificity. Screening measures for the presence of a condition should always emphasize sensitivity over specificity. True positive / True positives + False negatives

Specificity

The ability of the test at a predetermined cut score to determine the absence of the disorder. Inverse relationship with sensitivity. Screening for minimal consequence disorders, may have higher specificity. True negatives / True negatives + False positives

Define validity and explain its importance in the context of psychological assessment.

The accuracy or appropriateness of the interpretation of test scores (not a test). The validity of the interpretations of test scores is directly tied to the usefulness of the interpretations. Validity must always have a context and that context is the interpretation not the test itself. Interpretations must be consistent with nomological networks, theoretical rationales, etc. When test scores are interpreted in multiple ways - each interpretation needs to be evaluated. Not static - constantly moving, and it is the joint responsibility of test user and test maker.

Explain how validity coefficients are interpreted.

The validity of a test is examined by correlating it with an external variable - An external variable means a measure outside of the test (no criterion contamination). To establish validity we can pick external variables that: are connected to what the test is supposed to measure (true score) - Convergent evidence Or That should not correlate with test, but might due to systematic error - Discriminant evidence. If a test provides information that helps predict criterion performance better than any other existing predictor, the test may be useful even if the validity coefficients are relatively small. Linear regression allows you to predict values on one variable (criterion performance) given information on another variable (predictor test scores).

2 models for validating scales:

Theoretical relations (what we expect/want) between constructs and Empirical relations (the observed data). If they match - construct validity. - Validity cannot be formulated in the absence of a theory that relates it to other constructs.

Validity vs. Validation

Validation is a process - an activity or theory testing Validity is a concept - an ideal or property.

relative validity

Validity is not an all-or-none concept. It exists on a continuum - therefore we refer to is as relative validity or degrees of validity.


Ensembles d'études connexes

Mengenal Teks Eksplanasi - BI / G11

View Set

Chapter 13 - Appraisal of Property/Property Valuation

View Set

Adv Ch 2: Business Combinations + Consolidation Process

View Set

Mutations, DNA damage, DNA repair review

View Set

Ch 49: Drugs Used to Treat Anemia

View Set

mobility exam 5, chapters 25 (G), 38 and 39 (P/P)

View Set

Algebra (ADDING AND SUBTRACTING POLYNOMIALS)

View Set

4/24/23 Outline of Mark's Gospel Theology Quiz

View Set

Dental Radiography - short "danb" review

View Set