ch 5 validity

¡Supera tus tareas y exámenes ahora con Quizwiz!

standard error of estimation

A measure of how imperfect the validity of a test is. To estimate the accuracy of predicted values, it is useful to calculate the ________.

validity cont

No one way to test validity -Psychological tests test theory -Some ways more common than others -Predictive validity in practical situations -Construct validity in theoretical situations

Regression approach

Slope of the line relating test scores to criterion scores when both variables standardised = "validity coefficient" If slope = .82, then the test has predictive validity of 0.82 for this purpose *** Ignore SEe and "suppression" p. 98

main types of validity

content validity(see below) predictive validity (see below) =concurrent/ incremental construct (main one) (see below) =convergent/ divergent

test sensitivity

it is the Proportion gamblers correctly classified (how good the test is) ie the number of people that were correctly identified as important. or relevent

concurrent validity

this term is used to characterise those situations in which the test and criterion are administered together. concurrent validity A measurements ability to correlate or vary directly with an accepted measure of the same construct

Factor analysis

is looking at the items critically in a survey or scale. Determine convergent and discriminant evidence of construct validity. -used to identify which items in a test tend to group together- variable reduction ege.g., John is good at hockey, cricket, tennis, running . . . = John is good at "all sports

validity

the extent to which the test measures what it purports to measure. Validity gives a test meaning; whether it is measureing Social IQ, Music Iq, Cognitive etc

divergent validity

this is when the test does not have a strong correlate with other tests of related constructs. eg IQ test A for reading ability does not correlate strongly with IQ test for visual/spatial ability

Binet n Simon eg.

1st measure of intelligence = they hypothesised that if it truly does measure intelligence than we would expect -older kids would do better than younger kids -the kids that teachers call 'bright' will do well on the test =then we can make predictions based on theoretical construct of intelligence

quantification of vallidity

A correlation -1 to + 1 Pearson, Spearman, Kendall's tau, point-biserial Degree of relationship 0 to +1 Cross-tabulation, Chi-squared, phi coefficient, Kappa, Kendall's coefficient of concordance Proportion of variability explained 0 to 100% R2

factor structure

A factor matrix found in an oblique rotation that represents the simple correlations between variables and factors, incorporating the unique variance and the correlations between factors. Most researchers prefer to use the factor pattern matrix when interpreting an oblique solution. Indicator Single variable used in conjunction with one or more other variables to form a composite measure.

exploratory factor analysis EFA

A statistical method used to uncover the underlying relationships of a relatively large set of variables

factor analysis

A statistical procedure that identifies clusters of related items (called factors) on a test; used to identify different dimensions of performance that underlie one's total score.

Factor analysis (no detail needed for this) -It's a reduction of items by summarising.

A statistical procedure that identifies clusters of related items (called factors) on a test; used to identify different dimensions of performance that underlie one's total score. eg john is good at hocky footy, baseball summarised to john is good at sport

multitrait-multimethod matrix

A way of simultaneously evaluating convergent and divergent validity -Separates method variance (e.g., all measures are self-report) from variance associated with the construct to be measured (e.g., cheerfulness) -Need to use at least two traits and at least two methods (e.g., self-report v peer assessment) Measures of the same construct should relate more strongly than measures of different constructs using the same or different measures

multitrait and multi method matrix (used for construct validity)

A way of simultaneously evaluating convergent and divergent validity Separates method variance (e.g., all measures are self-report) from variance associated with the construct to be measured (e.g., cheerfulness) Need to use at least two traits and at least two methods (e.g., self-report v peer assessment) -Measures of the same construct should relate more strongly than measures of different constructs using the same or different measures

Decision theory and test utility

Application of decision theory to the evaluation of tests Emphasises errors in decision making as well as correct decisions Valid positives (predicted to have the characteristic and does) False positive (predicted to have the characteristic and does not) Valid negatives (predicted to not have the characteristic and does not) False negatives (predicted to not have the characteristic and does) Important false positive - selecting airline pilots Important false negatives - detecting malignant tumours

know these Important false positive - selecting airline pilots Important false negatives - detecting malignant tumours

Important false positive - selecting airline pilots Important false negatives - detecting malignant tumours

Predictive validity how adequately a test score can predict some 'criterion" in the future that criterion must be linked somehowto the first construct but can be almost anything eg another test score, belonging to a group, or not, an amount of time, a rating, a diagnosis, atraining cost, effect of an intervention, number of days absent, how likely they will be to get schizophrenic, or how llikely they should be in army a criterion must be theoretically linked to the construct being measure but can be almost anything eg another test score, an amount of time; a rating; a diagnosis; a training cost; effect of an intervention, number of days absent.

In psychometrics, predictive validity is the extent to which a score on a scale or test predicts scores on some other criterion measure (,like what we did on our for our assignment) For example, the validity of a cognitive test for job performance is the correlation between test scores and, for example, supervisor performance ratings in the future. predicting how relaxed a person will be after an intervention like a meditation tape. Has anxiety decreased? This is about validity

sampling error

It exists cos we use a sample and not the whole population. good methods can reduce sampling error but there will always be variation cos the samples aren't identical. This is why we measure validity. sampling error is the naturally occurring discrepancy, or error, that exists between a sample statistic and the corresponding population parameter.

construct validity

Theoretical concepts (anxiety, intelligence, personality) Operationalisation Theoretical validity Interaction between test and theory Test development informs psychology theory informs test development

receiver operating characteristic ROC curve

These are curves that are graphical representation of a subject's sensitivity to a stimulus.

construct validity

Validity of a test ultimately depends on the extent to which the test truly reflects the construct it purports to measure Strong theory stronger validity (e.g., intelligence vs creativity) Umbrella term - or "unitary" view Test development very much a part of theory development -Theory informs test development which informs theory -Convergent and discriminant validity -Interventions

method variance

Variability in responding that is due to specific properties of the assessment method used (e.g. gathering self-reports vs. informant ratings)

confirmatory factor analysis CFA

a type of factor analysis used for checking construct validity. It is hypothesis based and much more in depth than the other type of FA. There is significance testing added

predictive validity

allow us to estimate scores on a criterion external to the test itself. if the estimates the test provides are good then we are likely to accept the test as a valid measure ofthe criterion in question. So scores on an anxiety test that are high should predict psychiatrists ratings of patients anxiety levels

Convergent validity (and see divergent validity)

convergent validity is when the test (the measures) correlates with other tests that assess the same construct or trait. there might be diff items on the test but all are measuring same construct. creativity might correlate with emotional intelligence BUT not with WASC or IQ test

base rate

how common a characteristic or behavior is in the general population

test specificity

how well does the test pick up the ones that weren't classified (the non gamblers that were classified) ie proportion that were correctly identified as not important

construct validity. This is the ultimate test. Does it reflect what it purports to measure eg 2 Anxiety as a construct doesnt exist other than in the responses an individual makes in certain situations like when under threat.

in psychology, constructs are invented to make sense of a person's behaviour. eg intelligence or anxiety are a construct. Because we see commonalities in the way ppl solve problems or adapt to their surroundings, we speak of intelligence. It is not a thing, like a chair, but an idea that potentially makes sense of differences in the way ppl sole problems. we use contruct validity to evaluate how well the test gives the construct meaning.

decision errors

incorrect conclusion in hypothesis testing in relation to the real (but unknown) situation, such as deciding the null hypothesis is false when it is really true.

incremental validity adding the extras or improvements to previous tests must be worthwhile

incremental validity the extent to which the knowledge of a score on a test adds to that obtained by another pre-existing test score or psychological characteristic The degree to which an additional predictor explains something about the criterion measure "over and above" that explained by predictors already in use How well the test improves selection decisions made without knowledge of test results Or improvement in prediction over other measures available e.g., a Psychological test over demographic information Small improvement can have major (financial, clinical) implications Related to multiple regression

Decision theory and test utility. important

know these Application of decision theory to the evaluation of tests Emphasises errors in decision making as well as correct decisions -Valid positives (predicted to have the characteristic and does) -False positive (predicted to have the characteristic and does not) -Valid negatives (predicted to not have the characteristic and does not) -False negatives (predicted to not have the characteristic and does)

there are degrees of validity and degrees of

reliability

tests should have both

sensitivity and test specificity

regression approach

slope of the line relating test scores to criterion scores when both variables standardised so if slope is .82 then the test has a predictive validity of .82 for this purpose

we test for validity with

statistical analysis and critical thinking

multitrait-multimethod

to measure construct validity

Concurrent validity ismuch more common

two tests but both are given out at same time (so like our assignment.

how do you measure quantification of validity?

usually it's a correlation -1 to +1 pearsons spearman or kendalls tau Can be Degree of relationship 0 to 1 cross tabulation, chi squared or if you're using regression it's the Proportion of variability explained. 0 to 100%. R squared

validity used for predicting behaviour or trends in the future

validity extent to which the test measures what it purports to measure the degree to which accumulated evidence and theory (empirical) support specific interpretations of test scores entailed by the proposed uses of a test. without validity a test score has no meaning. there are degrees of validity (like reliability) that depend on the purpose of test; eg a test is valid for a particular purpose with a particular population eg braille version of Wisc -the extent to which scores on the test predict some criterion external to the test itself ie predictive validity eg for organisational testing ie Myers briggs Quality of a psyc test

ways to investigate validity

ways to investigate validity REGRESSION APPROACH -regression coefficient -pearson product moment correlation coefficient (least squares solution) -measure the standard error of estimate DECISION THEORETIC APPROACH two choice decision (this or that category) cut score determined (prior research) see what is above and below results in 4 possible outcomes -valid positive -valid negative -false positive -false negative CONSTRUCT VALIDITY cronbach- multitrait-multimethod (MTMM) variation due to the method and due to the underlying disposition wanting to assess -factor analysis: calculate correlation coefficients between all possible pairs of tests. reduction reveals underlying latent variables

Content validity aka Face validity

when inferences can be made as to what is being tested just by looking at the items on a test. eg end of term tests should show content validity as all should know what the test is about. But content validity can be a poor judge of what is known compared to more rigorous predictive and constructive validity -cultural differences from same test -ok for achievement testing. But still follow up with item analysis. But not necessary for psychometric soundess. -some tests do not look at items at all eg empirical keying and projective tests eg MMPI Addiction Potential Scale. -not necessary for psychometric soundness -may be necessary for acceptance by test users and courts

one predictive validity has time interval criterion testing is expensive ie it is repeated say at high school and then later at uni with the Weshler. Far more cheap is the Concurrent validity test (see below

where criterion scores are obtained after some time interval from test scores eg Weschler predicts future success for the future wen at school BUT in the end they might have done well cos things might have intervened to cause successs

NB the ditinction between exploratory and Confirmatory Factor Analysis is useful pg 113

you have a set of data on a scale eg DASS n Sensation seeking and you run factor analysis over it just to see what comes out. it is exploratory. For confirmatory factor analysis you already have an idea what hte underlying factors are. So you hypothesise two subscales and test for how well they correlate


Conjuntos de estudio relacionados

Macro: Chapter 14: MONETARY POLICY

View Set

Chapter 20- Muscular System and Pathologies

View Set

Geography module 4 week 6 grade 8

View Set