Chapter 10: validity & reliability
extraneous factors must be?
controlled or quantified
reliability
consistency of a specific measurement
how to measure quantitative data
instrumented devices, clinician measurement, clinician observation, pt self-report
the next 10 definitions are threats to
internal validity
multiple treatment interference
testing multiple variables might interfere with others Ex: so many tests on pt they tire & skew results
Intra-rating
testing of same person with same tester on a different day (tester doesn't change)
how to ensure objective results?
the design of an experiment to test the hypothesis must be unbiased
the less generalizable the study results are to the general population
the harder it is for external validity
blinding is pertinent to?
the internal validity of a study
Blinding is important to
the internal validity of the study
fidelity of intervention is an example of
the reliability of the independent variable (in an intervention study)
internal validity refers to?
the validity of a study's experimental design
Testing
"practice" or "learning" effect Ex: child given test multiple times & learns items
Intertester reliability (interrater reliability)
-ability of different testers to produce consistent repeated measures of a test
Accuracy
-closeness of measured value to true value of what is being assessed (Precision is dividing up a number again & again)
Intraclass correlation coefficients (ICCs)
-estimates reliability for measure of continuous data -scale of 0-1
construct validity
-how well a specific measure or scale captures a defined entity -stems form psychology, but may be used in health science
Concurrent Validity
-how well one measure correlates with an existing gold standard -important to be established for new measures aiming to assess the same properties as an existing test
when is validity discussed in an experiment?
-in relation to the structure of the overall design -the measurements preformed -the intervention assessed
Discriminative validity
-indicative of a given measure's lack of correlation or divergence from existing measures that it should not be related to Ex: specificity & sensitivity testing
instrument accuracy
-instruments must be valid & reliable -multiple people collecting data affects results (Interrater reliability) -sloppy data-gathering accefts results (intrarater reliability)
how to tightly control a study
-subject selection -administration of interventions -control of confounding factors
Consent validity
-the amount that a particular measure represents all facets of the construct it is supposed to measure -more scientifically rigorous than face validity Ex: items add up to one particular construct like an intelligence or concussion test
artificial nature of experimental condition
-the more artificial the environment, the less generalizable
Bias
-threat to internal validity -may be inherent to subjects or experimenters
Face validity
-whether a specific measure actually assesses what it's designed to measure -issue in developing "functional tests" for pts in rehab
measurement data is classified into 3 types
1. Categorical -you can or cannot (9 or 1) 2. Ordinal- measures how well it was done 3. Continuous -standardized measure
what are the two pain threats of internal validity?
1. Threats of the participants 2. threats of the measurement
4 threats to external validity
1. effects of pretesting 2. subject & Tx interaction 3. artificial nature of experiment 4. multiple Tx ingerference
we must measure fidelity to confirm?
1. interventionists adhere to intervention 2. key ingredients making intervention unique 3. differentiation of this intervention from another intervention
how to optimize external validity (limit desired generalization)
1. larger population 2. other outcomes (dependent variables) 3. other conditions (independent variables) 4. other setting
3 types of blinding
1. subjects 2. members of experimental team 3. clinicians treating pts
4 types of error
1. unreliable & invalid 2. unreliable & valid 3. reliable & invalid 4. reliable & valid
internal validity should be thought of as along a continuum rather than
a dichotomous property
Categorical or Nominal data involves?
a finite number of classification for observation -numeric value assigned to each category -order of numbers assigned to each category is inconsequential
"Intra-tester reliability" (intra-rater reliability, test-retest reliability)
ability of same tester to produce consistent, repeated measures of a test
face validity is determined subjectively
and often by expert opinion
Pearson's r
assesses association between two continuous measures across a sample of subjects -indicates that scores on the two measures are highly correlated -does NOT assess systematic error, or show the scores of 2 measures are systematically diverging from each other
Selection
biases resulting from pre-existing differences between history & selection Ex: males taking tampon surveys
Ordinal data uses?
categories -order of numeric classification is of consequence Ex: Likert scales, in which a numeric value is assigned to each possible response (face pain scale)
Selection Bias
characteristics that subjects have before they enroll in a study (age, injury, illness) that influences the results
Delimitations
decisions that investigators make to improve the internal validity of their studies
error variance
difference between true score & observed score
mortality
differential attrition from groups
controlling confounding factors does what to internal validity?
enhances it
bias can be avoided by?
ensuring subjects in different group shave similar characteristics by random assignment or matching procedure
sources of error include?
error or biological variability by the subject, or error by the tester or instrumentation used to take the measure
Agreement
estimates consistency or reproducibility of categorical data
History
events concurrent with independent variable effecting the dependent Ex: unknown variable is exposed & researcher is aware
confounding variables
extraneous factors that may result in false relationships
statistical regression
extreme scores tend to regress to the mean over time Ex: evaluating mean of the 3pt scorer to the game today
subject & treatment interaction
group selection biases -> will it generalize other groups Ex: being too selective with pts, must be a smoker, obese, with arthritis...
Precision of measurement
how confident one is in the reproducibility of a measure -takes into account ICC & standard deviation
when is the study's validity questioned?
if other factors are influencing the dependent variable
placebo effect
improvement due to expectation
Hawthorne effect
improvement due to observation... changing something because you know people are watching
Continuous Data
measured on a scale that can continuously be broken down into smaller & smaller increments
convergent validity
measures whether a given measure is highly correlated with other measures of the same construct Ex: balance measures strength & other things
maturation
processes occurring within subjects as a result of time instead of the independent variable
Inter-rating
rating between two people or having two different people run the test
threats associated with measurement
regression, instrumentation, testing effect
External Validity
relates to the degree to which the results of a study are generalizable to the real world
what are 2 measure of key components of validity
reliability & agreement
Intrarater & interrater agreement are defined the same as?
reliability measure
kappa statistic
reports the estimation of agreement ranging from 0-1 indicating perfect agreement
validity is an inherent principle in?
research design
Halo effect
researcher has expectations about performance of the subject
threats associated with participants
selection, maturation, attrition, history
Precision is reported as?
standard error of measurement (SEM) in the unit of measure
Ecological validity
translating treatments from controlled lab studies to typical clinical practice
Effects of pretesting
treatment may have it's own effect if preceded by a pretest (making treatment generalizable only under those conditions) Ex: calling attention to problem like posture, they may try to correct it
total variance is due to
true variance & error variance
what is the key quantitative inquiry?
unbiased & objective measurement of the dependent variables
true variance
variability between subjects
double-blinded
when both researchers and participants are blinded
when does Pearson's r approach 1?
when one measure increases in value and the second measure increases incrementally
when is the ICC lower?
when systematic error is present
when is a study valid?
when the independent variable proves to have a definite effect on the dependent
Internal Validity
where the independent variable is responsible for observed effects on the dependent variable