Reliability and Validity

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Split-Half Reliability

-A measure of internal consistency based on dividing the instrument items into two groups and assessing the correlation between the two halves. Advantage: -Eliminates time and physical environmental factors that can confound test-retest and alternate forms methods

Selection of an Instrument

-A useful instrument must have high reliability and be a valid measure of outcome. -Clinical -Research -has to be reliable in order to be valid


-Also a measure of reliability and agreement -This an agreement measure for categorical scales that corrects for "chance agreement" -Number exact agreement: number of possible agreements -"Chance-corrected measure "

Aletrnate Forms Reliability

-Also called equivalent or parallel forms reliability: -Assesses statistical equivalence. -Eliminates memory of particular responses in traditional test-retest format. -Correlation coefficients used to assess reliability

Intra-Rater Reliability

-Assesses the rater's influence on the accuracy of the measurement -Reflects the stability of ratings by one individual -Is a type of test-retest reliability

Concurrent Validity

-Criterion measures are taken at the same time (concurrently) -Use to establish validity

Convergent Validity

-How the measure assesses similar areas (not so much about comparing to gold standard) -Positive correlations

Discriminant Validity

-How the measure differs or performs against instruments known to assess other areas -Low correlations

Test-Retest Reliability

-Indicates the stability (consistency) of an instrument through repeated trials. -Time between tests, circumstances, carryover, and testing effects between assessments must be considered.

Content Validity

-Instrument adequately addresses all aspects of a particular variable of interest and nothing else. -Also called sampling validity -Subjective process by "panel of experts" during test development; non-statistical procedure. -Especially important in questionnaires, examinations, inventories & interviews that attempt to evaluate a range of information.

Face Validity

-Instrument appears to test what it is supposed to and it seems reasonable to implement. -Least rigorous, scientifically weak -Subjective process, yet necessary for all parties involved to acknowledge since perceived relevance can influence motivation and accuracy.

Norm Referenced Test

-Instruments are designed to assess an individual's skill relative to peers, or to a known group -Have been administered to a large sample to determine group means and standard deviations. -This information is used to determine how a test taker "ranks" among the group. Examples: -School placement exams -Pediatric developmental assessments

Regression to the Mean

-Measurement error is also connected to score's position in the distribution -Extremely high and low scores reflect differing degrees of error (positive & negative) -In a pre-test post-test design, scores have a tendency to move towards the mean

Criterion-Related Validity

-New (target) instrument is compared to a "gold standard" (criterion) measure. -How well it predicts other tests. -most objective and practical test of validity

Intraclass Correlation

-Preferred technique for evaluating rater reliability -measures reliability & agreement -there are 6 different ways to calculate the ICC depending on purpose, study design, level of measurement, and conceptualization -Indicates agreement -Within one subject -Across raters -Application -Test-retest

Measurement Error

-Some level of inconsistency is inevitable. Sources of inconsistency: -Tester (rater) -Instrument -Subject or characteristic Observed Score = True Score ± measurement Error

Criterion Referenced Test

-Test are sometimes developed to assess specific performance based on an "expected/required" outcome (criterion) -These are tests where the test taker must achieve/demonstrate a specific level of knowledge/skill in order to "move on to the next level" Examples: -Safety evaluation score -Licensure or board examinations

Construct Validity

-The ability to accurately measure an abstract concept -Usually used with multidimensional concepts -Often includes many different types of validity -Known groups - if it can identify presence or absence of a particular characteristic -Can be assessed by measuring concepts opposite to the construct

Internal Consistency

-The extent to which all items in an instrument measure the same trait -Most commonly done by evaluating the relationship between items in their subscale ex. visual perceptual tests and their many subtests -A good scale is one that measures different aspects of the same attribute (e.g. ADL function: bathing, dressing, grooming, toileting) -Note: internal consistency should not be confused with construct validity. Internal consistency is a measure of reliability

Cronbach's Alpha

-Used to assess internal consistency -Can be used with dichotomous or ordinal scales -Split half - indicate the reliability of all possible splits

Variance and Correlation

-Variance is the basis of correlation -The reliability coefficient is linked to the amount of variance found in the "True Score"

Rater Bias

-assessment/detection bias -When one therapist performs both assessments, first results might affect second time results -Knowledge of the first rating influences the second Strategy: -Blinding of the assessor -Establish scoring procedures -Employ training procedures -Evaluate across raters


-consistency (dependable). Reflects how consistent and free from random error a measurement is Important for generalizing findings to other samples

Systematic Errors

-consistent, unidirectional, and predictable (if detected) -Relatively easy to correct -recalibration or correction. -Affects validity more than reliability.


-measures what it is supposed to measure (meaningful). -Valid = faithful, true -The degree to which an instrument actually measures what it is meant to measure. -Important for drawing inferences and application of results. -Reliability is a prerequisite for validity, but not vice-versa. -Validity is determined by assessing the relationship between test results, characteristics, performances, and/or certain behaviors. -It is more difficult to determine validity of abstract variables (e.g. QoL).

Random Errors

-occur by chance and alter scores in unpredictable ways -Generally are not influenced by magnitude of true score -Larger samples minimize the influence or random errors leading to the use of average scores as a good estimate of true score

Continuous Data

-often converted to dichotomous variables for diagnostic purposes. -Cutoff scores are based on the consequences of false positives vs. false negatives (sensitivity and specificity).

Minimal Detectible Difference

-refers to the smallest increment of change that can be reliably measured -There is always measurement error, conceptually we want to know if the measurement is because of error or "actual differences"

Minimal Clinically Important Difference

-what is the smallest amount of change that has practical or "clinical meaning" that a patient would identify as important? -Usually, based on some external criterion that has value to clinicians, clients, or caregivers. Example: -FIM differences of 10 points below the level of 80 have been connected to an addition hour of care needed.

Predictive Validity

Attempts to determine if a measure is a suitable predictor of future criterion score

Inter-Rater Reliability

Compares variation between separate raters on the same group of participants

Population Specific Reliability

Some assessments may be difficult to standardize on certain populations -Pain -Deformity -Weakness -Anxiety -Spasticity

Reliability Coefficient

True variance / (True variance + Error variance) Reliability coefficient = T / (T + E) Range 0 to 1. < 0.50 (poor) 0.50 - 0.75 (moderate) > 0.75 (good). -based upon score variance -Correlation reflects the degree of association or proportion between scores. -Agreement reflects the actual equality of scores. -Systematic errors do not affect reliability coefficient since relative scores remain consistent (high correlation).

Standardized Test

assessments with specific: -Administration procedures -Scoring procedures -Test items or material Instruments can be considered standardized while NOT being classified as norm or criterion referenced

Pearson's Coefficient

interval or ratio data

Spearman's Coefficient

ordinal data


true negatives / actual negatives. -negative predictive value -tells if you are eliminating participants who should be eliminated.


true positives / actual positives. -positive predictive value - if everyone is being caught by test

Ensembles d'études connexes

Module 5: Knowledge Check Chapter 9

View Set

The Things They Carried Unit Review

View Set

Algebra 1: Module 1: 01.07 Algebraic Properties and Equations

View Set

SOCIAL PSYCHOLOGY 3004 exam 1 ch 1-4

View Set

The Purchasing Process - Ch 12 - AC 304

View Set