Reliability and Validity

¡Supera tus tareas y exámenes ahora con Quizwiz!

Split-Half Reliability

-A measure of internal consistency based on dividing the instrument items into two groups and assessing the correlation between the two halves. Advantage: -Eliminates time and physical environmental factors that can confound test-retest and alternate forms methods

Selection of an Instrument

-A useful instrument must have high reliability and be a valid measure of outcome. -Clinical -Research -has to be reliable in order to be valid

Kappa

-Also a measure of reliability and agreement -This an agreement measure for categorical scales that corrects for "chance agreement" -Number exact agreement: number of possible agreements -"Chance-corrected measure "

Aletrnate Forms Reliability

-Also called equivalent or parallel forms reliability: -Assesses statistical equivalence. -Eliminates memory of particular responses in traditional test-retest format. -Correlation coefficients used to assess reliability

Intra-Rater Reliability

-Assesses the rater's influence on the accuracy of the measurement -Reflects the stability of ratings by one individual -Is a type of test-retest reliability

Concurrent Validity

-Criterion measures are taken at the same time (concurrently) -Use to establish validity

Convergent Validity

-How the measure assesses similar areas (not so much about comparing to gold standard) -Positive correlations

Discriminant Validity

-How the measure differs or performs against instruments known to assess other areas -Low correlations

Test-Retest Reliability

-Indicates the stability (consistency) of an instrument through repeated trials. -Time between tests, circumstances, carryover, and testing effects between assessments must be considered.

Content Validity

-Instrument adequately addresses all aspects of a particular variable of interest and nothing else. -Also called sampling validity -Subjective process by "panel of experts" during test development; non-statistical procedure. -Especially important in questionnaires, examinations, inventories & interviews that attempt to evaluate a range of information.

Face Validity

-Instrument appears to test what it is supposed to and it seems reasonable to implement. -Least rigorous, scientifically weak -Subjective process, yet necessary for all parties involved to acknowledge since perceived relevance can influence motivation and accuracy.

Norm Referenced Test

-Instruments are designed to assess an individual's skill relative to peers, or to a known group -Have been administered to a large sample to determine group means and standard deviations. -This information is used to determine how a test taker "ranks" among the group. Examples: -School placement exams -Pediatric developmental assessments

Regression to the Mean

-Measurement error is also connected to score's position in the distribution -Extremely high and low scores reflect differing degrees of error (positive & negative) -In a pre-test post-test design, scores have a tendency to move towards the mean

Criterion-Related Validity

-New (target) instrument is compared to a "gold standard" (criterion) measure. -How well it predicts other tests. -most objective and practical test of validity

Intraclass Correlation

-Preferred technique for evaluating rater reliability -measures reliability & agreement -there are 6 different ways to calculate the ICC depending on purpose, study design, level of measurement, and conceptualization -Indicates agreement -Within one subject -Across raters -Application -Test-retest

Measurement Error

-Some level of inconsistency is inevitable. Sources of inconsistency: -Tester (rater) -Instrument -Subject or characteristic Observed Score = True Score ± measurement Error

Criterion Referenced Test

-Test are sometimes developed to assess specific performance based on an "expected/required" outcome (criterion) -These are tests where the test taker must achieve/demonstrate a specific level of knowledge/skill in order to "move on to the next level" Examples: -Safety evaluation score -Licensure or board examinations

Construct Validity

-The ability to accurately measure an abstract concept -Usually used with multidimensional concepts -Often includes many different types of validity -Known groups - if it can identify presence or absence of a particular characteristic -Can be assessed by measuring concepts opposite to the construct

Internal Consistency

-The extent to which all items in an instrument measure the same trait -Most commonly done by evaluating the relationship between items in their subscale ex. visual perceptual tests and their many subtests -A good scale is one that measures different aspects of the same attribute (e.g. ADL function: bathing, dressing, grooming, toileting) -Note: internal consistency should not be confused with construct validity. Internal consistency is a measure of reliability

Cronbach's Alpha

-Used to assess internal consistency -Can be used with dichotomous or ordinal scales -Split half - indicate the reliability of all possible splits

Variance and Correlation

-Variance is the basis of correlation -The reliability coefficient is linked to the amount of variance found in the "True Score"

Rater Bias

-assessment/detection bias -When one therapist performs both assessments, first results might affect second time results -Knowledge of the first rating influences the second Strategy: -Blinding of the assessor -Establish scoring procedures -Employ training procedures -Evaluate across raters

Reliability

-consistency (dependable). Reflects how consistent and free from random error a measurement is Important for generalizing findings to other samples

Systematic Errors

-consistent, unidirectional, and predictable (if detected) -Relatively easy to correct -recalibration or correction. -Affects validity more than reliability.

Validity

-measures what it is supposed to measure (meaningful). -Valid = faithful, true -The degree to which an instrument actually measures what it is meant to measure. -Important for drawing inferences and application of results. -Reliability is a prerequisite for validity, but not vice-versa. -Validity is determined by assessing the relationship between test results, characteristics, performances, and/or certain behaviors. -It is more difficult to determine validity of abstract variables (e.g. QoL).

Random Errors

-occur by chance and alter scores in unpredictable ways -Generally are not influenced by magnitude of true score -Larger samples minimize the influence or random errors leading to the use of average scores as a good estimate of true score

Continuous Data

-often converted to dichotomous variables for diagnostic purposes. -Cutoff scores are based on the consequences of false positives vs. false negatives (sensitivity and specificity).

Minimal Detectible Difference

-refers to the smallest increment of change that can be reliably measured -There is always measurement error, conceptually we want to know if the measurement is because of error or "actual differences"

Minimal Clinically Important Difference

-what is the smallest amount of change that has practical or "clinical meaning" that a patient would identify as important? -Usually, based on some external criterion that has value to clinicians, clients, or caregivers. Example: -FIM differences of 10 points below the level of 80 have been connected to an addition hour of care needed.

Predictive Validity

Attempts to determine if a measure is a suitable predictor of future criterion score

Inter-Rater Reliability

Compares variation between separate raters on the same group of participants

Population Specific Reliability

Some assessments may be difficult to standardize on certain populations -Pain -Deformity -Weakness -Anxiety -Spasticity

Reliability Coefficient

True variance / (True variance + Error variance) Reliability coefficient = T / (T + E) Range 0 to 1. < 0.50 (poor) 0.50 - 0.75 (moderate) > 0.75 (good). -based upon score variance -Correlation reflects the degree of association or proportion between scores. -Agreement reflects the actual equality of scores. -Systematic errors do not affect reliability coefficient since relative scores remain consistent (high correlation).

Standardized Test

assessments with specific: -Administration procedures -Scoring procedures -Test items or material Instruments can be considered standardized while NOT being classified as norm or criterion referenced

Pearson's Coefficient

interval or ratio data

Spearman's Coefficient

ordinal data

Specificity

true negatives / actual negatives. -negative predictive value -tells if you are eliminating participants who should be eliminated.

Sensitivity

true positives / actual positives. -positive predictive value - if everyone is being caught by test


Conjuntos de estudio relacionados

The Things They Carried Unit Review

View Set

Algebra 1: Module 1: 01.07 Algebraic Properties and Equations

View Set

SOCIAL PSYCHOLOGY 3004 exam 1 ch 1-4

View Set

The Purchasing Process - Ch 12 - AC 304

View Set

docs.aws.amazon - Migration Planning 15%

View Set

Inventory and Cost of Goods Sold

View Set