Types of Reliability/Validity
Parallel Forms Reliability
Administering different versions of a test/assessment tool that challenge the same material/knowledge to the same group to see how the scores show the consistency of the results across the versions Ex: two types of AP US History tests given to same year of students, testing the same stuff
Test-Retest Reliability
Administering the same test twice over a period time to the same group to see if the scores from each test correlate to evaluate the test reliability Ex: Indigo test scores may change, and that correlation can be used to evaluate how reliable that test is
Face Validity
How effective a procedure/test is in assessing stated aims Ex: when given a test about psychology, ALL the questions are on psychology, not something else
Ecological Validity
How well a task used in research represents real life Ex: "research into eyewitness testimony" shows that it lacks ecological validity b/c the participants saw things on video instead of in real life
Predictive Validity
How well a test predict future behavior Ex: if a depression test can measure accurate diagnoses of depression, it can predict future possibility of depression
Population Validity
How well the results of a study can be generalized to groups other than the group used in the experiment/study
Internal Validity
Instruments/procedures used in research measure what they're intended to measure
Inter-Rater Reliability
Measure of reliability to see how different judges/raters agree in assessment decisions-do they interpret answers the same way? Do they respond the same to answers given on certain content? Ex: How Flan grades FRQ's vs how the AP graders do-they should score them the same if there's inter-rater reliability
Sampling Validity
Random sampling increases external validity, and random assignment increases internal validity
External Validity
Results able to be generalized beyond immediate study. Ex: studying over periods of time vs. last minute cramming should be applicable to more than one school subject, like English and math
Average Inter-Item Correlation
Sub type of internal consistency reliability-obtained by taking all items on a test geared towards same thing like reading comprehension and determining the CORRELATION COEFFICIENT for each PAIR of items and taking the average of the CC's.
Split-Half Reliability
Sub type of internal consistency reliability-splitting in half all test items looking at the same area of knowledge (i.e. the American Revolution), making two sets of items. Then, the entire test is given to a group; the total scores are recorded, and the split-half reliability is determined by the correlation between the 2 total set scores.
Construct Validity
The test is actually measuring what it's supposed to measure and not measuring other variables-you can use experts in the field of whatever you're studying to see if what you're doing truly measures what you want to find out more about.
Internal Consistency Reliability
Used to evaluate how different test items "probe" the same stuff and if they produce similar results Ex: Do all questions on a mental health test evaluating for depression produce similar results?
Criterion-Related Validity
Used to predict current/future performance-test results and a criterion of interest correlate Ex: Using the current ACT to develop the new SAT we know and love today.