Chapter 5: Test & Measurements
Construct Validity Evidence
*Refers to the degree that the individual possesses a trait(construct) presumed to be reflected in the test performance. *Anxiety, intelligence, and motivation are constructs. *Examples - cardiovascular fitness and tennis skills *Construct validity can be demonstrated by comparing higher-skilled individuals with lesser-skilled individuals.
Content validity
*Related to how well a test measures all skills and subject matter that have been presented to individuals.
Content Validity
*Related to how well a test measures all skills and subject matter that have been presented to individuals. *To have content validity, test must be related to objectives of class, presentation, etc. (that for which the group is responsible). *Ask self - Does test measure what group has been taught?
Methods of estimating reliability (Test-retest method)
*Requires two administration of same test to the same group of individuals. *Calculate correlation coefficient between the two sets of scores(intraclass correlation coefficient best). *Greatest source of error in this method is caused by changes in individuals being tested. *Appropriate time interval between administration of tests sometimes difficult to determine.
Three types of validity evidence reported for norm-referenced tests
- Content Validity - Criterion Validity - Construct Validity
Validity of criterion-referenced test
- Directly related to predetermined behavioral objectives - Test items should be constructed to parallel behavioral objectives
Factors of Validity
- The characteristics of the test takers a test is only valid for the individuals for the same age, gender, and experience.
Factors affecting reliability
1. Method of scoring - The more objective the test, the higher the reliability. 2. The heterogeneity of the group - Reliability coefficients based on test scores from a group ranging in abilities will be overestimated. 3. The length of the test - The longer the test, the greater the reliability. 4. Administrative procedures - The directions must be clear, all individuals should be ready, motivated to do well, and perform the test in the same way; testing environment should be favorable to good performance.
Criterion behavior
Minimum level of performance
Validity
* Traditionally validity refers to the degree to which a test actually measures what it claims to measure. - Most important criterion to consider when evaluating a test. - Refers more to the agreement between what the test measures and the performance, skill, or behavior the test is designed to measure. - Means validity is specific to a particular use and group
Criterion-referenced measurement
* Used when individuals are expected to perform a specific level of achievement - An individual's level of performance is not compared with the performance of others.
Norm-referenced measurement
* Used when you wish to compare an individual's performance on a test with performance of other individuals. - Norms usually reported by gender, weight, height, age, or grade level.
Criterion Validity Evidence
*Indicated by how well test scores correlate with a specific criterion (successful performance). *Maybe subdivided into predictive validity (future performance) and concurrent validity (current performance). *SAT and ACT; predictor of success in college; criterion measure is success in college. Three criterion measures used most often. 1. Expert ratings 2. Tournament play 3. Previously validity test
Reliability
*Refers to consistency of a test. *Reliable test should obtain approximately the same results each time it is administered. *Individuals may not obtain the same score on the second administration of a test (fatigue, motivation, environmental conditions, and measurement error may affect scores), but the order of the scores will be approximately the same if test has reliability. *To have a high degree of validity, a test must have a high degree of reliability. *Objective(based on Facts) measures have higher reliability than subjective (personal perspectives, feelings, or opinions)measures.
Factors Affecting Validity
1. The characteristics of the individuals being tested - Testisvalid only for individuals of gender, age, and experience similar to those on whom the test was validated. 2. The criterion measure (variable that has been defined as indicating successful performance of a trait) selected - Different measures correlated with the same set of scores will produce different correlation coefficients (expert ratings, tournament play, previous validated tests) 3. Reliability - Test must be reliable to be valid. 4. Administrative procedures - Validity will be affected if unclear directions are given, or if all individuals do not perform the test the same way.
Norm-referenced factors
1. The sample size used to determine norms ( more confidence in large sample) 2. The population used to determine the norms (age and experience) 3. The date the norms were established
Criterion-referenced examples
1. To meet the good health standard, a twelve year old male should have a body fat percent no greater than 25 percent 2. For successful completion of a running fitness program, the individual must be able to run 2 miles in 14 minutes or less.
Administrative Feasibility
Administrative considerations may determine which test you use: 1. Cost 2. Time 3. Ease of administration 4. Scoring 5. Norms Good sports skills test will be similar to game performance.