chapter 5: what is a good test?

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

objectivity

-a test has high objectivity when two or more persons can administer the same test to the same group and obtain approximately the same results -is a form of reliability and can be determined by the test-retest

spit-half method

-a test split into halves and the scores of the two halves are correlated -is calculated by this method is for a test of only halve the length of the original test. because reliability usually increases as a test increases in length, the reliability for the full test needs to be estimated. the speaman-brown formula is often used for this purpose -may produce an inflated correlation coefficient

reliability of criterion-referenced tests

-defined as consistency of classification or how consistently the test classifies individuals as masters or non masters. -applies to a single cluster of items -the proportion of agreement is determined by how many group members are classified as masters on both test days and how many group members are classified as nonmaster on both test days

kuder-richardson formula 21

-estimates the average correlation that might be obtained if all possible split-half combinations of a group of items were correlated -two assumptions 1) the test items can be scored 1 for correct and 0 for wrong 2) the total score is the sum of the item scores

criterion validity

-indicated by how well test scores correlate with a specific criterion (successful performance) -can be subdivided into two groups: predictive validity and concurrent validity

concurrent validity

-indicates how well a test measures the current level of skill or knowledge of an individual. -the test and the criterion measurement are administered at approximately the same time

content validity

-is related to how well a test measures all skills and subjects that have been presented to the test taker. -example: the test must be reduced to a realistic number of items and still represent the total content of a longer test. "does the test measure what the test takers have been taught?"

criterion measure

-is the choice of the criterion measure 1. expert ratings 2. tournament play 3. previously validated test (VO2 max, skinfold, etc)

criterion-referenced

-is used when individuals are expected to perform at a specific level of achievement -one individual's performance is not compared with the performance of others -example: the student must be able to run 2 miles in 14 minutes

norm-referenced measurement

-is used when you wish to interpret each individual's performance on a test in comparison with other individual's performances -examples: percentiles, z-scores, and t-scores -norms for tests in physical education are usually reported by gender, weight, height, age or grade level

reliability

-refers to the consistency of a test -should obtain approximately the same results regardless of the number of times it is given -for a test to have a high degree of validity, it must have a high degree of reliability ----but high reliability does not necessarily mean high validity

construct validity

-refers to the degree that the individual posses a trait (construct), presumed to be reflected in the test performance. -construct validity evidence can be demonstrated by comparing higher-skilled individuals with lesser-skilled individuals

parallel forms method

-requires the administration of parallel of equivalent forms of a test to the same group and calculation of the correlation coefficient between the two sets of scores -problem associated with this method is the difficulty of constructing two tests that are parallel in content and item characteristics

test-retest method

-requires two administrations of the same test to the same group of individuals, with the calculation of the correlation coefficient between the two sets of scores

validity

-the most important criterion to consider when evaluating a test -refers to as the degree to which a test actually measures what it claims to measure

criterion-referenced limitations

-when a pass/fail standard is used, it does not show how good or how poor an individual's level of ability

predictive validity

-when you wish to estimate future performances -is obtained by giving a predictor test and correlating the criterion measure (obtained at a later time) -example: college entrance tests are used as predictive tests, with the criterion measure being success in college

objectivity factors

1) complete and clear instructors for administration and scoring of the test 2) trained testers administer the test 3) simple measurement procedures are followed 4) appropriate mechanical tools for measurement 5) results are expressed as numerical scores

administrative reliability (5)

1) cost 2) time 3) ease of administration 4) scoring: will the services of another individual affect the objectivity of scoring? 5) norms: are norms available to compare your test takers with others?

factors that affect reliability

1) method of scoring: the more objective the test, the higher reliability 2) the heterogeneity of the group 3) length of the test: the longer the test, the greater the reliability 4) administrative procedures

factors to consider when selecting a test with norms

1. sample size 2. population used to determine the norms - for example: if a basketball skills test has norms for tenth-grade students, only tenth-grade students should be used to develop the norms. Varsity basketball players and students in other grades should not be used 3. the time the norms were established - norms should be updated

factors that affect validity

1. the characteristics of the test takers - a test is valid only for individuals of age, gender, and experience similar to those on whom the test was validated 2. the criterion measure selected: 3. reliability 4. administrative procedures

reliability of norm-referenced tests

test-retest, parallel forms, split-half, and Kuder-richardson formula 21

validity of norm-referenced test is accepted if more than one type of the following validity evidence is strong

content validity criterion validity construct validity

domain-reference validity and decision validity evidence

domain-reference: if the items, or tasks, on a test represent the criterion behavior, the test has logical validity evidence. decision validity evidence: is used when a test's purpose is to classify individuals as proficient (masters) or non-proficient (non-maters) of the criterion behavior.

criterion-referenced validity is related directly to

predetermined behavioral objectives.


Ensembles d'études connexes

800-171 CUI Controls (124 Controls)

View Set

Technology&Trade - The Ricardian Model

View Set

Chapter 15 Questions - Anxiety and Obsessive-Compulsive Disorders

View Set

Chapter 14: Telephone Techniques

View Set

Chapter 2: The Constitution and the Founding (Inquizitive)

View Set

Plate Boundaries, Bathymetry, Ocean Basins

View Set

(MCDB) Introductory to Cellular & Molecular Biology; Mid term 2 Prep: Nucleic Acids

View Set

3.4 CORE SELF-EVALUATIONS: HOW MY SELF-EFFICACY, SELF-ESTEEM, LOCUS OF CONTROL, AND EMOTIONAL STABILITY AFFECT MY PERFORMANCE

View Set