Test Norms
2 special issues in evaluating the norm group
1. the effects of non-participation (who declined to participate and how might this affect the norms?) 2. the effects of motivation (did participants exert full ability/effort?)
2 strengths of standard scores
1. avoids the problem of inequality of units because they are linear transformations of z-scores 2. amenable to statistical calculations
4 purposes of a psychological report
1. provides accurate assessment-related information to the referral source and other concerned parties 2. serves as a basis for clinical hypotheses, interventions, and research 3. provides meaningful baseline information for evaluating an individual's progress after intervention 4. serves as a legal document
equation for standard score
SS=SD(s)/SD(r) (X-Mr) + Ms where: SS= standard score SD(s) = standard deviation in the standard score system SD(r) = standard score in raw score system X= raw score Mr = mean in raw score system Ms = mean in standard score system
equation of standard score using z-score
SS=z(SDs)+Ms
percentiles
reflects the percentage of cases in the norm group falling below a certain score; range = 1 to 99... median = 50
3 characteristics of standard scores
1. mean is set at a standard value 2. SD is set at a standard value 3. the value of any score expresses it's location relative to the normal distribution
accuracy of Barnum effect descriptors
they are accurate, yet they are not discriminating
age equivalents
norms created based on age such as mental age (determined by finding the median for examinees as successive age levels)
3 major types of norms
1. percentiles 2. standard scores 3. developmental norms
strengths of percentiles
easy to explain and easy to calculate and thus widely used
2 other developmental norms
1. anthropometric measurements (i.e. height and weight; **many are essentially AE) 2. stage theories of human development (tests based on these theories may place an individual at a certain stage; i.e. Piaget's theory of development)
3 weaknesses of percentiles
1. confusion with percentage-right score (a score of 50 is not failing, it is average) 2. represents an ordinal scale of measurement (no information of absolute differences between people) 3. inequality of units throughout the scale (percentiles are bunched up in the middle and spread out in the ends)
2 weaknesses of developmental norms
1. limited to "growth" traits (traits must have a developmental pattern) 2. uncontrolled standard deviations (SD changes from level to level and test to test)
norm groups
groups which arise from norming standardization programs and then provide the basis for norms
standard scores used in personality tests
t-scores with a mean of 50 and a standard deviation of 10
normed score
the individual's raw score is compared with scores of individuals in the normative (standardized) group
2 strengths of developmental norms
1. natural interpretations (i.e. "he's reading at 8th grade level") 2. Useful for measuring growth (i.e. comparing an individual's achievement level in 1st, then 4th, then 7th grade)
applicability of criterion referencing
a test must have a well-defined body of content for it to be criterion referenced
linear vs. non-linear standard scores
most standard scores are linear transformations of raw scores whereas some standard scores are derived from non-linear transformations when the raw scores are not normally distributed
narrative reports vs. Barnum descriptors
narrative reports should include information from test results that uniquely describes the individual NOT statements that could apply to anyone
real world example of Barnum effect
astrological signs and horoscope interpreting
national norm groups
groups that are representative of the segment of the population for whom the test is intended (i.e. people applying to college)
raw score
immediate result of an individual's responses to a test
weaknesses of standard scores
difficult to explain and one must know they mean and standard deviation
area transformation
results in a normalized standard score when doing a non-linear transformation
convenience norm groups
groups based on one or more groups that a conveniently available for testing
-->user norms
convenience norms based on whatever groups actually took the test (generally within some specified time)
stability of norm groups
determined by the size of the group (larger groups are more stable)
subgroup norms
groups based on the common traits within a larger norm group (i.e. sex or race); **useful when there are substantial differences between the subgroups on the variable measured by the test
developmental norms
applicable when a trait develops systematically with time; raw score can be interpreted in terms of age or grade
representativeness of norm groups
determined by how well-defined a population is; if a norm group is not representative one should get good information about the norm group -- if a norm group is representative one should compare it with a target group on key variables
criterion referenced vs. norm referenced
the framework for interpreting test performace can either be related to a criterion (like an "A") or other individual scores
testing example of Barnum effect
"the battery of tests shows that Ned is not equally proficient in all areas. His teacher should capitalize on his strengths while motivating him to improve in his weaker areas."
key variables of interest
Age, sex, racial ethnic group, SES, ability level, educational level, geographic region, size of city
relationship of norms*
most types of norms are systematically related to one another and interpreted in the context of the normal curve
purpose of norms
translate a raw score into a "normed" score that is based on other people's performance on the same test
narrative report
translates normed scores into words using clear and precise language to form a logical and meaningful and relevant evaluation; **every report must be individualized and professional
grade equivalents
norms created based on grade such as different tests for different grades in order to obtain the norm for each grade)
4 examples of widely used versions of standard scores
Wechsler and SB personality scores (M=50, SD=10), SAT scores (M=500, SD=100), IQ scores (M=100, SD=15), multilevel scaled scores like the Weschsler and SB sustests (M=10, SD=3)
types of norm groups
1. national 2. convenience 3. subgroup norms
usefulness of norm groups
provide a representative, stable, and meaningful framework for interpreting the test results
standard scores
conversion of a z-score into a new system with an arbitrarily chosen M and SD (generally these values are convenient such as M=50 and SD=10)
Barnum effect
people's tendency to believe high base-rate (frequent), non obvious statements that are probably true about everyone but contain NO unique information arising from the specific test