Assessment Chapter 4 (Of tests and testing)
Criterion-referenced testing and assessment
A method of evaluation and a way of deriving meaning from test scores by evaluating an individual's score with reference to a set standard. (Domain or content-referenced testing and assessment)
Subgroup norms
A normative sample can be segmented by any of the criteria initially used in selecting subjects for the sample. Norms for any defined group within a larger group. Are created when narrowly defined groups are sampled (i.e., socioeconomic status, education level, age).
Criterion
A standard on which judgment or decision may be based.
Percentile
An expression of the percentage of people whose score on a test or measure falls below a particular raw score. For example, if a student scores in the 85th percentile on a standardized test, then 85% of those taking the test had lower scores.
National norms
Norms derived from a standardization sample that was nationally representative of the population. For example, a national norm may be measured for all students in the United States who take the SATs in the Fall of their senior year. SAT officials will add all the scores for all the thousands of students who took the exam, and divide that large number by the total number of students who took the exam. The number that results from calculating the average is the SAT national norm for the Fall of that year.
Anchoring
Permits conversion of raw scores on the new version of the test into fixed reference group scores.
Local norms
Provide normative information with respect to the local population's performance on some test. Normative information about some limited population, frequency of specific interest to the test users. Are derived from the local population's performance on a measure. Typically created locally (i.e., by guidance counselor, personnel director, etc.)
Norming
Refer to the process of deriving norms. May be modified to describe a particular type of norm derivation.
Psychometric soundness of tests
Reliable, valid.
Cumulative scoring
The assumption that the more the testtaker responds in a particular direction as keyed by the test manual as correct or consistent with a particular trait, the higher the testtaker is presumed to be on the targeted ability or trait.
Reliability
The consistency of the measuring tool: the precision with which the test measures and the extent to which error is present in measurement.
Norm-referenced testing and assessment
method of evaluation and a way of deriving meaning from test scores by evaluating an individual testtaker's score and comparing it to scores of a group of testtakers.
Incidental sample (convenience sample)
one that is convenient or available for use.
error variance
the component of a test score attributable to sources other than the trait or ability measured.
Assumptions about psychological testing and assessment
1. Psychological traits and states exist 2. Psychological traits and states can be quantified and measured. 3. Test-related behavior predicts non-test related behavior. 4. Tests and other measurement techniques have strengths and weaknesses. 5. Various sources of error are part of the assessment process. 6. Testing and assessment can be conducted in a fair and unbiased manner. 7. Testing and assessment benefit society.
Construct
An informed, scientific concept developed or constructed to describe or explain behavior.
Overt behavior
An observable action or the product of an observable action, including test-or assessment-related responses.
Trait
Any distinguishable, relatively enduring way in which one individual various from another.
Grade norms
Developed by administering the test to representative samples of children over a range of consecutive grade levels. are norms specifically designed as a reference in the context of the grade of the test taker who achieved a particular score. For example, a high school may administer a standardized test for all senior students in the Fall of this year. When the results are in, half of the students scored below a 65 percent and half of the students score above a 65 percent. The median, or grade norm, in that case for seniors at that high school in the fall of this year is 65 percent.
States
Distinguish one person from another but are relatively less enduring.
Norm-referenced
Evaluate the test score in relation to other scores on the same test.
Validity
If a test does, in fact, measure what it purports to measure.
Stratified-random sampling
If every member of the population had the same chance of being included.
Purposive sample
If we arbitrarily select some sample because we believe it to be representative of the population.
Assumption 6: Testing and assessment can be conducted in fair and unbiased manner
More controversial than the other 6 assumptions. One source of fairness-related problems is the test user who attempts to use a particular test with people whose background and experience are different from the background and experience of people for whom the test was intended.
National anchor norms
Provide some stability to test scores by anchoring them to other test scores. An equivalency table for scores on two nationally standardized tests designed to measure the same thing. Used to consider two tests that were normed by using the same sample (i.e., each member of the sample took both tests).
Standardized test
Tests that have clearly specified procedures for administration, scoring, and interpretation in addition to norms.
Normative sample
That group of people whose performance on a particular test is analyzed for reference in evaluating the performance of individual testtakers.
Standard
That which others are compared to or evaluated against.
Age-equivalent scores (age norms)
The average of performance of different samples of testtakers who were at various ages at the time that the test was administered.
Race norming
The controversial practice of norming on the basis of race or ethnic background.
Percentage correct
The distribution of raw scores. The number of items that were answered correctly multiplied by 100 and divided by the number of items.
Fixed reference group scoring system
The distribution of scores obtained on the test from one group of testtakers- referred to as the fixed reference group - use as the basis for the calculation of test scores for future administrations of the test. (e.g., SAT)
Standardization or test standardization
The process of administering a test to a representative sample testtakers for the purpose of establishing norms.
Sampling
The process of selecting the portion of the universe deemed to be representative of the whole population.
Whether a trait manifests itself in observable behavior, and to what degree it manifests, is presumed to depend on
The strength of the trait in the individual but also the nature of the situation.
Psychological trait
covers a wide range of possible characteristics. Relate to intelligence, specific intellectual abilities, cognitive styles, adjustment interests, attitudes, sexual orientation and preferences, psychopathology, personality in general and specific personality traits.
User norms (program norms)
descriptive statistics based on a group of testtakers in a given period of time rather than norms obtained by formal sampling methods.
Samples of behavior may be obtained by:
direct observation, or pencil-to-paper answers.
Age norms
norms specifically designed for use as a reference in the context of age of the test taker who achieved a particular score. Commonly defined as social rules for age-appropriate behavior, including everyday actions and/or the timing and sequencing of major life events (i.e., marriage, parenthood)
Stratified sampling
such sampling would help prevent sampling bias and ultimately aid in the interpretation of the findings. The most accurate way of developing norm group. Common demographics to stratify: age, gender, socioeconomic status, geographic region.
Classical test theory (CTT, true score theory)
the assumption is made that each testtaker has a true score on a test that would be obtained but for the action of measurement error.
Equipercentile method
the equivalency of scores on different tests is calculated with reference to corresponding percentile scores.
Norms
the test performance data of a particular group of testtakers that are designed for use as a reference when evaluating or interpreting individual test scores.