Psychological Testing: Chapter 4

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

classical test theory

(CTT; also variously referred to as true score theory ) the assumption is made that each testtaker has a true score on a test that would be obtained but for the action of measurement error.

Standard error of the mean

A measure of sampling error

Subgroup norms

A normative sample can be segmented by any of the criteria initially used in selecting subjects for the sample. What results from such segmentation are more narrowly defined subgroup norms

Standard error of the difference

A statistic used to estimate how large a difference between two scores should be before the difference is considered statistically significant

Standard error of measurement

A statistic used to estimate the extent to which an observed score deviates from a true score

Validity

A test is considered valid for a particular purpose if it does, in fact, measure what it purports to measure.

Age norms

Also known as age-equivalent scores, age norms indicate the average performance of different samples of test-takers who were at various ages at the time the test was administered

Standard

As a noun, standard may be defined as that which others are compared to or evaluated against. As an adjective, standard often refers to what is usual, generally accepted, or commonly employed. The verb "to standardize" refers to making or transforming something into something that can serve as a basis of comparison or judgment.

Error variance

Because error is a variable that must be taken account of in any assessment, we often speak of error variance, that is, the component of a test score attributable to sources other than the trait or ability measured.

Grade norms

Designed to indicate the average test performance of testtakers in a given school grade, grade norms are developed by administering the test to representative samples of children over a range of consecutive grade levels

Standard error of estimate

In regression, an estimate of the degree of error involved in predicting the value of one variable from another

cumulative scoring

Inherent in cumulative scoring is the assumption that the more the testtaker responds in a particular direction as keyed by the test manual as correct or consistent with a particular trait, the higher that testtaker is presumed to be on the targeted ability or trait.

norm-referenced

One way to derive meaning from a test score is to evaluate the test score in relation to other scores on the same test. As we have pointed out, this approach to evaluation is referred to as norm-referenced.

local norms

Provide normative information with respect to the local population's performance on some test.

Assumption 2

Psychological Traits and States Can Be Quantified and Measured

Assumption 1

Psychological Traits and States Exist

Assumption 3

Test-Related Behaviour Predicts Non-Test-Related Behaviour

Assumption 7

Testing and Assessment Benefit Society

Assumption 6

Testing and Assessment Can Be Conducted in a Fair and Unbiased Manner

Assumption 4

Tests and Other Measurement Techniques Have Strengths and Weaknesses

Standardisation

The process of administering a test to a representative sample of test-takers for the purpose of establishing norms is referred to as standardisation or test standardisation .

Sampling

The process of selecting the portion of the universe deemed to be representative of the whole population is referred to as sampling . The test developer can obtain a distribution of test responses by administering the test to a sample of the population—a portion of the universe of people deemed to be representative of the whole population.

Assumption 5

Various Sources of Error Are Part of the Assessment Process

norm-referenced testing and assessment

a method of evaluation and a way of deriving meaning from test scores by evaluating an individual testtaker's score and comparing it to scores of a group of testtakers. In this approach, the meaning of an individual test score is understood relative to other scores on the same test. A common goal of norm-referenced tests is to yield information on a testtaker's standing or ranking relative to some comparison group of testtakers.

criterion

a standard on which a judgment or decision may be based.

States

also distinguish one person from another but are relatively less enduring (Chaplin et al., 1988).

percentile

an expression of the percentage of people whose score on a test or measure falls below a particular raw score.

construct

an informed, scientific concept developed or constructed to describe or explain behaviour. We can't see, hear, or touch constructs, but we can infer their existence from overt behaviour.

national norms

are derived from a normative sample that was nationally representative of the population at the time the norming study was conducted.

to norm

as well as related terms such as norming , refer to the process of deriving norms. Norming may be modified to describe a particular type of norm derivation.

trait

has been defined as "any distinguishable, relatively enduring way in which one individual varies from another" (Guilford, 1959, p. 6). The trait term that an observer applies, as well as the strength or magnitude of the trait presumed to be present, is based on observing a sample of behaviour. Samples of behaviour may be obtained in a number of ways, ranging from direct observation to the analysis of self-report statements or pencil-and-paper test answers.

Criterion-referenced testing and assessment

may be defined as a method of evaluation and a way of deriving meaning from test scores by evaluating an individual's score with reference to a set standard.

domain sampling

may refer to either (1) a sample of behaviors from all possible behaviors that could conceivably be indicative of a particular construct or (2) a sample of test items from all possible items that could conceivably be used to measure a particular construct.

overt behaviour

overt behaviour refers to an observable action or the product of an observable action, including test- or assessment-related responses.

Percentage correct

refers to the distribution of raw scores—more specifically, to the number of items that were answered correctly multiplied by 100 and divided by the total number of items.

normative sample

that group of people whose performance on a particular test is analyzed for reference in evaluating the performance of individual testtakers. Whether broad or narrow in scope, members of the normative sample will all be typical with respect to some characteristic(s) of the people for whom the particular test was designed.

race norming

the controversial practice of norming on the basis of race or ethnic background.

Reliability

the criterion of reliability involves the consistency of the measuring tool: the precision with which the test measures and the extent to which error is present in measurements. In theory, the perfectly reliable measuring tool consistently measures in the same way.

error

traditionally refers to something that is more than expected; it is actually a component of the measurement process. More specifically, error refers to a long-standing assumption that factors other than what a test attempts to measure will influence performance on the test.

fixed reference group scoring system

type of aid providing a context for interpretation. Here, the distribution of scores obtained on the test from one group of testtakers—referred to as the fixed reference group —is used as the basis for the calculation of test scores for future administrations of the test. Perhaps the test most familiar to college students that exemplifies the use of a fixed reference group scoring system is the SAT.

Norm

used in the scholarly literature to refer to behavior that is usual, average, normal, standard, expected, or typical. In a psychometric context, norms are the test performance data of a particular group of testtakers that are designed for use as a reference when evaluating or interpreting individual test scores.

Criteria for a good test

would include clear instructions for administration, scoring, and interpretation. It would also seem to be a plus if a test offered economy in the time and money it took to administer, score, and interpret it. Most of all, a good test would seem to be one that measures what it purports to measure. Beyond simple logic, there are technical criteria that assessment professionals use to evaluate the quality of tests and other measurement procedures. Test users often speak of the psychometric soundness of tests, two key aspects of which are reliability and validity.


Set pelajaran terkait

ITM 350 chapter 9 quiz study guide

View Set

NUR 1275 Oncology Prep U Questions

View Set

Chem 1010- Chapter 5 book questions

View Set

PHYS 1301 - Ch. 4 (LearnSmart & Connect).

View Set

English CAWT115 - Chapter 10, 11, 12 Assignments

View Set

Climate Change and disease, Allergies and climate change

View Set

Anatomy and Physiology CH. 5 and 6

View Set

ATI PN Pharmacology Proctored Exam

View Set