Test Development

¡Supera tus tareas y exámenes ahora con Quizwiz!

Grade-based

if performance is a function of grade

Stanine

if raw scores are transformed into scores that range from 1 to 9

Age-based

if test performance is a function of age

Constructing Relevant Test Items

Items can be classified as either selection-type or supply-type.

Scale Values

are assigned to different amounts of the trait, attribute, or characteristic being measured

Item difficulty

calculate the proportion of the total number of test-takers who answered the item correctly

Item fairness

degree a test item is biased; a biased testitem is an item that favors one particular group of examinees in relation to another when differences in group ability are controlled

Interval

difference between 2 points differ by the same number of scale units

Item discrimination

indicate how adequately an item separates or discriminates between high scorers and low scorers on an entire test

Speed tests

the closer an item is to the end of the test, the more difficult it may appear to be; this is because test takers may not get to items near the end of the test before time runs out

Guessing

the issues surrounding guessing are more complex

Pilot Work

Also called pilot study or pilot research. Refers to the preliminary research surrounding the creation of a prototype of the test. The test developer attempts to determine how best to measure a targeted construct. May include literature reviews, experimentation, creation, revision and deletion of preliminary test items.

Item Analysis

Data from the tryout will be collected and test-takers' performance on the test as a whole and on each item will be analyzed. Statistical procedures are employed to assist in making judgments about which items are good as they are, which items need to be revised, and which items should be discarded

Scaling

Defined as the process of setting rules for assigning numbers in measurement. Process by which a measuring device is designed and calibrated and by which numbers (or other indices)

General Guidelines in Item Writing

Select the type of test item that measures the intended learning outcome most directly. Write the test item so that the performance it elicits matches the performance in the learning task. Write the test item so that the task is clear and definite. Write the test item so that it is free from nonfunctional material. Write the test item so that irrelevant factors do not prevent an informed student from responding correctly. Write the test item so that irrelevant clues do not enable the uninformed student to respond correctly. Write the test item so that the difficulty level matches the intent of the learning outcome, the age group to be tested, and the use to be made of the results. Write the test item so that there is no disagreement concerning the answer. Write the test item far enough in advance that they can be later reviewed and modified as needed. Write more test items than called for by the test plan.

Test Construction

Stage that entails writing test items, as well as formatting items, setting scoring rules, and designing and building a test

Test Conceptualization

Takes place once the idea for a test is conceived

Use of Item Response Theory (IRT) in Building and Revising Tests

Item response theory is a probabilistic model that attempts to explain the response of a person to an item

Ratio

meaningful zero point

Supply-type items

1. Completion 2. Essay (restricted response) 3. Essay (extended response)

Selection-type items

1. Multiple choice 2. True-false 3. Matching 4. Classification

5 Stages Test Development

1. Test conceptualization 2. Test construction 3. Test tryout 4. Item analysis 5. Test revision

Due for revision when

1. The stimulus materials look dated and current test-takers cannot relate to them. 2. The verbal content of the test, including the administration instructions and the test items, contains dated vocabulary that is not readily understood by current test-takers. 3. Certain words or expressions in the test items or directions may be perceived as inappropriate or even offensive to a particular group. 4. Test norms are no longer adequate due to changes in group membership. 5. Reliability or validity of the test can be significantly improved by a revision. 6. The theory on which the test was originally based has been improved significantly, and these changes should be reflected in the design and content of the test.

Test Tryout

Happens once a preliminary form of the test has been developed. The test is administered to a representative sample of test-takers under conditions that stimulate the conditions under which the final version of the test will be administered. The test should be tried out on people who are similar in critical respects to the people for whom the test was designed. Issue of number of people on whom the test should be tried out. Rule of thumb: no fewer than 5 and preferably as many as 10 for each item. The more subjects in the tryout, the better.

Test Revision As a Stage in New Test Development

Having conceptualized the new test, constructed it, tried it out, and item-analyzed it, what remains in to act judiciously on all the information and mold the test into its final form

Types of Scales

Nominal, Ordinal,Interval, Ratio, Age-based, Grade-based,Stanine, Unidimensional or Multidimensional (e.g. height, achievement), Comparative or Categorical

Test Revision

Refers to action taken to modify a test's content or format for the purpose of improving the test's effectiveness as a tool of measurement. Usually based on item analyses, as well as related information derived from the test tryout

Selection-type Items

Require students to select from a predetermined list of potential answers. These questions include multiple choice, true/false, matching, and classification questions. often viewed as less challenging in terms of the thinking skills required to answer them. However, when well written, they can measure higher levels of thinking, not simply the recalling of facts. The writing of these test items can be challenging, and they frequently take more time to construct. When well written, though, they are easier to score and can provide a more objective method of assessment than do created response items.

Test Revision in the Life Cycle of an Existing Test

Tests should be revised "when significant changes in the domain represented, or new conditions of test use and interpretation, make the test inappropriate for its intended test use

Supply-type Items

These questions measure the student's ability to communicate effectively, not just their understanding of content. Include extended answer and essays. Most teachers have to balance several considerations when choosing what to include in a test, including the fair assessment of knowledge or skills, number of students taking the test, and amount of time available to score the test. This type of question is often easier to write, but can require more time to score. Scoring is often less reliable because it is more subjective.

Test Development

Umbrella term for all that goes into the process ofcreating a test (Cohen, Swerdlik, Sturman, 2013). May be due to a stimulus about an emerging social phenomenon or pattern or behavior, or in response to a need to assess mastery in an emerging occupation or profession.

Writing Items

What range of content should the items cover? Which of the many different types of items should be employed? How many items should be written in total and for each content area covered?

Item validity

provides an indication of the degree to which the test is measuring what it purports to measure

Item reliability

provides an indication of the internal consistency of a test; the higher this index, the greater the test's internal consistency. Equal to the product of the item score standard deviation and the correlation between the item score and the total test score

Test Development

Conjuntos de estudio relacionados

Chapter 7

hey si

Week 1: Lewis Ch. 5, 62, 64

Chapter 7 Finance

Module 5

Personal and Family Health Exam 1

NEC 440-480

APHG Unit 1

genetics

Global Politics of Food Quiz 4

Computer Science Exam #3

Translation Quiz: 102-112

Psych Unit 2 Test

H&W - Week 5 (Exam 2)

Code Academy Quizzing

Chapter 1: Composition of the Atmosphere

Database Modeling ch 1

Genetics Exam 4

Ch 10 Ap Classroom

3140