Midterm AOL
when the components or factors of a test are hypothesized to have a positive correlation.
. Convergent Validity
CLASSIFICATION OF SELECTION TYPE TEST ITEMS OR OBJECTIVE TEST
1. Multiple Choice 2. Matching Type 3. True or False 4. Rearrangement 5. Analogy 6. Identification
TWO GENERAL TYPES OF TEST ITEM TO USE IN AN ACHIEVEMENT TEST USING PAPER AND PENCIL
1. Selection type items or Objective test 2. Supply type items or Subjective Type
TRUE-FALSE TEST ITEMS Three Forms
1. Simple - consist of only two choices 2. Complex - consists of more than two choices 3. Compound - two choices plus a conditional completion response
Different variations of True-False items.
1. T-F Correction or Modified True or False Question 2. Yes-No Variation 3. A-B Variation
Multiple choice items can provide
1. Versatility in measuring all levels of cognitive ability 2. Highly reliable test scores 3. Scoring efficiency and accuracy 4. Objective measurement of student achievement or ability 5. A wide sampling of content or objective 6. A reduced guessing factor when compared to true-false items 7. Different response alternatives which can provide diagnostic feedback.
made up of items consisting of pairs of words related to each other. It is designed to measure the ability of examinee to observe the relationship of the first word to the second word
ANALOGY
· this variety is designed to identify errors in a word, phrase, or sentence in a paragraph
CONTAINED-OPTIONS VARIETY
describes the present status of the individual by correlating the sets of scores obtained FROM TWO MEASURES GIVEN CONCURRENTLY.
Concurrent Validity
It is the validity established by analyzing the activities and processes that correspond to a particular concept; is established statistically by comparing psychological traits or factors that theoretically influence scores in a test.
Construct Validity
it is done through a careful and critical examination of the objectives of assessment so that it reflects the curricular objectives.
Content Validity
- It is established statistically such that a set of scores revealed by the measuring instrument is CORRELATED with the scores obtained in another EXTERNAL PREDICTOR OR MEASURE.
Criterion-related Validity
It follows the program of studies. (e.g. Curriculum Guide, Syllabus)
Curricular Validity
The external environment may include room temperature, noise level, depth of instruction, exposure to materials, and quality of instruction, which could affect changes in the responses of examinees in a test.
External Environment
It is done by examining the physical appearance of the instrument.
Face Validity
this variety consist of group of words or terms in which one does not belong to the group
GROUP-TERM VARIETY
it requires the examinees to identify what is being defined in the statement or sentence and there are no options to choose from
IDENTIFICATION TYPE
• Subjective and objective type of test are best in assessing low - level learning targets in terms of coverage and efficiency
IDENTIFY ASSESSMENT TOOLS (HOW TO ASSESS?)
Knowledge and simple understanding pertain to mastery of substantive subject matter and procedure.
IDENTIFY LEARNING OBJECTIVES / LEARNING OUTCOMES AND TOPIC (WHAT TO ASSESS?)
Answer the following question: -What are you assessing for? Is it for Placement? Feedback (Formative)? Diagnosis and Intervention? Or for Grading (Summative)?
IDENTIFY PURPOSE (ROLE) OF ASSESSMENT (HOW TO ASSESS?)
Interrater reliability (also called interobserver reliability) measures the degree of agreement between different people observing or assessing the same thing. You use it when data is collected by researchers assigning ratings, scores or categories to one or more variables. KEY WORDS: DIFFERENT PEOPLE, SAME TEST
INTER-RATER
Internal consistency assesses the correlation between multiple items in a test that are intended to measure the same construct. KEY WORDS: DIFFERENT QUESTIONS, SAME CONSTRUCT
INTERNAL CONSISTENCY
Every participant possesses characteristics that affect their performance in a test.
Individual differences of participants
In general, test composed of items of moderate or average difficulty (0.30 to 0.70) will have more influence on reliability than those composed primarily of easy or very difficult items.
Item Difficulty
In general, test composed of more discriminating items will have greater reliability than those composed of less discriminating items.
Item Discrimination
Difficult and time consuming to construct 2. Lead a teacher to favor simple recall of facts 3. Place a high degree of dependence on student's reading ability and teacher's writing ability
Limitations of Multiple Choice
Consist of two columns - column A contains descriptions and must be placed at the left side while Column B contains options and placed at the right side
MATCHING TYPE TEST
· Used to measure knowledge outcomes and other types of learning outcomes such as comprehension and application · Most commonly used format in measuring student achievements at different levels of learning
MULTIPLE CHOICE
· Easy to develop and use because it just works around the objectives without considering the different levels of cognitive behavior
ONE WAY TABLE OF SPECIFICATIONS
· Maps out the content or topic, test objectives, number of hours spent, and format, number, and placement of items
ONE WAY TABLE OF SPECIFICATIONS
reliability measures the correlation between two equivalent versions of a test. You use it when you have two different assessment tools or sets of questions designed to measure the same thing. KEY WORDS: DIFFERENT PEOPLE, SAME TIME, DIFFERENT TEST
PARALLEL FORMS
pencil test can be used with facts, concepts, principles and procedures
Paper
is also used to assess skills and products. Performance assessment is called as "authentic assessment" when used in real-life and meaningful context
Performance assessment
describes the future performance of an individual by correlating the sets of scores obtained from TWO MEASURES GIVEN AT LONGER TIME INTERVAL
Predictive Validity
consist of multiple-option item where it requires a chronological, logical or rank order
REARRANGEMENT TYPE
an essay item that places STRICT LIMITS on both content and response given by the student.
RESTRICTED RESPONSE
It refers to the consistency of scores obtained by the same person when retested using the same instrument/its parallel or when compared with other students who took the same test
Reliability
· the optional response of this type of multiple-choice test is dependent upon a setting or foundation of some sort. · A setting can be a form of sentence, paragraph, graph, equation, picture or some forms of representation
SETTING-AND-OPTION VARIETY
· Alternative form of assessment for the students needs to supply or create the appropriate word(s), symbol(s) or number(s) to answer the question or complete the statement · Two ways of constructing short answer or completion type - question form or complete the statement form.
SHORT ANSWER OR COMPLETION
Administer a test to a group of examinees. The items need to be split into halves. In this technique, get the sum of point in the odd numbered items and correlate it with sum of points of even numbered items. KEY WORDS: ODD-EVEN TECHNIQUE, SAME PEOPLE
SPLIT-HALF
most commonly used in classroom testing, board examinations, civil service examinations and many others
STEM-AND-OPTION VARIETY
· this variety makes use of structured response which is commonly used in natural science classroom testing · answered only in a specific way, such as true, or false; yes, no, or don't know; not good, good, fair, or very good, etc.
STRUCTURED-RESPONSE VARIETY
· Considered as one of the standards of quality assessment by Chapuis, Chappuis&Stiggins (2009) · It is a process wherein students have an opportunity to reflect and rate their own work and judge how well they have performed in relation to a set of assessment criteria
STUDENT SELF-ASSESSMENT
· It requires students to create and supply their own answer or perform a certain task to show mastery of knowledge or skills · It is also known as constructed response test
SUPPLY TYPE TEST ITEMS OR SUBJECTIVE TYPE
· A table that maps out the test objectives, contents, or topics covered by the test; the levels of cognitive behavior to be measured; the distribution of items, number, placement, and weights of test items; and the test format.
TABLE OF SPECIFICATIONS
· a test blueprint that guides the teacher in constructing the test.
TABLE OF SPECIFICATIONS
· It is a form of on-going assessment usually done in combination with oral questioning · It can also be used to assess the effectiveness of teaching strategies and academic interventions
TEACHER OBSERVATION
reliability measures the consistency of results when you repeat the same test on the same sample at a different point in time. You use it when you are measuring something that you expect to stay constant in your sample. KEY WORDS: SAME PEOPLE, DIFFERENT TIMES
TEST-RETEST
• In this type of test, examinees determine whether the statement is true or false • It is appropriate to assessing behavioral objectives such as "identify", "select", or "recognize" • It is appropriate when there are only two plausible alternatives
TRUE OR FALSE TYPE
-Typically used to measure the ability to identify whether or not the statements of facts are correct. The basic format is simply a declarative statement that the student must judge as true or false
TRUE-FALSE TEST ITEMS
· Allows one to see the levels of cognitive skills and dimensions of knowledge that are emphasized by the test
TWO - WAY TABLE OF SPECIFICATION
· Reflects not only the content, time spent, and number of items but also the levels of cognitive behavior targeted per test content based on the theory behind cognitive testing
TWO - WAY TABLE OF SPECIFICATION
The more items a test has, the likelihood of reliability is high.
The number of items in a test
· Reflects the features of one-way and two-way TOS · It challenges the test writer to classify objectives based on the theory behind the assessment
Three-Way Table of Specifications
· Shows the variability of thinking skills targeted by the test · It takes a much longer time to develop
Three-Way Table of Specifications
Adding a time factor may improve reliability for lower-level cognitive items. Since all students do not function at the same pace, a time factor adds another criterion to the test that causes discrimination, thus improving reliability.
Time Limits
Is it a degree to which the assessment instrument measures what it intends to measure. It also refers to the usefulness of the instrument for a given purpose It is the most important criterion of a good assessment instrument.
Validity
SELECTION TYPE ITEMS OR OBJECTIVE TEST
· Requires only one answer in each item
when the components or factors of a test are hypothesized to have a negative correlation. An example to correlate are scores in a test on intrinsic and extrinsic motivation.
Divergent Validity
· Appropriate to assess student ability to organize and present their original idea
ESSAY TYPE OF TEST
allows students to determine the length and complexity of the response
EXTENDED RESPONSE
The type of students taking the test can influence reliability. A group of students with heterogeneous ability will produce a large spread of test than a group with homogenous ability.
Spread of Scores