God help us
Reliability coefficients range from
0-1
Which is the most appropriate description of a learning progression?
A learning progression is an ordered sequence of the stuff a student must learn so as to achieve a significant curricular outcome
Imagine that you had rejected the chapter's recommendation for teachers not to seek reliability evidence for most of their own classroom tests. Moreover, you routinely ask your students to complete each of your exams twice, usually two or three days apart. You make sure that nothing takes place during the two or three days separating those test-taking occasions that would bear directly on students' mastery of what's being tested. You then correlate students' scores on the two testing occasions. What you hope to determine by these two-time testing activities is an answer to the question: How stable are my classroom assessments? Accordingly, which of the following reliability evidence would you regard as most appropriate, and personally gratifying, for any of your twice-taken tests?
A test-retest r of positive .68
Which of the following is the best definition for the concept of accessibility for students with disabilities?
Accessibility refers to the notion that all test takers must have an unobstructed opportunity to demonstrate their status with respect to the constructs being measured by an educational test. This is a key component of supporting individuals with disabilities
Which of the following is not a dimension of learning science found in the Next Generation (Science Standards)
Advanced mathematics for the sciences
Suppose that you are an elementary school teacher whose students are tested each spring using a state-adopted accountability test in mathematics. This year, the state has shifted its annual accountability tests to a new set of exams developed collaboratively by a group of partner-states during the previous several years. At each grade level, one of four available versions of that grade's tests may be administered. You have reviewed the technical manual for the new 50-item tests, and you are pleased with the new test's reliability coefficients reported for students at the same grade level as the grade you currently teach. Although the following indicators are totally fictitious, which of the following reliability indicators should have properly triggered the greatest satisfaction on your part?
Alternate-form r=.69
A dozen middle-school mathematics teachers in a large school district have collaborated to create a 30-item test of students' grasp of what the test's developers have labeled "Essential Quantitative Aptitude" —that is, students' EQA. All 30 items were constructed in an effort to measure each student's EQA. Before using the test with many students, the developers wish to verify that all or most of its items are functioning homogeneously —that is, are properly aimed at gauging a test-taker's EQA. On which of the following indicators of assessment reliability should the test developers focus their efforts?
An internal-consistency reliability coefficient
Please assume you are a middle-school English teacher who, despite this chapter's urging that you rarely, if ever, collect reliability evidence for your own tests, stubbornly decides to do so for all of your midterm and final exams. Although you wish to determine the reliability of your tests for the group of students in each of your classes, you only wish to administer the tests destined for such reliability analyses on one occasion, not two or more. Given this constraint, which of the following coefficients would be most suitable for your reliability-determination purposes?
An internal-consistency reliability coefficient
A prominent procedure to minimize assessment bias for students with disabilities is to employ ________________.
Assessment accommodations
This illustrative item is intended for use in a middle-school American history course. Directions: Remembering the class discussions of America's current immigration issues, please provide a brief essay on each of the issues cited below. You will have a full 50-minute class period to complete this examination, and you should divide your essay-writing efforts equally between the two topics. In grading your twin essays, equal weight will be given to each essay. Remember, compose two clear essays —one for each issue. Your Two Essay Topics 1. Why would some form of "amnesty" for illegal aliens be a helpful solution to at least part of today's U.S. immigration problems? 2. Why would some form of "amnesty" for illegal aliens be a disastrous solution to today's U.S. immigration problems? Which of the following statements most accurately describes the match between the illustrative item and the Chapter 7 guidelines for creating essay items?
At least one of the chapter's guidelines has been explicitly followed in the illustrative item.
If a teacher's students include children with disabilities or children who are English language learners, which assertion about assessment bias is most defensible?
Because assessment bias erodes the validity of inferences derivative from students' test performances, even greater effort should be made to reduce assessment bias when working with these two distinctive populations.
Why do some members of the measurement community prefer to use the phrase "absence-of-bias" rather than "assessment bias" when quantitatively reporting the degree to which an educational test appears to be biased?
Because both reliability and validity, two key attributes of educational tests are positive qualities. "Absence of bias" is a positive quality to be sought in educational tests.
Which category of test items best describes the following item: According to educators, one of the major advantages of the Every Student Succeeds Act is that it forces schools and teachers to focus only on the important material included in the state test. a. True b. False
Binary Choice
Which category of test items best describes the following: True or False: Mount Everest is the tallest mountain Earth.
Binary-Choice
Consider the following test item. Your primary concern in selecting techniques to assess a learning objective or objectives should be classroom practicality and efficiency. a. True b. False Which category best describes this item?
Binary-choice
Considering the knowledge you've gained regarding formative assessment in this chapter, which of the following characteristics of a class discussion could yield formative assessment information?
Class discussions that show what students are thinking
Only one of the following statements about a test's classification consistency is accurate. Select the accurate statement regarding classification consistency.
Classification consistency indicators represent the proportion of students classified identically on two testing occasions
These standards, released by the Council of Chief State School Officers and the National Governors Association for Best Practices were an attempt to establish continuity and consistency across varying state curricular aims
Common core state standards
One of the most commonly misused terms in educational jargon is the word "standards." In reality, there is no singular, all-encompassing concept of a standard, but rather more specific subtypes of educational standards. Which of the following subtypes of standards could best be described as "the knowledge or skills that educators want students to learn"?
Content standard
When teachers consider their "sought-for ends of instruction," they are considering the ways in which a student's knowledge or skills will change for the better as a function of quality teaching .Which of the following terms best describes this concept?
Curriculum instruction
This illustrative essay item was written for sixth graders. Thinking back over the mathematics lessons and homework assignments that you received during the past 12 weeks, what mathematical conclusions can you draw? Describe those conclusions in no more than 300 words, written by hand on the test-booklets provided or as a printed copy of your conclusions composed on one of our classroom computers. Select the statement that most accurately appraises this essay item for sixth-grade students.
Despite its adherence to one of the chapter's item-writing guidelines for essay items, the depiction of a student's task renders the item dysfunctional
Proper formative assessment is conducted
During the instructional process
The Common Core State Standards were an attempt to outline what students should know at each grade level in which of the following subjects?
English language arts and mathematics
Which of the following is not a significant challenge in regard to assessing students who are English language learners (ELLs)?
English language learners often score higher on assessments due to their increased focus in language acquisition
This illustrative short-answer item was written for a third-grade class. The purpose is to help both the teacher and the students determine how well those students had achieved mastery of a recent state-approved language arts curriculum goal. Please write your answer legibly. _____________ is a good one-word description for commas, periods, question marks, and colons. Which statement most accurately describes the illustrative item?
For young students such as these third graders, direct questions should be used instead of incomplete statements, so the illustrative item violates an item-writing guideline for short-answer items.
Which of the following is not a step in the four steps for creating a learning progression?
Form a basic and introductory understanding of a target curricular aim
Which represents the most appropriate definition of formative assessment?
Formative assessment is a planned process in which assessment-elicited evidence of students' status is used by teachers to adjust their ongoing instructional procedures or by students to adjust their current learning tactics.
The relationship between the degree to which an educational test is biased and the test's disparate impact on certain groups of learners is an important one. Which statement best captures the nature of this relationship?
If an educational assessment displays a disparate impact on different groups of test takers it may or may not be biased
Which of the following indices of a test's reliability is most often provided by developers of the kinds of standardized tests destined for use with large numbers of students?
Internal-consistency reliability coefficients
Which category of test items best describes the following: Consider these three categories of test items: multiple choice, binary, and matching. Choose the appropriate term to match the description: Multiple choice This type of question offers the test-taker only two options from which to choose. Binary This type of question offers the test-taker several options from which to choose. MatchingThis type of question may ask the test taker to attach vocabulary words with their proper definition
Matching
Consider the following test-item. Which of the following decisions requires educators to use quality assessment information? a. Choosing who should get into this college b. Deciding what reading group a student should be placed in c. Determining whether a student is legally disabled d. All of the above e. Only (a) and (b) Which category best describes this item?
Multiple choice
Which term best describes the type of measurement that would yield the following feedback. Jonathan scored within the 92nd percentile on the SAT?
Norm-referenced assessment
One important group of students in need of protection from assessment bias is English language Learners (ELLs). Therefore, it is important to understand the qualifying categories for an ELL student. Which is not a category of students considered to be English language learners (ELLs)?
Students who are fluent in English, but prefer to speak another primary language.
When educators collect test-based evidence to inform decisions about already completed instructional activities they are engaging in a basic form of
Summative assessment
Which strategy seems most suitable for teachers to use when trying to detect and eliminate assessment bias in their own teacher-made tests?
Teachers should pay particular attention to the possibility that assessment bias may have crept into their teacher - made tests and should strive to rely on their best judgements about the presence of such bias on all their classroom tests, but especially on their most significant classroom assessments.
If a multistate assessment consortium has generated a new performance test of students' oral communication skills and wishes to verify that students' scores on the performance test remain relatively similar regardless of the time during the school year when the test was completed, which of the following kinds of consistency evidence would be most appropriate?
Test-retest evidence of reliability
The 49ers scored four safeties during Thursday's game, while the Giants scored three field goals. What was the final score in the game? Show your work.
The assessment seems biased in favor of students who already understand the rules of american football
This illustrative item is destined for use in a high-school speech course that, in recent weeks, has been focused on debate preparation. Directions: To conclude our unit on how to prepare successfully for a debate, please consider carefully the following preparation-focused topics. After doing so, choose one that you regard as most important —to you —and then write a 300 -400 word essay describing how best to prepare for whatever topic you chose. Be sure to identify which of the potential topics you have selected. You will have 40 minutes to prepare your essay. Potential Essay Topics Introducing your position and defending it Use of evidence during the body of the debate Preparing for your opponents' rebuttal Please choose the statement that most accurately reflects the illustrative item's congruence with Chapter 7's guidelines for writing essay items.
The illustrative item is structured in direct opposition to one of the chapter's guidelines for writing essay items.
Consider the illustrative three-option multiple-choice item: An anonymously completed, self-report item regarding a student's values —an item that has no clearly correct answer —is best suited for use in an: a. cognitive examination b. affective inventory c. psychomotor skills test Which statement best characterizes the illustrative item?
The illustrative item violates a general item-writing guideline by providing a blatant grammatical clue to the correct answer.
This illustrative short-answer item was constructed for 10th-grade students. Following World War II, an international organization intended to maintain world peace was established, namely, the United Nations. Similarly, after World War I a peace-oriented international organization was established. What was the name of that earlier organization? Which of the following statements best mirrors the degree to which the illustrative item is in accord with Chapter 7's guidelines for writing short-answer items?
The illustrative item violates none of the chapter's guidelines for writing short-answer items.
Consider the following illustrative binary-choice item. For this next True/False item, indicate whether the item's statement is true or false by circling the T or F following the item. Matching items should employ homogenous lists, but should seek to achieve relative brevity. (Circle one: T or F) Which statement best describes the illustrative item?
The illustrative true/false item violates one of the item-category guidelines by including two substantial concepts in a single item.
Here's an illustrative short-response item intended for use with ninth-graders in a high-school government course: Please accurately fill in the blanks you find in the statement given below regarding "Western Exploration and Expansion." In _______, _______ and _______ explored what ultimately became the _______ section of the northwestern United States with the assistance of a native-American guide known as _______. Select the most accurate of the following statements regarding this illustrative short-answer item.
The item satisfies the guideline regarding linear equality, yet violates the number-of-blanks guideline.
Consider whether the following binary-choice item adheres to the item-writing guidelines presented in the text. Presented below is a binary-choice item. Please indicate —by circling the R or W —whether the statement given in the item is right (R) or wrong (W). • R or W: Absence-of-bias determinations are typically made as a function of judgmental scrutiny and, when possible, empirical analysis. Which statement best describes the illustrative item?
The item violates none of the none of the chapter's guidelines, either the five general guidelines or the specific guidelines for binary-choice items.
Here is an illustrative response-scoring plan devised by a high-school Latin teacher. Please review how the teacher plans to evaluate students' Latin compositions, then select the option that most accurately describes the teacher's scoring intentions. A Latin teacher in an urban high school (that has a long and oft-honored history of preparing students for college) frequently expresses during faculty meetings her complete disdain for what she calls "multiple-guess exams." As part of her annual teacher-evaluation evidence, she has been asked by her school's principal to present a written description of how she plans to evaluate students' responses to her constructed-response items. Please consider the following description supplied by the teacher, then select from four alternatives the most accurate comment regarding this teacher's scoring plans. "I plan to score my students' essay responses holistically, not analytically, because I invariably ask students to generate brief essays in which they must incorporate at least half of the new vocabulary terms encountered during the previous week. I supply students with a set of explicit evaluative criteria that I will incorporate in arriving at a single, overall judgment of an essay's quality. Actually, I always pre-weight each of these evaluative criteria and post those weights for students in advance of their tackling this task. Because this is a course emphasizing the writing of Latin (rather than oral Latin), I make it clear to my students —well in advance —that grammar and the other mechanics of writing are very important. When I score students' essays, if there is more than one essay per test, I score all of Essay One before moving on to Essay Two. Because I want these students to become, in a sense, Latin "journalists," I require that they clearly identify themselves with a byline at the outset of each essay. This scoring system, based on nearly 20 years of my teaching Latin to hundreds of our school's students, really works." Select the statement that most accurately depicts this teacher's scoring plans.
The teacher's approach violates one of the chapter's essay-scoring guidelines
Review the following language arts item drawn from a high-school English course for assessment bias. In which one of the following four statements are all of the pronouns used properly? a. I truly enjoyed his telling of the joke. b. We watched him going to the coffee shop. c. We listened to them singing the once-popular, but rarely heard song. d . Dad watched them joking about politicians —while approving of it all.
This assessment does not appear to be biased
Consider the following illustrative binary-choice item. It deals with a reliability/precision concept treated in the Standards for Educational and Psychological Testing (2014). Directions: Please indicate whether the statement below regarding the reliability/precision of educational tests is Accurate (Circle the A) or Inaccurate (Circle the I). A or I Because the standard error of measurement can be employed to generate confidence intervals around reported scores, it is typically more informative than a reliability coefficient. Which statement best describes the illustrative item?
This illustrative binary-choice item, violates none of the general or item-category guidelines for this type of selected-response item.
This illustrative essay item was written for 11th-grade students taking an English course. In the space provided in your test booklet, please compose a brief editorial (of 250 words or less) in favor of the school district's after-school tutorial program. The intended audience for your position statement consists of those people who routinely read this town's weekly newspaper. Because you will have the entire class period to complete this task, you may wish to write a draft editorial using the scratch paper provided so that you can then revise the draft before copying your final version into the test booklet. Your grade on this task will contribute 40 percent toward the grade for the six-week persuasive writing unit. Which statement best characterizes this item?
This illustrative item contains no serious violation of any of the chapter's guidelines for writing essay items.
Consider the multiple binary-choice item with its four separate subitems and then decide how well the item adhered to the chapter's item-writing guidelines. Directions: For each statement in the following cluster of four statements, please indicate whether the statement is true (T) or false (F) by circling the appropriate letter. In an elaborate effort to ascertain the reliability of a new high-stakes test developed in their district, central-office administrators have calculated the following types of evidence based on a tryout of the test with nearly 2,300 students: • Internal consistency r = .83 • Test-retest r = .78 • Standard error of measurement = 4.3 T or F (1) The three types of reliability evidence calculated by the central-office staff are essentially interchangeable. T or F (2) The trivial difference between the test-retest coefficient and the internal consistency coefficient constitutes no cause for alarm. T or F (3) The test-retest r should never be smaller than a test's internal consistency estimate of reliability. T or F (4) The standard error measurement (4.3 in this instance) is derived more from validity evidence than from reliability evidence. Choose the most accurate of the following statements regarding the illustrative multiple binary-choice item as a whole.
This illustrative item seems to violate none of the chapter's guidelines for constructing such items, that is, the general guidelines, the guidelines for multiple binary-choice guidelines, and the guidelines for binary-choice items.
Consider the illustrative binary-choice item. Please decide whether the following statement regarding the reliability of educational tests is True or False. Please place a check behind the True or False to indicate your answer. True ___ False___ When determining a test's classification consistency, there is no need to consider the cut score employed nor that cut score's location in the score distribution. Which statement best describes the illustrative item?
This illustrative item violates the item specific guideline regarding the use of negative statements in a binary choice item.
Consider the following illustrative five-option multiple-choice item. It addresses content presented in the Standards for Educational and Psychological Testing (2014) related to the fundamental notion of assessment validity. When we encounter a test whose scores are affected by processes that are quite extraneous to the test's intended purpose, we assert that the test displays which one of the following? a. Construct underrepresentation b. Construct deficiency c. Construct corruption d. Construct-irrelevant variance e. All of the above Which statement best describes the illustrative item?
This illustrative item, because it includes an "all of the above" alternative, violates an important item-writing guideline
Which of the five general item-writing commandments is violated in the following item? True or False: Special education teachers must rely on school psychologists because of their expertise in psychometrics.
Thou shall not employ ambiguous statements in your assessment items
Consider the following test item. To what purpose, and for what end-goal, is diagnostic assessment done? a. To ascertain the achievement level of the students. b. To ascertain the specific strengths and weaknesses of the students. c. To ascertain the effectiveness of teaching. Which of the five general item-writing guidelines has been violated in this test item?
Thou shall not employ complex syntax in your assessment items.
Consider test items and the accompanying directions. Answer the following questions correctly. Computer based assessment is becoming increasingly popular for which of the following reasons? 1. Schools are technologically equipped. TRUE or FALSE 2. Teachers enjoy computer-based assessments. TRUE or FALSE 3. Districts are increasingly securing adequate bandwidth. TRUE or FALSE 4. Computer based assessments are harder for students. TRUE or FALSE Which of the five general item-writing commandments has most likely been ignored in this scenario?
Thou shall not provide opaque directions to students regarding how to respond.
Consider the illustrative three-option multiple-choice item. Schools often purchase shorter-duration tests in an attempt to assist classroom teachers in adjusting their instructional activities to the progress of their students. These tests are an example of an ___________: a. interim assessment b. computer-based assessment c. constructed response Which of the five general item-writing principles has been violated?
Thou shall not provide students with unintentional clues regarding appropriate responses
Which of the five general item-writing commandments is violated in the following test item? Complete the following sentence. In multiple choice-items, the first part of the item, given drawing its name from plant biology, is called the _______. a. stem b. cluster c. binary-choice d. test-item
Thou shall not provide students with unintentional clues regarding appropriate responses.
Which of the five general item-writing commandments is most likely violated in the following question: When you scrupulously measure students' performance, you are assessing them. a. True b. False
Thou shall not use vocabulary that is more advanced than required
Which of the five general item-writing commandments is most likely violated in the following question: Which of the following is a misrepresentation of the tenets of satisfactory binary-item production thus obfuscating the concept? a. Include multiple concepts in each statement. b. Rarely use negative statements, and never use double negatives. c. Have an approximately equal number of items representing the two categories being tested. d. Keep item length similar for both categories being tested.
Thou shall not use vocabulary that is more advanced than required
Which of the following is not one of the conceptual categories listed in the CCSS mathematics standards?
Trigonometry
Which of the following is not one of the three types of reliability evidence?
Validity
If educators wish to accurately estimate the likelihood of consistent decisions about students who score at or near a high-stakes test's previously determined cut-score, which of the following indicators would be most useful for this purpose?
a conditional standard error of measurement (near the cut-score)
Which of the following is not an example of unfair penalization?
a group of students that failed to complete their homework.
Which of the following types of assessment targets a student's attitudes, interests, and values?
affective assessment
Which of the following terms refers to the degree to which there is a meaningful agreement between two or more of the following: curriculum, instruction, and assessment?
alignment
These illustrative short-answer items were created for use in a 12th-grade English course and are intended to be used in the course's midterm exam. Please complete the short-answer items below by filling in the blank you will find in each item. o __________ is the case to be employed with all modifiers of gerunds —definitely including pronouns. o A __________ infinitive that, in former times, was regarded as a grammatical error is now acceptably encountered in all kinds of writing. Which of the following assertions best reflects how these two short-answer items conform to the chapter's item-writing guidelines for such items.
although several of the chapter's item-writing guidelines have been promptly followed, there is the same violation of an item-writing guideline in both items
A recently established for-profit measurement company has just published a new set of interim tests intended to measure students' progress in attaining certain scientific skills designated as "twenty-first century competencies." There are four supposedly equivalent versions of each interim test, and each of these four versions is to be administered about every two months. Correlation coefficients showing the relationship between every pair of the four versions are made available to users. What kind of coefficient do these between-version correlations represent?
an alternate-form coefficient
Review the following constructed-response item for bias. It's intended to measure students' composition skills. Please compose a short essay consisting of 500 to 1,000 words on the topic: "Soccer Outside the United States." Either use one of our classroom computers or write the essay by hand. Be sure to engage in appropriate prewriting activities, draft an initial version of the essay, and then revise your draft at least once. You will have 90 minutes to complete this task.
biased towards children who live outside of the united states
Which of the following terms refers to a group of related standards under the CCSS for mathematics?
clusters
Reliability refers to the
consistency of the test scores
Describe each of the eight factors to consider when trying to decide what to measure with a classroom test
constructed response
Which term best describes the type of measurement that would yield the following feedback: Jonathan mastered 92 percent of the tested content?
criterion based measurement
Which term best describes the desired outcome of instruction
curricular aim
Decisions linked to classroom assessments should be made
in advance
Public Law 94-142 installed the use of an ___________________ to outline the educational processes for students with disabilities.
individualized education program
Ramon Ruiz is sorting out empty tin cans he found in the neighborhood. He has four piles based on different colors of the cans. He thinks he has made a mistake in adding up how many cans are in each pile. Please identify Ramon's addition statement that is in error. a. 20 bean cans plus 32 cans = 52 cans. b. 43 bean cans plus 18 cans = 61 cans. c. 38 bean cans plus 39 cans = 76 cans. d. 54 bean cans plus 12 cans = 66 cans
is racist against mexicans
What are the two major causes of assessment bias we encounter in typical educational tests?
offensiveness and unfair penalization
This excerpt from a teacher's memo includes faculty-created rules for scoring their students' responses to essay items. The following rules for scoring students' responses to essay items were created last year by our faculty and were approved by a near unanimous vote of the faculty. Please review what those rules recommend prior to our taking this year's "confirmatory" faculty vote on these rules. RULES FOR SCORING RESPONSES TO ESSAY ITEMS When teachers in this school score their students' responses to essay items, those teachers should always (1) make a preliminary judgment about how much importance should be assigned to the conventions of writing, such as spelling, (2) decide whether to score holistically or analytically, (3) prepare a tentative scoring key prior to actually scoring students' responses, (4) try to score students' responses anonymously without knowing which student supplied which response, and (5) score a given student's responses to all essay items on a test and then move on to the next student's responses. Please select the most accurate assertion regarding these rules.
only one of the faculty-approved rules is basically opposed to the chapter 7 guidelines for scoring students' responses to essay items.
Amy Johnson has a large collection of Barbie dolls. Originally, she had 49. Recently, she somehow lost 12 Barbies. How many Barbies does Amy have left? (Show your work.) a. 37 Barbies b. 61 Barbies c. 27 Barbies
this assessment might offend people who view girls as having much broader interest than playing with dolls