Classroom Assessment - FINAL
An approach to assessment in which a students test performance is interpreted according to how much of a defined assessment domain the student has mastered
Criterion-referenced
Which of the following is not a key step that a classroom teacher needs to take in implementing a portfolio assessment program?
Decide which students should be involved in the portfolio assessment program.
Mr. Ramirez administered an instructionally diagnostic test to his students in order to better understand how to proceed in his teaching of fractions. The diagnostic test he administered took him nearly three days to score, thus countering its practical usefulness. Which attribute of an instructionally diagnostic test did this test violate?
Ease of usage
Which of the following is not one of the five rules for rubrics outlined in Chapter 8?
Employ as many evaluative criteria as possible
Which of the following rules is not one that is recommended when creating a scoring rubric that will have a positive impact on classroom instruction?
Employ as many evaluative criteria as possible to judge major aspects of students' responses.
Based on the course textbook, which definition of portfolio assessment is most accurate?
An assessment approach centered on the systematic appraisal of a student's collected work samples.
Select all that are key ingredients in classroom portfolios.
Ensure students "own" their portfolios Communicate the portfolio process to parents and offer parents opportunities to review work samples Select criteria by which to evaluate portfolio work samples
Mr. Smith, a second-year mathematics teacher in a large urban high school, is seeking frequent reactions to his teacher-made tests from the other mathematics teachers in his school. He typically first secures his colleagues' agreement during informal faculty-lounge conversations, then relays copies of his tests - along with brief review forms - at several points in the school year. Although Mr. Smith simultaneously carries out systematic reviews of his own tests by employing what he regards as a first-rate test-appraisal rubric from his school district, when his own views regarding any of his test's items conflict with those of his colleagues, he always defers to the reactions of his much more experienced fellow teachers. Which option represents the most accurate statement regarding Mr. Smith's test-improvement efforts?
Even though Mr. Smith is wise in seeking the item-quality reactions of his school's other math teachers - especially because he is only in his second year of teaching - the ultimate decision about the quality of any of his test items should not be deferentially based on collegial input but, rather, based on Mr. Smith's own judgment.
Which is not a factor around which teachers should structure the evaluation of the quality of their instruction?
Evidence constructed exclusively from classroom observations
According to Popham, the results of efforts to employ portfolios for accountability have been promising.
False
All constructed-response items can be considered performance assessments since a student is performing a task to answer the question.
False
The smaller the size of the standard deviation, the more spread out the scores in the distribution.
False
Ms. Cooke computed some basic descriptive statistics for each of her two Algebra I classes. Her first period class had a standard deviation of 7.6. Her second period class had a standard deviation of 4.5. Which class has a greater variability in test scores and how do you know?
First period has a greater variability because the standard deviation of 7.6 is larger than the standard deviation of 4.5.
Which characteristic would not be considered a quality of an item that would cause the item to be instructionally insensitive?
Format of the item (binary, multiple choice, etc.) Length of item
Which best describes the two types of evaluation processes under which teachers are routinely evaluated?
Formative and summative evaluations
A first-year classroom teacher, George Jenkins, has just finished preparing the initial set of three classroom tests he intends to use with his fifth-grade students early in the school year (one test each in mathematics, language arts, and social studies). In an effort to improve those tests, he has e-mailed a draft version of the three tests to his mother, who provided continuing support for George while he completed his teacher-education coursework as well as a semester-long student teaching experience. He asks his mother to suggest improvements that he might make in the early-version tests. Which best describes George's effort to enhance the quality of his tests?
George could probably have secured better advice about his draft tests had he solicited it from his school's teachers and from his fifth-grade students before they took the tests.
The score based on the grade level and months of the school year represented by a student's test performance
Grade equivalent
Which of the following is the most often misinterpreted score-interpretation indicator used with standardized tests?
Grade-equivalent score
Given your reading of Chapter 8, which would you anticipate being a limitation of performance-based assessments?
Lengthy administration time
Which of the following is the most troublesome problem facing those educators who wish to rely heavily on the use of performance tests?
Making valid inferences about students' generalized skill-mastery
The arithmetic average of a set of scores
Mean
Measurements of Central Tendency
Mean Median Mode Scale score Raw score
The midpoint in a set of scores on the scores are ranked from lowest to highest
Median
Mr. Miller administered a test to his students following his unit on converting fractions to decimals. He concluded that his class did not respond well to his instruction due to low test scores. While this may seem like common practice on the surface, what piece of information is Mr. Miller missing to make an adequate determination of his students' response to his instruction?
Mr. Miller is missing pretest data.
An example of analytic scoring would be to evaluate _______.
the performance of each step listed on a process checklist
Which represents an accurate description of a performance assessment?
An approach to measuring a student's status based on the way the student completes a specified task.
In a normal distribution, approximately what percentage of test scores would fall within two standard deviations above the mean?
2 percent
The book outlines ___________ rules for creating a rubric.
5
Suppose a group of test scores forms a perfectly normal distribution. Approximately what percentage of test scores will fall within one standard deviation of the mean?
66 percent
How many evaluative criteria for performance tasks did the author provide in the chapter?
7
The author describes ________ "key ingredients" for creating a portfolio.
7
What is one way in which a performance assessment differs from a more conventional test?
A conventional test will differ from a performance assessment in the degree to which the test situation approximates the real-life situation to which inferences were made.
A rubric is a scoring guide to be employed in judging students' responses to constructed-response assessments such as a performance test. Which one of the following elements is the least necessary feature of a properly constructed rubric?
A designation of a performance standard required for skill-mastery
Which of the following is not an element typically embodied in performance tests?
A direct link to a preexisting content standard
Which of the following is an instructionally beneficial rubric?
A skill-focused rubric
Which type of instructionally diagnostic tests are most commonly found in special education?
Classification-focused
Anita Gonzales teaches middle-school English courses. At least half of her classroom tests call for students to author original compositions. Her other tests are typically composed of selected-response items. Anita has recently committed herself to the improvement of these selected-response tests, so when she distributes those tests to her students, she also supplies an item-improvement questionnaire to each student. The questionnaire asks students as they complete their tests to identify any items that they regard as (1) confusing, (2) having multiple correct answers, (3) having no correct answers, or (4) containing unfamiliar vocabulary terms. Students are to turn in their questionnaires along with their completed tests, but are given the option of turning in the questionnaires anonymously or not. Which statement most accurately portrays Anita's test-improvement procedures?
Although seeking students' judgments regarding her tests has much to commend it, Anita should have sought students' reactions to a test only after they had completed it - by distributing blank copies of the test along with the item-improvement questionnaire.
During last year's end-of-school evaluation conference, Jessica Jones, a high-school social studies teacher, was told by the principal that her classroom tests were "haphazard at best." Jessica now intends to systematically review each of the classroom tests she builds, based on her principal's suggestions. She intends to personally evaluate each test on the basis of (1) its likely contribution to a valid test-based inference, (2) the accuracy of its content, (3) the absence of any important content omissions, and (4) the test's fundamental fairness. Which option represents the best appraisal of Jessica's test-review plans?
Although the four test-review factors Jessica chose will help her identify certain deficiencies in her tests, she should also incorporate as review criteria a full range of widely endorsed experience-based and research-based (1) item-specific guidelines and (2) general item-writing guidelines.
The teaching staff in a suburban middle school is concerned with the quality of their school's teacher-made classroom assessments. This issue has arisen because the district school board has directed all schools to install a teacher-evaluation process featuring prominently weighted evidence of students' learning as measured chiefly by teacher-made tests. The district office requires teachers to submit all students' responses from each classroom assessment immediately after those assessments have been administered. Then, in less than 2 weeks after submission, teachers receive descriptive statistics for each test (such as students' means and standard deviations). Teachers also receive an internal consistency reliability coefficient for the total test and a p-value and an item-discrimination index for each item. Teachers then must personally judge the quality of their own tests' items. The teachers' reviews of their test's individual items are seen as subjective by almost everyone involved, whereas the empirical evidence of item quality is regarded as objective. The school's faculty unanimously decides to weight teachers' own per-item judgments at 25 percent while weighting the statistical per-item p-values and item-discrimination indices at 75 percent. Please select the statement that most accurately characterizes the test-improvement procedures in this suburban middle school.
Because the relevance of traditional item-quality indicators, such as those supplied by this school's district office, can vary depending on the specific use to which a teacher-made test will be put, the across-the-board weightings (25 percent judgmental; 75 percent empirical) may be inappropriate for the proposed teacher-evaluation process.
Consider the three statements. Which are considered characteristics of portfolio assessments? I. Represents the range of reading and writing students are engaged in II. Engages students in assessing their progress and/or accomplishments and establishing ongoing learning goals III. Mechanically scored or scored by teachers who have little input
I and II
Consider the following statements regarding test-preparation. Which one is not accurate?
If teachers simultaneously direct their instruction toward a test's specific items and the curricular aim on which the test is based, this constitutes appropriate test preparation.
Engaging in assessment improvement is a natural part of the teaching process. However, sometimes reviewing your self-created materials alone can prove problematic. Which represents a reasonable explanation of why self-review can be problematic?
If you created the assessment, you are prone to be biased in its favor.
Which of the following is one of the attributes of concern for an instructionally diagnostic test?
Length of assessment
Which of the following is not one of the five review criteria for judgmentally based improvement procedures listed in Chapter 11?
Item's ease of grading
Which of the following is not a common flaw when scoring performance assessments?
Lack of scorer familiarity with the totality of the content being assessed
Ms. Troy is the principal of Sunnyside Elementary School. In order to evaluate one of her first-grade teachers, Mrs. Stelter, Ms. Troy sits down and observes a 30-minute lesson. At the conclusion of the lesson, Ms. Troy concludes that Ms. Stelter is an impactful teacher. What is the main issue with Ms. Troy's evaluation process?
Ms. Troy has not collected any outcome data.
Which represents a disadvantage of percentile scores?
None of these are disadvantages of percentile scores.
A parent receives his child's grade equivalent score on the third-grade math assessment. The grade equivalent score is 5.3. The parent calls the teacher to discuss whether or not his daughter should skip the fourth grade. Which of the following pieces of advice is most appropriate?
None of these represent appropriate advice.
Consider the three statements. Which are considered characteristics of portfolio assessments? I. Assesses all students on the same dimensions. II. Addresses achievement only. III. Separates learning, testing, and teaching.
None of these represent characteristics of portfolio assessment.
An approach to assessment in which a student's test performance is interpreted relatively - that is, according to how the student's performance compared with that of other test takers
Norm-referenced
Which type of score would indicate a test-taker's standing in relation to that of a norm group?
Percentile score
(1) Which of the following is not a component of the seven-step sequence for portfolio assessment found in Chapter 9?
Portfolio assessment is complicated and should only be done by the teacher.
Which of the following, from a classroom teacher's perspective, is probably the most serious drawback of portfolio assessment?
Portfolio assessment's time-demands on teachers
Some performance-assessment proponents contend that genuine performance assessments must exhibit which of the following features?
Prespecified quality standards Judgemental appraisal Multiple evaluative criteria
Which represents an appropriate linear representation of the classic pretest versus posttest model?
Pretest --> Instruction --> Posttest
Calculated by subtracting the value of the lowest test score from the value of the highest test score
Range
Which statistic would yield information regarding the variability of a group's test scores?
Range
Which term represents the number of items that a student has answered correctly on an assessment?
Raw score
Portfolio outcomes
Represents a collaborative approach to assessments Represents the range of reading and writing students are engaged in Engages students in assessing their progress and/or accomplishments and establishing ongoing learning goals Measures each student's achievements, while allowing for individual differences between students Has a goal of student self-assessment Links assessment and teaching to learning
Converted raw scores to employ a new, arbitrarily chosen scale to represent a student's performance
Scale score
Scale scores are converted raw scores that use a new, arbitrarily chosen scale to represent levels of achievement or ability. What is one clear advantage of a scale score?
Scale scores allow for the comparison of several equidifficult forms of a test.
According to the text, which of the following are parts of a rubric?
Scoring scale Evaluative criteria
Which of the following is not one of the attributes of concern for an instructionally diagnostic test?
Simplicity of content
Performance assessments are often scored via rubric. Given the knowledge you've gained from this chapter, which type of rubric do you feel is most often best-suited for performance assessments?
Skill-focused rubrics
Mr. Miller has decided to incorporate a pretest/posttest design to his classroom evaluation procedures in order to gather better classroom data to guide his instruction. He created two equidifficult forms of the test he plans to use, divided his class in half, and administered one of the two forms to each half of his class. He then administered the opposite version of the test to each different half. So, in the end, each half of the class had taken both tests. What type of testing design is Mr. Miller following?
Split-and-switch design
Measurements of Variability
Standard deviation Range
Mr. Byron is concerned about his students' end of year test scores. In his state, student test data are compared across years. For example, his students' scores on last year's exam were taken into account so that this year, students who scored similarly last year can be compared to one another this year. The comparisons are reported in percentiles such as, "Addy scored as well or better than 95 percent of her peers who took this assessment and who scored at the same scale score last year." What type of approach to evaluation is being described?
Student growth percentile
Testing outcomes
Student self-assessment is not a goal Assessment process is not collaborative Addresses achievements only Mechanically scored or scored by teachers who have little input Assesses students across a limited range of reading and writing assignments that may not match what students do Separates learning, testing, and teaching Assesses all students on the same dimensions
Which of the following is not a reason that should dissuade policymakers from evaluating educational quality on the basis of students' scores on certain educational achievement tests?
Substantial gaps between minority and majority students' performance on most accountability tests will rarely be found.
(2) Which of the following is not a component of the seven-step sequence for portfolio assessment found in Chapter 9?
Teachers should autonomously decide on evaluative criteria as the content experts.
Given your reading in this chapter, which would you suggest are more effectively measured by performance-based assessments?
The ability to formulate problems
Which would be the best justification for the relatively large amount of time required to respond to many performance-based assessment tasks?
The tasks can provide students with valuable learning opportunities
Given your reading of Chapter 9, which would you anticipate being a major obstacle to the effective use of portfolios?
They are labor intensive.
Given your reading of Chapter 9, which would you anticipate being a common misperception of portfolios?
They consist of a haphazard collection of student work.
Which of the following represents a key impediment to teachers' portfolio assessment?
Time demands linked to a teacher's implementation of portfolio assessment
When considering instructionally diagnostic tests, at least how many strengths and weaknesses should be addressed?
Two
Which of the following terms is appropriate for an evaluation model that employs a student's prior achievement and background characteristics as statistical controls to help isolate the effects on student achievement of specific teachers, schools, or districts?
Value added-model
Of the following test-preparation practices, which are considered by the author to meet both guidelines. Select all that apply.
Varied-format preparation Generalized test-taking preparation
Which of the following phrases is most commonly used to describe a portfolio focused on a student's self-evaluative and ongoing improvement in the quality of work products?
Working portfolio
Because she is eager for her students to perform well on their 12th-grade senior mathematics tests (administered by the state department of education), Mrs. Williamson gives students answer keys for all of the test's selected-response items. When her students take the test in the school auditorium, along with all of the school's other 12th-graders, she urges them to use the answer keys discreetly, and only if necessary. Mrs. Williamson's activities constitute ________.
a violation of both guidelines
Mrs. Gordon makes sure she has her sixth-grade students practice each fall for the nationally standardized achievement tests required in the spring by the district's school board. Fortunately, she has been able to make photocopies of most of the test's pages during the past several years, so she can organize a highly relevant 2-week preparation unit wherein students are given actual test items to solve. At the close of the unit, students are supplied with a practice test on which about 60 percent of the test consists of actual items copied from the nationally standardized commercial test. Mrs. Gordon provides students with an answer key after they have taken the practice test so that they can check their answers. Mrs. Gordon's activities constitute _______.
a violation of both guidelines
Mrs. Jones, a third-grade teacher, was asked by officials of her state department of education two years ago to serve as a member of a Bias Review Committee whose task was to consider whether a set of not-yet-final items being prepared for the state's annual accountability tests contained any assessment bias that would preclude their use. Even though Mrs. Jones realized that her committee's item-by-item reviews would not be the only factor determining whether such underdeveloped items would actually be used on the state-administered accountability tests, she was convinced that many of the items she had reviewed would end up on those tests.Accordingly, based on the informal notes she had taken during a two-day meeting of the Bias Review Committee, she always makes certain to give her own third-grade students plenty of guided and independent practice in responding to items similar to those she had reviewed. Mrs. Jones generates these practice items herself, always trying to make her practice items resemble the specific details of the items she reviewed. Because a new teacher-evaluation system in her district calls for the inclusion of state test scores of each teacher's students, Mrs. Jones was pleased to see that her own third-graders scored well on this year's state tests. Mrs. Jones's activities constitute ________.
a violation of both guidelines
The district where Todd Blanding teaches high-school chemistry stipulates that up to 100 percent of a teacher's student-growth evidence, used for teacher evaluations, can be based on before-instruction and after-instruction classroom assessments. Todd and the other teachers in his high school realize how important it is for their students to score well on classroom tests, particularly any tests being used to collect evidence of pre-instruction to post-instruction growth. Accordingly, each month the high school's staff participates in content-alike learning communities so they can explore together suitable test-preparation alternatives. Based on these monthly explorations, Todd has developed a pretest-to-posttest instructional approach whereby he never provides item-specific instruction for more than half of the items he intends to use for any upcoming posttest. (Item-specific instruction explicitly explores the nuances of a particular item.) Because at least half of the items on an instructional unit's posttest will not have been discussed in class prior to the posttest, Todd is confident that he can base valid interpretations about students' growth from their pretest-to-posttest performances. Todd's activities constitute __________.
a violation of both guidelines
Mr. Thompkin teaches mathematics in an urban middle school serving many students from lower-income families. Although Mr. Thompkin personally finds his district's heavy emphasis on educational testing to be excessive, he concedes that his students will benefit from scoring well on the many math tests he is obliged to administer during a school year. Because most of his students cannot afford to enroll in the commercial test-preparation programs that are available throughout his city, Mr. Thompkin entices a psychologist friend of his - a friend who is particularly knowledgeable about test-taking skills - to visit all of his courses one day during the first month of school. The psychologist explains to students not only how to take tests successfully but also how to prepare in advance for any high-stakes testing situations. Mr. Thompkin believes one class period per year that's focused on test-taking rather than learning mathematics is a decent trade-off for his students. Mr. Thompkin's activities constitute ________.
a violation of neither guideline
Mrs. Hilliard knows the reading test administered to all state eighth graders contains a set of five fairly lengthy reading selections, each of which is followed by about eight multiple-choice items dealing with such topics as (1) the main idea of the selection or the main idea of its constituent paragraphs, (2) the meaning of technical terms that can be inferred from contextual clues, and (3) the defensibility of post-reading inferential statements linked to the selection. Mrs. Hilliard routinely spends time in her eighth-grade language arts class trying to improve her students' reading comprehension capabilities. She has the students read passages similar to those used in the statewide test, then gives her students a variety of practice tests, including written multiple-choice, true-false, and oral short-answer tests in which, for example, individual students must state aloud what they believe to be the main idea of a specific paragraph in the passage. Mrs. Hilliard's activities constitute _________.
a violation of neither guideline
Ms. Sanchez realizes that many of her fourth-graders are relatively recent arrivals in the United States, having come from Mexico and Central America. Most of her students speak English as a second language and possess limited experience in taking the kinds of standardized tests used so frequently these days in U.S. schools. Accordingly, Ms. Sanchez has located a number of English-language standardized tests for her fourth-grade students, and she has photocopied segments of the tests so the introductory pages will be available to all of her students. Once every few weeks, Ms. Sanchez asks her fourth-graders to spend classroom instructional time trying, as she says, to "make sense" out of these tests. About 20 minutes is devoted to students' reading the tests' directions and then determining if they can understand specifically how they are to complete each of the standardized tests. She makes no copies of any items other than those used in a test's directions. Ms. Sanchez 's activities constitute _________.
a violation of neither guideline
Fred Phillips prepares his sixth-grade social studies students to do well on a state-administered social studies examination by having all of his students take part in practice exercises using test items similar to those found on the state examination. Fred tries to replicate the nature of the state examination's items without ever using exactly the same content as it is apt to appear on the examination. He weaves his test-preparation activities into his regular social studies instruction so cleverly that most students really don't know they are receiving examination-related preparation. Fred Phillips' activities constitute _________.
a violation of the educational defensibility guideline
Srijati is eager to have her fourth-grade students become better "close readers"—that is, to be better able to read written materials carefully so that they are capable of, as Srijati says, "sucking all of the meaning out of what they read." Because of reductions in assessment funds, Srijati's school district has been obliged to eliminate all constructed-response items assessing students' reading comprehension. All items measuring students' reading comprehension, therefore, must be selected-response types of items and, beyond that, district officials have indicated that only three specific item types will be used in district-developed reading tests. So that her students will perform optimally on the district-developed reading tests, Srijati provides "close-reading practice" based exclusively on the three district-approved ways for students to display their reading comprehension. Srijati's fourth-graders really shine when it is time to take the district reading tests. Srijati's activities constitute _______.
a violation of the educational defensibility guideline
Because there is a statewide reading comprehension test that must be passed by all high-school students before they receive state-sanctioned diplomas, Mr. Gillette, a 10th-grade English teacher, spends about four weeks of his regular class sessions getting students ready to pass standardized tests. He devotes one week to each of the following topics: (1) time management in examinations, (2) dealing with test-induced anxiety, (3) making calculated guesses, and (4) trying to think like the test's item writers. Mr. Gillette's students seem appreciative of his efforts. Mr. Gillette's activities constitute ________.
a violation of the professional ethics guideline
An advantage of performance-based assessments over achievement tests is that they can be used to evaluate _________.
both the process and product of a task
The gives students guided or independent practice with actual items copied from a state-developed high school graduation test that is currently being used.
current-form preparation
According to the author, which of the following are chief purposes or functions of portfolio assessments? Check all that apply
documenting student progress showcasing student accomplishments evaluation of student status
Instructionally diagnostic tests are generally designed to yield informative results regarding _____________________.
individual students
Links to socioeconomic status is a quality that might cause an item to be __________________________.
instructionally insensitive
To evaluate the quality of instruction, the author recommends teachers use which of the following models
pretest-posttest design split and switch design
Analytic scoring is better than holistic scoring when an educator is trying to _________.
provide diagnostic feedback to students
A mathematics achievement test includes addition problems formatted only in vertical columns, the teacher provides practice with addition problems formatted solely in this manner.
same-format preparation
According to the author, which type of rubric can markedly enhance a teacher's instruction?
skill-focused
A ______________________ is a test that is designed to yield either norm-referenced or criterion-referenced inferences and that is administered, scored, and interpreted in a predetermined manner.
standardized test
The author suggests focusing on three types of evaluative evidence to measure the impact of instruction. Select the three types of evidences from the examples below.
students' performances on teacher-made unit assessments evidence regarding unanticipated effects on instruction