Final for Classroom Assessment
Which of the following statements about goal-attainment grading is most defensible?
"If a teacher can collect defensible assessment evidence of a student's mastery of the teacher's designated curricular aims, then this evidence should be the only basis for goal-attainment grading."
In a normal distribution, approximately what percentage of test scores would fall within two standard deviations above the mean?
2 percent
A rubric is a scoring guide to be employed in judging students' responses to constructed-response assessments such as a performance test. Which one of the following elements is the least necessary feature of a properly constructed rubric?
A designation of a performance standard required for skill-mastery
Which of the following is not an element typically embodied in performance tests?
A direct link to a preexisting content standard
Mr. Cory has established a set of guidelines for grading his ninth-grade students' English essays. He was quite pleased, because as he graded the essays he realized that almost 95 percent of the class earned an A. Which type of grading is most likely being described?
Absolute grading
Anita Gonzales teaches middle-school English courses. At least half of her classroom tests call for students to author original compositions. Her other tests are typically composed of selected-response items. Anita has recently committed herself to the improvement of these selected-response tests, so when she distributes those tests to her students, she also supplies an item-improvement questionnaire to each student. The questionnaire asks students as they complete their tests to identify any items that they regard as (1) confusing, (2) having multiple correct answers, (3) having no correct answers, or (4) containing unfamiliar vocabulary terms. Students are to turn in their questionnaires along with their completed tests, but are given the option of turning in the questionnaires anonymously or not. Which statement most accurately portrays Anita's test-improvement procedures?
Although seeking students' judgments regarding her tests has much to commend it, Anita should have sought students' reactions to a test only after they had completed it - by distributing blank copies of the test along with the item-improvement questionnaire.
Mr. McMillan was busy assigning grades to his sixth-grader's science fair projects. One of his start students, Darren, submitted an excellent project. However, Mr. McMillan felt the project didn't represent the full extent of Darren's potential, so he gave Darren a B. What type of grading is Mr. McMillan applying?
Aptitude-based grading
Which of the following is not a key step that a classroom teacher needs to take in implementing a portfolio assessment program?
Decide which students should be involved in the portfolio assessment program
Which of the following is not one of the five rules for rubrics outlined in Chapter 8?
Employ as many evaluative criteria as possible
Which of the following is not one of the four described steps in the development of a goal-attainment approach to grading?
Ensuring that grading criteria are kept private for security purposes
Mr. Smith, a second-year mathematics teacher in a large urban high school, is seeking frequent reactions to his teacher-made tests from the other mathematics teachers in his school. He typically first secures his colleagues' agreement during informal faculty-lounge conversations, then relays copies of his tests - along with brief review forms - at several points in the school year. Although Mr. Smith simultaneously carries out systematic reviews of his own tests by employing what he regards as a first-rate test-appraisal rubric from his school district, when his own views regarding any of his test's items conflict with those of his colleagues, he always defers to the reactions of his much more experienced fellow teachers. Which option represents the most accurate statement regarding Mr. Smith's test-improvement efforts
Even though Mr. Smith is wise in seeking the item-quality reactions of his school's other math teachers - especially because he is only in his second year of teaching - the ultimate decision about the quality of any of his test items should not be deferentially based on collegial input but, rather, based on Mr. Smith's own judgment.
To provide a more complete picture of students' current affective status, it is sensible to ask students to supplement their anonymous responses to a self-report inventory by adding optional, unsigned explanatory comments if they wish to do so.
False
A first-year classroom teacher, George Jenkins, has just finished preparing the initial set of three classroom tests he intends to use with his fifth-grade students early in the school year (one test each in mathematics, language arts, and social studies). In an effort to improve those tests, he has e-mailed a draft version of the three tests to his mother, who provided continuing support for George while he completed his teacher-education coursework as well as a semester-long student teaching experience. He asks his mother to suggest improvements that he might make in the early-version tests. Which best describes George's effort to enhance the quality of his tests?
George could probably have secured better advice about his draft tests had he solicited it from his school's teachers and from his fifth-grade students before they took the tests.
Engaging in assessment improvement is a natural part of the teaching process. However, sometimes reviewing your self-created materials alone can prove problematic. Which represents a reasonable explanation of why self-review can be problematic?
If you created the assessment, you are prone to be biased in its favor.
Which of the following is not one of the five review criteria for judgmentally based improvement procedures listed in Chapter 11?
Item's ease of grading
A high-school biology teacher, Nicholas, relies heavily on his students' test performances when he assigns grades to those students. Typically, he sends his selected-response classroom tests to the district's assessment director who, usually in 24 hours, returns a set of item analyses to the teachers. These analyses usually contain an overall mean and a standard deviation for each class's test performances, as well as p-values and item-discrimination indicators for every item in each test. Nicholas teaches four different biology classes, so four separate analyses are carried out at the district office. Nicholas is pleased that very few of his tests' items display exceptionally high or exceptionally low p-values. Moreover, the vast majority of the items appear to have discrimination indices of a positive .25 or above. Three items have negative discrimination indices. After looking at the phrasing of those items, Nicholas sees how he should revise them to eliminate potentially confusing ambiguities.
Nicholas was appropriately pleased with the results of the district-conducted item analyses, and he made a sensible decision to revise the three negatively discriminating items
Which type of score would indicate a test-taker's standing in relation to that of a norm group?
Percentile score
Which of the following is not a component of the seven-step sequence for portfolio assessment found in Chapter 9?
Portfolio assessment is complicated and should only be done by the teacher.
Which of the following, from a classroom teacher's perspective, is probably the most serious drawback of portfolio assessment?
Portfolio assessment's time-demands on teachers
Which represents an appropriate linear representation of the classic pretest versus posttest model?
Pretest --> Instruction --> Posttest
Mrs. Kate administered a test on the scientific method to her 10th grade Biology students. When Mrs. Kate's administrator reviewed her assigned grades on the test he noticed that the grades represented a normal distribution. When he asked why, Mrs. Kate said, "The students who performed best in the class received the highest grades and then grades were distributed normally based on class performance."
Relative grading
When considering the clarification of curricular aims, what two groups of stakeholders should be the teacher's focus?
Students and parents
Which of the following would be the most suitable assessment target for a multifocus affective inventory?
Students' interests in different subjects
Which one of the following statements regarding the improvement of classroom assessments is not accurate?
Students' reactions to test items should play little or no role in item improvement.
Which of the following is not a reason that should dissuade policymakers from evaluating educational quality on the basis of students' scores on certain educational achievement tests?
Substantial gaps between minority and majority students' performance on most accountability tests will rarely be found.
Which of the following is not a component of the seven-step sequence for portfolio assessment found in Chapter 9?
Teachers should autonomously decide on evaluative criteria as the content experts.
Which of the following represents a key impediment to teachers' portfolio assessment?
Time demands linked to a teacher's implementation of portfolio assessment
Which would be the best justification for the relatively large amount of time required to respond to many performance-based assessment tasks?
The tasks can provide students with valuable learning opportunities
Which of the following is the most accurate in regards to grading student effort?
There is no clear, commonly defined, approach to evaluating student effort and therefore effort cannot be graded.
Which of the following terms is appropriate for an evaluation model that employs a student's prior achievement and background characteristics as statistical controls to help isolate the effects on student achievement of specific teachers, schools, or districts?
Value added-model
Which of the following indices is most commonly used to represent an item's difficulty?
p value
Analytic scoring is better than holistic scoring when an educator is trying to _________.
provide diagnostic feedback to students
A ______________________ is a test that is designed to yield either norm-referenced or criterion-referenced inferences and that is administered, scored, and interpreted in a predetermined manner
standardized test
If respondents who are completing an affective self-report inventory that's presented by a computer are informed at the outset that their responses will be anonymous, it can be safely assumed that almost all students will believe their responses to be truly anonymous.
False
Which of the following rules is not one that is recommended when creating a scoring rubric that will have a positive impact on classroom instruction?
Employ as many evaluative criteria as possible to judge major aspects of students' responses.
Many experienced assessors of student affect suggest the most appropriate way for classroom teachers to monitor their students' affect is through the use of:
Anonymously completed self-report inventories
Which is not a factor around which teachers should structure the evaluation of the quality of their instruction?
Evidence constructed exclusively from classroom observations
The only acceptable response options presented to a student who must complete a self-report affective inventory containing statements about the same topic should be the following: Strongly Agree, Agree, Uncertain, Disagree, Strongly Disagree.
False
The vast majority of educators think revealing to students the nature of a teacher's affective curricular aims—at the outset of instruction—is an instructionally sensible action to take.
False
Which best describes the two types of evaluation processes under which teachers are routinely evaluated?
Formative and summative evaluations
Consider the following set of factors that could be employed to judge the quality of the tasks for performance tests. Which one is not generally endorsed as a task-selection factor?
Motivational impact on students
Suppose a group of test scores forms a perfectly normal distribution. Approximately what percentage of test scores will fall within one standard deviation of the mean?
66 percent
About a month into the new school year, high school teacher Rodney Gardner used a 25-item multiple-choice test to measure how well students had mastered a rather large array of factual information about key events in U.S. history. He calculated the p-value for each of the test's 25 four-option items, and he was relatively pleased when the average p-value for the entire set of 25 items was .56. Much later in the year, with the same class, he tried out another assessment tactic with a brand-new test consisting of 40 true/false items. When he calculated p-values for each of the 40 items, he was gratified to discover that the average p-value was .73. Mr. Gardner concluded that because of the p-value "bump" of .17, his students' learning had increased substantially. Please select from the four choices below the statement that most accurately describes Mr. Gardner's interpretation of his students' test results
A serious flaw in Mr. Gardner's conclusion about his students' hypothetical improved learning is that the increase in p value is a univariate result of student learning.
Which of the following steps is not one that should be followed in the creation of a multifocus affective inventory for use in classroom assessment?
Create a series of exclusively positive statements related to each affective variable selected
A teacher's challenge in reducing students' tendency to supply socially desirable answers on a self-report inventory is identical to the challenge in reducing students' socially desirable responses to the items on a cognitive test.
False
When anonymously completed self-report inventories are being used in an attempt to assess students' affect, if some students respond too positively and, at the same time, some students respond too negatively, teachers simply cannot draw valid group-focused inferences about students' affective status.
False
When assessing the affective dispositions of a group of students by using a self-report inventory, it is obligatory to employ a Likert inventory as similar as possible to the inventories introduced by Rensis Likert in 1932.
False
Which characteristic would not be considered a quality of an item that would cause the item to be instructionally insensitive?
Format of the item (binary, multiple choice, etc.)
Given your reading of Chapter 8, which would you anticipate being a limitation of performance-based assessments?
Lengthy administration time
Mr. Miller administered a test to his students following his unit on converting fractions to decimals. He concluded that his class did not respond well to his instruction due to low test scores. While this may seem like common practice on the surface, what piece of information is Mr. Miller missing to make an adequate determination of his students' response to his instruction?
Mr. Miller is missing pretest data.
Ms. Troy is the principal of Sunnyside Elementary School. In order to evaluate one of her first-grade teachers, Mrs. Stelter, Ms. Troy sits down and observes a 30-minute lesson. At the conclusion of the lesson, Ms. Troy concludes that Ms. Stelter is an impactful teacher. What is the main issue with Ms. Troy's evaluation process?
Ms. Troy has not collected any outcome data.
Consider the three statements. Which are considered characteristics of portfolio assessments?
None of these represent characteristics of portfolio assessment
A substantial number of educators regard affective curricular aims as being equal in importance to cognitive curricular aims—or, possibly, of even greater importance.
True
Most teachers want their students, at the close of an instructional period, to exhibit subject-approaching tendencies (that is, an interest in the subject being taught) equal to or greater than the subject-approaching tendencies those students displayed at the beginning of instruction.
True
Multifocus self-report affective inventories typically contain far fewer items related to each affective variable being measured than do traditional Likert inventories.
True
Which of the following is an instructionally beneficial rubric?
A skill-focused rubric
Mr. Ramirez administered an instructionally diagnostic test to his students in order to better understand how to proceed in his teaching of fractions. The diagnostic test he administered took him nearly three days to score, thus countering its practical usefulness. Which attribute of an instructionally diagnostic test did this test violate?
Ease of usage
Performance assessments are often scored via rubric. Given the knowledge you've gained from this chapter, which type of rubric do you feel is most often best-suited for performance assessments?
Skill-focused rubrics
Given your reading of Chapter 9, which would you anticipate being a major obstacle to the effective use of portfolios?
They are labor intensive.
Given your reading of Chapter 9, which would you anticipate being a common misperception of portfolios?
They consist of a haphazard collection of student work.
Instructionally diagnostic tests are generally designed to yield informative results regarding _____________________.
individual students
An example of analytic scoring would be to evaluate _______.
the performance of each step listed on a process checklist
Consider the three statements. Which are considered characteristics of portfolio assessments?
I and II
Which of the following is not one of the attributes of concern for an instructionally diagnostic test?
Length of assessment
Which of the following is not a common flaw when scoring performance assessments?
Lack of scorer familiarity with the totality of the content being assessed
Which characteristic would not be considered a quality of an item that would cause the item to be instructionally insensitive?
Length of item
For tests intended to provide norm-referenced interpretations, which of the following kinds of items should be sought?
Positive discriminators
Which statistic would yield information regarding the variability of a group's test scores?
Range
Which term represents the number of items that a student has answered correctly on an assessment
Raw score
Scale scores are converted raw scores that use a new, arbitrarily chosen scale to represent levels of achievement or ability. What is one clear advantage of a scale score?
Scale scores allow for the comparison of several equidifficult forms of a test.
Given your reading in this chapter, which would you suggest are more effectively measured by performance-based assessments?
The ability to formulate problems
Which of the following is not an important rule to be followed in the classroom assessment of student affect?
To monitor students' ever-changing attitudes and interests, assess affect on at least a bi-weekly basis.
For an affective self-report inventory, very young students can be asked to reply to simple statements —sometimes presented orally—by the use of only two or three agreement-options per statement
True
Generally speaking, teachers can make defensible decisions about the impact of affectively oriented instruction by arriving at group-focused inferences regarding their students' affective status prior to and following that instruction.
True
One of the more difficult decisions to be faced when constructing a multifocus self-report affective inventory is arriving at a response to the following question: How many items are needed for each affective variable being measured?
True
When teachers administer an affective assessment to their students early in an instructional program and intend to administer the same or a similar assessment to their students later, the assessments often have a substantial impact on the teacher's instruction.
True
Students' affect is most often measured in school because evidence of students' affect is believed to help predict how students are apt to behave in particular ways later—when those students' educations have been concluded.
True
Which of the following phrases is most commonly used to describe a portfolio focused on a student's self-evaluative and ongoing improvement in the quality of work products?
Working portfolio
Because she is eager for her students to perform well on their 12th-grade senior mathematics tests (administered by the state department of education), Mrs. Williamson gives students answer keys for all of the test's selected-response items. When her students take the test in the school auditorium, along with all of the school's other 12th-graders, she urges them to use the answer keys discreetly, and only if necessary. Mrs. Williamson's activities constitute ________.
a violation of both guidelines
Because there is a statewide reading comprehension test that must be passed by all high-school students before they receive state-sanctioned diplomas, Mr. Gillette, a 10th-grade English teacher, spends about four weeks of his regular class sessions getting students ready to pass standardized tests. He devotes one week to each of the following topics: (1) time management in examinations, (2) dealing with test-induced anxiety, (3) making calculated guesses, and (4) trying to think like the test's item writers. Mr. Gillette's students seem appreciative of his efforts. Mr. Gillette's activities constitute ________.
a violation of the professional ethics guideline
During last year's end-of-school evaluation conference, Jessica Jones, a high-school social studies teacher, was told by the principal that her classroom tests were "haphazard at best." Jessica now intends to systematically review each of the classroom tests she builds, based on her principal's suggestions. She intends to personally evaluate each test on the basis of (1) its likely contribution to a valid test-based inference, (2) the accuracy of its content, (3) the absence of any important content omissions, and (4) the test's fundamental fairness. Which option represents the best appraisal of Jessica's test-review plans?
Although the four test-review factors Jessica chose will help her identify certain deficiencies in her tests, she should also incorporate as review criteria a full range of widely endorsed experience-based and research-based (1) item-specific guidelines and (2) general item-writing guidelines.
Which represents an accurate description of a performance assessment?
An approach to measuring a student's status based on the way the student completes a specified task.
Jurgen James, an experienced mathematics teacher, loves fussing with numbers based on his classroom assessments. He studies the performance of his previous year's students on key tests to help him arrive at criterion-referenced interpretations regarding which segments of his instructional program seem to be working well or badly. Based on the performances of both of his algebra classes, he learns that the differences in p-values for items taken by uninstructed students (based on a first-of-year pretest) and p-values for items taken by instructed students (based on final exams) are staggering. That is, when pretest p-values are subtracted from final exam p-values, the resulting differences are mostly at least .40 or higher. Mr. James concludes that his algebra items were doing precisely the job they were created to do.
"Jurgen came to the correct conclusion, and I'm not surprised by his students' p-value jumps - he is a spectacular math teacher!"
The teaching staff in a suburban middle school is concerned with the quality of their school's teacher-made classroom assessments. This issue has arisen because the district school board has directed all schools to install a teacher-evaluation process featuring prominently weighted evidence of students' learningas measured chiefly by teacher-made tests. The district office requires teachers to submit all students' responses from each classroom assessment immediately after those assessments have been administered. Then, in less than 2 weeks after submission, teachers receive descriptive statistics for each test (such as students' means and standard deviations). Teachers also receive an internal consistency reliability coefficient for the total test and a p-value and an item-discrimination index for each item. Teachers then must personally judge the quality of their own tests' items. The teachers' reviews of their test's individual items are seen as subjective by almost everyone involved, whereas the empirical evidence of item quality is regarded as objective. The school's faculty unanimously decides to weight teachers' own per-item judgments at 25 percent while weighting the statistical per-item p-values and item-discrimination indices at 75 percent. Please select the statement that most accurately characterizes the test-improvement procedures in this suburban middle school.
Because the relevance of traditional item-quality indicators, such as those supplied by this school's district office, can vary depending on the specific use to which a teacher-made test will be put, the across-the-board weightings (25 percent judgmental; 75 percent empirical) may be inappropriate for the proposed teacher-evaluation process.
Which type of instructionally diagnostic tests are most commonly found in special education
Classification-focused
If a teacher sets out to bring about changes in students' values, he or she needs to select as a curricular aim the promotion of only those values that are supported by more than 50 percent of the students' parents and at least half of the general citizenry of the state in which the teacher's school is located.
False
The difficulties stemming from the presence of "social desirability's" contamination of students' responses can be effectively addressed by informing students in the initial directions that they are to identify themselves only after responding to all items
False
The greater the educational significance that a teacher attributes to the pursuit of affective curricular aims, the more acceptable it is for the teacher to use students' self-report responses not only to arrive at affectively focused inferences about groups of students but also to make inferences about particular student's affective status.
False
Ms. Cooke computed some basic descriptive statistics for each of her two Algebra I classes. Her first period class had a standard deviation of 7.6. Her second period class had a standard deviation of 4.5. Which class has a greater variability in test scores and how do you know?
First period has a greater variability because the standard deviation of 7.6 is larger than the standard deviation of 4.5.
Sue Philips, a health education teacher in a large urban middle school, has recently begun analyzing her selected-response classroom tests using empirical data from students' current performances on those tests. She has acquired a simplified test-analysis program from her district's administrators and applies the program on her own laptop computer. She tries to base students' grades chiefly on their test scores and hopes to find that her items display item-discrimination indices below .20. A recent analysis of items from one of her major classroom tests indicated that three items were negative discriminators. Sue was elated. Please select the statement that most accurately describes Sue's test-improvement understanding.
Given Sue's use of students' test performances to assign grades, her understanding of item-discrimination indices is confused—actually, her items should be yielding strong positive indices rather than low or negative indices.
Which of the following is the most often misinterpreted score-interpretation indicator used with standardized tests?
Grade-equivalent score
Which of the following contentions about affective assessment in the classroom is not accurate?
If classroom affective assessment is introduced, it substantially diminishes the attention given to the measurement of higher-level cognitive outcomes.
Consider the following statements regarding test-preparation. Which one is not accurate?
If teachers simultaneously direct their instruction toward a test's specific items and the curricular aim on which the test is based, this constitutes appropriate test preparation.
Joshua Jenkins teaches a particularly popular high-enrollment series of American government courses in a large suburban high school. He has recently been tinkering with the multiple-choice exams he gives to his students because he wants to select the very best students to take part in an upcoming off-campus community project. He carries out a distractor analysis for one exam, given at the end of a 6-week unit on U.S. politics. He uses the data from the two largest of his four American government classes. After reviewing the distractor analysis data for item 27, Joshua decides to continue using the item in his exam covering the U.S. politics unit. Distractor Analysis Table Response Options Item No. 27 A B* C D F Omit (p = .62, D = .15) 1 15 1 1 2 0 Top 20 students 2 12 2 1 3 0 Bottom 20 students *Correct Answer Please consider the following four options, then select the most accurate appraisal regarding Joshua's decision
Joshua erred in deciding to retain the item because, in order to make the intended norm-referenced interpretations about his students, the discrimination index of .15 is too low.
Which of the following is the most troublesome problem facing those educators who wish to rely heavily on the use of performance tests?
Making valid inferences about students' generalized skill-mastery
Which represents a disadvantage of percentile scores?
None of these are disadvantages of percentile scores.
A parent receives his child's grade equivalent score on the third-grade math assessment. The grade equivalent score is 5.3. The parent calls the teacher to discuss whether or not his daughter should skip the fourth grade. Which of the following pieces of advice is most appropriate?
None of these represent appropriate advice.
Which of the following is not one of the attributes of concern for an instructionally diagnostic test?
Simplicity of content
Mr. Miller has decided to incorporate a pretest/posttest design to his classroom evaluation procedures in order to gather better classroom data to guide his instruction. He created two equidifficult forms of the test he plans to use, divided his class in half, and administered one of the two forms to each half of his class. He then administered the opposite version of the test to each different half. So, in the end, each half of the class had taken both tests. What type of testing design is Mr. Miller following?
Split-and-switch design
Mr. Byron is concerned about his students' end of year test scores. In his state, student test data are compared across years. For example, his students' scores on last year's exam were taken into account so that this year, students who scored similarly last year can be compared to one another this year. The comparisons are reported in percentiles such as, "Addy scored as well or better than 95 percent of her peers who took this assessment and who scored at the same scale score last year." What type of approach to evaluation is being described?
Student growth percentile
If classroom teachers set out to improve their own tests using judgmental approaches, which of the following review criteria is not a factor teachers ought to consider?
The likelihood that, if a test is seen by parents, those parents will recognize the suitability of an item's content coverage
Whenever self-report inventories are employed to measure students' affective dispositions, students' perceptions that their responses are anonymous are typically far more important than is "actual" anonymity
True
Whereas most cognitively oriented classroom assessments attempt to measure students' optimal performances, affectively oriented classroom assessments attempt to get an accurate fix on students' typical dispositions.
True
When considering instructionally diagnostic tests, at least how many strengths and weaknesses should be addressed?
Two
Mrs. Gordon makes sure she has her sixth-grade students practice each fall for the nationally standardized achievement tests required in the spring by the district's school board. Fortunately, she has been able to make photocopies of most of the test's pages during the past several years, so she can organize a highly relevant 2-week preparation unit wherein students are given actual test items to solve. At the close of the unit, students are supplied with a practice test on which about 60 percent of the test consists of actual items copied from the nationally standardized commercial test. Mrs. Gordon provides students with an answer key after they have taken the practice test so that they can check their answers. Mrs. Gordon's activities constitute _______.
a violation of both guidelines
Mrs. Jones, a third-grade teacher, was asked by officials of her state department of education two years ago to serve as a member of a Bias Review Committee whose task was to consider whether a set of not-yet-final items being prepared for the state's annual accountability tests contained any assessment bias that would preclude their use. Even though Mrs. Jones realized that her committee's item-by-item reviews would not be the only factor determining whether such underdeveloped items would actually be used on the state-administered accountability tests, she was convinced that many of the items she had reviewed would end up on those tests.Accordingly, based on the informal notes she had taken during a two-day meeting of the Bias Review Committee, she always makes certain to give her own third-grade students plenty of guided and independent practice in responding to items similar to those she had reviewed. Mrs. Jones generates these practice items herself, always trying to make her practice items resemble the specific details of the items she reviewed. Because a new teacher-evaluation system in her district calls for the inclusion of state test scores of each teacher's students, Mrs. Jones was pleased to see that her own third-graders scored well on this year's state tests. Mrs. Jones's activities constitute ________.
a violation of both guidelines
The district where Todd Blanding teaches high-school chemistry stipulates that up to 100 percent of a teacher's student-growth evidence, used for teacher evaluations, can be based on before-instruction and after-instruction classroom assessments. Todd and the other teachers in his high school realize how important it is for their students to score well on classroom tests, particularly any tests being used to collect evidence of pre-instruction to post-instruction growth. Accordingly, each month the high school's staff participates in content-alike learning communities so they can explore together suitable test-preparation alternatives. Based on these monthly explorations, Todd has developed a pretest-to-posttest instructional approach whereby he never provides item-specific instruction for more than half of the items he intends to use for any upcoming posttest. (Item-specific instruction explicitly explores the nuances of a particular item.) Because at least half of the items on an instructional unit's posttest will not have been discussed in class prior to the posttest, Todd is confident that he can base valid interpretations about students' growth from their pretest-to-posttest performances. Todd's activities constitute __________.
a violation of both guidelines
Mr. Thompkin teaches mathematics in an urban middle school serving many students from lower-income families. Although Mr. Thompkin personally finds his district's heavy emphasis on educational testing to be excessive, he concedes that his students will benefit from scoring well on the many math tests he is obliged to administer during a school year. Because most of his students cannot afford to enroll in the commercial test-preparation programs that are available throughout his city, Mr. Thompkin entices a psychologist friend of his - a friend who is particularly knowledgeable about test-taking skills - to visit all of his courses one day during the first month of school. The psychologist explains to students not only how to take tests successfully but also how to prepare in advance for any high-stakes testing situations. Mr. Thompkin believes one class period per year that's focused on test-taking rather than learning mathematics is a decent trade-off for his students. Mr. Thompkin's activities constitute ________.
a violation of neither guideline
Mrs. Hilliard knows the reading test administered to all state eighth graders contains a set of five fairly lengthy reading selections, each of which is followed by about eight multiple-choice items dealing with such topics as (1) the main idea of the selection or the main idea of its constituent paragraphs, (2) the meaning of technical terms that can be inferred from contextual clues, and (3) the defensibility of post-reading inferential statements linked to the selection. Mrs. Hilliard routinely spends time in her eighth-grade language arts class trying to improve her students' reading comprehension capabilities. She has the students read passages similar to those used in the statewide test, then gives her students a variety of practice tests, including written multiple-choice, true-false, and oral short-answer tests in which, for example, individual students must state aloud what they believe to be the main idea of a specific paragraph in the passage. Mrs. Hilliard's activities constitute _________.
a violation of neither guideline
Ms. Sanchez realizes that many of her fourth-graders are relatively recent arrivals in the United States, having come from Mexico and Central America. Most of her students speak English as a second language and possess limited experience in taking the kinds of standardized tests used so frequently these days in U.S. schools. Accordingly, Ms. Sanchez has located a number of English-language standardized tests for her fourth-grade students, and she has photocopied segments of the tests so the introductory pages will be available to all of her students. Once every few weeks, Ms. Sanchez asks her fourth-graders to spend classroom instructional time trying, as she says, to "make sense" out of these tests. About 20 minutes is devoted to students' reading the tests' directions and then determining if they can understand specifically how they are to complete each of the standardized tests. She makes no copies of any items other than those used in a test's directions. Ms. Sanchez 's activities constitute _________.
a violation of neither guideline
Fred Phillips prepares his sixth-grade social studies students to do well on a state-administered social studies examination by having all of his students take part in practice exercises using test items similar to those found on the state examination. Fred tries to replicate the nature of the state examination's items without ever using exactly the same content as it is apt to appear on the examination. He weaves his test-preparation activities into his regular social studies instruction so cleverly that most students really don't know they are receiving examination-related preparation. Fred Phillips' activities constitute _________.
a violation of the educational defensibility guideline
Srijati is eager to have her fourth-grade students become better "close readers"—that is, to be better able to read written materials carefully so that they are capable of, as Srijati says, "sucking all of the meaning out of what they read." Because of reductions in assessment funds, Srijati's school district has been obliged to eliminate all constructed-response items assessing students' reading comprehension. All items measuring students' reading comprehension, therefore, must be selected-response types of items and, beyond that, district officials have indicated that only three specific item types will be used in district-developed reading tests. So that her students will perform optimally on the district-developed reading tests, Srijati provides "close-reading practice" based exclusively on the three district-approved ways for students to display their reading comprehension. Srijati's fourth-graders really shine when it is time to take the district reading tests. Srijati's activities constitute _______.
a violation of the educational defensibility guideline
An advantage of performance-based assessments over achievement tests is that they can be used to evaluate _________.
both the process and product of a task