ED 412 Final

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

Which of the following statements could be technically correct?

"The test-based inference is valid"

Essay Item Guidelines

1. Convey to students a clear idea regarding the extensiveness of the response desired. 2. Construct items so the student's task is explicitly described. 3. Provide students with the approximate time to be expended on each item as well as each item's value. 4. Do not employ optional items. 5. Precursively judge an item's quality by composing, mentally or in writing, a possible response.

Matching Item Guidelines

1. Employ homogeneous lists. 2. Use relatively brief lists, placing the shorter words or phrases at the right 3. Employ more responses than premises. 4. Order the responses logically. 5. Describe the basis for matching and the number of times responses may be used. 6. Place all premises and responses for an item on a single page/screen.

T/F Guidelines

1. Phrase items so that a superficial analysis by the student suggests a wrong answer. 2. Rarely use negative statements, and never use double negatives. 3. Include only one concept in each statement. 4. Have an approximately equal number of items representing the two categories being tested. 5. Keep item length similar for both categories being tested.

Guidelines for Scoring Essay Responses

1. Score responses holistically and/or analytically. 2. Prepare a tentative scoring key in advance of judging students' responses. 3. Make decisions regarding the importance of the mechanics of writing prior to scoring. 4. Score all responses to one item before scoring responses to the next item. 5. Insofar as possible, evaluate responses anonymously.

Guidelines for Multiple T/F:

1. Separate item clusters vividly from one another. 2. Make certain that each item meshes well with the cluster's stimulus material.

Guidelines for Multiple Choice:

1. The stem should consist of a self-contained question or problem. 2. Avoid negatively stated stems. 3. Do not let the length of alternatives supply unintended clues. 4. Randomly assign correct answers to alternative positions. 5. Never use "all-of-the-above" alternatives, but do use "none-of-theabove" alternatives to increase item difficulty.

What are the 5 General Item-Writing Commandments?

1. Thou shall not provide opaque directions to students regarding how to respond to your assessment instruments . 2. Thou shall not employ ambiguous statements in your assessment items. 3. Thou shall not provide students with unintentional clues regarding appropriate responses. 4. Thou shall not employ complex syntax in your assessment items. 5. Thou shall not use vocabulary that is more advanced than required.

Short Answer Guidelines

1. Usually employ direct questions rather than incomplete statements, particularly for young students. 2. Structure the item so that a response should be concise. 3. Place blanks in the margin for direct questions or near the end of incomplete statements. 4. For incomplete statements, use only one or, at most, two blanks. 5. Make sure blanks for all items are equal in length.

What is this item's dominant shortcoming? • Which of the following men was a U.S. president during the twentieth century? A. Franklin Roosevelt B. Jimmy Carter C. Dwight Eisenhower D. All of the above

? The dreaded "All of the above" was used.

What is this item's dominant shortcoming? • Directions: On the line to the left of each state in Column A write the letter of the city from the list in Column B that is the state's capitol. Each city in Column B can be used only once. Column A Column B _____ 1. Oregon a. Bismarck _____ 2. Florida b. Tallahassee _____ 3. California c. Los Angeles _____ 4. Washington d. Salem _____ 5. Kansas e. Topeka f. Sacramento g. Olympia h. Seattle

? The responses are not ordered logically

What is this item's dominant shortcoming? Which of the following isn't an example of how one might collect construct-related evidence of validity? A. intervention studies B. test-retest studies C. differential-population studies D. related-measures studies

? This item violates the guideline urging the avoidance of negatives in a multiple-choice item's stem. If a negative is employed, it should certainly be clearly identified rather than hiding it as an unitalicized contracted form of "is not."

The most compelling empirical support for formative assessment is that supplied by:

A 1998 research review by Paul Black and Dylan Wiliam of classroom-assessment studies

As most educators currently use the expression, "content standard," to which of the following is that phrase most equivalent?

A curricular aim

Formative assessment is best thought of as:

A process in which assessment-elicited evidence informs adjustment decisions

Which of the following is typically recommended for use with students who have the most serious cognitive disabilities?

Alternate assessments

An English teacher has created three versions of an exam. He decides he wants to see if all three versions of the examinations are providing students with the same kind of challenge. What sort of reliability evidence should he gather?

Alternate-form

Which of the following best defines the standard error of measurement?

An estimate of the consistency of an individual's performance on a test

Which of the following is an attribute of the Smarter Balanced assessments that distinguishes them from PARCC assessments?

Assessments utilize Computer Adaptive Technology (CAT)

Which of the following types of tests is most frequently, but mistakenly, pushed by its developers as formative assessments? Select one: a. Nationally standardized achievement tests b. Interim assessments c. Tests in the National Assessment of Educational Progress d. Statewide annual accountability tests

B) Interim assessments

Formative assessment is most similar to which of the following? Select one: a. Assessment as learning b. Assessment against learning c. Assessment for learning d. Assessment of learning

C) assessment for learning

What kind of instructional objective is Mrs. Jenkins pursuing when she wants her students to become able to distinguish between facts and opinions presented in a newspaper's "Letters to the Editor"?

Cognitive

Using Bloom's taxonomy what kinds of educational objectives are most frequently measured by teachers' classroom assessments?

Cognitive (lower level)

Reliability=

Consistency

Intervention studies, differential-population studies, and related-measures studies are all instances of ways to collect:

Construct-related evidence of validity

Classroom teachers are most apt to focus on which of the following:

Content-related evidence of validity

What appears to be the most significant obstacle to creating assessments of all ELL students using the first-language of those students?

Cost

What kind of evidence is most eagerly sought by the commercial testing firms that develop academic aptitude tests?

Criterion-related evidence of validity

Although many people use the terms "grading" and "evaluation" as though the two words are synonymous, what is the technical difference between these two terms?

Evaluation deals with the determination of a teacher's instructional effectiveness; grading deals with letting students know how well they are performing

T/F: For a test item to be biased, it must offend at least one group of students on the basis of that group-members' personal characteristics such as race, religion, or gender

F

T/F: "English Language Learners" ELL are those students who have been identified as capable of responding appropriately to an English-language assessment

False

T/F: "Teaching to the test," having been employed as a descriptive phrase for so many years by both educators and laypersons, is fairly well understood by most people.

False

T/F: All student responses to essay tests should be first scored analytically, then scored holistically.

False

T/F: Because part of the function of an essay item is to distinguish between high-achievers and low-achievers, it is not necessary for the items to be constructed so all students can understand the task.

False

T/F: Because students' parents can ordinarily become heavily involved in portfolio assessment, a teacher's first task is to make sure parents "own" their child's portfolio.

False

T/F: Current-form preparation, that is, special instruction based on an existing form of a test, can be appropriate in some situations.

False

T/F: Directions regarding how students should respond to classroom assessments intended to be especially challenging should be somewhat opaque in order to increase the assessment's difficulty.

False

T/F: Empirical bias-detection techniques, especially DIF-biased approaches, will invariably identify more biased items than will even well-trained and representative bias-review committee.

False

T/F: For short-answer items employing incomplete statements, use at least two blanks per item, preferably more.

False

T/F: For short-answer items, place blanks for incomplete statements near the beginning of the statement.

False

T/F: Fortunately, well organized teachers do not need to devote much time to the conduct of portfolio conferences.

False

T/F: In order to help a teacher's more able students respond to essay questions, it is acceptable to employ fairly advanced vocabulary terms in the directions regarding how to respond to such questions.

False

T/F: Most standardized achievement tests are accompanied by fairly explicit descriptions of what is being measured, such descriptions being sufficiently clear for most teachers' instructional planning purposes.

False

T/F: Students should be asked to review their own work products only near the end of the school year so their self-evaluations can be more accurate.

False

T/F: Students should rarely be involved in the determination of the evaluative criteria by which a portfolio's products will be appraised.

False

T/F: Students' work products must be stored in file folders, then placed in a lockable metal file cabinet to prevent unauthorized use of a student's portfolio.

False

T/F: Teachers should prepare a tentative scoring key after first scoring, say, a half-dozen students' responses to an essay item.

False

T/F: The Common Core Standards were designed to be set at the median level of all states' standards so that they were not too high or too low

False

T/F: The use of "clone" items in test-preparation activities is highly recommended because it is consistent with both of the chapter's two test-preparation guidelines.

False

T/F: There is, in reality, no truly acceptable way a teacher can prepare students for a high-stakes test.

False

T/F: To best assess students' ability to respond to essay items, allow students to choose among optional items.

False

T/F: When teachers score their students' responses to essay questions, teachers should score all of a particular student's responses to different questions, then move on to scoring the next student's responses.

False

T/F: Mrs. Peterson collects monthly affective inventories (entitled How am I Doing?) from the students in her instructional technology classes. She not only uses anonymous self-report inventories, but also encourages students to suggest ways that she could improve her teaching

False-- (Remedy: Do not ask for handwritten, hence potentially identifiable, student suggestions or, if important, ask students to submit such suggestions on sheets separate from the self-report inventories.)

T/F: Mrs. Hashizuma has her students complete self-report, multifocus affective inventories at the beginning of each school year. Then, just as the inventories have been completed, she announces that students are to go to the back of the room and place their completed inventories in a large box intended for that purpose.

False-- (Remedy: Mrs. Hashizuma should have made her announcement about the collection method before students began to complete their inventories. The idea with such anonymity-enhancement collection methods is to induce honest responding by students. A post-completion announcement would have had no positive impact on the candor of students' responses.)

T/F: Ms. Stafford is convinced that she can make accurate inferences about the affective status of individual students, and thereupon can make better instructional decisions about them. Accordingly, she asks students to write their names on the reverse side of their multifocus attitude inventories.

False-- (Remedy: Ms. Stafford's confidence notwithstanding, students' anonymity must be preserved if they are apt to respond truthfully. Otherwise, students are likely to produce "socially desirable" responses instead of honest ones. Inferences about individual students, therefore, are likely to be invalid. Ms. Stafford should employ anonymously completed inventories and be satisfied with valid group-based inferences.)

T/F: 1. Mr. Evans uses an anonymous self-report affective inventory to gauge students' current preferences regarding highly visible Republican and Democratic political figures.

False-- (Remedy: Select noncontroversial affective targets for assessment.)

T/F: Miss Meadows assesses her English students' affect at the end of each semester. She then arrives at interpretations regarding the attitudes toward English of each class of students.

False-- (Remedy: She should use a pre-instruction as well as a post-instruction assessment to account for differences in her students' entry attitudes.)

T/F: The heart of the Professional Ethics Guideline is that teachers should not prepare students for tests in a way that violates universal canons of fundamental morality. (this one is tricky. read question and text carefully)

False-- In addition to "general ethical"considerations (lying, stealing, etc.) teachers should be governed by ethical consideration within their profession. Key phrase was "Professional Ethics".

If a student's raw score were at the 45th percentile, in what stanine would the student have scored? Select one: a. Sixth b. Fourth c. Fifth

Fifth

If a teacher wished to secure "nonpartisan blind-scoring" of students' test performances, how could this best be done if the teacher employs a pretest-posttest design or a split-and-switch design?

For either of these two designs, pretests and posttests (definitely undated) should be coded so, after scoring, they can be identified, then the pretests and posttests are mixed together so that scorers do not know whether a student's response was made prior to or following instruction. If parents, members of the business community, or teachers from other schools can be enlisted as blind-scorers, they will be seen as nonpartisans.

What does it mean when the use of traditional standardized achievement tests is rejected because of these tests' "technical tendency to exclude items covering important content?"

In order to create a suitable score-spread among test-takers, items with high p values are typically not included in such tests or are subsequently removed at test-revision time. Yet items on which students perform well often cover the content teachers thought important enough to stress. The better that students do on an item, the less likely the items covering important, teacher-stressed content will be found on a certain educational accountability tests.

Instruction: _____ as Curriculum: ______

Instruction : Means :: Curriculum : Ends

What is meant by "instructional sensitivity"?

Instructional sensitivity is the degree to which students' performances on a test accurately reflect the quality of instruction specifically provided to promote students' mastery of what is being assessed.

Suppose you were trying to show that the items constituting an educational assessment procedure all measured the same underlying trait. What sort of evidence should you attempt to secure?

Internal Consistency

James Popham asserts that there is no such thing as a "valid test". What does he mean?

It's the validity of the score-based inference that is at issue

What are the four levels of formative assessment?

Level 1-- Teachers' Instructional Adjustments Level 2-- Students' Learning Tactic Adjustments Level 3-- Classroom Climate Shift Level 4-- Schoolwide Implementation

Which of the following levels of formative assessment best characterizes a fundamental shift in classroom climate? Select one: a. Level 3 b. Level 1 c. Level 4 d. Level 2

Level 3

What should be the two major concerns of a classroom teacher who wishes to eliminate bias in the teacher's assessment instruments?

Offensiveness and unfair penalization of items in the teacher's tests

Which of the following represents the most synonymous label for "student academic achievement standards"?

Performance standards

Identify the four common categories of accommodations typically considered for either assessment or instruction (as identified by CCSSO, 2005)

Presentation, response, setting, and timing/scheduling

Mr. Acura has persuaded a colleague to administer Mr. Acura's seventh-grade science test to the colleague's seventh-grade social studies students (who have not yet taken a science class). Mr. Acura, who wants his classroom assessments to yield criterion-referenced interpretations about his students, is delighted to learn that in the item-by-item comparisons between the "instructed" science students and the "uninstructed" social studies students, Mr. Acura's science students dramatically out-perform their social studies counterparts. Mr. Acura decides to leave his tests largely unaltered. Was Mr. Acura's decision right or wrong?

Right

Mr. Goldberg, a kindergarten teacher, has recently compared his observation-based classroom assessments with the recently state-approved Content Standards for Kindergartners. He discovered that fully five of the new content standards are not even addressed in his observation-based assessments. Accordingly, he decides to alter his observation instrument so all of the new state-sanctioned kindergarten content standards will be represented by his classroom tests. Was Mr. Goldberg's decision right or wrong?

Right

Mr. Hubbart, a science teacher, reviews the content of his own tests he created two years ago. He discovers the content in a half-dozen items has been rendered inaccurate by recent studies published in scientific research journals. Without checking with any colleagues, he decides to revise the half-dozen items so that they are consistent with the latest research findings. Was he right or wrong?

Right

Mr. Villa uses one of his chemistry tests to discriminate among students so the most able students can take part in a special competition sponsored by the National Science Foundation. A testing specialist at the school district's office has performed several analyses on Mr. Villa's items indicating at least a fourth of the items on this test have discrimination indices of less than .15. Seeing these results, Mr. Villa decides to alter those items so they might discriminate more efficiently. Was Mr. Villa's decision right or wrong?

Right

Mrs. Chang, a fifth-grade teacher, asks a colleague to review her major examinations. The colleague has recently completed a graduate course in classroom assessment and has expressed a willingness to help other teams improve their classroom tests. The colleague identifies about 10 items that clearly violate item-writing rules based on the item-type involved. Mrs. Chang decides to change the items as suggested because all 10 items, as she re-examines them, do appear to have problems. Was Mr. Hubbart's decision right or wrong?

Right

Which of the following is NOT a traditionally cited reason classroom teachers need to know about assessment? -So teachers can monitor students' progress -So teachers can assign grades to students -So teachers can diagnose students' strengths and weaknesses -So teachers can clarify their instructional intentions

So teachers can clarify their instructional intentions

A history teacher, Mrs. Scoggins, tries to determine the consistency of her tests by occasionally readministering them to her students, then seeing how much similarity there was in the way her students performed. What kind of reliability evidence is Mrs. Scroggins attempting to collect

Stability

A self-report inventory focusing on students' interest in history was field-tested by being administered to 1,000 students in early October and again in late November. The inventory's developers wanted to see if students' interest in history shifted over a seven-week period. What brand of reliability evidence was the focus of the field-test?

Stability

Test-retest data regarding assessment consistency is an instance of:

Stability reliability

T/F: Even if the individual items in a test are judged to be bias-free, it is possible for the total set of items, in aggregate, to be biased

T

T/F: When Ms. Marks tries out an early version of a 20-item multifocus affective inventory intended to assess five different factors related to students' interest in pursuing a post-secondary education, she discovers that most of the four items related to each factor appear to be measuring different things. Accordingly, she concludes that her multifocus inventory does, indeed, appear to measure students' affect regarding the five factors in which she is interested.

TRue

It is said that one reason students' scores on educational accountability tests should not be used to evaluate instruction is there is a "teaching-testing mismatch." What does this mean?

Teaching-testing mismatches signify that a substantial amount of the content contained in the test's items may not have been taught—or may not even supposed to have been taught.

The Common Core State Standards were developed by...

The National Governors Association (NGA) and the Council of Chief State School Officers (CCSSO)

It was suggested in the chapter that World War I's Army Alpha was a particularly influential aptitude test that shaped traditional psychometric thinking about how educational tests should be built. What was the measurement mission of the Alpha?

The chief measurement mission of the Army Alpha was to provide comparative score interpretations for test-takers.

Why is test-based teacher evaluation consonant with federal insistence that student growth constitute "a significant factor" in the evaluation of teachers?

The essence of test-based teacher evaluation is to make certain that students' test scores—reflecting student growth—constitute a dominant determiner of judgments reached about a teacher's quality.

What is one reason that a simple one-group, pretest-posttest design will rarely yield convincing evidence of a teacher's instructional effectiveness?

The pretest is usually reactive, and this often confounds how students react to the instruction and how they react to the posttest; (or) a different-difficulty pretest and posttest provide difficult-to-interpret evidence of the instruction's impact.

Which of the following statements best describes the relationship among the three sanctioned forms of reliability evidence?

The three forms of evidence represent fundamentally different ways of representing a test's consistency

Which of the following views regarding the assessment of English language learners (ELL) is most defensible?

The use of assessment accommodations for ELL students typically leads to more valid test-based inferences about those students

What is this item's dominant shortcoming? If a classroom teacher actually computed a Kuder-Richardson coefficient for a final exam, this would be an example of an: A. stability reliability coefficient B. internal consistency reliability coefficient C. content validity coefficient D. construct validity coefficient

The use of the article "an" renders only Choice C a grammatically correct option.

What is one reason that the "split-and-switch" design can provide believable evidence of a teacher's instructional effectiveness?

This data-gathering design provides (based on a 50 percent sample of a teacher's class) two pretest-to-posttest contrasts using an identical pretest and posttest, but contrasts in which students will not have already seen the posttest they are given.

What is this item's dominant shortcoming? • True? Or False? A classroom test that validly measures appropriate knowledge and/or skills is likely to be reliable because of the strong link between validity and reliability.

This is a double-concept item

What is this item's dominant shortcoming? Correct? Or Incorrect? When teachers assess their students, it is imperative they understand the content standard on which the test is based.

This item is ambiguous because it is unclear to whom the pronoun "they" refers, that is, to the "teachers" or the "students" in the statement.

What is this item's dominant shortcoming?• True? Or False? Test items should never be constructed that fail to display a decisive absence of elements which would have a negative impact on students because of their gender or ethnicity.

This item maximizes, not minimizes negatives

What is this item's dominant shortcoming? • True? Or False? Having undertaken a variety of proactive steps to forestall the inclusion of items that might, due to an item's content, have an unwarrantedly adverse impact on students because of any student's personal characteristics, the test-developers then, based on sound methodological guidelines, should carry out a series of empirically based bias-detection studies.

This item royally violates the item-writing commandment about the avoidance of complex syntax.

What is this item's dominant shortcoming? • True? Or False? One of the most heinous transgressions in the genesis of assessment items is the item's incorporation of obfuscative verbiage.

This item violates the commandment to which it refers by using excessively advanced vocabulary terms.

What is this item's dominant shortcoming? A set of properly constructed binary-choice items will: A. Typically contain a substantially greater proportion of items representing one of the two alternatives available to students. B. Incorporate qualities that permit students to immediately recognize the category into which each item falls, even based on only superficial analyses. C. Vary the length of items representing each of the two binary-option categories so that, without exception, shorter items represent one category while longer items represent the other. D. Contain no items in which more than a single concept is incorporated in each of the items.

This item's stem is too terse and its alternatives too verbose.

What best describes the stated purpose of the Common Core State Standards?

To ensure that all students graduate college and career ready

T/F: A teacher's instruction should be directed toward the body of knowledge and/or skills represented by a test rather than toward the actual items on a test.

True

T/F: Construct all essay items so a student's task is explicitly described.

True

T/F: Early in a school year, a teacher who is using a portfolio assessment should make sure the students' parents understand the portfolio process.

True

T/F: For essay items, make decisions regarding the importance of the mechanics of students' writing prior to scoring students' responses.

True

T/F: Generalized test-taking preparation, if not excessively lengthy, represents an appropriate way to ready students for a high-stakes test.

True

T/F: Generally speaking, the marked increase in inappropriate test-preparation during the last decade or so stems chiefly from the imposition of high-stakes tests associated with a widespread push for educational accountability.

True

T/F: In general, a wide variety of work products should be included in a portfolio rather than a limited range of work products.

True

T/F: In order for students to evaluate their own efforts, the evaluative criteria to be used in judging a portfolio's work products must be identified, then made known to students.

True

T/F: Insofar as possible, classroom teachers should evaluate their students' essay responses anonymously, that is, without knowing which student wrote which response.

True

T/F: Mr. Gomez is confident the parents of his sixth-grade students will uniformly want their children to possess positive attitudes toward learning. Accordingly, he develops a 10-item self-report inventory dealing with such attitudes. He administers it anonymously at the beginning and end of each school year.

True

T/F: Mrs. Chappel teaches third-grade students. In an effort to form an inference about her students' attitudes toward mathematics, she uses an eight-item self-report inventory with only three response options (that is, Agree, Not Sure, Disagree) rather than the five levels of disagreement often used for older children (that is, Strongly Agree, Agree, Uncertain, Disagree, Strongly Disagree).

True

T/F: Ms. Johnson asks her fourth-grade students, on a pretest and posttest basis, to complete anonymously a self-report affective inventory about their interest in the subjects she teaches. Ms. Johnson uses students' responses to arrive at a group-based inference about the aggregate interests of her students

True

T/F: Parents should become actively involved in reviewing the work products in a child's portfolio.

True

T/F: Provide students with the approximate time to be expended on each essay item as well as each item's value.

True

T/F: Short-answer items, especially those intended for young children, should employ direct questions rather than incomplete statements.

True

T/F: The essence of the Educational Defensibility Guideline is that a suitable test preparation practice will boost students' mastery on both a curricular aim's domain knowledge and/or skill as well as the test representing that curricular aim.

True

T/F: Typically, judgement-only approaches to the detection of item bias are employed prior to use of empirical bias-detection techniques

True

T/F: When creating a multifocus affective inventory for his French classes, Mr. Bouvier asks a group of his colleagues to review draft statements for his new self-report inventory. Any statements that his colleagues do not universally classify as positive or negative are discarded by Mr. Bouvier.

True

T/F: When held, portfolio conferences should not only deal with the evaluation of a student's work products, but should also improve the student's self-evaluation abilities.

True

T/F: When using blanks for short-answer incomplete statements, make sure the blanks for all items are equal in length

True

When evaluating a teacher's instruction, what is the difference between formative evaluation and summative evaluation?

Whereas formative evaluation focuses on decisions aimed at the improvement of a teacher's ongoing instruction, summative evaluation deals with more permanent go/no-go decisions such as termination or the granting of tenure

A sixth-grade teacher, Mrs. Jones, has subjected her major tests (which she hopes will yield accurate criterion-referenced inferences) to a pretest-posttest type of item analysis. Because she discovers that almost all of her items reflect substantial pre-to-post increases in the number of students' correct responses, Mrs. Jones decides to make about half of her items more difficult so students' pre-to-post improvements will not be so pronounced. Was Mrs. Jones' decision right or wrong?

Wrong

Although Ms. Thompson wants her classroom tests to provide evidence for her to make criterion-referenced interpretations about each of her students' current achievement levels, she still wants her tests to be "technically superior." As a consequence, she has decided to replace all of the binary-choice items in her test for which students' response data indicate the items are nondiscriminators. Was Ms. Thompson's decision right or wrong?

Wrong

Mr. Gurtiza teaches English to high school seniors. Although he employs writing samples to assess students' composition skills, many items in his tests are selected-response in nature. To improve his exams, Mr. Gurtiza asks his students, as they are taking their exams, to circle the numbers of any multiple-choice items that they find need revisions because of ambiguity, incorrectness, and so on. Mr. Gurtiza then tallies the frequency of items whose numbers have been circled. He decides to discard or modify any item circled by at least 20 percent of his students. Was Mr. Gurtiza's decision right or wrong?

Wrong

Mr. Robinson decides to eliminate all items from his pretest on which students score too well. He plans to eliminate most subsequent instruction dealing with the content represented by those items. He finds that four of his pretest's items have p values of .20 or less. As a result, he decides to replace each of these easy items with more difficult ones. Was Mr. Robinson's decision right or wrong?

Wrong

Parents who want their children to score high on standardized achievement tests would be most happy if their child earned which of the following percentiles? a. 99th b. 2nd c. 50th

a. 99th

Which of the following score-interpretation options is most often misunderstood? a. Grade equivalents b. Percentiles c. Stanines

a. Grade equivalents

Which one of the following three ways of interpreting standardized test scores has a descriptive statistical function fundamentally different than the other two? a. Mean b. Standard deviation c. Range

a. Mean

Which of the following is not a group-focused index employed in the interpretation of students' test performances? a. Raw score b. Standard Deviation c. Median

a. Raw score

f two sets of test scores indicate that Score-Set X has a standard deviation of 10.2, while Score-Set Y, on the same test, has a standard deviation of 8.4, what does this signify? a. The performances of the X students were more variable than those of the Y students. b. The scores of the Y students were more spread out than those of the X students. c. Students in the Score-Set X outperformed their Score-Set Y counterparts.

a. The performances of the X students were more variable than those of the Y students.

Most score-interpretation indices for large-scale tests rely on a fundamentally similar interpretational framework. That framework is best described as: a. relative b. absolute c. arbitrary

a. relative

validity=

accuracy

Which of the following score-interpretation indices were initially introduced to permit amalgamation of students' scores on different standardized tests? a. Grade equivalents b. Normal curve equivalents c. Stanines

b. Normal curve equivalents

Which of the following score-interpretation options is most readily interpretable? a. Grade equivalents b. Percentiles c. NCEs

b. Percentiles

Small/Broad Scope: Students will be able to write sentences in which verbs agree in number with relevant nouns and pronouns

broad

Small/Broad Scope: Students will compare and contrast findings presented in a text to those from other sources, noting when the findings support or contradict previous explanations or accounts

broad

small/broad: Students will assess how point of view or purpose shapes the content and style of a text

broad

The chief ingredients of a learning progression are its:

building blocks

If a teacher set out to install a variation of formative assessment in which assessment-elicited evidence was being used chiefly to permit students to modify learning tactics, which level of formative assessment would be the teacher's dominant focus? Select one: a. Level 4 b. Level 3 c. Level 2 d. Level 1

c) level 2

Which of the following would not be a clear instance of summative assessment? Select one: a. When results of accountability tests are employed to evaluate instructional quality b. When teachers use students' performances on major exams to issue semester or school-year grades c. When students employ their performances on classroom tests to decide whether to adjust how they are trying to achieve curricular goals d. When scores of a teacher's students on a late spring district-wide test are used to evaluate the teacher's effectiveness

c. : When students employ their performances on classroom tests to decide whether to adjust how they are trying to achieve curricular goals

Which of the following score-interpretation options is especially useful in equalizing the disparate difficulty levels of different test forms? a. Grade equivalents b. Percentiles c. Scale scores

c. Scale scores

Of the following options, which one is—by-far—the most integral to the implementation of formative assessment in a classroom? Select one: a. Teachers' willingness to refrain from grading most of their students' exams b. Teachers' use of a variety of both selected-response and constructed-response items c. Use of assessment-elicited evidence to make adjustments d. A teacher's willingness to try new procedures in class

c. Use of assessment-elicited evidence to make adjustments

Which of the following is not a likely reason that formative assessment is employed less frequently in our schools than the proponents of formative assessment would prefer? a. The prevalence of instructionally insensitive accountability tests b. Teachers' reluctance to alter their current practices c. Misunderstandings by teachers regarding the nature of formative assessment d. The absence of truly definitive evidence that formative assessment improves students' learning

d. The absence of truly definitive evidence that formative assessment improves students' learning

T/F: If an accountability test produces a statistically significant disparate impact between minority and majority students' performances, it is certain to possess assessment bias

false

Small/Broad Scope: Students will be able to correctly define 10 new vocabulary words

small

Small/Broad Scope: Students will recognize that in a multi-digit number, a digit in one place represents 10 times as much as it represents in the place to its right and 1/10 of what it represents in the place to its left

small

T/F: If a teacher's classroom test in mathematics deals with content more likely to be familiar to girls than boys, it is likely the test may be biased

true


Kaugnay na mga set ng pag-aaral

Psychology Ch 11 (book multi choice)

View Set

Marketing 305, Chapter 1, Foundations of Consumer Behavior

View Set

Accounting Tests Multiple Choice

View Set

Chapter 1, 2, and 3 MAT 305 Definitions and Axioms

View Set

Chapter 7, Windows Server 2019 study guide

View Set

methods of experimental psych final study guise

View Set

Ch. 25 Types of Diabetes and Diagnosis

View Set

Practice Midterm Exam - Weeks 1-4; Chapters 1-7

View Set