Classroom Assessment Midterm

Ace your homework & exams now with Quizwiz!

The standard error of measurement is focused chiefly on: A. Test content evidence of validity B. None of the above C. Relations to other variables evidence of validity D. Response processes evidence of validity

B. None of the above

Classroom teachers are most apt to focus on which of the following? A. Evidence of alternate-form reliability B. Test content evidence of validity C. Internal structure evidence of validity D. Evidence of internal-consistency reliability

B. Test content evidence of validity

A history teacher, Mrs. Scroggins, tries to determine the consistency of her tests by occasionally re-administering them to her students, then seeing how much similarity there was in the way her students performed. What kind of reliability evidence is Mrs. Scroggins attempting to collect? A. Internal consistency B. Test-retest C. Alternate form D. Precision

B. Test-retest

Which of the following is not a traditional reason that teachers assess students? A. To determine instructional effectiveness B. To clarify instructional intentions C. To monitor students' progress D. To assign grades to students

B. To clarify instructional intentions

Which one of the following statements could be technically correct? A. "The test is face-valid." B. "The test is consequentially valid." C. "The test-based inference is valid." D. "The test is definitely valid."

C. "The test-based inference is valid."

Which of the following answer choices depicts the chronologically accurate development of assessment legislation in the United​ States? a. ESSA, ESEA, NCLB b. ESEA, ESSA, NCLB c. ESEA, NCLB, ESSA d. NCLB, ESSA, ESEA

c. ESEA, NCLB, ESSA

Which of the following approaches to bias-elimination is it most reasonable to expect classroom teachers to use? a. Neither empirical nor judgmental approaches are reasonable for classroom teachers to use. b. Empirical approaches c. Judgmental approaches d. Both empirical and judgmental approaches are equally reasonable for classroom teachers to use.

c. Judgmental approaches

Consider the following​ test-item. Which of the following decisions requires educators to use quality assessment​ information? a. Choosing who should get into this college b. Deciding what reading group a student should be placed in c. Determining whether a student is legally disabled d. All of the above e. Only​ (a) and​ (b) Which category best describes this​ item? a. Matching b. Multiple​ binary-choice c. Multiple choice d. Binary-choice

c. Multiple choice

A district chooses a commercial test to provide information about the social studies skills and knowledge that the students seem to be having difficulty in mastering. A relatively elaborate series of alignment studies will be carried out early in the school year in an attempt to provide validity evidence to confirm this instructionally supportive usage. Which source of validity evidence will a person supervising the alignment studies rely​ on? a. Validity evidence based on test content b. Validity evidence based on internal structure of the social studies test c. Validity evidence based on relationships between​ students' test scores and other variables d. Validity evidence based on response processes

a. Validity evidence based on test content

A prominent procedure to minimize assessment bias for students with disabilities is to employ​ ________________. a. assessment accommodations b. classroom modifications c. individualized education programs d. positive behavior supports

a. assessment accommodations

Which of the following types of assessment targets a​ student's attitudes,​ interests, and​ values? a. Cognitive assessment b. Affective assessment c. Psychomotor assessment d. Standardized assessment

b. Affective assessment

Which of the following is not a step in the four steps for creating a learning​ progression? a. Form a basic and introductory understanding of a target curricular aim. b. Determine the measurability of each preliminary identified building block. c. Arrange all the building blocks in an instructionally sensible sequence. d. Identify all requisite precursory subskills and bodies of enabling knowledge.

a. Form a basic and introductory understanding of a target curricular aim.

Review the following mathematics item for assessment bias. Amy Johnson has a large collection of Barbie dolls.​ Originally, she had 49.​ Recently, she somehow lost 12 Barbies. How many Barbies does Amy have​ left? (Show your​ work.) a. 37 Barbies b. 61 Barbies c. 27 Barbies a. The assessment might offend people who view girls as having much broader interests than playing with dolls. b. The assessment item incorporates a discernible stereotype regarding socioeconomic status. c. The assessment item does not appear to be biased. d. This assessment item is biased against students who struggle with mathematics.

a. The assessment might offend people who view girls as having much broader interests than playing with dolls.

One important group of students in need of protection from assessment bias is English language Learners​ (ELLs). Therefore, it is important to understand the qualifying categories for an ELL student. Which is not a category of students considered to be English language learners​ (ELLs)? a. Students who are beginning to learn English but could benefit from school instruction. b. Students who are fluent in​ English, but prefer to speak another primary language. c. Students who are proficient in English but could use additional assistance in academic or social contexts. d. Students whose first language is not English and know​ little, if​ any, English.

b. Students who are fluent in​ English, but prefer to speak another primary language.

Based on the 2014 edition of the Standards for Educational and Psychological Testing​, and on common​ sense, which one of the following statements about​ students' test results represents a potentially appropriate phrasing​ that's descriptive of a set of​ students' test​ performances? a. The​ students' test scores are invalid. b. Students' scores on the test permit valid interpretations for this​ test's use. c. The scores are valid​ if, and only​ if, they were elicited by a valid test. d. The​ students' test scores are valid for unlimited educational purposes.

b. Students' scores on the test permit valid interpretations for this​ test's use.

Which of the following is a consequence of a collaborative effort by the National​ Governor's Association​ (NGA) and the Council of Chief State School Officers​ (CCSSO)? a. No Child Left Behind b. The Common Core State Standards​ (CCSS) c. Every Student Succeeds Act d. Elementary and Secondary Education Act

b. The Common Core State Standards​ (CCSS)

The Elementary and Secondary Education Act contains various subsections referred to as​ "titles." Which of the following​ "titles" gets the most attention​ (and funding) from​ policymakers? a. Title III b. Title I c. Title II d. Title IX

b. Title I

Which of the following is not one of the three types of reliability​ evidence? a. Internal consistency b. Validity c. Alternative form d. Test-retest

b. Validity

Which of the following descriptions of validity is most accurate? a. Validity refers to the consistency with which a test measures whatever it is measuring. b. Validity refers to the accuracy of score-based interpretations for specific purposes. c. Validity describes the legitimacy of the decision to which a test-based inference will be put. d. Validity describes the degree to which a test's usage leads to appropriate consequences for students.

b. Validity refers to the accuracy of score-based interpretations for specific purposes.

Public Law​ 94-142 installed the use of an ​ ___________________ to outline the educational processes for students with disabilities. a. accommodating education plan b. individualized education program c. individualized education plan d. assessment monitoring protocol

b. individualized education program

When educators collect​ test-based evidence to inform decisions about already completed instructional activities they are engaging in a basic form of​ _____________________. a. instructional assessment b. summative assessment c. standardized assessment d. formative assessment

b. summative assessment

If educators wish to accurately estimate the likelihood of consistent decisions about students who score at or near a​ high-stakes test's previously determined​ cut-score, which of the following indicators would be most useful for this​ purpose? a. A standard error of measurement for the entire test b. A traditionally computed​ internal-consistency reliability coefficient c. A conditional standard error of measurement​ (near the​ cut-score) d. A standard deviation signifying the degree of score spread among​ test-takers

c. A conditional standard error of measurement​ (near the​ cut-score)

Which is the most appropriate description of a learning​ progression? a. A learning progression is an ordered sequence of the stuff a student must learn in order to perform at a high level on a summative assessment. b. A learning progression is an ordered sequence of the stuff a teacher must teach so as to achieve a significant curricular outcome. c. A learning progression is an ordered sequence of the stuff a student must learn so as to achieve a significant curricular outcome. d. None of these appropriately describe a learning progression.

c. A learning progression is an ordered sequence of the stuff a student must learn so as to achieve a significant curricular outcome.

Which statement best characterized this nation's current use of formative assessment? a. A four-levels approach to formative assessment is used in the lower grades. b. Only a five-strategies approach to formative assessment is often encountered. c. Although research-supported, formative assessment is not widely used. d. Most teachers now employ the formative-assessment process in their classes.

c. Although research-supported, formative assessment is not widely used.

Please assume you are a​ middle-school English teacher​ who, despite this​ chapter's urging that you​ rarely, if​ ever, collect reliability evidence for your own​ tests, stubbornly decides to do so for all of your midterm and final exams. Although you wish to determine the reliability of your tests for the group of students in each of your​ classes, you only wish to administer the tests destined for such reliability analyses on one​ occasion, not two or more. Given this​ constraint, which of the following coefficients would be most suitable for your​ reliability-determination purposes? a. A standard error of measurement b. A​ test-retest reliability coefficient c. An​ internal-consistency reliability coefficient d. An​ alternate-forms reliability coefficient

c. An​ internal-consistency reliability coefficient

Which of the following conclusions regarding multiple binary-choice items has not been supported by available research? A. These items are a bit less difficult for students than multiple-choice items. B. These items are highly efficient in gathering student achievement data. C. These items are regarded by students as more difficult than multiple-choice items. D. These items tend to be more reliable than other forms of selected-response items.

A. These items are a bit less difficult for students than multiple-choice items.

What should be the two major fairness concerns of a classroom teacher who wishes to eliminate bias in the teacher's assessment instruments? A. Offensiveness and absence-of-bias in items in the teacher's tests B. Unfair penalization and reliability of items in the teacher's tests C. Offensiveness and unfair penalization of items in the teacher's tests D. Disparate impact and offensiveness of items in the teacher's tests

C. Offensiveness and unfair penalization of items in the teacher's tests

Which of the following is not an item-writing rule for the creation of binary-choice items? A. Include only a single concept in any statement. B. Phrase items so that a superficial analysis by students will suggest an incorrect answer. C. Rarely use statements containing double negatives, although single or triple negatives are acceptable. D. Keep item-length similar for both of the binary categories being assessed.

C. Rarely use statements containing double negatives, although single or triple negatives are acceptable.

What kind of evidence is most eagerly sought by the commercial testing firms that develop academic aptitude tests? A. Test content evidence of validity B. Evidence that a test is regarded by some as face-valid C. Relations to other variables evidence of validity D. Evidence that the consequences of a test's use will be appropriate

C. Relations to other variables evidence of validity

Suppose that you and several other teachers in a middle school were trying to construct a new test intended to be predictive of​ high-school students' subsequent scores on the SAT and ACT college admissions exams.​ Moreover, suppose that you were in no particular hurry to assemble validity evidence in support of the accuracy of those inferred predictions. Which source of validity evidence would supply the most compelling support for the validity of your anticipated​ predictions? a. Validity evidence based on the new​ test's internal structure b. Concurrent validity evidence based on the new​ test's relation to other variables c. Predictive validity evidence based on the new​ test's relation to other variables d. Validity evidence based on the content of your new test

c. Predictive validity evidence based on the new​ test's relation to other variables

Validity evidence can be collected from a number of sources. For​ instance, suppose that a mathematics test has been built by a school​ district's officials to help identify those​ middle-school students who are unlikely to pass a statewide​ 11th-grade high-school diploma test. The new test will routinely be given to the​ district's seventh-grade students. To secure evidence supporting the validity of this kind of predictive​ application, the new test will be administered to current​ seventh-graders, and the​ seventh-grade tests will also be given to the​ district's current​ eleventh-graders. This will permit the​ eleventh-graders' two sets of test results to be compared. Which best describes this source of validity​ evidence? a. The internal structure of the​ seventh-grade test b. Students' responses observed during the two​ test-taking experiences c. The relationship of​ 11th-graders' performances on the two tests d. Test content evidence

c. The relationship of​ 11th-graders' performances on the two tests

Consider the following illustrative​ five-option multiple-choice item. It addresses content presented in the Standards for Educational and Psychological Testing​ (2014) related to the fundamental notion of assessment validity. When we encounter a test whose scores are affected by processes that are quite extraneous to the​ test's intended​ purpose, we assert that the test displays which one of the​ following? a. Construct underrepresentation b. Construct deficiency c. Construct corruption d.​ Construct-irrelevant variance e. All of the above Which statement best describes the illustrative​ item? a. This illustrative​ multiple-choice item contains negatively phrased​ options, and thus violates a general​ item-writing guideline. b. The​ multiple-choice item illustrated here employs ambiguous directions to the​ test-taker and,​ therefore, violates one of the​ chapter's item-category guidelines. c. This illustrative​ item, because it includes an​ "all of the​ above" alternative, violates an important​ item-writing guideline. d. This illustrative item contains no violation of the general or​ item-category guidelines presented for this type of​ selected-response item.

c. This illustrative​ item, because it includes an​ "all of the​ above" alternative, violates an important​ item-writing guideline.

Proper formative assessment is conducted​ ___________________. a. prior to the beginning of the instructional process b. when summative assessment does not provide clear evaluative information c. during the instructional process d. at the conclusion of the instructional process

c. during the instructional process

Decisions linked to classroom assessments should be made​ _____________. a. after thorough review b. at the conclusion of the assessment c. in advance d. during initial review of the exam

c. in advance

Which of the following is the most useful indicator of the consistency of an individual student's test performance? a. A dichotomous item-analysis b. An alternate-form reliability coefficient c. A polytomous item-analysis d. A standard error of measurement

d. A standard error of measurement

Which of the following is typically recommended for use with students who have the most serious cognitive disabilities? A. Alternate assessments B. Regular assessments C. Assessment accommodations D. No assessments whatsoever

A. Alternate assessments

Which of the following is a generally recommended item-writing rule for matching items? A. In the test's directions, describe the basis for matching and the number of times a response can be used. B. Employ relatively long lists, usually containing at least two-dozen premises or responses. C. Whenever possible, employ heterogeneous lists of premises and responses. D. Typically, employ more premises than responses, allowing each response to be used more than once.

A. In the test's directions, describe the basis for matching and the number of times a response can be used.

Stability data regarding assessment consistency is an instance of: A. Test-retest reliability B. Precision reliability C. Internal-consistency reliability D. Alternate-form reliability

A. Test-retest reliability

Which of the following conceptions of assessment validity is most constant with the Standards for Educational and Psychological Testing released in 2014 by AERA, APA, and NCME? A. How consistently a test measures whatever it measures B. The accuracy of score-based interpretations for intended test uses C. The accuracy of score-based inferences about test-takers D. The appropriateness of a test's usage

B. The accuracy of score-based interpretations for intended test uses

Please review the following item for assessment bias. It was used to assess the basic computation mathematics aims being pursued by an​ inner-city, elementary​ school's staff in a Midwestern state. Ramon Ruiz is sorting out empty tin cans he found in the neighborhood. He has four piles based on different colors of the cans. He thinks he has made a mistake in adding up how many cans are in each pile. Please identify​ Ramon's addition statement that is in error. a. 20 bean cans plus 32 cans​ = 52 cans. b. 43 bean cans plus 18 cans​ = 61 cans. c. 38 bean cans plus 39 cans​ = 76 cans. d. 54 bean cans plus 12 cans​ = 66 cans A. The assessment item does not appear to be biased. B. These are not two major causes of assessment biased encountered in typical educational tests. C. The assessment item appears to be biased against Americans of Latino backgrounds. D. The assessment item appears to be biased in favor of recyclers.

C. The assessment item appears to be biased against Americans of Latino backgrounds.

Which of the following statements best describes the relationship among the three sanctioned forms of reliability evidence? A. All three types of evidence are essentially equivalent. B. Test-retest reliability evidence is more important than either internal-consistency evidence or alternate-form evidence. C. The three forms of evidence represent fundamentally different ways of representing a test's consistency. D. The three forms of evidence differ in their significance because internal-consistency evidence of a test's reliability is a necessary condition for the other two types of consistency.

C. The three forms of evidence represent fundamentally different ways of representing a test's consistency.

Of the following four statements, one is not a guideline to be followed when constructing multiple-choice items. Which statement is it? A. Never use "all of the above" as a response option. B. Randomly assign correct answers to the available answer-choice positions. C. To keep stems brief, place most words in an item's alternatives. D. Don't let the length of alternatives suggest correct or incorrect answers.

C. To keep stems brief, place most words in an item's alternatives.

Which of the following is not a general item-writing rule for classroom assessments? A. Do not employ sentence-structures unlikely to be easily understood by students. B. Do not provide unclear directions to students about how to respond to an assessment. C. Do not employ words, phrases, or sentences apt to be regarded as ambiguous to students. D. Do not inform students about how much the items on a test will be weighted.

D. Do not inform students about how much the items on a test will be weighted.

Which of the following should be most influential in guiding those who create educational tests? A. The Common Core State Standards B. Test specifications of the PARCC Assessment Consortium C. Test specifications of the Smarter Balanced Assessment Consortium D. The AERA-APA-NCME Standards for Educational and Psychological Testing

D. The AERA-APA-NCME Standards for Educational and Psychological Testing

Which of the following views regarding the assessment of English language learners (ELL) is most defensible? A. Almost all groups charged with the assessment of ELL students urge the use either of translated tests or, failing that, the provision of interpreters for ELL students. B. It is relatively easy to develop alternate tests for ELL students in those students' native languages. C. ELL students, if tested with regular English-language tests, will often outperform students whose native language is English. D. The use of assessment accommodations for ELL students typically leads to more valid test-based inferences about those students.

D. The use of assessment accommodations for ELL students typically leads to more valid test-based inferences about those students.

The National Assessment of Educational Progress (NAEP): A. was first established in 1996 by the U.S. Congress to serve as an accountability oriented nation's "report card." B. is a mandatory examination in reading and mathematics that must be taken annually by a sample of students in each state. C. sets forth prescribed assessment frameworks at three grade levels, such frameworks to be followed by the 50 states in framing their own state content standards. D. assesses national samples of U.S. students at three grade levels every few years in certain academic subjects.

D. assesses national samples of U.S. students at three grade levels every few years in certain academic subjects.

The Common Core State Standards were an attempt to outline what students should know at each grade level in which of the following​ subjects? a. English Language Arts and Mathematics b. Science and Mathematics c. English Language Arts and Science d. Mathematics and Social Studies

a. English Language Arts and Mathematics

Which of the following is a recommended item-writing rule for the construction of binary-choice items? a. Employ a roughly equal number of statements representing the two categories being tested. b. If one category being assessed requires longer statements than the other category, be sure that the disparity in statement-length is constant. c. Employ relatively few double-negative statements and, if you do, be sure to emphasize with italics or bold-face type that a negative is involved. d. Include no more than two concepts in any one statement.

a. Employ a roughly equal number of statements representing the two categories being tested.

Reliability coefficients range from​ ________________. a. -1.00 to​ +1.00 b. 0 to 1.00 c. -100.00 to​ +100.00 d. -10.00 to​ +10.00

a. -1.00 to​ +1.00

Please select the one answer that most accurately identifies the particular​ item's "quoted" reason for teachers to know about assessment.​ "Teachers need to give classroom assessments in order to assign grades to students indicating how well each student has attained the learning outcomes set for​ them." This​ is: a. A traditional reason for teachers to know about assessment b. One of ​today's reasons for teachers to know about assessment c. Both a traditional reason and one of​ today's reasons for teachers to know about assessment d. Neither a traditional reason nor one of​ today's reasons for teachers to know about assessment

a. A traditional reason for teachers to know about assessment

Which of the following terms refers to the degree to which there is a meaningful agreement between two or more of the​ following: curriculum,​ instruction, and​ assessment? a. Alignment b. Common core c. Continuity d. Standardization

a. Alignment

Which of the following is a reasonable explanation for why the word assessment may be used over the word​ testing? a. Assessment is a broader descriptor of the type of measurement practices in which teachers engage. b. The word testing produces anxiety in students. c. Students are no longer asked to participate in testing given the more nuanced measurement approaches required by federal legislation. d. Teachers would rather avoid the word testing as it creates situations where classroom management becomes difficult.

a. Assessment is a broader descriptor of the type of measurement practices in which teachers engage.

Which category of test items best describes the following​ item: According to​ educators, one of the major advantages of the Every Student Succeeds Act is that it forces schools and teachers to focus only on the important material included in the state test. a. True b. False a. Binary-choice b. Matching item c. Multiple​ binary-choice d. Multiple choice

a. Binary-choice

Which category of test items best describes the​ following: True or​ False: Mount Everest is the tallest mountain Earth. a. Binary-choice b. Multiple choice c. Matching item d. Multiple​ binary-choice

a. Binary-choice

Considering the knowledge​ you've gained regarding formative assessment in this​ chapter, which of the following characteristics of a class discussion could yield formative assessment​ information? a. Class discussions that show what students are thinking. b. Class discussions that get all students involved. c. Class discussions that are about important topics. d. None of these types of class discussions would yield formative assessment information.

a. Class discussions that show what students are thinking.

How does "classification consistency" differ conceptually from more traditional indicators of test reliability? a. Classification-consistency approaches are focused more on capturing the degree of students' consistent categorizations rather than supplying only numerical indices of students' score consistency. b. Whereas the three traditional forms of reliability evidence are largely interchangeable, classification-consistency approaches to reliability are truly distinctive. c. In contrast to traditional reliability approaches, if there are no actual decisions linked to a test's use, it is impossible to determine a test's classification consistency. d. Classification-consistency approaches do not employ numerical indicators of an assessment's consistency, unlike traditional reliability procedures.

a. Classification-consistency approaches are focused more on capturing the degree of students' consistent categorizations rather than supplying only numerical indices of students' score consistency.

Which represents the most appropriate definition of formative​ assessment? a. Formative assessment is a planned process in which​ assessment-elicited evidence of​ students' status is used by teachers to adjust their ongoing instructional procedures or by students to adjust their current learning tactics. b. Formative assessment is a testing structure in which teachers assess unplanned student activities and use the results to inform their instruction. c. Formative assessment is an unplanned process in which​ assessment-elicited evidence of​ students' status is used by teachers to adjust their ongoing instructional procedures or by students to adjust their current learning tactics. d. Formative assessment is a testing structure in which student evidence is used by teachers to assign quantitative scores to student work.

a. Formative assessment is a planned process in which​ assessment-elicited evidence of​ students' status is used by teachers to adjust their ongoing instructional procedures or by students to adjust their current learning tactics.

Cronbach's coefficient alpha and the Kuder-Richardson reliability formulae are examples of: a. Internal-consistency coefficients b. Test-retest coefficients c. Alternate-form coefficients d. None of the above

a. Internal-consistency coefficients

Which category of test items best describes the​ following: Consider these three categories of test​ items: multiple-choice, binary, and matching. Choose the appropriate term to match the​ description: _____ Multiple choice (a.) This type of question offers the​ test-taker only two options from which to choose. _____ Binary (b.) This type of question offers the​ test-taker several options from which to choose. _____ Matching (c.) This type of question may ask the test taker to attach vocabulary words with their proper definition. a. Matching b. Binary-choice c. Multiple​ binary-choice d. Multiple choice

a. Matching

Which of the following pieces of federal legislation had notably been an attempt at reversing the growing achievement gap that left poor and minority students in failing schools while requiring a​ "show us" approach to student​ evaluation? a. No Child Left Behind b. Elementary and Secondary Education Act c. Every Student Succeeds Act d. Individuals with Disabilities Education Act

a. No Child Left Behind

Please select the one answer that most accurately identifies the particular​ item's "quoted" reason for teachers to know about assessment. ​"Just a year​ ago, the voters in our school district voted favorably in a huge​ school-levy election that brought in substantial tax dollars for our schools. Most of the​ district's teachers are convinced that this positive support for the schools was based on our​ schools' consistently high rankings on the​ state's annual accountability​ tests." This​ is: a. One of ​today's reasons for teachers to know about assessment b. A traditional reason for teachers to know about assessment c. Both a traditional reason and one of​ today's reasons for teachers to know about assessment d. Neither a traditional reason nor one of​ today's reasons for teachers to know about assessment

a. One of ​today's reasons for teachers to know about assessment

Which strategy seems most suitable for teachers to use when trying to detect and eliminate assessment bias in their own​ teacher-made tests? a. Teachers should pay particular attention to the possibility that assessment bias may have crept into their​ teacher-made tests and should strive to rely on their best judgments about the presence of such bias on all of their classroom tests but especially on their most significant classroom assessments. b. Because decisions based on hard evidence will almost always be more defensible than decisions based dominantly on human​ judgment, teachers should identify potentially biased items in their​ teacher-made tests by using empirical methodsand only thereafter confirm those identifications using human judgment. c. Given that​ students' self-identification of potentially biased items can prove remarkably illuminating to teachers regarding the items in their​ teacher-made tests, all significant classroom assessments should provide an opportunity for students themselves to indicate that they regarded an item as biased and to indicate the nature of this bias while actually completing a​ test's items. d. Because teachers spend so much time focusing on avoiding​ bias, it is highly unlikely that​ teacher-produced exams contain bias.

a. Teachers should pay particular attention to the possibility that assessment bias may have crept into their​ teacher-made tests and should strive to rely on their best judgments about the presence of such bias on all of their classroom tests but especially on their most significant classroom assessments.

Which of the following sources of validity evidence are teachers most likely to collect? a. Test content b. Response processes c. Relations to other variables d. Internal structure

a. Test content

If a multi-state assessment consortium has generated a new performance test of​ students' oral communication skills and wishes to verify that​ students' scores on the performance test remain relatively similar regardless of the time during the school year when the test was​ completed, which of the following kinds of consistency evidence would be most​ appropriate? a. Test-retest evidence of reliability b. Internal-consistency evidence of reliability c. Alternate-form evidence of reliability d. The standard error of measurement

a. Test-retest evidence of reliability

Consider the illustrative​ binary-choice item. Please decide whether the following statement regarding the reliability of educational tests is True or False. Please place a check behind the True or False to indicate your answer. True​ ___ ​ False___ When determining a​ test's classification​ consistency, there is no need to consider the cut score employed nor that cut​ score's location in the score distribution. Which statement best describes the illustrative​ item? a. This illustrative item violates the​ item-specific guideline regarding the use of negative statements in a​ binary-choice item. b. This illustrative item is quite consistent with the​ item-specific guideline regarding the phrasing of items to elicit wrong answers. c. This illustrative​ item, regrettably, relies on particularly complicated syntax​ and, therefore, is apt to confuse​ test-takers. d. This illustrative item violates none of the​ chapter's general​ item-writing guidelines or the specific guidelines for writing​ binary-choice items.

a. This illustrative item violates the​ item-specific guideline regarding the use of negative statements in a​ binary-choice item.

Which source of validity evidence should be of most interest to teachers when evaluating their own​ teacher-made tests? a. Validity evidence based on test content b. Validity evidence based on response processes of​ students' taking the tests c. Validity evidence based on the internal structure of the​ teacher-made tests d. Validity evidence based on relationships between​ students' scores on​ teacher-made tests and those​ students' performances on other variables

a. Validity evidence based on test content

Which of the following statements most accurately reflects the relationship between students' aptitude and their achievement? a. Whereas aptitude tends to reflect potential, achievement tends to reflect prior learning. b. The level of a student's aptitude can never exceed the level of the student's achievement. c. Actually, achievement is little more than an operationalization of aptitude. d. Both aptitude and achievement are equivalent to a traditional conception of intelligence.

a. Whereas aptitude tends to reflect potential, achievement tends to reflect prior learning.

Reliability refers to the​ ___________________. a. consistency of the test scores b. relevancy of the test scores c. usefulness of the test scores d. None of the provided answer choices

a. consistency of the test scores

One of the important rules to be followed in creating multiple binary-choice items is that: a. Most items should mesh sensibly with a cluster's stimulus material. b. The stimulus material for any cluster of items should contain a substantial amount of extraneous information. c. Item clusters should be strikingly separated from one another. d. Multiple binary-choice items, to avoid confusion, should never be included in a test already containing binary-choice items.

c. Item clusters should be strikingly separated from one another.

Please select the one answer that most accurately identifies the particular​ item's "quoted" reason for teachers to know about assessment. ​"Wishing that students will make progress does not guarantee that students actually will do so. And this is why I believe teachers have a fundamental responsibility to monitor their​ students' progress throughout the school year. I try to administer informal​ progress-monitoring quizzes every few weeks to make sure my instruction is​ "taking." If my instruction is not working as well as I want it to​ work, then I can make modifications in my upcoming teaching plans.​ Assessment-based monitoring of​ students' progress is so very sensible that​ it's hard for me to understand why it is not more widely used. This​ is: a. One of ​today's reasons for teachers to know about assessment b. A traditional reason for teachers to know about assessment c. Both a traditional reason and one of​ today's reasons for teachers to know about assessment d. Neither a traditional reason nor one of​ today's reasons for teachers to know about assessment

b. A traditional reason for teachers to know about assessment

Which of the following is the best definition for the concept of accessibility for students with​ disabilities? a. Accessibility refers to the notion that some test takers must have an unobstructed opportunity to demonstrate their status with respect to the​ construct(s) being measured by an educational test. While most students have an unobstructed​ opportunity, educators only need to focus on a small few. b. Accessibility refers to the notion that all test takers must have an unobstructed opportunity to demonstrate their status with respect to the​ construct(s) being measured by an educational test. This is a key component of supporting individuals with disabilities. c. Accessibility refers to the notion that most test takers must have an unobstructed opportunity to demonstrate their status with respect to the​ construct(s) being measured by an educational test. It is impossible to achieve this for​ everyone, but is achievable for most students. d. None of these are reasonable definitions.

b. Accessibility refers to the notion that all test takers must have an unobstructed opportunity to demonstrate their status with respect to the​ construct(s) being measured by an educational test. This is a key component of supporting individuals with disabilities.

Suppose that the developers of a new science achievement test had inadvertently laden their​ test's items with​ gender-based stereotypes regarding the role of women in science​ and, when the new test was​ given, the test scores of girls were markedly lower than the test scores of boys. Which of the following deficits most likely accounts for this gender disparity in​ students' scores? a. Concurrent invalidity b. Construct-irrelevant variance c. Construct underrepresentation d. Predictive invalidity

b. Construct-irrelevant variance

Which of the following pieces of federal legislation attempts to install greater degrees of flexibility so that states and districts can particularize their programs for implementing chief provisions of the current successor to​ ESEA? a. NCLB b. ESSA c. ESEA d. IDEA

b. ESSA

Assume a​ state's education authorities have recently established a policy​ that, in order for students to be promoted to the next grade​ level, those students must pass a​ state-supervised English and language arts​ (ELA) exam. Administered near the close of grades​ three, six, and​ eight, the three new​ grade-level exams are intended to determine a​ student's mastery of the official​ state-approved ELA curricular targets for those three grades. As state authorities set out to provide support for these​ promotion-denial exams, which one of the following sources of validity evidence are they likely to rely on most​ heavily? a. Evidence based on responses of students as they take a​ grade-level test b. Evidence based on test content c. Evidence based on the test​ scores' relationship to other variables d. Evidence based on internal structure of each of the tests

b. Evidence based on test content

Which of the following questions is not an element in a research-supported conception of formative assessment? a. Formative assessment is a process, not a test. b. Formative assessment should be used only by teachers to adjust their ongoing instructional activities. c. Formative assessment must be carefully planned. d. Formative assessment calls for the use of assessment-elicited evidence in making adjustment decisions.

b. Formative assessment should be used only by teachers to adjust their ongoing instructional activities.

Which of the following represents the most appropriate strategy by which to support the validity of​ score-based interpretations for specific​ uses? a. Assembly of​ test-users' personal perceptions regarding the accuracy of their own​ score-based interpretations of​ test-takers' performances b. Generation of an​ evidence-laden validity argument in support of a particular​ usage-specified score interpretation c. Isolation of as much as possible​ inference-linked validity​ evidence, both positive and​ negative, regarding how to interpret​ test-takers' scores d. Collection of​ validity-relevant evidence, as well as​ validity-irrelevant evidence, regarding the best and the worst ways to give meaning to​ students' test scores

b. Generation of an​ evidence-laden validity argument in support of a particular​ usage-specified score interpretation

Which term best describes the type of measurement that would yield the following​ feedback: Jonathan scored within the 92nd percentile on the​ SAT? a. Assessment b. Classroom-created measurement c. Norm-referenced measurement d. Criterion-referenced measurement

c. Norm-referenced measurement

Why do some members of the measurement community prefer to use the phrase​ "absence-of-bias" rather than​ "assessment bias" when quantitatively reporting the degree to which an educational test appears to be​ biased? a. Because​ "absence of​ bias" does not include the qualifier​ "assessment" and,​ thus, is more comprehensive in its applicability. b. Because the quantitative indices typically employed when describing the extent to which a test is biased are most commonly conceptualized in a negative rather than a positive fashion. c. Because both reliability and​ validity, two key attributes of educational​ tests, are positive qualities.​ "Absence-of-bias" is a positive quality to be sought in educational tests. d. None of the provided answer choices is accurate.

c. Because both reliability and​ validity, two key attributes of educational​ tests, are positive qualities.​ "Absence-of-bias" is a positive quality to be sought in educational tests.

Consider the following test item. Your primary concern in selecting techniques to assess a learning objective or objectives should be classroom practicality and efficiency. a. True b. False Which category best describes this​ item? a. Multiple​ binary-choice b. Multiple choice c. Binary-choice d. Matching

c. Binary-choice

Please select the one answer that most accurately identifies the particular​ item's "quoted" reason for teachers to know about assessment. ​"I was quite surprised when our​ state's department of education insisted that each of the​ state's teachers collect accurate evidence of their​ students' growth because such evidence was to be used in evaluating all of the​ state's teachers. I​ have, for my entire​ career, collected pretest and posttest evidence of my​ students' achievement status because this helps me irrespective of what the state wants me to do to determine which​ changes, if​ any, are needed during next​ year's instruction." This​ is: a. A traditional reason for teachers to know about assessment b. One of ​today's reasons for teachers to know about assessment c. Both a traditional reason and one of​ today's reasons for teachers to know about assessment d. Neither a traditional reason nor one of​ today's reasons for teachers to know about assessment

c. Both a traditional reason and one of​ today's reasons for teachers to know about assessment

These​ standards, released by the Council of Chief State School Officers and the National Governors Association for Best Practices were an attempt to establish continuity and consistency across varying state curricular aims. a. Every Student Succeeds Act b.Individuals with Disabilities Education Act c. Common Core State Standards d. No Child Left Behind

c. Common Core State Standards

One of the most commonly misused terms in educational jargon is the word​ "standards." In​ reality, there is no​ singular, all-encompassing concept of a​ standard, but rather more specific subtypes of educational standards. Which of the following subtypes of standards could best be described as​ "the knowledge or skills that educators want students to​ learn"? a. Teaching standard b. Performance standard c. Content standard d. Academic standard

c. Content standard

Ms. Brown attempted to design an assessment that assessed the reading comprehension of her students in regard to their ability to comprehend informative text. After reviewing her drafted​ assessment, she realized she had relied on other kinds of text and had not included an adequate amount of informative text. Which phrase best describes Mrs.​ Brown's mistake? a. Inaccurate assessment formation b. Test bias c. Content underrepresentation d. Construct-irrelevant variance

c. Content underrepresentation

When students' test scores are-as predicted-correlated positively with those students' scores on a test aimed at a similar measurement mission, of what is this an example? a. Construct-irrelevant variance b. Divergent validity evidence c. Convergent validity evidence d. Construct underrepresentation

c. Convergent validity evidence

Which of the following was a major shift in the 2014 AERA-APA-NCME Standards for Educational and Psychological Testing? a. Formally sanctioning both the construct and the label: "Consequential Validity." b. Adding a complete chapter on affective assessment for individual students. c. Defining assessment validity as inference-accuracy for a specific purpose. d. Deleting a significant chapter of previous standards dealing with "fairness."

c. Defining assessment validity as inference-accuracy for a specific purpose.

Which one of the following pairs of validity evidence most frequently revolves exclusively around judgments focused on test​ content? a. Clearly explicated test​ developers' statements about what is to be assessed and correlations between​ test-takers' scores and their performances on external variables b. Descriptions of developmental care and correlations of​ test-takers' performances with their scores on similar tests c. Developmental-care documentation and external content reviews by nonpartisan judges d. Careful analyses of a​ test's internal structure and​ test-developers' descriptions of the care with which a test was built

c. Developmental-care documentation and external content reviews by nonpartisan judges

Which of the following best captures the testing comparison between NCLB and ESSA? a. ESSA demands more stringent assessment than NCLB. b. ESSA eliminates NCLB's focus on fairly assessing special subgroups of students. c. ESSA permits greater state-determinations of testing than NCLB. d. ESSA emphasizes more appropriate testing of female students than NCLB.

c. ESSA permits greater state-determinations of testing than NCLB.

Classroom assessment in public schools is often a function of federal legislation. Which of the following pieces of legislation is historically considered to have had the greatest impact on public school testing​ policies? a. No Child Left Behind b. Individuals with Disabilities Education Act of 2004 c. Elementary and Secondary Education Act​ (ESEA) of 1965 d. Every Student Succeeds Act

c. Elementary and Secondary Education Act​ (ESEA) of 1965

Which of the following terms best describes the means teachers employ in their attempt to promote​ students' achievement of the curricular ends being​ sought? a. Curriculum b. Measurability c. Instruction d. Assessment

c. Instruction

If Mr. Higgins, a fourth-grade teacher, tries to evaluate his major exams by ascertaining the degree to which his test's items are functioning in a similar manner, what kind of test-evaluative evidence is this? a. Test-retest reliability evidence b. Alternate-form reliability evidence c. Internal-consistency reliability evidence d. None of the above

c. Internal-consistency reliability evidence

If a​ teacher's students include children with disabilities or children who are English language​ learners, which assertion about assessment bias is most​ defensible? a. In view of the inherent difficulties that students with disabilities and English language learners are bound to experience in completing most educational​ tests, meaningful relaxations should be allowed in the levels of assessment challenges given to those two groups of students. b. Given the atypical nature of these two groups of students and the inability of most test accommodations to adequately level the playing​ field, there is really no need for​ test-developers to give any extra attention to​ bias-reduction for either of these two special groups of students. c. Students with disabilities and English language learners are at no greater risk for experiencing assessment bias. d. Because assessment bias erodes the validity of inferences derivative from​ students' test​ performances, even greater effort should be made to reduce assessment bias when working with these two distinctive populations.

d. Because assessment bias erodes the validity of inferences derivative from​ students' test​ performances, even greater effort should be made to reduce assessment bias when working with these two distinctive populations.

One of the following rules for the construction of essay items is accurate. The other three rules are not. Which is the correct rule? a. Give students an opportunity to match their achievement levels with the essay test by allowing them to choose, from optional items, those they will answer. b. Judge the quality of a given set of essay items by seeing how accurately a tryout group of students can comprehend what responses are sought. c. Force students to allocate their time judiciously by never indicating how much time should be expended on a particular item. d. Construct all essay items so the student's task for each item is unambiguously described.

d. Construct all essay items so the student's task for each item is unambiguously described.

Which term best describes the type of measurement that would yield the following​ feedback: Jonathan mastered 92 percent of the tested​ content? a. Assessment b. Classroom-created measurement c. Norm-referenced measurement d. Criterion-referenced measurement

d. Criterion-referenced measurement

Differential item functioning (DIF) is employed in connection with which of the following approaches to bias-detection? a. Neither empirical nor judgmental approaches b. Both empirical and judgmental approaches c. Judgmental approaches d. Empirical approaches

d. Empirical approaches

Which of the following rules is often recommended for the generation of matching items? a. Order all of the premises alphabetically, but arrange the responses in an unpredictable manner. b. Ideally, both the premises and the responses should represent fundamentally heterogeneous lists. c. Place the premises for an item on one page, then put most of the responses for that item on the following page. d. Employ relatively brief lists, placing the shorter words or phrases at the right.

d. Employ relatively brief lists, placing the shorter words or phrases at the right.

Which of the following assertions most accurately captures the relationship between disparate impact of a test and assessment bias? a. If a test is biased, it will only rarely have a disparate impact on different student subgroups. b. A test that is biased against either gender group will almost certainly be biased against ethnic groups. c. If a test has a disparate impact on different student subgroups, the test is a priori biased. d. If a test has a disparate impact on different student subgroups, the test is not necessarily biased.

d. If a test has a disparate impact on different student subgroups, the test is not necessarily biased.

Please select the one answer that most accurately identifies the particular​ item's "quoted" reason for teachers to know about assessment.​ "Just as physicians need to know about​ patients' blood pressure and what it​ indicates, teachers need to know about educational testing. It is simply part of what a solid educational professional needs to​ understand." This​ is: a. One of ​today's reasons for teachers to know about assessment b. A traditional reason for teachers to know about assessment c. Both a traditional reason and one of​ today's reasons for teachers to know about assessment d. Neither a traditional reason nor one of​ today's reasons for teachers to know about assessment

d. Neither a traditional reason nor one of​ today's reasons for teachers to know about assessment

Which of the following​ non-profit organizations played a significant role in the rapid adoption of the Common Core State Standards​? a. The National Education Association b. The American Federation of Teachers c. The Council for Exceptional Children d. The Bill and Melinda Gates Foundation

d. The Bill and Melinda Gates Foundation

Which of the following is generally conceded to be a key component of formative assessment? a. A teacher's exclusive reliance on the collection of data using constructed-response tests. b. Data obtained via standardized achievement tests c. A heavy emphasis on using students' classroom test results as a dominant factor in determining students' grades d. The framework provided by a learning progression's building blocks

d. The framework provided by a learning progression's building blocks

What is the chief function of validity evidence when employed to confirm the accuracy of​ score-based interpretations about​ test-takers' status in relation to specific uses of an educational​ test? a. To underscore the importance of consequential validity​ that's intended to verify that the particular uses for​ test-takers' scores are legitimate b. To explore the full range of potential uses for educational tests originally built to support one​ or, at​ most, two specific uses of​ test-takers' results c. To confirm rival hypotheses that might legitimately challenge a proposed interpretation of a​ test-taker's performance d. To support relevant propositions in a validity argument​ that's marshaled to determine the defensibility of certain​ score-based interpretations

d. To support relevant propositions in a validity argument​ that's marshaled to determine the defensibility of certain​ score-based interpretations


Related study sets

TeXes ESL Supplement Practice Questions

View Set

Ch 17 Nursing Diagnosis objectives

View Set

Ch.3: Drug Action Across the Life Span

View Set

mental health practice test 1,2,6,7

View Set

PGM Facility Management Level 2 Test

View Set

RDA Test Law & Ethics: Part 3 Study Guide

View Set

Module 5 Practice Questions (MS Surgeries)

View Set