Tests and Measurements Final Exam

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

The Rorschach Inkblot Test is an example of which kind of test?

Projective

Define mean.

Same as average

How can a r-value (correlation) of +0.82 be interpreted?

The two variables or the two sets of data tend to be closely related.

What is another word for "mean"?

Average

The most effective marking and reporting systems are those designed to provide the information needed by whom?

Users of the report

Find the z-score for the following: raw score = 35, mean = 37, SD = 8

-.25 (use your formula sheet)

Find the z-score if the T-score is 35.

-1.5 (use your formula sheet)

10. Know that the mean, median, and mode all fall at a z-score of ___ in a normal distribution.

0.0

In what ways should marking and reporting systems benefit the pupil? (3 ways)

1) indicating pupil's strengths and weaknesses 2) negatively reinforcing undesirable behavior 3) enhancing and maintaining pupil motivations

A percentile rank of 16 is how many standard deviations below the mean in a normal distribution?

1.0 (use your normal distribution picture)

If a pupil were told her raw score was 24 in a distribution, with a mean = 16 and a standard deviation = 6, what would her z-score be?

1.33 (use your formula sheet)

Approximately what percentage of cases fall between a z-score of +2 and a z-score of +3?

2% (use your normal distribution picture)

What percent of students in a distribution falls between the first and third quartiles?

50%

Find the raw score for the following: z-score = -0.6, mean = 70, SD = 10

64 (use your formula sheet)

In a normal distribution, what percentage of scores will fall between -1 standard deviation and +1 standard deviation?

68% (use your normal distribution picture)

What percentage of the normal curve lies between z-scores of -2.0 and +2.0?

95% (use your normal distribution picture)

In what instances would predictive validity be of high interest?

A biographical data bank being used in picking airplane pilots {When you need to predict a FUTURE behavior about someone}

Define multimodal.

A distribution with more than one most frequent score

What could one conclude about a correlation of 1.25 between Form A and Form B of a test?

A mistake has been made --- correlation can ONLY be between -1 and +1

Define median.

A point on the scale the divides the number of scores in halves.

Which marking (grading) practices results in the most reliable grades?

A, B, C, D, F

What types of test results NEED to have interpretation provided when being given to parents?

ALL TYPES

List some examples of building reliability into a test. (3 examples)

Adding items of good quality; administering the test to a heterogeneous group; controlling the conditions of test administration

Briefly describe an alternate-forms reliability check.

Administering two different tests (or two different test forms) at two different times

What is provided by norm-referenced achievement tests?

An index of a student's achievement relative to a designated group.

How should each mark (grade) in a course be assigned and interpreted?

As a measure of the level of achievement

What is another name for a histogram?

Bar Graph

Which type of marking system would require the greatest amount of teacher time?

Basing marks on student improvement

Why should one not practice combining achievement and attitude (or effort) in a single mark?

Because the mark will be difficult to interpret

Why is it not reasonable to expect the typical school system to bring all 4th-graders up to the 4th-grade norm on a standardized achievement test in a subject such as reading?

Because the norm is an average, rather than a minimum standard

Within a given item format, the items of a test should be arranged how?

By increasing level of complexity

If a teacher wanted to determine how well a standardized achievement test would measure the objectives which she had been trying to teach, it would be best for her to ________?

Compare the test itself with her objectives

According to the text, what should NOT be considered in combining grades into a composite mark?

Conduct Grades

What is the primary intent of Congress in requiring annual school- and district-wide assessments?

Congress' primary intent was to enhance accountability for achievement in special education student.

What type of validity coefficient is most important for personality tests?

Construct

The procedures for item analysis of a norm-referenced test require that both the high group and the low group ________________.

Contain the same number of pupils

Which type of validity refers to comparing test items with objectives?

Content

Which type of validity is most important for tests constructed by teachers?

Content Validity

Provide a few, brief examples of predictive validity.

Correlating ACT scores with a 4th-year GPA

Define concurrent validity.

Correlating a new test/measurement device with an existing device measuring the same things

What are some recommendations to help prepare students for a test?

Equalize the advantages between test-wise and non-test-wise students

What are some appropriate uses for the results of standardized aptitude tests?

Evaluating students' readiness for new learning; adjusting instruction to meet individual needs; assisting student with their educational plans

What is the factor that most seriously limits the value of grades for certification?

Grades are not comparable between classes and schools

Define "high-stakes" testing.

HST refers to the use of standardized test results alone or in combination with other measures to make educational decisions that significantly affect students, staff, or policies.

If the correlation between piano-playing ability and proficiency in weight-lifting is negative, what could we assume or predict about a poor piano player?

He would be a good weight-lifter

What should be the role of IQ tests in grading or marking practices?

IQ tests should play NO ROLE AT ALL in grading

What would make the content validity high for a teacher-constructed test?

If the teacher has matched items to objectives.

What is the best way to interpret test scores?

In light of both test and non-test information

What type of reliability can be computed after a single test administration?

Internal consistency

What are some disadvantages of using a pass/fail grading system?

It makes an interpretation on passing grades difficult.

What happens when one departs from the exact instructions in administering a standardized test?

It will probably affect the reliability of measurement most seriously.

What is a common characteristic of items found in standardized tests?

Items have been analyzed and refined

What does "aptitude" mean?

Likely to be successful if aptitude is present and likely to be unsuccessful if not

What is meant by "a given test item is highly discriminating?"

Many more high-scoring students than low-scoring students answer it correctly.

What are some major disadvantages of comparisons based on aptitude?

Marks tend to unjustly reward low aptitude students.

What statistic is also called the 50th percentile?

Median

What type of test would tell a parent how well the student is doing compared to students in other schools?

Norm-referenced test

What happens to the reliability of a test if all test-takers' scores are inadvertently increased by 5 points?

Nothing - the reliability coefficient remains the same.

What types of numbers indicate a strong, linear correlation?

Numbers very close to +1 or -1

What does a negative correlation mean?

One set of data increases while the other set decreases

Pupil progress reports in the form of letter grades are most likely to adequately serve the needs of whom?

Parents

Aptitude is to future as achievement is to ______.

Past

What is the name for the point below which a given percentage of scores lies?

Percentile

How does performance-based assessment compare to standardized testing?

Perf-based assessment is more time-consuming.

Standardized readiness and diagnostic tests provide information useful for what purposes?

Predicting success in certain areas; diagnosing pupil weaknesses; determining whether pupils are prepared to learn

As compared with achievement tests, aptitude tests are mainly used for ________.

Prediction

What type of validity is most appropriate for aptitude tests?

Predictive

Which type of validity requires a time interval for its determination?

Predictive

To what other measurement is the standard error of measurement closely related?

Reliability

What happens if the instructions for administering and scoring a standardized achievement test are NOT followed rigidly when administered?

Reliability is affected to an unknown extent

A student's reading achievement score is one standard deviation above the national norm. How could you best interpret this finding to her parents?

She is above average in reading

What measure is the most dependable derived unit for measuring how far a score varies from the mean for the group?

Standard deviation

The majority of scores tend to cluster toward the middle of the range of scores in which type of distribution?

Symmetrical distribution

What type of test is called for when a teacher gives a final exam to see if students have met his course objectives?

Teacher-made test

What is a major advantage of teacher-made tests over standardized achievement tests?

Teacher-made tests can better match course objectives

Which type of reliability coefficient is usually greater over a shorter period of time rather than a longer period of time?

Test-retest

What is the most stable measure of central tendency?

The Median

What factors should determine the weights assigned to various components of a final mark?

The emphasis the teacher has attached to each of the components during the instructional period.

What is the main distinction between aptitude and achievement tests?

The purpose for which they are used

What is the first decision made by the test constructor?

The purpose of the test

What is the raw score when the z-score is 0?

The raw score is the mean

Define mode.

The score that is duplicated (repeated) the largest number of times in a set of scores

Briefly define the standard error of measure.

The standard error of measurement on a test is a value that decreases as the accuracy of a test increases. ALSO: The standard error of measurement is the index which estimates the extent to which a score obtained on a test approximates the individual's true score on the test.

From a measurement point of view, what is the major objection to reporting grades by intervals of one percentage point rather than by five letter grades?

The unit is too small to be judged accurately

What happens to the validity coefficient when the time interval between predictor and criterion measures is decreased?

The validity coefficient increases

What is the purpose for interpreting a test score as a band of scores rather than a specific value?

To prevent attaching significance to chance differences

National, regional, and state norms for a standardized achievement test are appropriate for interpreting test results only when the test is administered how?

Under the same instructions used for the norm group

What are some characteristics of teacher-made tests?

Uniform directions are not specified; content is determined by the classroom teacher; hurried, often haphazard construction

What may have unintentionally happened when Congress agreed to allow accommodations and alternative standard assessments?

Unintentionally, Congress may have limited the comparability of test results for special education students.

How can the reliability of grades be improved?

Using 5 - 15 categories

What kinds of decisions can teachers make using reports from standardized achievement tests?

What students to put in advanced groups; the difficulty level of individual assignments; whether to encourage a student to plan for college

How can one build reliability into a test?

Write items of various difficulty levels

Can a test that yields a "large" negative correlation be useful?

Yes, it would be just as useful as one with the same sized positive correlation.

Briefly describe a test-retest situation.

You administer Test A on April 1 and on May 15 to the same group of subjects and you then correlate the results.

Record-keeping is most time consuming when grades are based on comparison with what?

effort

What description could you give to an oven thermometer that measured the temperature in an oven to be 400°F five days in a row when the temperature was actually 397°F.

reliable but not valid

What can best describe a student whose science achievement score is one standard deviation below the national norm?

she is below average in science

A student obtained a score of 25 on a test. What does this mean?

that he obtained a raw score of 25, and little else

Why were the earliest IQ tests constructed?

to identify children who would probably have difficulty in learning in the typical classroom


Ensembles d'études connexes

Culture (assessment project questions)

View Set

الفصل الثاني / حضارة بلاد الرافدين (العراق)

View Set

lesson 1 module 2: trade-offs and opportunity costs

View Set

Med Surg GI/GU Diabetes Questions

View Set

Mosby Review - Section 6 Microbiology and Public Health

View Set