Tests and Measurements Final Exam
The Rorschach Inkblot Test is an example of which kind of test?
Projective
Define mean.
Same as average
How can a r-value (correlation) of +0.82 be interpreted?
The two variables or the two sets of data tend to be closely related.
Find the z-score for the following: raw score = 35, mean = 37, SD = 8
-.25 (use your formula sheet)
Find the z-score if the T-score is 35.
-1.5 (use your formula sheet)
10. Know that the mean, median, and mode all fall at a z-score of ___ in a normal distribution.
0.0
In what ways should marking and reporting systems benefit the pupil? (3 ways)
1) indicating pupil's strengths and weaknesses 2) negatively reinforcing undesirable behavior 3) enhancing and maintaining pupil motivations
A percentile rank of 16 is how many standard deviations below the mean in a normal distribution?
1.0 (use your normal distribution picture)
If a pupil were told her raw score was 24 in a distribution, with a mean = 16 and a standard deviation = 6, what would her z-score be?
1.33 (use your formula sheet)
Approximately what percentage of cases fall between a z-score of +2 and a z-score of +3?
2% (use your normal distribution picture)
What percent of students in a distribution falls between the first and third quartiles?
50%
Find the raw score for the following: z-score = -0.6, mean = 70, SD = 10
64 (use your formula sheet)
In a normal distribution, what percentage of scores will fall between -1 standard deviation and +1 standard deviation?
68% (use your normal distribution picture)
What percentage of the normal curve lies between z-scores of -2.0 and +2.0?
95% (use your normal distribution picture)
In what instances would predictive validity be of high interest?
A biographical data bank being used in picking airplane pilots {When you need to predict a FUTURE behavior about someone}
Define multimodal.
A distribution with more than one most frequent score
What could one conclude about a correlation of 1.25 between Form A and Form B of a test?
A mistake has been made --- correlation can ONLY be between -1 and +1
Define median.
A point on the scale the divides the number of scores in halves.
Which marking (grading) practices results in the most reliable grades?
A, B, C, D, F
What types of test results NEED to have interpretation provided when being given to parents?
ALL TYPES
List some examples of building reliability into a test. (3 examples)
Adding items of good quality; administering the test to a heterogeneous group; controlling the conditions of test administration
Briefly describe an alternate-forms reliability check.
Administering two different tests (or two different test forms) at two different times
What is provided by norm-referenced achievement tests?
An index of a student's achievement relative to a designated group.
How should each mark (grade) in a course be assigned and interpreted?
As a measure of the level of achievement
What is another word for "mean"?
Average
What is another name for a histogram?
Bar Graph
Which type of marking system would require the greatest amount of teacher time?
Basing marks on student improvement
Why should one not practice combining achievement and attitude (or effort) in a single mark?
Because the mark will be difficult to interpret
Why is it not reasonable to expect the typical school system to bring all 4th-graders up to the 4th-grade norm on a standardized achievement test in a subject such as reading?
Because the norm is an average, rather than a minimum standard
Within a given item format, the items of a test should be arranged how?
By increasing level of complexity
If a teacher wanted to determine how well a standardized achievement test would measure the objectives which she had been trying to teach, it would be best for her to ________?
Compare the test itself with her objectives
According to the text, what should NOT be considered in combining grades into a composite mark?
Conduct Grades
What is the primary intent of Congress in requiring annual school- and district-wide assessments?
Congress' primary intent was to enhance accountability for achievement in special education student.
What type of validity coefficient is most important for personality tests?
Construct
The procedures for item analysis of a norm-referenced test require that both the high group and the low group ________________.
Contain the same number of pupils
Which type of validity refers to comparing test items with objectives?
Content
Which type of validity is most important for tests constructed by teachers?
Content Validity
Provide a few, brief examples of predictive validity.
Correlating ACT scores with a 4th-year GPA
Define concurrent validity.
Correlating a new test/measurement device with an existing device measuring the same things
What are some recommendations to help prepare students for a test?
Equalize the advantages between test-wise and non-test-wise students
What are some appropriate uses for the results of standardized aptitude tests?
Evaluating students' readiness for new learning; adjusting instruction to meet individual needs; assisting student with their educational plans
What is the factor that most seriously limits the value of grades for certification?
Grades are not comparable between classes and schools
Define "high-stakes" testing.
HST refers to the use of standardized test results alone or in combination with other measures to make educational decisions that significantly affect students, staff, or policies.
If the correlation between piano-playing ability and proficiency in weight-lifting is negative, what could we assume or predict about a poor piano player?
He would be a good weight-lifter
What should be the role of IQ tests in grading or marking practices?
IQ tests should play NO ROLE AT ALL in grading
What would make the content validity high for a teacher-constructed test?
If the teacher has matched items to objectives.
What is the best way to interpret test scores?
In light of both test and non-test information
What type of reliability can be computed after a single test administration?
Internal consistency
What are some disadvantages of using a pass/fail grading system?
It makes an interpretation on passing grades difficult.
What happens when one departs from the exact instructions in administering a standardized test?
It will probably affect the reliability of measurement most seriously.
What is a common characteristic of items found in standardized tests?
Items have been analyzed and refined
What does "aptitude" mean?
Likely to be successful if aptitude is present and likely to be unsuccessful if not
What is meant by "a given test item is highly discriminating?"
Many more high-scoring students than low-scoring students answer it correctly.
What are some major disadvantages of comparisons based on aptitude?
Marks tend to unjustly reward low aptitude students.
What statistic is also called the 50th percentile?
Median
What type of test would tell a parent how well the student is doing compared to students in other schools?
Norm-referenced test
What happens to the reliability of a test if all test-takers' scores are inadvertently increased by 5 points?
Nothing - the reliability coefficient remains the same.
What types of numbers indicate a strong, linear correlation?
Numbers very close to +1 or -1
What does a negative correlation mean?
One set of data increases while the other set decreases
Pupil progress reports in the form of letter grades are most likely to adequately serve the needs of whom?
Parents
Aptitude is to future as achievement is to ______.
Past
What is the name for the point below which a given percentage of scores lies?
Percentile
How does performance-based assessment compare to standardized testing?
Perf-based assessment is more time-consuming.
Standardized readiness and diagnostic tests provide information useful for what purposes?
Predicting success in certain areas; diagnosing pupil weaknesses; determining whether pupils are prepared to learn
As compared with achievement tests, aptitude tests are mainly used for ________.
Prediction
What type of validity is most appropriate for aptitude tests?
Predictive
Which type of validity requires a time interval for its determination?
Predictive
To what other measurement is the standard error of measurement closely related?
Reliability
What happens if the instructions for administering and scoring a standardized achievement test are NOT followed rigidly when administered?
Reliability is affected to an unknown extent
A student's reading achievement score is one standard deviation above the national norm. How could you best interpret this finding to her parents?
She is above average in reading
What measure is the most dependable derived unit for measuring how far a score varies from the mean for the group?
Standard deviation
The majority of scores tend to cluster toward the middle of the range of scores in which type of distribution?
Symmetrical distribution
What type of test is called for when a teacher gives a final exam to see if students have met his course objectives?
Teacher-made test
What is a major advantage of teacher-made tests over standardized achievement tests?
Teacher-made tests can better match course objectives
Which type of reliability coefficient is usually greater over a shorter period of time rather than a longer period of time?
Test-retest
What is the most stable measure of central tendency?
The Median
What factors should determine the weights assigned to various components of a final mark?
The emphasis the teacher has attached to each of the components during the instructional period.
What is the main distinction between aptitude and achievement tests?
The purpose for which they are used
What is the first decision made by the test constructor?
The purpose of the test
What is the raw score when the z-score is 0?
The raw score is the mean
Define mode.
The score that is duplicated (repeated) the largest number of times in a set of scores
Briefly define the standard error of measure.
The standard error of measurement on a test is a value that decreases as the accuracy of a test increases. ALSO: The standard error of measurement is the index which estimates the extent to which a score obtained on a test approximates the individual's true score on the test.
From a measurement point of view, what is the major objection to reporting grades by intervals of one percentage point rather than by five letter grades?
The unit is too small to be judged accurately
What happens to the validity coefficient when the time interval between predictor and criterion measures is decreased?
The validity coefficient increases
What is the purpose for interpreting a test score as a band of scores rather than a specific value?
To prevent attaching significance to chance differences
National, regional, and state norms for a standardized achievement test are appropriate for interpreting test results only when the test is administered how?
Under the same instructions used for the norm group
What are some characteristics of teacher-made tests?
Uniform directions are not specified; content is determined by the classroom teacher; hurried, often haphazard construction
What may have unintentionally happened when Congress agreed to allow accommodations and alternative standard assessments?
Unintentionally, Congress may have limited the comparability of test results for special education students.
The most effective marking and reporting systems are those designed to provide the information needed by whom?
Users of the report
How can the reliability of grades be improved?
Using 5 - 15 categories
What kinds of decisions can teachers make using reports from standardized achievement tests?
What students to put in advanced groups; the difficulty level of individual assignments; whether to encourage a student to plan for college
How can one build reliability into a test?
Write items of various difficulty levels
Can a test that yields a "large" negative correlation be useful?
Yes, it would be just as useful as one with the same sized positive correlation.
Briefly describe a test-retest situation.
You administer Test A on April 1 and on May 15 to the same group of subjects and you then correlate the results.
Record-keeping is most time consuming when grades are based on comparison with what?
effort
What description could you give to an oven thermometer that measured the temperature in an oven to be 400°F five days in a row when the temperature was actually 397°F.
reliable but not valid
What can best describe a student whose science achievement score is one standard deviation below the national norm?
she is below average in science
A student obtained a score of 25 on a test. What does this mean?
that he obtained a raw score of 25, and little else
Why were the earliest IQ tests constructed?
to identify children who would probably have difficulty in learning in the typical classroom