Quizzes for Midterm
When a researcher tests the accuracy of a regression equation on a second group of people that is different from the group on which the equation was developed, what is being done?
Cross-validation
Which of the following is a practical method of ensuring data recoding accuracy when two people are administering a skinfold test?
The person writing down the score should repeat the score to the tester
Stability Reliability is when each subject is measured with the same instrument on two occasions or on two different days, and scores are then correlated.
True
The Spearman - Brown prophecy formula is used to estimate the reliability of a measurement when the length of the test is changed.
True
The Spearman-Brown prophecy formula is used to estimate the reliability of a measurement when the length of the test is changed.
True
The agreement of competent judges about the value of a measure is considered rater reliability.
True
When graphing, X scores are placed on the horizontal axis, and Y scores are places on the vertical axis.
True
Which of the following is an example of a gold standard test that would be suitable for obtaining criterion-related validity evidence?
VO2 max (aerobic fitness)
In which of the following situations is logical validity evidence most likely used
Validity for a three-item fitness test.
Which of the following scenarios is most representative of predictive validity evidence?
Worse scores on a treadmill test are shown to be associated with higher subsequent risk for mortality
Is objectivity important for criterion-referenced tests and can it be estimated?
Yes, it is important and could be estimated as rater reliability.
A student runs the 50-yard dash in 6.1 seconds. If the mean is 6.52 and that standard deviation is .25, what is the student's z-score?
Z= (x-mean)/S = 1.68
Reliability of criterion-referenced tests is:
defined specifically as consistency of classifying subjects the same.
What is the reliability for the data below using an intra-class R if the score for each person is the mean of his/her trial scores and a one-way ANOVA is used? MSA is 2.81 and MSW is .59. R = (MSA - MS W)/ MSA Person A B C D Trial 1 2 1 3 2 Trial 2 2 2 4 1
.79
A person's height is measured in centimeters as 174.4 cm. What is his height in meters, rounded to 2 decimal places?
1.74
Which of the following factors ensures that test scores have objectivity?
A defined scoring method for the test.
Why is objectivity estimated from a two-way ANOVA model and not a one-way ANOVA model?
A difference among judges is expected to occur.
Within a simple distribution frequency table of questionnaire scores, the cumulative percentage for a score of 77 is 23. What does this tell us?
Not= 23% of respondents scored 77
Which of the following is desirable during the development and administration of a test?
Planning the equipment and procedures that will be used when administering the test
If people were administered a bench press test and a 100-yard dash test on one day and 60 days later the people were classified as to football playing ability, this is an example of obtaining:
Predictive validity
What is the probability that a z-score is > 0.00?
50% probability
What is the mean score for the following set of data? Score: 4, 5, 5, 7, 7, 8
6
Usually the reliability coefficient will be higher if it is for:
A. an internal consistency rather than a stability coefficient.
What is the rationale for considering an experienced team coach's subjective rating of player ability to be a good gold standard criterion measure?
All ( the coach has seen the athlete perform in the real-world game situations, the coach is an expert, The coach has seen the player perform under a variety of playing conditions, the coach has seen the player perform over a long period of time)
Why is it inappropriate to make statements such as "The test is valid"?
All ( validity is specific to the situation and population for which it is established, validity pertains to how we use the test, validity is not simply a property of the test, validity is not generalizable)
if the validity coefficient for a test is low it may be that:
All (criterion it was validated against is not appropriate, the reliability for the test is not high, the validity for the test is low, the test is not appropriate for the group used to obtain validity evidence)
Why would you convert a performance time score recorded in different units (eg time to complete a treadmill test, recorded in minutes and seconds) into total number of seconds?
Because it increases the precision of the scores
The reliability of a test may be increased by:
C. increasing the amount of difference among students. A. increasing the number of trials. B. increasing the number of potential scores. (D. All of the above)
f you wished to use SPSS to calculate Z-scores, using a known mean and standard deviation, which SPSS program or sub-routine would you use?
Compute Variable
If you wished to use SPSS to calculate z-scores, using a known mean and standard deviation, which SPSS program or sub-routine would you use?
Compute variable
When testing skinfolds, a physical education teacher ensures that children's scores are not available to anyone except to the pupil being tested. What aspect of ethical testing does this cover?
Confidentiality
What type of evidence of validity is demonstrated when performance of an advanced group is superior to performance of an intermediate group?
Construct-related
What type of evidence of validity is demonstrated when a test is correlated with subjective ratings of performance from judges?
Criterion-related
which of the following is not necessary when administering a sports skills test in a school setting?
Encouraging students to interact while performing the test
If test scores change little from one day to another, the scores are said to be internally consistent.
False
The descriptive program in the SPSS package is used to get a reliability coefficient.
False
The descriptive program, in the SPSS package is used to get a reliability coefficient.
False
Which of the following pairs of test would yield a discriminant validity coefficient?
Sit-up test and sit and reach test.
Which of the following example situations describes an outlier?
In a data set containing blood pressure scores, one score is 3.5 standard deviations below the mean
When creating an SPSS or Excel data file, why is it advisable to use short variable names?
It simplifies scrolling back and forth within large data sets more straightforward.
Which of the following is an alternative term used in kinesiology for content validity evidence?
Logical validity
What is the reliability for the data below using an intra-class R if the score for each person is the mean of his/her trial scores and a one-way ANOVA is used?
MSa is 2.81 and MSw is .59. 79
The mean is the most popular measure of central tendency because it is:
The arithmetic average of all scores.
If half the test scores are above the mean 76 which of the following statements is true?
The mean and medium are equal
The general form of a prediction equations is Y=bX+c. In this equation, what does "b" represents?
The slope of the regression line
When a teacher uses a battery of test (aka a set of test that assess a variety of aspects of something), which of the following is true about the test that compose that battery?
The test in the battery should not be highly correlated with the criterion
The main advantage of z-score is:
They can used for comparing scores from different test.
Which of the following general statements regarding testing situations within kinesiology is true?
They usually comprise a broad range of types of tests and test situations.
Which of the following statements about validity is correct?
To be valid, a test must be reliable.
A criterion score commonly used when multiple trial scores are collected is the mean of the trial scores.
True
Which of the following is a "measurement instrument"?
a) A paper and pencil exam b) A shuttle run test c) A skinfold caliper d) A psychological questionnaire ( e) All of the above are examples of measurement instruments )
Which of the following is an important test characteristic?
a) Reliability b) Validity c) Discrimination d) Practicality e) Mass testability ( f) All of the above)
Which of the following is not a stage of testing?
a) Test Selection b) Test Preparation c) Test Administration d) Data Processing e) Decision Making and Feedback ( f) All of the above are stages of testing.)
What is/are the purpose(s) of a test manual?
a) To train testers how to use the test b) To provide standardized instructions for the test c) To provide reliability and validity evidence for the test d) To provide test norms ( e) All of the above are purposes of a test manual)
Which of the following is not relevant to ensuring safety during a test?
b) Ensure the test is not too difficult (basketball dribble)
Why is it inappropriate to make statements such as "The test is valid"?
c) Validity pertains to how we use the test d) Validity is specific to the situation and population for which it is established (e) All of the above answers are correct) a) Validity is not simply a property of the test b) Validity is not generalizable
The formula R= (MSa-MSw)/MSa yields an estimate of the reliability of the:
mean score
Poort time- keeping, when not stopping the stopwatch quickly enough as an athlete crosses the finish is an example of:
positive measurement error
The best method for determining the reliability of physical performance scores is by calculating the intraclass R among
scores of a test administered on each of several days.