Psychological Testing Midterm
Two significant problems with the 1937 scale were
(a) the reliability of the test and (b) differential variability in IQ scores.
What value of validity coefficients is considered adequate?
.30 to .40
In the 1960 revision of the Stanford-Binet scale (SB-LM), the problem of differential variation in IQ scores was solved through the implementation of the deviation IQ concept. Complete the following statements related to this concept.The deviation IQ was a standard score with a mean of _________ and a standard deviation of _________ .The deviation IQ was derived by examining the standard deviation of __________ for a representative sample of examinees at each chronological age level.The deviation IQ could be compared across chronological age levels because an IQ at one age level (for example, an IQ of 90 at the 10-year-old age level) corresponded to the same ___________ as the same IQ at another age level (for example, an IQ of 90 at the 18-year-old age level).
100 16 mental age percentile
If researchers found a criterion validity coefficient of .40 for an achievement test, it would mean that _____.
16% of the variation in the criterion can be explained by variation in achievement test scores
The _____, a structured personality test, was developed primarily through factor analysis.
16PF
he 1960 revision of the Stanford-Binet scale (SB-LM) incorporated several improvements over its predecessors. For example, instructions for scoring and test administration were improved. Further, the age range was extended from 16 to _______ years of age. Also, tasks that met two criteria were selected for the SB-LM. These included tasks that showed an increase in the percentage passing with an increase in _____; tasks that correlated highly with ______ as a whole.
18 age scores
The maximum possible mental age a person could obtain on the Stanford-Binet is
19.5
In the year _________ , Binet and Simon revised their intelligence test again to create a third version containing minor improvements over the second (1908) version.
1911
A test has a reliability coefficient of .77. This coefficient means that _____.
77% of the variance in test scores is true score variance, and 23% is error variance
Suppose you are in the 87th percentile on a test. This means that _____.
87% of the students got a score lower than your score
Freeman's definition of intelligence
Adjustment or adaptation of the individual to his total environment, the ability to learn, and the ability to carry on abstract thinking,
Anderson's definition of intelligence
Based on individual differences in information-processing speed and executive functioning influenced largely by inhibitory processes
Evidence suggests that _____ developed the earliest systematic program of psychological testing.
China
_____ assumes that each person has a true score that would be obtained if there were no errors in measurement.
Classical test score theory
Which of the following is a recommendation regarding the evaluation of validity coefficients found in the Standards for Educational and Psychological Testing?
Ensure that the results of the validity study are sufficiently generalizable to other groups.
_____ applied the concept of survival of the fittest to the study of individual differences among human beings in his book Hereditary Genius.
Frances Galton
Why was the standardization sample of the 1916 Stanford-Binet scale considered inadequate?
It was comprised exclusively of white children from California.
he formula used by Terman to derive the intelligence quotient (IQ) is
MA/CA x 100
The score of a child who took the 1908 revision of the Binet-Simon scale would probably be reported as a(n) _____.
Mental age
Alfred Binet was appointed to a commission by the French minister of public instruction in order to identify children who were _________ ___________. Binet was especially qualified for the commission because he had studied _________ ________.
Mentally subnormal Human abilities
_____ are points that divide the frequency distribution into equal fourths, or 25%, while _____ divide the distribution by 10%.
Quartiles; deciles
Who viewed intelligence in terms of general mental ability as well as specific factors?
Spearman
Spearman's definition of intelligence
The ability to reduce either relations or correlates,
Gardner's definition of intelligence
The ability to resolve genuine problems or difficulties as they are encountered,
is a standard score with a mean of 100 and a standard deviation of 16 (later 15) that was first introduced in the 1960 revision of the Stanford-Binet.
The deviation IQ
Sternberg's definition of intelligence
The mental activities involved in purposive adaptation to, shaping of, and selection of real-world environments relevant to one's life,
factor analysis
The method for reducing a set of variables or scores to a smaller number of hypothetical variables, Approximately half of the variance in a set of diverse mental ability tests is represented in the g factor
positive manifold
The observation that, when a set of diverse ability tests are administered to large unbiased samples of the population, almost all of the correlations indicate that higher levels ability on one test are associated with higher levels of ability on each other, All tests, no matter how diverse, load on g
Which statement best explains why the use of psychological testing declined among psychologists beginning in the 1950s through the 1970s?
The public increasingly criticized the potentially intrusive nature of psychological testing and feared the misuse of tests.
Binet's definition of intelligence
The tendency to take and maintain a definite direction; the capacity to make adaptations for the purpose of attaining a desired end, and the power of autocriticism,
Which of the following is true regarding the conditions of a validity study?
They are never exactly reproduced.
Which statement is true of the work of German psychophysicists Herbart, Weber, Fechner, and Wundt?
This body of work established the idea that psychological testing requires rigorous experimental control.
A standardization sample is _____.
a comparison group
The gf-gc theory of intelligence is _____.
a theory that proposes two basic types of intelligence
Jerome is taking a test where they are scored in terms of both speed and accuracy. This is a(n) _____.
ability test
What refers to the simple fact that one can differentiate older children from younger children by the former's greater capabilities?
age differentiation
Which of the following were the two guiding principles for Binet's test construction?
age differentiation and general mental ability
A student is taking the 1908 Binet-Simon scale and has been asked to recall six items from a passage. What age level is being assessed?
age level 9
Administering two supposedly equivalent forms of test to the same group of individuals yields a correlation coefficient indicating _____.
alternative forms reliability
A necessary consequence of the maximum mental age used with the 1916 Stanford-Binet is that
anyone older than 19 years, 6 months would have an IQ less than 100
As part of the scholarship application for a summer dramatic arts program, 12-year-old Sharon completed a test that assessed their potential for acquiring acting-specific skills. Sharon has taken a(n) _____ test.
aptitude
Strong interrater reliability is probably most important to which type of test?
behavior rating scales
Which of the following is a new direction in psychological science?
big data
fluid intelligence
can best be thought of as those abilities that allow us to reason, think, and acquire new knowledge
Which of the following is a potential problem with the test-retest method?
carryover effects
Because of the potential problem of underestimating the IQs of people over a certain age, Terman placed a maximum limit on the chronological age that would be figured into the IQ equation. This is because it was believed that mental age __________ ____ _______ after _____ years of age.
ceased to improve 16
When using the split-half method and the two halves have unequal variances, which of the following can be used?
coefficient alpha
Which of the following can confirm that a test has substantial reliability but cannot tell you if a test is unreliable?
coefficient alpha
Job samples provide a good example of the use of which type of evidence?
concurrent validity evidence
If the math portion of the college entrance exam you took included only items related to geometry, you could say that the test suffered from which of the following?
construct underrepresentation
After taking an exam in a psychology course, a student makes this comment: "All the questions on the test came from Chapter Three, but the professor said the exam was over material in Chapters One through Six! What a waste of study time!" The student's criticism of the professor's exam is most obviously related to _____ evidence of validity.
content-related
Developers of a sensation-seeking scale found that scores on the scale were highly correlated with self-reported frequency of alcohol and drug use. This finding most clearly provides _____ evidence for the validity of the sensation-seeking scale.
convergent construct-related
A researcher has developed a new test for anxiety. The new test correlates well with already existing tests on anxiety. This indicates _____ for validity.
convergent evidence
Which of the following allows test developers to estimate what the correlation between two measures would have been if they had not been measured with error?
correction for attenuation
Knowing that George Washington was the first president of the United States is an example of _____.
crystallized intelligence
The knowledge you have acquired through your academic studies would best be described in terms of _____.
crystallized intelligence
A central problem of the 1937 revision of the Stanford-Binet scale was that _____.
different age groups showed significant differences in the standard deviation of IQ scores
Dr. Lansing found that the correlation between scores on a measure of anxiety they developed and scores on an existing measure of depression was .82. It is likely that reviewers of Dr. Lansing's anxiety measure will view this correlation as evidence against _____ validity.
discriminant construct-related
The definition of intelligence, as the term is used in testing, is best described as
elusive
Administration of the modern Stanford-Binet requires examiners to continue testing until the _____.
examinee's ceiling is reached
Cognitive Tradition
examines adaptation to real-world demands
information-processing tradition
examines how we learn and solve problems,
Psychometric Tradition
examines the elemental structure of a test,
Which of the following does not need to be included when reporting reliability?
explanation of classical test theory
The most we could say about a self-esteem test consisting of the items "I feel I deserve to be treated with respect" and "I think I am worthless" is that it has _____.
face validity
A researcher has administered an assessment but believes the assessment actually measures several characteristics, instead of just one. What can the researcher use to determine if their conclusion is correct?
factor analysis
The total product of the various separate and distinct elements of intelligence, according to Binet, is called _____.
general mental ability
When you take a class exam, you are taking a(n) _____ test.
group
Test constructors can improve reliability by _____.
increasing the number of items on a test
KR 20 and coefficient alpha are both measures of the extent to which items on a test are _____.
intercorrelated
What refers to the intercorrelations among items within the same test?
internal consistency
What needs to be done to the validity coefficient to determine the percentage of variation in the criterion that can be expected to be known in advance?
it needs to be squared
Which of the following is likely the most important new development relevant to psychometrics?
item response theory
Reliability figures also varied as a function of IQ level, with higher reliabilities in the ______ IQ ranges. Each age group in the standardization sample produced an _________standard deviation of IQ scores. As a result, ____________
lower unique IQs at one age were not equivalent to IQs at another
A reliability of a difference score is expected to be _____.
lower than the reliability of either test on which it is based Hide
The Civil Rights Act of 1991 _____.
made it illegal for employers to use separate race-related norms for employment testing
As a result of the Supreme Court's ruling in Griggs v. Duke Power, employers must be able to provide evidence that tests used to make selection and promotion decisions _____.
measure capabilities that are specific to particular jobs or situations
Construct-irrelevant variance is closest to which of the following concepts?
measurement error
The score in the exact middle of the distribution of scores, such that equal numbers of scores fall above and below it, is the _____.
median
A person's equivalent age capability is referred to as their _____.
mental age
Binet incorporated _____ in the first two versions (1905 and 1908) of the Binet-Simon scale of intelligence.
mental age
According to Spearman, g can be best conceptualized as
mental energy
If a standardization sample consists of 100 middle-class White men, the test can be used to evaluate the scores of whom?
middle-class White men
Both the mean of a test and the 50th percentile of a test would be considered the _____.
norm
What type of test compares each person with a norm?
norm-referenced test
_____ are obtained by administering a particular test to a defined group and obtaining the distribution of scores for that group.
norms
The sample with which Terman standardized his test can best be described as
not representative of the population of likely test-takers
Some of the terms that Binet used to classify intellectual deficiency are no longer used today because _____.
of their derogatory connotations
Describing the winners of a race as being in first, second, and third place is an example of which type of scale?
ordinal
If you lined children up according to their weight, from highest to lowest, you would be using a(n) _____ scale.
ordinal
Cora is taking a test that is intended to measure their typical behavior. Cora is taking a(n) _____ test.
personality
The forecasting function of tests is actually a type or form of criterion validity evidence known as _____.
predictive validity evidence
Miguel is taking a test where they are shown ambiguous stimuli and asked to describe what they see. What type of test is this?
projective personality test
What type of test provides subjects with stimuli that are considered ambiguous?
projective personality test
What are points that divide the frequency distribution into equal fourths?
quartiles
According to classical test theory, what can produce different scores by the same individual among repeated applications of the same test given on the same day?
random error
A measure can be _____ yet not _____.
reliable; valid
crystallized intelligence
represents the knowledge and understanding that we have acquired
According to Spearman's theory, intelligence consists of one general factor (g) plus a large number of ________ factors.
specific
A researcher administered a test, divided it in half, and scored each half separately. Which method did the researcher use?
split-half
The relative closeness of a person's observed score to their true score is estimated by the _____.
standard error of measurement
Although the 1960 revision of the Stanford-Binet scale (SB-LM) did not include a new _________ sample, a more representative sample of __________ children was obtained in 1972, and was thereafter used with the 1960 revision.
standardization 2100
Which of the following is needed to accurately evaluate the meaning of test scores?
standardization sample
Shakira is taking a test where they have to answer "True" or "False" to items such as "I like heavy metal music," "I am in good health," and "I sleep well at night." What type of test is this?
structured personality test
What provides a statement, usually of the "self-report" variety, and requires the subject to choose between two or more alternative responses such as "True" or "False"?
structured personality test
A measurement device or technique used to quantify behavior or aid in the understanding and prediction of behavior is referred to as a(n) _____.
test
Administering a test to a group of individuals, re-administering the same test to the same group at a later time, and correlating test scores at times 1 and 2 demonstrates which method of estimating reliability?
test-retest method
Sources of error associated with time sampling are best expressed in _____ reliability coefficients, whereas error associated with the use of particular items is best expressed in _____ reliability coefficients.
test-retest; internal consistency
Das's definition of intelligence
the ability to plan an structure one's behavior with an end in view
Which of the following aspects of a test need to be reliable and valid for the results of a criterion-related validity study to be accurate?
the criterion
Validity is best understood as
the extent to which a test measures what it claims to measure
The normative group, or standardization sample, is _____.
the group to which current examinees can be compared
The restricted range should be considered for which of the following when interpreting the validity coefficient?
the predictor and criterion
What system of dividing frequency distributions was developed in the U.S. Air Force during World War II?
the stanine system
For what primary reason were group tests of ability and personality developed?
to screen military recruits
Although the use of_____ is accepted in medical settings, it is much more controversial in educational settings.
tracking
The Binet scale was developed under the assumption that a person's intelligence can best be represented by a single score g that reflects the shared __________ underlying performance on a diverse set of tests.
variance
In his research on individual differences in human functioning, Galton examined all of the following except _____.
verbal comprehension
The concept of g refers to the _
view that one general mental ability factor underlies all intelligent behavior