Psychological Testing Midterm

Ace your homework & exams now with Quizwiz!

Two significant problems with the 1937 scale were

(a) the reliability of the test and (b) differential variability in IQ scores.

What value of validity coefficients is considered adequate?

.30 to .40

In the 1960 revision of the Stanford-Binet scale (SB-LM), the problem of differential variation in IQ scores was solved through the implementation of the deviation IQ concept. Complete the following statements related to this concept.The deviation IQ was a standard score with a mean of _________ and a standard deviation of _________ .The deviation IQ was derived by examining the standard deviation of __________ for a representative sample of examinees at each chronological age level.The deviation IQ could be compared across chronological age levels because an IQ at one age level (for example, an IQ of 90 at the 10-year-old age level) corresponded to the same ___________ as the same IQ at another age level (for example, an IQ of 90 at the 18-year-old age level).

100 16 mental age percentile

If researchers found a criterion validity coefficient of .40 for an achievement test, it would mean that _____.

16% of the variation in the criterion can be explained by variation in achievement test scores

The _____, a structured personality test, was developed primarily through factor analysis.

16PF

he 1960 revision of the Stanford-Binet scale (SB-LM) incorporated several improvements over its predecessors. For example, instructions for scoring and test administration were improved. Further, the age range was extended from 16 to _______ years of age. Also, tasks that met two criteria were selected for the SB-LM. These included tasks that showed an increase in the percentage passing with an increase in _____; tasks that correlated highly with ______ as a whole.

18 age scores

The maximum possible mental age a person could obtain on the Stanford-Binet is

19.5

In the year _________ , Binet and Simon revised their intelligence test again to create a third version containing minor improvements over the second (1908) version.

1911

A test has a reliability coefficient of .77. This coefficient means that _____.

77% of the variance in test scores is true score variance, and 23% is error variance

Suppose you are in the 87th percentile on a test. This means that _____.

87% of the students got a score lower than your score

Freeman's definition of intelligence

Adjustment or adaptation of the individual to his total environment, the ability to learn, and the ability to carry on abstract thinking,

Anderson's definition of intelligence

Based on individual differences in information-processing speed and executive functioning influenced largely by inhibitory processes

Evidence suggests that _____ developed the earliest systematic program of psychological testing.

China

_____ assumes that each person has a true score that would be obtained if there were no errors in measurement.

Classical test score theory

Which of the following is a recommendation regarding the evaluation of validity coefficients found in the Standards for Educational and Psychological Testing?

Ensure that the results of the validity study are sufficiently generalizable to other groups.

_____ applied the concept of survival of the fittest to the study of individual differences among human beings in his book Hereditary Genius.

Frances Galton

Why was the standardization sample of the 1916 Stanford-Binet scale considered inadequate?

It was comprised exclusively of white children from California.

he formula used by Terman to derive the intelligence quotient (IQ) is

MA/CA x 100

The score of a child who took the 1908 revision of the Binet-Simon scale would probably be reported as a(n) _____.

Mental age

Alfred Binet was appointed to a commission by the French minister of public instruction in order to identify children who were _________ ___________. Binet was especially qualified for the commission because he had studied _________ ________.

Mentally subnormal Human abilities

_____ are points that divide the frequency distribution into equal fourths, or 25%, while _____ divide the distribution by 10%.

Quartiles; deciles

Who viewed intelligence in terms of general mental ability as well as specific factors?

Spearman

Spearman's definition of intelligence

The ability to reduce either relations or correlates,

Gardner's definition of intelligence

The ability to resolve genuine problems or difficulties as they are encountered,

is a standard score with a mean of 100 and a standard deviation of 16 (later 15) that was first introduced in the 1960 revision of the Stanford-Binet.

The deviation IQ

Sternberg's definition of intelligence

The mental activities involved in purposive adaptation to, shaping of, and selection of real-world environments relevant to one's life,

factor analysis

The method for reducing a set of variables or scores to a smaller number of hypothetical variables, Approximately half of the variance in a set of diverse mental ability tests is represented in the g factor

positive manifold

The observation that, when a set of diverse ability tests are administered to large unbiased samples of the population, almost all of the correlations indicate that higher levels ability on one test are associated with higher levels of ability on each other, All tests, no matter how diverse, load on g

Which statement best explains why the use of psychological testing declined among psychologists beginning in the 1950s through the 1970s?

The public increasingly criticized the potentially intrusive nature of psychological testing and feared the misuse of tests.

Binet's definition of intelligence

The tendency to take and maintain a definite direction; the capacity to make adaptations for the purpose of attaining a desired end, and the power of autocriticism,

Which of the following is true regarding the conditions of a validity study?

They are never exactly reproduced.

Which statement is true of the work of German psychophysicists Herbart, Weber, Fechner, and Wundt?

This body of work established the idea that psychological testing requires rigorous experimental control.

A standardization sample is _____.

a comparison group

The gf-gc theory of intelligence is _____.

a theory that proposes two basic types of intelligence

Jerome is taking a test where they are scored in terms of both speed and accuracy. This is a(n) _____.

ability test

What refers to the simple fact that one can differentiate older children from younger children by the former's greater capabilities?

age differentiation

Which of the following were the two guiding principles for Binet's test construction?

age differentiation and general mental ability

A student is taking the 1908 Binet-Simon scale and has been asked to recall six items from a passage. What age level is being assessed?

age level 9

Administering two supposedly equivalent forms of test to the same group of individuals yields a correlation coefficient indicating _____.

alternative forms reliability

A necessary consequence of the maximum mental age used with the 1916 Stanford-Binet is that

anyone older than 19 years, 6 months would have an IQ less than 100

As part of the scholarship application for a summer dramatic arts program, 12-year-old Sharon completed a test that assessed their potential for acquiring acting-specific skills. Sharon has taken a(n) _____ test.

aptitude

Strong interrater reliability is probably most important to which type of test?

behavior rating scales

Which of the following is a new direction in psychological science?

big data

fluid intelligence

can best be thought of as those abilities that allow us to reason, think, and acquire new knowledge

Which of the following is a potential problem with the test-retest method?

carryover effects

Because of the potential problem of underestimating the IQs of people over a certain age, Terman placed a maximum limit on the chronological age that would be figured into the IQ equation. This is because it was believed that mental age __________ ____ _______ after _____ years of age.

ceased to improve 16

When using the split-half method and the two halves have unequal variances, which of the following can be used?

coefficient alpha

Which of the following can confirm that a test has substantial reliability but cannot tell you if a test is unreliable?

coefficient alpha

Job samples provide a good example of the use of which type of evidence?

concurrent validity evidence

If the math portion of the college entrance exam you took included only items related to geometry, you could say that the test suffered from which of the following?

construct underrepresentation

After taking an exam in a psychology course, a student makes this comment: "All the questions on the test came from Chapter Three, but the professor said the exam was over material in Chapters One through Six! What a waste of study time!" The student's criticism of the professor's exam is most obviously related to _____ evidence of validity.

content-related

Developers of a sensation-seeking scale found that scores on the scale were highly correlated with self-reported frequency of alcohol and drug use. This finding most clearly provides _____ evidence for the validity of the sensation-seeking scale.

convergent construct-related

A researcher has developed a new test for anxiety. The new test correlates well with already existing tests on anxiety. This indicates _____ for validity.

convergent evidence

Which of the following allows test developers to estimate what the correlation between two measures would have been if they had not been measured with error?

correction for attenuation

Knowing that George Washington was the first president of the United States is an example of _____.

crystallized intelligence

The knowledge you have acquired through your academic studies would best be described in terms of _____.

crystallized intelligence

A central problem of the 1937 revision of the Stanford-Binet scale was that _____.

different age groups showed significant differences in the standard deviation of IQ scores

Dr. Lansing found that the correlation between scores on a measure of anxiety they developed and scores on an existing measure of depression was .82. It is likely that reviewers of Dr. Lansing's anxiety measure will view this correlation as evidence against _____ validity.

discriminant construct-related

The definition of intelligence, as the term is used in testing, is best described as

elusive

Administration of the modern Stanford-Binet requires examiners to continue testing until the _____.

examinee's ceiling is reached

Cognitive Tradition

examines adaptation to real-world demands

information-processing tradition

examines how we learn and solve problems,

Psychometric Tradition

examines the elemental structure of a test,

Which of the following does not need to be included when reporting reliability?

explanation of classical test theory

The most we could say about a self-esteem test consisting of the items "I feel I deserve to be treated with respect" and "I think I am worthless" is that it has _____.

face validity

A researcher has administered an assessment but believes the assessment actually measures several characteristics, instead of just one. What can the researcher use to determine if their conclusion is correct?

factor analysis

The total product of the various separate and distinct elements of intelligence, according to Binet, is called _____.

general mental ability

When you take a class exam, you are taking a(n) _____ test.

group

Test constructors can improve reliability by _____.

increasing the number of items on a test

KR 20 and coefficient alpha are both measures of the extent to which items on a test are _____.

intercorrelated

What refers to the intercorrelations among items within the same test?

internal consistency

What needs to be done to the validity coefficient to determine the percentage of variation in the criterion that can be expected to be known in advance?

it needs to be squared

Which of the following is likely the most important new development relevant to psychometrics?

item response theory

Reliability figures also varied as a function of IQ level, with higher reliabilities in the ______ IQ ranges. Each age group in the standardization sample produced an _________standard deviation of IQ scores. As a result, ____________

lower unique IQs at one age were not equivalent to IQs at another

A reliability of a difference score is expected to be _____.

lower than the reliability of either test on which it is based Hide

The Civil Rights Act of 1991 _____.

made it illegal for employers to use separate race-related norms for employment testing

As a result of the Supreme Court's ruling in Griggs v. Duke Power, employers must be able to provide evidence that tests used to make selection and promotion decisions _____.

measure capabilities that are specific to particular jobs or situations

Construct-irrelevant variance is closest to which of the following concepts?

measurement error

The score in the exact middle of the distribution of scores, such that equal numbers of scores fall above and below it, is the _____.

median

A person's equivalent age capability is referred to as their _____.

mental age

Binet incorporated _____ in the first two versions (1905 and 1908) of the Binet-Simon scale of intelligence.

mental age

According to Spearman, g can be best conceptualized as

mental energy

If a standardization sample consists of 100 middle-class White men, the test can be used to evaluate the scores of whom?

middle-class White men

Both the mean of a test and the 50th percentile of a test would be considered the _____.

norm

What type of test compares each person with a norm?

norm-referenced test

_____ are obtained by administering a particular test to a defined group and obtaining the distribution of scores for that group.

norms

The sample with which Terman standardized his test can best be described as

not representative of the population of likely test-takers

Some of the terms that Binet used to classify intellectual deficiency are no longer used today because _____.

of their derogatory connotations

Describing the winners of a race as being in first, second, and third place is an example of which type of scale?

ordinal

If you lined children up according to their weight, from highest to lowest, you would be using a(n) _____ scale.

ordinal

Cora is taking a test that is intended to measure their typical behavior. Cora is taking a(n) _____ test.

personality

The forecasting function of tests is actually a type or form of criterion validity evidence known as _____.

predictive validity evidence

Miguel is taking a test where they are shown ambiguous stimuli and asked to describe what they see. What type of test is this?

projective personality test

What type of test provides subjects with stimuli that are considered ambiguous?

projective personality test

What are points that divide the frequency distribution into equal fourths?

quartiles

According to classical test theory, what can produce different scores by the same individual among repeated applications of the same test given on the same day?

random error

A measure can be _____ yet not _____.

reliable; valid

crystallized intelligence

represents the knowledge and understanding that we have acquired

According to Spearman's theory, intelligence consists of one general factor (g) plus a large number of ________ factors.

specific

A researcher administered a test, divided it in half, and scored each half separately. Which method did the researcher use?

split-half

The relative closeness of a person's observed score to their true score is estimated by the _____.

standard error of measurement

Although the 1960 revision of the Stanford-Binet scale (SB-LM) did not include a new _________ sample, a more representative sample of __________ children was obtained in 1972, and was thereafter used with the 1960 revision.

standardization 2100

Which of the following is needed to accurately evaluate the meaning of test scores?

standardization sample

Shakira is taking a test where they have to answer "True" or "False" to items such as "I like heavy metal music," "I am in good health," and "I sleep well at night." What type of test is this?

structured personality test

What provides a statement, usually of the "self-report" variety, and requires the subject to choose between two or more alternative responses such as "True" or "False"?

structured personality test

A measurement device or technique used to quantify behavior or aid in the understanding and prediction of behavior is referred to as a(n) _____.

test

Administering a test to a group of individuals, re-administering the same test to the same group at a later time, and correlating test scores at times 1 and 2 demonstrates which method of estimating reliability?

test-retest method

Sources of error associated with time sampling are best expressed in _____ reliability coefficients, whereas error associated with the use of particular items is best expressed in _____ reliability coefficients.

test-retest; internal consistency

Das's definition of intelligence

the ability to plan an structure one's behavior with an end in view

Which of the following aspects of a test need to be reliable and valid for the results of a criterion-related validity study to be accurate?

the criterion

Validity is best understood as

the extent to which a test measures what it claims to measure

The normative group, or standardization sample, is _____.

the group to which current examinees can be compared

The restricted range should be considered for which of the following when interpreting the validity coefficient?

the predictor and criterion

What system of dividing frequency distributions was developed in the U.S. Air Force during World War II?

the stanine system

For what primary reason were group tests of ability and personality developed?

to screen military recruits

Although the use of_____ is accepted in medical settings, it is much more controversial in educational settings.

tracking

The Binet scale was developed under the assumption that a person's intelligence can best be represented by a single score g that reflects the shared __________ underlying performance on a diverse set of tests.

variance

In his research on individual differences in human functioning, Galton examined all of the following except _____.

verbal comprehension

The concept of g refers to the _

view that one general mental ability factor underlies all intelligent behavior


Related study sets

AP Environmental Science Chapter 19 Vocab

View Set

Declaration of Independence Facts

View Set

Patient Care Chapter 17, 18, 19, Homework Questions

View Set

nclex antepartum and intrapartum practice test

View Set

Chapter 2 Review - Expanded Tax Formula, Form 1040, and Basic Concepts

View Set