psychological testing 1

Ace your homework & exams now with Quizwiz!

consider the case where nine persons earn $10,000 a tenth person earns $990,000. What is their average income?

$100,000

Factor loadings can vary between

-1.0 and +1.0

In general, the optimal level of item difficulty is

.5

Many authors suggest that reliability should be at least ___ for decisions about individuals.

.90

A group of standard scores always possesses a mean of ___ and a standard deviation of ___.

0.0, 1.0

The optimal value for the item-discrimination index is

0.5

Currently, ethnic minorities constitute about __________ of the U.S. population.

1/3

A C scale consists of ____ units.

11

Idiocy and it's treatment by the physiological method was first published in ______ by _________.

1866, Seguin

first resolution of the binet-simon scales was completed in _____ by _________.

1910, Goddard

latest revision of the stanford-binet was completed in

2003

In a normal distribution, approximately ____ percent of the scores will fall within one standard deviation of the mean in either direction.

68

Full Scale IQ is ____ percent certain to be accurate within plus or minus 5 IQ points.

95

Brass instruments tests recorded

All of the above

Leta Stetter Hollingworth is known for which of the following accomplishments?

All of the above

the following is a desirable feature for an individually administered test?

All of the above

Regarding the true score, which statement is correct?

All of the above: we can never know the true score with certainty, we can derive a probability that the true score resides within a certain interval, and we can derive a best estimate of the true score

the mean of a standardized score is typically set at

All of the above?

regarding a test manual, ethical guidelines indicate the test publishers

Are required to publish a manual

_______________ share a common assumption that behavior is best understood in terms of clearly defined characteristics such as frequency, duration, antecedents, and consequences.

Behavioral procedures

In testing children, Binet warned scientists to be on the lookout for

Both suggestability and and failure of attention

the theoretical upper limit of the validity coefficient is constrained by the reliability of

Both the test and the criterion

individual test of intelligence, projective personality tests, and neuropsychological test batteries are example of Level ___ tests

C

what did Goddard report as the primary culprit of the low intelligence scores of immigrants?

Environmental deprivation

The first form of numerical rating scales can be traced to

Galen in the 2nd century

in developing his test, Rorschach was most heavily influenced by

Galton

On a __________ scale, respondents who endorse one statement also agree with milder statements pertinent to the same underlying continuum.

Guttman

the validity of a screening test is bolstered to the extent it possesses _________ sensitivity and _______ specificity.

High; high

window stickers are made aware, and language they can understand, of the reasons were testing, etc., this is called

Informed consent

in order to investigate potential sex bias in a test item, we would examine a(n)

Item-characteristic curve

__________________________ is a statistical index of how efficiently an item discriminates between persons who obtain high and low scores on the entire test.

Item-discrimination index

Which test is considered to have better norming?

MMPI-2

a form of ranking is found in ________ scales.

Ordinal

In _________ validity, test scores are used to estimate outcome measures

Predictive

________ validity is particularly relevant for entrance examinations and employment tests.

Predictive?

A(n) ________ scale has a conceptually meaningful zero point.

Ratio

for what kind of distribution with the highest number of persons the score in the superior range?

Rectangular

for what kind of reliability is the Spearman-brown formula relevant?

Split-half

which of the below is an index of the internal consistency of a test or scale?

Split-half

The mean of _________ scores is always 5, and the standard deviation is approximately 2.

Stainine

_________ we have the long-term effect of pressuring African-American students to "protectively disidentify" with achievement in school and related intellectual domains

Stereotype threat

the classical theory of reliability is also known as the theory of

True and error scores

The "thought meter" was developed by

Wundt

A construct is

a theoretical, intangible quality or trait in which individuals differ

Under what circumstances is it considered ethical to ask a client to take a test such as the MMPI-2 home for completion?

almost never

A test is valid when the inferences made from it are

appropriate meaningful useful all of the above

Classical theory assumes that measurement errors

are not correlated with true scores

Which is the most comprehensive term?

assessing

Renorming of tests should

be the rule, not the exception

In Moore's adoption study, which group scored higher on IQ tests?

black children adopted into white families

A(n) ____________ effect is observed when significant numbers of examinees obtain perfect or near-perfect scores.

ceiling

The degree to which items on a test are representative of the universe of behavior the test was designed to sample is an index of ____________ validity.

content

A factor loading is actually a(n)

correlation

In a(n) _______________ test, the objective is to determine where the examinee stands with respect to very tightly defined educational objectives.

criterion-referenced

The text mentions that the following element(s) can be used to define informed consent from a legal standpoint.

disclosure competency voluntariness all of the above

Putting forth a variety of answers to a complex or fuzzy problem is an example of _________ thinking.

divergent

The major challenge with split-half reliability is

dividing the test into nearly equivalent halves

Perfect correlation

does not imply identical pre- and posttest scores for each examinee

It is common practice in test development that the prepublication version of a new instrument might contain _____________ the number of items desired on the final draft.

double

why were group tests are generally slow to catch on?

early versions had to be laboriously scored by hand

The Glasgow Coma Scale was developed by the method of

expert rankings

The only reason for building in ________ validity involves public relations.

face

In a ____________, the frequency of the class intervals is represented by single points rather than columns.

frequency polygon

The difference in underlying raw score points between percentiles of 90 and 99 is _________ that between percentiles of 50 and 59.

greater than

According to the text, which kind of test generally requires the greatest vigilance from the examiner?

group and individual tests require equal vigilance

regarding the publication of new or revised instruments, the most important guideline is to

guard against premature release of a test

The quantification or "common sense" approach to content validity advocated by the text

helps cull out existing items that are deemed inappropriate by expert raters

The standard error of the estimate is an index of the error of measurement caused by the ______________ of a test.

imperfect validity

Access to psychological tests is restricted because:

in the hands of unqualified persons, psychological tests can cause harm the selection process is rendered invalid for persons who preview test questions leakage of item content to the general public completely destroys the efficacy of a test all of the above

An important advantage of _________ tests is that the examiner can gauge the level of motivation of the examinee.

individual

When test takers are made aware, in language that they can understand, of the reasons for testing, etc., this is called

informed consent

A homogeneous scale is also referred to as

internally consistent

Which type of reliability method is best for a test that involves subjectivity of scoring?

interscorer

The more powerful and useful statistics should only be used with ___________ levels of measurement.

interval and ratio

A construct possesses the following characteristic(s):

it cannot be operationally defined a network of predictions can be derived from theory about the construct both a and b

Another name for latent trait theory is

item response theory

A graphical display of the relationship between the probability of a correct response and the examinee's position on the underlying trait measured by the test is called

item-characteristic curve

If an examinee obtains a verbal score higher than his/her performance score, then the underlying true scores for verbal and performance abilities

may or may not show the same pattern

Which is the correct order for levels of measurement?

nominal, ordinal, interval, ratio

If all other factors are held constant, what effect does a strong practice effect have upon a test's reliability?

none

In a(n) _____________ test, the performance of each examinee is interpreted in reference to a relevant standardization sample.

norm-referenced

The ____________ is simply the normal distribution graphed in cumulative form.

normal ogive

Suppose we obtain the following response patterns on a multiple choice question with correct answer "c": abc d e high-scorers 5 6 80 5 4 low-scorers 15 14 40 16 15 What needs to be done to improve this test item?

nothing, this is a good test item

What is the relationship between the variance and the standard deviation?

one can be computed from the other

What effect does an excessive emphasis on nationally normed achievement tests for selection and evaluation appear to promote?

outright fraud and cheating

When instructions for a task are neutral or nonthreatening, test-anxious subjects

perform just as well as low-anxious subjects

T score scales are especially common for __________ tests.

personality

The sources of measurement error are

potentially knowable for individual cases

A ______ test allows enough time for test takers to attempt all items, but is constructed so that no test taker is able to obtain a perfect score.

power

As a group, Native Americans may tend to emphasize ___________ more than European Americans.

present time

The "brass instruments" era was a dead end because

psychologists mistook simple sensory processes for intelligence

which of the following is NOT true about the concept of variance

psychometricians prefer variance to standard deviation as an index of variability

The necessary prerequisite(s) to administering a new test are:

reading the manual memorizing key elements of instructions rehearsing the test all of the above

What is the relationship between the reliability and the validity of a psychological test?

reliability is necessary but not sufficient for validity

When initial research indicates that an instrument produces skewed results in the standardization sample, test developers typically

revamp the test at the item level

A ______ test typically contains items of uniform and generally simple level of difficulty.

speed

The most commonly used statistical index of variability in a group of scores is the

standard deviation

In a _______ scale, all raw scores are converted to a single-digit system of scores ranging from 1 to 9.

stanine

The restriction imposed by Hollerith cards was a main impetus for the development of ________ scales.

stanine

A Likert scale is also referred to as a __________________.

summative scale

Methods for computing the reliability coefficient for a test involve

temporal stability internal consistency both a and b

The essential objective of _________________ is to determine the distribution of raw scores in the norm group so that the test developer can publish derived scores known as norms.

test standardization

What concept is best summed up by the question, "Does use of this test result in better patient outcomes or more efficient delivery of services?"

test utility

What was the catalyst for the development Binet and Simon's test?

the call for an instrument to identify cognitively impaired school children needing special instruction

The use of normalized standard scores is appropriate when

the normative sample is large and representative -the raw score distribution is only mildly non-normal both a and b

According to the functionalist perspective on test validity, a test is valid if

the test serves the purpose for which it is used

What approach do most test developers use in choosing a norm group?

they strive to make a good faith effort to select a representative sample

Which of the following was used as a situational test by the Office of Strategic Services during WWII?

transporting equipment across a raging brook scaling a ten foot high wall surviving a realistic interrogation all of the above

The expression ____________ refers to the phenomenon in which a test predicts a criterion less well when used on a new sample of subjects.

validity shrinkage

Regarding the true score, which statement is correct?

we can never know the true score with certainty we can derive a probability that the true score resides within a certain interval we can derive a best estimate of the true score all of the above

When are expert judges needed to determine the content validity of a test?

when the trait being measured is ill-defined

The test item writer's aim is to make all or nearly all considered guesses _________ guesses.

wrong


Related study sets

Chapter 6 - Land Use Regulations

View Set

Assignment Zero - Introduction to WileyPlus

View Set

Module 6 - Mechanical Material Handling

View Set

Chapter 20: Exposure and Technique Errors

View Set

Domestic and Intl. Banking Chp. 4

View Set

الثبات على الحق - سورة الأحزاب - 1

View Set