PSY310 Psychological Testing MIDTERM

Ace your homework & exams now with Quizwiz!

Which court case resulted in the ruling for the first time that diversity considerations can be used in university admissions decisions?

Regents of the University of California v. Bakke (1978).

If a standardized test requires assessors' to score by hand, then the test's manual will ideally contain an ample number of...

examples of correct, incorrect, and partially correct responses.

Outliers can be useful in identifying test takers who...

failed to understand the test instructions and/or to follow the test instructions.

From the perspective of test users, a typical objective of test takers' test-taking behavior, such as blackening small grids on an answer sheet with a number 2 pencil, is to...

predict behavior apart from blackening grids.

In his article entitled "A Method of Scaling Psychological and Educational Tests," L. L. Thurstone introduced absolute scaling, which was a...

procedure for obtaining a measure of item difficulty.

In Chapter 6 of your text, Adam Shoemaker, the featured professional in Meet an Assessment Professional, described the use of a test with little criterion validity. Dr. Shoemaker recalled that this test was used for the purpose of...

providing a "job preview" of sorts to aspirants.

According to the Close-Up in Chapter 1 of the textbook, the use of smartphones as an assessment tool has been generally well received by...

psychiatric patients.

Much of nineteenth-century psychological measurement focused on...

sensory abilities.

When graphing ratio data, why is it preferable to set the ordinate of the y-axis at 0?

setting the ordinate to other values may yield an exaggerated impression of the changes in the variable.

According to Dr. Joni Mihura, a distinguishing feature of meta-analytic research is its...

systematic rules for study inclusion.

Unreliable findings that reach general acceptance in the academic community...

tend to linger too long.

An assumption in psychological testing and assessment is that psychological traits can be measured. For this to be true...

test developers must clearly define the trait constructs the test purports to measure.

As used in the vocabulary of assessment professionals, protocol typically refers to...

test forms.

The Code of Fair Testing Practices in Education was developed and endorsed by...

test publishers and professional organizations.

An investigation of a test's construct validity may yield evidence that...

test scores increase with age, the test correlates with another test purporting to measure the same construct, and/or the test is measuring a single construct.

Which of the following is true for estimates of alternate- and parallel-forms reliability?

test scores may be affected by factors (such as motivation, fatigue, or intervening events like practice, learning, or therapy), item sampling is a source of error variance, and two test administrations with the same group are required.

Of the following parties to the assessment enterprise, which group would be least likely to have read the Standards?

test takers.

Which organization published Technical Recommendations for Psychological Tests and Diagnostic Tests?

the American Psychological Association.

Which of the following methods of setting cut scores is not data-driven?

the Angoff method.

To better focus on the cultural identity of a patient, a tool of assessment used by Neil Krishan Aggarwal is...

the Cultural Formulation Interview (CFI).

Stanley (1971) wrote that in classical test theory, a so-called true score is "not the ultimate fact in the book of the recording angel." By this, Stanley meant that...

the amount of test variance that is true relative to error may never be known.

Which statement is true regarding this distribution of scores: 1, 2, 2, and 3?

the arithmetic mean is equal to the mode.

Role play may be preferable to naturalistic observation as a tool of measurement in situations where...

the cost of naturalistic observation would be prohibitive.

The first step in developing the Constructive and Unconstructive Worry Questionnaire was...

the creation of an item pool.

In the context of item response theory, the term discrimination best refers to...

the degree to which an item differentiates people with respect to a trait being measured.

If you were a psychologist working in the field of human resource management, which claim for a new personnel selection test by a test publisher would be most compelling and persuasive?

the test improves the hit rate.

Detailed information regarding how a particular test was developed can typically be found in...

the test manual.

Griggs v. Duke Power Company and Albemarle Paper Company v. Moody had what issue in common?

the use of tests that discriminate against minorities because they do not measure the specific skills required for the job.

The American Psychological Association issues both guidelines and standards. Standards are different from guidelines because...

they must be followed and are not simple aspirational.

The reason the Constructive and Unconstructive Worry Questionnaire was developed was...

to capture both diagnostic and therapeutic value of the worry construct.

Norms come in many different forms. Which of the following is not correctly referred to by the term "norms"?

trait norms.

A basic assumption about psychological testing and assessment is that...

traits and states can be quantified and measured.

Consider an employment test that is highly accurate (about 98 percent correct in classifications), but very costly (about $5,000 per test). For which of the following positions would the use of the test be most warranted?

translator for the U.S. Ambassador to the United Nations.

The legal term disparate impact refers to an employer's hiring or promotion practices that...

unintentionally have a negative impact on members of a protected class.

In the term latent-trait theory, "latent" is a synonym for...

unobservable.

What are used by organizations to assess the saving associated with different decisions?

utility analyses.

The endpoint of a utility analysis is typically an educated decision about...

which of the many possible courses of action is optimal.

The process of differential item functioning (DIF) analysis entails...

scrutinizing item response curves for DIF items.

Which of the following traits is most likely to be valued in an individualist culture?

self-reliance.

A prospective test user may ask many questions about a test's validity. Which of the following is not a validity-related question?

"As a tool of measurement, is this test consistent?"

Generally, which correlation coefficient best reflects the relationship between the number of calories consumed and body weight?

+0.8.

What is the value of the item-discrimination index for an item answered correctly by an equal number of students in the higher- and lower-scoring groups?

0.

T scores have a mean of _____ and a standard deviation of _____.

50 and 10.

According to Dr. Joni Mihura, multiscale personality tests may contain over _____, and sometimes over _____, scales that each need to be evaluated separately.

50 and 100.

It is an online community of asexual individuals that has become a source of recruitment of subjects for asexuality research. It is called the...

Asexuality and Visibility Education Network.

Which of the following is true of both the standard error of measurement and the standard error of the difference?

Both provide confidence levels.

Groupthink is best defined as the tendency to...

make poor decisions when in collective decision-making situations.

What name is best associated with therapeutic psychological assessment?

Dr. Stephen Finn.

A list of unpublished tests is published as a resource to the assessment community. This list is available from...

Educational Testing Service.

Which of the following tests was mentioned by name by Dr. Delphine Courvoisier as an instrument that she uses in her daily work?

Health Assessment Questionnaire.

Tests relevant primarily to white middle-class students produce inaccurate and misleading test scores when administered to lower-class African-American students. This was the conclusion of a court in which of the following cases?

Hobson v. Hansen (1967).

Which of the following statements is true regarding the differences between a power test and a speed test?

In a power test, a test taker has time to complete all items; in a speed test, a time limit is imposed.

According to the text, what makes a smartphone "smart"?

It has a computational capacity.

Test items that contain alternatives with five points ranging from "strongly agree" to "strongly disagree" are characterized as using...

Likert scaling.

Which statistic is appropriate for use to estimate the heterogeneity of a test composed of multiple-choice items?

Pearson-product moment correlation coefficient.

Dr. Joni Mihura's Rorschach research had the effect of inspiring the creation of the...

Rorschach Performance Assessment System.

Which court case resulted in the ruling that psychotherapists must reveal privileged information if a third party is endangered?

Tarasoff v. Regents of the University of California (1974).

Which statement is true concerning a coefficient of correlation?

a correlation coefficient may be useful in prediction.

What publication presents standards for constructing and using fair and non-discriminative employment tests?

Uniform Guidelines for Employee Selection Procedure.

A strategy for cheating on an examination entails one test taker memorizing items and later recalling and reciting them for the benefit of a future test taker. This cheating strategy may be countered by...

a computer-randomized presentation of test items or a computer-tailored test administration to each test taker.

A test is considered to be biased if...

a factor inherent in the test systematically prevents accurate measurement.

Frequency distributions may be illustrated in a variety of ways. Which of the following graphs is most likely to be used?

a histogram.

Which is an example of the selected-response item format?

a multiple-choice item.

A nonlinear transformation is used to convert a raw score to...

a normalized score.

It's a group of people whose performance on a particular test is analyzed for reference in evaluating the performance of future test takers. This group is called...

a normative sample.

If an outlier exists in graphed test-related data, it may signal...

a problem in the wording of one of the test questions.

With regard to item-discrimination indices, a d equal to -1 is...

a test developer's nightmare.

Which is true about reliability in the psychometric sense?

a test may be reliable in one context and unreliable in another.

Which assessment technique is the best example of a face-valid method?

administering a word processing test to a person applying to a job that requires the use of a word processor.

Which of the following is the best example of a psychological trait?

aggressiveness.

A utility analysis of a test may best be thought of as...

an evaluation of the costs and benefits of the test.

Jennifer and Rafael happen to walk into a large company at the same time to apply for an advertised position as an accounting clerk. The Human Resource (HR) professional responsible for selecting the best candidate for the position orders an appropriate test of basic mathematical skills for both the applicants. Based on their scores, the HR professional chooses Jennifer for the job. The reason for this choice is a determination that Jennifer has an 85 percent chance of performing at an acceptable level. By contrast, Rafael's score indicated that he has only a 50 percent chance of performing successfully. The tool of assessment used to make this hiring decision most probably was...

an expectancy table.

An educational psychologist conducts a utility analysis of a teaching program used to improve the handwriting of very young children. The measure of utility in this analysis will most likely be a variable related to...

an increase in performance level.

Which of the following is a term for the graphed representation of an extremely atypical score that can sometimes provide a hint regarding a deficiency in the testing or scoring procedures?

an outlier.

For the purposes of the Ariel et al. (2015) study, "use of force" was coded as being present on any occasion that a police verbal confrontation with a citizen escalated to the point of...

any physical contact.

The case of Daubert v. Merrell Dow Pharmaceuticals, Inc. (1993) established...

criteria for determining the admissibility of expert testimony.

Predictive and concurrent validity can be subsumed under...

criterion-related validity.

Relating scores obtained on a test to other test scores or data from other assessment procedures is typically done in an effort to establish the _____ validity of a test.

criterion-related.

The effect of _____ of test scores for remedying adverse impact is to make equivalent all scores that fall within a particular range.

banding.

Which is true of item-characteristic curves (ICCs)?

biased items exhibit different shapes of ICCs for different groups when the two groups do not differ in total test score.

The development of a criterion-referenced test usually entails...

both exploratory work with a group of test takers who have mastered the material and exploratory work with a group of test takers who have not mastered the material are correct.

Prior to the development of the Constructive and Unconstructive Worry Questionnaire, research on worry had shown that the act of worrying can lead to...

both positive and negative outcomes.

According to Dr. Delphine Courvoisier, when a research project demands that subjects respond to a series of telephone calls, it would be instructive to understand how _____ affects the other variables in the study.

compliance.

The form of criterion-related validity that reflects the degree to which a test score is correlated with a criterion measure obtained at the same time that the test score was obtained is known as...

concurrent validity.

The right to informed consent establishes that...

consent must be informed, competent, and voluntary, by mentally able participants able to carry out personal affairs; incompetence must be established through a special court hearing.

Blueprinting is best associated with ___ validity.

construct.

A review of existing measures of individual differences in worry suggested to the authors of the Constructive and Unconstructive Worry Questionnaire that none of the measures were made to distinguish people's tendency to worry...

constructively from their tendency to worry unconstructively.

Unlike norm-referenced interpretation, criterion-referenced interpretation...

describes a test taker's performance in terms of predetermined standards.

The term test heterogeneity best refers to the extent to which test items measure...

different factors.

A test item functions differently in one group of test takers as compared to another group of test takers known to have the same level of an underlying trait. This phenomenon is known as...

differential item functioning.

A psychological interview requires...

direct, reciprocal communication.

The higher the item-difficulty index, the _____ the item.

easier.

Which is not an economic cost typically factored in a test utility analysis?

fees paid by test takers for testing services.

The television program Dancing With the Stars...

has a multiple-hurdle selection model in place.

A difference between a ratio scale and an interval scale is that a ratio scale...

has an absolute zero point.

Error variance for measures of inter-item consistency comes from...

heterogeneity of test items.

Which is an example of a false positive in the context of employee selection?

hired applicants who scored at or above the cut-off score on the employment test went on to fail on the job.

A new test designed to gauge competency to stand trial is found to lack face validity. Which is the most likely consequence of this fact?

impression management will be a poor factor in the test results.

The greater the value of the item-discrimination index, the more test takers answered the item correctly in the higher-scoring group as compared to test takers...

in the lower-scoring group.

As used in Chapter 5 of your text, the term inflation of variance is synonymous with...

inflation of range.

In general, criterion-referenced testing...

is useful for communicating information regarding the mastery of basic skills such as reading and writing.

Which is true regarding the adjustment of test scores by group membership?

it is viewed as allowing preferential treatment or certain groups, it is viewed as helping guarantee the proportional representation of various minority groups in the workplace, and (according to the Civil Rights Act of 1991) it is illegal for purposes of making hiring or promotion decisions.

How is a stanine different from other standard scores?

it takes on whole values from 1 to 9.

Which of the following conditions may lead to the decision to revise a psychological or educational test?

item content (including vocab used in instructions and pictures) has become dated, reliability and validity can be improved by a revision, and/or test norms no longer represent the population for which the test is designed.

Which source of error variance affects parallel- or alternate-form reliability estimates but does not affect test-retest estimates?

item sampling.

A vice president (VP) of personnel employs a "Corporate Screening Test" in the hiring process. For future testing purposes, the VP maintains records of scores achieved by _____, as opposed to _____, in order to avoid restriction of range effects.

job applicants; hired employees.

Legal and ethical mandates regarding vocational assessment require that performance on vocational tests measure...

job-related abilities.

Laws are rules of conduct that derive from a society's...

legislatures and courts.

Tests are categorized as...

level A, B, and C, depending on how much testing-related knowledge is necessary to administer the test.

Criterion-referenced testing and assessment is most typically employed in...

licensing for occupations and professions.

According to Dr. Delphine Courvoisier, quality-of-life research for patients with a chronic disease...

may be conducted at different points in time through the course of the disease.

When "putting tests to the test" and researching the validity of a particular instrument for a particular purpose, test users...

may need to research which combination of tests best suits a particular purpose.

According to Brotto and Yule, their new measure of asexuality performed satisfactorily on...

measures of incremental validity, convergent validity, and discriminant validity.

Which are examples of constructs?

mechanical aptitude, depression, and intelligence.

A negative item-discrimination index results for a particular item when...

more low scorer than high scorers on a test get the item correct.

In a scale used in a psychology experiment, a value of "1" is assigned to subjects with Black hair, and a value of "2" is assigned to subjects with blonde hair. The type of scale used in this experiment is...

nominal.

Each of the three approaches to validity assessment in the trinitarian model should best be thought of as...

one type of evidence that, with others, contributes to a judgment concerning the validity of a test.

If a time limit is long enough to allow test takers to attempt all items, and if some items are so difficult that no test taker is able to obtain a perfect score, then the test is referred to as a _____ test.

power.

Sorting techniques can be employed to develop...

ordinal scales.

The median and the interquartile range are _____ in nature.

ordinal.

The Family Education Rights and Privacy Act mandated that...

parents and students have an opportunity to review their school records.

Of the following list, where would Taylor-Russell tables be most likely to be found?

personnel offices

The ABAP Diplomate is a symbol of...

recognition and accomplishment by a private organization.

Which is an example of a false negative in the context of employee selection?

rejected applicants who scored below the cut-off score on the employment test and who would have succeeded on the job had they been hired.

The term psychometric soundness is typically a reference to a test's...

reliability and validity.

The use of tools of psychological evaluation to gather data and to draw conclusions about a subject who is not in physical proximity to the person or people conducting an evaluation is called...

remote assessment.

Which of the following is not an alternative to classical test theory cited in your text?

representational theory.

The idea for a new test may come from...

review of available literature, social need, and/or common sense appeal.

The term norms refers to the...

scores with which the results of subsequent measurement can be compared.

In an undergraduate measurement course, an instructor announces that the first examination will cover the topics of reliability and validity. One student in the class, Jamarr, publicly predicts that only questions on reliability will be posed. As it turns out, true to Jamarr's prediction, all of the test questions are only on the topic of reliability. Given this background, which of the following is the most reasonable conclusion that Jamarr's fellow students could draw?

the first examination lacked content validity.

When setting a cut score on a predictor...

the goal of selection must be taken into account.

A new women's professional basketball team uses a composite score on a series of tests to evaluate 100 prospective players. For the 25 openings on the team, a process of top-down selection is used to award team positions. Accordingly...

the highest scorer is in the first position to be hired.

Test items with little discriminative ability prompt a test developer to consider the possibility that...

the item is poorly worded and needs to be rewritten, the item is too complex for the educational level of the population, and/or the content of the item does not match the construct measured by the other items in the scale.

What is the relationship between a coefficient of determination and a correlation coefficient?

the larger the correlation coefficient, the larger the coefficient of determination.

Which historical event was the impetus for the awarding of federal funds to schools in an effort to identify gifted and talented students?

the launch of Sputnik.

If a distribution of scores has a few extremely low scores and no corresponding high scores, which of the following would be true?

the mean would be smaller than the median.

Which of the following statistics is the preferred measure of central tendency for a skewed distribution?

the median.

The known groups method for setting cut scores is also known as...

the method of contrasting groups.

In the context of utility, the hit rate is equivalent to...

the number of correct classifications divided by the total number of classifications.

Which is an example of biofeedback equipment that can be used as a tool of psychological assessment?

the penile plethysmograph.

In the language of psychological testing and assessment, reliability best refers to...

the proportion of total variance that can be attributed to true variance.

A review of a new personality test is published in a journal. In that review, it would be reasonable to expect to find information about...

the psychometric soundness of the test.

Psychometrics may best be defined as...

the science of psychological measurement.

The term psychometrics may be defined as...

the science of psychological measurement.

A score at the 20th percentile on a high school history test is...

the score at or below which 20% of the scores in the distribution fall.

The definition of a norm-referenced test as compared to a criterion-referenced test differs primarily with respect to...

the score to which a test taker's score is compared.

In a norm-referenced approach to testing, a test taker's scores are compared with...

the scores of other individuals who were administered the test.

The manifestation of behavior associated with a particular psychological trait is to some extent dependent on...

the situation a person is in.

Which statistic can help the test user determine how large a difference must be for scores yielded from two different tests to be considered statistically significant?

the standard error of the difference between two scores.

National anchor norms for a test taken by university students allow for comparisons between...

the students' scores on that test and the scores of another test with the same purpose.

Internal consistency estimates of reliability are inappropriate for...

word processing tests based on speed.


Related study sets

Chapter 42 Guided Reading Answers

View Set

Saunders Lab Values (Chapter 11) and Prioritizing Client Care (Chapter 8) Evolve Online

View Set

Chapter 93: Drug Therapy for Urinary Tract Infections

View Set

Chapter 51: Assessment and Management of Patients With Diabetes

View Set

SCM 424: Multiple Regression Analysis

View Set

Clinical Anatomy - Upper Limb (disorders/pathology)

View Set