Psychometrics

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

If an individual has a high score on the MMPI F scale, the test administrator might suspect (thought would not know for sure) that the individuals were responding with a ___________________. a. acquiescence bias b. extremity bias c. malingering d. guessing

C

What might you be able to increase the reliability of a test? a. shorten the test by removing items that are consistent with the other items on the test b. lengthen the test by adding items that are relatively inconsistent with the other items on the test c. removing items that detract from internal consistency and replace them with items that enhance internal consistency d. removing items that increase internal consistency and replace them with items that reduce internal consistency

C

"When a respondent realizes that she/he is being assessed, and that realization affects or biases the way that she/he responds to the test" matches which term? a. participant reactivity b. hypothetical constructs c. operational definition d. psychometrics

A

. Which of the following is the best for evaluating the construct validity of a test or assessment? a. Evaluate whether test scores are correlated with things that they should be correlated with. b. Evaluate whether test scores are consistent from one time to a later time. c. Evaluate whether the different items on a test are consistent with each other. d. Evaluate whether test scores do not differ (on average) across groups of respondents.

A

A doctoral student wants to develop a psychotherapeutic treatment for Dependent Personality Disorder (DPD). As part of his dissertation, he develops a new test to measure DPD, but he does not evaluate its psychometric quality very carefully. He subsequently uses the test in his dissertation. He finds that its scores are indeed affected (i.e., reduced) by a new psychotherapeutic treatment that he has developed. He, therefore, concludes that the new psychotherapeutic treatment is effective at reducing DPD, and he recommends it as a clinical option for DPD. However, one of his dissertation committee members points to the items on his new measure, and she notes (correctly) that many of them look like they reflect anxiety more so than DPD. Thus, his new measure (which supposedly reflects only levels of DPD) might reflect Generalized Anxiety Disorder (GAD) along with DPD (i.e., the scores likely reflect some messy blend of the two). Based on this, she thus suggests that, unfortunately, his new measure might lack ______________ validity. Unfortunately, it is thus not clear what his new psychotherapeutic treatment does. Yes, the treatment seems to affect the scores on his new measure. But does

A

Based upon the logic of the MTMM matrix, which of the following correlations would you expect to be the largest? a. The correlation between self-esteem scores from two self-report scales b. The correlation between self-esteem scores from a self-report measure of self- esteem, and self-esteem scores from an "informant report" method of measurement c. The correlation between self-esteem scores and extraversion scores, where both are measured by self-report methods d. The correlation between self-esteem scores from a self-report measure of self- esteem, and extraversion scores from an "informant report" method of measurement

A

Conceptually, what does reliability refer to? a. the degree to which differences in test scores are consistent with differences in true scores b. the degree to which we correctly interpret the meaning of test scores c. the degree to which a test is unbiased and works equally well for all groups of respondents d. the raw amount of true differences among respondents

A

Data involving levels of severities in depression might be considered what type of data? a. Ordinal b. Nominal c. Interval d. Ratio

A

From the "alternate forms" approach to reliability, how do you estimate reliability? a. Create two versions of the same test, administer both forms to a single group of participants, and correlate the scores from the two forms. The correlation is your estimate of reliability. b. Create two versions of the same test, administer both forms to a single group of participants, and correlate the scores from the two forms. Multiply the correlation by two (to reflect the two forms), and that is your estimate of reliability. c. Create two versions of the same test, administer both forms to a single group of participants, and compute the mean score on each version. The difference between the means reflects the reliability of the test. d. Create several alternate versions of the same test and determine which one is the best. Estimate the reliability of that test, and it's the best estimate for the entire test.

A

Imagine that your raw score on a Neuroticism scale was 45, where the average score was 50, with a standard deviation of 5. What is your z score (i.e., standard score)? a. -1 b. .90 c. 1 d. -5

A

Imagine you were told that a new test had a reliability of .87. How should this be interpreted? a. This is a good level of reliability. b. This is a poor level of reliability. c. It depends, depending on the test and its purpose, this could good or poor. d. This level of reliability is theoretically impossible.

A

In a research context, low reliability will: a. make the observed effect sizes (e.g., correlations among measures) weaker than they should be b. make the observed effect sizes (e.g., correlations among measures) stronger than they should be c. make the observed effect sizes (e.g., correlations among measures) either weaker or stronger than they should be d. not influence the observed effects, but it will influence the "true" effects

A

Loren develops a new test of mathematical ability. She views math ability as a two-factor construct—one factor representing calculus ability and one representing non-calculus ability. Her new test includes various types of math items (i.e., calculus, algebra, and geometry). Based on her theory of the "math ability" construct, she hypothesizes that there will be two weakly correlated dimensions to her test—a) calculus ability, and b) non-calculus math ability, with all algebra and geometry items loading equally strongly on the non-calc factor. After collecting responses to her test from many students, she finds that the test seems to have a three-dimensional structure (algebra ability, geometry ability, calculus ability) with weakly correlated dimensions. This finding suggests: a. a lack of internal structure validity b. a good level of internal structure validity c. a lack of content validity d. good content validity

A

Say that the test developer is designing a study to evaluate the convergent validity of her new self-report measure of extraversion. She is considering two studies. In study A, she would administer her new measure, along with a self-report measure of happiness. In study B, she would administer her new measure to a sample of participants, and she would observe each participant's behavior in an in-lab social interaction. Based on those behavioral observations, she would then rate each participant's apparent happiness. In both studies, she would correlate her scale's scores with scores on happiness (whichever way it had been measured). Setting other factors aside from a moment (e.g., random measurement error), what would she expect about a difference between her findings from these two studies? a. The validity correlation would be higher in Study A than in Study B b. The validity correlation would be higher in Study B than in Study A c. The validity correlation would be identical across the studies d. The validity correlation could not be determined for Study A or B.

A

Suppose a teacher creates a multiplication test, and its items involve questions related to American football (e.g., "how many yards does a team need to gain in order to get three first-downs?"). So, to do well on the test, respondents must be familiar with football and have an understanding of multiplication. This could raise concerns about construct bias because: a. Among people who are familiar with football, a poor test score indicates a "lackof understanding of multiplication," however, among people who are NOT familiar with football, the same poor test score does not necessarily indicate "lack of understanding of multiplication." b. Among people who are familiar with football, those who do well on the test actually "lack an understanding of multiplication" whereas those who do poorly actually have a good "understanding of multiplication." c. Among people who are not familiar with football, those who do well on the test truly do have a good "understanding of multiplication," whereas those who do poorly truly have a poor "understanding of multiplication." d. Among people who are not familiar with football, a poor test score indicates a good "understanding of multiplication," whereas

A

Suppose we study cultural differences in negative emotionality (NE, the tendency to experience negative emotions). We administer a three-item test of NE to a group of Canadians and a group of Russians. Because we're aware of the potential for construct bias, we conduct a factor analysis of the NE items separately for each group. Within each group, we seem to have a one-factor structure, with the factor loadings in the table below. Based on these results, which of the following can we not conclude about the NE test? a. There is no concern about construct bias, according to these results. b. Among Canadians, NE test scores generally reflect differences in Anger. c. Among Russians, NE test scores generally reflect differences in Sadness. d. If we compare the NE test scores across cultures, we're mainly comparing Anger (for Canadians) to Sadness (for Russians).

A

The use of "balanced" scales is intended to minimize the effect of which type of response bias? a. acquiescence bias b. extremity bias c. malingering d. guessing

A

When evaluating her new measure of Schizoid PD, Sarah finds that its scores are correlated with clinicians' diagnoses of an entirely different PD (Obsessive-Compulsive) at r =.80, p <.05. This strong positive correlation would be evidence of: a. poor discriminant validity b. good convergent validity c. good structural validity d. construct bias

A

Which of the following is an example of extremity? a. A respondent could get a relatively high score on loneliness by selecting the more intense or extreme options (regardless of whether she/he truly feels lonely). b. A respondent's score might be a poor reflection of her/his true loneliness because the test blends easy and hard, potentially confusing the respondent's "agreement" choices. c. An "obvious" scale like this would be easy to manipulate, in terms of appearing lonely (or not) if one wanted to. d. A respondent could get a relatively high score on loneliness just by wanting to agree with items of all types (regardless of whether she/he truly feels lonely).

A

Why is a construct's nomological network important in terms of construct validity? a. It leads test-developers to form hypotheses about the other measures that a test should (and should not) be correlated with, thus guiding the design and evaluation of validity studies b. It tells test-developers which construct would be most appropriate for a pre- specified method of assessment, thus guiding test evaluation c. It tells researchers which "informants" from participants' social networks would be most likely to provide valid ratings of the participants, guiding the design of validity studies. d. It tells researchers which content/items should be included in a measure, thereby guiding the evaluation of content validity.

A

A researcher wants to test whether there are sex differences in math ability. He gives his students a multiplication test and finds that females score lower, on average, than males. In terms of construct bias, why might he need to be very cautious before interpreting this finding as evidence that females truly have lower "multiplication ability" (on average) than males? a. The males and females in his class might not accurately represent the "population" of males and females. b. The test scores might not reflect multiplication ability equally well in the two groups (e.g., the test scores might not be a good measure of multiplication ability for most/many females). c. The males in his class might have had more lucky guesses than females, thus earning higher scores. d. Math ability is not an important construct, so bias is inherent in its examination.

B

After taking this exam, you find out that your standard score (z score) is .99. What's the correct interpretation for this? a. Your score is higher than 99% of your classmates' scores. b. Your score is almost 1 standard deviation higher than the average class score. c. You earned 99% of the possible points on the test. d. Compared to the average person, your score is relatively low.

B

An important criterion for admission consideration to an inpatient psychiatric unit is whether a client is suicidal or not. This type of data is considered: a. Ordinal b. Nominal c. Interval d. Ratio

B

As students in Introductory Psychology class, Marie and Martin participate in a research study. They complete a survey about "attitude towards alcohol use." Let's say that they actually have the same level of that attitude—on a scale of 1 (totally anti-alcohol) to 10 (totally enthusiastic about alcohol), they are both "truly" 1. However, when completing the survey, they provide slightly different answers. Marie is willing to use the scale minimum (of 1), whereas Martin shies away from making such an intense response (e.g., he considers the fact that he does go to parties with alcohol, so he doesn't feel "justified" in making the lowest possible response). So, although they do have the same attitude, they end up with different scores on the survey. This situation is an example of which response bias? a. acquiescence bias b. extremity bias c. social desirability bias (impression management) d. social desirability bias (self-deception)

B

From the "test-retest forms" approach to reliability, how do you estimate reliability? a. Create two versions of the same test, administer both forms to a single group of participants, and correlate the scores from the two forms. The correlation is your estimate of reliability. b. Create one version of a test, administer it to a single group of participants at two different times, and correlate the scores from the two testing times. The correlation is your estimate of reliability. c. Create one version of a test, administer it to a single group of participants at two different times, and correlate the scores from the two testing times. Multiply the correlation by two (to reflect the two testing occasions), and that's your estimate of reliability. d. Create one version of a test, administer it twice to a single group of participants, and compute the mean score on each testing occasion. The difference between the means reflects the reliability of the test.

B

Let's say that you administered a test to a set of respondents. You then improved the test in a way that would ultimately improve its reliability. You then administered the revised test to the same set of original respondents. According to CTT, which of the following would happen? a. increase true score variance b. decrease error variance c. increase the correlation between true scores and error scores d. increase the average error score

B

Mayes and Ganster (1983) conducted an MTMM analysis of two methods of measuring four psychological needs: • Two methods—the Manifest Needs Questionnaire (MNQ) and the Personality Research Form (PRF) • Each method included scales measuring four different psychological needs - achievement, autonomy, affiliation, and power (i.e., these are hypothesized to be relatively independent needs). Below is the key Table from their article. The left-hand part of the table includes the scale labels, along with their means, standard deviations, and alpha values. In the right-hand part of the table is an MTMM matrix. From the perspective of an MTMM, the correlations in the "unboxed diagonal" above (i.e., .55,.38, .48, and .74) reflect what? a. content validity b. convergent validity c. discriminant validity d. predictive validity

B

Say my z score (standard score) on a self-esteem test is .50. How can you interpret that value? a. My self-esteem is 50% of the total possible self-esteem as measured by the test b. My self-esteem score is one-half of a standard deviation above the mean score. c. Compared to the average person, my self-esteem score is relatively low. d. I score higher than 50% of the other people who have taken the test.

B

Say that I develop a test, and I try to convince you that it is a measure of intelligence. You then gather responses to the test, estimate reliability, and find that those test scores have high (estimated) reliability. What can we conclude about those scores, assuming that they do indeed have high reliability? a. the test is unbiased and works equally well for all groups of respondents b. responses to the test items were not strongly affected by measurement error c. the test's (observed) scores are uncorrelated with other scores that they should be uncorrelated with d. The test is a good (precise) measure of intelligence

B

Say that a confidence interval around a true score was 950 to 1050. Which is the best interpretation of these values? a. We are 95% confident that the child's observed score falls between 950 and 1050. b. We are 95% confident that the child's true score lies between 950 and 1050. c. The true score is likely to be statistically significant if it is in the range of 950 to 1050. d. The true score is unlikely to be statistically significant if it is in the range of 950 to 1050.

B

Say that the test developer wants to use her newly developed brief test of intelligence to assess intellectual disability (ID). To determine whether the test's scores lead to the correct identification of individuals who have ID, she compares her "brief test" scores to the results of an in-depth cognitive assessment (seen as the gold standard of diagnosing ID). Let's say that she conducts a study in which participants are assessed via her new test and an in-depth interview. Comparing the scores, she finds a "sensitivity" of .90. What does this mean? a. Of the individuals who the test identifies as having ID, 90% truly have ID. b. 90% of the individuals who truly have ID are correctly identified (by the new test) as having ID c. 90% of the individuals who truly do NOT have ID are correctly identified (by the test) as not having ID d. It reflects the ability of the test to identify people who do not have ID

B

Say that, in psychological reality, extraversion and happiness are positively linked— relatively extraverted people tend to be relatively happy, whereas relatively introverted people tend to be less happy. Let's say that the "true" correlation between extraversion and happiness is r = .40. Finally, let's say that the test-developer measures both constructs in a sample of participants, where the reliabilities of the two measures are .75 and .60. According to the assumptions of classical test theory, what will the correlation between the measures be? a. The correlation will be positive and greater than .40 b. The correlation will be positive but less than .40 c. The correlation will be zero. d. The correlation will be negative

B

The definition "theoretical psychological characteristics, attributes, processes, or states that cannot be directly observed" matches which term a. participant reactivity b. hypothetical constructs c. operational definition d. psychometrics

B

The use of an MTMM matrix is based on which realization? a. Self-report questionnaires are (believed to be) prone to error from respondents' response biases. b. The magnitude of a validity correlation is affected by both: a) the actual association between constructs, and b) similarity or difference in the methods by which those constructs are measured. c. The correlation between a test and an important criterion might vary from one sample or context to another. d. If a test developer unsystematically eyeballs a set of convergent and discriminant validity correlations, then she or he might interpret the results in subjective and inaccurate ways.

B

The variance: a. is the sum of participants' scores divided by the number of scores b. reflects the degree to which participants' scores differ from each other c. is an index of skewness d. is inferior to the range as an index of central tendency

B

What is the best definition of a "point estimate" of a true score? a. the range of values within which we strongly believe the person's true score is likely to be located b. our best guess about a test taker's actual standing on a particular psychological attribute c. our best guess of the actual reliability of a set of test scores d. the point at which an individual's true score is greater than the amount of error affecting her/his observed score

B

Which is the best (i.e., more precise) confidence interval? a. 950 to 1050 b. 990 to 1010 c. 910 to 1010 d. 990 to 1050

B

Which of the following is not a valid statement about the actual occurrence and effects of various response biases? a. Acquiescence bias is most likely to occur when test-takers do not easily understand a test's items b. Research shows that, although it's a theoretical possibility, extremity bias does *not* seem to be a problem in practice (and thus there's little reason to be concerned about it) c. Psychologists are debating whether the so-called social desirability response "bias" at least partly reflects meaningful personality characteristics (and is thus not "bias"). d. There is evidence that some lawyers may encourage their clients to "fake bad" on psychological assessments in some circumstances

B

Which of the following results would typically be taken as evidence of discriminant validity? a. large positive correlation b. correlation of about zero c. large negative correlation d. two-factor internal structure

B

Which of the following would indicate the presence of construct bias? a. group differences in a test's means b. group differences in a test's factorial structure c. group differences in the correlation between test scores and a key criterion variable d. group differences in admission/hiring rates, where admission/hiring is at least partially based upon test scores

B

. Construct underrepresentation occurs when _________________________. a. you're evaluating the validity of a test, and you find that the sample of research respondents does not have the construct of interest b. you're evaluating the validity of a test, and you find that the sample of research respondents does not represent the full range of trait/ability levels of the construct c. a test does not cover the entire range of content that is relevant to its intended construct d. the construct that's intended to be measured by a test is not well-established as an important psychological attribute worthy of measurement.

C

.Which of the following is the most straightforward example of a lack of construct validity? a. A researcher uses a scale that a colleague designed to measure depression. The scale's scores measure depression but do so very imprecisely (with a great deal of random measurement error). This imprecision harms her ability to detect meaningful psychological effects in her data b. The admissions staff at a university requires applicants to participate in an admissions interview. In this interview process, interviewers rate the applicant's "capacity for academic achievement." These interview-based ratings do accept this capacity but do so more precisely for females than for males. This leads the staff to make more better-informed admissions decisions for female applicants than for male applicants. c. A clinical psychologist designs a new scale that he believes to reflect psychopathy, and he uses in his practice and interprets as such when working with clients. In fact, the scale does not measure psychopathy, it instead actually reflects histrionic disorder. This misinterpretation leads to ineffective and potentially harmful work with his clients. d. A lack of construct validity cannot be constr

C

15. If the correlation (roe) between observed scores and error scores is 0, then RXX will equal: a. 0 b. -1 c. 1 d. the value cannot be determined

C

A developmental psychologist studies the development of reading ability. He studies a sample of 7-year-olds and a sample of 10-year-olds. Not surprisingly, he finds that the 10- year-old group has a higher mean level of reading ability than the 7-year-old group. He also finds that the 10-year-old group has a larger standard deviation than the 7-year-old sample. What does the difference in standard deviations tell us? a. The average 10-year-old reads better than the average 7-year-old. b. All 10-year-olds read better than all 7-year-olds. c. There are greater differences among the 10-year-olds' reading abilities than among the 7- year-olds' reading abilities. d. There are fewer differences among the 10-year-olds' reading abilities than among the 7- year-olds' reading abilities.

C

A z score: a. is the inverse of the mean b. is a measure or error variance c. reflects how far a participant's score falls from the mean d. is a measure of the variance

C

Consider a legal context in which an individual sues an insurance company for compensation. The individual claims that, due to an accident, he is suffering from memory impairments. As part of the lawsuit, the insurance company hires a clinical psychologist to administer a standardized memory test. Which of the following biases is most likely to be a concern (for the insurance company) in this context? a. acquiescence bias b. extremity bias c. malingering d. guessing

C

Imagine that we examine the predictive accuracy of SAT scores by correlating those scores with college GPA. Imagine we find that the correlation between SAT scores and CGPA is r = .40 for females and r = .20 for males. What is the most accurate interpretation of these findings? a. SAT scores are higher among females than among males b. Both SAT scores and CGPA scores are higher (on average) among females than among males c. SAT scores are more strongly associated with CGPA among females than among males d. The distribution of SAT scores is more skewed among females than among males.

C

Imagine that you were asked to complete a personality inventory, and one of the items is: Which of the following is more characteristic of you? ___ Creative ___ Well-adjusted What type of item is that and which response bias is it intended to deal with? a. balanced; acquiescence bias b. balanced; social desirability bias c. forced-choice; social desirability bias d. forced-choice; extremity bias

C

In a Normal Distribution curve, a score that falls at 96% above or below the Mean score is how many Standard Deviations from the Mean Score? a. One standard deviation b. 1.5 standard deviations c. Two standard deviations d. 2.5 standard deviations

C

In some situations (e.g., research), the purpose of measurement is not to shape decisions or knowledge about specific individuals. In such situations, test-takers might remain anonymous. What is the potential effect of such anonymity? a. makes respondents less likely to answer with an extremity bias b. makes respondents more likely to answer with an extremity bias c. makes respondents more willing to admit to socially undesirable qualities or behaviors d. makes respondents feel that they can respond certainly and/or strategically

C

Let's say that Amy's true score for a self-esteem test was 45, and Barry's true score was 40. If the test scores have good reliability, what is the difference that we should see between Amy's and Barry's observed test scores? a. Amy's observed score will be exactly equal to Barry's. b. Amy's observed score will be 5 points less than Barry's. c. Amy's observed score will be 5 points higher than Barry's. d. Amy's observed score will be 10 points higher than Barry's.

C

Let's say that this class quiz is intended to assess two constructs that we'll call "knowledge of content validity" and "knowledge of internal structure validity." Let's say that the quiz includes an item about discriminant validity. The presence of that item represents: a. content validity b. construct underrepresentation c. construct-irrelevant content d. structural invalidity

C

Marie applies for graduate school in Clinical Psychology, and she is asked to take a personality test as part of the admissions process. One of the test's items states "I am persistent and a hard worker", and she has to respond "true" or "false" about herself. Marie knows that, while she's very smart, she's not usually a very hard worker and doesn't persist when things get tough. However, she also suspects that her chances of admission will increase if she responds "true" to the item, which she does. Her response reflects which type of response bias?: a. acquiescence bias b. extremity bias c. social desirability bias (impression management) d. social desirability bias (self-deception)

C

Micha wants to study moral tolerance, so he develops a new set of items to reflect that construct (to detect who is and is not morally tolerant). He sees moral tolerance as a psychological construct/dimension that is separate from moral relativism (e.g., relativists could either be tolerant or intolerant). He recruits a sample of respondents to answer his moral relativism items and his moral tolerance items. He conducts a factor analysis of the responses to all the items, expecting to find a two-factor structure. Micha is attempting to evaluate __________________ validity. a. content b. face c. internal structure d. conceptual

C

Psychologists have long discussed what "counts" as small, medium, and large correlations. What are Hemphill (2003)'s guidelines for this in psychology? a. small < .50, medium .50 to .80, large > .80 b. small < .30, medium .30 to .70, large > .70 c. small < .20, medium .20 to .30, large > .30 d. small < .05, medium .05 to .15, large > .15

C

Say we are examining a new personality scale to assess "feelings of loneliness." We conduct a factor analysis of the scale separately for males and females, and we find the same factor structure in the two groups. This would be evidence of: a. convergent validity b. discriminant validity c. a lack of construct bias d. the presence of construct bias

C

There are two key assumptions of Classical Test Theory. What is one of them? a. true scores are determined additively by Observed Score and Measurement Error b. measurement error is nonrandom/systematic c. measurement error is random/unsystematic d. the measure in question is a valid representation of the underlying latent variable

C

What is a "validity scale"? a. Sets of items that are embedded within a large inventory and are intended to be particularly good items that were written to reflect a measured construct with as much validity as possible. b. Sets of items that are embedded within a large inventory and are intended to trick" respondents into responding without any response biases. c. Sets of items that are embedded within a large inventory and are intended to quantify the degree to which a respondent is manifesting specific response biases d. Sets of items that are embedded within a small inventory and are intended to qualify the degree to which a respondent is responding without any response biases.

C

What is the status of empirical evidence regarding the "validity of validity scales"? a. Despite the theoretical logic of such scales, most evidence indicates that they do not work well. b. The evidence overwhelmingly indicates that they work very well. c. Evidence regarding their effectiveness is mixed, though much research does indicate that they work fairly well. d. Despite the theoretical logic of such scales, most evidence indicates that they cannot be declared to work well or not.

C

What is the value of having test norms? a. The test norms reflect the one score that would be expected from a normative (i.e., typical) respondent. b. The test norms reflect the one score that would be expected from a normal (i.e., non-pathological) respondent. c. Any new test-taker can be compared to those test norms as a frame of reference for interpreting her/his score. d. Test norms allow researchers to use a test in their studies, as they allow researchers to compute correlations between the test's scores and other variables of interest.

C

When using an "adjusted true score estimate," how does an individual's estimated true scores compare to their observed (unadjusted) scores? a. The estimated true score will generally be lower than the individual's observed score. b. The estimated true score will generally be higher than the individual's observed score. c. The estimated true score will be less extreme (i.e., closer to the mean of observed scores) than the individual's observed score. d. The estimated true score will be more extreme (i.e., further away from the mean of observed scores) than the individual's observed score.

C

When writing good items for a psychological measure, test developers often avoid items that might be confusing for respondents (e.g., items with double negatives). In terms of response biases, why would we avoid such items? a. such items could prevent us from detecting response biases b. such items could prevent us from intervening (e.g., through statistical control) to deal with response biases that might have occurred c. such items could increase respondent frustration and lead to careless or unmotivated responding d. such items would not allow for test developers to employ multiple strategies to identify items that will probably suffer from bias

C

Which of the following is a definition of "criterion-referenced tests"? a. tests used to understand how a respondent compares with other respondents (e.g., who has higher or lower levels of the construct being measured) b. time-limited tests, with the goal of seeing how many questions a respondent can answer correctly in a specific amount of time c. tests with cutoff scores, used to sort people into groups (e.g., pass vs no pass) d. tests that are not time-limited, with the expectation that respondents will answer all questions

C

Which of these confidence intervals below represents the correct 95% CI bands, considering the following information. - Observed Score: 123 - Test Standard Deviation: 15 - Reliability: 0.78 a. 119 - 127 b. 120 - 126 c. 109 - 136 d. 107 - 138

C

You conducted a study that examines the correlation between IQ and income, and you find a value of r = 0.75. At the end of the study, you find out all the IQ scores were scored 10 points too high. What will the value of r be after you rescore the IQ data, subtracting 10 from each person's score? a. r will be increased b. r will be decreased c. r will remain the same d. cannot be determined from the information given

C

You have been asked to assess a patient for evidence of Alzheimer's Disease based on symptoms of cognitive impairment using the MOCA. The cut-off scores for the MOCA are: 11-18 for MCI and 18 or above for Alzheimer's Disease. The patient's true score is 19, but the observed score is a 17. Based on the observed score, you determine that the patient does not have Alzheimer's Disease. You made a: a. Correct Decision b. Type I Error c. Type II Error

C

A group of respondents took a psychological test. In their distribution of test scores, the variance is 4.56. Which of the following is the most accurate statement about that value? a. It means that the people's scores on the test are likely connected to scores on another test b. It reflects a small amount of variability (people did not differ greatly from each other, in terms of their test scores) c. It reflects a large amount of variability (i.e., people do differ quite a lot from each other, in terms of their test scores) d. This value is possible, but it is difficult to know whether it reflects a large or small amount of variability.

D

Consider a very brief three-item measure of loneliness: I spend more time by myself than I'd like: I often feel lonely Disagree____ Agree _____ I wish that I had more connection to people Disagree____ Agree _____ For this measure, a respondent gets 1 "point" for each item with which she/he agrees. The respondent's loneliness score is then computed simply by summing these points. From the perspective of acquiescence bias, a concern about this measure is: a. A respondent could get a relatively high score on loneliness by selecting the more intense or extreme options (regardless of whether she/he truly feels lonely). b. A respondent's score might be a poor reflection of her/his true loneliness because the test blends easy and hard, potentially confusing the respondent's "agreement" choices. c. An "obvious" scale like this would be easy to manipulate, in terms of appearing lonely (or not) if one wanted to. d. A respondent could get a relatively high score on loneliness just by wanting to agree with items of all types (regardless of whether she/he truly feels lonely).

D

Imagine that you are measuring the following characteristics. For which one would the test-retest approach probably not be a good method for estimating reliability? a. general intelligence b. extraversion (as a personality trait) c. height d. happiness (as a mood state)

D

Imagine that you develop your own psychological test, and you want to write about validity when describing your analysis of the test. Which of the following statements is phrased in a way that is most consistent with the contemporary definition of validity? a. "My test is valid." b. "My test is moderately valid." c. "My test is valid as a measure of construct X." d. "My test is moderately valid as a measure of construct X."

D

In the parlance of the MTMM matrix, which of the following is the correct label for the correlation between self-esteem scores from a self-report measure of self-esteem, and extraversion scores from an "informant report" method of measurement? a. monotrait-monomethod correlation b. monotrait-heteromethod correlation c. heterotrait-monomethod correlation d. heterortrait-heteromethod correlation

D

Psychologists have developed many strategies to help deal with response biases in general. These strategies are intended to accomplish several specific goals. Which of the following is NOT one of those goals? a. prevent or minimize the existence of bias b. minimize the effects of bias c. detect bias and intervene d. eliminate the source of bias

D

Say, in a sample of students, there is a zero correlation between height and self-esteem. This means that _________. a. taller people tend to have higher self-esteem than do shorter people b. shorter people tend to have higher self-esteem than do taller people c. none of the students have high self-esteem d. a person with high self-esteem is equally likely to be tall or short

D

What is 0 called when it has no existence, such as in reaction time? a. absent zero b. attribute zero c. arbitrary zero d. absolute zero

D

What problem is the QCV procedure intended to avoid? a. Self-report questionnaires are (believed to be) prone to error from respondents' response biases. b. The magnitude of a validity correlation is affected by both: a) the actual association between constructs, and b) similarity or difference in the methods by which those constructs are measured. c. The correlation between a test and an important criterion might vary from one sample or context to another. d. If a test developer unsystematically eyeballs a set of convergent and discriminant validity correlations, then she or he might interpret the results in subjective and inaccurate ways.

D

Which Pearson correlation coefficient realistically shows the strongest relationship between two variables? a. 1.20 b. .00 c. .75 d. -.80

D

Which of the following could potentially affect response to tests of knowledge, ability, achievement, or aptitude (i.e., tests in which there are presumably right or wrong answers)? a. acquiescence bias b. extremity bias c. social desirability bias (impression management) d. guessing

D

Which of the following values reflects "consistency" among the parts of a test? a. variance of the total test scores b. mean of the total test scores c. number of items d. sum of all the inter-item covariances

D

Which scale allows additivity and multiplicative interpretations, such as saying 80 miles is "twice as far" as 40 miles? a. nominal b. ordinal c. interval d. ratio

D


Kaugnay na mga set ng pag-aaral

Chapter 30: abdominal and genitourinary injuries

View Set

Hinkle 67 Management of Patients with Cerebrovascular Disorders

View Set

Vocab for Achievement Fifth Course: Lesson 22

View Set

International Finance Final Exam Basic Concepts

View Set

La Peinture Française du XIXe Siècle (Midterm)

View Set

Evolve Review Questions for Unit 2

View Set

Gov. Exam 2 Review (Chapters 4,5 and 6)

View Set

Exam 3: High-Risk Postpartum NCLEX Questions

View Set

RAD SEMINAR REVIEW: VERTEBRAL COLUMN & PELVIS

View Set

Module 2 / Unit 1 - Using Data Types and Units

View Set