Psychological Testing Midterm

Ace your homework & exams now with Quizwiz!

Representativeness Heuristics

"...judging the relationship between variables on the basis of similarity alone disregards other potentially significant factors..." Clinicians base their diagnoses on the degree to which an individual is thought to resemble those making up a diagnostic category Stereotypes (How similar is person X to the typical person in diagnostic category Y?) Prototypes (How similar is person X to the person showing all characteristics associated with diagnosis Y?) Exemplars (How similar is person X to those that the clinician has seen in their personal work?)

Hindsight Bias

"ad hoc fallacy" or tendency to explain an event after it occurs, unaware a biased prediction has occurred. "knew it all along" Foster unrealistic sense of confidence

Scaling

"the process by which a measuring device is designed and calibrated, and the way numbers - scale values - are assigned to different amounts of the trait, attribute, or characteristic being measured"

Test

A measuring device or procedure in which a sample of behavior is obtained, evaluated, and scored.

Test Retest Reliability

A reliable test should yield similar scores over time Used to gauge consistency of scores over time for the same person (via correlation) Test given to same group on two occasions, scores on first and second exam correlated

Psychological Test Definition

A standardized procedure for sampling behavior and describing it with categories or scores. A measurement tool or technique that requires a person to perform one or more behaviors in order to make inferences about human attributes, traits, or characteristics or predict future outcomes. By inference, we mean using evidence to reach a conclusion.

Non random/systematic measurement error

A test consistently measures something other than the target Occurs when source of error always increase or decrease a true score Does not lower reliability of a test since the test is reliably inaccurate by the same amount each time

Indirect Behavioral Assessment

ADHD assessment has scales designed to be completed by parents or teachers at more than one setting

Affective Assessment

Affective Assessment assesses all noncognitive features of an individual, including temperament, clinical disposition, personality, attitudes, values, and interests. Structured Inventories for diagnostic purposes, hypothesis testing, treatment planning, and progress evaluation (Minnesota Multiphasic Personality Inventory-II (MMPI-2) Strong Interest Inventory Unstructured assessment involves the use of projective techniques and qualitative methods Based on psychoanalytic theory, present the client with unstructured, ambiguous stimuli, allowing the client to "project" thoughts and feelings onto the stimulus Inkblots, pictures, incomplete sentences... yield insights into a client's motivation, personality, values, etc.

Late 19th century: Intelligence Tests

Alfred Binet and the Binet-Simon Scale Lewis Terman and the Stanford-Binet Intelligence Scales David Weschler and the Weschler-Bellevue Intelligence Scale and the Weschler Adult Intelligence Scale

Psychological Test Differences

Behavior performed Construct measured and outcome predicted Content Administration and format Scoring and interpretation Psychometric quality

Bootstrap approach

Combination of the two; sequential method First write items based on theory, next validate items based on using samples and statistically analyzing the findings.

Clinician Bias in relation to patient characteristics

Cultural Identities of the pt. influenced the diagnostic, therapeutic, and prognostic decisions made by clinical psychologists. E.g., increased organic dos diagnoses with increase in pt. age E.g., increased Borderline personality dos (i.e., emotional dysregulation, fear of abandon, rejection hypersensitivity...) among female pts. E.g., Increasing age along with poor health ------ less optimistic psychologist predictions regarding treatment and prognosis ("ageism" and "healthism" in everyday clinical practice

Limitations of Psychological Tests

Decisions about peoples' lives should not be made on the basis of a single high-stakes test score. Tests are biased and unfair to minorities and women Tests create anxiety and stress Tests label and categorize Test developers dictate what students must know or learn "Teaching to the test" inflates scores Multiple-choice questions punish creative thinkers; trivialize the complexities of the learning process

Validity

Does your test actually measure what it is designed to measure? Truthfullness

Reliability

Does your test yield consistent results? Consistency

Measurement Error and Reliability

Error reduces the reliability or repeatability of psychological test results A crucial assumption of classical theory is that unsystematic measurement errors act as random influences Main features of classical theory Measurement errors are random Mean error of measurement = 0 True scores and errors are uncorrelated: rTE = 0 Errors on different tests are uncorrelated: r12 = 0

Correlation Coefficient [but comparing the same person]

Expresses the degree of linear relationship between two sets of scores obtained from the same person Rxx = True score variance (T)/Total Variance of test scores (O) O = T + E If measurement error is very small, close to zero, R = If measurement error is very large, R =

Types of Validity

Face Validity Content-Related Validity Construct Validity Criterion-Related Validity

Possible Sources of Measurement error

GROUP 1 - TEST ITSELF GROUP 2 - TEST ADMINISTRATION GROUP 3 - TEST SCORING GROUP 4 - TEST TAKERS

Types of Tests

Group vs. individual Intelligence tests Aptitude tests Achievement tests Creativity tests Personality tests Interest inventories Behavioral procedures Neuropsychological tests

Psychological Test Similarities

Limited sample of behavior (an observable and measurable action) Standardized procedure Behavior used to make inferences about some psychological construct (an underlying, unobservable personal attribute, trait, or characteristic of an individual that is thought to be important in describing or understanding human behavior)

central tendency

MEAN MEDIAN MODE

Neuropsychological Tests

Measure cognitive, sensory, perceptual and motor performance to determine the extent, locus, and behavioral consequences of brain damage

Response Bias from Examinees

Motivation Fake good or fake bad (malingering) Integration of well-developed validity scales, review both self- and other-report data, use of objective mental-status examination findings...

Creativity Tests

Novel, original thinking

Confidence Interval Calculations

O=X+E 95% CI = X +/- 2(SEM) 99.7%CI = X +/- 3(SEM)

Behavioral Procedures

Objectively describe and count the frequency of a behavior//Identify the antecedents and consequences of the behavior

Availability Heuristic

Pertains to the situation where information used for prediction or decision making is that which is most easily accessed or recalled Illusory Correlation (forming test sign-symptom correlations without empirical evidence) Correlations based on clinicians' personal associations and projections than on data Recall/memory availability and vividness can limit judgement accuracy Clinician's memory capacity Only remembered one piece of information

Interest Inventories

Preference for certain activities or topics//Occupational/career interest

Psychological Test assumptions

Psychological tests measure what they purport to measure or predict what they are intended to predict An individual's behavior, and therefore test scores, will typically remain stable over time Individuals understand test items the same way Individuals will report accurately about themselves Individuals will report honestly about their thoughts and feelings The test score an individual receives is equal to his or her true score plus some error

Measures of variability

Range Standard Deviation

Misrepresentation of change

Regression to the Mean - extreme observations, scores, or performances on one occasion will likely be followed by less extreme results on future occasions E.g., "His depression scale score is lower than the one he scored two months ago. He is definitely better! The treatment definitely worked!"

Split Half Reliability

Reliability gauged by splitting a test into two parts and comparing an individual's scores on both halves If a test if split into two (odd vs. even questions), the two halves should yield similar scores for a given individual

Group One Test Itself Error Factors

Representation of the items, wording of the items, culturally biased items, linguistically biased items, double-barreled items

early 1900s: personality tests

Robert Woodworth and the Personal Data Sheet Carl Jung and the Rorschach Inkblot Test Henry Murray and C. D. Morgan and the Thematic Apperception Test

Standard Error of measurement

SD * square root of 1-r (reliability)

Empirical approach to test construction

Sampled large samples on random items - identify items that relate to the construct attempted to measure

Content Validity

Shows evidence that the test items adequately reflects the test domain - via literature review, consulting subject matter experts To ensure that you comprehensively cover/capture the test domain of interest

Criterion Related Validity

Shows that a test is able to predict the behavior that it is designed to predict (criterion/outcome) To show that your test scores actually lead to predicted behavioral outcomes either now (concurrent) or in the future (predictive)

Assessment

Systematic procedures for making inferences about characteristics of people. Broader and more comprehensive than testing.

Adjustment and Anchoring

Tendency for final judgments to be biased in the direction of initially reviewed data Judgement overly influenced by the first page of the clinical material reviewed Potential for reviewers to reach different opinions regarding the same evidence in the event that this evidence is reviewed in different sequences

Test vs. Assessment

Test Assessment

Construct Validity

The extent to which the variables being studied represent the constructs they are purported to measure. (Are we comparing oranges to apples?) Shows evidence that the test items adequately capture the concept (or construct) that it is designed to capture To show that your test correlates positively (convergent) with similar test and negatively (discriminant) with dissimilar tests

Confirmation Bias

The tendency to selectively attend to information that is in line with one's viewpoint while minimizing or disregarding data that may disconfirm this position. e.g., "selective" data review

Rational Approach to test construction

Theoretical approach

Personality Tests

Traits, qualities or behaviors that determine a person's individuality//Checklists, inventories, and projective techniques

Observed Exam Score =

True Exam Score + error

early mid 1900s vocational tests

U.S. Employment Service and the General Aptitude Test Battery

Inter Rater Reliability

Used to gauge the consistency of ratings across multiple raters (e.g., figure skating judges, diving judges) In order for a scoring method to be reliable, independent ratings by multiple judges should be highly similar

Parallel Forms Reliability

Used to gauge the equivalence of two or more different versions of an assessment that measure the same concept Different versions of the same test should yield highly similar scores for a given individual Test developer creates two forms of the test Assesses equivalence of two parallel forms scores on both tests correlated

Internal Consistency Reliability

Used to see if responses to a set of similar items are uniform (or consistent) Measured using Coefficient Alpha a way to compare individuals' scores on all possible ways of splitting the test in halves (instead of just one random split). Logic: A reliable test should contain only those questions that measure the same concept Heterogeneous test (or homogenous subsets) is split in half and scores on first half compared with scores on second half Assesses how related items or groups of items are to one another scores on both halves correlated

Psychometrician

a specialist in psychology or education who develops and evaluates psychological tests.

Intelligence Tests

ability in global areas

Ratio Variables

an interval scale, but with a true zero point. temperature, length, number of children, income...

Group Tests

are designed for administration to groups of participants simultaneously. Advantage - speed and efficiency Limitations in the type of test formats available (e.g., paper-and-pencil, computer-based). A major drawback is the inability to observe all examinees and control relevant individual factor (e.g., client motivation, mood).

Individual Tests

are often used for diagnostic decision making and generally require some interaction b/t the examiner and examinee. Allow the two to establish rapport, reduce anxiety Often the administrator require special training Can provide information about the client's presentation, affect, attitudes, verbal and nonverbal behaviors, etc.

Speed and Power Tests

bg

Aptitude Tests

capability in a specific task

Diagnostic Overshadowing Bias

client's problem receives inadequate treatment b/c attention is diverted to an more salient characteristics E.g., Gay or lesbian pt. ------ a clinician might perceive the presenting problem as related to conflicts over sexual orientation and fail to address other critical issues E.g., Individuals with AIDS were less likely to be referred to treatment for emotional symptoms (depression) than patients with other medical problems.

Heuristics

decisional simplification strategies - Mental Shortcuts Representativeness Availability Adjustment and Anchoring

Achievement Tests

degree of learning, success, or accomplishment in a subject or task

Behavioral Observations

direct indirect

Nonstandardized tests

do not do the above.

Random/Unsystematic Measurement Error (reduce reliability)

effects are unpredictable and inconsistent Random in nature Will increase and decrease a person's score by exactly the same amount with infinite testing Cancels itself out Lowers reliability of a test

Standardized vs nonstandardized tests

f

Group vs. Individual Tests

ff

Objective vs subjective scored tests

ff

Cognitive vs affective tests

gg

Norm vs criterion referenced tests

gr

Standardized Tests

have specific conditions for administration, timing, and scoring. To ensures that no matter who the examiner or examinee, the test will be administered under strict, replicable conditions. Allow comparability of scores and interpretations across time/situation Conform to rigorous test construction guidelines

Speed Test

is to measure how many of the simple items a person can complete within a certain amount of time; the score is simply the number of (correct) items completed within the time limit. Coding tests of the WAIS-IV

Objective Tests

leave no doubt as to the correctness of a given answer: correct answers are predetermined and require no judgement on the part of the examiner. Multiple-choice, true-false items Help control subjective bias in scoring (interscorer reliability)

Test Administration Error

level of noise, room temperature, lighting, inconsistent way of giving instructions and/or answering questions

Cognitive Tests

memory, perceptual, processing and reasoning capacities Intelligence Tests: measure a person's ability to learn, solve problems, and understand increasingly complex or abstract information Wechsler ADult Intelligence Scale - Fourth Edition (WAIS-IV) Aptitude Tests: predict a person's capacity to perform some skill or task in the future (e.g., college) SAT (actually, can only predict the freshman yr of college) Achievement Tests: measure knowledge students have acquired through instruction or training up to a certain point in their academic career.

Test Takers Error

motivation, anxiety, attention, and fatigue level

Norm Referenced Tests

often Standardized tests, are administered to a representative sample of participants (called a standardization sample), to determine average performances for various subgroups of interest (called a norm group). A client's score can then be compared to the average of the standaridization sample (i.e., Average, Above Average, Below Average) Commonly used to assess intelligence, achievement, personality, cognitive functioning... The raw score is transformed into some type of standard score or percentile rank Example, SAT

Nominal Variables

qualitative system; if numbers are used they are arbitrary. sex, ethnicity, college major, political affiliation...

Ordinal Variables

ranking according to characteristics, with intervals not necessarily being consistent between ranks. tallest to shortest, Olympic medalists...

Interval Variables

ranking with equal space between units but with no true zero. many exams/psych tests (at least aspire!) to interval...

Subjective Tests

require the examiner to make a judgment on the quality of the response in scoring an item Essay, open-ended questions Can elicit important and rich client information

Test Scoring Error

scorers' levels of skills and qualifications, criteria of scoring, subjectivity

Criterion Referenced Tests

tests compare a person's score to a predetermined standard or level of performance - a criterion. Either "passed" or "failed;" a cutoff score For example, a depression screening test with a total score of 20 and above - indicate further evaluation DSM-V diagnostic checklists (3 or more of the six listed symptom criteria make a clinical diagnosis)

Face Validity

the degree to which a procedure, especially a psychological test or assessment, appears effective in terms of its stated aims.

Direct Behavioral Assessment

the examiner physically present in the same environment with the client and uses a data collection procedure to assess the frequency, duration, and/or magnitude of one or more target behaviors A school counselor may observe a 2nd-grade student referred for overactivity in the classroom using a time-on-task observation system Has a natural control group

Power Test

the score is an indicator of the skills or abilities possessed by the examinee, without the pressure of time limits. Items vary in difficulty, and examinees eventually miss or could not complete many items in a row (reach the ceiling level), and the administration ceases. For example, Matrix Reasoning of the WAIS-IV

Past Behavior Heuristic

use of previous behavior to predict future behavioral outcomes Patient A's past substance use experience ---Patient A is an addict today!


Related study sets

ECON 202 Ch 1 HW: The Big Ideas in Economics

View Set

Parenteral Anticoagulant Therapy

View Set

Substance Abuse, Eating Disorders, Impulse Control Disorders NCLEX 3000

View Set

NM-flashcards types of life policies

View Set

Digestive and Gastrointestinal Function (CH 43,44,45,46,51)

View Set