Psych 440 Midterm 1

Ace your homework & exams now with Quizwiz!

Culture

"The socially transmitted behavior patterns, beliefs, and products of work of a particular population, community, or group of people" History suggests cultural bias in testing can have an adverse impact: - immigration restrictions - forced sterilization

Psychometrics

'Measuring the mind'. The fundamental goal of psychological measurement is to *predict behavior*.

Origins of testing

- Chinese civil service exams initiated in the Chang Dynasty over 3000 yrs old - 1859 Darwin's Origin of the Species raised issue of individual differences + provides theoretical basis for animal models in medical and psychological testing - Wilhelm Wundt was a German medical doctor who studied how individuals were similar instead of different (Leipzig School) + described human abilities with respect to reaction time, perception and attention span

Professional ethical standards

- APA 1895 - formed committee on mental measurement - APA 1954 - published Technical Recommendations for Psychological Tests and Diagnostic Tests - Collaboration of APA and other organizations (AERA) have led to publication of sound practices in the field of testing and assessment

Age and grade norms

- Average performance of test-takers at various age/grades - Scores do not present equal units of measurement - Scores often used as evaluative standards - Not effective with very young or adult test-takers

Culture and testing

- Many early tests had NO minority individuals in standardization samples - Items culturally grounded in the dominant American culture: + "who was the first person to discover American?" - Translation problems: no corresponding object/word, changes in meaning

Comparison of NRT and CRT

- NRT= covers a *broad* content domain - CRT= focuses on a more *specific* content domain - NRT= emphasizes *discrimination* among individuals - CRT= emphasizes *description* of what individuals can and cannot do - NRT= interpretation requires a clearly defined comparison group - CRT= interpretation requires a clearly defined standard or criterion level of performance - NRT= scores are usually reported in terms of standard scores or percentile ranks - CRT= scores are usually reported in absolute

Nominal scale

- Nominal (or naming) level - Lowest level of measurement - Ordering in not important, only the label attached to designate a mutually exclusive and exhaustive category Examples: - medical diagnoses, gender, political party affiliation

Error and prediction

- Standard Error of the Estimate (SE) - Indicates magnitude of errors in estimation - Higher correlations produce smaller SE - Lower correlations produce larger SE

Who's involved in assessment?

- Test developers - Test users - Test taker - Society at large

Techniques of psychological assessment

- Tests - Interviews - Case history data - Behavioral observation - Role-playing - Computer-based instruments - Other techniques

Testing in the U.S.

- U.S. military developed Army Alpha & Beta during WWI (Yerkes & Brigham) + used to identify intellectual abilities of recruits and personality risk factors for "shell shock" - In 1939 the Wechsler-Bellevue Intelligence Scale (now WAIS) developed for adults + later other versions were developed for use with preschool (WPPSI) and school age children (WISC)

Rights a test-taker has

-Informed consent -Informed of test findings -Privacy (confidentiality) -Least stigmatizing label

Assessment assumptions

1) Psychological states or traits exist, and can be quantified and measured. 2) Different approaches to measuring aspects of the same thing can be useful. 3) Various sources of error are part of the assessment process. 4) Test-related behavior can predict behavior in other settings. 5) Present-day behaviors can predict future behaviors.

Z-scores

A standard score where the mean of the scores is set at zero (0) and standard deviations are set at intervals of one (1).

Normative sample

A group of people who performance on a particular test is analyzed for reference in evaluating the performance of individual test-takers. Sample must be representative or typical of the intended population of interest Inadequate norms makes it difficult to make proper interpretations

Interpreting percentiles

A percentile difference of 10 near the middle of the group often represents a smaller difference in performance than a difference in 1- near the tails. In terms of skills, a difference of a few percentile points near the tails means more change has taken place than the same size difference near the middle of the group.

Standard scores

A raw score that has been converted from one scale to a new (standardized) scale with a prescribed mean and SD. Typically expressed in terms of number of SDs from the mean - all standard scores have equal unit sizes across the distribution

Percentiles

A raw score that has been converted into the percentage of a distribution that falls below that particular raw score. - widely used in test manuals as well as other literature on commercially published standardized tests

Convenience samples

A sample that is convenient or available for use.

Scale

A set of numbers who properties model empirical properties of the variables to which the numbers are assigned. + discrete + continuous

Correlation

A statistical technique which allows us to make inferences about how two (or more) variables related (co-relate) to each other (linearly). Expressed using a correlation coefficient: - statement about the direction of a relation - statement about the strength of the relation

The normal distribution

A symmetrical, mathematically defined frequency distribution curve. - highest at the center (most frequent scores are at the mean) and tapering on both sides Asymptotic towards the abscissa + there is never a zero point or a 100 point, measures must be between 1-99 Mean, median, and mode are equal Area under the curve is divided in terms of SD units and can aid in the interpretation of test scores.

Coefficient of determination

Accurate interpretation of correlation coefficients requires another statistic, the coefficient of determination. - calculated by squaring the r2 The coefficient of determination tells how much variance in one variable is accounted for by the variance in the other.

Percentile pros and cons

Advantages: - can be used to interpret performance in terms of various groups and are easily understood Disadvantages: - percentiles are an ordinal scale - differences between individuals near the middle are magnified and differences at the extremes are compressed

Z-scores: pros and cons

Advantages: - indicates each person's standing as compared to the group mean - can easily be converted to percentiles Disadvantages: - negative z values can be difficult to work with and explain - dealing with fractional z values can be a hassle

Purposive samples

Arbitrarily selecting a sample because it is believed to present some population.

Interval scale statistics

Because of equal intervals between values some mathematic operations are meaningfully appropriate: - addition and subtraction - multiplication and division not appropriate because there is no true zero - statistical tests based on mean scores and/or variance

For every one unit increase in IV there's a one unit increase in..

Beta

Classification

Define when objects fall into the same or different categories with regards to an attribute. Examples: • Types of objects, college majors, sex, personality types

Discrete scale

Categorical labels or integers, no meaningful middle grounds between categories.

Variable

Characteristics or attributes of objects (people, places, things, animals, etc.) in a population *that are not constant*.

Alfred Binet

Commissioned by France's education system to help identify 'subnormal' children. - *developed first intelligence test* in 1905 with Theodore Simon - mental age proposed as criterion for evaluation - test revised by Lewis Terman at Stanford, current revisions still widely used

What is important to remember about correlation and causation?

Correlation ≠ causation!!

Test users

Counselors, other therapists, teachers, human resources, researchers.

Asymptotic

Curve in normal distribution doesn't touch x-axis.

Scales and descriptive statistics

Data must be measured on an interval or a ratio scale for the computation of means and other parametric statistics to be valid. Therefore, if data are measured on an ordinal scale, the median but not the mean can serve as a measure of central tendency.

Kurtosis

Describes the steepness of a distribution in its center. *Platykurtic*: flat *Leptokurtic*: peaked *Mesokurtic*: somewhere in between

Error

Deviation for some measurement from the true standing of an individual on some characteristic. Many sources of error: - effects of the environment - precision of the measurement device - confounding variables Error influences estimates of both central tendency and variability.

Psychological assessment assumption #2

Different approaches to measuring aspects of the same thing can be useful. - because we are making inferences it is better to have convergent evidence

Skewness

Distributions can be characterized by the extent to which they are asymmetrical or "skewed". *Positive skew*: only a few extremely high scores and many low scores *Negative skew*: only a few extremely low scores and many high scores

Quartiles

Dividing points between the four quarters of a distribution of test scores. *Interquartile range* is equal to the difference between Q3 and Q1 + the relative distance of Q1 and Q3 from the median (Q2) gives an indication of skewness of the distribution *Semi-interquartile range* equals the interquartile range divided by 2.

Validity

Does the test measure effectively what it purports to measure?

Reliability

Does the test produce consistent measurement results?

Random samples

Each individual from the population has an equal chance of being included in the sample.

James McKeen Cattell

First American to systematically study assessment of *individual differences*. - a student of Wundt, but more influenced by Galton's methods - studied differences in reaction time - *coined the term 'mental test'* - named his daughter "Psyche"

Psyche

Greek word for 'the mind'.

Nominal scale statistics

If numbers are assigned, they cannot be meaningfully manipulated mathematically. Appropriate arithmetic operations: - counting - proportions - percentages - chi-square tests

Z-scores and the normal distribution

If we have a normal distribution, we can make the following assumptions: - approx 68% of the scores are between a z-score of 1 and -1 - approx 95% of the scores will be between a z-score of 2 and -2 - approx 99.7% of the scores will be between a z-score of 3 and -3

Ratio scale

Includes ordering, equal intervals AND an obsolute zero. - all mathematical operations can be meaningfully performed Examples: - length and weight, Kelvin scale

Ordinal scale

Individuals or things are ranked or ordered on the basis of some criteria; intervals between ranks are not consistent. Examples: - grade level, ranking from shortest to tallest, movie sequels

Best portrayer of data if there is an outlier

Median

Norm-referenced (NRT) test interpretation

Interpretation is based on an individual's *relative standing* in some known group.

Criterion-referenced (CRT) test interpretation

Interpretation is based on measuring an individual's skill level in relation to a clearly specified standard (i.e., criterion).

Limitations

It is important to understand that normed scores do not represent standards or goals to be achieved by students. + norms simply describe typical or normal performance Criterion reference scores may have little or no application at the upper end of the knowledge/skill continuum. + more difficult to make proper comparisons between test takers

Metric

Latin word for 'measurement'.

Deviation scores

Measure of how far the raw score is from the mean of its distribution (X - μ).

Variability

Measures are used to describe how much fluctuation in scores there are in a sample of observations. - needed to interpret a person's score

Central tendency

Measures are used to describe the typical response seen in a sample of observations. - needed to interpret a person's score

Tarasoff vs. Univ. of California

Mental professionals have a right to tell people they may be in danger because of their patient.

Inferential statistics

Methods for making inferences about a population of objects based on information from a sample from that population. Examples: - chi-square test of association, t-test and ANOVA, correlation and regression

Measures of central tendency

Mode (most frequently observed) Mean (average score) Median (50th percentile score)

Mode

Most frequently observed score. - only measure of central tendency that can be used with Nominal data. Examples: 3,4,4,5,5,5,6,8 = 5 3,4,4,4,5,5,5,8 = 4 and 5

Advantages of standardized measurements

Objectivity Quantification • Communication • Economy • Scientific generalizability

Variance and SD

Reflects the variability of scores about the mean of the group. Variance is the average of the sum of the squared deviations of each score from the mean. The SD is the square root of the variance. + is expressed in the same units of measurement as the original scores

National and anchor norms

National norms are derived from 'representative' samples of a country. + often developed using stratified sampling methods Anchor norms indicate how test scores for a measure compare to the norms for other measures of the same construct. + calculated using percentile scores

Interval scale

Numbering includes order, but intervals between each successive level represents equal differences; *no absolute zero point in the scale*. Examples: - fahrenheit scale, intelligence test scores

Continuous scale

Numbers do not represent categories, middle ground between units possible.

Correlation coefficient

Pearson's r

Types of norms

Percentiles Age Grade National Anchor Subgroup

Median

Point which divides the group in half so that 50% of scores fall about it and 50% fall below it. - better measure of central tendency than the mean when the data are skewed because it is unaffected by extreme scores Examples: 2,3,4,5,6 = median is 4 2,3,4,5,6,7,8 = median is 5

Predictors of risk taking behavior

Positive predictors: - confidence - risk propensity - sensation seeking - gender (M>F) - extraversion Negative predictors: - age - social desirability - neuroticism - risk assessment

Prediction

Predicting values of one variable based on knowledge of scores on other variables is a practical use of correlation. Examples: - predicting job performance from aptitude test scores However, prediction technique must take into account both the scales of measurement and the correlation between the two variables.

Psychological assessment assumption #5

Present-day behaviors can predict future behaviors. - or there would be no point to testing

Descriptive statistics

Procedures for organizing, summarizing, and describing quantitative information. + academic performance can be described using descriptive statistics Examples: - batting average, census data, horsepower

Psychological assessment assumption #1

Psychological states or traits exist, and can be quantified and measured. - if these things are not real, then what's the point of trying to measure them? - if we can't quantify and measure them then psychologists are screwed

Test developers

Psychologists required to adhere to ethical standards (APA, AERA).

Federal testing legislation

Public interest in educational testing parked by Sputnik (1957). - National Defense Education Act (1958) provided money for aptitude testing in attempt to identify gifted children - Increased use of tests led to concerns about value and effect of psychological testing on students

T-scores

Represent one transformation of z which overcomes the disadvantage of working with negative scores. T-score = (z-score X 10) + 50 - t-score mean = 50 - t-score SD = 10

Scaling

Represent quantity of an attribute numerically. • Also used to measure psychological characteristics such as IQ test scores Examples: • Physical attributes and other quantities such as weight, height, age, and the cost of buying products or services

Stratified samples

Sampling individuals from subgroups in the population in the same proportion as the population they are part of. - best when population includes subgroups that differ on some potentially meaningful characteristic Helps prevent sampling bias

Scales of measurement

Scales, or levels, of measurement help determine what statistical analyses are appropriate; enable test users to make accurate score interpretations. Four levels: - nominal - ordinal - interval - ratio

What are the two types of measurement?

Scaling and Classification.

Standard deviation (SD) formula

Standard deviation -- the average deviation of each score from the mean. Population: σ = √∑ (X - μ)2 / N Sample: s = √∑ (X - mean of X)2 / (n - 1)

Types of normative samples

Stratified Random Purposive Convenience

Negative relation

Strong negative (r= -.7) + political affiliation and willingness to vote for another party's candidate Moderate/weak negative relation (r= -.4) + brushing teeth and cavities

Positive relation

Strong relation (r= .7 or higher) + height and weight + age and job experience Moderate/week relation (r= .4 or lower) + chemotherapy and cancer remission + GRE scores and grad student success r2 = proportion of variance shared by variables

Sir Frances Galton

Studied genetic influence using pedigree charts. - attempted to quantify individual differences by classifying people - *developed first correlation coefficient* later refined by Karl Pearson - created Anthropometric Laboratory in London in 1884 - major proponent of the Eugenics Movement

What if a person scores below the mean?

Subtract 1 from percentile: 1-#

Measures of variability

Synonyms for variability are "spread" and "dispersion". Each term refers to differences among scores within a sample or population. Three common types are: - Range - Deviation scores - Variance and SD

Sampling and norms

Test administered to members of the sample under the same conditions. - environment - instructions - time restrictions - etc Developers calculate descriptive statistics and provide precise description of sample.

Psychological assessment assumption #4

Test-related behavior can predict behavior in other settings. - otherwise the test would be useless

Mean

The "average" of a set of scores. - found by summing all values and then dividing that sum by the total number of observed values - requires interval or ratio data - sensitive to every score in the sample, and may be inappropriate with skewed data Example: 3,4,4,5,5,5,6,8 = 40/8 = 5 3,3,4,4,6 = 20/5 = 4

Range

The difference between the highest and lowest scores and is sensitive to outliers. Examples; 2,5,7,7,8,8,10,12,15,17,20 + Range = 20-2= 18 2,5,6,6,7,8,11,14,15,15,23 + Range = 23-2= 21

Population z-score formula

The population z-score is calculated by subtracting the population mean from the individual raw score and then dividing the population SD. Where: - x is the individual score - μ is the population mean - δ is the population SD Z= (X-μ)/δ

Measurement

The process of assigning numbers or symbols to a characteristic or attribute according to a set of rules.

Assessment

The process of gathering and integrating data for the purpose of making an evaluation. - most effective when info is obtained using multiple techniques

Testing

The process of measuring variables by means of devices or procedures designed to obtain a sample of behavior.

Sample z-score formula

The sample z-score is calculated by subtracting the sample mean from the individual raw score and then dividing by the sample SD. Where: - X is the individual score - X ̅ is the sample mean - s is the sample SD Z= (X- X ̅ )/s

Examples of psychological scales

They often measure individual differences. • Interests • Personality • The GRE • Risk taking behavior

Describing data

Three methods: - pictorially - measures of central tendency - measures of variability (or dispersion)

Multiple regression

Used when multiple predictors are used. - can be used when more than one predictor variable is available Takes into account the correlation between each of the predictor scores and what is being predicted. Also taken into account are the correlations among the predictors Y=a + b1X1 + b2X2

Simple linear regression

Used when one variable is used to predict values. - describes the relationship between one independent variable (x) and one dependent variable (y)

Logistic regression

Used when the variable being predicted id dichotomous (ex. gender).

Ordinal scale statistics

Values imply nothing about magnitude of differences between one level to the next. - numbers are not units of measurement - statistical operations are limited to non-parametric tests

Psychological assessment assumption #3

Various sources of error are part of the assessment process. - our goal is to manage error sources instead of ignoring the problem

Adequate norms

Was the test developed using samples similar to the people taking the test?

Z-scores and percentile ranks

Z-scores can be used to calculate percentiles when raw scores have a normal distribution. - when used in conjunction with a Z-table, the z-score reveals the area of the normal distribution below the score in question

Characteristics of an effective test

• Reliability • Validity • Adequate norms

No relation (pearson's r)

r = 0.00

.80 and -.80

same correlation

Tests are defined by what (and how) they measure

• Content • Format • Administration procedures • Scoring and interpretation procedures • Psychometric quality - what makes an effective test?


Related study sets

Microeconomics Module Three Quiz

View Set

Landforms Exam Review (Chapter 1)

View Set