Psyc 442 - Exam 1 - McDermott
spearman rho
A method for computing correlation, used primarily when sample sizes are small or the variables are ordinal in nature
there are many different cultures that would result in different responses and cross comparing cultures would be difficult if not impossible
In the 1930's and 1940's developers of IQ tests devised culture-specific tests and clarified that the tests were not intended for minority cultures. Yet, the tests were used on individuals belonging to other cultures. What are some problems with this?
norm referenced tests involve comparing individuals to the normative group; with criterion referenced tests test-takers are evaluated as to whether they meet a set standard (e.g. a driving exam)
Norm-Referenced versus Criterion-Referenced Interpretation
operational definition
a carefully worded statement of the exact procedures used in a research study
David Wechsler (1939)
a clinical psychologist, developed the Wechsler Adult Intelligence Scale
meta-analysis
a family of techniques to statistically combine information across studies to produce single estimates of the data under study; the estimates are in the form of effect size, which is often expressed as a correlation coefficient
histogram
a graph with vertical lines drawn at the true limits of each test score (or class interval), forming a series of contiguous rectangles
discrete scale
a measurement scale that allows for measurement of things with categorical value
Pearson r
a method of computing correlation when both variables are linearly related and continuous
norm-referenced testing and assessment
a method of evaluation and a way of deriving meaning from test scores by evaluating an individual testtaker's score and comparing it to scores of a group of testtakers
subgroup norms
a normative sample can be segmented by any of the criteria initially used in selecting subjects for the sample
correlation coefficient
a number that provides us with an index of the strength of the relationship between two things
they may be minimized near the ends of the distribution and exaggerated in the middle of the distribution
a problem with real differences between raw scores
standard score
a raw score that has been converted from one scale to another scale, where the latter scale has some arbitrarily set mean and standard deviation
cut score
a reference point, usually numerical, used to divide data into two or more classifications (e.g. pass or fail)
incidental/convenience sampling
a sample that is convenient or available for use; may not be representative of the population.
distribution
a set of test scores arrayed for recording or study
central tendency
a statistic that indicates the average or midmost score between the extreme scores in a distribution
raw score
a straightforward, unmodified accounting of performance that is usually numerical
alternative assessment
a type of evaluation other than a conventional test; it is sometimes used with students who cannot take a conventional test for some reason or for whom a conventional test is not an accurate assessment of their knowledge or ability
rapport
a working relationship between the examiner and the examinee
respondents are arguably the best-qualified people to provide answers about themselves
advantage of self-report
-greater access to potential test-users -scoring and interpretation tends to be quicker -costs tend to be lower -facilities testing otherwise isolated populations and people with disabilities
advantages of internet testing
frequency distribution
all scores are listed alongside the number of times each score occurred
James McKeen Cattell
an American who had studies with Galton, coined the term mental test in 1890 and was responsible for launching mental testing in its modern form
national anchor norms
an equivalency table for scores on two different tests; allows for a basis of comparison
outlier
an extremely atypical point (case), lying relatively far away from the other points in a scatterplot
variability
an indication of the degree to which scores are scattered or dispersed in a distribution
construct
an informed, scientific concept developed to describe or explain behavior
trait
any distinguishable, relatively enduring way in which one individual varies from another
test-taker
anyone who is the subject of an assessment or evaluation
purposive sampling
arbitrarily selecting a sample that is believed to be representative of the population
down
as error and error variance go up, reliability and validity go _______
role-play test
assessees are directed to act as if they were in a particular situation; useful in evaluating various skills
collaborative psychological assessment
assessment in which the assessor and assessee work as partners
dynamic assessment
assessment typically employed in educational settings but also may be used in correctional, corporate, neuropsychological, clinical, and other settings Evaluation → Intervention → Evaluation
age norms
average performance of different samples of test-takers who were at various ages when the test was administered
t score
can be called a fifty plus or minus ten scale; that is, a scale with a mean set at 50 and a standard deviation set at 10
simple scoring report extended scoring report interpretive report
computer reports may come in the form of a... (3)
interval scale
contains equal intervals between numbers. Each unit on the scale is exactly equal to any other unit on the scale
z-score
conversion of a raw score into a number indicating how many standard deviation units the raw score is below or above the mean of the distribution
-1 and +1
correlation coefficients vary in magnitude between...
0
correlation of ___ indicates no relationship between two variables
The Standards for Educational and Psychological Testing
covers issues related to test construction and evaluation, test administration and use, special applications of tests and considerations for linguistic minorities
eugenics
created by Galton; the artificial selection and distribution of a species on purpose
national norms
derived from a normative sample that was nationally representative of the population at the time the norming study was conducted
Francis Galton
devised a number of measures for psychological variables and abilities (e.g., questionnaires, rating scales, etc.)
range
difference between the highest and the lowest scores
- respondents may have poor insight into themselves. People might honestly believe some things about themselves that in reality are not true. - some respondents are unwilling to reveal anything about themselves that is very personal or paints them in a negative light
disadvantages of self-report
states
distinguish one person from another like traits but are relatively less enduring
- familiarity with test materials and procedures - ensuring that the room in which the test will be conducted is suitable and conducive to the testing - establish rapport during test administration
ethical testers have responsibilities before, during, and after the test including...
Henry Goddard
eugenicist who came up with early psychological testing of immigrant populations?
stratified-random sampling
every member of the population has an equal opportunity of being included in the sample
Woodworth Psychoneurotic Inventory
first widely used self-report personality test
grouped frequency distribution
frequency distribution that has class intervals rather than actual test scores
1. the construct is defined 2. values are assigned to different concepts 3. a scoring system and way to interpret results is devised
how can traits and states be quantified and measured?
assessments tend to be broader and more comprehensive, include multiple tests or evaluations, and answer some form of a bigger question
how is an assessment different from a test?
test user qualifications
in 1950 the APA published a report called Ethical Standards for the Distribution of Psychological Tests and Diagnostic Aids; it outlined three levels of tests in terms of expertise
positive correlation
indicates that as one variable increases or decreases, the other variable follows suit
negative correlation
indicates that as one variable increases the other decreases
case history data
information preserved in records, transcripts, or other forms
ratio scale
interval scale with a true zero point
nominal scale
involve classification of categorization based on one or more distinguishing characteristics; all things measured must be placed into mutually exclusive and exhaustive categories; no rank order or quantification
normalizing a distribution
involves "stretching" the skewed curve into the shape of a normal curve and creating a corresponding scale of standard scores
ordinal scale
involves classifications, like nominal scales but also allows rank ordering
scatterplot
involves simply plotting one variable on the X (horizontal) axis and the other on the Y (vertical) axis
the median
is the median or the mode affected less by skewing?
minimal competency testing programs
many states in the 1970's passed laws to the effect that high school graduates should be able to meet "minimal competencies" in reading, writing, and arithmetic.
interview
method of gathering information through direct communication involving reciprocal exchange
behavioral observation
monitoring the actions of people through visual or electronic means
psychological measurement
most psychological measures are truly ordinal but are treated as interval measures for statistical purposes
bar graph
numbers indicative of frequency appear on the Y - axis, and reference to some categorization (e.g., yes/ no/ maybe, male/female) appears on the X -axis
I have this construct that I say exists so I create behavioral parameters of what that looks like and I try to quantify that
operational definition of a construct
truth in testing legislation
passed at the state level, starting in the 1980's, the objective was to give test-takers a way to learn the criteria by which they are being judged (Descriptions of the tests and the subject matter assessed, etc)
local norms
provide normative information with respect to the local population's performance on some test
utility of a test
refers to the usefulness or value of a test; typically measured by reliability and validity
real-world behavior
responses on tests are thought to predict...
- to know why they are being evaluated, how the test data will be used, and what (if any) information will be released to whom - with full knowledge of such information, test-takers give their informed consent - the right to be informed of test findings - the right to privacy and confidentiality - the right to the least stigmatizing label
rights of test-takers
stratified sampling
sampling that includes different subgroups, or strata, from the population (the entire group for which the test is designed)
positive skew
skew where relatively few of the scores fall at the high end of the distribution
negative skew
skew where relatively few of the scores fall at the low end of the distribution
Wilhelm Wundt
started the first experimental psychology laboratory and measured variables such as reaction time, perception, and attention span
probability level below .05
statistical significance level
measures of variability
statistics that describe the amount of variation in a distribution
content
subject matter of a test
mean
sum of the observations divided by the number of observations
sampling
test developers select a population, for which the test is intended, that has at least one common, observable characteristic
frequency polygon
test scores or class intervals (as indicated on the X - axis) meet frequencies (as indicated on the Y -axis)
psychometrists; psychometricians
test users are sometimes referred to as __________________ or __________________
projective test
test where the individual is assumed to "project" onto some ambiguous stimulus his or her own unique needs, fears, hopes, and motivation; came out of psychoanalysis which suggests that an individual has an unconscious mind that they aren't aware of that drives their actions
level C tests
tests and aids that require substantial understanding of testing and supporting psychological fields together with supervised experience in the use of these devices
test developer
tests are created by them for research studies, publication (as commercially available instruments), or as modifications of existing tests
test user
tests are used by a wide range of professionals called the
administration
tests may require certain tasks to be performed, trained observation of performance, or little involvement by this (e.g. self-report questionnaires)
level A tests
tests or aids that can adequately be administered, scored, and interpreted with the aid of the manual
level B tests
tests or aids that require some technical knowledge of test construction/use and knowledge of psychology and education
accommodation
the adaptation of a test, procedure, or situation, or the substitution of one test for another
variance
the arithmetic mean of the squares of the differences between the scores in a distribution and their mean
average deviation
the average deviation of scores in a distribution from the mean
grade norms
the average test performance of testtakers in a given school grade
r
the coefficient used to measure the magnitude of an association between two different variables or constructs
error
the collective influence of all of the factors on a test score beyond those specifically measured by the test
error variance
the component of a test score attributable to sources other than the trait or ability measured
reliability
the consistency of the measuring tool; the precision with which the test measures and the extent to which error is presented in measurements
Fixed Reference Group Scoring Systems
the distribution of scores obtained on the test from one group of testtakers is used as the basis for the calculation of test scores for future administrations of the test
validity
the extent to which a test measures or predicts what it is supposed to
format
the form, plan, structure, layout of test items, and other considerations (e.g. time limits)
assessment
the gathering and integration of data for the purpose of making a psychological evaluation through tools such as tests, interviews, case studies, behavioral observation, and other methods; objective is typically to answer a referral question, solve a problem or arrive at a decision through the tools of evaluation
median
the middle score in a distribution
mode
the most frequently occurring score(s) in a distribution
skewness
the nature and extent to which symmetry is absent in a distribution
percentile
the percentage of people whose score on a test or measure falls below a particular raw score
standardization
the process of administering a test to a representative sample of test-takers for the purpose of establishing norms
testing
the process of measuring psychology-related variables by means of devices or procedures designed to obtain a sample of behavior; objective is typically to obtain some gauge, usually numerical in nature, with regard to an ability or attribute
normative sample
the reference group to which test-takers are compared
psychometrics
the science of psychological measurement
culture
the socially transmitted behavior patterns, beliefs, and products of work of a particular population, community, or group of people
standard deviation
the square root of the average squared deviations about the mean; it is the square root of the variance; typical distance of scores from the mean
kurtosis
the steepness of a distribution in its center
norms
the test performance data of a particular group of testtakers that are designed for use as a reference when evaluating or interpreting individual test scores
continuous scale
used for dimensional constructs in which it is theoretically possible to divide any of the values of the scale; typically having a wide range of possible values
scoring and interpretation
way of getting the results of a test that may be simple, such as summing responses to items, or may require more elaborate procedures
- non-verbal signs or body language may vary from one culture to another - psychoanalysis pays particular attention to the symbolic meaning of non-verbal behavior - other cultures may complete tasks at a different pace, which may be particularly problematic for timed tests
what are some problems between different cultures and non-verbal communication and behavior?
- cultures differ in regards to gender roles and views of psychopathology - cultures also vary in terms of collectivist vs. individualist value
what are some problems between different cultures and standards of evaluation?
- some meaning and nuance may be lost in translation - some interpreters may not be familiar with mental health issues so pre-training may be necessary - assessments need to be evaluated in terms of the language proficiency required and the current level of the test-taker
what are some problems between different cultures and verbal communication?
Stanford-Binet Intelligence Test and the Wechsler Adult Intelligence Scale
what are the two gold standard intelligence tests today?
World Wars I and II
what brought the need for large-scale testing of the intellectual ability?
sensory abilities
what did Galton think were related to intelligence?
- reliability and validity - administration, scoring, interpretation should be straightforward for trained examiners. (AKA user-friendly) - a good test is a useful test that will ultimately benefit individual test-takers or society at large. - cost-effectiveness - generalizability of findings
what makes a good test?
the launch of Sputnik by the Soviet Union (1957)
what prompted the U.S. government to greatly increase testing of abilities and aptitudes in schools to identify talented students?
they can accurately score things that a human can't score
what role do computers play in administration, scoring, and interpretation?
negatively skewed distribution
what type of distribution?
normal curve (bell-shaped)
what type of distribution?
positively skewed distribution
what type of distribution?
- educational - clinical - counseling - geriatric - business and military - government and organizational credentialing
what types of settings are tests given in?
after World War I when tests developed for military use were adapted and became widespread in schools and industry
when did public concerns about testing start?
China; to select people for government jobs
where were the first systematic tests developed? for what reason?
Alfred Binet and Theodore Simon (1905)
who developed the first intelligence test to identify mental retardation in schoolchildren?
the wealthy because they had time to study and were educated
who typically did better on systematic tests and why?
Charles Darwin
who's interest in individual differences led Galton to his work?
there isn't actually an exact measurement of the interval
why are most psychological measures truly ordinal but treated as interval?
he said it was much broader than that
why did Binet reject Galton's view of intelligence?
his findings were largely the result of using a translated Stanford-Binet intelligence test that overestimated mental deficiency in native English-speaking populations, let alone immigrant populations
why were Goddard's findings on immigrant populations controversial?