unit 5
for item discrimination, values of ___ and above are considered acceptable
.3
item difficulty between ___ and ___ is adequate
.3 & .7
What is the optimal difficulty on a true/false item
.5
What is the optimal difficulty on a multiple choice item with 5 response options
.60
If 15 of 20 students responded correctly to an item, its item difficulty is
.75
Reliability coefficients range from ___ to ___; score of ___ or higher is considered acceptable, below is questionable
0 to 1; .8
Effective Instruction Delivery
1) Proximity 2) "Look at me." 3) "Good job looking." 4) Give a clear directive in a neutral tone. 5) Provide praise for following directions quickly.
A score that falls two standard deviations above the mean is at the _____ percentile
98
What is a psychological construct?
A conceptual label for an intangible skill, ability, attribute, or cluster of behaviors Ie motivation, intelligence, giftedness Helps us summarize, communicate about, interpret, and respond to the complex range of human behaviors
Measurement invariance
Measures same construct for all test-takers (ex takes age into account) --Yields scores with same meaning for all
giftedness
Multidimensional School eligibility often based on IQ and/or academic achievement scores No federal criteria or mandate for giftedness Renzulli's Three-Ring Conception is ONE view
Physiological symptoms of anxiety
Muscle tension Racing heart rate Stomach ache
If a child has symptoms consistent with DSM-5 diagnostic criteria, but symptoms are only present in one setting (e.g., school), can we diagnose ADHD?
NO; needs to be present in multiple settings/ more pervasive
Test fairness
No advantage or disadvantage due to characteristics unrelated to target construct
Is talk therapy or CBT an effective treatment for ADHD?
No. Not characterized by cognitive biases so not as helpful.
CBT
Psychoeducation (educating them about the disorder and try to normalize it, explain the treatment process) Teaching relaxation & coping skills Cognitive restructuring Systematic desensitization - exposures
standard scores
Raw scores transformed to have set means and SDs -- N cognitive measures standard scores tend to have M = 100 and SD = 15
How does an IQ test differ from an academic test?
Read aloud to you → usually doesn't require reading
RSVP test characteristics
Reliable (yields consistent results) Standardization (adheres to protocol) Valid (measures intended construct → construct validity) Practicality (time and other resources)
Intra-rater reliability
Results are consistent with the same person rating the test more than once
Autism DSM-5 essential features
a. Impairment in reciprocal social communication and social interaction b. Restricted, repetitive patterns of behavior, interests, or activities c. Symptoms are present in the early developmental period d. Impairment in social, occupational or other important areas of current functioning
purposes of assessment
baseline score and progress monitoring
inter-rater reliability
between graders; measure of agreement among observers on how they record and classify a particular event
negative skew
many high scores -- Tail is to the left (few low scores)
IQ Score Interpretation
norm referenced -- Scores are compared to scores of same-age peers A score at the 40th percentile means the student performed as well as or better than 40 out of 100 peers in the normative sample Percentile does NOT refer to percent correct
Formative Assessment
occurs before and during instruction → guides the instructional process
performance assessment
performing/ doing something to be judged like a presentation
mand
to request
Stability or Test-Retest Reliability
Give same assessment twice, separate days, result should correlate; reliability is the correlation between scores
Classroom-based supports for anxiety
Give them a cue card with relaxation options/ their strategies Accommodations to use coping strategies Extra time on tests or assignments
How do we identify signs of autism?
- Caregivers → familiarity with typical milestones and early risk indicators - Screenings in childcare, school, and healthcare settings - If a child doesn't respond to their name → might be a hearing problem not autism
ODD interventions
- Contingency Management - Parent Training - Antecedent Interventions
IQ Tests
- Historically, test development and uses lacked cultural sensitivity - Current commonly used measures use nationally representative samples - Threats to fairness and validity must always be considered - early IQ tests had inappropriate questions
Distractor quality is based on if the answers are
- Incorrect - Plausible (are the other answers seemingly correct? Ex: if one answer has a psychologist and the rest of the answers have disney characters then the only plausible answer is the psychologist) The proportion of the students who chose each distractor is evaluated - If no one chooses a distractor it may be implausible or too easy
Recommendations for teaching gifted students
- Instruction should be "individualized" → might be gifted in math or science, etc not all gifted students should be doing the same thing - Enrichment programs, pull out programs
How might anxiety interfere with school performance?
- Intrusive thoughts can be distracting - Fear of incompetence can interfere with participation and efficient, accurate work completion - Anxiety avoidance behaviors (e.g., school refusal) - Social difficulties
How can depression impact school performance?
- Lacking motivation - Not enough sleep → not good for concentration - Lack of interest in school work - Distracting thoughts
Gardner's Theory of Multiple intelligences
- Posits that we have 8 intelligences - Proceed with caution = not a lot of research supporting this
If medication is a well-established treatment why should we have a toolbox of non-pharmacological treatments?
- Some people might not want medication - Challenging side effects with medications - Appetite-related side effects
Percentile ranks - position relative to sample
- Sue scored as well as or better than 70% of test takers in the normative sample - Not equally distributed across normal curve -- Raw scores differences between percentile ranks are larger at the extremes of the distribution
Optimal Difficulty
- adjusting for guessing Find midpoint between chance performance and 100%
anxiety treatment
-- medication management -- CBT
treatment for depression
--CBT Psychoeducation Feelings identification Recognizing and replacing automatic thoughts Planning preferred activities Problem-solving --Medication management
Variability
--How widely scores are distributed --The range (difference between the highest and lowest scores) is examined -- Commonly measured in standard deviations (SD) Small SD - scores are tightly clustered around the mean Large SD - scores are spread out Some tests it's better that there is a large SD (such as college admissions exams) or a small SD (such as in a classroom so that all students are learning)
Oppositional defiant disorder
--angry/ irritable mood --argumentative/ defiant --Vindictiveness Cannot be diagnosed if exclusively in sibling relationships → can be in one context but not in sibling
A student's standard score on a cognitive assessment is 70. What is the equivalent z-score?
-2
10 out of 50 students correctly answered a multiple choice question with five alternatives. What is the item difficulty index or p value for that item? Does the p value for this item reflect an adequate item difficulty index? Why or why not? What would be the optimal item difficulty level?
10/50= .2; No because .2 is not between .3 and .7 (too hard); .6
Standard scores have a mean of ______ and SD of ___
100; 15; 68% of test takers score between one standard deviation of the mean (85 to 115)
A student's T-score is 60. What is the equivalent z-score? This score falls 1 standard deviation(s) above the mean. What is the student's percentile rank?
84th
Look at a normal distribution of test scores. If a student's cognitive assessment score falls at the 16th percentile, what was the student's standard score? Based on this score, would a psychologist suspect an intellectual disability? Why or why not?
85, no only 1 SD below the mean
Early risk indicators: (first 24 months)
Lack of appropriate gaze Lack of sharing enjoyment or interests Lack of response to name Not speaking single words by 16 months Repetitive movements or posturing of body, arms, hands, or fingers Absent use of gestures Etc
Sternberg's triarchic theory
Analytical abilities --Academic and intellectual problem-solving Creative abilities --Generating novel problem-solving strategies Practical abilities --Applying knowledge to real situations
Grade equivalent scores
Based on mean or median score for a grade level -- If median score for a 4th grader is 40, any child in any grade whose score is 40 would have a grade equivalent score at 4th grade -- Do not indicate instructional level or appropriate grade placement -- Not equal interval (Skills acquired more rapidly early on A 1 year "delay" at an early age / grade represents a greater difference in test performance (and functioning) than a 1 year delay at an older age / grade GE scores are less reliable in older grades)
Why is CBT well-established for anxiety and depression, but not typically indicated for ADHD?
CBT works well for anxiety and depression because it works wells with identifying and replacing automatic thoughts that are often a problem for people with these disorders, but it is not typically indicated for ADHD because that kind of talk therapy doesn't help people with ADHD like it helps those with anxiety and depression and their thoughts and worries. ADHD usually relies on behavior interventions to help kids rather than CBT or other interventions that focus on cognitive biases. ADHD is not characterized by cognitive biases (thoughts and beliefs about not being good enough)
IQ Tests in Schools
Can be a component of a multi-faceted, comprehensive assessment Useful when there are questions of intellectual disability or giftedness The balance of the research does NOT support a link between IQ and learning disability identification or treatment
Spearman's (1904) G-Factor (general ability)
Central factor underlying cognitive abilities Spearman's Two-factor theory General ability → g Specific skills → s (run into problems with specific skills assessments)
Major depression: diagnostic criteria
Five or more symptoms for 2 or more weeks including at least one of the following: Depressed mood (or irritable mood in children)/ Loss of interest or pleasure -Changes in weight or appetite (Failure to gain expected weight (in children)) -Sleep disturbance -Psychomotor agitation or retardation (lethargic) -Fatigue -Feelings of worthlessness -Recurrent thoughts of death or suicidal ideation or attempts -Symptoms cause distress or impairment
Will criterion or norm-referenced tests have more items with higher p values?
Criterion tests
Contingency management
Daily report card system & Premarck principle
If a first-grader's grade equivalent score on a reading test is 4.2, should we provide the first-grader with fourth grade reading materials? Why or why not?
Doesn't inform placement, just tells us they scored above the mean → need more information
anxiety
Excessive fear and anxiety in most contexts (generalized) or in response to specific stimuli or situations (e.g., phobias, separation anxiety)
A student has an ADHD diagnosis from an MD and takes medication. The school must provide special education. True or false?
False. They might not be eligible at the state level (diagnosed by private practice)
Tom's score fell one standard deviation below the mean. score and percentile?
His score was 85; What percentile? 16 (everything to the left of 85)
Why is it important to offer treatment at school?
Make sure kids are actually getting treatment, already at the school (takes away other possible barriers)
ADHD involves developmentally atypical & impairing levels of
Inattention & disorganization (Difficulty staying on task Losing things Seeming not to listen) Hyperactivity-impulsivity (Difficulty staying still and/or seated Excessive talking and/or interrupting)
Sternberg's Theory of Successful Intelligence
Individuals' definitions of success are informed by: --Personal Goals --Sociocultural Context Individuals adapt to their environments --Studying --Practicing a sport or working out for a sport Individuals shape and select environments --Ask questions --Choose classes
categories of disorders:
Internalizing → emotional states, cognitive distress, depression & anxiety Externalizing → outwardly directed behavior, ADHD, oppositional defiant disorder, conduct disorder (more observable) Developmental → basic skill deficits, communication disorders, autism
If item difficulty is 1, do we suspect that the item is too easy or too hard?
It is too easy because everyone got it correct
Item Difficulty - p
Item difficulty: Proportion of test takers who respond correctly to an item And item = a question Ranges from 0 to 1
Z scores
Standard score ranging from -4 to 4 based on standard deviation (SD) units Ex: Z score of 1 is one SD above the mean
T score
Standard score with a mean of 50 and SD of 10 Ex: a t score of 45 is .5 SD below the mean
Multiple choice items Stem: Alternatives: Distractors:
Stem: the question itself Alternatives: all the answer options Distractors: incorrect choices
Autism Interventions
Teach communication skills Use visual supports, if needed Applied Behavior Analysis Discrete trial training Social skills instruction
construct validity
The extent to which an assessment measures the target construct -- ex= A math test with significant reading demands may have low construct validity
content valdity
The extent to which test content represents the target domain; A math calculation text that only assesses addition, although the curriculum covered 4 operations, has low content validity
Standardization
Uniformity in administration and scoring procedures Scripted instructions Time limits Test materials Test environment Scoring system (e.g., partial credit? rubrics?)
How do we address reliability (inter-rater reliability and intra-rater reliability) when scoring subjective assessments?
Use a rubric to keep grading similar, training for graders, multiple people score the same assignment
Can we directly measure constructs?
We estimate construct levels
Summative assessment:
assess achievement after a unit or other timeframe
Criterion-regulated:
assigns scores based on predetermined standards
norm-regulated
assigns scores in comparison to peers' scores
Standardized assessment:
i.e. CRCT, SAT (no proctor changing)
subjective assessment
ie an essay
internal consistency
compare scores on half of the test with scores on another half to determine how well diff items measures the same construct (or compare subtest scores to total scores)
Factor
construct that captures relationship between variables --Defined by its factor loadings
alternate forms reliability
create 2 forms of same test/ reliability is the correlation between scores
Picky eating →
if extremely picky (like 3 things only) might be a sign of autism but not necessarily
distractors
incorrect response options
2nd percentile or lower =
intellectual disabilities
Positive skew
few high scores - Tail is to the right (few high scores) - Scores clustered to the left (many low scores)
RIOT
record review, interviews, observations, testing
Central tendency:
score that is typical of the group; Mean: average = sum of scores / # of scores Median: Middle score in the series Mode: most frequent score
Central tendency error
scoring error where a teacher is more likely to give everybody the same, middle scores
Halo effects:
scoring error where students already made a good impression on the teacher outside of the particular assignment, teacher grades them higher
Leniency or severity errors:
scoring error with extremes, almost everyone gets an A or a C
objective assessment
spelling test, math calculation test (no room for judgment in grading)
Item discrimination - D
statistic that tells how well an item distinguishes students who know item content from those who do not Ranges from -1 to 1 -- Positive - those who did well on test answered item correctly and those with low scores answered item incorrectly -- Negative - item discriminates in the unexpected direction (Those who did well on test responded incorrectly to this item; Check for error in item)
criterion validity
the extent to which a measure is related to an outcome -- The extent to which test scores are in agreement with (other scores) or predict (future scores) an external criterion
