Psych 1X03: Instrumental Conditioning
What is the over-justification effect?
a change in perception caused by reward (ex. volunteering for fun vs. for resume)
Variable Interval Graph:
straight slope
Law of Effect...
actions maintained if followed by pleasant consequences, or eliminated if unpleasent
What is the S-Delta (Sδ)?
-a cue that indicates when the contingent relationship is not valid -also signals no contingency (ex. diff. environment)
Partial Reinforcement: Fixed Ratio...
a set amount.. 1 cake for every A+ (ex. FR-1: Cake for every A+)
What does reinforcer mean?
a stimulus that changes frequency of a certain behaviour (most effective if specific behaviour immediately leads to specific consequence)
What is successive approximation?
chaining; learning complex contingency by smaller, gradual steps through reward training
Punishment Training...
ex. being scolded after teasing
Escape Training...
ex. no chores after good grades
Omission Training...
ex. time out after being grumpy
Reward Training..
ex. treat after dogs trick
What is instrumental conditioning?
explicit training for learning contingency between voluntary behaviour and consequences (ex. touching a hot stove)
What are reward contrast effects?
having a higher response rate with better reward
What is autoshaping?
learning behaviour-consequence contingency without explicit training
What is acquisition?
learning contingency that response leads to consequence
Variable Ratio Graph:
more frequent reward, steeper rate
What is Partial Reinforcement?
more likely to form longer contingency than continuous, after reinforcement stops
Fixed Ratio Graph:
pause and run
Thorndike...
put cats in a box opened by rope, randomness of behaviour gradually decreased, but nothing is learned (never a distinct "aha" moment for cats)
Partial Reinforcement: Variable Interval...
receiving reinforcement at any time... *approx every 2 min (ex. VI-2: studying for quizzes approx. 2/month; cake approx. every 2 min)
Partial Reinforcement: Variable Ratio...
reinforcement is delivered after some random # of responses around characteristic mean (ex. VR-10: cake odds of 1 in 10)
Partial Reinforcement: Fixed Interval...
reinforcement is delivered following the first response after a set interval of time.. *every 2 min (ex. FI-2: exam studying 2/year; cake every 2 min)
What is continuous reinforcement?
reinforcement presented after each trial of behaviour
Fixed Interval Graph:
scalloped
What is generalization?
the greater the similarity to the SD, greater rate of behaviour
What is extinction?
when the SD is presented without reward, so contingency and behaviour stops (ex. eating no veggies after grandparents stop giving cookies)
What is discrimination?
when the Sδ is presented without reward, so contingency and behaviour is specific (ex. eating veggies only in grandparents house, but not at a friend's house)
Stamping In/Stamping Out...
•behaviours with + consequences are stamped in •behaviours with - consequences are stamped out (behaviour refinement; response->reward)
Increase frequency of behaviour (stamped in)...
•present + reinforcer (reward training) •remove - reinforcer (escape training)
Decrease frequency of behaviour (stamped out)...
•present - reinforcer (punishment training) •remove + reinforcer (omission training)
What is the discriminative stimulus (SD)?
-sets the occasion for a response (environment) -also signals contingency between response and reinforcement