Psychology 410 - Instrumental conditioning (Reward/ Punishment)
Response-reinforcer
Behavior and reward Changing a reward, will change a response Problems: Behaviors aren't always reliant on rewards
Extinguishing
Both Instrumental and Operant conditioning can be extinguished by withholding the reward.
Is it better to know when a stimulus will happen?
Can approach/ avoid the situation Sometimes the anticipation is worse. Depends on whether the individual can control his/her fate.
Side effects of learned helplessness
Emotional - Physcosomatic (Ulcers) Motivational (lack of response) Cognitive - belief that no action will help.
Omission
Taking something away in order to decrease behavior Bad grades = no money
The Sequential Hypothesis (PREE)
5 times - reward 6 times - reward 2 times - reward Person does not know WHAT they did for their reward, therefore they keep doing it.
Theory of avoidance Cognitive
Acquired pair of expectations Lucky socks -> win the game. Issue: You will wear these socks for every time you play. No evidence to suggest otherwise. No socks -> ? Video: The montana stain on the shirt. The wife (Ravens fan) washes it "by accident" Avoidance: You are avoiding not wearing the shirt because you do not want to lose.
Two types of avoidance
Active: Doing an action prevents something from happening (Texting your gf so that she doesn't get mad). Passive: Not doing an action prevents the stimulus from occurring. (Not bringing up something that upsets you to prevent an argument).
Punishment
Adding something negative in order to stop something from happening Bad grades = chores
Role of neuroscience in learning
Amygdala V Important - needed DURING and AFTER learning
Applications of instrumental conditioning
Animal Training Biofeedback Education Token Economies
Conditioned fear/ avoidance in response to punishment
Animal peeing when they hear the word "No"
Side effects of punishment and non-reward
Conditioned fear/ avoidance Aggression Secondary reinforcement
3 Theories of avoidance learning
Conditioning Cognitive Functional.
Fixed ratio
Constant behavior After a set amount of time, you get a reward. Every time you make a bracelet, you get a reward. Factory piece work Easy to extinguish Steady response Determined by subjects effort.
Fixed interval
Constant time After a certain amount of time, you get a reward. Normally you wait until the end and then increase behavior (studying for a test) annual sales Little response until just before reinforcement then rapid response. Fairly easy to extinguish *Look at graph for this.
What makes an effective punishment? Continuous or non-continuous punishment?
Continuous punishment works better. PREE makes it harder to extinguish the behavior in the absence of the aversive stimuli.
What makes an effective punishment? Delays in punishment
Delays in punishment The longer the delay, the less effective it is
Learned helplessness in every day life
Depression. Classroom learning (used to one type of performance might lead them to believe that this performance is to be expected every time).
Stimulus Reinforcer
Discriminative stimulus and reward. Seeing red light is the sign to press a button.
Protestant ethic effect
Earning something will make you feel much better about the reward The marbles coming out back but the kids not taking them
Education
Example with the machines they used. They were tested and then received immediate feedback, this prevented anxiety and made them very efficient at learning the correct responses. Guides behavior. DOES NOT WORK
Schedules of reinforcement
Fixed ratio Fixed interval Variable ratio Variable interval
Theory of avoidance Functional approach
Focuses on innate responses to threat. Mice freeze, Rats stand up INVERSE OF INSTINCTIVE DRIFT This is Species Specific Defense Response (SSDR)
Instrumental conditioning methods
For animals: Skinner boxes mazes For humans Head turning and leg movement (with the mobile) for babies.
Variables affecting acquisition H, D and K
H - strength prior to training (H for Heavy - strength) D - how much you want it (Drive) K - reward amount and quality. K for K(C)urrency
Drive
How much you want it
Response strength
HxDxK
The Frustration Hypothesis (PREE)
If a kid keeps whining for candy and you eventually give it to them just to shut them up, then it this sort of learned behavior takes a long time to let up.
Limitations to appropriate response
Instinctive drift. Hard to make an animal do something other than its normal behavior.
Instrumental (Thorndike) and operant conditioning (Skinner) differences
Instrumental: Separate trials occur over time in different places. Multiple subjects. Operant: Continuous training, same env. Single subject. Rat in an operant chamber (skinner box)
Criticisms of reinforcement
It's manipulative Behaviors should not always require rewards.
Social reinforcement
Keg stand when you shouldn't have Last shot when you shouldn't have. You did it due to social reinforcement.
Errorless discrimination training
Knowing the exact stimulus (red circle, brown, closer and closer and closer. SD - CS+ S Delta - CS-
Variables affecting acquisition
Larger reward (more $) = increased behavior (Mowing the lawn better) This is a positive contrast. Decrease reward (Less $) = decrease in behavior (doing a crappy job) Negative contrast.
Does learning require reinforcement?
Learning does not require reinforcement You do not have to be aware to be reinforced
What makes an effective punishment? What kind of stimuli works better?
More intense stimuli work better Mild stimuli can cause habituation.
Chaining (appropriate response)
Multiple behaviors. Sit, spin, high five etc. Chicken video: walking around the frame, knocked over things.. Behavior was praised.
Reinforcers as information
No physical reward Directions Pick up lines - even though you might expect a reward, even if you don't, you still learn.
Extinction bursts
Non-reinforced behavior increases initially. The elevator close button does not work at first so you press it another time. A key on a keyboard doesn't work so you press it harder.
Variable ratio
Not constant behavior After a random amount of behavior, you get a reward. Slot machines. You keep going because you expect a reward. Hard to extinguish Rapid response
Variable interval
Not constant time After a random amount of time, you get a reward. Like pop quizzes, you must always be on the ball. Checking email, locating prey. Hard to extinguish Steady response
Biofeedback
Not originally considered effective. IS NOW EFFECTIVE Operant conditioning on the autonomic nervous system. You are getting feedback on what your body is doing. Monk example with the cold sheets. They were able to use their bodies to heat the sheets to the point of steaming them and drying them completely. Listening to their bodies. Hyper aware of their bodies.
INSTRUMENTAL CONDITIONING - PUNISHMENT. 3 types
Omission Punishment Avoidance The goal is to STOP bad behavior
Appropriate response
One to One behavior Response learning - the manner in which responses and reinforcements occur. How contingent the behavior is to the reinforcer. action must occur with the reward.
Skinner
Operant conditioning (consequence for a behavior - good work = salary bonus) and operant learning *Classical is different, that was pavlov
Applications of Instrumental conditioning - Punishment/ non/rewards
Pet fence OCD training provide safe exposure to fear Preventative compulsive action showing that there aren't aversive effects. Behavior - consequence. obsessive behavior - consequence is that it is harmful to their life. Use instrumental conditioning to shape their behaviors and provide a better standard of living.
Secondary reinforcement
Petting the dog after punishing it because you feel bad. Eventually they do the bad thing to get the attention afterwards.
Thorndike
Placed cat in a puzzle box. The objective was to get out of the box. Learning was measured by the time it took to escape from the box.
Punishments (positive/ negative)
Positive punishment - the addition of an aversive stimulus Negative punishment - The removal of a pleasant stimulus
Instrumental response
Pressing the lever. See the level (DS), you press it (IR). Whatever you do to get the reward Everything to do with behavior
Primary vs. Secondary reinforcement
Primary - directly reduces a biological need (water) or what you really want (love, happiness) Secondary - Rewards that are neutral stimuli, when paired with a primary reinforcer that acquire a capacity to reinforce their own. (evaluative conditioning - Study with brown pen and the black pen paired with good or bad music. Money is a good example.
Ratio
Proportion of responses.
What makes an effective punishment? Make sure there is no positive reinforcement
Punishing a kid in front of his peers, he may talk back since he is in front of them. This provides him with positive reinforcement when he is supposed to be getting punished.
The law of effect
Responses followed by satisfaction tend to be repeated. Responses followed by dissatisfaction tend to decrease in frequency. Good things make you want to do it again. Bad things don't.
Discriminative stimulus control
Sd tell us when the reward is ready. A tone or a light. S Delta is the stimulus that doesn't mean anything. Conditions a response to occur more often in its presence than in its absence (stimulus control) Discriminating between a tone and a light when the tone is the only one that provides the reward.
Shaping (appropriate response)
Shaping behavior. Teaching a cat to high five
TEST QUESTION - about active/ passive avoidance
She will ask about Escape - active and avoid - active/ passive. This will be WRONG. Active/ Passive only occurs with avoidance!!
Non-reward vs. punishment
Side effects for both is aggression Non reward is associated less with behavioral problems
What makes an effective punishment? Verbal rationale along with punishment
So that the punishment is understood.
Escape
Something negative is already happening, your behavior is to get away from the aversive stimuli.
Discriminative stimulus
Something that can be discriminated against. A red light vs a Blue light vs. the sound of a tone.
Positive reinforcement
Something that increases behavior. Food reward for pushing a lever.
Reinforcer
Something that will increase the behavior
Problem with Discrimination
Sometimes we want it to generalize but it won't. Kid not yelling in the grocery store does not generalize to the kid not yelling in the library. Or: Teaching a dog to sit at home might not work as well when you're at a dog park.
The Discrimination Hypothesis (PREE)
Subject sometimes cannot discriminate between reward and non reward trials. If you don't always give your dog a treat, it won't know which time its going to receive one.
What makes an effective punishment? Some responses are better than others
Tapping Ty on the butt will land him on the couch when he jumps. Tapping his face will slow him down making him fall over and eat shit
Reinforcers as behaviors
The Pre-mack principle (considered the best) A more highly preferred activity can serve as a reinforcer for something less sucky. Pinball/ candy example. Kids that loved pinball had to eat the candy first (eating the candy is considered work) Vice versa
Instrumental conditioning
The association between a behavior and its consequence. If you act cute, you get spoiled. If you pee on the rug, you get scolded. The thing to be learned is not based on a UR.
Avoidance learning/ negative reinforcement
The removal of an aversive stimulus leads to an increase in behavior/ doing something to stop something negative from occurring. Bringing in good grades to avoid getting punished (active) Avoidance can occur actively or passively.
Reinforcers as stimuli
The rewards are the stimuli that you drive for. The reward is a stimulus that you really want. Drive and motivation. Brain Stimulation (people always want more dopamine) STIMULI - STIMULATION However sometimes people do things to increase their drives (XXX movies)
Stimulus-Response
Thorndyke advocated this. Association between discriminative stimulus and response. Flash of light and food as a pose to tone and food. The discriminative stimulus matters.
Delay of reinforcement
Time between act you are reinforcing and the reinforcer is important to the strength of learning.
Skinner video (pigeons)
Told to Peck/ turn (taught to read) in response for a reward. Also made to peck the red part of the wall. Birds were unethically kept at 1/3 their body weight. Operant conditioned. On the test it wouldn't be asked whether this is operant or instrumental. Is this classical conditioning or operant response? Operant. Could this be considered as a skinner box? Yes - it is an operant chamber
Token Economies
Used in mental health institutions and some classrooms. Buy a certain amount of things to get a reward. In class, do a certain number of things correct and you get a gold star etc. EFFECTIVE.
Animal training
Video with Ty the cat, the pig that did tricks and Jesse the dog. Discriminative stimulus
Theory of avoidance Conditioning
Watson-Mower theory 1st part conditional 2nd part instrumental 1st part Shock -> Fear Tone + shock - > Fear 2nd part Tone -> Escape Learn to avoid or approach depending on the classical part.
How do we know acquisition occurred?
We test it.. Contingent: Action and outcome at the same time. Non-contingent: Multiple outcomes, not always in line with the action.
Non-rewards
While rewards reinforce behavior, non-rewards are used to stop them.
What makes an effective punishment Not using incompatible responses
Yelling at a crying child won't do shit.
Extinguishing
You can extinguish a behavior by withholding a reward. Only if the behavior was associated with a reward at an earlier time.
Spontaneous recovery
You can recover from something that was an extinguished behavior.
Positive/negative reinforcement video
https://www.youtube.com/watch?v=wfraBsz9gX4 Positive reinforcement: The addition of a reinforcer that increases behavior. Negative reinforcement: The removal of an aversive stimulus that will increase behavior.
Resistance to extinction
trained with small rewards If rewards weren't presented immediately. Partial Reinforcement Extinction Effect (PREE) - Behaviors that were persistently reinforced are harder to extinguish.
Why do some behaviors persist?
Approach-Avoidance Conflict: Sometimes the reward is worth it. (how big the reward is) If the punisher is pretty far away we might think we can get away with it
How to help learned helplessness?
Assertiveness training Hard and easy examples.
Reinforcers as strengtheners
Association between stimulus and response. Strengthens neural connections. Stimulus and behavior are getting
Aggression in response to punishment
Cat hisses/ dog growls when getting told off
Learned Helplessness
Rats that could/couldnt escape Placed in another location with an exit. Rats that didn't have one before will stay and not try to leave (No learning) Those who had an exit will bail. (Quick learning) Rats who weren't shocked before and are introduced into the new environment will eventually escape (quick learning)
Four theories of reinforcement
Reinforcers as stimuli Reinforcers as behaviors Reinforcers as strengtheners Reinforcers as information Reinforcers are things that you want.
Operant learning
Response operates on the stimulus environment to produce an outcome. Same thing as instrumental response.
What makes an effective punishment?
Response-contingent The punishment needs to be in response to one specific behavior
Skinner boxes
otherwise known as operant chambers. For animals only. Rat in the box with the lights and lever.