final - 222
What are the two ways to do DRL?
1. reinforce if behavior happens after a certain period of time (reinforcement schedule: FIXED INTERVAL) 2. set a limit on the allowable number of occurrences in a time period
Theories of Avoidance
1. two-process theory 2. one-process theory
issues w/ Hull's Drive Reduction Theory
1. what about $$? What physical need is money meeting? 2. saccharin is reinforcing, but has no nutritional value 3. copulation w/o ejaculation is reinforcing - doesn't reduce sex drive 4. babies suck on pacifier to bring image into focus - no clear drive reduction
variable interval graph
- NOT scalloped like FI - higher rates of behavior than FI - steady but still pretty low rate of responding
Carl Warden monkey experiment results, implications
- observer did behavior in less time when they had a model - failed attempts showed that they had the right nature of problem solving
cons: secondary reinforcers
- paired w/ primary reinforcers, so can be extinguished if no longer reliably paired - weaker than primary reinforcers, so not as effective
Theories of the partial reinforcement effect (PRE)
1. discrimination hypothesis 2. frustration hypothesis 3. sequential hypothesis 4. response unit hypothesis
unique variables affecting punishment
1. introductory level of the punisher 2. reinforcement of punished behavior 3. alternative sources of reinforcement
variables affecting observational learning: 1. difficulty of the task
hard tasks = less learning - but observational learning can help lead to success (especially w/ a model) - ex: think watching someone solve a rubik's cube
alternatives to punishment: 3b. Differential Reinforcement of Incompatible Behavior (DRI)
non-reinforcement of unwanted behavior (EXTINCTION) + REINFORCEMENT of an INCOMPATIBLE ALTERNATE behavior
What is this an example of (operant conditioning)?: a toddler burned by a hot stove will be less likely to touch the stove again
positive punishment
all of the following are examples of what (+/-) in operant conditioning?: a verbal reprimand, something painful (spanking), speeding ticket
positive punishment
What is this an example of (operant conditioning)? $10 for an A makes it more likely a student will earn more As
positive reinforcement
alternatives to punishment: 1. response prevention
prevent the behavior from occurring by altering the environment
What kind of reinforcer is social contact?
primary
vicarious punishment
process where the observer sees the model punished - WEAKENS likelihood of observer imitating model's behavior
Skinner Box
rat learns to associate pressing the lever or bar with the likelihood of a food pellet appearing OPERANT CONDITIONING: lever (s) pressing lever (r) receiving the food is the consequence (c)
progressive schedules
reinforcement occurs after an increasingly bigger ratio/ interval/ duration/ time UNTIL behavior DROPS OFF (break point) - arithmetic progression -> linear increase - geometric progression -> exponential increase
crate training puppies to avoid them peeing outside of crate or chewing up something is an example of what type of alternative punishment?
response prevention
fixed ratio
schedule: behavior is reinforced when it has occurred a fixed number of times - every X responses produce 1 reinforcement - post-reinforcement pause
fixed interval
schedule: reinforcement is provided after a FIXED amount of TIME - behavior is reinforced after every X amount of time
variable interval
schedule: reinforcement is provided after a VARIABLE amount of TIME - on AVERAGE, after X amount of time, behavior is reinforced
variable ratio
schedule: reinforcement is provided after a VARIABLE number of correct responses - on average, every x number of behaviors is reinforced
fixed duration
schedule: reinforcement is provided for continuous performance for a fixed period of time
variable duration
schedule: reinforcement is provided for continuous performance for a varying period of time
fixed time
schedule: reinforcement occurs after a set period of time
variable time
schedule: reinforcement occurs after irregular periods of time
What kind of reinforcer is praise?
secondary
What kind of reinforcer is recognition?
secondary
What process in operant conditioning is seen in this example?: some animals feed their young successively livelier prey
shaping
Pavlovian forces: Pavlok wristband example
shock (US) + rewarding stimulus (cookies) - counterconditioning - stimulus is conditioned to be aversive
variables in operant conditioning: 2. contiguity
shorter ISI = faster learning - seen in clicker training - learning can still occur despite delays; you can decrease effects fo delay if predictably preceded by a specific stimulus (signals vs. unsigned delay)
variables in operant conditioning: 3. reinforcer characteristics
size/strength: - small, frequent reward > large, infrequent rewards - bigger is better!! - the more you increase size, the less additional reinforcement you get qualitative differences: - ex: rats like bread & milk better than sunflower seeds
animal intelligence
general thought that animals learned through reasoning
raising hand in class after seeing someone else do it: congratulated for correct answer VS. embarrassed for answering incorrectly.... example of what in observational learning
variable affecting observational learning - consequences of observer's behavior
Which schedule of reinforcement produces the steepest straight line?
variable ratio
a slot machine is an example of what schedule of reinforcement?
variable ratio
In regards to a Skinner Box.... POSITIVE PUNISHMENT press lever -> ___________
weakens behavior adds shock?
In regards to a Skinner Box.... NEGATIVE PUNISHMENT press lever -> ___________
weakens behavior take away food?
displaced aggression
when aggression is directed at someone who is not the punisher or even inanimate objects
problems with punishment: 3. APATHY
when neither escape nor aggression is possible, general suppression of behavior is more likely - if punishment is common, especially likely - "maybe if I just do nothing, I won't end up doing something that is going to get me punished."
KINDS of reinforcers
- NATURAL or CONTRIVED - PRIMARY or SECONDARY
alternatives to punishment
(not using punishment does NOT mean giving into another's every whim) 1. response prevention 2. extinction 3. differential reinforcement (DRA, DRI, DRL)
positive punishment
(punishment by application) - something is ADDED to the environment that you do NOT LIKE - WEAKENS behavior
negative punishment
(punishment by removal) - something is TAKEN AWAY that you DO LIKE - lose a privilege - WEAKENS behavior
MAIN DIFFERENCE: classical vs operant conditioning
** role of behavior ** (often occur together though; ex: Little Albert is classical conditioning because of the association between rat and loud bang; BUT, loud noise could also serve as a punishment - operant conditioning) - Pavlovian forces - operant forces
negative reinforcement allows you to either:
- ESCAPE something you don't like that is already present - AVOID something before it occurs
WHat's the difference between an FR30 and a chain of 3 FR30s?
- FR30: reward after behavior 30x - chain 3x FR30: reward after all 3 (behavior 90x)
vicarious reinforcement or punishment? - going in the water after watching Moana vs. Open Water / Jaws
- Moana: vicarious reinforcement - Open Water / Jaws: vicarious punishment
Neuromechanics of Reinforcement
- Olds & Milner (1954) - mapping reward pathways - ESB: Electrical Stimulation of Brain - stimulate the "reward pathways" - rich in cells that produce DA (a source of "natural high") - the reward prediction error hypothesis of dopamine (RPEH)
discrimination hypothesis
- PRE hypothesis - it's harder to distinguish (discriminate) between extinction and an intermittent schedule - periods of non-reinforcement similar to extinction, making it harder to tell them apart
response unit hypothesis
- PRE hypothesis - says that the PRE is an illusion - we shouldn't think of intermittent reinforcement like "press-nothing, press-nothing, press-reinforcement --- FR3) - instead, an FR3 is "press-press- press - reward" 3 presses is what's rewarded - looking @ units of behavior that occur in extinction, partially reinforced schedules show more extinction compared to continuous schedules' units of behavior
Relative Value Theory & the Premack Principle
- Premack Principle: high probability behavior reinforcers a low probability behavior - high prob: receiving food - low prob: pressing lever
contrived reinforcers
- Reinforcers that have been deliberately arranged to modify a behavior - they are not a typical consequence of the behavior in that setting - behavior does not follow spontaneously
fixed interval graph
- SCALLOPED - at the beginning of the interval, little/no responding - increases to rapid rate of responding before interval expiration
Skinner's air crib
- Skinner raised his daughter in a Skinner box
Thorndike's line drawing study
- Thorndike attempted to draw 4 inch line with eyes clothed - no feedback - saw NO IMPROVEMENT over 3,000 trials - but, when eyes were open, improvement - conclusion: practice is only important in that it provides opportunity for reinforcement
problems with punishment: 4. ABUSE
- a large risk, especially w/ physical punishment - people often start w/ an insufficient level of punishment and gradually increase
Thorndike's Law of Effect
- behavior is a function of consequences - behaviors followed by favorable consequences ("satisfying state of affairs") become more likely, and behaviors followed by unfavorable consequences ("annoying state of affairs") become less likely
What do social and asocial learning have in common? Differences?
- both involve observation - differences: social includes a model and imitation; asocial doesn't (ghost)
What compound schedules only reinforce at the last schedule
- chain schedule - tandem schedule
variable ratio graph
- constant and high rate of responding - resistant ot change - has shorter and less frequent post-reinforcement pauses
Thorndike puzzle box results
- decreased time to escape over multiple trials
pros: secondary reinforcers
- don't satiate as quickly (food will stop being rewarding before $$) - easier to reinforce behavior quickly and w/o disruption to behavior (clicker training)
other simple schedules
- duration - time - progressive
two-process theory - dog training example
- escape: (dog) during training, OPERANT CONDITIONING leads to escaping shock - avoidance: after training, PAVLOV conditioning to the light causes fear; OPERANT leads to escaping the light (fearful CS)
problems with punishment: 2. AGGRESSION
- especially likely when escape isn't possible - ex: criticize our critics, lashing out at person, steal work supplies, sabotage, vandalism, assault - aggression is not always directed at punisher; displaced aggression
experiment: tie hanging strings that are far apart w/ chair + pliers
- example of asocial learning - experimenter walks by and "bumps" string, setting them in motion - gives idea to participant: tie pliers to one string, swing to next string, tie it
Bobo the clown - kids tried to get marbles
- example of superstition - Operant Conditioning
sequential hypothesis
- explanation for PRE - continuous reinforcement - reinforcement is a signal for lever pressing - intermittent reinforcement - a period of reinforcement followed by non-reinforcement is a signal for lever pressing (the sequence of these two schedules is a signal to lever press) - rats keep pressing during extinction b/c in the past, long periods of non-reinforced behavior have been reliably followed by reinforcement
frustration hypothesis
- explanation for PRE - non-reinforcement = frustrating - frustration = aversive emotional state - anything decreasing frustration will be reinforcing - in periods of frustration during extinction, lever pressing will increase - frustration has become a signal to keep pressing lever
"insightful" primates
- fashioning a long stick to get a banana - hanging bananas, stacking boxes, reach bananas - very little record of learning history; later studies tested the effect of learning history
Skinner operant conditioning
- followed Thorndike - invented the Skinner Box - coined term "Operant Learning" (aka: instrumental learning, response learning, consequence learning, R-S Learning)
operant forces: Pavlok wristband example
- getting shocked is a consequence of eating cookies - eating cookies might decrease IF shock is aversive
Thorndike experimentally studies animal intelligence
- his plan: present animal w/ a problem over and over and see if performance can improve - fluency increases over time! (chicks + maze)
the problem with avoidance
- how can something that never happens affect behavior? - during training, dog gets shocked and THEN moves over quicker & quicker - after training, dog moves over and never experiences the shock - all kinds of things never happen and yet are reinforcing
issues w/ the Two-Process Theory
- if MORE avoidance behavior = LESS fear to the CS (less conditioned suppression).. ... how is Pavlovian conditioned fear to the CS leading to escape during avoidance? - avoidance is remarkably persistent (extinction should occur and avoidance should extinguish.. but it doesn't) - Sidman avoidance procedure: shock every 15 sec unless press lever; still learn avoidance by lever pressing (if you get rid of the CS, there shouldn't be anything to Pavlovian condition, right?)
variables affecting observational learning: 6. consequences of observer's behavior
- if observing gives you an advantage, you observe more - in the end, the observer's reinforcement/ punishment for behaving like the model wins out. more powerful
variables affecting observational learning: 5. consequences of observed acts
- if outcome of model's behavior is good = increased imitation - if outcome of model's behavior is bad = decreased imitation
variables affecting observational learning: 2. skilled vs. unskilled model
- if watching expert: observer sees exactly what's required - if watching unskilled model: observer can see what works and what doesn't (learn from their mistakes) - Which is better: watching Bill Nye juggle or Honey Booboo?..... depends!
explanations: over-imitation
- imitation evolved before observational learning - beneficial: ensures success; can edit later - imitation is learned (people are rewarded for imitating others) - generalized imitation: reinforce general tendency to imitate
What does strengthening behavior mean?
- increased probability - increased rate and other six measures of learning - (behavioral momentum) increased persistence: in extinction, in face of negative consequences, if more work is required, even if there are alternative behaviors w/ reinforcers
problems w/ punishment
- it is very effective provided that punishment of initial sufficient strength is delivered immediately following behavior and alternate ways of earning reinforcer are provided **problems especially true w/ physical punishment** 1. escape 2. aggression 3. apathy 4. abuse 5. imitation
helplessness
- lack of persistence in the face of failure (varies a lot between people) - operant learning history can explain this - learned helplessness - learned experiences can also help prevent helplessness (growth mindset, the power of yet, praise hard work - not intelligence)
Olds & Milner (1954)
- laid foundation for mapping reward pathways - ESB: Electrical Stimulation of Brain - stimulate the "reward pathways" - rich in cells that produce DA (a source of "natural high")
variables affecting observational learning: 3. characteristics of the model
- learn more from attractive, likable, prestigious (higher status), non-neutral mood models - suggested that certain characteristics have more SALIENCE and attract more attention - ex: recall more from eyepatch condition (saliency)
issues w/ Relative Value Theory & the Premack
- lower probability behaviors can reinforce high probability behaviors - ex: rats deprived of exercise had to drink water to get access to running wheel - has a tough time w/ secondary reinforcers (What behavior does praise give access to?)
What compound schedules are signaled?
- multiple schedule - chain schedule
What compound schedules reinforce throughout?
- multiple schedule - mixed schedule
variables affecting observational learning: 4. characteristics of the observer
- not all species rely on observation (humans get the most, followed by apes) - learning history - previous skills - how difficult a task is for YOU - age (younger = more imitation; older = gets more from observing; "advanced in years" = less benefits) - sex
Carl Warden
- observational learning exists - simple, controlled experiment - two identical compartments, with identical problems (chain pull) - demonstrated that some animals benefit from consequences of model's behavior (monkeys pull chain + open door to get raisins)
Epstein's Pigeon Experiment
- pigeons previously TRAINED to push boxes around solved the puzzle - pigeons NOT trained to push boxes did NOT solve the puzzle - provide evidence that there is no "insightful" problem solving - successful problem solving is the result of previous learning
variables in operant conditioning: 6. other variables
- previous learning - competing contingencies (most behaviors have both rewards & punishments) ---- going to class, taking a nap, buying expensive item you've been wanting
insightful problem solving
- problem: reinforcement is available but the behavior to produce it is not - insight: solution that occurs suddenly w/o learning (w/o any reinforcement) - we know from learning curves that usually behavior needed for the solution occurs gradually - "a-ha!" moment - ex: chain problem, triangle dot problem
Thorndike - observational learning
- puzzle box - chicks, cats, dogs - no matter the number of observations, animals seemed to learn nothing - said observational learning doesn't exist!
two-process theory - rat shock example
- rats learned avoidance of white shock-paired side - researchers closed the exit from the shock-paired side and required that rats learn to run in a wheel to open it - rats learned the behavior even though they never got shocks - evidence that the room itself could affect behavior as much as an actual shock - rats who avoided shock more showed LESS conditioned suppression (less fear to the tone)
unique variables affecting punishment: 1. introductory level of the punisher
- really important to use an effective level from the BEGINNING - using a weak punisher & gradually increasing its intensity actually shapes the behavior to PERSIST - a larger punisher is needed to suppress behavior w/ this gradual increase than one that would have stopped behavior from the start
How is creativity reinforced?
- reinforce any new behavior - ex: Malia the dolphin - Karen Pryor: animal trainer/ scientist
tips for successful shaping
- small steps - immediate reinforcement (high contiguity) - small reinforcers (doesn't distract; little satiation) - reward best approximation - back up when necessary
types of observational learning
- social - asocial
Herbert & Harsh (1944)
- social learning - cats, 5 problems - gave different "doses" of observations: 15 trials vs. 30 trials
Fixed Ratio graph
- steady, high rates of responding (upward line) UNTIL reinforcement - post-reinforcement pause (flat line): time out from responding after each reward - higher ratio, longer pause after each reward
issues w/ Response-Deprivation Theory
- struggles w/ secondary reinforcers (like RVT does) - we don't always have a baseline for rewarding processes
One-Process Theory - shock example
- the reduction in shock itself is reinforcing - no CS required
Hull's Drive Reduction Theory
- the theory of reinforcement that attributes a reinforcer's effectiveness to the reduction of a drive (when our physiological needs are met from a behavior, the drive is reduced) - drive-reduction explains why reinforcer is reinforcing - provides a good explanation for all physiologically-based behaviors (eating, drinking)
unique variables affecting punishment: 2. reinforcement of punished behavior
- the unwanted behavior is reinforced or it wouldn't happen - effectiveness of punishment depends on variables controlling reinforcement - more or less reinforcement - w/o reinforcement, the behavior would decline on its own - more reinforcement = less effect of punishment (skipping class on a beautiful day is reinforcing b/c it's nice out)
unique variables affecting punishment: 3. alternative sources of reinforcement
- the unwanted behavior is reinforced or it wouldn't happen - if there are other ways to earn the same reinforced, punishment is more effective - points out the importance of providing alternative way to get reinforcer - ex: puppy chewing on shoes, replace it w/ chew toy
"stretching the ratio"
- to create extinction resistant behavior - a type of shaping used to reinforce high behavior rates using few reinforcers - can be done in all schedules - gradually increase the work or wait required - what you are shaping is persistence - stretching a ratio schedule too far too fast causes ratio strain where behavior is disrupted
problems with punishment: 5. imitation
- while most commonly thought of with abused children, imitation isn't seen just in children... and also not just with physical punishment
Explanations - post-reinforcement pause:
1. FATIGUE: "catch our breath"; not supported 2. ESCAPE momentarily the aversiveness of work (procrastination) 3. PAUSE to work for other reinforcers; break from exercise to sleep, eat, study (assumes you can't do other reinforced behaviors at the same time)
Theories of Positive Reinforcement
1. Hull's Drive Reduction Theory 2. Relative Value Theory & The Premack Principle 3. Response-Deprivation Theory
variables in operant conditioning
1. contingency 2. contiguity 3. reinforcer characteristics 4. behavior characteristics 5. motivating operations 6. other variables
variables affecting observational learning
1. difficulty of the task 2. skilled vs. an unskilled model 3. characteristics of the model 4. characteristics of the observer 5. consequences of observed acts 6. consequences of observer's behavior
compound schedules
A complex contingency where two or more schedules of reinforcement are combined 1. multiple schedule 2. mixed schedule 3. chain schedule 4. tandem schedule
Are the following examples of DRA, DRI, or DRL?: - giving a puppy a chew toy so it can't chew on shoes - taking paper notes so you can't check Facebook
Differential Reinforcement of Incompatible Behavior (DRI)
Is the following an example of DRA, DRI, or DRL?: less than 2 hours watching TV in a day = special snack at bedtime
Differential Reinforcement of Low Rate (DRL)
The Dragon Riders of Pelhath have been training to drop Raging Sheep on their enemy by flying over a field and trying to drop one of the durable and violent sheep into a box that would represent an enemy. As they continue to drop the sheep at the same pace, they end up dropping the sheep outside of the box less often. What measure of learning would most likely be used to indicate that the Dragon Riders are learning to do the task? Intensity Speed Errors All of the above
Errors
primary reinforcers
Events that are inherently reinforcing because they satisfy biological needs
T/F: observation always leads to increased behavior
FALSE - depends on the consequences of the models behavior - vicariously reinforced vs. vicariously punished
T/F: you can determine if something is reinforcing w/o even looking at the effect on behavior
FALSE - some stimuli that are aversive to you can be reinforcing to some people or in some situations - have to look at its effect on behavior
T/F: with operant conditioning, you reinforce the animal/person
FALSE - you reinforce the BEHAVIOR - you don't reinforce your dog by giving them a treat; you reinforce the behavior they are performing when you do that
T/F: practice is only helpful with feedback
FALSE! - feedback is more helpful, but practice can still improve performance
Reinforcement schedule? Rate of behavior? wait to reinforce a behavior until 4 sec. had passed
FI4s - likely slower behavior
Reinforcement schedule? Rate of behavior? for the first round, reward every clap
FR1 - quick learning (new behavior)
Reinforcement schedule? Rate of behavior? reinforced every four claps
FR4 - high rate - post-reinforcement pause
What does "FR3" mean?
Fixed Ratio schedule - every 3 behaviors = 1 reinforcement)
Thorndike's puzzle box
In Thorndike's original experiments, food was placed just outside the door of the puzzle box, where the cat could see it. If the cat triggered the appropriate lever, it would open the door and let the cat out. - plan: present animal w/ problem over & over again and see if performance improves - learning measures: error? topography? speed? latency? rate? fluency?
asocial observational learning
Learning by observing environment (w/o model) - ghost condition - problem solving
Suppose there are frogs living in a pink pond. The frogs were blue, pink, purple, or green, but because of predators, only pink frogs remain. Last week a hurricane came through and changed the color of the pond to blue. Will the frogs evolve? No Yes
NO - only pink frogs remain - no genetic variation
event is independent of behavior - depends on another event
Pavlov Conditioning - an event (US) is contingent on another event (CS)
female chimps learn from watching moms, while boys go do whatever; opportunity to observe differs... this is an example of what in observational learning
SEX is a VARIABLE affecting observational learning
negative reinforcement
STRENGTHENS a response by REDUCING/REMOVING an aversive (disliked) stimulus - anything that INCREASES the likelihood of a behavior by following it with the REMOVE of an UNDESIRABLE event/state
positive reinforcement
STRENGTHENS a response by presenting a stimulus that you like after a response - anything that increase the likelihood of a behavior by following it with a DESIRABLE event/state - the subject receives something they want (ADDED)
who coined the term "operant learning"?
Skinner (even though he came after Thorndike)
T/F: many operant conditioning situations are a combination of the four types (reinforcement/ punishment)
TRUE - ex: Thorndike's cats - escaping a box (neg. reinforcement) and getting food (positive reinforcement) on pulling lever
T/F a reinforcer can also be a punisher
TRUE - what is reinforcing at one time might be punishing at another time - ex: eating the first cookie and liking it increases the tendency to eat more cookies. However, if you eat 50 cookies, they may no longer be reinforcing.
over-imitation
The tendency of observers to imitate acts by a model that are irrelevant to obtaining reinforcement. - a little like a superstition - elevator + turn backwards + hat - more imitation w/ age
Which variable schedule (VR vs. VI) has higher rate of behavior? (generally)
VR
Reinforcement schedule? Rate of behavior? reinforced behavior a varying number of claps - on average 4
VR4 - likely very high behavior rate
Which would lead to the least variability between participants in the treatment and control group? a. ABA reversal b. Between- subjects design c. Matched sampling d. All would be about equal
a. ABA reversal
In attempt to make a bunny salivate to a bell, the bell is rang after every carrot that a bunny receives. Is this likely to be effective, why or why not? a. No, because the US comes before the CS b. Yes, because carrots produce good conditioning in bunnies. c. Yes, as long as the timing is control correctly d. No, because the CS comes before the US
a. No, because the US comes before the CS
Which of these could be used to increase the strength of a behavior? a. Reinforcement b. Punishment c. Both are possible
a. Reinforcement
Which of the below would allow you to tell if a neutral stimulus has become a CS? a. Removing the US b. Through Higher Order conditioning c. Isolating the UR d. Delaying the time in between trials
a. Removing the US
After a period of extinction, the researcher waits a long while before presenting a bell, the CS to the dog. The dog once more begins to salivate at the sound of a bell. What is occurring? a. Spontaneous recovery b. Reinstatement c. Renewal d. Blocking
a. Spontaneous recovery
A new neurological connection being created between the conditioned stimulus neurons and the unconditioned stimulus neurons is a key principle of: a. Stimulus Substitution Theory b. Spontaneous Recovery c. Compensatory Response Theory d. Rescorla-Wagner Model e. Preparatory Response Theory
a. Stimulus Substitution Theory
Researchers are studying toddler reactions to stuffed elephants. They place stuffed elephants into the toddler's cribs only moments before their parents come in, scoop the toddlers up, smile, snuggle, and kiss them, which the toddlers return in kind. Eventually, after enough training, when the researchers put the elephants into the cribs, the toddlers smile, snuggle, and kiss the stuffed elephants. Toddlers act affectionately toward stuffed elephants that predict their parents loving behavior. a. Stimulus Substitution Theory b. Preparatory Response Theory c. Compensatory Response Theory d. Rescorla-Wagner Model
a. Stimulus Substitution Theory
How does the Sidman Avoidance Procedure provide evidence against the Two-process theory? a. There is no CS. b. There is no US. c. Extinction occurs, yet the behavior remains. d. It doesn't- it actually supports it.
a. There is no CS.
Which was evidence AGAINST the Two-process theory. a. There was less conditioned suppression for animals who avoid shock a lot. b. There is no way to deal with secondary reinforcers. c. Once extinction happens, the avoidance behavior ceases. d. All of the above.
a. There was less conditioned suppression for animals who avoid shock a lot.
Which schedule is most likely to lead to superstitious behavior? a. VT b. FD c. VR d. VI
a. VT
experience in the definition of learning refers to __________ a. changes in the environment b. mental states c. a relationship between an event and simple response d. our surroundings
a. changes in the environment
An establishing operation for reinforcement does what to the level of punishment needed? a. increases b. decreases c. nothing
a. increases
Joey likes to listen to rock music when cutting onions. After doing so for awhile, he notices his eyes get teary when listening to rock music. The onion in this situation is a(n) __________. a. unconditioned stimulus b. conditioned stimulus c. unconditioned response d. conditioned response
a. unconditioned stimulus
establishing or abolishing operation? - satiation - drugs - guilt
abolishing
What is it that makes something reinforcing?
always leads to STRENGTHENING behavior
What is it that makes something a punishment?
always leads to a WEAKENING of a behavior
What does VR3 mean?
an average of 3 behaviors produces 1 reinforcement
variables in operant conditioning: 1. contingency
an increase in correlation = strengthening of behavior
intermittent reinforcement
an operant conditioning principle in which only some of the responses made are followed by reinforcement - interval vs. ratio (fixed & variable)
shaping
an operant conditioning procedure in which reinforcers guide behavior toward closer and closer approximations of the desired behavior
Thorndike's studies of learning started as an attempt to understand _______
animal intelligence - didn't trust this data because it was an anecdotal data source
punishment
any consequence that DECREASES the likelihood of the behavior that follows it - operant conditioning
reinforcement
any consequence that INCREASES the likelihood of the behavior that follows it - operant conditioning
variables in operant conditioning: 5. motivating operations
anything that changes the effectiveness of a consequences - establishing vs. abolishing operations
Juan is always pulling his sister's hair- his mom punishes him, but only about 1/4 of the time. Which characteristic is causing learning to happen slowly? a. Contiguity b. Contingency c. The behavior is not innate d. The reinforcer is too strong.
b. Contingency
Which of the following correctly orders these four sources of data from most reliable to least reliable? a. Descriptive studies, experimental studies, case studies, anecdotes b. Experimental studies, descriptive studies, case studies, anecdotes c. Experimental studies, case studies, descriptive studies, anecdotes d. Anecdotes, case studies, descriptive studies, experimental studies
b. Experimental studies, descriptive studies, case studies, anecdotes
The formula for the Rescola-Wagner Model is ΔVn = c(λ - Vn-1). What is it measuring (ΔVn)? a. The maximum amount of learning. b. How much learning happens on a given trial c. Salience of stimulus d. Number of trials
b. How much learning happens on a given trial
The formula for the Rescola-Wagner Model is ΔVn = c(λ - Vn-1). According to this equation, which of the following variables would most greatly affect learning? a. Velocity b. Salience of the stimuli c. Intelligence of the participant d. Higher-order conditioning
b. Salience of the stimuli
What simple learning form(s) is pseudoconditioning a result of? a. Habituation b. Sensitization c. Reflex d. It could be any of the above
b. Sensitization
In one study, the researcher had decided to conduct an experiment with a rat where you would turn off the lights and then ring a bell over and over again. The bell is then paired with food from outside of the room, which is then repeated over and over. After a while the subject starts to salivate as soon as the lights turn off. Which concept does this demonstrate? a. Blocking b. Sensory Pre-Conditioning c. Latent Inhibition d. Higher Order Conditioning
b. Sensory Pre-Conditioning
A student is afraid of heights. A therapist has the individual create a list of frightening scenes related to heights, in order from least to most fear inducing. As a team, they work through the list, moving to each new situation only after the previous one no longer induces fear in the individual. What form of exposure therapy is being used? a. Aversion Therapy b. Systematic Desensitization c. Virtual Reality Exposure Therapy (VRET) d. Behavioral Activation
b. Systematic Desensitization
What is the main difference between the sequential and frustration hypotheses for the PRE? a. One involves a cue to tell you to keep behaving and the other does not. b. The cue to keep behaving is external in one and internal in the other. c. One is more closely related to the one-process theory while the other fits with the two-process theory better. d. Nothing- they are 2 names for the same theory.
b. The cue to keep behaving is external in one and internal in the other.
multiple schedule
compound schedule of reinforcement consisting of 2 or more simple schedules where the particular schedule in effect is SIGNALED - ex: MULT FI 10" then VR10
In an experiment there is a bell that is rung, a one second pause, then a small electric shock delivered to a rat. Eventually after a few trials of the bell being rung, a one second pause, and a shock, the rat begins flinching at the sound of the bell. What kind of conditioning is this? a. Delayed b. Trace c. Forward d. Backward
b. Trace
Which would be harder to extinguish? a. a behavior trained on an FR50 schedule b. a behavior trained on an VR 100 schedule c. both equally difficult
b. a behavior trained on an VR 100 schedule
punishment is _______ a. ineffective b. a fast way to change behavior c. a great way to increase behavior d. effective at all intensities
b. a fast way to change behavior
Which is true of a mixed schedule? a. it's signaled b. it's unsignaled c. you only give the reinforcer after all the schedules have been completed d. more than one above
b. it's unsignaled
Which of the following is primary reinforcer? a. giving gold stars to someone b. keeping someone warm c. telling someone "that's the way" d. giving a good grade to someone
b. keeping someone warm
Sally was pushing the buttons of a video game to earn gold pieces when she heard the bicycle-bell belonging to her cruel older brother, signaling that he was home. She will likely push the buttons ___________ due to ____________ a. more rapidly; conditioned suppression b.less rapidly; conditioned suppression c. more rapidly; instrumental transfer d. less rapidly; instrumental transfer
b.less rapidly; conditioned suppression
What's the most effective type of chaining? forward vs. backward?
backward! (4, 4+3, 4+3+2, 4+3+2+1)
Response Deprivation Theory
behavior becomes reinforcing when animal is prevented from doing it as often as they normally do (every behavior has a baseline rate; being below this baseline makes behavior reinforcing) - extension of the Premack Principle - now explains WHY, if you deprive a rat of exercise, running a wheel is reinforcing (emphasis is not on relative frequencies of behaviors to each other; instead, relative frequency of behavior to its baseline)
Skinner's Operant Conditioning
behavior increases or decreases depending on the consequences of that behavior - behavior is reinforced = strengthened - behavior is punished = weakened
superstitious behavior
behavior that occurs repeatedly despite the fact that it does not produce the reinforcers that maintain it
establishing vs. abolishing operations
both motivating operations - a variable in operant conditioning - establishing: increases effectiveness of reinforcement - abolishing: decreases effectiveness of reinforcement
Did Thorndike's puzzle box use positive or negative reinforcement?
both positive and negative reinforcement - strengthens behavior of pressing lever because it removes adverse situation-box (NEGATIVE) and then is rewarded w/ food (POSITIVE)
In a fixed ratio 3 schedule (FR3), how many times would a rat get a reinforcer with 12 lever presses. a. 6 b. 5 c. 4 d. 3
c. 4
What is an example of a FR schedule, specifically? a. Hunting where successfully killing a deer is the reinforcement. b. Playing slot machines in a casino where winning is reinforcement. c. An orchard worker getting paid per basket of apples. d. Getting paid commission for selling cars. The behavior measured is attempting to sell a car. Selling a car and getting commission is reinforcement.
c. An orchard worker getting paid per basket of apples.
If you use a progressive schedule to increase the work or waiting required to get reinforcement, the point at which the work/delay is too much and behavior decreases sharply or stops is called the ___________. a. Ceiling b. Point of exhaustion c. Break point d. Wall
c. Break point
Students drank beer in a controlled experimental room with certain kinds of decorations - blue walls, bright yellow pictures. Researchers recorded how much they drank over several sessions spaced a few days. During a final test session, the students drink beer in a new room - red with purple pictures. Students were much more intoxicated from fewer beers than they drank previously. This would best support which theory of Pavlovian Conditioning? a. Stimulus Substitution Theory b. Preparatory Response Theory c. Compensatory Response Theory d. Rescorla-Wagner Model
c. Compensatory Response Theory
After witnessing many great wins in the Dean Dome, Hridann was conditioned to feel a rush of excitement when the Tar Heel run out of the tunnel at the beginning of the game. After graduating, he moved to Winston- Salem and went to Carolina games at the Joel Coliseum. The Heels went through a bad slump, and he went through the process of extinction. He's now visiting a friend in Chapel Hill and decides to a game. Which is likely to happen when the Heels rush out of the tunnel and why? a. He will not feel a rush of excitement because extinction has occurred. b. He will feel a rush of excitement because of Spontaneous Recovery c. He will feel a rush of excitement because of the Renewal Effect d. He will feel a rush of excitement because of the Reinstatement Effect
c. He will feel a rush of excitement because of the Renewal Effect (context)
The birds are flying south for the winter. This is an example of what type of behavior? a. Reflex b. General behavior trait c. Modal action pattern
c. Modal action pattern
Mixed schedule
compound schedule of reinforcement consisting of 2 or more simple schedules where the particular schedule in effect is UNSIGNALED - ex: MIX FI10" VR10
What is the main difference between Relative Value Theory and the Response-Deprivation Theory a. RVT stresses the role of Pavlovian Conditioning and RDT does not. b. RDT stresses the role of Pavlovian Conditioning and RVT does not. c. RVT stresses the comparison between relative frequencies of 2 behaviors and RDT stresses the comparison between frequencies of a behavior and a baseline. d. RDT stresses the comparison between relative frequencies of 2 behaviors and RVT stresses the comparison between frequencies of a behavior and a baseline.
c. RVT stresses the comparison between relative frequencies of 2 behaviors and RDT stresses the comparison between frequencies of a behavior and a baseline.
Which memory activity would Clive Wearing have the most trouble completing? a. Playing the piano b. Given five words to remember and told to repeat those words back c. Remembering what happened on his birthday the year after his brain injury d. Remembering what his wife looks like
c. Remembering what happened on his birthday the year after his brain injury
Which is more effective in advertising if you want to create a strong preference for a particular brand? (Positive valence = makes you feel good) a. Showing a stimulus with positive valence and then the brand name. b. Showing a stimulus with positive valence and a brand name at the exact same time. c. Showing the brand name and then a stimulus with positive valence. d. Showing a brand name without any stimuli with positive valence.
c. Showing the brand name and then a stimulus with positive valence.
Which of the following is an example of negative reinforcement? a. Hillary cries after losing the card game. b. Leo hits his little brother because the brother broke Mark's bike. c. Stella wears her duck boots on a raining day so she doesn't have to walk around with wet feet all day. d. Jaxon gets a gold star because he didn't act out.
c. Stella wears her duck boots on a raining day so she doesn't have to walk around with wet feet all day.
Consider a group of rats in which the conditioned response is to freeze. However, their unconditioned response to the unconditioned stimulus is to jump away from the source of the shock. For which theory of Pavlovian conditioning is this a problem? a. The Rescorla-Wagner Model b. The Preparatory Response Theory c. The Stimulus Substitution Theory d. All of the above
c. The Stimulus Substitution Theory
When Julio ate a hamburger, he also came down with the flu and was really nauseous. Now Julio can't eat a hamburger or even see a burger restaurant without feeling sick. Which statement is true about the Pavlovian components of Julio's learning? a. Nausea is a CS b. A hamburger is a CR c. The flu is a US d. The burger restaurant is a US
c. The flu is a US
Which of the following is FALSE? a. Some stimuli are inherently more likely to become conditioned stimuli than others. b. A shorter ITI, in general, leads to less conditioning compared to a longer ITI. c. With overshadowing, a previously established CS interferes with learning for a new CS in with compound stimuli. d. All of the above are true.
c. With overshadowing, a previously established CS interferes with learning for a new CS in with compound stimuli. ^^ blocking
Once a stimulus is conditioned, if it is then repeatedly presented alone, __________ will occur. a. discrimination b. acquisition c. extinction d. blocking e. latent inhibition
c. extinction
Which of the following is an example of a reinforcer with good contingency but weak contiguity? a. playing a slot machine b. sending sweepstakes coupons to the clearinghouse to try to win $1M c. mailing three cereal box tops to receive a plastic toy d. being burned by a hot stove
c. mailing three cereal box tops to receive a plastic toy
What is different about DRL vs. DRA and DRI? a. it takes longer to implement b. you need more reinforcers c. you aren't trying to get rid of a behavior completely d. it only works in humans
c. you aren't trying to get rid of a behavior completely
training complex behaviors
chain multiple behaviors together
two-process theory
classical and operant conditioning can interact to establish new behaviors
What kind of reinforcer is a "finger tap"?
contrived, secondary
What kind of reinforcer is clicker for dog training?
contrived, secondary
What kind of reinforcer is money?
contrived, secondary
How is taste aversion different from other forms of applied Pavlovian conditioning? a. The pairing only has to happen once b. The inter-stimulus interval can often be longer than usual c. The ITI is much shorter d. A & B e. All of the above
d. A & B
What is the defining difference between a CS and US? a. A response to a CS is the result of genetics. b. A response to the US is a result of learning. c. A response to the CS requires an active action on the behalf of the person/animal. d. A response to the US is innate.
d. A response to the US is innate.
What is the best way to maintain high contingency in your classical conditioning paradigm? a. Pair the conditioned stimulus with the unconditioned stimulus 25% of the time. b. Pair the conditioned stimulus with the unconditioned stimulus 50% of the time c. Never present the conditioned stimulus with the unconditioned stimulus d. Always pair the conditioned stimulus with the unconditioned stimulus.
d. Always pair the conditioned stimulus with the unconditioned stimulus.
Which schedule produces stereotypical "scalloping" of responses in the cumulative record data? a. VR b. VI c. FR d. FI
d. FI
Little Albert became scared of all white fluffy things after being conditioned to fear the white bunny. This fear of all white fluffy things is an example of what? a. Desensitization b. Overextension c. Implicit Bias d. Generalization e. reward conditioning
d. Generalization
The neighbor's dog used to bark loudly throughout the day. After the dog whisper came for a visit the dog now barks quieter. This is an example of what measure of learning? a. Topography b. Errors c. Speed d. Intensity
d. Intensity
Who would be more likely to develop a conditioned eye blink to a buzzer? a. Fido: a shy, unexcitable dog b. Janet: an 85 years old woman c. Kyle: a chronically stressed male d. Kaitlyn: an acutely stressed 12 years old
d. Kaitlyn: an acutely stressed 12 years old
After Thanksgving dinner, you try to reward your cousin for good behavior with some turkey. Is this likely to be reinforcing and why? a. Yes, food is a primary reinforcers. b. Yes, since he just had turkey, he's familiar with it, so it should be more reinforcing. c. No, he's probably asleep now. d. No, Satiation has likely occurred.
d. No, Satiation has likely occurred.
In a _____ schedule, reinforcement is contingent on number of behaviors whereas in a _____ schedule, reinforcement is contingent on behavior after an amount of time that has passed. a. Fixed, interval b. Interval, ratio c. Variable, fixed d. Ratio, interval
d. Ratio, interval
When researchers measured how fast rats would learn to freeze to a footshock-paired tone, they found that the higher pitched the tone they used, the rats would learn to freeze more to the tone faster. They think this is because the rats notice the higher pitched tone more. With which theory does this best fit? a. Stimulus Substitution Theory b. Preparatory Response Theory c. Compensatory Response Theory d. Rescorla-Wagner Model
d. Rescorla-Wagner Model
Hannah wants to classically condition her brother to jerk his leg when he hears his favorite song. She does this by playing the song for him (CS) and hitting his knee (US) when the chorus comes on so that his leg jerks (UR). She pairs the US and CS together consistently for many trials. She tests it by playing the song alone. No reaction- why? a. She used backward conditioning which is an ineffective method of conditioning. b. Poor Contiguity c. The knee jerk response has become extinct. d. The failure in conditioning is a result of latent inhibition.
d. The failure in conditioning is a result of latent inhibition.
In perfect practice, the ______ the CS-US interval, the faster the rate of learning; the shorter the inter-trial interval, the ____ the rate of learning. a. shorter, slower b. longer, slower c. longer, faster d. shorter, faster
d. shorter, faster
Neuromechanics of Reinforcement; RPEH What's the change in dopamine?: worse than expected = decrease behavior
decrease in dopamine
schedules of reinforcement
different patterns of frequency and timing of reinforcement following desired behavior; specific schedules produce specific patterns of behavior - continuous vs. intermittent - Fixed interval, variable interval - fixed ratio, variable ratio
Is the following an example of DRA, DRI, or DRL?: giving child praise and attention when drawing on paper instead of the wall
differential reinforcement of alternative behavior (DRA)
The fact that we increase a behavior gets us food if we are hungry supports which theory(ies)? a. Hull's Drive Reduction Theory b. Relative Value Theory c. Response Deprivation Theory d. A & B e. A & C f. All of the above
e. A & C
Which of the following can be impacted by natural selection? a. Reflexes b. MAPs c. General behavior traits d. A & B e. All of the above
e. All of the above
Martha got food poisoning from bad lettuce in a salad she had eaten earlier, which made her throw up. A few days later her mother fixes salads for lunch and Martha immediately feels nausea when she sees them. Which of the following is NOT paired correctly? a. US - bad lettuce b. CS - salad c. UR - vomiting d. CR - nausea e. All of the above are paired correctly
e. All of the above are paired correctly
In terms of Pavlovian conditioning, which statement(s) is true regarding drug addiction a. The drug is the US b. The drug high is the CR c. The physiological preparation for the drug is the CR d. Items, places, and people associated with the drug are the CS e. More than 1 above
e. More than 1 above a. The drug is the US c. The physiological preparation for the drug is the CR d. Items, places, and people associated with the drug are the CS
problems with punishment: 1. ESCAPE
escape can be attempted through: avoiding the punisher, skipping school, cheating, lying, suicide (extreme), etc. - a common side effect of frequent punishment: the punished individual gets really good at doing some of these things
establishing or abolishing operation? - deprivation - pain - fear - wanting to return to homeostasis - having to wait a long time for it - having to work hard for it
establishing
Would food deprivation be an establishing or abolishing operation?
establishing! - if food is a reward and is then taken away, food will then be much more rewarding
natural reinforcers
events that follow spontaneously from a behavior
continuous reinforcement
every behavior is reinforced - required for some behaviors (eg. shopping) - good for shaping or training new behaviors - relatively rare in nature
variables in operant conditioning: 4. behavior characteristics
evolved tendencies - teaching a dog to bark is easier than teaching NOT to bark (natural) - doesn't usually include pressing levers or doing tricks; harder to teach than teaching pigeon to peck for food
learned helplessness
exposure to inescapable aversive at one time leads to inaction in future situations
sleep training (letting baby cry itself to sleep basically) and turning back on jumping puppy are examples of what alternatives to punishment?
extinction
T/F: natural selection helps the individual adapt to changes in its environment
false (helps the species, not the individual)
What schedule of reinforcement would this be... - restaurants has daily specials - checking the mail at the same time every day
fixed interval
Which schedule of reinforcement produces a scalloped line?
fixed interval
Which reinforcement schedule includes the post-reinforcement pause?
fixed ratio
alternatives to punishment: 2. extinction
identify all the reinforcers maintaining the behaviors and remove them but.... - hard to do outside of lab - likely many reinforcers for a behavior - we don't have control over everything - attention itself can be reinforcing
Neuromechanics of Reinforcement; RPEH What's the change in dopamine?: better than expected = increase behavior
increase in dopamine
the partial reinforcement effect (PRE)
intermittent schedules tend to be more resistant to extinction ** the "thinner" the schedule, the more responses during extinction
secondary reinforcers
learned reinforcers that develop their reinforcing properties because of their association with primary reinforcers
social observational learning
learning by observing another individual (model) - vicarious reinforcement - vicarious punishment
observational learning
learning by observing others - also called social learning, vicarious learning
What kind of reinforcer is food?
natural, primary
What is this an example of (operant conditioning)?: a boy who loses his TV privileges for pulling his sister's hair will be less likely to pull her hair again
negative punishment
What is this an example of (operant conditioning)? taking aspirin relieves headaches and makes it more likely that aspirin will be taken in the future
negative reinforcement
creativity
new, novel, unique - can be reinforced - promising a reward for a behavior can have detrimental effects on creativity
Neuromechanics of Reinforcement; RPEH What's the change in dopamine?: as you expected = no behavioral change needed
no change in dopamine
alternatives to punishment: 3a. differential reinforcement of alternative behavior (DRA)
non-reinforcement of unwanted behavior (EXTINCTION) + REINFORCEMENT of a specific ALTERNATE behavior - can be for the same reinforcer (or different reinforcer)
Herbert & Harsh cat experiment results
observational learning in cats - more observing, more learning
vicarious reinforcement
observing someone else receive a reward for behavior - happens if the observer tends to act like the model more (STRENGTHENED)
What's the prevailing theory of escape/avoidance?
one-process theory
environmental event depends on behavior
operant conditioning (you must do something to get food from the vending machine)
schedule effects
patterns of behavior are determined by the schedules of reinforcement
ghost condition
something is caused to happen, but there is no observable person/model doing the action - asocial learning - children learned just as well, if not better, when they saw the mat move on its own vs. when they observed the model moving it
In regards to a Skinner Box.... NEGATIVE REINFORCEMENT press lever -> ___________
strengthens behavior (escaping box)
In regards to a Skinner Box.... POSITIVE REINFORCEMENT press lever -> __________
strengthens behavior (gets food)
What happens when a behavior is reinforced/ punished just by CHANCE?
superstition
chaining
teaching a behavior chain, or a connected series of behaviors - if a link doesn't occur, shaping is used first - each link is reinforced by the next
alternatives to punishment: 3c. Differential Reinforcement of Low Rate (DRL)
undesirable rate of behavior is on extinction + reinforcement of behavior only at low rates - two ways to do DRL
PRE paradox
unreinforced response (which we have more of in a thin schedule) should weaken rather than strengthen behavior
One-Process Theory
the view that avoidance and punishment involve only one procedure - operant learning - most evidence supports this theory (for avoidance)
imitation
to behave in a way that resembles the behavior of a model - tend to imitate behaviors that are reinforcing for the model - failing to imitate could prevent rewarding consequences - humans seem to be compulsive about imitation
T/F: Almost any stimulus can become conditioned if it is regularly proceeded by the US.
true
T/F: learning always involves the acquisition of new behaviors
true
chain schedule
two or more simple schedules where only the last schedule in the chain is reinforced; particular schedule in effect is SIGNALED
tandem schedule
two or more simple schedules where only the last schedule in the chain is reinforced; the particular schedule in effect is UNSIGNALED