Operant Conditioning
What consequences are associated with selective effect of reinforcement?
- A reinforcing consequence is theorized to not only strengthen behaviour but to select behaviour - Environment operates to select behaviours from the repertoire of behaviours that an organism emits in a manner similar to how natural selection operates to select genotypes that increase reproductive fitness
How do we know what is satisfying or discomforting for an animal according to Thorndike?
- By a satisfying state of affairs is meant one which the animal does nothing to avoid, often doing such things as attain and preserve it - By a discomforting state of affairs is meant one which the animal commonly avoids or abandons
What is instinctive drift?
- Cue-consequence specificity effect (Garcia effect) in classical conditioning showed that animals come biologically prepared to associate certain types of stimuli with other types of stimuli - Equivalent phenomenon in operant conditioning of a biological constraint that limits the behaviours of operant conditioning
What research did Guthrie and Horton do?
- Elaborated on Thorndike's work - Placed a cat in a puzzle box with a pole in the center - Pole could be tipped in any direction in anyway to open the door - Camera takes a photo at instant door opened to capture behaviour of cat at that moment - Each cat came to settle on a particular behaviour for opening the door from trial to trial; during earlier trials, a number of behaviours occurred; however as trials progressed, a single behaviour dominated (decrease in variability) - Behaviour that came to dominant varied from cat to cat
What are some biological constraints on operant conditioning?
- Hereditary factors compete with and can overshadow reinforcement contingencies as determinants of behaviour - Phylogenetic influences and ontogenetic influences on behaviour can operate simultaneously - Reinforcement is not the sole determinant of a creature's behaviour
How does punishment effect the basic paradigm?
- If the effect of the consequence is to decrease the future probability of occurrence of the response in the presence of the discriminative stimulus, then the contingency is one of punishment - Punishment always decreases the behaviour - If the consequence involved the removal of something, the contingency is negative punishment - If the consequence involved the addition of something, the contingency is positive punishment
How does reinforcement effect the basic paradigm?
- If the effect of the consequence is to increase the future probability of occurrence of the response in the presence of the discriminative stimulus, then the contingency is one of the reinforcement - Reinforcer always increases behaviour - If the consequence involved the removal of something, the contingency is negative reinforcement - If the consequence involved the addition of something, the contingency is positive reinforcement
What did Edward Thorndike research?
- Investigated systematically how an animal's non-reflexive behaviours can be modified as a result of experience - Used puzzle boxes to monitor animal's behaviour and how they attempted to get out - Concluded that the animal's first production of the appropriate response occurred purely by accident - Measured escape latency, or the amount of time it took the animal to get out of the box each time
How did Hays and Woodbury measure selective effect of reinforcement?
- Measured distribution of lever press response forces when force required to produce reinforcement was greater than or equal to 21g and when it was changed to greater than 36g - Change in the distribution reflects on a selection of increasingly efficient response forces
How does selective effect of reinforcement compare with natural selection?
- Natural selection favours changes that increase reproductive fitness of the species over the long rune because changes in gene frequency within the population occur following reproduction - Works through genetic mechanisms - Selection by reinforcement describes the immediate effect of the contemporary environment on the behaviour of a single individual to increase the efficiency of their behaviour in that environment - Works through neural mechanisms underlying the biological basis of reinforcement that mediates the selective effects of the contemporary environment
What is Thorndike's Law of Effect?
- Of several responses made to the same situation, those which are accompanied or closely followed by satisfaction to the animal will, other things being equal, be more firmly connected with the situation so that when it recurs they will be more likely to recur - Connections can also weaken the connection if they are followed by discomfort - The greater the satisfaction or discomfort, the greater the strengthening or weakening of the bond
How did Page and Neuringer study stereotypy and variability?
- Placed pigeons in a chamber with 2 response keys and a food hopper - Contingency required pigeons to peck the response keys 8 times to obtain food; pecks could be distributed between keys in any manner (all left or right or a mix) - In order for reinforcement to occur, the sequence of 8 pecks had to be different than the sequences over the previous 50 trials - Results showed that the pigeons generated greater variability of sequences when the contingency was in effect - When the contingency of reinforcement was not selecting for variability, variable sequences decreased
What is the stop-action principle as found by Guthrie and Horton?
- Reinforcer stops the animal's ongoing behaviour and strengthens the association between the situation and the ongoing behaviour at the moment the reinforcer occurs - Because of the strengthening process, the specific bodily position and the muscle movements occurring at the moment of reinforcement will have a higher probability of occurring on the next trial
How did Stein, Xue and Belluzzi demonstrate reinforcement effects with neurons?
- Reinforcer was an injection of a small amount of dopamine onto the surface of a neuron in the hippocampus - Contingency was the burst of action potentials - Used a recording electrode that feeds into amplifier and oscilloscope connected to a computer that controls micropipette injector - Baseline group recorded the number of bursts before reinforcement - Reinforcement group was presented a small dose of dopamine upon a burst (o.1, 0.5, 1 and 2 mM) - Non-contingent group received injections of dopamine that were given at regular intervals independent on bursts - Non-contingent group resulted in significantly lower rate of bursts across all 4 doses
How did Breland & Breland explain instinctive drift?
- Reported instances where species specific behaviour interfered with operant conditioning - Attempted to train a raccoon to pick up coins and deposit them in a piggy bank - Raccoon would pick up coins, dip them into a box, take them out and rub them together for several minutes and not drop them in the box - Rubbing got worse with training - Coins presented just before food, coins released behaviours that are part of the raccoon's food gathering repertoire - Raccoons dip a piece of food into a stream before eating it and the rubbing motions are similar to those used to remove the shell form the crustacean - Species-specific behaviour intruded into a context where it was inappropriate and actually postponed food reinforcement
How do stereotypy and variability differ?
- Selective and strengthening effect of reinforcement suggests that reinforcers produce a uniformity or stereotypy behaviour - Reduces variability in behaviour as the individual repeatedly performs the same response - Contingency of reinforcement can be altered to select and strengthen variability in behaviour
What is superstitious behaviour and how does it relate to the stop-action principle?
- Stop-action principle suggests possible explanation for superstitious behaviour - Whatever behaviour that is occurring when reinforcement occurs will be strengthened regardless of whether the reinforcer depends on the occurrence of the behaviour
What is an example of superstitious behaviour done by Matute?
- Students were exposed to unpleasant loud tones - Tole that tones could be turned off by typing the correct sequence of keys on a keyboard - No sequence of key strokes terminated the tones - Tones went on and off independently of the typing - If a student was typing when a tone terminated, student was likely to repeat the sequence of keystrokes that she/he made in the moments before the tone terminated - Most of the students developed stereotypic sequences of keystrokes that they believed turned off the tones
How did Diech, Allan and Ziegler measure selective effect of reinforcement?
- Use shaping to modify the gape size of a pigeon (gape = opening of beak when pigeon pecks key) - Alter the contingency of reinforcement - Reinforcement only occurs when the gap is larger than a particular width - Results showed that the distribution of gapes shift to meet criteron
What did Georges Romances suggest?
- Wrote a book called Animal Intelligence (1888) - Collection of stories of animal behaviour - Anecdotal evidence suggests animals learn through reasoning
What is the first misconception associated with reward=reinforcement?
1. Rewards may function as reinforcers but not all reinforcers are rewards - Ex. a subject responds on al ever to give themselves a shock of lower intensity than occurs when the subject does not respond - Not compatible with what people typically think of as reward
What is the three-term contingency?
1. The context or situation in which a response occurs 2. The response itself 3. The stimuli that follow the response
What is the second misconception associated with reward=reinforcement?
2. The concept of reward does not refer to the effect of giving the reward - A reward is only a reinforcer when it is given contingent upon the occurrence of a behaviour and the future probability of that behaviour increases - Reinforcement, unlike reward, is defined by the effect on the behaviour
What is the third misconception associated with reward=reinforcement?
3. Reinforcement, unlike reward, is a property of the relationship between the behaviour and the consequence rather than a property of a stimulus or event - The same stimulus can function as either a reinforcer or a punisher depending on the relationship between the behaviour and the consequence - Ex. consider $100 a stimulus; $100 strengthens the behaviour that produces it and weakens the behaviour that loses it - Reinforcement and punishment are not properties of stimuli but of environment-behaviour relations
What is contingency?
A future event or circumstance that is possible but cannot be predicted with certainty; a rule that states that some event will occur if and only another event occurs
What is a conditioned reinforcer?
A previously neutral stimulus that has acquired the capacity to strengthen responses because that stimulus has been repeatedly paired with food or some other primary reinforcer
What is a response chain?
A sequence of behaviours that must occur in a specific order, with the primary reinforcer being delivered only after the final response of the sequence
What is a primary reinforcer?
A stimulus that naturally strengthens any response it follows (e.g. water, food, sexual pleasure)
How is the Law of Effect applied to Thorndike's puzzle box experiments?
Certain behaviours, those that opened the door, were closely followed by a satisfying state of affairs (escape and food) so when the animal was returned to the same situation it was more likely to produce those behaviours than it had been before
What is an example of generalization?
Changing the light from yellow to green; the pigeon still pecks at the light for a short period of time thinking it will get a reinforcer despite never being trained on green light (generalized lights)
Explain the basic paradigm involving SD - R - SR+?
Discriminative Stimulus (SD) = signals the response-consequence contingency in effect (stimulus control) Response (R) = in the presence of the SD it is followed by the SR+ SR+ = consequence involving a change in the context contingent upon occurrence of the response; can involve the addition or the removal of something - Contingent relation between the response and the consequence is defined solely by its effect on the response
How did Skinner describe the three-term contingency?
In the presence of a specific stimulus, often called a discriminative stimulus, the reinforcer will occur if and only if the operant response occurs
How does extinction in operant conditioning work?
Involves no longer following the operant response with a reinforcer, the response will weaken and eventually disappear
Describe the difference between positive and negative punishment and provide an example of each.
Positive = consequence involved the addition of something (e.g. adding a shock) Negative = consequence involved the removal of something (e.g. taking away a toy)
Describe the difference between positive and negative reinforcement and provide an example of each.
Positive = consequence involved the addition of something (e.g. sugar water) Negative = consequence involved the removal of something (e.g. removing a shock)
In modern conditioning, what is "satisfying state of affairs" now replaced by?
Reinforcer
What is the big misconception with operant conditioning?
Reward = Reinforcement
What is a generalized reinforcer?
Special class of conditioned reinforcers - those that are associated with a large number of different primary reinforcers (e.g. money)
What is shaping?
The form of an existing response is gradually changed across successive trials towards a desired target behavior by rewarding exact segments of behavior
What is the theory of conditioned reinforcement?
The strongest conditioned reinforcers are those that provide the best information about the delivery of primary reinforcers
