Chapter 4 "Reinforcement & Extinction of Operant Behavior"
Be able to discriminate between behavior that is elicited and emitted.
- Elicited is an involuntary act and emitted is a voluntary act
What are the 3 main general steps in shaping new behaviors?
1) Pick a target behavior 2) Begin reinforcing behaviors that can later lead to the target behavior (successive approximations) 3) Place old behaviors on extinction- differential reinforcement
Conditioned Reinforcer
A conditioned reinforcer is an event or stimulus that has acquired its effec-tiveness to increase operant rate on the basis of an organism's life or ontogenetic history.
Contingency of Reinforcement
A contingency of reinforcement defines the relationship between the occasion, the operant class, and the consequences that follow the behavior (e.g., SD : R → Sr). We change the contingencies by altering one of the components and observing the effect on behavior. For example, a researcher may change the rate of reinforcement for an operant in a given situation. In this case, the R → Sr component is manipulated while the SD : R component is held constant. Contingencies of reinforcement can include more than three terms as in conditional discrimination (e.g., four-term relations); also, the effectiveness of reinforcement contingencies depends on motivational events called establishing operations (e.g., deprivation and satiation)
Cumulative Record
A cumulative record is a real-time graphical representation of operant rate. Each response produces a constant upward increment on the Y-axis, and time is indexed on the X-axis. The faster the rate of response is, the steeper the slope or rise of the cumulative record. See also cumulative recorder
What is a discriminative stimulus? What does it do to behavior? What is the difference between a SD and SΔ?
A discriminative stimulus occurs before the behavior. It signals potential availability of consequence. The difference is that S^D is when reinforcement is available, the other is when it is not
Describe the Premack Principle in terms of LFB and HFB. Be able to give an example
A higher probability behavior can potentially be a reinforcer for a lower probability behavior. Example- if you do your homework, then you can watch the movie
Premack Principle
A higher-frequency behavior will function as reinforcement for a lower-frequency behavior.
Operant Chamber
A laboratory enclosure or box used to investigate operant conditioning. An operant chamber for a rat is a small, enclosed box that typically contains a lever with a light above it and a food magazine or cup connected to an external feeder. The feeder delivers a small food pellet when electronically activated.
in-vitro Reinforcement (IVR)
A method used to investigate reinforcement in the neuron, increasing calcium bursts or firings by injection of dopamine agonists or other agents.
Positive Reinforcer
A positive reinforcer is any stimulus or event that increases the probability (rate of response) of an operant when presented.
Extinction Burst
A rapid burst of responses when an extinction procedure is first implemented
Intermittent Schedule of Reinforcement
A schedule programmed so that some rather than all operants are reinforced. In other words, an intermittent schedule is any schedule of reinforcement other than continuous (CRF).
Describe Thorndike's Law of Effect. What is responsible for controlling behavior? How is this different from Classical Conditioning?
Behaviors followed by favorable consequences become more likely, and behaviors followed by unfavorable consequences become less likely
Spontaneous Recovery (operant)
After a period of extinction, an organism's rate of response may be close to operant level. After some time, the organism is again placed in the setting and extinction is continued. Responding initially recovers, but over repeated sessions of extinction the amount of recovery decreases. Repeated sessions of extinction eliminate stimulus control by extraneous features of the situation and eventually "being placed in the setting" no longer occasions the operant.
Repertoire
All the behavior an organism is capable of emitting on the basis of species and environmental history
Spontaneous Recovery (Respondent)
An increase in the magnitude of the conditioned response (CR) after respondent extinction has occurred and time has passed. A behavioral analysis of spontaneous recovery suggests that the CS-CR relation is weakened by extinction, but the context or features of the situation elicit some level of the CR. During respondent conditioning, many stimuli not specified by the researcher as the conditioned stimulus (CS), but present in the experimental situation, come to regulate behavior
Operant Conditioning
An increase or decrease in operant responses as a function of the consequences that have followed these responses.
Operant
An operant is behavior that operates on the environment to produce a change, effect, or consequence. These environmental changes select the operant appropriate to a given set-ting or circumstance. That is, particular responses increase or decrease in a situation as a function of the consequences they produced in the past. Operant behavior is emitted (rather than elicited) in the sense that the behavior may occur at some frequency before any known conditioning.
Describe the response-deprivation hypothesis; give an example. How is it different than the Premack Principle?
Any behavior can potentially can be a reinforcer if deprivation occurs. Restricting access to a behavior causes a behavior to become ore preferred
Law of Effect
As originally stated by Thorndike, the law refers to stamping in (or out) some response. A cat opened a puzzle-box door more rapidly over repeated trials. Currently the law is stated as the principle of reinforcement: operants may be followed by consequences that increase (or decrease) the probability or rate of response.
Outside of a decreased rate of responding, what are some other "side-effects" of extinction?
Extinction bursts: short bursts of response when extinction initially occurs. Operant variability: try to do many behaviors to get reinforcement again. Resurgence: behavior already extinguished may come back. Emotional responses: aggression/ depression
Contingent Response
In the response deprivation hypothesis, the contingent response is the activity obtained by making the instrumental response, as in the contingency if activity A occurs (instru-mental response), then the opportunity to engage in activity B (contingent response) occurs.
Instrumental Response
In the response deprivation hypothesis, the instrumental response is the behavior that produces the opportunity to engage in some activity.
Partial Reinforcement Effect (PRE)
Intermittent reinforcement schedules generate greater resistance to extinction than continuous reinforcement (CRF). The higher the rate of reinforcement, the greater the resistance to change; however, the change from CRF to extinction is discriminated more rapidly than between intermittent reinforcement and extinction.
Negative Punishment
Negative punishment is contingency that involves the removal of an event or stimulus following behavior and decreasing the rate of response. The negative punishment procedure requires that behavior (watching television) is maintained by positive reinforcement (entertaining programs) and the reinforcer is removed (TV turned off) if a specified response occurs (yelling and screaming). The probability of response is reduced by the procedure.
Negative Reinforcement
Negative reinforcement is a contingency where an ongoing stimulus or event is removed (or prevented) by some response (operant) and the rate of response increases. If it is raining, opening and standing under an umbrella removes the rain and maintains the use of the umbrella on rainy days. When operant behavior increases by removing an ongoing event or stimulus the contingency is called escape. The contingency is called avoidance when the operant increases by preventing the onset of the event or stimulus. Both escape and avoidance involve negative reinforcement.
Response Deprivation
Occurs when access to the contingent behavior is restricted and falls below its baseline (or free-choice) level of occurrence.
Renewal
One type of post-extinction effect is called renewal, involving the recovery of responding when the animal is removed from the extinction context. In respondent extinction, such recovery of responding is well established and is thought to occur because of inhibitory learning to the extinction context (Bouton, 2004). Once the animal is removed from the extinction setting, the contextual cues for inhibition no longer occur and responding recovers. A similar effect is observed with operant behavior after extinction, but the evidence is not as extensive.
Operant Variability
Operant behavior becomes increasingly more variable as extinction proceeds. From an evolutionary view, it makes sense to try different ways of acting when something no longer works. That is, behavioral variation increases the chances that the organisms will reinstate reinforcement or contact other sources of reinforcement, increasing the likelihood of survival and reproduction of the organism.
Emitted
Operant behavior is emitted in the sense that it occurs at some probability in the presence of a discriminative stimulus (SD), but the SD does not force its occurrence.
Describe the process of operant extinction. How is this similar/different than classical conditioning?
Operant extinction= removal of the reinforcer (S^R+) classical conditioning= removal of the US- the gradual decrease in the response rate of a behavior learned via reinforcement
Positive Reinforcement
Positive reinforcement is a contingency that involves the presentation of an event or stimulus following an operant that increases the rate of response.
Be able to define and give an example of each of the 4 basic contingencies of reinforcement and punishment.
Positive reinforcement. Negative reinforcement. Positive punishment. Negative punishment.
What does "positive" and "negative" mean in relation to reinforcement and punishment?
Positive: a consequence is added contingent on the behavior. Negative: a consequence is removed contingent on the behavior
Operant Class
Refers to a class or set of responses that vary in topography but produce a common environmental consequence or effect. The response class of turning on the light has many variations in form (turn on light with left index finger, or right one, or side of the hand, or saying to someone "Please turn on the light").
Cumulative Recorder
Refers to a laboratory instrument that is used to record the frequency of operant behavior in real time (rate of response). For example, paper is drawn across a roller at a constant speed, and each time a lever press occurs a pen steps up one increment. When reinforcement occurs, this same pen makes a downward deflection. Once the pen reaches the top of the paper, it resets to the bottom and starts to step up again.
Discriminated Extinction
Refers to a low rate of operant behavior that occurs as a function of an S∆. For example, the probability of putting coins in a vending machine with an "out of order" sign on it is very low
Positive Punishment
Refers to a procedure that involves the presentation of an event or stimulus following behavior that has the effect of decreasing the rate of response. A child is given a spanking for running into the street and the probability of the behavior is decreased.
Emotional Responses
Refers to a response such as "wing flapping" in birds that occurs with the change in contingencies from reinforcement to extinction. A common emotional response is called aggression (attacking another organism or target).
Neuroplasticity
Refers to alterations of neurons and neural interconnections during a lifetime by changes in environmental contingencies.
Discriminative Stimulus (SD)
Refers to an event or stimulus that precedes an operant and sets the occasion for operant behavior (antecedent stimulus).
Magazine Training
Refers to following the click of the feeder (stimulus) with the presentation of food (reinforcement). For example, a rat is placed in an operant chamber and a microcomputer periodically turns on the feeder. When the feeder is turned on, it makes a click and a food pellet falls into a cup. Because the click and the appearance of food are associated in time you would, after training, observe a typical rat staying close to the food magazine, and quickly moving toward it when the feeder is operated (see conditioned reinforcer)
Differential Reinforcement
Refers to reinforcement for any behavior other than a target operant. For example, after a period of time the applied behavior analyst delivers reinforcement for any behavior other than "getting out of seat" in a classroom. The target behavior is on extinction and any other behavior is reinforced.
Behavior Variability
Refers to the animal's tendency to emit variations in response form in a given situation. The range of behavioral variation is related to an animal's capabilities based on genetic endowment, degree of neuroplasticity, and previous interactions with the environment. Behavioral variability in a shaping procedure allows for selection by reinforcing consequences and is analogous to the role of genetic variability in natural selection.
Operant Rate/Probability of Response
Refers to the number of responses that occur in a given interval. For example, a bird may peck a key for food two times per second. A student may do math problems at the rate of 10 problems per hour.
Rate of Response
Refers to the number of responses that occur in a given interval. For example, a bird may peck a key for food two times per second. A student may do math problems at the rate of 10 problems per hour. This is the same as operant rate
Resistance to Extinction
Refers to the perseverance of operant behavior when it is placed on extinction. Resistance to extinction is substantially increased when an intermittent schedule of reinforcement has been used to maintain behavior. See intermittent reinforcement effect
Topography
Refers to the physical form or characteristics of the response. For example, the way that a rat presses a lever with the left paw, the hind right foot, and so on. The topography of response is related to the contingencies of reinforcement in the sense that the form of response can be broadened or restricted by the contingencies. The contingency of reinforcement may require only responses with the left paw rather than any response that activates the micros-witch—under these conditions left paw responses will predominate. Generally, topography or form is a function of the contingencies of reinforcement
Continuous Reinforcement (CRF)
Refers to the presentation of a conditioned reinforcer and the subsequent increase in rate of the operant that produced it.
Deprivation Operation
Refers to the procedure of restricting access to a reinforcing event. With-holding an event or stimulus increases its effectiveness as a reinforcer
Operant Level
Refers to the rate of an operant before any known conditioning. For example, the rate of key pecking before a peck-food contingency has been established.
Latency
Refers to the time from the onset of one event to the onset of another. For example, the time it takes a rat to reach a goal box after it has been released in a maze.
Force of Response
Reinforcement can be made contingent on the force or magnitude of response. Force or magnitude is a property or dimension of behavior
What is a functional definition of reinforcement and punishment?
Reinforcement causes an increase in behavior which it is contingent to in the future. Punishment causes a decrease in behavior on which it is contingent to in the future
Elicited
Respondent (CR) and reflexive (UR) behavior is elicited in the sense that the behavior is made to occur by the presentation of a stimulus (CS or US).
Successive Approximations
Same as shaping (above term)
Be familiar with the three-term contingency
The basic unit of analysis in the analysis of operant behavior; encompasses the temporal and possibly dependent relations among an antecedent stimulus, behavior, and consequence.
Ad Libitum Weight
The body weight of an organism that has free access to food 24 h a day
Shaping
The method of successive approximation or shaping may be used to establish a response. This method involves the reinforcement of closer and closer approximations to the final performance. For example, a rat may be reinforced for standing in the vicinity of a lever. Once the animal is reliably facing the lever, a movement of the head toward the bar is reinforced. Next, closer and closer approximations to pressing the lever are reinforced. Each step of the procedure involves reinforcement of closer approximations and non reinforcement of more distant responses. Many novel forms of behavior may be shaped by the method of successive approximation.
Response Deprivation Hypothesis
The principle that organisms work to gain access to activities that are restricted or withheld (deprivation), presumably to reinstate equilibrium or free-choice levels of behavior. This principle is more general than the Premack principle, predicting when any activity (high or low in rate) will function as reinforcement.
Extinction
The procedure of extinction involves the breaking of the contingency between an operant and its consequence. For example, bar pressing followed by food reinforcement no longer produces food. As a behavioral process, extinction refers to a decline in the frequency of the operant when an extinction procedure is in effect. In both instances, the term extinction is used correctly
Reinstatement
The recovery of behavior when the reinforcer is presented alone (response independent) after a period of extinction. In an operant procedure, reinstatement involves reinforcement of a response followed by extinction. After extinction, response independent reinforcement is arranged and the opportunity to respond is removed (using retractable levers). This is followed by tests that reinstate the opportunity to respond (response levers available)
Ontogenetic Selection
The selection of operant behavior during the lifetime of an organism is ontogenetic selection. The process involves operant variability during periods of extinction and selection by contingencies of reinforcement. An organism that alters its behavior (adaptation) on the basis of changing life experiences is showing ontogenetic selection. In this ontogenetic form of adaptation, the topography and frequency of behavior increase when reinforcement is withheld (increase in operant variability). These behavioral changes during extinction allow for the selection of behavior by new contingencies of reinforcement. Thus, a wild rat that has been exploiting a compost heap may find that the homeowner has covered it. In this case, the rat emits various operants that may eventually uncover the food. The animal may dig under the cover, gnaw a hole in the sheathing, or search for some other means of entry. A similar effect occurs when food in the compost heap is depleted and the animal emits behavior that results in getting to a new food patch. In the laboratory, this behavior is measured as an increase in the topography and frequency of bar pressing as the schedules of reinforcement change.
What is the partial reinforcement effect? Be able to give an example
The tendency for a response that is reinforced after some, but not all, correct responses to be very resistant to extinction. How can you tell if being reinforced or going through extinction? Example: giving a dog a treat only half of the time that he shakes your hand successfully
S-delta (SΔ)
When an operant does not produce reinforcement, the stimulus that precedes the operant is called an S-delta (S∆). In the presence of an S-delta, the probability of emitting an operant declines. See extinction stimulus.
Response Differentiation
When reinforcement is contingent on some difference in response properties, that form of response will increase. For example, the force or magnitude of response can be differentiated; if the contingencies of reinforcement require a forceful or vigorous response in a particular situation, then that form of response will predominate. In another example, when reinforcement is based on short interresponse times (IRT, 2-5 s), the distribution of IRTs becomes centered on short intervals. Changing the contingencies to reinforce longer IRTs (20-25 s) produces a new distribution centered on long intervals. See differential reinforcement.
Response Hierarchy
With regard to responses within a response class, a response hierarchy refers to the order or likelihood of the response forms in the class based on response properties (effort) or probability of reinforcement in a given situation. For a child, the parents may have differentially reinforced shouting rather than quiet conversation at the dinner table and loud talk has a higher probability of occurrence at dinner than talk at less volume. For a free-choice or baseline assessment (Premack, 1962), the responses in different classes for a situation are arranged in a hierarchy (between response classes) by relative frequency or probability of occurrence. For a rat the probability of eating, drinking, and wheel running might form a hierarchy with eating occurring most often and wheel running least.
Free-Operant Method
an organism may repeatedly respond over an extensive period of time. The organism is "free" to emit many responses or none at all. More accurately, responses can be made without interference from the experimenter (as in a trials procedure).