Basic Learning Processes Exam 3 Experiments & Terms
Why is Sidman avoidance also known as "free-operant" avoidance? Does it support the two-process theory of avoidance?
(Quick summary: Sidman avoidance does NOT support the two-process theory of avoidance. This is because in Sidman avoidance, there's no overt cue.) In a free-operant avoidance procedure, the aversive stimulus (e.g., shock) is scheduled to occur periodically without warning: let's say every five seconds. Each time the participant makes the avoidance response, it obtains a period of safety: let's say 15 seconds long, during which shocks do not occur. Repetition of the avoidance response before the end of the shock-free period serves to start the safe period over again. A free-operant avoidance procedure is constructed from two time intervals. One of these is the interval between shocks in the absence of a response. This is called the S-S (shock-shock) interval. The other critical time period is the interval between the avoidance response and the next scheduled shock. This is called the R-S (response-shock) interval. The R-S interval is the period of safety created by each response. This was bad news for Miller's Two-Process Theory because (according to the two-process theory), having no explicit CS/cue/signal means no aversive stimulus to escape, and escape from an aversive stimulus was supposed to reinforce avoidance behavior. According to the Two-Process Theory, reducing fear of the CS was the motivation for making the instrumental response. But with Sidman Avoidance, there's no cue/CS at all. So the Sidman Avoidance paradigm contradicts the two-process theory of avoidance.
How do you know which direction in the stimulus dimension will the peak shift, up or down the stimulus dimension?
(Quick summary: The peak will shift up or down the stimulus dimension depending on the magnitude of the S+ and S-. So if your S- is higher than the S+, then the peak will shift down the stimulus dimension. If your S- is lower than the S+, then the peak will shift up the stimulus dimension.) Let's say your training stimulus is a light with a pitch of 550 nm (nanometers). This light is your S+. Now let's say in one experimental group you're also training with an S- light. This light has a wavelength of 555 nm. In another experimental group the S- will be 590 nm. In both cases, the S- (which has inhibitory properties) affects the generalization gradient. The shift of the peak of the generalization gradient away from the original S+ (in our example, the 550 nm light) is called the peak-shift effect. The gradient shifts DOWN the stimulus dimension, meaning the peak shifts to a lower wavelength in our example. Refer to the picture. This happened because the S- was a light of higher wavelength (for both groups; 555 nm and 590 nm respectively).
Lovibond et al. (2008)
A challenge to Miller's 2-process theory (Quick summary: The Stimulus A group made the instrumental avoidance response successfully but did not report fear of the CS. As this group's belief in the efficacy of the avoidance response increased, both their rated expectancy of shock and their skin conductance responses decline. This result contradicts Miller's two-process theory.) College students received conditioning with three different stimuli, designated as A, B, and C. The stimuli were colored blocks presented on a computer screen. The US was shock to the index finger at an intensity that was uncomfortable but not painful. On trials with Stimulus A, an avoidance conditioning procedure was in effect. Stimulus A was presented for 5 seconds, followed by shock 10 seconds later (A+). However, if the participants pressed the correct button during the CS, shock was omitted on that trial. Stimulus B received only Pavlovian conditioning as a comparison. Each presentation of B was followed by shock (B+) without the opportunity to avoid. Stimulus C was a control cue and was never followed by shock (C-). To track the effect of these procedures, the participants were asked to rate their expiation that shock would occur, and their skin conductance responses were recorded as an index of fear. Ratings of shock expectancy were obtained during the 10-second delay between the CS and the scheduled US. The left graph shows changes in skin conductance as a measure of fear. Fear was always low for the control group, Stimulus C, as would be expected because C never ended in shock. Fear increased crossed trials for the Pavlovian Stimulus B, which ended in shock on each trial (B+). In contrast, fear decreased across trials for the avoidance stimulus (A+). The changes in fear to Stimulus A and B were paralleled by changes in the expectancy of shock. Shock expectancy increased across trials for the Pavlovian Stimulus B but decreased for the avoidance Stimulus A. Subsequent test trials indicated that the participants were not afraid of Stimulus A because they had learned to prevent shock on A trials. If their avoidance response was blocked, their fear returned, as did their expectation that shock would occur again.
Contextual cues
A more comprehensive analysis of the stimuli organisms experience during the course of instrumental conditioning indicates that 'discrete discriminative stimuli' (presented for a brief period, has a clear beginning & end, and can be easily characterized) occur in the presence of background contextual cues. The contextual cues may be visual, auditory, or olfactory features of the room or place where the discrete discriminative stimuli are presented. Experimental example: In a study of sexual conditioning by Akins (1998), contextual cues were used as a signal for sexual reinforcement, in much the same way that a discrete CS might be used. Male domesticated quail served as subjects, and the apparatus consisted of two large compartments that were distinctively different. (2 contexts, represented here as 2 distinct compartments) 1. One compartment had sand on the floor and the walls and ceiling were colored orange. 2. The other compartment had a wire-mesh floor and walls and ceiling painted green. Before the start of the conditioning trials, the subjects were allowed to move back and forth between the two compartments during a 10-minute preference test to determine their baseline preference. The NON-PREFERRED compartment was then designated as the CS. Conditioning trials consisted of placing the male subject in its CS context for five minutes, at which point a sexually receptive female was placed with them for another five minutes. Thus, these subjects received exposure to the CS context PAIRED WITH the sexual US (a sexually receptive female). Subjects in a control group received access to a female in their home cages two hours before being exposed to the CS context; for them, the CS and US were unpaired. Results: (Refer to picture) Notice that the paired and un-paired groups showed similar low preferences for the CS compartment at the outset of the experiment (for the paired group: before the introduction of the sexually receptive female). This low preference persisted in the control group. In contrast, subjects that received the CS context paired with sexual reinforcement came to prefer that context. Thus, the ASSOCIATION of contextual cues WITH sexual reinforcement increased preference for those cues. Clarification: Why was the non-preferred compartment used as the CS? - Because the experiment needed a context that otherwise would not be remarkable in any way to the subjects. They did not care for the 'non-preferred' compartment at first. But throughout the course of the experiment - with the introduction of the sexual reinforcement, they came to prefer the compartment. This is because the context changed. It was now associated with the sexual reinforcement.
What is the relationship of the steepness of the stimulus generalization gradient and stimulus control?
A steep generalization gradient indicates strong control of behavior by the stimulus dimension that is tested. By contrast, a flat generalization gradient indicates weak or non-existent stimulus control.
Disinhibition
A temporary increase in previously extinguished behavior as a result of the introduction of novel stimuli. Disinhibition is interchangeable with dishabituation.
Describe how the idea of predatory imminence explains avoidance behavior.
According to the predatory imminence continuum, different defensive responses occur depending on the level of danger faced by an animal. Predatory imminence is important to consider in all aversive conditioning situations because it reflects the innate coping mechanisms that come into play whenever the defensive behavior system is activated.
Brown and Jacobs (1949)
Acquired drive [This type of experiment was originally called an 'acquired drive' experiment because the 'drive' (motivation) to perform the instrumental response (fear of the CS) was learned through classical conditioning rather than being innate (such as due to hunger or thirst).] Now referred to as 'Escape from fear' (EFF) procedures (Quick summary: Terminating the CS by making the instrumental response [the avoidance response] reduced the fear effect of the CS. This was apparently sufficient enough. The actual shock was never even presented in the second phase of the experiment.) In the typical avoidance procedure, classical conditioning of fear and instrumental reinforcement through fear reduction occur intermixed across trials. However, if these two processes make separate contributions to avoidance learning, it should be possible to demonstrate their operation when the two types of conditioning are not intermixed. This is the goal of 'escape from fear' (EFF) procedures Experiment: The experiment consisted of two phases. In the first phase, rats underwent Pavlovian conditioning of a CS to a shock. In the second phase, the CS was presented and the rats could terminate it by performing a response. The US was never presented during this phase, regardless of the rats' actions, so the response did not play any role in avoiding the US. The rats learned to perform the response, suggesting that, as predicted by Miller's two-process theory, CS termination was sufficient to support avoidance learning. Results: The rats learned to cross faster just to turn off the CS. Their response was negatively reinforced. On the surface, it would seem that removing the shock in the second phase made their response stronger.
What is extinction? Is extinction forgetting? How do you know?
Acquisition of conditioned behavior involves procedures in which a reinforcing outcome occurs. In Pavlovian (classical) conditioning, the outcome or unconditioned stimulus is presented as a consequence of a conditioned stimulus. In instrumental conditioning, the reinforcing outcome is presented as a consequence of the instrumental response. Extinction involves omitting the US, or reinforcer. 1. In classical conditioning, extinction involves repeated presentations of the CS by itself. 2. In instrumental conditioning, extinction involves no longer presenting the reinforcer as a consequence of the instrumental response. With both types of procedures conditioned responding declines. Extinction is NOT forgetting. Forgetting is the reduction of a learned response that occurs because of the passage of time, not because of particular experiences. *Extinction is an active process produced by the unexpected absence of the US or the reinforcer. Forgetting, by contrast, is a decline in responding that may occur simply because of the passage of time and does not require non-reinforced encounters with the CS or the instrumental response.*
Active avoidance
Active avoidance is a procedure that requires an active response to avoid punishment. Includes: 1. One-way active avoidance 2. Two-way active avoidance
What is the effect of extensive training with a training stimulus on the generalization gradient?
After extensive training, pigeons respond to stimuli that are very similar to the training stimulus, but respond little to other stimuli.
Describe the three most important hypotheses that have been proposed to explain PREE: the discrimination hypothesis, Amsel's frustration theory, and Capaldi's sequential memory theory. Which is/are probably correct?
All are correct in some way. 1. The discrimination hypothesis An explanation of the partial reinforcement extinction effect, stating that extinction is slower after partial reinforcement than continuous reinforcement because the onset of extinction is more difficult to detect following partial reinforcement. 2. Amsel's frustration theory (Quick summary [but read on for clarity]: The instrumental response becomes conditioned to the expectation of non-reinforcement. Hence, the response persists into extinction.) According to frustration theory, persistence in extinction results from learning something paradoxical, namely to continue responding when you expect to be nonreinforced or frustrated. This learning occurs in stages. Intermittent/partial reinforcement involves both rewarded and non-rewarded trials. Rewarded trials lead individuals to expect reinforcement and nonrewarded trials lead them to expect the absence of reward. Consequently, intermittent reinforcement initially leads to the learning of two competing expectations. These two competing expectations lead to conflicting behaviors: the expectation of reward encourages subjects to respond, and the anticipation of nonreinforcement discourages responding. However, as training continues, this conflict is resolved in favor of responding. The resolution of the conflict occurs because reinforcement is not predictable in the typical partial reinforcement schedule. Therefore, the instrumental response ends up being reinforced some of the times when the subject expects nonreward. Because of such experiences, the instrumental response becomes conditioned to the expectation of nonreward. According to frustration theory, this is the key to persistent responding in extinction. With sufficient training, intermittent reinforcement results in learning to make the instrumental response when the subject expects nonreward. Once the response has become conditioned to the expectation of nonreward, responding persists when extinction is introduced. 3. Capaldi's sequential memory theory This theory assumes that subjects can remember whether or not they were reinforced for performing the instrumental response in the recent past. They remember both recent rewarded and nonrewarded trials. The theory assumes further that during intermittent reinforcement training, the memory of nonreward be- comes a cue for performing the instrumental response. Precisely how this happens depends on the sequence of rewarded (R) and nonrewarded (N) trials that are administered. That is why the theory is labeled sequential. The memory of a particular sequence itself becomes a cue later on in extinction. So when the subject is not being rewarded, they expect reinforcement at some point because they remember not having been reinforced for awhile and then eventually getting a reward.
Two-way avoidance
Also called 'shuttle avoidance.' A shuttle avoidance procedure in which trials can start in either compartment of a shuttle box, and the avoidance response consists of going from the occupied compartment to the unoccupied compartment. Sometimes: A--->B[safe] B--->A[safe] Always harder to learn than one-way avoidance because there is no consistent safe side.
The Frustration Theory of PREE (Abraham Amsel, Ph.D.) - an emotion theory
Amsel believed that the animals who were given PRF had adapted their emotional behavior (they were able to deal with frustration). Hence, they continued to respond into the extinction phase. If animals never experienced frustration then they will 'give up' sooner. So a subject on a CRF schedule would give up sooner. Think of a child who is playing a board-game. If the child loses they may throw a fit and become upset. As we get older and gain life experience, we (hopefully) do not quit at the first sign of trouble. Resisting extinction can be beneficial socially-speaking.
Passive avoidance / Passive avoidance procedure
An aversive stimulus is avoided if no response is made. In this test, subjects learn to avoid an environment in which an aversive stimulus (such as a foot-shock) was previously delivered. Procedure: (Refer to picture.) Animals are allowed to explore both compartments on the first day. On the following day, they are given a mild foot shock in one of the compartments (let's say, 'A'). Animals will learn to associate certain properties of the chamber with the foot shock. In order to test their learning and memory, the mice are then placed in the compartment where no shock was delivered (B). Then they are tested on how long it takes them (the latency) to enter the compartment they were shocked in. Mice with normal learning and memory will avoid entering the chamber where they had previously been exposed to the shock.
One-way avoidance
An avoidance conditioning procedure in which the required instrumental response is always to cross from one compartment of a shuttle box to the other in the same direction. Always: A--->B[safe] One-trial learning is frequently observed.
Flooding
An effective and extensively investigated extinction procedure for avoidance behavior. It involves presenting the CS in the avoidance situation without the US, but with the apparatus altered in such a way that the participant is prevented from making the avoidance response. Thus, the subject is exposed to the CS without being permitted to terminate it. It is "flooded" with the CS. Keyword here is 'flooded'.
Stimulus discrimination
An organism is said to exhibit stimulus discrimination if it responds differently to two or more stimuli.
Jenkins (1962)
Another test of the discrimination hypothesis Rats were tasked with running down a runway. Training phase 1: Group 1 was sometimes reinforced / sometimes not. (Partial reinforcement.) Group 2 was continuously reinforced. (CRF) Training phase 2: Both groups were trained in CRF. So the animals that had had partial partial reinforcement were now getting continuously reinforced. So what happened? Would group 1 be able to distinguish the onset of extinction in the testing phase? Testing phase: Group 1 (first had partial reinforcement and then CRF) was STILL unable to distinguish the onset of the extinction phase even though they had also had CRF during the second training phase.
Panic disorder
Anxiety disorder in which the individual experiences recurrent, sudden onsets of intense apprehension or terror, often without warning and with no specific cause. Can be treated with systematic desensitization and flooding.
Describe the development of the generalization gradient as training progresses.
As training progresses we can expect to see a steeper generalization gradient. This indicates good control of behavior by the stimulus dimension that is being tested.
What is an SSDR? How do species specific defense reactions determine how difficult an avoidance response will be to learn?
Aversive stimuli and situations elicit strong unconditioned, or innate, responses. These innate responses are assumed to have evolved because they are successful in defense against pain and injury. Psychologist, R. C. Bolles, called these species-specific defense reactions (SSDRs). A major prediction of the SSDR theory is that some responses will be more easily learned in avoidance experiments than others. Consistent with this prediction, Bolles (1969) found that rats can rapidly learn to run in a running wheel to avoid shock. By contrast, their performance of a rearing response (standing on the hind legs) did not improve much during the course of avoidance training. Presumably, running was learned faster because it was closer to the rat's SSDRs in the running wheel.
Brogden, Lipman, and Culler (1938)
Avoidance learning versus classical conditioning (Quick summary: Avoidance conditioning wins - it produced much higher response rats faster than the classically conditioned group . It is not the same as classical conditioning.) In the 1930s people focused on the difference between a standard classical conditioning procedure and a procedure that had an instrumental avoidance component added. Tested 2 groups of guinea pigs in a rotating wheel. - One group was classically conditioned. - Another group was trained with avoidance conditioning. A tone served as the CS and a shock as the US. The shock stimulated the animals to run and rotate the wheel. For the classical conditioning group, the shock was presented 2 seconds after the onset of the tone. For the avoidance group, the shock also followed the tone when the animals did not rotate the wheel. However, if the animals moved the wheel during the tone CS before the shock occurred, the scheduled shock was omitted. The avoidance group quickly learned to make the conditioned response and was responding on 100% of the trials within 8 days of training. With this high level of responding, these guinea pigs managed to avoid all scheduled shocks. In contrast, the classical conditioning group never achieved this high level of performance.
Shock frequency reduction theory
By definition, avoidance responses prevent the delivery of shock and thereby reduce the frequency of shocks an organism receives. The theories of avoidance we have discussed so far have viewed the reduction of shocks as a secondary by-product rather than as a primary cause of avoidance behavior. (This is the assumption of Miller's two-process theory of avoidance.) By contrast, the shock-frequency reduction hypothesis views the reduction of shocks to be critical to the reinforcement of avoidance behavior. (Research shows that shock-frequency reduction is not necessary for avoidance learning though.)
The introduction of extinction is easier to detect after which of the following: continuous reinforcement (CRF) or after partial reinforcement (PRF)?
CRF According to the discrimination hypothesis explanation of PREE (Partial Reinforcement Extinction Effect), PREE occurs because if you do not get reinforced after each response during training (in other words, partial reinforcement), you may not immediately notice when reinforcers are omitted altogether in extinction. The total absence of reinforcement during extinction is presumably much easier to detect after continuous reinforcement.
Guttman & Kalish (1956)
Demonstration of a generalization gradient Stimulus generalization is the opposite of differential responding, or stimulus discrimination. An organism is said to show stimulus generalization if it responds in a similar fashion to two or more stimuli. In the experiment, pigeons were reinforced on a VI schedule for pecking a response key illuminated by a yellow light with a wavelength of 580 nm. After training, the birds were tested with a variety of other colors presented in a random order without reinforcement, and the rate of responding in the presence of each color was recorded. The highest rate of pecking occurred in response to the original 580 nm color. But the birds also made substantial number of pecks when lights of 570 nm and 590 nm wavelengths were tested. This indicates that responding GENERALIZED to the 570 nm and 590 nm stimuli. However, as the color of the test stimuli became increasingly different from the color of the original training stimulus, progressively fewer responses occurred. The results showed a gradient of responding as a function of how similar each test stimulus was to the original training stimulus. This is an example of a 'stimulus generalization gradient.' Additionally, the experiments proposed hypothetical results if they had used color-blind pigeons. In this hypothetical scenario, there would be no gradient. Responding would be identical across the spectrum (a flat stimulus generalization gradient line more or less).
Dobrzecka, Szwejkowska, and Konorski (1966)
Demonstration of constraints on stimulus control Dogs were trained/tested with a buzzer (behind it) and a metronome (in front of it). The buzzer and metronome differ in sound and also spatial location. In experiment 1, a right-left discrimination, the dogs were supposed to raise their left leg if they heard the buzzer behind them. If they heard the metronome in front of them they were supposed to raise their right leg. The question for this particular experiment was: what happens when you switch the cue LOCATIONS? (Put the buzzer in front and the metronome behind.) Remember - the dog was trained to raise its left leg if it heard the buzzer (which was behind it) and raise its right leg if it heard the metronome (which was in front of it). So during the testing phase, now that the metronome was behind the dog - the dog actually raised it's left leg. Not its right leg like during the original training. And when it heard the buzzer (now in front) during the testing phase, the dog raised its right leg. So what happened? The dog was making a response that went along with the LOCATION of the sound and not the SOUND itself. This is not random. All dogs do the same thing. This is a constraint on stimulus control. When making a spatial location response, the dogs weren't learning much about the stimulus (sound of it) itself - they were learning the location of the cue. In experiment 2, the dogs were again trained with the buzzer and metronome in a 'go / no-go discrimination.' During training, the buzzer was behind the dog and the metronome was in front. This was more of a qualitative experiment because the dogs either had to make a response or not make a response (go / no-go). During the testing phase, the positions of the buzzer and the metronome were switched. The buzzer was now in front and the metronome was now in the back. What happened? This time the dogs responded according to the SOUND of the stimulus. So they weren't responding with respect to the location of the cue. They were responding according to the qualitative aspect of the sound - i.e., what it sounded like. Again this is an example of a constraint on stimulus control. The animal is biased (due to inborn tendencies) to attend to certain kinds of cues for certain kinds of tasks. For locational stuff -> locational cues. For qualitative stuff (go/no-go), the qualitative aspect of the cue.
Reynolds (1961)
Demonstration of discrimination learning in pigeons (Quick summary: The pigeons responded differently to each of the 'stimulus components' of the original training stimulus [a white triangle with a red background]. This differential responding to the stimulus components is known as 'stimulus discrimination.' This is an example of discrimination learning.) Two pigeons were reinforced on a VI schedule for pecking a circular response key. Reinforcement for pecking was available whenever the response key was illuminated by a visual pattern consisting of a white triangle on a red background. Reynolds was interested in which of the stimulus components (the red background versus the white triangle) gained control over the pecking behavior. After training, the pigeons were tested with each stimulus component separately. Results: One of the pigeons pecked more frequently when the response key was illuminated with the red light than when it was illuminated with the white triangle. By contrast, the other pigeon pecked more frequently when the white triangle was projected on the response key than when the key was illuminated by the red light. For the first bird, the pecking behavior was more strongly controlled by the red light. For the second bird, the pecking behavior was more strongly controlled by the white triangle. The stimulus control of instrumental behavior is demonstrated by variations in responding or differential responding related to variations in stimuli. So if an organism responds one way in the presence of one stimulus and in a different way in the presence of another stimulus, then its behavior has come under the control of those stimuli.
Cite experimental evidence for and against Miller's two-process theory of avoidance learning. Specifically, know the results of the experiment by Brown & Jacobs (1949).
Evidence for: - See Brown & Jacobs (1949) Evidence against: - See Lovibond et al. (2008) - See Sidman Avoidance
Give an example of a stimulus discrimination procedure. Is a multiple schedule of reinforcement an example? Describe it.
Example of a stimulus discrimination procedure: - See Reynolds (1961). Yes, it is. An instrumental conditioning procedure in which responding is reinforced in the presence of one stimulus (the S+) and not reinforced in the presence of another cue (the S-) is a special case of a multiple schedule of reinforcement. Stimulus discrimination and multiple schedules are common outside the laboratory. Playing a game yields reinforcement only in the presence of enjoyable or challenging partners. Driving rapidly is reinforced when you are on a freeway, but not when you are on a crowded city street. Loud and boisterous discussion with your friends is reinforced at a party. The same type of behavior is frowned upon during a church service. Eating with your fingers is reinforced at a pic- nic, but not when you are in a fine restaurant. Daily activities typically consist of going from one situation to another (to the kitchen, to the bus stop, to your office, to the grocery store, and so on), and each situation has its own schedule of reinforcement.
Partial-reinforcement extinction effect (PREE)
Extinction is much slower and involves fewer frustration reactions if partial reinforcement rather than continuous reinforcement was in effect before the introduction of extinction. This phenomenon is called the partial reinforcement extinction effect (PREE).
What is 'forgetting'?
Forgetting is the reduction of a learned response that occurs because of the passage of time, not because of particular experiences.
Honig et al. (1963)
Generalization gradients of excitation and inhibition This experiment is associated with Spence's Model of Discrimination Learning. The key idea is that the S- has inhibitory properties that in-turn contributes to a net effect (along with the excitatory properties of the S+) of learning. The basic idea of the model states that reinforcement of a response in the presence of the S+ conditions excitatory response tendencies. Whereas, non-reinforcement of responding during S- conditions inhibitory properties to S- that serve to suppress the instrumental behavior. Differential responding to S+ and S- reflects both conditioned excitation to S+ and conditioned inhibition to S-. The experiment: Honig et al. trained pigeons to peck at a vertical line, designated the S+. A blank response key was the S-. In the graph, the vertical line is 90 degrees. As the experimenters shifted the line so that it became more horizontal - responding decreased. This produced a generalization gradient of excitation. Next, the experimenters trained pigeons where the S- was instead the vertical line and the S+ was the blank response key. As the experimenters shifted the line away from its vertical position, responding increased. This produced a generalization gradient of inhibition.
How does the experiment by Guttman and Kalish (1956) illustrate the idea of stimulus control?
Guttman and Kalish performed a study on stimulus generalization in instrumental conditioning. The results showed a gradient of responding as a function of how similar each test stimulus was to the original training stimulus. This is an example of a stimulus generalization gradient. Stimulus generalization gradients are an excellent way to measure stimulus control because they provide precise information about how sensitive the organism's behavior is to variations in a particular aspect of the environment. A comparison of the results obtained by Guttman and Kalish and the hypothetical experiment with color-blind pigeons indicates that the steepness of a stimulus generalization gradient provides a precise measure of the degree of stimulus control. A STEEP generalization gradient indicates GOOD control of behavior by the stimulus dimension that is tested. In contrast, a FLAT generalization gradient (picture) indicates POOR stimulus control.
How did Honig et al.'s (1963) experiment demonstrate that discrimination training can cause stimuli to acquire excitatory and inhibitory properties?
Honig et al. trained pigeons to peck at a vertical line, designated the S+. A blank response key was the S-. In the graph, the vertical line is 90 degrees. As the experimenters shifted the line so that it became more horizontal - responding decreased. This produced a generalization gradient of excitation. Next, the experimenters trained pigeons where the S- was instead the vertical line and the S+ was the blank response key. As the experimenters shifted the line away from its vertical position, responding increased. This produced a generalization gradient of inhibition.
Sidman Avoidance Non-discriminated avoidance AKA Free-operant avoidance
In this procedure, the aversive stimulus is NOT preceded by a tone or other signal. An animal receives an aversive stimulus (mild shock) at fixed intervals (10 seconds for example), without a warning signal - unless it performs an avoidance response (such as pressing a lever). After each avoidance response the timer is reset so the avoidance behavior earns the animal a timed respite (break) as the shock is delayed for a certain amount of time. If it presses the lever again before the end of this respite, it earns another timed delay. So the animals learn to avoid the shocks by demonstrating very accurate timings of their response. In this procedure there are two independent temporal variables: 1. Shock-shock interval, (S-S) 2. Response-shock interval (R-S) Each occurrence of the response initiates a period without shock, as set by the R-S interval. In the absence of a response, the next shock occurs a fixed period after the last shock, as set by the S-S interval. This was bad news for Miller's Two-Process Theory because (according to the two-process theory), having no explicit CS/cue/signal means no aversive stimulus to escape, and escape from an aversive stimulus was supposed to reinforce avoidance behavior. According to the Two-Process Theory, reducing fear of the CS was the motivation for making the instrumental response. But with Sidman Avoidance, there's no cue/CS at all. So the Sidman Avoidance paradigm contradicts the two-process theory of avoidance.
Jenkins & Harrison (1962)
Intra-dimensional discrimination training (Quick summary: A steeper generalization gradient was observed when S+ and S- were stimuli of the same dimension and varied. When you have a stimulus and absence of a stimulus, but you're looking for the effect of a property of the stimulus in-question - you'll see a broader generalization gradient because the subject may not be attending to the property in-question. So in this experiment, we are examining the pitch for the tones. But if we have a group where a sound is presented (S+) and then is not presented (our S-) - then the subject is likely attending to the loudness of the sound. Not the pitch (the property in-question).) What is intra-dimensional discrimination training? - A training procedure in which S+ and S- differ only in terms of the value of one stimulus feature (in this case, pitch). Below, you'll see that group 2 uses this procedure and thus, exhibits the steeper generalization gradient. Jenkins & Harrison examined how auditory stimuli that differ in pitch can come to control the pecking behavior of pigeons reinforced with food. There were 3 different training procedures: 1. One group of pigeons received a discrimination training procedure in which the 1,000 cps (cycles per second) tone served as the S+ and the ABSENCE of a tone served as a S-. 2. A second group also received discrimination training where the 1,000 cps tone again served as the S+. However, this time the S- was a 950 cps tone. 3. The third group of pigeons served as the control group and did not receive discrimination training. For them the 1,000 cps tone was continuously turned on, and they could always receive reinforcement during the experimental sessions. Testing phase: Each group was tested for the pecking response in the presence of tones of various frequencies to see how precisely pecking was under stimulus control by the pitch of the tone. Results: The control group - which did not receive discrimination training - responded nearly equally in the presence of all of the test stimuli. The pitch of the tones did not control their behavior. Each of the other two groups produced more stimulus control by pitch. However, the steepest generalization gradient, and hence the strongest stimulus control, was observed in group 2 - the pigeons that were trained with the 1,000 cps tone as S+ and the 950 cps tone as S-. Pigeons in group 1 showed an intermediate response. Conclusions: 1. Discrimination training increases the stimulus control of instrumental behavior. 2. A particular stimulus dimension (such as tonal frequency or pitch) is most likely to gain control over responding if the S+ and S- differ along that stimulus dimension. Why did we see weaker stimulus control in group 1 (where the 1,000 cps tone was S+ and the ABSENCE of a tone was S-)? - Because the discrimination between the presence of a tone and the absence of a tone could be based on the loudness or timbre of tone rather than it's pitch. Picture: Notice that for the 1,000 cps S+/950 cps S- group, the peak shifts to the right. This is because the S- is the 950 cps tone. The peak shifts away from the S-. This is an example of peak shift effect.
Miller's 2-Process Theory of Avoidance Learning
Intro: Avoidance procedures involve a negative contingency between a response and an aversive stimulus. If you make the appropriate avoidance response, you will not fall (by watching where you step), get rained on (by opening up your umbrella), or drive off the road (by paying attention while driving). No particular pleasure is derived from these experiences. You simply do not get hurt. The absence of the aversive stimulus is presumably the reason that avoidance responses occur. *The study of avoidance is concerned with how the absence of something provides reinforcement for instrumental behavior.* ----- The 'two-process theory of avoidance' assumes that TWO mechanisms are involved in avoidance learning: 1. CLASSICAL CONDITIONING process activated by pairings of the warning stimulus (CS) with the aversive event (US) on trials when the organism fails to make the avoidance response. Through these classical conditioning trials, the CS comes to elicit fear. Thus, the first component of two-process theory is the classical conditioning of fear to the CS. Fear is an emotionally arousing, unpleasant state. The termination of an unpleasant or aversive event provides negative reinforcement for instrumental behavior. 2. The second process in two-process theory is based on such negative reinforcement. The instrumental avoidance response is learned because the response terminates the CS and thereby reduces the conditioned fear elicited by the CS. Thus, the second component in two-process theory is INSTRUMENTAL REINFORCEMENT OF THE AVOIDANCE RESPONSE through fear reduction. ------------------ Classical conditioning HAS to occur first because instrumental reinforcement through fear reduction is not possible until fear has become conditioned to the CS. (Need something to reduce in the first place.) Classical conditioning is an establishing operation that enables the reinforcement of the instrumental response through fear reduction. However, successful avoidance responses constitute extinction trials for the CS (because the US gets omitted). Thus, two-process theory predicts repeated interaction between classical and instrumental processes. So classical conditioning makes possible instrumental negative reinforcement, but successful instrumental avoidance responding can result in extinction of the classically conditioned fear.
Safety signal theory
Performance of an avoidance response always results in distinctive feedback stimuli, such as spatial cues involved in going from one side to the other in a shuttle box or tactile and other external stimuli involved in pressing a response lever. Because the avoidance response produces a period of safety in all avoidance conditioning procedures, response feedback stimuli may acquire conditioned inhibitory properties and become signals for the ABSENCE of aversive stimulation. Such stimuli are called safety signals. The safety-signal hypothesis predicts that introducing an explicit feedback stimulus (so, after making the avoidance response) will facilitate the learning of an avoidance response. Thus, the safety signals that accompany avoidance responses may provide positive reinforcement for avoidance behavior.
How does the concept of relational learning and transposition explain some results that are difficult to explain otherwise?
Spence's model of discrimination learning is an absolute stimulus learning model. It predicts behavior based on the NET excitatory properties of individual stimuli. The alternative approach assumes that organisms learn to respond to a stimulus based on the relation of that stimulus to other cues in the situation. For example, when presented with an S+ that is larger than the S-, the subject may respond to the S+ based on its relative size (in comparison to the S-) rather than in terms of its absolute size. Example of transposition and relational learning: In a series of experiments, Wolgang Köhler trained chickens with a simple simultaneous discrimination in which a response to a darker shade of grey was not reinforced and a response to a lighter shade of grey was reinforced (or vice versa). Once the chickens learned the original discrimination, they were given a choice between the original lighter shade of grey and a novel, still lighter, shade of grey. Köhler reasoned that if the chickens learned to respond to a specific shade value, then they ought to select the original grey shade given in training; but, if they learned to respond to the lighter of two shades (i.e., to the relationship between the two shades), then they ought to respond to the novel shade and ignore the previously reinforced shade. *Köhler reported that chickens (and, in the subsequent experiments, apes) selected the novel (lighter) shade on over 70% of trials, indicating a preference for "relationally correct" stimulus.* Köhler called this behavioral result transposition—just as the notes of musical melodies do not change their relation to each other when the melodies are moved or transposed to different keys, the learned relation remains intact when new stimuli are substituted.
How do spontaneous recovery, renewal, and reinstatement show that extinction is not forgetting or unlearning?
Spontaneous Recovery: Extinction typically produces a decline in conditioned behavior, but this effect dissipates with time. If a rest period is introduced after extinction training, responding is observed to recover. Because nothing specific is done during the rest period to produce the recovery, the effect is called 'spontaneous recovery.' Renewal: Another strong piece of evidence that extinction does not result in permanent loss of conditioned behavior is the phenomenon of renewal. Renewal refers to a recovery of acquisition performance when the contextual cues that were present during extinction are changed. The change can be a return to the context of original acquisition or a shift to a neutral context. Renewal has been of special interest for translational research because it suggests that clinical improvements that are achieved in the context of a therapist's office may not persist when the client returns home or goes to work of school. Reinstatement: Reinstatement refers to the recovery of conditioned behavior produced by exposures to the unconditioned stimulus. Consider, for example, learning an aversion to fish because you got sick (US) after eating fish on a trip. Your aversion is then extinguished by nibbling on fish without getting sick on a number of occasions. In fact, you may learn to enjoy eating fish again because of this extinction experience. The phenomenon of reinstatement suggests that if you were to become sick again for some reason, your aversion to fish would return even if your illness had nothing to do with eating this particular food.
Stimulus discrimination and ______ are two ways of considering the same phenomenon.
Stimulus control. If an organism does not discriminate between two stimuli, its behavior is not under the control of those cues.
What experimental evidence supports the claim that discrimination procedures increase stimulus control?
Stimulus discrimination and stimulus control are two ways of considering the same phenomenon. One cannot have one without the other. If an organism does not discriminate between two stimuli, its behavior is not under the control of those cues. In the Reynolds (1961) experiment, each of the pigeons attended to a component of the original training stimulus. Remember, that original training stimulus was a white triangle on a red background. So it had 2 components. During testing, one pigeon responded more to the red background and another pigeon responded more to the white triangle. So for one pigeon, the pecking response was under the control of the red background stimulus while in the case of the other pigeon - the pecking response was controlled by the white triangle. The stimulus control of instrumental behavior is demonstrated by variations in responding (differential responding) related to variations in stimuli. If an organism responds one way in the presence of one stimulus and in a different way in the presence of another stimulus, its behavior has come under the control of those stimuli. Such differential responding was evident in the behavior of both pigeons Reynolds tested.
Tyler, Wortz, and Bitterman (1953)
Test of the discrimination hypothesis Why do subjects that are given a partial (intermittent) reinforcement persist in responding during the extinction phase longer than subjects on other schedules (such as continuous reinforcement)? The discrimination hypothesis: Subjects on the partial reinforcement schedule cannot detect the beginning of the extinction phase. Bitterman et al. wanted to test the PREE (partial reinforcement extinction effect). 50% alternating reinforcement trials before extinction. 50% randomized reinforcement trials before extinction. - Both groups had equal amounts of reinforcers per trial. What happened? The group with a predictable pattern (alternating) was able to notice the extinction effect beginning. The group that was given randomized reinforcement trials could NOT predict the beginning of the extinction phase. If they cannot tell when extinction begins, the subject will continue on for some time throughout the extinction phase. The prediction was that the randomized group would persist longer into extinction and that is what happened.
S+ is...
The S+ is a discriminative stimulus that is reinforced.
S- is...
The S- is a discriminative stimulus that is not reinforced.
The Sequential Theory (E. John Capaldi, Ph.D.) - a memory theory
The animal is able to 'remember' (counting/keeping track of what predicts what) its experience with a previous sequence of non-reinforcement and reinforcement. It then remembers this sequence as a cue that more reinforcement is coming, so it persists.
Explain how Dobrzecka, Szwejkowska, and Konorski (1966) demonstrated constraints on stimulus control.
The experiment consisted of two tasks. One task was a left/right discrimination. The other was a go/no-go discrimination. In the left/right discrimination the experimenters learned that the dog was paying attention to the LOCATION of the sound and not the qualitative aspect. This was because the discrimination itself was concerned with spatial awareness. In the go/no-go discrimination the experimenters learned that the dog was paying attention to the QUALITATIVE aspect of the sound. This was because the discrimination itself was qualitative, tasking the dog with doing or not doing something. So these are both constraints on stimulus control.
Overtraining extinction effect
The more training that is provided with reinforcement, the stronger will be the expectancy of reward, and therefore the stronger will be the frustration that occurs when extinction is introduced. That in turn should produce more rapid extinction. This prediction has been confirmed and is called the 'overtraining extinction effect.'
What is peak shift? How does the Hull-Spence model explain peak shift?
The shift of the peak of the generalization gradient away from the original S+ is called the peak-shift effect. Spence assumed that intradimensional discrimination training produces excitatory and inhibitory stimulus generalization gradients centered at S+ and S-, respectively, in the usual fashion. However, because the S+ and S- are similar in intradimensional discrimination tasks (e.g., both being colors), the generalization gradients of excitation and inhibition will overlap. Furthermore, the degree of overlap will depend on the degree of similarity between S+ and S-. Because of this overlap, generalized inhibition from S- will suppress responding to S+, resulting in a peak-shift effect. More inhibition from S- to S+ will occur if S- is closer to S+, and this will result in a greater peak-shift effect.
Resistance to extinction
The tendency to continue responding once reinforcement stops. Example: PREE (partial reinforcement extinction effect)
What is it about partial reinforcement that causes the subject to, during testing, persist responding into the extinction phase?
There are 3 theories: 1. The Discrimination Hypothesis - PRF inability to anticipate the onset of extinction 2. The Frustration Theory of PREE (Abraham Amsel, Ph.D.) - an emotion theory 3. The Sequential Theory (E. John Capaldi, Ph.D.) - a memory theory
Use Reynolds' (1961) experiment to illustrate the idea of differential response and stimulus discrimination.
This experiment illustrates several important ideas. First, it shows how to experimentally determine whether instrumental behavior has come under the control of a particular stimulus. The stimulus control of instrumental behavior is demonstrated by variations in responding (differential responding) related to variations in stimuli. If an organism responds one way in the presence of one stimulus and in a different way in the presence of another stimulus, its behavior has come under the control of those stimuli. Such differential responding was evident in the behavior of both pigeons Reynolds tested. One pigeon attended more to the red background. Another pigeon attended more to the white triangle. Differential responding to two stimuli also indicates that the pigeons were treating each stimulus as different from the other - this is stimulus discrimination.
Magnitude reinforcement effect
This phenomenon refers to the fact that responding declines more rapidly in extinction following reinforcement with a larger reinforcer and is also readily accounted for in terms of the frustrative effects of non-reward. Non-reinforcement is likely to be more frustrating if the individual has come to expect a large reward than if the individual expects a small reward. Consider the following scenarios. In one you receive $100/month from your parents to help with incidental expenses at college. In the other, you get only $20/month. In both cases your parents stop the payments when you drop out of school for a semester. This nonreinforcement will be more aversive if you had come to expect the larger monthly allowance.
Systematic desensitization
This therapy aims to remove the fear response of a phobia, and substitute a relaxation response to the conditional stimulus gradually using counter conditioning. This is done by forming a hierarchy of fear, involving the conditioned stimulus (e.g. a spider), that are ranked from least fearful to most fearful. The patient works their way up starting at the least unpleasant and practicing their relaxation technique as they go. When they feel comfortable with this (they are no longer afraid) they move on to the next stage in the hierarchy, etc. Keyword here is 'gradual'.