Chapter 11: Theories of learning
Schedule of reinforcement
A program for giving reinforcement; specifically the frequency and manner in which a desired response is reinforce. The schedule that is used will influence the speed of learning (the response acquisition rate) and the strength of the learned response. Reinforcement may be provided on a continuous schedule (i.e. after every correct response) or on a partial reinforcement schedule (i.e. only on some occasions for performing the correct response)
Response
A reaction by an organism to a stimulus.
A symbolic model
A real or fictional character displaying behavior in books, movies, television programs, online and other media.
A live model
A real-life person who may be demonstrating, acting out and/or describing or explaining a behaviour.
Token economies
A setting in which an individual receives tokens (reinforcers) for desired behaviour. These tokens can then be collected and exchanged for other reinforcers in the form of actual rewards. Tokens may also be withdrawn and, in many cases, penalties are used and individuals are 'fined' a certain number of tokens for inappropriate behaviour.
Positive reinforcer
A stimulus that strengthens or increases the frequency or likelihood of a desired response by providing a satisfying consequence.
Operant conditioning
A type of learning whereby the consequences of an action determine the likelihood that it will be performed again in the future. More specifically, the theory proposes that an organism will tend to repeat a behaviour (operants) that has desirable consequences (such as receiving a treat), or that will enable it to avoid undesirable consequences (such as being given detention). Furthermore, organisms will tend not to repeat a behaviour that has undesirable consequences
Observational learning processes
According to Bandura's social learning theory, observational learning involves a sequence of processes called attention, retention, reproduction, motivation, and reinforcement
Spontaneous recovery (operant conditioning)
After the apparent extinction of conditioned response, this can occur and the organism will once again show the response in the absence of any reinforcement. The response is likely to be weaker and will probably not last very long. This response is often stronger when it occurs after a lengthy period following extinction of the response than when it occurs relatively soon after extinction.
Conditioned emotional response
An emotional reaction that usually occurs when the autonomic nervous system produces a response to a stimulus that did not previously trigger that response e.g fears, phobias
Stimulus
Any object or event that elicits (produces) a response from an organism
Operant
Any response (or set of responses) that acts (operates) on the environment to produce some kind of consequence. It is behaviour that has an impact on the environment in some way. In turn, the environment provides an event that makes the behaviour more or less likely to recur. Positive consequences strengthen the behaviour and make it more likely to recur and adverse consequences weaken the behaviour and make it less likely to recur. Since the consequences occur in the environment, the environment determines whether or not the operant occurs.
Unconditioned stimulus (UCS)
Any stimulus that consistently produces a particular, naturally occurring, automatic response.
Reinforcer
Any stimulus that strengthens or increases the frequency or likelihood of a response that it follows
Negative reinforcer
Any unpleasant or aversive stimulus that, when removed or avoided, strengthens or increases the frequency or likelihood of a desired response
Applications of classical conditioning
Essentially, classical conditioned responses are conditioned reflexes that are acquired through associative learning; they are 'conditional' upon an organism's experience. Cclassically conditioned responses are described as involving anticipatory behaviour. Consequently, learning through classical conditioning may be involuntary and relatively simple, but conditioned reflexes or responses acquired through classical conditioning may not necessarily be 'thoughtless' and are therefore not as 'mechanistic' as they were believed to be.
Appropriateness
For any stimulus to be a reinforce, it must provide a pleasing or satisfying consequence for its recipient. Similarly, for any stimulus to be an appropriate punisher, it must provide a consequence that is unpleasant and therefore likely to decrease the likelihood of the undesirable behaviour. An inappropriate punisher can have the opposite effect and produce the same consequence as a reinforcer. Although punishment may temporarily decrease the occurrence o unwanted responses or behaviour, it doesn't promote more desirable or appropriate behaviour in its place.
Three-phase model of operant conditioning
Has three parts that occur in a specific sequence: 1. the discriminative stimulus that occurs before a particular response 2. the response that occurs due to the discriminative stimulus 3. the consequence to the response
Retention
Having observed the model, we must be able to remember the model's behaviour. Responses learned by modelling are often not needed until some time after they have been acquired. We need to store in memory a mental representation of what we have observed, and the more meaningful we can make that representation, the more accurately we will be able to replicate the behaviour when necessary.
The discriminative stimulus
Must be present for a response to occur. The stimulus that precedes a particular response, signals the probable consequence for the response and therefore influences the occurrence of the response.
Motivation
The observer must also be motivated to perform the behaviour; that is, they must want to reproduce what was observed. Unless the behavioural response is useful or provides an incentive or reward for the observer, it is unlikely that they will want to learn it in the first place, let alone perform it or continue to perform it
Extinction (operant conditioning)
Process is similar to its occurrence in classical conditioning. In operant conditioning, it is the gradual decrease in the strength or rate of a conditioned (learned) response following consistent non-reinforcement of the response. It is said to have occurred when a conditioned response is no longer present. Less likely to occur when partial reinforcement is used; when reinforcement does not regularly follow every correct response. The uncertainty of the reinforcement leads to a greater tendency for the response to continue.
Classical conditioning
Refers to a type of learning that occurs through repeated association of two (or more) different stimuli. We learn that two events go together after we experience them occurring together on a number of occasions. Occurs when a particular stimulus consistently produces a response that it didn't previously elicit. A response that is automatically produced by one stimulus becomes associated, or linked, with another stimulus that would not normally produce this response.
Acquisition (operant conditioning)
Refers to the overall learning process during which a specific response, or pattern or responses, is established. However, the means by which behaviour is acquired in operant conditioning differs from that of classical conditioning. In operant conditioning, it is the establishment of a response through reinforcement. The speed with which the response is established depends on which schedule of reinforcement is used.
Association
Refers to the pairing or linking of one stimulus with another stimulus; a stimulus that would not normally produce a particular automatic response is associated with a stimulus that would produce the automatic response.
Timing
Reinforcement and punishment are most effective when given immediately after the response has occurred. This timing helps to ensure that the organism associates the response with the reinforce or punisher, without interference from other factors during the time delay. Timing also influences the strength of the response. If there is a considerable delay between the response and the consequence, learning will generally be very slow to progress and in some cases may not occur at all.
Reinforcement
Said to occur when a stimulus strengthens or increases the frequency or likelihood of a response that it follows. This may involve using positive stimulus or removing a negative stimulus to subsequently strengthen or increase the frequency or likelihood of a preceding response or operant. An essential feature of reinforcement is that it is only used after the desired or correct response is made
Acquisition (classical conditioning)
The overall process during which an organism learns to associate two events (the CS and the UCS). The presentations of the CS and the UCS occur close together in time and always in the same sequence. The duration of this stage is usually measured by the number of trials it takes for the CR to be acquired (learned). One of the important considerations in classical conditioning is the timing of the CS and the UCS pairing. A very short time between presentations of the two stimuli is most effective. The end of this stage is said to occur when the CS alone produces the CR
Conditioning
The process of learning associations between a stimulus in the environment (one event) and a behavioural response (another event). More to do with the learning process; how the learning occurs
Partial reinforcement
The process of reinforcing some correct responses but not all of them. May be delivered in a number of ways or by different schedules. Reinforcement can be given after a certain number of correct responses have been made (as a ratio) or after a certain amount of time has been elapsed following the last correct response (after an interval). Furthermore, reinforcement may be given on a regular basis (fixed), or it may be unpredictable (variable). Behaviour that is conditioned or maintained on a schedule is generally considered to be the most difficult to change.
Spontaneous recovery (classical conditioning)
The reappearance of the CR when the CS is presented, following a rest period (i.e when no CS is presented) after the CR appears to have been extinguished. Furthermore, the CR tends to be weaker than it was originally
Negative reinforcement
The removal or avoidance of an unpleasant stimulus. It has the effect of increasing the likelihood of a response being repeated and thereby strengthening the response
Unconditioned response (UCR)
The response that occurs automatically when the UCS is presented. A reflexive involuntary response that is predictably caused by a UCS.
Continuous reinforcement
The schedule of reinforcing every correct response after it occurs. Once a correct response consistently occurs, a different reinforcement schedule can be used to maintain, increase or strengthen the response.
Conditioned stimulus (CS)
The stimulus that is 'neutral' at the start of the conditioning process and does not normally produce the unconditioned response. However, through repeated association with the UCS, the CS triggers a very similar response to that cause by the UCS
Stimulus generalisation (classical conditioning)
The tendency for another stimulus that is similar to the original CS to produce a response that is similar (but not necessarily identical) to the CR. The greater the similarity between stimuli, the greater the possibility that a generalisation will occur.
Stimulus discrimination (operant conditioning)
This occurs when an organism makes the correct response to a stimulus and is reinforced but does not respond to any other stimulus, even when stimuli are similar (but not identical).
Stimulus generalisation (operant conditioning)
This occurs when the correct response is made to another stimulus that is similar (but not necessarily identical) to the stimulus that was present when the conditioned response was reinforced. This response usually occurs at a reduced level.
A model
Who or what is being observed and may be live or symbolic
Neutral stimulus
anything that does not normally produce a predictable response. In particular, this stimulus is 'neutral' to the UCR.
Key elements of classical conditioning
unconditioned stimulus, unconditioned response, conditioned stimulus, conditioned response
Aversion therapy
A form of behaviour therapy that applies classical conditioning processes to inhibit or discourage undesirable behaviour by associating (pairing) it with an aversive (unpleasant) stimulus such as a feeling of disgust, pain or nausea.
Graduated exposure
A form of treatment for phobias and fears, involving presenting successive approximations of the CS until the CS itself does not produce the conditioned response. Techniques involve progressively introducing the client to increasingly similar stimuli that produces the conditioned response requiring extinction, and ultimately to the CS itself. In this way, the client is gradually 'desensitised' to the fear or anxiety-producing object or event.
Shaping
A procedure in which a reinforcer is given for any response that successively approximates and ultimately leads to the final desired response, or target behaviour.
Social learning theory
Bandura's studies of observational learning processes led him to develop this. It emphasises the importance of the environment, or 'social context', in which learning occurs. Bandura proposed that from the time we are born we are surrounded by other people displaying a huge variety of behaviours, all of which we are able to observe. This provides us with a rich source of information about our environment. Through observation we learn many behaviours, not by actually carrying out the behaviour and experiencing the consequences, but simply by watching the behaviour and its consequences being experienced by someone else.
Respondents
Behaviours by known or recognized stimuli
External reinforcement
Comparable to learning by consequences
Vicarious conditioning
During this, the individual watches a model's behaviour being either reinforced or punished, and then subsequently behaves in exactly the same way or in modified way, or refrains from the behaviour, as a result of what they have observed
Comparing classical and operant conditioning
In both CC and OC there is an acquisition process whereby a response is conditioned or learned. In CC, the association of two stimuli, the CS and UCS, provides the basis of learning. In OC, the association is with an operant response to a stimulus and the consequence that follows the response, as described by Skinner's three-phase model of S -> R -> C. In both types of conditioning, extinction of the learned response can occur. In CC, extinction takes place over a period when the UCS is withdrawn or is no longer present and the CS is repeatedly presented alone. In OC, extinction also occurs over time, but after reinforcement is no longer given. In both CC and OC, extinction can be interrupted by spontaneous recovery. Although not unique to conditioning, in both CC and OC stimulus generalisation and stimulus discrimination can occur. In addition, both types of conditioning are achieved as a result of the repeated association of two events that follow each other closely in time. These similarities in the two types of conditioning have led some psychologists to propose that both CC and OC are variants of a single learning process. Furthermore, CC and OC often occur in the same situation.
The role of the learner
In classical conditioning, the learner is a passive participant in the conditioning process. The learner does not have to do anything for the CS or the UCS to be presented. Furthermore, the response made by the learner occurs automatically without them having to make any effort or actively do anything. The learner essentially has no control over the learning process. In operant conditioning, the learner is an active participant in the learning process. The learner must operate on the environment before reinforcement or punishment is received. The learner is neither reinforced nor punished without performing the behaviour that produces the consequence. In this sense, the learner has control over the learning process.
Timing of the stimulus and response
In classical conditioning, the response (e.g salivation) depends on the presentation of the UCS occurring first. In operant conditioning, the presentation of the reinforcer or punisher depends on the response occurring first The response occurs in the presence of a stimulus. The reinforcement or punisher received as a consequence of the response strengthens or weakens the stimulus-response association. In classical conditioning, the timing of the two stimuli (CS then UCS) produces an association between them that conditions the learner to anticipate the UCS and respond to it even if it is not presented. In operant conditioning, the association that is conditioned is between the stimulus and the response. The response is either strengthened by reinforcement or weakened through punishment
The nature of the response
In classical conditioning, the response by the learner is usually a reflexive involuntary one. In operant conditioning, the response by the learner is usually a voluntary one but may also be involuntary. In classical conditioning, the response is often one involving the action of the autonomic nervous system, and the association of the two stimuli is often not conscious or deliberate. In operant conditioning, the response may involve the autonomic nervous system but usually involves the central nervous system and is conscious, intentional and often goal-directed.
Differences between classical and operant conditioning
In operant conditioning the consequence of a response is a vital component of the learning process in that a behaviour becomes more or less likely, more or less frequent or strengthened, depending on its consequence. In classical conditioning the bahviour of the organism does not have any environmental consequences. Classical and operant conditioning also generally involve different types of responses. In classical conditioning, the response is involuntary; an automatic reaction to something happening in the environment (such as the sight of food or the sound the bell). Operant conditioning, however, involves voluntary responses that are initiated by the organism, as well as involuntary responses.
Attention
In order to learn through observation, we must pay attention to or closely watch a model's behaviour and the consequences. If we do not attend to the model's behaviour, we will not recognize the distinctive features of the observed behaviours. And we may fail to notice the consequences. It may be influenced by several factors, including: the perceptual capabilities of the observer, the motivation and interest level of the observer, the kinds of distracters that are present, the charcteristics of the model, such as attractiveness. Our level attention is also influenced by such factors as the importance of the behaviour. According to Bandura, we pay closer attention and are more likely to imitate models who have the following characteristics: the model is perceived positively, is liked and has a high status, there are perceived similarities between features and traits of the model and the observer, such as age and sex, the model is familiar to the observer and is known through previous observation, the model's behaviour is visible and stands out clearly against other 'competing' models, the model is demonstrating behaviour that the observer perceives themselves as being able to imitate.
Vicarious reinforcement
Increases the likelihood of the observer behaving in a similar way to a model whose behaviour is reinforced. Thus, the observer is conditioned through observing someone else being reinforced without personally experiencing the reinforcement or consequence directly
Reinforcement
Influences the motivation to reproduce the observer behaviour and increases the likelihood of reproduction. Bandura distinguished between different types of reinforcement that impact on motivation, in addition to the standard types described by Skinner.
Flooding
Involves bringing the client into direct contact with the anxiety or fear-producing stimulus and keeping them in contact with it until the conditioned response is extinguished. It is believed that people will stop fearing the stimulus and experiencing the anxiety associated with it when they are exposed to it directly.
Fixed-interval schedule
Involves delivery of the reinforcer after a specific time has elapsed since the previous reinforcer, provided the correct response has been made.
Response cost
Involves removal of any valued stimulus, whether or not it causes the behaviour. There is a 'cost' for making a 'response'. It is a form of punishment because it decreases the likelihood of a behaviour occurring.
Positive punishment
Involves the presentation (or introduction) of a stimulus and thereby decreasing (or weakening) the likelihood of a response occurring again.
Negative punishment
Involves the removal or loss of a stimulus and thereby decreasing (or weakening) the likelihood of a response occurring again.
Positive reinforcement
Occurs from giving or applying a positive reinforcer after the desired response has been made
Vicarious reinforcement
Occurs indirectly by observing the modelled behaviour being reinforced without personally experiencing the reinforcement.
Stimulus discrimination (classical conditioning)
Occurs when a person responds to the CS only, but not to any other stimulus that is similar to the CS.
Observational learning
Occurs when someone uses observation of a model's actions and the consequences of those actions to guide their future actions. Often called modelling or social learning. We are more likely to model, learn and reproduce responses that are observed to be desirable and reinforcing.
Vicarious punishment
Occurs when the likelihood of an observer performing a particular behaviour decreases after having seen a model's behaviour being punished.
Self reinforcement
Occurs when we are reinforced by meeting certain standards of performance we set for ourselves
Factors that influence the effectiveness of reinforcement and punishment
Order of presentation, timing, appropriateness
Punishment
The delivery of an unpleasant consequence following a response, or the removal of a pleasant consequence following a response. Has the same unpleasant quality as a negative reinforcer, but unlike a negative reinforcer, it is given or applied, whereas the negative reinforcer is prevented or avoided. It weakens the response of decreases the probability of that response occurring again over time.
The consequence
The environmental event that occurs immediately after the response and determines whether or not the response will occur. Skinner argued that any behaviour which is followed by a consequence will change in strength and frequency depending on the nature of that consequence.
Extinction (classical conditioning)
The gradual decrease in the strength or rate of a CR that occurs when the UCS is no longer presented. Is said to have occurred when a CR no longer occurs following the presentation of the CS.
Conditioned response (CR)
The learned response that is produced by the CS. Occurs after the CS has been associated with the UCS. The behaviour involved is very similar to that of the UCR, but it is triggered by the CS alone.
Order of presentation
To use reinforcement and punishment effectively it is essential that either be presented after a desired response, never before. This helps to ensure that the organism learns the consequences of a particular response.
The response
Voluntary behaviour that occurs in the presence of the discriminative stimulus. In all cases, it involves activity that has an effect on the environment in the form of a consequence that follows it
Variable-interval schedule
When reinforcement is given after irregular periods have passed, provided the correct response has been made.
Reproduction
When the model's behaviour has been closely attended to and retained in memory, we can attempt to reproduce, or imitate, what has been observed. We must, however, have the ability to put into practice what we observed.
Fixed-ratio schedule
When the reinforcer is given after a set, unvarying number of desired responses have been made
Variable-ratio schedule
When the reinforcer is given after an unpredictable number of correct responses