Chapter 11 - Positive reinforcement
Naturalistic free operant
conducted in learner's everyday environment; observer discretely observes and records activities and durations with these activities.
Control procedures for positive reinforcement
control is demonstrated by comparing response rates in the absence/presence of a contingency and then showing that with the absence and presence of the contingency the behavior can be turned on and off.
DRO
delivers reinforcer whenever target behavior has not occurred during a set time interval. ABCBC : A=baseline, B= reinforcer is presented contingent on occurrence of target behavior, C= DRO condition, where reinforcement is delivered contingent on absence of target behavior.
Reinforcement
does more than increase future frequency of behavior it follows; it also changes the function of stimuli that immediately precede the reinforced behavior.
Magnitude of reinforcer
duration of reinforcer, # of reinforcers, intensity of reinforcer
Types of reinforcers
edible, sensory, tangible, social, activity
Multiple schedule
two or more component schedules for a single response with only one component schedule in effect at a given time. An SD signals the presence of each component schedule and that stimulus is present as long as the schedule is in effect.
Concurrent schedules
two or more contingencies operate independently and simultaneously for two or more behaviors. Puts two or more stimuli together to see which will produce the larger increase in responding. Also used to determine differences between relative and absolute reinforcement effects of stimuli. Potential effects of a stimulus as a reinforcer may be masked or overshadowed when that stimulus is pitted against another on a concurrent schedule.
Reinforcer assessment
variety of methods used to present one or more stimuli contingent on a target response and then measure the future effects on the rate of responding.
Premack principle
a high probability activity can be used as a reinforcer contingent on completion of a low probability activity. Not tangibles.
Methods to teach and promote reinforcer delay
1. Make reinforcers visible 2. Gradually increase delay 3. Use conditioned reinforcers during delay 4. Teach clients self-instruction or self-prompting ("I only have to wait a few minutes")
Response to reinforcement (teaching delay of gratification)
1. Start with short delay, then increase 2. Provide verbal assurance reinforcement is coming 3. Provide activity to bridge gap between behavior and reinforcer.
Behavior is likely a result of instructional control (rule following) if
1. There is no identifiable immediate consequence 2. Response to consequence delivery is greater than 30 secs. 3. Frequency of behavior changes due to changes in antecedents 4. Single instance of reinforcement causes a large increase in behavior
Disadvantages of reinforcement
1. When a reinforcer is too powerful, it make evoke behaviors that interfere with behavior it is supposed to reinforce. 2. When rate of responding increases for one behavior, responding can decrease for other behaviors 3. Reinforcement of a particular response can strengthen all responses in the same functional response class 4. Reinforcing stimuli that also function as conditioned elicitors can elicit behaviors that are incompatible with the target response. 5. Undesirable effect of SR+ is an increase in undesirable behaviors that are members of the same response class as the behavior being reinforced.
Using reinforcement effectively
1. set easy criterion at first, 2. use high quality reinforcers, 3. vary reinforcers, 4. use direct, rather than indirect reinforcement contingencies, 5. combine response prompts with reinforcement, 6. record each occurrence initially, 7. gradually increase response requirements, 8. gradually shift from contrived to naturally occurring reinforcers.
Generalized conditioned reinforcer
CR that has been paired with various UR/CR and doesn't depend on any EO's to enhance its effectiveness. Provide the basis for implementing a token economy.
Reinforcer assessment (types)
Concurrent, multiple, progressive ratio
Response deprivation hypothesis
If access to a behavior is restricted below baseline levels, a learner will work to regain access to that activity, regardless if it is a high probability or low probability behavior. Restricting access to a behavior is like deprivation, so giving access to that behavior is a form of reinforcement
Conditioned reinforcer (secondary/learned)
NS is paired with a UR/CR and acquires reinforcing properties through stimulus-stimulus pairing.
Stimulus preference assessment
Refers to a variety of procedures used to determine the stimuli the person prefers and the value of the stimuli
Control procedures for positive reinforcement (types)
Reversal design involving withdrawal of reinforcement contingency, NCR, DRO, DRA
Contrived/planned reinforcement
arranged for purpose of modifying behavior
Offering pre-task choice
ask participant to choose what he wants from options presented
Stimulus preference assessments (types)
asking person, asking significant other, offering pretask choice, free operant, contrived free operant, naturalistic free operant
Progressive ratio schedules
assess effectiveness of a stimulus as a reinforcer as response requirements increase. Response requirements are gradually over time independent of the participant's behavior until a breaking point is reached and response rate declines.
A reinforcer doesn't affect the response it follows
it only increases the frequency in which similar responses are emitted in the future.
Quality of reinforcer
level of individual preference for reinforcer
Free operant
observe and record activities person engages in and for how long with each activity. Person has free access to a predetermined set of items.
Positive reinforcement
occurs when a response is immediately followed by the presentation of a stimulus and as a result, similar responses occur more frequently in the future.
Asking target person
open ended questions, choice format, rank ordering
Contrived free operant
participant is given brief exposure to each item prior to preference assessment, free operant conducted after initial exposure.
Multiple stimuli
person chooses from an array of three or more stimuli. Can occur with/without replacement.
Response deprivation hypothesis
predicts whether access to one behavior (contingent behavior) will function as reinforcement for another behavior (instrumental response).
Self-control
preference for larger, delayed reinforcer than smaller, immediate reinforcer
Impulsivity
preference for smaller, immediate reinforcer rather than larger, delayed reinforcer
Noncontingent reinforcement (NCR)
presentation of a potential reinforcer on a FT/VT schedule independent of the occurrence of the target behavior. The presentation of the reinforcer without the contingent response allows any effects of the stimulus presentation alone to be detected. Has a minimum of 5 phases: ABCBC- A=baseline, B= NCR condition, C= reinforcer is presented contingent on occurrence of target behavior Offers the most unconfounded demonstration of effects of positive reinforcement
Paired stimuli (forced choice)
presentation of two stimuli, person chooses one, and stimuli continued to be matched up randomly with all others in set. Choices are ranked in terms of high, medium, low.
Reinforcement can also strengthen
rate, frequency, duration, latency, and magnitude of a behavior.
Automaticity of reinforcement
refers to the fact that behavior is modified by its consequences irrespective of the person's awareness. There doesn't need to be any logical connection between behavior and reinforcing consequence for reinforcement to occur. Reinforcement will strengthen any behavior that precedes it.
Automatic reinforcement
reinforcement that occurs without the mediation of others. This includes naturally occurring sensory consequences. Most common form of AR is babbling.
DRA
reinforcer is delivered contingent on the occurrence of a desirable, alternative behavior. ABCBC: A=baseline, B= reinforcer is presented contingent on occurrence of target behavior, C=DRA condition, reinforcement is presented contingent on the occurrence of an alternative behavior.
Naturally occurring reinforcers
secondary reinforcers
Trial based method (types)
single stimulus, paired stimuli, multiple stimuli w/ and w/o replacement
Trial based method
stimuli are presented in a series of trials and learners responses are measures as an index of preference. Three measures of behavior are recorded: approach (movement toward stimulus), contact (touching stimulus), engagement (duration of interaction with stimulus)
Single stimulus (successive choice method)
stimulus is presented and persons reaction is recorded
Positive reinforcer
stimulus presented as a consequence and responsible for the subsequent increase in responding.
Unconditioned reinforcers (primary/unlearned)
stimulus that functions as a reinforcer with no prior learning history. Product of phylogenic development.
