PSY 1400 Final Exam
Moro reflex
When a human infant experiences a sudden loss of head support, it extends its head and widely spreads the arms with palms in front fingers extended, and thumbs flexed. Can save the life of a falling child, making it easier for the mother to grab onto the child.
The "Commitment" in Acceptance and Commitment Therapy
Accepting thoughts as they arise is an important skill to master if one is to live in accord with one's values. For example, if one of our values is "making lasting relationships," then behaving in a committed way to this value means meeting new people, even when thoughts of emotional insecurities arise. Helping clients identify their personal values is an important component of ACT. Accepting thoughts and recognizing that they are not incompatible with behaving in accord with one's values are complementary skills acquired in ACT. Values are defined in ACT as client-selected qualities of behavior that may be continuously emitted without reaching an end-goal. Therefore, "finding a mate" is not a value (it is an end-goal) but "making lasting relationships" is. Articulating a set of values and behaving in accord with them may enhance the reinforcing efficacy of consequences we were already interested in. If, for example, making lasting relationships is a newly articulated value, then introducing one's self to a stranger at a party produces a consequence (the person interacts with us) that may function as a more effective positive reinforcer than before. This enhanced reinforcing value may be important in tipping the scale of choice toward behaving in a values-consistent direction, instead of an experiential-avoidance direction. Importantly, the ACT therapist lets the client's values, not the therapist's, direct therapy. Again, if the client's behavior is following the therapist's advice mostly to earn the approval of the therapist - pliance - it is unlikely that the client's adaptive behavior will continue after therapy concludes (and approval from the therapist is no longer available). The client needs to pursue their own values. Successful values-driven actions will produce the reinforcers best suited to maintaining the client's adaptive behavior.
Extinction-Induced Variability
An increase in the variety of operant response topographies following extinction. Saying Hey Siri with different tones of voice if she doesn't respond.
Behavior
An individual living organism's activity, public or private, which may be influenced by external or internal stimulation.
Interval Schedule of Reinforcement
An interval schedule of reinforcement specifies the amount of time that must elapse before a single response will produce the reinforcer. For example, an interval schedule might specify this contingency: IF 30 seconds pass and then a response is made → THEN the reinforcer is provided. If 30 seconds have not yet elapsed, then responding (no matter how much or how rapid) produces nothing. Interval schedules are time-based schedules - sometimes we just have to wait.
B.F. Skinner and intermittent reinforcement contingencies
B. F. Skinner began researching intermittent reinforcement in the early 1930s, and he quickly found that different contingencies produced very different response rates and patterns of behavior. For example, when behavior is reinforced every time it occurs, Skinner's nonhuman subjects responded quickly, when given the opportunity to do so. By contrast, when Skinner reinforced the first response that occurred after a 1-minute timer elapsed, the subjects learned not to waste effort by responding early in the 1-minute interval. Instead, they waited for time to pass and then began responding, only when the available reinforcer got nearer in time. Skinner, fascinated by these orderly and replicable patterns of behavior, studied complex contingencies of reinforcement for two decades with his collaborator, Charles Ferster.
Information Theory of Reinforcement
Behavior is controlled by the likely future, as exemplified in the past. Past reinforcement experiences provide information about what's likely to happen next. To explain the PREE, Information Theory holds that the estimate of the likely future is updated only when the individual detects a change in the reinforcement rate. If it's difficult to detect the change in reinforcement rate following extinction, then the estimate of the likely future is unchanged for a much longer period of time.
What does it mean to say "behavior is determined?"
Behavior has a cause, or multiple causes.
Determinism
Behavior is caused by biological and environmental factors
Assumption #1 of Behavior Analaysis
Behavior is determined
Direct observation
Behavior is recorded as the behavior occurs, or a lasting product of the behavior is recorded at a later time
Nature
Behavioral determinants include biological variables such as the evolutionary past of the species and the unique genome of the individual. Innate behaviors. (alcoholism, down syndrome)
Habits
Behaviors that occur in specific settings, even when our motivation to obtain the reinforcer is low. Habits are operant behaviors that have shifted from consequence/motivational control to antecedent-stimulus control.
Effort
The third variable that produces exclusive choice is effort. When people and other animals are given a choice between working hard to get a reinforcer and working less hard to get the same reinforcer, they exclusively choose the less effortful option. Figure 13.3 shows an experiment conducted by Herrnstein and Loveland (1975). Under these conditions, pigeons exclusively chose less work (fixed ratio [FR] 1) over more work (FR 10) in pursuit of the same reinforcer. Translating this to your everyday life, if you need an elective course in plant science and you hear that one section is taught by an instructor who assigns a lot of busy work (BR) and another section is taught by a professor who doesn't assign any homework (BL), all else being equal, you will choose to enroll in the course taught by professor BL. The reason why is that both professors offer the same reinforcer (elective credits in plant science) but one requires less effort than the other. Why waste your effort? Similarly, many consumers exclusively buy products online because of the reduced effort in ordering. For example, Amazon's "dash buttons" allow you to order products like laundry detergent by pressing the button, conveniently located in the laundry room, once. What could be easier? Compare this with making a shopping list, driving to the store, finding the laundry detergent, and then hauling it home. Simply put, all else being equal, individuals choose low-effort reinforcers over high-effort reinforcers.
Mary Cover Jones
"mother of behavior therapy"; used classical conditioning to help "Peter" overcome fear of rabbits
What must we do to use extinction to positively influence behavior?
1. Determine if the problem behavior is operant behavior 2. Identify the reinforcer that maintains the behavior (Called Functional Analysis of Behavior)
What are the 3 steps to guide visual analysis?
1. Draw a trend arrow through the baseline data to predict what will happen if the independent variable is never turned on. 2. Evaluate if behavior in baseline is too variable (bouncy) to have confidence in the prediction of the trend arrow. 3. Draw trend or level lines through the intervention data. Evaluate if there is a convincing change in trend or level (whichever change is of interest)
3 Objections to Reinforcement
1. Intrinsic Motivation 2. Performance-Inhibiting Properties of Reinforcement 3. Cheating
4 empirically supported principles that increase the efficacy of Pavlovian conditioning
1. use an important US (phylogenetically important to survival) 2. Use a salient CS (something noticeable) 3. Use a CS that signals a large delay reduction to the US (larger delay-reduction ratio is better) 1. make sure the CS is not redundant. there shouldn't be another CS that already signals the delay reduction
Conditioned Stimulus (CS)
A formerly neutral stimulus that now evokes a conditioned response.
Continuous Reinforcement
A simple reinforcement contingency wherein every instance of the response is reinforced.
Contingency Management
A treatment for drug abuse that uses contingent consequences. A new causal relation between drug abstinence (behavior) and a reward (consequence). IF you abstain from using drugs for the next two days, THEN you will earn a modest cash reward.
Functional Variable
A variable that, when changed, reliably and systematically influences behavior
What comes before each phylogenetic reflex?
An antecedent stimulus that elicits the response
Why do we behave?
Because behavior is important for survival, from the moment you are born to the time you die. Behavior helps us survive as long as the behavior is a good match for local conditions.
Why do patients prefer single subject design?
Because they don't want to be part of the control group.
Nurture
Behavioral determinants include all of the events experienced during an individual's life
What are the generic characteristics of highly effective reinforcers?
Contingency, Reinforcer Size, Reinforcer Quality, Reinforcer Immediacy
What makes a good habit good?
Conversely, what makes a good habit good is, very often, that it does not feel that good now, but the more we do it, the better it is for us. Consider exercising. Going out for a run once a year is not a habit - we've only done it once. However, if the behavior becomes habitual, it will be reliably evoked by antecedent stimuli, regardless of how we feel about the consequences of running. When running happens habitually, we gradually experience the physical and mental-health benefits of running. When a good habit is acquired, we take on a desired identity - we think of ourselves as runners.
What is an IOA NOT?
Doesn't measure accuracy or reliability. But it does increase believability.
Motivational variables
Environmental and biological variables that can increase or decrease the momentary efficacy of a reinforcer.
Across-Individual Replication
Evaluate if the effects of the IV can systematically and reliably influence the behavior of more than one individual.
Empirical evidence
Evidence that must be observable.
Second component of a behavioral experiment
Falsifiable hypothesis. Mentalistic hypotheses are not falsifiable
Examples of positive reinforcers
Food, water, electronic brain stimulation, heroin, methylphenidate, alcohol, cocaine, social reinforcers (reciprocal smiling, responsive parents)
Calculating IOA when using duration recording
For a disagreement, just count the seconds different between the two answers. (If I say 152 seconds, and they say 150 seconds, then the disagreement is 2 seconds)
2 categories of functional variables that may be used therapeutically to change behavior
Functional antecedent variables Functional Consequence variables
What is the distinction between functional variables and causes?
Functional variables play no CAUSAL role, but they are merely correlated with the change. We don't pretend to know enough about behavior to know that the functional variable is actually the thing directly causing the behavior, but when we change the FV, the behavior changes as well.
What are some everyday behaviors that are operant behaviors influenced by contingent consequences?
Grades, paychecks, soothed infants, smiles, frowns, speeding tickets.
Habituation
Gradual reduction in responding following repeated presentations of the eliciting stimulus. It's a form of learning, not due to fatigue.
When are habits formed?
Habits are formed when an operant response has been repeatedly reinforced, hundreds, if not thousands, of times in the presence of the same antecedent stimulus. Applied to the behavior of humans, repeatedly drinking at 5 pm can shift an operant behavior from consequence/motivational control ("I could really go for a drink") to antecedent stimulus control ("It's 5 o'clock - time for a cocktail").
What are 3 reasons for distinguishing between positive and negative reinforcment?
Heuristics, Loss Aversion, and Preference for Positive Reinforcement
What do behavior analysts do to positively influence behavior?
Identifying functional antecedent variables and functional consequence variables is the core of behavior analysis. Changing these variables is what behavior analysts do to positively influence behavior.
Using Discrimination Training to Improve Stimulus Control
If a child is forever misreading words because of their resemblance to other words, that child will never become a fluent reader. She will find reading frustrating (the sentences make no sense), she will not read books at home, and she will struggle in school. As noted in Extra Box 1, the alternative to teaching children to read whole words is to teach them the sounds that individual letters make. Once they have mastered this skill, we can teach them to sound-out letter combinations that form simple words. When this is accomplished, the child has the fundamental skills needed to read thousands of simple words. For example, our budding reader may first be taught to say "kuh" when she sees the letter K. That is, we will reinforce saying "kuh" when K is the antecedent stimulus, and we will extinguish "kuh" when it occurs at other times. This discrimination training session will alter the function of K, such that it becomes an SD that evokes the vocal response "kuh." The left panel of Figure 12.5 illustrates the generalization gradient we might expect after a brief lesson in which she is shown the letter K and "kuh" is reinforced. The child is very likely to say "kuh" each time she sees the letter K (the SD), but she will also say "kuh" when she sees letters that look like K. Consistent with the bell-shape of the generalization gradient, those letters that look more like K will evoke more "kuh" responses than letters that less resemble K. The right panel in the figure shows improved stimulus control - the letter K is the only antecedent stimulus that evokes the "kuh" response. The response is nearly perfectly discriminated, and that is what we need if the child is to learn to read. To get our young reader's performance to transition from the left to the right panel of Figure 12.5, we will use discrimination training. As noted earlier, discrimination training works better when the SD is presented at random times, intermixed with SΔ stimuli. This can be accomplished with 11 flash cards, shuffled periodically, each showing one of the letters along the x-axis of Figure 12.5. When K is shown and our reader says "kuh" the response will be reinforced ("very good!"). When any of the other letters are shown, "kuh" will be extinguished. On these SΔ trials, the correct response is no response. We are teaching the child that K is an SD for "kuh" and all of the other letters that more or less resemble a K are SΔs for this verbal response. After a brief discrimination training lesson, the child's generalization gradient will "tighten up" as she begins to emit the "kuh" sound when she sees the K, and not when she sees any of the other letters. When she has mastered this, then she is ready to add another discriminated operant to her repertoire; perhaps learning to say "tuh" when she sees the letter T. As this example illustrates, it is often important for behavior to be under tight stimulus control. The same is true of our African pouched rats. The smell of TNT (SD) must powerfully evoke digging behaviors and all other smells must not. If the rats repeatedly dig in places where there are no landmines (a "false positive"), the rats will prove useless. Ditto for airport baggage screeners. The technology of discrimination training greatly aids in establishing and maintaining tight stimulus control (Edwards et al., 2015).
WHAT SUBSTITUTES FOR DRUG REINFORCERS?
If you grew up in the US in the last 40 years, there is a good chance you were taught that drugs are the ultimate reinforcers; they are so powerful that they have no substitutes. If you start using drugs, you were told, no other reinforcer will be able to compete and your life will be ruined. Is this true? No, not really. In his 2015 TedMed talk, Columbia University neuroscientist, Dr. Carl Hart, discusses the research that started the myth of drugs as the ultimate reinforcers. In these studies, rats were placed in an operant chamber where they could press a lever to self-administer a drug reinforcer, such as cocaine. Most of the rats pressed the lever and consumed high doses of drugs on a daily basis. When this finding was translated to humans, policy-makers concluded that drugs were the ultimate reinforcers; that a "war on drugs" was necessary, as were mandatory minimum jail sentences for those found in possession of drugs. What policy makers didn't realize was that these experiments were conducted in impoverished environments - the researchers made only one reinforcer available and it was a drug reinforcer. Meanwhile, other researchers were collecting data that challenged the policy-makers' conclusions. Researchers like Dr. Bruce Alexander wondered if drugs would lose their "ultimate reinforcer" status if the environment was not impoverished; that is, if other reinforcers were available that might compete with (substitute for) drugs. To find out, Dr. Alexander and his colleagues (Hadaway et al., 1979) raised one group of rats in an impoverished environment (living alone in a cage with nothing to do); these rats used a lot of morphine when given the opportunity. The other group of rats was raised in an environment in which there were opportunities to engage in behaviors that produced natural reinforcers - they could play with each other, they could climb on objects, and they could build nests. These animals raised in the "rat park" consumed far less morphine than their isolated counterparts. Was this because of alternative reinforcers, or was it because the rats living alone were stressed out? To find out, Dr. Marilyn Carroll housed monkeys in cages and periodically gave them the opportunity to press a lever to self-administer a drug (phencyclidine or "angel dust"). In one phase of the experiment, only one lever was inserted into the cage - take it or leave it. In another condition, two levers were inserted and the monkeys could choose between phencyclidine or a few sips of a sweet drink. The results were clear - monkeys took drugs less often when given a choice between drugs or soda pop (Carroll, 1985; Carroll et al., 1989). Yes, soda pop substituted for angel dust. The finding that substitute reinforcers can dramatically decrease drug use has been replicated many times, with many drugs (Comer et al., 1994), with many different substitutes (Cosgrove et al., 2002; Venniro et al., 2018), and in other species (Anderson et al., 2002; Carroll et al., 1991) including humans (Hart et al., 2000, 2005). Importantly, this finding is consistent with the predictions of the matching law (Anderson et al., 2002). If the rate of reinforcement for non-drug-taking activities is very low (e.g., RNonDrug=1 reinforcer per week), and the rate of reinforcement for taking drugs is twice as high (RDrug=2 reinforcers per week), then the matching law predicts that, all else being equal, 67% of the individual's time should be spent in drug-related activities (BDrug): BDrug/ (Bdrug+BNonDrug) = RDrug / (RDrug+RNonDrug)= 2/(2+1) = 2/3 = 0.67*100 = 67% In Dr. Carl Hart's TedMed talk, he discusses the implications of this behavioral research on public policy. Drug use is not a moral failing, he argues, it is the result of a lack of substitute reinforcers. To prevent drug use, Dr. Hart suggests, we must increase the value of RNonDrug. That is, if an individual can obtain contingent reinforcers for engaging in non-drug-taking activities, that individual will be much less likely to develop a substance-use disorder. Perhaps the most effective of these non-drug-taking activities is work. Being gainfully employed substantially reduces one's risk of substance abuse. When economic prospects are dim, as they are in the areas of the United States where the opioid crisis has killed thousands, the risk of problem drug use is increased. Showing compassion by increasing RNonDrug isn't a political stance; it is a scientific stance.
Four Variables Affecting Choice
Imagine there are two identical buttons in front of you. You can choose to press either one and you are free to switch between them any time you like. Furthermore, if you don't want to press either button, preferring instead to stand on one leg while sipping tea, you can do that too. Before you press the buttons, we will assume you have no preference between them - one is as good as the other. However, if the consequences of pressing the buttons are different, a preference will develop. The next four sections describe variables that will strongly influence choice. Indeed, when these variables alone are at work, they tend to produce exclusive choice for one alternative over the other. Reinforcement vs. No Consequence Reinforcer Size/Quality Effort Reinforcer Delay
Heuristics in distinguishing between positive and negative reinforcement
Important to remember all you options when influencing behavior through reinforcement. SR+, SRE-, and SRA- provide a heuristic for remembering three different way sin which you can arrange reinforcement contingencies: consequences can be presented (SR+), removed/reduced (SRE−), or prevented (SRA−).
Persistently Following Incorrect Rules
In 1998, Dr. Andrew Wakefield and his colleagues published a paper in the prestigious British medical journal, the Lancet. The authors claimed that the measles, mumps, and rubella (MMR) vaccine caused autism in children. Wakefield's research methods broke many of the rules of good experimental design that were outlined in Chapter 2. For example, their data were correlational (correlation does not imply causation) and were based in part on parents' self-reports (subject to human biases and recall errors). Later reporting revealed that Wakefield altered the medical records of every single child in the study published in the Lancet (he faked his data). A hearing convened by the British government's General Medical Council concluded in 2010 that Wakefield was a fraud, and 10 of his coauthors on the Lancet paper agreed. In the same year, after many scientifically valid studies revealed no connection between the MMR vaccine and autism, the Lancet retracted Wakefield's article. However, these decisions were too late in coming. By 2008, hundreds of thousands of parents had adopted the rule, "IF you vaccinate your child → THEN the child will develop autism." These anti-vax parents are adamant in their adherence to this rule. From their perspective, they are tracking. That is, they are following the "don't vaccinate" rule because it appears to correctly describe a contingent relation between refusing vaccinations (behavior) and the avoidance of autism (SRA−). This is a sad example of how humans will sometimes persistently follow incorrect rules. Sad because this rule-following has led to an historic increase in MMR and no decrease whatsoever in the prevalence of autism (Centers for Disease Control and Prevention, 2018). Laboratory research on rule-following suggests that humans will often follow rules that do not accurately describe the reinforcement contingencies in operation. For example, Hayes et al. (1986) arranged alternating contingencies of point-delivery that were expected to maintain a high rate of responding when a fixed-ratio (FR) schedule was in operation, and a low rate when a differential reinforcement of low rate (DRL) schedule was in effect. One group of participants was given minimal instructions and they gradually learned to press the button quickly when the FR was in effect (orange light on) and to press more slowly when the DRL was active (red light on). However, when a different group of participants was instructed that the best way to earn points was to press slowly regardless of which light was on, they rigidly follow these instructions, earning fewer points when the FR schedule was in effect. Hayes et al. (1986) reasoned that following the "press slowly" rule was not an instance of tracking - the rule did not accurately describe how to effectively earn points. Instead, following the instruction was an instance of pliance; participants followed the instructions because they came from an authority figure - the experimenter - and authority figures can deliver socially mediated consequences for compliance and noncompliance (as in the Milgram experiment). As a thought experiment, imagine what would happen if the "press slowly" instruction was provided by an agent who had no authority; that is, they could never deliver a reinforcing consequence for following their rule. For example, while playing a video game that sometimes requires fast button-pressing and sometimes slow button-pressing, an in-game character, we'll call him Jasper, appears on the screen and tells you to "press slowly." We know that Jasper's advice is worthless, so we will not engage in tracking. But will we engage in pliance? It depends. If Jasper has the ability to handsomely reinforce our rule-following, then we will do what he says (pliance). However, if Jasper is powerless, pliance is unlikely to occur (see Cerutti, 1994 for data consistent with the outcomes of this thought experiment). The more reinforcers and punishers an authority figure controls, the more pliance will occur. This can help us understand the behavior of anti-vax parents who live in the Hasidic Jewish neighborhoods in Brooklyn, New York. This tightly knit, tradition-bound community strictly adheres to religious rules regarding diet, prayer, and dress. So when a prominent rabbi instructed the community to refuse vaccinations (Belluz, 2019), their pliance is understandable. Similarly, parents who identify strongly with a social group committed to anti-vax practices may refuse vaccines because doing so is positively reinforced with credit, praise, and so on from the group and negatively reinforced by avoiding criticism as a "sellout" (pliance).
Impulsivity and Self-Control
In our everyday language, we often refer to people as "impulsive" or "self-controlled." Sometimes these words are used as descriptions of behavior ("I bought the dress on impulse"), and sometimes as explanations of behavior ("You know why she bought that dress - because she's so impulsive"). In the latter case, the cause of impulsive (or self-controlled) behavior is hypothesized to be inside the person. That is, we see an individual make several impulsive choices and then speculate that the cause of those choices is "impulsivity" or a lack of "willpower." Conversely, if we see someone resist a temptation, we point to "self-control" as the cause. As we have previously discussed, such explanations are circular (see Chapter 9). The only evidence for an internal "impulsivity" or "self-control" is the external pattern of choices made. When the only evidence for the cause (impulsivity) is the effect (impulsive choice), the explanation is circular. A more scientific approach to understanding impulsive and self-control choice begins by specifying behavioral definitions (what is an impulsive choice), and then looking for functional variables that systematically influence these choices. We will use these definitions: Impulsive choice Choosing the smaller-sooner reward and foregoing the larger-later reward. Self-control choice Choosing the larger-later reward and foregoing the smaller-sooner reward. Let's apply these definitions to some everyday impulsive choices. How about the choice between eating junk food (impulsive choice) and eating healthy food (self-control choice)? To put some meat on this example, imagine going to Taco Bell for lunch and eating a beef quesarito, cheesy fiesta potatoes, and a medium Mountain Dew. This meal gives you the immediate enjoyment that only a beef quesarito can provide. However, by choosing to eat this, you are consuming almost all the calories you are allowed under your diet. Because you already had a mocha frappuccino for breakfast and you do not plan to skip dinner, you have chosen to forego your long-term weight loss goal. Is eating this meal an impulsive choice? Most of us would intuitively say yes, but does it meet the behavioral definition of an impulsive choice - is junk food a smaller-sooner reward that requires you to forego a larger-later reward? Relative to the long-term benefits of sticking to your diet (weight loss, improved health, reduced risk of disease, higher self-esteem, etc.), that beef quesarbito is a pretty small reward. Is it sooner? Yes, the benefits of eating the quesarito meal are experienced right now. The benefits of sticking to your diet will not be felt for months. Therefore, eating junk food is impulsive - you are choosing the smaller-sooner reward and foregoing the larger-later reward. Conversely, choosing to stick to the diet is a self-control choice because you are preferring the larger-later reward and foregoing the smaller-sooner one. Before we provide more examples of impulsive and self-control choices, note that when we make an impulsive choice (like eating an unhealthy snack) we feel a strong sense of wanting. However, almost immediately after eating, we regret our decision. This is quite common with impulsive choices and we will have more to say about this later, in the section on "Preference Reversals." But for now, Table 13.3 provides more examples of impulsive and self-control choices. See if you would feel tempted by these impulsive choices, only to later regret having made them.
spontaneous recovery
Increase in conditioned responding following the passage of time since Pavlovian extinction
Visual Analysis
Involves looking at a graph of time-series single-subject behavior to evaluate if a convincing change occurred when the independent variable was introduced/removed.
Why is pavlovian learning adaptive?
It allows the individual to predict when the unconditioned stimulus will occur. (Like a soldier learning that shouting precedes an explosion)
What are some other examples of pavlovian conditioning in humans?
It can often be emotional responses. Children's excitement when birthday candles are lit. Excitement when hearing a gaming console boot up. Music played before sex can evoke positive emotions.
What happens if you change the behavioral definition during the study?
It would portray changes in behavior inaccurately
How is replication involved in single-subject research?
It's built in by switching IV on and off over and over as well as between participant replication
Who was Karl Popper?
Karl Popper was a contemporary of Sigmund Freud who was a behaviorist and supported the concept that scientific theories needed to be falsfiable.
Why do we let individuals select their own reinforcers?
Letting the individual select their own reinforcers is a more sensitive, humane, and effective method relative to having the therapist, manager, or pet owner decide what they like and imposing it on others.
Startle Reflex
Lifelong reflex. Elicited by loud noises
Salivary reflex
Lifelong reflex. Elicited by tastes in the mouth
Corneal reflex
Lifelong reflex. If an object or a puff of air enters they eye, there is a rapid eye-vlink
Withdrawal Reflex
Lifelong reflex. If you touch something hot, you will quickly remove yourself from danger
Natural Sciences
Make falsifiable predictions about things that happen next
Third component of a behavioral experiment
Manipulation of the independent variable
Examples of negative reinforcers
Medications: escape from pain, stress, depression, skin rashes etc. Addictive drugs: drug withdrawal (taking the drug stops withdrawal symptoms)
What makes changes in level and trend more obvious?
Minimal bounce. How big are the changes in trend and level relative to the amount of bounce?
What does money function as?
Money has a conditioned-reinforcing function
What reflexes are babies born with?
Moro, palmar grasp, swimming, rooting, suckling, parachute, respiratory occlusion, corneal, withdrawal, startle, salivary, milk let-down
What are the 2 determinants of behavior?
Nature and Nurture
What would happen if you stopped behaving?
No behavior= death= end of genes
Is free will real?
No. Behavior isn't really self-determined. Everything you do is influenced by causal biological and experiential variables.
Which type of inferential statistics can detect behavior changes with as few as 5 participants?
Nonparametric statistics
What are the characteristics of the scientific method?
Objective, Quantitative, Systematic, Empirical, Falsifiable Predictions, Experimentation, Peer-Review, and Replication
Antecedent
Observable stimulus that is present before the behavior occurs. Example: Presence of the button
Habit
Operant behavior that (1) is evoked by antecedent stimuli and (2) persists despite the imposition of an AO. When behavior becomes habitual, AOs no longer decrease the probability of the behavior that produces those reinforcers. Said another way, habits are less influenced by motivation to acquire the reinforcer and more influenced by the antecedent stimuli that were reliably present when the response was reinforced. These stimuli evoke habitual responding, even though the "reinforcer" is no longer reinforcing.
What is a reputation?
Our attempt to accurately predict long-term behavior in individuals around us. We keep track of others' behavior and use the information adaptively.
What are the 4 direct-observation methods?
Outcome recording, event recording, interval recording, duration recording
Little Albert Experiment
Pavlovian Fear Conditioning. 1920 - Watson - classical conditioning on a 9 month old baby - white rat was paired with a loud clanking noise resulting in crying and fear of rat. experiment proved that adverse experiments in infancy can play a big role in later development (contrary to their belief).
What are two ways that a consequence can come to function as a conditioned reinforcer?
Pavlovian learning and verbal learning
Principle 3 for effective conditioned reinforcement
Principle 3: Use a conditioned reinforcer that signals a large delay reduction to the backup reinforcer. The third principle of Pavlovian conditioning was, "Use a CS that signals a large delay reduction to the US." Translation: Use a conditioned reinforcer that signals a large delay reduction to the backup reinforcer. The bigger the delay reduction to the backup reinforcer, the more effective the conditioned reinforcer will be. As in Chapter 4, the amount of delay reduction signaled by the conditioned reinforcer is easily calculated using the delay-reduction ratio: Delay-reduction ratio = US-->US interval / CS --> US interval In this equation, the US→US interval refers to the average time between backup reinforcers. The CS→US interval is the time separating the conditioned reinforcer and the delivery of the backup reinforcer.
4. Deliver punishment contingently
Punishers are, by definition, effective when delivered contingent upon a problem behavior. Noncontingent punishment - that is, presentation of a punisher regardless of whether problem behavior occurs - does not decrease the future probability of problem behavior. Noncontingent punishers can increase problem behavior and elicit aggression. The practical implication of this finding is clear: Parents and managers who noncontingently "punish" their children or employees are not going to reduce problem behavior. Instead, their actions may elicit negative emotions and aggression. Good parents and good managers are those who understand and adhere to good contingencies of reinforcement and punishment.
Reinforcer Quality
Quality refers to the subjective value of a reinforcer, which can vary from one individual to the next.
Motivation
Rather than viewing motivation as a mysterious essence existing inside the individual, behavior analysts point to environmental and/or biological factors that, when turned ON or OFF, change how much the individual "wants" the reinforcer and is "willing" to emit operant behavior to get it. We refer to those environmental and/or biological factors as "motivating operations" (MOs)
Two methods for identifying reinforcers
Reinforcer Surveys and Stimulus preference assessments
Frequency
Response count divided by time or opportunity to respond. 1 response per minute. Or 3 responses out of 5 opportunities.
The SDp
SDp - an antecedent stimulus that decreases a specific operant response because the individual has learned that when the SDp is present, that response will be punished. You can think of the SDp as a warning stimulus or the watchful eye of the punisher. As long as the warning stimulus (SDp) is present, the response is suppressed. When the SDp is removed (the police cruiser drives away), the punishable behavior increases in probability. The response-suppressing effects of the SDp have been demonstrated many times in the lab. For example, in an experiment conducted by Honig (1966), pigeons were taught to peck a white response key by arranging a variable-interval (VI) 20-second schedule of food reinforcement (SD = white). When a black line was projected onto the key, pecking it occasionally produced a brief electric shock (SDp = black line). During each session, these contingencies periodically alternated back and forth, always accompanied by the SD or the SDp. Figure 12.2 shows the discriminated operant behavior of the four pigeons that participated in Honig's experiment. Like our red-light running behavior, the pigeons' pecking was tightly controlled by the SD and the SDp. On average, the pigeons pecked about once per second when the SD was on, and they virtually never pecked during the SDp. The SDp decreased the operant response. Outside the lab, another everyday example of an SDp is a bee that lands on our arm. When we see this antecedent stimulus, it decreases the probability that we will attempt to pick up the bee. We have long ago learned that touching a bee is a good way to get stung, a consequence that punishes this behavior. The outcome of this learning is that now, when we see a bee land on our arm, we will do nothing (decreased behavior) until the bee flies off (the SDp is removed). The arrival of the bee is a warning stimulus - an SDp - which decreases insect-touching behaviors.
Calculating IOA when using partial or whole interval recording
Same method is used to record for both
Pseudo-Sciences
Sciences that make unfalsifiable predictions (Popper)
How can we effectively use Differential Reinforcement?
See flowchart picture on phone taken on 10/11/2021.
Skinner's Functional Taxonomy of Speaker Behavior
Skinner's account of verbal behavior focused on the behavior of the speaker; that is, the individual who is talking, writing, signing, and so on. Four of his categories of verbal operants are defined and described here. These categories have generated the most research and have proven most useful in helping children with autism acquire language skills (DeSouza et al., 2017). Echoic Mand Tact Intraverbal
Differential Reinforcement of High-Rate Behavior (DRH)
Sometimes the problem with a behavior is not its topography, but the rate at which it occurs. Rates of responding may be increased or decreased with differential reinforcement. This is used if the rate of behavior is too slow. Here, low-rate responding is put on extinction and high-rate responding is reinforced.
Influencing Impulsive Choice
Steeply discounting the value of future consequences (solid curve in Figure 13.13) can lead to impulsive choices. When the subjective value of the smaller-sooner reward (height of the green bar) exceeds that of the larger-later reward (red dot at T1), the impulsive choice is made. By contrast, when delayed rewards are discounted more shallowly (dashed curve), the subjective value of the larger-later reward (blue dot at T1) exceeds that of the smaller-sooner reward, and the self-control choice is made. If, in Figure 13.10, your own Rich Uncle Joe discounting curve was steeper than ours, there is a good chance that you are young and not wealthy; younger and less well-off individuals tend to have steeper discounting curves (Reimers et al., 2009). Consistent with the analysis shown in Figure 13.13, a substantial amount of evidence suggests steeply discounting the future is correlated with substance-use disorders such as cigarette smoking and the misuse of alcohol, cocaine, methamphetamine, heroin and other opiates (Bickel et al., 1999; Heil et al., 2006; Madden et al., 1997). Likewise, steeply discounting delayed consequences is correlated with relapsing during drug-abuse treatment (Coughlin et al., 2020; Harvanko et al., 2019; MacKillop & Kahler, 2009). Similarly, pathological gambling and behaviors that risk significant health losses are correlated with steep delay discounting (Dixon et al., 2003; Herrmann et al., 2015; Kräplin et al., 2014; Odum et al., 2000). Perhaps you remember from Chapter 2 that correlation does not imply causation. No experiments have established that steeply discounting the value of future consequences plays a causal role in addictions. However, there are some findings that are suggestive. For example, studies that have assessed delay discounting in human adolescents have reported that steep delay discounting precedes and predicts early drug use (Audrain-McGovern et al., 2009) and similar findings have been reported in rats (Perry et al., 2008). Such findings suggest reducing delay discounting early in life might improve decision-making and lessen human suffering. The following sections describe two methods for reducing impulsive choice. One method - using a commitment strategy - works by engineering the environment to improve the choices we make. The second method - delay-exposure training - works through learning.
Steps for Pavlovian Conditioning
Step 1. Baseline Phase. Does the stimulus evoke the response? If not, it is a neutral stimulus. Step 2. Evaluate the ability of the dog biscuit (unconditioned stimulus) itself to elicit the salivary reflex (unconditioned response). Step 3. Pavlovian conditioning procedure. Shake the box (neutral stimulus) before placing the biscuit in your dog's mouth (unconditioned stimulus). Repeat several times. Step 4. Perform the conditioned stimulus (shaking the box), but don't give the dog biscuit. If the dog salivates (conditioned response), the behavior function of the box shaking has changed.
The Three-Term Contingency
Stimulus control of operant behavior is critical to the survival of the individual. We refer to the functional relation between antecedent, behavior, and consequence as the three-term contingency (Skinner, 1938); this is sometimes referred to as the A-B-C contingency. This adds a third term (the antecedent stimulus) to the two-term (IF response → THEN consequence) contingency that we have discussed in previous chapters. The three-term contingencies discussed so far may be represented as follows: Antecedent Behavior Consequence IF SD AND Response → THEN Reinforcer IF SΔ AND Response → THEN No Consequence IF SDp AND Response →THEN Punisher As noted by Skinner (1972), when individuals interact with consequences in the presence of an SD, SΔ, or SDp, the development of discriminated operant behavior is "practically inevitable." Consistent with that observation, well-controlled laboratory experiments have demonstrated that nonhuman animals (Colwill & Rescorla, 1990) and humans (Gámez & Rosas, 2007) explicitly learn the relation between all three terms in the three-term contingency.
How has reinforcement been used to positively influence behavior?
Teaching independent living skills to adults and children with intellectual and developmental disabilities, treatment of children with autism using positive reinforcement, positively influence behaviors that are at the heart of public health deficits, and teach critical helping behaviors to animals.
Extinction Burst
Terminating the reinforcement contingency sometimes, but not always, produces a temporary increase in the rate, magnitude, or duration of the previously reinforced response. If your rate of button-pressing temporarily increases (pressing rapidly), the magnitude of your presses increases (pressing really hard), or the duration of your presses increases (holding the button down longer than normal), then this was an extinction burst. Similarly, in clinical settings, extinction bursts happen in about half the cases in which extinction is the only therapy employed. Because extinction bursts can be stressful and can lead to accidental lapses in extinction-based therapy, it is important to continue studying the variables influencing extinction bursts.
What do we learn from the Libet study?
That when people report that they "decided" to lift their finger, their brain had already made the decision and sent the signal.
Differential Reinforcement of Incompatible Behavior (DRI)
The "something else" is a response that is topographically incompatible with the problem behavior. This was the technique used in the zoo study - holding the ring outside the cage was topographically incompatible with throwing feces and shaking the cage.
Organizational Behavioral Management
The application of behavior analysis to business settings. Focused on objective measurement of individual behavior of employee performance. Changing environmental events to produce measurable performance improvements.
Magnitude
The force or intensity of a behavior. Could be measured in grams of force or decibels.
Positive Reinforcement (SR+)
The presentation of a consequence (a stimulus presentation), the effect of which is to increase operant behavior above its no-reinforcer baseline level. Any time that reinforcement involves the presentation of a stimulus we will classify it as a positive reinforcer.
The Matching Law and Attention
The world in which we live is filled with thousands, if not millions, of stimuli, yet we attend to very few of them. For example, when is the last time that you chose to attend to the space on the ceiling in the corner opposite from you right now. Have you ever looked at that spot? Why do you instead spend so much time attending to books, your computer, and your phone? In the 1980s, no one spent 4 hours a day looking at their phones. What changed? If you have understood this chapter so far, you should be able to propose a falsifiable hypothesis about the allocation of your attention. The hypothesis forwarded by the matching law is that more attention will be allocated to stimuli predictive of higher rates of reinforcement (like your phone today), less attention will be allocated to stimuli predictive of lower rates of reinforcement (like phones in the 1980s), and no attention will be allocated to stimuli predictive of no reinforcers (like that spot on the ceiling). The evidence for this matching law hypothesis of attention comes from several laboratory studies. Rats, pigeons, and humans reliably attend to stimuli signaling higher rates of reinforcement more than they attend to stimuli signaling lower rates, or no reinforcement, exactly as the matching law predicts (Della Libera & Chelazzi, 2006, 2009; Shahan & Podlesnik, 2008; for review see Shahan, 2013). Your everyday experience is consistent with these findings. Your attention is easily drawn to the sound of your name across a noisy room because this auditory stimulus has reliably signaled a momentary increase in the probability of a reinforcer. For example, when a friend calls your name, attending to them might provide an opportunity to watch a funny video, hear a juicy bit of gossip, or get invited to a party. Similarly, your attention is readily drawn to a $20 bill blowing across campus, but not at all drawn to a candy wrapper rolling in the wind. The matching law's account of human attention may also help to explain why humans choose to allocate more of their news-gathering activities to biased sources, either progressive or conservative. These news outlets provide a higher rate of reinforcement; that is, they more frequently publish stories their audiences find reinforcing. For example, individuals who describe themselves as progressive choose to allocate more of their attention to newspapers and cable news channels that frequently publish stories and editorials consistent with their world view (e.g., MSNBC), and less attention to news outlets that frequently publish news appealing to a conservative audience (e.g., Breitbart, Fox News). Those with the opposite world view allocate their attention in the opposite direction, where they find the higher rate of reinforcement. If you let Google or Facebook choose what news you read, then you should know that their artificial intelligence algorithms have learned what you find reinforcing, and, to keep your attention directed at their platform, they feed you a steady diet of news stories predicted to be reinforcing (Zuboff, 2019). The downside of this is that different individuals have different sets of "facts," leading to the shared belief that the other tribe is "nuts." As you can see, the predictions of the matching law have been subjected to a large number of empirical tests involving a wide variety of choices and species. The preponderance of the evidence is supportive of the matching law. More advanced courses in behavior analysis will outline even further the quantitative predictions of the matching law, but these are beyond the scope of this text. The next section explores one of these predictions without going into any further quantitative details. As you will see, the matching law helps us to understand some of the maladaptive and irrational choices that we all make every day.
The first goal of behavior analysis
To accurately predict behavior.
What is the consequence in SRA-?
Two-Factor Theory: Consequence is fear reduction. Relies on two learning processes (two factors): Pavlovian and operant conditioning. Pavlovian conditioning explains why fear arises (the warning stimulus is a CS that evokes fear) and operant conditioning explains why avoidance behavior occurs (fear reduction is the consequence that functions as an SRA−). One-Factor Theory: One-factor theory holds that operant conditioning alone can explain SRA−. The other factor - Pavlovian conditioning - is not necessary. Thus, according to one-factor theory, momentarily preventing the aversive event is the consequence that maintains SRA− behavior. There is no need for fear reduction.
Delay-Reduction ratio
US-US interval/CS-US interval
Why Study Schedules of Reinforcement?
We have devoted a lot of space in this chapter to describing pigeons pecking response keys and rats pressing levers. What does any of this have to do with positively influencing human behavior? When we understand how ratio and interval schedules maintain behavior, we can use that knowledge to accurately predict and positively influence important human actions. Consider the opioid crisis, heroin-use disorder, and the death toll it has imposed in our time. For generations, lawmakers and presidents have approached this problem by trying to stop drugs at the border. If successful, the street price of heroin would increase and, the politicians predict, drug use will decline. However, this approach can have unintended side effects that are predictable from an understanding of ratio schedules of reinforcement. This is illustrated in Figure 11.15. The red dot shows a heroin-user's spending on heroin when it costs $7.50 a bag (one hit) and the black curve summarizes spending at a wider range of heroin prices. Because the curve is nearly identical to that of the pigeons, rats, monkeys, and human cigarette smokers in Figure 11.7, we can predict what would happen if the street price of heroin increases. If the border crackdown was highly effective, and the price of heroin tripled, our law-enforcement efforts would be "rewarded" with an increase in user's spending on heroin (green data point). At a population level, some of that increased spending will come from criminal activities, so our well-meaning "cops at the border" policy will have the unintended side effect of increasing crime; not to mention the Breaking Bad scenario in which law-abiding citizens get into the drug trade because of the flood of new money in the heroin market. And it's not just in law enforcement that knowledge of contingencies is important. Unintended side effects of complex contingencies of reinforcement have, according to its CEO, Jack Dorsey, been the rule at Twitter. The creators of Twitter, Dorsey suggests, were naïve about the effects of the complex contingencies they were programming on "likes," "followers," "retweets," "retweets with comments," and so on. What seemed in the early days like an empowering platform connecting the world gradually became an environment in which emotion-eliciting posts filled with misinformation and conspiracies were differentially reinforced; thoughtful discourse was extinguished. "If we were to do all of this over again ..." Dorsey said in 2020, "I wish we would have understood and hired a game theorist [behavioral scientist] to understand the ramifications of the tiny decisions that we make". Accurately predicting behavior under complex contingencies of reinforcement is a core goal in behavior analysis and Twitter's experience helps us understand why - predicting unintended side effects allows us to avoid those effects. To their credit, Dorsey and Twitter have taken behavioral research more seriously in recent years. They are working to improve their platform, so it reinforces more nuanced, prosocial communications. We applaud these efforts, as they will be important if we are to undertake meaningful dialogues to solve those crises, like opioid-use disorders and global heating, that have at their core disordered contingencies of reinforcement that maintain maladaptive human behavior.
What makes a bad habit bad?
What makes a habit bad is that it feels good now, but the more we do it, the worse things get for us. For example, smoking one cigarette per year is not a bad habit. But the more often you smoke, the more likely it is to become habitual, and the worse it is for your health.
Consciousness
What is it like to be a bat? This is the question posed by the philosopher Thomas Nagel in his 1974 essay on consciousness. To Nagel, the study of consciousness is possible only if we can agree on what consciousness is. He turned to bats to shine light on the topic. For Nagel, "an organism has conscious mental states if and only if there this something that it is like to be that organism - something it is like for the organism." Is a bat conscious? Is there anything that it is like to be a bat; anything for the bat? This is a bit of a mind-bender, but this definition of consciousness points to awareness of interoceptive stimuli, that is, our private, inner, subjective experiences. Is a bat consciously aware of interoceptive stimuli or is it simply a well-oiled insect-catching machine? Although Nagel (1974) argued there was no way to know if a bat was conscious, most consciousness researchers disagree (Blackmore, 2005). It is possible, in theory, to test for the bare minimum of consciousness by arranging special contingencies of reinforcement to alter the function of an interoceptive stimulus and to make that private, inner stimulus an SD. Parents arrange these contingencies all the time, for example, when they teach their child to say "tummy ache" under the discriminative-stimulus control of an interoceptive stimulus - the sensation of pain in the mid-section: IF pain in the mid-section (SD) AND child says, "tummy ache" THEN parental care Because the parent cannot feel the child's pain, they look for indirect evidence of that interoceptive stimulus - the child is crying and holding her belly (Skinner, 1945, 1957). Once acquired, if the child's verbal response "tummy ache" generalizes to an antecedent knock on the head, the parent will not reinforce it: IF pain in the head (SΔ) AND "tummy ache" THEN "no honey, you have a head-ache" This discrimination-training approach comes natural to parents who need their children to accurately report on other private stimuli such as hunger, thirst, symptoms of an illness, and so on. Such discrimination training "tightens" interoceptive stimulus control and might be said to expand the child's consciousness (Blanshard & Skinner, 1967). That is, they become more aware of and attuned to their private, interoceptive stimuli. But what of bats, rats, and other nonhuman animals? Are they conscious? We cannot say for sure. We know that several nonhuman species are capable of discriminating at least one internal, private stimulus - that produced by drugs - but these demonstrations happen in the lab, not in the wild. In these drug-discrimination studies, pressing a lever after a morphine injection (which produces an interoceptive SD we might describe as "feeling high") is reinforced with food. On other days, the injection is saline (SΔ) and pressing the "I feel high" lever is not reinforced. When the animal accurately presses on test days (when no reinforcers are provided), we may infer that its behavior is under interoceptive SD and SΔ control (Maguire et al., 2013). So, on the one hand, nonhuman animals are capable of discriminating their internal, subjective states. But, on the other hand, in the wild we assume that no such discrimination-training contingencies exist. To the best of our knowledge, bats are not hanging about asking each other about a tummy ache. Thus, nonhuman consciousness of interoceptive stimuli seems unlikely, but the question remains unanswered. An anecdotal piece of evidence about consciousness in the absence of discrimination-training contingencies comes from the case of Helen Keller. Keller, who lost her hearing and sight at 19 months, learned to communicate only at age 6, when Anne Sullivan arrived and used a form of discrimination training to teach her to communicate through tactile stimulation in the palms of her hands. Years later, Keller (1908) reflected on her days before that discrimination training began: Before my teacher came to me, I did not know that I am. I lived in a world that was no-world ... I never contracted my forehead in the act of thinking. I never viewed anything beforehand or chose it ... never in a start of the body or a heart-beat did I feel that I loved or cared for anything. My inner life, then, was a blank without past, present, or future, without hope or anticipation, without wonder or joy or faith. Keller's reflections on consciousness (or the lack thereof) must be treated with caution. They come decades after her self-described emergence into consciousness. Nonetheless, these reflections raise the possibility that without contingencies of reinforcement that make interoceptive stimuli functional (SD and SΔ), we would not attend to those stimuli; we would not be conscious of our own inner lives. Want to get a sense of the emergence of consciousness? Play the yellow car game with a friend. You both have to find one yellow car per day, take a picture of it, and text it to the other player. To earn a point, the yellow car needs to be a different one each day: IF new yellow car (SD) AND you text it to friend → THEN you have scored a point. Now, for the duration of time you play the game, notice how your attention is drawn to yellow cars. Where previously you had no conscious awareness of yellow cars, now you see them everywhere. Now they have a behavioral function. Your consciousness of yellow cars is expanded when they function as an SD. A similar expansion of consciousness may occur when interoceptive stimuli come to function as SDs and SΔs, through discrimination training. So, is there anything that it is like to be a bat? Perhaps not. Not unless contingencies of reinforcement are arranged that make interoceptive stimuli important (functional), something to take notice of because, like a yellow car, those stimuli acquire SD or SΔ properties. Perhaps it is contingencies that turn the lights of consciousness on. Perhaps.
Operant Extinction
When a normally reinforced behavior no longer works. When it no longer produces a reinforcer. Responding that meets the reinforcement contingency no longer produces the reinforcer and, as a result, it falls to baseline (no-reinforcer) levels. You did the IF behavior, but the THEN reinforcer didn't happen. Operant extinction can also increase the probability of behaviors other than the previously reinforced response.
Schedule Thinning
When learning a new behavior, it is important to receive reinforcement each time a correct response is made. Continuous reinforcement is critical in these early learning periods because correct responses are infrequent relative to incorrect responses. For example, imagine learning to play the bagpipes; we know, this requires some imagination. If we are just starting out and our instructor reinforces correct technique on an FR 60 schedule, we may never learn to execute clear notes - giving up before ever making the 60th correct response. Thus, continuous reinforcement is important at the beginning, when acquiring new behaviors. However, it is often impractical to reinforce correct responses each time they occur, particularly when the performance improves and they occur at a high rate. Indeed, when we are playing well, we don't want the bagpipe teacher to yell "Yes! Yes! Good one laddie!" each time we correctly play a note. And, if asked to do this, the teacher would complain that it takes too much time and energy. Therefore, after a new skill has been acquired, it is important to transition from continuous to intermittent reinforcement. The solution is schedule thinning, a procedure for gradually reducing the rate of reinforcement, while maintaining the desired behavior. By thinning the schedule of reinforcement, we can ensure that the adaptive behavior can be maintained outside the clinic, where it will not be reinforced every time. Schedule thinning can be accomplished in several ways. The correct method depends on the desired outcome. If we want the behavior to occur frequently, but it is impractical to reinforce every response, then we can increase the ratio requirement from FR 1 (continuous reinforcement) to FR 2. If the behavior is maintained and problem behavior is infrequent, further increases can be implemented. But sometimes the goal is not to have more of the desired behavior, but to have it occur at a reasonable rate. This is the case in our clinical example - we want appropriate requests for attention to occur occasionally, not constantly. One approach is to shift from FR 1 to a very brief FI schedule, and then gradually increase its duration as long as destructive behavior remains low (e.g., FI 2-s, FI 4-s, etc.). This was the approach taken by Hanley et al. Although destructive behavior was infrequent, thinning the FI schedule did not reduce appropriate requests for attention to a reasonable rate. Much better outcomes were obtained by visually signaling when appropriate requests would be reinforced (when a white card was saliently displayed) and when these requests would not be reinforced (when a red card was displayed). Schedule thinning occurred by presenting the red card (operant extinction) for longer and longer intervals of time. The positive effects of this procedure have been replicated by other clinicians. Although these methods of schedule thinning have yielded some successes, there remains much to be learned. A common difficulty is that as the schedule is thinned, the individual fails to obtain the reinforcer, which, of course, means the adaptive behavior is undergoing extinction. When that happens, many of the secondary effects of extinction can occur: unwanted emotional behavior, variations on the adaptive behavior (e.g., yelling the request), and resurgence of destructive behavior. If these behaviors are inadvertently reinforced, destructive behavior can be very difficult to treat. Therefore, it is important that schedule thinning be done with great care by an appropriately trained behavior analyst.
Consequence
an observable stimulus change that happens after behavior occurs. Consequence influences our behavior. Example: doors close and elevator begins to move.
What are the 4 dimensions of behavior?
frequency, duration, intensity, latency
Organizational Behavior Management
systematic application of positive reinforcement principles in organizational settings for the purpose of raising the incidence of desirable organizational behaviors. Can improve employee performance by an average of 69%.
Spontaneous recovery of operant behavior
temporary resumption in operant responding following time away from the extinction setting
First thing learned in pavlovian conditioning
the CS signals a delay reduction to the US (Delay reduction- the time/delay to the next US event is less than it was before the CS occurred)
second thing learned in Pavlovian conditioning
the CS signals when the US is coming (like when it predictably comes in 10 seconds)
third thing learned in Pavlovian conditioning
the CS signals which US is coming
Graduated exposure therapy
the client is gradually exposed to successively stronger approximations of the CS. Before each new CS-approximation is presented, steps are taken to reduce/eliminate any fear evoked by the prior CS-approximation. (Mary Cover Jones?)
Multiple baseline across participants design
time-staggered A-B replications are demonstrated across participants
multiple baseline across situations design
time-staggered A-B replications are demonstrated across situations
Intermittent Reinforcement
A complex reinforcement contingency wherein the response is sometimes but not always reinforced.
Rewards
Beneficial consequences that we thing will function as reinforcers, but we don't know if they will.
Categories of functional variables
Biological variables (genetics, brain chemistry) and environmental variables (things we experience through our senses
Political Will
A mentalistic construct. "So, let us renew it [political will], and say together: 'We have a purpose. We are many. For this purpose we will rise, and we will act.'" This may inspire us in the moment, but does it positively influence day-to-day decisions about recycling, setting our thermostat, using public transportation, or calling our legislators to demand that they do something about the climate crisis? Given that more than 10 years have passed since the speech and the crisis has only worsened, we fear the answer is no. What are the relevant MOs (e.g., the need to raise huge amounts of money from special interest groups to afford running for re-election)? What contingencies are currently operating (e.g., IF you vote for the bill that benefits my company but trashes the environment → THEN I will write a big check for your re-election campaign) and how can they be modified to generate the behavior needed to effectively tackle the climate crisis?
Spontaneous Extinction
the disappearance of an unconditioned or conditioned response, while spontaneous recovery refers to the reappearance of an unconditioned or conditioned response after a period of extinction.
How does motivation to acquire the reinforcer affect how quickly operant extinction occurs?
the hungriest rats (those deprived of food for 23 hours) pressed the lever during extinction far more often than rats that had eaten 1-3 hours ago.
Pavlovian extinction
the procedure of repeatedly presenting the CS without the US, the effect of which is a reduction or elimination of the CS's ability to evoke the CR
Taste aversion learning
a form of learning in which an organism learns to avoid a taste after just one pairing of that taste with illness. One-trial learning- revulsion comes after a single encounter.
Suckling reflex
elicited by an object (typically mother's nipple) passing the lips
What 3 factors affect spontaneous recovery?
1. The more time that passes between pavlovian extinction/exposure sessions, the more spontaneous recover will occur 2. Spontaneous recover decreases as more Pavlovian extinction sessions are conducted 3. Less spontaneous recovery will occur if each Pavlovian extinction session is continued until the CS no longer evokes the CR
4 weaknesses of group experimental designs
1. When the independent variable is a therapeutic intervention, no one wants to be assigned to the control group. 2. Focusing on the behavior of the group means we aren't studying the behavior of the individual. 3. The behavior of the treatment and control groups will differ simply because the people assigned to the two groups are different. 4. Reliance on inferential statistics to evaluate if the IV changed behavior. Invites errors, subjectivity, and unscrupulous practices.
Primary punisher
A contingent consequence that functions as a punisher because, in the evolutionary past of the species, this consequence decreased the chances of survival. Primary punishers are pain-inflicting stimuli (e.g., extreme heat or cold) or the removal/reduction of a primary reinforcer such as food, water, a sexual partner, and so on. Each of these consequences decreases the individual's (or the individual's genes) chances of survival; therefore, natural selection has prepared us to find these consequences punishing. No new learning is required.
Generalized Conditioned Reinforcer
A conditioned reinforcer signals a delay reduction to more than one backup reinforcer
Negative Reinforcement - Avoidance (SRA-)
A consequent prevention of a stimulus, the effect of which is to increase operant behavior above its no-reinforcer baseline level. Warning stimulus-- Operant response --> Aversive stimulus prevented. The operant response prevents an unpleasant stimulus change from happening. Operant response happens because of warning stimulus.
Negative Reinforcement - Escape (SRE-)
A consequent removal or reduction of a stimulus, the effect of which is to increase operant behavior above its no-reinforcer baseline level. (Aversive stimulus present-- operant response --> Aversive stimulus removed or reduced)
multiple baseline across behaviors design
A multiple baseline design in which the treatment variable is applied to two or more different behaviors of the same subject in the same setting staggered across time.
Neutral Stimulus
A stimulus that doesn't occasion the response of interest (salivating)
Unconditioned Stimulus (US)
A stimulus that elicits a response without any prior learning (food in dog's mouth)
Information Theory of Reinforcement
Agrees that reinforcers increase operant behavior above a baseline level, but it disagrees about how this happens. The Information Theory holds that reinforcers provide information that allows the individual to predict when or where subsequent reinforcers may be obtained. In much the same way that a road sign can direct you to Albuquerque, the delivery of a reinforcer provides information directing individuals to more reinforcers. In the words of Cowie and Davison (2016), "Behavior is... controlled by the likely future, as exemplified in the past"
Simple contingencies
All-or-nothing. Behavior was reinforced every time it occurred. For example, IF we press the app icon → THEN the app loads. IF the rat presses the lever → THEN a food reinforcer is provided. Such examples clearly illustrate the contingent relation between a response and a reinforcer, but our experiences in the real world tell us that reinforcement contingencies are often much more complex.
Break-and-run in humans
Although FR schedules produce reliable patterns of "break-and-run" responding in nonhuman species, the effect has not been consistently demonstrated in human laboratory studies. Humans typically do not pause after the delivery of the reinforcer. "Why" is uncertain, but it may be due to differences in the way human and animal experiments are conducted. For example, pecking a key to obtain a food reward is probably the most exciting thing that happens in the daily life of a laboratory pigeon, but for a human it is incredibly boring to repeatedly press a button to earn points on a computer screen - worst computer game ever! If the task and contingencies arranged are so mind-numbingly boring that humans do not attend to the consequences of their behavior, that behavior cannot be systematically influenced by those consequences. In our everyday lives, however, taking a break between large tasks, like a large FR requirement for a pigeon, is normal.
Mand
An infant may first learn to say "water" as an echoic, but will later emit this verbal operant to get a drink. In doing so, the function of the verbal response "water" has changed - it is influenced by different antecedent stimuli and different consequences. A mand is a verbal operant occasioned by an establishing operation and maintained by the verbally specified reinforcer. Less technically, manding is asking for something that will satisfy a current need - water, a turn with a toy, or attention. "Mand" derives from the word "demand." The mand places a demand on the listener - go get me a drink of water. [Establishing Operation (Antecedent)] --> [Mand (Behavior)] --> [Mand-Specified Reinforcer (Consequence)] Manding benefits the speaker (Skinner, 1957). When faced with a difficult task (establishing operation), we often mand "help," as this produces the reinforcer that meets our current need - assistance. Typically developing children acquire mands very early in the process of language learning, asking for food, water, toys, hugs, and a host of other momentarily effective reinforcers. Mands acquired by children with severe language deficits allow them to get the reinforcers they want, when they want them.
What is behavioral analysis?
An objective, unbiased, scientific approach to discovering how behavior works
Examples of negative reinforcement- avoidance
Behavior -- Aversive event prevented Turning in an assignment on time-Avoid a late penalty Putting on sunblock-Avoid getting a sunburn Paying your electricity bill-Avoid a service interruption Ordering medium-heat wings-Avoid burning mouth Saying "yes, that shirt looks good on you"-Avoid your friend getting upset Withdrawing your hand from a growling dog-Avoid getting bit Using a condom-Avoid an unplanned pregnancy
Experiments
Behavior analyst turns variables on and off to see if they systematically and reliably change behavior.
Why do we reject mentalistic explanations of motivation?
Behavior analysts reject circular explanations of behavior. Mentalistic explanations leave us powerless to positively influence behavior.
Conditioned punisher
By contrast, the punishing function of a conditioned punisher must be learned. The learning process is Pavlovian. A conditioned punisher is a contingent consequence that signals a delay reduction to a backup punisher. In the lab, we can establish a new conditioned punisher through Pavlovian conditioning. For example, if a mild electric shock (an unconditioned stimulus [US]) is delivered to the participant's leg once, on average, every 2 minutes, and a blue light precedes each shock by 2 seconds, the blue light will quickly acquire a conditioned-stimulus (CS) function. This happens because the blue light signals a very large reduction in the delay to the next shock. Primary and conditioned punishers are built into the popular underground fencing technologies that keep the family dog in the yard. These fences work by arranging primary and conditioned punishers. The primary punisher is an electric shock delivered by a special collar worn by the dog. The shock is delivered contingent on crossing over the underground fence installed at the perimeter of the yard. The conditioned punisher is a salient tone provided by the collar. The tone signals a delay reduction to shock. For the underground fencing to work, a skilled animal behaviorist will put the dog on a leash so the dog can learn the tone→shock sequence. The trainer will also positively reinforce turning away each time the dog hears the tone. By doing so, the dog avoids the shock (a negative reinforcer). All told, the dog learns that (1) the tone signals that "The shock is coming! The shock is coming!" and (2) turning around is an effective shock-avoidance response. When properly trained, the dog will rarely come close enough to the perimeter to activate the conditioned punisher and will almost never contact the primary punisher.
How did John Watson change advertising?
Changed from appealing to intellect to appealing to emotions. More effective. Used pavlovian learning to make people love things like cocacola.
What is therapy/intervention?
Changing functional variables to change behavior and improve lives
Pavlovian conditioning
Classical Conditioning or Respondent Conditioning. When something (metronome ticking) always precedes the eliciting stimulus (food), then the first thing will begin evoking the response (salivating).
Theories
Conceptual models of how the world works
Discrimination Training
Discrimination training is a procedure in which an operant response is reinforced in the presence of an SD and extinguished in the presence of an SΔ. This was the procedure used by Skinner (1933, 1938) - in the presence of the SD (light on) lever pressing was reinforced. In the presence of the SΔ (light off), the same response was never reinforced. Applied to training airport baggage screeners, the goal is to ensure that the scanned image of a knife functions as an SD for flagging the bag for further screening. At the same time, it is important that scanned images of pencils, toothbrushes, and tweezers are passed over for further screening. Discrimination training will involve (1) presenting scanned images of knives and reinforcing their detection and (2) presenting scanned images of pencils, tweezers, and other long narrow objects (SΔs) and withholding reinforcement if the item is erroneously flagged. When the image of a knife reliably evokes the flagging response, and no other image does, then the response is under tight stimulus control - it occurs when the SD is present and not when an SΔ is displayed.
Multiple-Baseline Designs
Evaluates the functional relation between an IV and behavior by conducting a series of time-staggered A-B comparisons either across behavior, across situations, or across individuals. Used when it the IV will have an irreversible effect or when it would be unethical to turn off the IV.
How can we address the replication crisis caused by subjective use of inferential statistics?
Greater transparency, including publishing their plans for conducting the experiment and analyzing data in advance of the experiment. `
What are the two approaches to conducting behavioral experiments? Which is preferred?
Group designs and single-subject designs. Single-subject is preferred
What are the 2 meanings of objective?
Humans are susceptible to biases that cloud how we evaluate evidence that supports and refutes our favorite theories. Our current understanding of behavior is tentative and will be changed in the future as new findings support an alternative viewpoint that better allows us to predict and positively influence behavior.
Verbal Behavior, Rule-Following, and Clinical Behavior Analysis
Humans are one of the few species on the planet who learn and teach a complex language to their offspring and peers. Our linguistic capacity exceeds that of any other species (Snowdon, 1990). This chapter outlines a behavior-analytic account of verbal behavior - talking, listening, reading, and so on. This account has led to teaching methods that have proven useful in helping children with autism and other developmental disabilities to acquire language. Needless to say, these successes have made a huge difference in the lives of these individuals and their parents. Among language-able humans, an important category of verbal activity involves rules - creating rules, following rules, breaking the rules, and so on. Parents typically provide and enforce rules (no screen-time after 8:00 pm; be nice to grandma; don't drink milk out of the carton). That is, they verbally specify expected and forbidden behaviors and arrange consequences when the rules are followed or violated. Likewise, countries, cultures, and social groups provide verbally specified rules, norms, expectations, and laws. Those following/violating these rules will, likewise, experience the consequences. We also create our own rules, maxims, and personal codes of ethics. These verbally specified rules often make explicit contingencies of reinforcement ("You catch more bees with honey than vinegar") and punishment ("If you speed through 7th street, you'll get a ticket"). This chapter introduces two categories of rule-following. These are important in understanding human compliance with social norms and with authority figures. Finally, this chapter discusses an activity that we all do, every day, during most of our waking hours - we talk to ourselves. This self-talk is behavior (an individual living organism's activity, public or private, which may be influenced by external or internal stimulation). What we say to ourselves may be influenced by external stimulation ("When will this light turn green!") and internal stimulation ("My stomach hurts; why did I order the diablo wings?"). Our internal dialogue is such a constant presence that most of us regard the voice in our head as me. For some, that voice is persistently positive ("I am so dope"), but for others the voice is chronically negative ("I am ugly, stupid, and unlovable"). Our self-talk may voice rules that work well in the short run ("I don't want to feel anxious so I'm not going to the party") but not in the long run ("I have no friends"). When the verbal content of cognition has a significant negative impact on daily living, many seek the help of a psychologist or behavior therapist. Many of the behavior-analytic principles covered in this book have helped people who seek such therapy, and we have covered many of them already (e.g., graduated exposure therapy and contingency management of substance-use disorder). This chapter focuses on new principles that clinical behavior analysts use to improve the lives of those who seek their help. These principles were derived from the study of human language, so let's start the chapter there. Behavioral Approaches to Language When you were young, you learned to communicate with others who spoke the same language, be that English, Portuguese, or sign language. Later, when you went to school, you learned about the parts of speech (nouns, verbs, etc.), syntax, and grammar (we know, it was horrible). But before you ever entered a classroom, you had a functional understanding of language. By functional, we mean that you were already able to "use your words" to obtain reinforcers and avoid punishers. You could ask for something you wanted ("Can I have a drink of water?"). You could imitate the words of others ("Daddy said 'bite me.'"). You could use your words to help others ("Watch out for that rock; it hurt my foot."). You could lie to avoid getting grounded ("No, I didn't do it."). B. F. Skinner (1957) recognized the importance of language, and provided a functional approach to understanding it. Where his contemporaries catalogued language by its structure, Skinner provided a taxonomy of functions; that is, verbal behaviors (talking, writing, signing, etc.) that are operant activities, influenced by different kinds of antecedent stimuli, and different kinds of consequences. A thorough summary of Skinner's theory of verbal behavior is beyond the scope of this chapter. However, his taxonomy has helped behavior analysts to identify and remediate language deficiencies in children with autism and other developmental disabilities (Sundberg & Sundberg, 2011). For that reason, we provide a brief introduction to Skinner's functional taxonomy.
The Token Economy
In the 1960s applied behavior analysts developed a conditioned-reinforcement system known as the "token economy". In this therapeutic system, tokens were used to reinforce prosocial and life-skill behaviors among hospitalized psychiatric patients, individuals with developmental disabilities, and delinquent youth. A token economy is a set of rules governing the delivery of response-contingent conditioned reinforcers (tokens, points, etc.) that may be later exchanged for one or more backup reinforcers. A backup reinforcer is the reinforcer provided after the conditioned reinforcer signals the delay reduction to its delivery. In a token economy, earning a token signals that the individual is nearer in time to a desired product or service (the backup reinforcer) than they were before the token was given to them.
The Liking Strategy
Identify things that the individual "likes." The logic is simple - things that we like should function as effective positive reinforcers. Use reinforcer surveys and stimulus preference assessments to find out what incentive will be an effective reinforcer.
How does the rate of reinforcement prior to extinction affect how quickly operant extinction occurs?
If behavior has been reinforced every time it occurs (high rate of reinforcement), then after extinction starts, behavior will quickly decrease to baseline levels. Conversely, if behavior was infrequently reinforced (low rate of reinforcement), then following the contingency change to extinction it will take longer for behavior to decrease to baseline levels. This direct relation between prior reinforcement rate and how quickly behavior undergoes extinction is called the partial reinforcement extinction effect (PREE). (Car starting vs Slot Machines)
Objection 3: Cheating
If cheating can produce the positive reinforcer more easily than engaging in the desired behavior, some people will succumb to the temptation.
If we prefer VR schedules over FR, and we are more productive under VR schedules than FR, then why aren't VR schedules used more often?
In our daily lives, the VR schedule is a rarity. Why is this such an underappreciated schedule of reinforcement? Perhaps we fear that VR schedules are a "gateway" to pathological gambling. But if that were true, we would have widespread prohibitions on all games of chance - no more Yahtzee for you! A more likely answer is that nobody outside the game-design and behavior-analytic world knows about the potential of the VR schedule. Your parents don't know about the VR, nor do your managers or teachers.
Why do many behavioral analysists use inferential statistics to supplement visual analyses?
Inferential statistics is often considered the "gold standard" method for evaluating behavior change. People may reject visual analysis as unconvincing or too subjective. Most visually convincing behavior changes also pass an inferential statistics test.
How is the question "did behavior changed" answered in behavioral sciences?
Inferential statistics. In theory it is objective, but in practice it can be quite subjective.
What theory explains the effects of extinction on operant behavior best?
Information Theory of Reinforcement (NOT Response Strengthening Theory)
Teaching Stimulus-Response Chains
Individuals with significant development disabilities and early language delays can struggle to acquire new stimulus-response chains. They often cannot understand instructions or imitate the behavior of others. Acquiring a stimulus-response chain such as using the toilet, making popcorn, or accessing a favorite game on a tablet can greatly enrich their life and that of their parents. Therefore, a behavioral technology for teaching stimulus-response chains represents one more opportunity to positively influence behavior. Consider the case of Emily, a 4-year old diagnosed with autism spectrum disorder. At baseline, she spoke no more than one word at a time, frequently made throat clearing sounds, and often dropped to the floor while crying and screaming. The behavior analysts who worked with Emily sought to teach her how to use a tablet computer to verbally ask for something she wanted, like an Elmo phone (King et al., 2014). To accomplish this, they first conducted a task analysis; that is, they provided a precise specification of the sequence of antecedents, responses, and consequences that comprise a stimulus-response chain. For Emily, one of her stimulus-response chains was IF see tablet elsewhere (SD) AND walk to and pick up tablet → THEN tablet in hand IF tablet in hand (SD) AND search for Elmo icon → THEN find Elmo icon IF see Elmo icon (SD) AND touch Elmo icon → THEN tablet says "Elmo phone" and Emily is given the Elmo phone to play with Thus, Emily had to learn three behaviors, emit them in a precise sequence, and do so under tight antecedent stimulus control. This would be simple to teach to a typically developing 4-year old, but Emily's developmental disability made this a challenge. To teach this stimulus-response chain, the behavior analysts used a training technique known as backward chaining. When using backward chaining, the final link in the stimulus-response chain is taught first and, once that link is mastered, additional links are added in reverse order. Forward chaining can also be done; it involves teaching the links in the stimulus-response chain in the order they will need to be emitted. Both techniques are effective (Slocum & Tiger, 2011). In Emily's case, backward chaining involved first teaching her to touch the Elmo icon on the tablet computer - the last response in the chain. When that response was under stimulus control, the preceding response in the chain - searching for the Elmo icon on a screen containing several reward-depicting icons - was taught. Note how the consequence of Emily's screen-searching behavior produced the SD (the Elmo icon) that then evoked the final response in the chain (touching the icon to earn some play time with the Elmo phone). Eventually Emily mastered the entire chain and, in so doing, acquired a bit more independence. Now when she wanted to play with her favorite toy, she could grab her tablet and tell her parents exactly what she wanted. Prior to acquiring this stimulus-response chain, Emily must have experienced frustration with her inability to communicate her desires; frustration that could lead to disruptive behavior.
What are the primary and other effects of operant extinctino?
Primary: It returns behavior to its baseline (no-reinforcer) level. Other Effects: 1. Extinction-induced emotional behavior 2. Extinction burst 3. Extinction-induced variability 4. Extinction-induced resurgence
Reinforcer Size
Larger reinforcers are more effective than smaller reinforcers, and a large empirical literature supports this. The caveat to this rule is that the larger the reinforcer, the sooner satiation will occur. The larger the reinforcer, the sooner satiation (an AO) will occur and, when it does, the behavior will come to an end. Thus, the behavior analyst hoping to initiate and maintain behavior will want to use a reinforcer of adequate size to get the behavior going, but not so large that the accumulated reinforcers have an AO effect.
clicker training with humans
Levy, when interviewed in 2018 by the Hidden Brain podcast, noted that clicker training has another, less easily quantified, advantage - using a clicker keeps the learner focused on the performance. Using verbal praise and criticism, the traditional teaching method, continuously draws the medical student's attention away from the surgical technique and places it on the teacher's evaluation of the student's competency as a person. Clickers, Levy noted, are "language free...baggage free...I'm quiet, you're quiet; we're just learning a skill." Learners find clicker feedback less disruptive and more fair than verbal feedback. A simple click keeps things objective, and better surgeons are the result. Using clickers to teach humans to execute complex behaviors has only begun to be studied, but the results are encouraging. Clicking marks correct responses at the moment they occur. This improves the skilled performances of athletes, dancers, and individuals with autism as they learn important life skills. They might also prove effective with patients in physical therapy, when learning to do the exercise their physical therapist is teaching. Here it is important that the exercise be done exactly correctly, lest the patient further injure themselves, or not benefit from the exercise as they should. Perhaps you too will think of an application of clicker training to positively influence human behavior.
Verbal Learning and conditioned punishers
Like conditioned reinforcers, conditioned punishers may be established through verbal learning. That is, explaining that the onset of a stimulus signals a delay reduction to a backup punisher will, in principle, establish that stimulus as a conditioned punisher.
Natural selection and primary reinforcers
Natural selection favors those who repeat behaviors that produce life-sustaining consequences. As an illustrative thought experiment, consider two infants living in Paleolithic (cave-person) times. One infant finds its mother's milk highly reinforcing - suckling behaviors that produce this milk are repeated again and again (Kron, 1967). The second infant's behavior is not reinforced by milk - the milk was sampled once and suckling was not repeated. The genes of the first infant have prepared it for survival. It did not need to learn anything new for mother's milk to reinforce its behavior; the first sip was enough to increase the future probability of successful suckling behaviors. The second infant's genes are, sadly, a death sentence in an age without modern medical technologies. An early death prevents these genes from reaching the next generation. The result of this natural selection process is that mother's milk functions as a reinforcer for virtually all human infants and other mammals. Individuals are more likely to survive if their genes establish life-sustaining consequences as reinforcers. Behavior analysts refer to these life-sustaining consequences as primary reinforcers.
noncontingent consequences in north korea
North Korea. Communist. Everyone gets the same rewards whether they do good at their jobs or not. So there's no encouragement to work hard. No innovation or creative thinking. Reduced productivity, innovation, and creativity.
4 simple schedules of reinforcement
Note how the response rates are higher under the ratio schedules of reinforcement than the interval schedules. Faster responding produces reinforcers more quickly under a ratio schedule but not an interval schedule. Note also that post-reinforcement pausing occurs under the FR and FI schedules. The variable schedules (VR and VI) maintain steadier response rates. Finally, note how the VR schedule maintains the highest response rate of them all. Other than in the gambling and video-gaming industries, it truly is the underappreciated of the schedules of reinforcement.
Two Term Contingency
The basic operant contingency. Two terms- response and consequence
Percentile Schedule of Reinforcement
Offers a simple automated training technique incorporating the six principles of effective shaping. Behavior analysts have used percentile schedules of reinforcement to improve academic performance and social interactions among children with disabilities and to reduce cigarette smoking. Step 1: Determine Terminal Behavior Step 2: Collect 10 days of baseline data. Step 3: Choose the 60th percentile of the last 10 days of data as your goal for that day to get that day's reinforcer. (repeat each day)
Ways positive and negative reinforcers are different
Presentation vs. removal/reduction/prevention If given a choice, people prefer positive reinforcers.
Principles for effective conditioned reinforcement
Principle 1: Use an effective backup reinforcer. Principle 2: Use a salient conditioned reinforcer. Principle 3: Use a conditioned reinforcer that signals a large delay reduction to the backup reinforcer. Principle 4: Make sure the conditioned reinforcer is not redundant.
Arranging effective conditioned punishers
The conditioned punisher will more effectively suppress behavior when (1) the backup punisher (US) is a phylogenetically important event, (2) the conditioned punisher (CS) is highly salient (noticeable), (3) the conditioned punisher signals a large delay reduction to the US, and (4) the conditioned punisher is not redundant with another stimulus that already signals the US is coming. Delay Reduction Ratio = US-->US Interval / CS-->US interval
Functional Communication Training
Problematic demands for attention (e.g., tantruming) are extinguished while appropriate requests (e.g., "will you play with me please") are established and reinforced. When a child has learned the latter response, they can appropriately request attention when they want it. Functional Communication Training has proven effective in reducing inappropriate requests for social reinforcers in a variety of populations and settings.
Marking
The conditioned reinforcer immediately follows the response, and this helps the individual learn which response produced the backup reinforcer.
What behaviors are most likely to be punished?
Punishment decreases the future probability of behavior, so we are inclined to use it when it is imperative to decrease behavior now, before someone is hurt. Punishment is used in applied (clinical) settings when the behavior is dangerous to oneself or others. In addition, punishment is most often used only after other interventions have failed. For example, sometimes a reinforcement-based intervention, like differential reinforcement, will reduce problem behavior without punishment, but sometimes it will not. When these interventions do not reduce a behavior that is dangerous to oneself or others, effective punishment is needed. Indeed, it is the position of the Association for Behavior Analysis International that clients have a right to effective behavioral treatment, and sometimes that treatment will include punishment.
What has been the most prevalent means of decreasing problem behavior throughout history?
Punishment. Spanking/hitting misbehaving children. Criminal justice and religions often depend on punishment. Differential Reinforcement offers a better option for reducing problem behavior.
TACTIC 2: TRAIN DIVERSELY.
Rather than teaching a skill in one setting, with one teacher, and with one reinforcer, generalization can be promoted by training in multiple settings, with multiple teachers and with a variety of reinforcers. Each time the skills training moves to a new setting, new instructor, and so on, an opportunity for generalization is presented and, if it occurs and is reinforced, the individual may learn that the skill is applicable to many settings, not just the SD setting in which it was acquired.
Open Science Collaboration (2015)
Replicated 100 prominent findings and reported that more than 60% of the findings could not be replicated.
What advantages does visual analysis have over inferential statistics?
Requires that behavior analyst develop methods for producing large changes in behavior that can are apparent to the naked eye. Recognizes the inherent subjectivity of science. Doesn't try to hide behind objectivity. Reduces biases by putting behavior analyst's reputation on the line.
Differential Reinforcement of low-rate behavior (DRL)
Responding quickly is extinguished and responding slowly is reinforced. This happens naturally if you visit a foreign country with only a basic comprehension of their language. If the person giving you directions to the bus terminal speaks too quickly, you cannot understand them, so you look at them quizzically and withhold the usual reinforcers such as "merci." When the person repeats the directions slowly, you understand and then say, "merci beaucoup."
Escape Extinction
Responding that meets the negative reinforcement contingency no longer removes or reduces the aversive event. As a result, responding decreases to baseline (no-reinforcer) levels. The first, escape extinction, was designed to reduce food refusals: IF food refusal (tantrum) → THEN food is not removed The second contingency change, negative reinforcement escape (SRE−), was designed to increase eating: IF food consumed → THEN child can choose to end the meal
Applied Behavior Analysis
Rigorous scientific research focused on socially significant behavior in non-laboratory settings. Expected to demonstrate a level of effectiveness that is readily apparent to consumers. Practical interventions that produce meaningful improvements in socially significant behavior.
Do humans and animals prefer FR or VR schedules?
Several experiments have posed this FR vs. VR question to nonhuman animals and their answer is clear - as the price of the reinforcer increases (i.e., more work is required to get the reinforcer), subjects strongly prefer the VR schedule. This finding is illustrated in Figure 11.8. When these four pigeons could choose between an FR 3 and a VR 3 (the data point furthest to the left), they did not prefer one over the other - their choices were 50-50. But as the work requirements on the FR and VR schedules increased, all four pigeons came to strongly prefer the VR schedule. Humans have rarely been given this choice, but it appears that they also prefer VR schedules, particularly at high-ratio requirements and when they have the opportunity to occasionally "win" the reinforcer after a single response. Indeed, the reason VR schedules are preferred is closely linked to these occasional "wins" that are arranged on the VR, but never on the FR. That is, those few times when the individual obtains a reinforcer after very little effort, like a slot machine player who wins on the very first play. If these occasional wins don't occur under the VR schedule (but the ratio value remains the same), preference for the VR vanishes. The allure of gambling, of course, is the "win" - a big reward obtained after very little work. Casinos understand VR schedules and they use them to keep players spending because the very next wager could be a winner.
Stimulus Presentation
Something new is added to the environment
Alternating Treatments Design
The IV is turned ON and OFF rapidly to evaluate if this systematically and repeatedly changes behavior. Used to evaluate functional relations between behavior and one or more IV (good if there is more than one treatment to test). This design should only be used if it can be expected that the effects of the treatment will occur quickly. If not, a reversal design would be better. Internal validity comes because we repeatedly turn on and off the IV
6. Use a punisher in the goldilocks zone
This final characteristic of effective punishment is concerned with the intensity/size of the punishing stimulus. The humane use of punishment seeks to find the least restrictive punisher that reduces the future probability of behavior. That punisher exists in the so-called Goldilocks zone: Not too aversive and not so benign that it does not decrease behavior. One intuitive, but ineffective, strategy for finding that Goldilocks-zone punisher is to start with a low-intensity punisher and gradually increase its aversiveness until it works. Unfortunately, habituation learning renders this strategy ineffective. As a refresher, habituation occurs when an eliciting stimulus (e.g., lemon juice in the mouth) gradually elicits less phylogenetic behavior (salivation) when that stimulus is presented repeatedly. Applied to punishment, repeatedly presenting an aversive stimulus reduces negative reactions to that stimulus and decreases its efficacy as a punisher. In applied/clinical settings, behavior analysts have developed techniques for finding the Goldilocks-zone punisher. In one technique, potential punishers are presented briefly (to avoid habituation) and the client's response is recorded. Potentially effective punishers are those that elicit the beginnings of a negative emotional response and/or evoke escape behaviors such as turning away or holding up a hand to stop. Conducting these brief punisher assessments can increase the efficacy of punishment-based interventions. After all, one person's punisher might be another person's reinforcer.
Summary of Influencing Impulsive Choice
This chapter defined choice as a voluntary behavior occurring in a context in which alternative behaviors are possible. We reviewed four variables that tend to produce exclusive choice for one alternative over another. That is, we strongly prefer to obtain a reinforcer instead of nothing, to get a larger over a smaller reinforcer, to work less for the same reinforcer, and to receive reinforcers sooner rather than later. Choice is less exclusive when the outcomes of choice are uncertain. This describes the hunter-gatherer environment in which human behavior evolved. Whether hunting game or foraging for berries, our ancestors learned where food was more likely to be found, but there were no guarantees. In this uncertain environment, their behavior probably conformed to Herrnstein's matching equation, which holds that choice is influenced by relative rates of reinforcement. In modern times, the matching law can help us understand choices that are otherwise difficult to explain: terrorism, substance-use disorder, and white nationalism. In each case, the matching law suggests reducing these behaviors may be achieved by arranging alternative reinforcers that will substitute for the problematic reinforcer. These reinforcers will be more competitive for human behavior if they are more meaningful (and more immediate) than those reinforcers currently maintaining behaviors destructive to self and others. The chapter also explored the impulsive choices we all make but later regret. These preferences for a smaller-sooner over a larger-later reward can have a negative impact on our lives. We sleep longer than we planned, lapse from our diet, and may even cheat on our partner because the benefits of doing so are immediate and the benefits of doing otherwise are delayed and, therefore, discounted in value. Many studies conducted in many species have established that delayed rewards are discounted in value and that the shape of the delay-discounting function is a hyperbola. Hyperbolic discounting can explain why we repeatedly change our minds - preferring the larger-later reward most of the time, and defecting on these good intentions when the smaller-sooner reward may be obtained immediately. While steeply discounting the value of delayed rewards is correlated with substance-use disorders and other health-impacting behaviors, there are methods for reducing impulsive choice. One method is the commitment strategy - choosing the larger-later reward well in advance, when the smaller-sooner reward is not immediately available. To the extent that this commitment cannot be reversed, this greatly increases self-control choices. The other method, delay-exposure training, provides the individual lots of experience with delayed reinforcers. Delay-exposure training seeks to change the individual's learning history, thereby producing large and lasting reductions in impulsive choice.
Summary of Four Variables Affecting Choice
This section has considered four variables that, all else being equal, will generate exclusive choice. When we say "all else being equal," we mean that the only difference between the two alternatives is that one, for example, produces the reinforcer faster than the other. If there are no other differences, then an exclusive choice will develop for the alternative that yields (1) a reinforcer (vs. one that does not), (2) a bigger/better reinforcer, (3) the same reinforcer less effortfully, and (4) the same reinforcer sooner. The remainder of this chapter will focus on nonexclusive choices. That is, situations in which we prefer BL, but we don't choose it all the time; sometimes we choose BR. For example, an angler may prefer to fish in lake BL but will sometimes fish in lake BR. Similarly, a shopper may prefer clothing store BL, but will sometimes shop in BR (to see if they have any bargains). In both of these situations, the consequences of choice are less certain than discussed earlier and, as a result, choice is nonexclusive; that is, we prefer one lake/store but search for reinforcers in both. Another form of nonexclusive choice is variously described as impulsivity, self-control, or delay of gratification. Sometimes we make choices that we describe as "impulsive" and then regret the decision. For example, we order a high-calorie dessert after eating a full meal, and after enjoying the immediate benefits of this choice, we regret our decision and berate ourselves for our lack of "self-control" or "willpower." Appealing to reifications like "impulsivity" and "willpower" reveals how little we understand the choices we make. As you will see later, nonexclusive choices between, for example, dieting and breaking from the diet, are predictable (one of the goals of behavior analysis). More importantly, some progress has been made in identifying functional variables that can be used to positively influence choice. As always, we hope readers of this book will further this progress and will help to reduce impulsive choices such as problem drug use, uncontrolled eating, and pathological gambling.
The second goal of behavior analysis
To discover functional variables that may be used to positively influence behavior
Fixed-Ratio Schedule
Under a fixed-ratio (FR) schedule the number of responses required per reinforcer is the same every time. For example, under a fixed-ratio 3 schedule (abbreviated FR 3), three responses are required to satisfy the reinforcement contingency. Under an FR 10 schedule, the reinforcer will not be delivered until the 10th response is made. Figure 11.2 depicts an FR 3 schedule of reinforcement. Each black circle represents a response and the green arrows indicate when reinforcers are delivered. Because this is an FR schedule, the reinforcer is provided after the third response, every time. It does not matter how much time passes between responses; reinforcers are delivered only after the third response is made. Said another way, the ratio requirement is three responses, every time. If a college student gives herself a break contingent upon answering six questions in her textbook, then she has arranged an FR 6 schedule of reinforcement.
FR Schedule and procrastination
Understanding what causes the FR schedule's characteristic post-reinforcement pause may be helpful in understanding our tendency to procrastinate, or a child's tendency to become emotional or aggressive when asked to transition from one activity to the next. One theory of the post-reinforcement pause is that the individual has learned how the contingency of reinforcement operates - they have learned that after a reinforcer, the FR response counter is reset to zero. Therefore, the next reinforcer has never been further away, both in terms of the amount of work and time to the next reinforcer. This distance to the next reinforcer is aversive and it can evoke negative emotions, aggression, and escape. It is not surprising, then, that when children are asked to transition from a reinforcing activity (playing with a friend) to an activity that requires more work and more time to reach the next reinforcer, emotional outbursts may occur. For adults who no longer have to do what their parents tells them to do, post-reinforcement "procrastination" is of greater interest. For example, when we finish writing a term paper and have the freedom to decide what to do next, most of us will take a break - we grab lunch with friends, watch a movie, or take the dog for a walk - we don't immediately begin working on the next term paper. Such instances of "procrastination" may occur because, once again, the FR response-counter has been reset to zero - the next reinforcer for task completion has never been further away. Undertaking the next big project is an aversive prospect and we avoid it by engaging in other behaviors that will produce a more immediate, less-effortful reinforcer, like lunch. Thus, procrastination is natural - we all do it, including our nonhuman friends. One strategy for reducing the harm of procrastination is to plan for it. That is, make concrete plans to take a break after completing a big task, but commit to making it a healthy break; an hour for lunch with a friend, a 20-minute walk outside to clear the head, a "nappuccino" - drink a cup of coffee, take a 25-minute nap, and then undertake the next task. This strategy sets an end-time on the break, so we are less likely to move continuously from one break to the next, endlessly procrastinating from the next task.
What do we learn from Masterpascua's 2009 rat experiement
We learn about behavioral epigenetics. Because rats whose mothers neglected them failed to have a specific gene turn on in their brains. This made it so later they were unable to respond well in stressful situations. Then, when they became mothers, they also neglected their offspring passing down the maladaptive stress reaction. Combination of nature and nurture.
Replication across labs or clinics
If results can't be replicated at different clinics or labs, perhaps behavior changes are due to an unrecognized confound rather than the independent variable.
Strategy 4 to replace bad habits with good habits
Experience the self esteem building intrinsic reinforcers that can only be sampled when we, even briefly, engage in the desired behavior.
Single Subject Experimental Designs
Expose individuals to baseline (IV OFF) and experimental (IV ON) phases to determine if the independent variable systematically and reliably changes behavior. Tailored interventions for the unique needs of the patient.
Verbal Learning and Conditioned Reinforcers
Parents explain to you that tickets win arcade prizes (that tickets are a delay-reduction stimuli for prizes), which makes them function as conditioned reinforcers to skillfully play games. During verbal learning, information is provided indicating that the conditioned reinforcer signals a delay reduction to another reinforcer. That is, the Pavlovian CS→US contingency is verbally described. Laboratory studies reveal that humans are capable of verbally learning Pavlovian contingencies. For example, if humans are instructed that a red light will precede the delivery of an electric shock (US), when that instructed CS is encountered, an involuntary physiological fear response occurs (the conditioned response [CR]), even when the shock does not (Phelps et al., 2001). Similarly, in your everyday life, if a friend tells you their dog bites in a vicious and unpredictable way (an event that would function as a highly aversive US), then when you see the dog approaching (CS), you will experience fear (CR), despite having never seen the dog bite anyone.
Stimulus Events
Things you see, hear, smell, taste, or feel which can occur outside or inside your body.
What are some examples of external and internal stimuli?
External - seeing a sign for my favorite restaurant. Internal - feeling hungry
Calculating IOA when using event recording
Can be the same as outcome recording. Or to more precisely evaluate agreements and disagreements, you can calculate IOA in each of the 1-minute observation intervals. The second way is preferred.
What are the two ways in which behavior can change from the baseline?
Can change in trend or level.
What potential dangers are there when behavior analysts evaluate whether behavior changed?
Can invite subjectivity, prejudice, and bias into the scientific process.
Functional Analysis of Self-Injurious Behavior
Conducting a functional analysis of behavior is critical in the treatment of self-injurious behavior. Is this behavior an operant? If so, what reinforcer could possibly maintain behavior that is so clearly harmful to the individual? Answers to these questions will influence the treatment prescribed. Therefore, it is good news that a functional analysis of behavior identifies the reinforcer maintaining problem behavior most of the time. Because the physical stimulation of self-injurious behavior is an automatic consequence of the response, if it functions as a reinforcer, it will be identified as an automatic reinforcer. To find out if self-injury is maintained by automatic reinforcement, the individual would, safety permitting, be given some time alone while the behavior analyst discretely records the frequency of self-injury. If self-injury occurs at the usual rate while alone, then the only consequence that might maintain this behavior is the automatic outcome of that response, that is, the "painful" stimulation. If automatic reinforcers maintain a problem behavior, then extinction is impossible - the therapist cannot turn OFF the automatic stimulation experienced each time face slapping occurs (Vollmer et al., 2015). Preventing self-injury by restraining the client is not extinction - during extinction, the response occurs but the reinforcer does not. Let's assume that automatic reinforcement is not responsible for the self-injurious behavior. During a functional analysis of behavior, several other hypothesized reinforcers will be evaluated. Does problem behavior occur primarily when attention is the consequence of self-injurious behavior? If so, then extinction will involve no longer delivering this positive reinforcer contingent upon self-injury. However, it is often ethically impossible to extinguish attention-maintained self-injurious behavior. One cannot ignore self-injury if serious physical harm to the client is the outcome. Alternatively a functional analysis may reveal that self-injurious behavior occurs because it allows the individual to escape from everyday tasks. For example, if self-injury occurs when the individual is asked to change clothes or transition to a new activity, then negative reinforcement (escape from the activity) is responsible for the problem behavior. In such cases, escape extinction can be effective in reducing self-injury; that is, clothes will be changed and transitions between activities will occur regardless of self-injurious behavior. But, as mentioned earlier, the behavior analyst should be prepared for extinction-induced emotions, physical aggression, and topographical variability in the self-injurious behavior, including an increase in the magnitude of the self-injurious response. Such reactions may make it impossible to use extinction alone.
Strategy 1 to replace bad habits with good habits
Find the antecedents that evoke bad habits. First, identify the antecedent stimuli that reliably evoke our bad habits. To do this, simply notice when and where you are when you find yourself engaged in the bad habit. After directly observing your behavior for a few days, consider if there are a set of antecedent stimuli that reliably set the occasion for the bad habit. If so, consider if these stimuli can be hidden from view (or hearing, smell, etc.). For example, if you have a bad habit of snacking when you see the bag of chips on the couch, get rid of the chips. Throwing them away will work, but a less drastic strategy is to put them in a cabinet, well behind other foods that are better for your health. If you are less likely to run across the chips, you are less likely to mindlessly eat them.
Predicting Preference Reversals
Figure 13.12 illustrates a unique prediction of hyperbolic discounting - that there are predictable times when, from one moment to the next, we will change our minds about what is important in life. In the graph from the preceding section (Figure 13.11), when dessert was immediately available, our decision-maker preferred cake (SSR) over the delayed benefits of dieting (LLR). But in Figure 13.12, neither reward is immediately available at T2, where the stick-figure decision-maker contemplates its options. From this location in time, both rewards are hyperbolically discounted in value and, importantly, these subjective values have changed places. Now the subjective value of the LLR (red dot) is greater than the discounted value of the SSR (green dot). At T2 the decision-maker can clearly see the benefits of dieting and resolves to stick with it. Unfortunately, when an immediately available dessert is encountered again, moving the decision-maker from T2 to T1 (in Figure 13.11), a "change of mind" will occur. The subjective values reverse positions again, dessert is consumed, and self-loathing ensues. Our stick-figure decision-maker notes its lack of "willpower" and resolves to buy a self-help book (How to Stick to Your Decisions). Preference reversals have been extensively studied by behavioral scientists interested in human and nonhuman behavior; they predictably occur when the decision-making is moved, in time, from T1 to T2 (or vice versa; Ainslie & Herrnstein, 1981; Green et al., 1994; Kirby & Herrnstein, 1995). One reason for the interest in preference reversals is that "changing your mind" from a self-control choice to an impulsive choice is maladaptive. Another reason is that it seems so irrational, and yet we all do it. For example, most of us set the alarm with good intentions to get up, but when the alarm goes off we snooze it at least once. This is a preference reversal. The night before (at T2 in Figure 13.12), the subjective value of getting up early (the red dot at T2) was higher (more valuable) than the discounted value of a little more sleep tomorrow morning (the green dot at T2). However, while sleeping, the decision-maker moves from T2 to T1. When the alarm goes off at T1, extra sleep is an immediate reinforcer and it is more valuable than the discounted value of the delayed benefits of getting up on time. A very similar preference reversal occurs when individuals with a substance-use disorder commit to outpatient therapy with the best of intentions to quit (Bickel et al., 1998). These choices are often made at T2, in the therapist's office when drugs are not immediately available. Here the subjective value of the delay benefits of drug abstinence (the red dot at T2) is subjectively greater than the discounted value of drugs (the green dot at T2). A few days later when drugs are available immediately (T1), the individual finds that they cannot resist the immediate temptation of drug use. The subjective values of these choice outcomes have reversed, as has the decision to abstain from drugs.
Strategy 5 to replace bad habits with good habits
Gradually increase the daily goal, remembering to keep it easy - the goal is to keep doing the behavior in the presence of the antecedent stimuli until, one day, it becomes habitual. It's just something we do, whether we feel like it or not. The fifth strategy will follow the advice of Chapter 8, the next in the sequence of successive approximations will arrange the self-esteem building reinforcer for walking just a little more - 45 seconds. The idea is not to get into shape right now; it is to get into the habit of exercising. Each time we see the treadmill, we get on it and we walk. We don't dread getting on it because it is a painless, positive-identity building activity. Each day we are forming a good habit with physical and mental-health benefits that will compound over the lifetime.
What doe people see and then say "the behavior is willed" or "the behavior is not willed"?
If the behavior appears to have no direct cause and the behavior appears to have a goal, people would tend to say it is willed. But it's not, because free will isn't real.
Positive punishment
Stimulus presented to decrease future probability of behavior. The contingent presentation of a consequence that decreases the future probability of the behavior below its no-punishment level. When a positive-punishing consequence occurs, it adds something to the environment; something that was not there before the punished behavior occurred. Example: drinking spoiled milk. Common in nature- stepping on a thorn, predator attacks.
Negative punishment (penalty)
Stimulus removed, reduced, or prevented to decrease future probability of behavior. The contingent removal, reduction, or prevention of a reinforcer; the effect of which decreases the future probability of the behavior below its no-punishment level. Hockey- referees remove players for violating safety rules. Nature examples- dropping food, aggressive play.
4 Defining Features of Single-Subject Designs
1. Focus is on behavior of individuals, not groups. 2. Each subject experiences the baseline and experimental (intervention) phases 3. Behavior is measured repeatedly in each phase until confident predictions about behavior may be made. 4. Internal validity is assessed through replication and evaluating the functional role of confounded variables.
Six characteristics of effective punishment interventions (6)
1. Focus on reinforcement first 2. Combine punishment with extinction and/or differential reinforcement 3. Deliver punishers immediately 4. Deliver punishment contingently 5. Punish every time 6. Use a punisher in the goldilocks zone
What are two stable environmental conditions?
1. Human infants are born to mothers who provide nutrition by breast feeding 2. Mothers carry infants from place to place
Commonly used punishers
1. Time-out from positive reinforcement 2. Response-cost punishment Both of these are negative-punishment contingencies. Because stakeholders find the removal of a positive reinforcer to be more acceptable than the presentation of a punishing stimulus, these punishment contingencies are widely used by applied behavior analysts and parents. With a little creativity, they can also be used by managers who need to reduce behaviors that put a company at risk.
3 kinds of replications built into single subject experimental designs
1. Within-individual Replication 2. Across-individual replication 3. replication across labs or clinics
4 Dimensions of a Reinforcer that affect its efficacy
1. contingency 2. size 3. quality 4. immediacy
How do we replace bad habits with good habits?
5 Strategies (listed below) So, if you are ready to form a good habit, (1) find the antecedents that evoke bad habits and (2) replace them with stimuli that will, one day, evoke a good habit. Then, (3) set the bar very low and (4) experience the self-esteem-building intrinsic reinforcers that can only be sampled when we, even briefly, engage in the desired behavior. Finally, (5) gradually increase the daily goal, remembering to keep it easy - the goal is to keep doing the behavior in the presence of the antecedent stimuli until, one day, it becomes habitual. It's just something we do, whether we feel like it or not.
Habit (dictionary)
A pattern of behavior that is repeated so often that it becomes almost involuntary. We habitually engage in these behaviors without explicitly considering their consequences and some of those consequences are not in our long-term best interests.
Time-out from positive reinforcement
A signaled response-contingent suspension of a positive-reinforcement contingency, the effect of which decreases the future probability of problem behavior. When these four guidelines are followed, time-out from positive reinforcement can be an effective way to negatively punish problem behavior. It is a more humane approach to punishment than the corporal punishment techniques (spanking, whipping, etc.) that were common prior to the 1960s, when time-out was first introduced as an alternative to these aversive consequences.
Calculating IOA when using outcome recording
Agreement: occurs when bot the independent observers agree that outcome occurred. Disagreement: when one observer scores the outcome as having occurred but the other observer does not
How are causation and correlation related?
All causal relations are correlated, but not all correlations reveal causal relations
What is the purpose of a baseline phase?
Allows for accurate predictions of behavior if the independent variable is never turned ON. Increases confidence making predictions about future behavior.
Extinction
Also simple. The response is never reinforced.
SΔ (pronounced "ess-delta")
An SΔ is an antecedent stimulus that decreases a specific operant response because the individual has learned that when the SΔ is present, that response will not be reinforced (extinction). The SΔ was the dark house. The antecedent SΔ decreases behavior relative to the SD. The trick-or-treater's behavior is discriminated. Their precious candy-collecting time will not be wasted on SΔ houses.
Comparison (A-B) Design
Arranges a baseline (A) phase (IV OFF) and an experimental (B) phase (IV ON). Can show correlations between intervention and behavior, but can't rule out confounds. Referred to as a "Quasi-Experimental Design"
What kind of learning is pavlovian?
Automatic and unconscious
How can we minimize bounce?
By minimizing between-session changes. Such as appointments at the same time every day, getting a good night's sleep before each session, preventing changes in physical environment.
Punisher
Consequences that decrease the probability of behaviors.
How does a teeth brushing habit get formed?
Consider the good habit of brushing your teeth in the morning. This behavior is reliably evoked by an antecedent stimulus - standing at the bathroom sink after we wake up. Long ago, we explicitly learned the sequence of individual operant responses - grabbing the toothbrush, apply toothpaste, moving the brush back and forth. They were all reinforced with predictable consequences (a loving parent praising our skillful brushing, fresh breath after brushing, etc.). But today, after the behavior has been reinforced thousands of times, the antecedent stimulus-situation reliably evokes habitual brushing behaviors regardless of our momentary motivation for the reinforcers. The ability of these antecedent stimuli to control our habitual tooth brushing is more evident when the stimuli are removed. For example, when we go camping, all of the antecedent stimuli in our bathroom are gone. Without these stimuli, we often "forget" to brush until, much later in the day, when we are motivated by a negative reinforcer of the escape variety - ugh, my breath is horrible.
Latency
The interval of time between the opportunity to respond and the response itself. Latency is of interest when we are interested in how long it takes to initiate target behavior.
Behavior Analysis
Defined by its goals, assumptions, and major activities.
Examples of negative reinforcement - Escape
Deleting an email-Email removed from inbox Taking an aspirin-Headache goes away Picking your nose-Booger is removed Soothing an infant-Baby stops crying Taking out the trash-Garbage removed from apartment Putting out a fire-Fire stops burning Agreeing to go to the bar-Peer pressure ends Seeing a behavior therapist-Depression is reduced Leaving a room-Ending an unwanted conversation
Examples of establishing operations (EOs)
Depriving an individual of any primary reinforcer is an EO. For example, when swimming underwater we are deprived of oxygen - a primary reinforcer. This (1) greatly increases the reinforcing efficacy of oxygen and (2) increases the probability of swimming to the surface. Depriving us of water, heat, and so on all have the effects of (1) temporarily increasing the efficacy of the primary reinforcer and (2) increasing behaviors that produce these reinforcers. Depriving an individual of an operant behavior (and the reinforcer that maintains it) is also an EO. Such response deprivation (1) increases the efficacy of access to that behavior/reinforcer and (2) increases the behaviors that will allow access to that behavior/reinforcer. For example, if we normally exercise for 40 minutes each day, then depriving us of exercise will (1) increase the reinforcing efficacy of the freedom to exercise when we want and (2) increase behaviors that will give us that freedom. But not all EOs involve restriction and deprivation. Consider what happens if we eat a tub of salty popcorn. Consuming all that salt does not deprive us of a drink of water, but this EO has the effect of (1) temporarily increasing the reinforcing value of a drink and (2) increasing the probability that we will go to the refrigerator to get a drink. Similarly, getting a sunburn does not deprive us of a bottle of aloe vera, but this EO (1) temporarily increases the reinforcing value of this pain-reducing lotion and (2) increases the probability that we will go to the drug store to buy some. Similarly, a headache is an EO that increases the reinforcing value of pain-reducing medications. A final point that can be made through examples is that not all EOs involve primary reinforcers. The services of a mechanic are not a primary reinforcer, but if your car breaks down on the side of the road, this EO will (1) temporarily increase the reinforcing value of the services of a mechanic and (2) increase the probability that you will call a mechanic for help.
The Response Strengthening Theory of Reinforcement
Each obtained reinforcer is hypothesized to strengthen the behavior it follows. The more frequently an operant behavior is followed by a reinforcer, the theory goes, the more firmly it is established, and the more difficult it will be to disrupt (Nevin & Grace, 2000). A metaphor for this hypothesis is a bucket of water - the heavier the bucket, the more response strength; the more response strength, the more probable is the response. Obtained reinforcers give the bucket its weight. Each reinforcer adds a bit more water to the bucket. Frequent or big reinforcers create a heavier bucket than infrequent or small reinforcers.
Response-Strengthening Theory of Reinforcement
Each time a reinforcer is obtained, the strength of the response it follows is increased. Operant responses with a lot of response strength are more likely to occur. Inconsistent with PREE and spontaneous recovery of operant behavior.
Respiratory occlusion reflex
Elicited by lack of oxygen. Infant pulls the head backward, brushes the face with the hands, and cries. (All to remove the breathing obstruction)
Parachute reflex
Elicited by the sensation of falling forward. Infant extends its arms in front, breaking the fall
Does ACT Work?
Empirical evaluations of ACT have often taken a bottom-up approach, testing the component parts of ACT to find out which ones work, which can be improved and which should be discarded as a waste of the client's time. These studies have revealed the efficacy of acceptance exercises in persisting with a therapeutic task even though it produces distressing thoughts and feelings (Villatte et al., 2016; Vowles et al., 2007). Likewise, responding flexibly to thoughts (as thoughts) reduces emotional reactions to negative thoughts about one's self (Masuda et al., 2009, 2010), and values-based components of ACT increase self-reported engagement in values-consistent behavior and quality of life (Villatte et al., 2016). Larger studies have evaluated multicomponent versions of ACT, as it is typically implemented by therapists outside of a research setting. These efficacy experiments use the group experimental designs described in Chapter 3. That is, treatment-seeking individuals are randomly assigned to an ACT group and others are assigned to a control group. The control group might receive a different type of psychotherapy or no therapy at all. Overall, substantial evidence reveals ACT is effective in treating anxiety and depression (Hacker et al., 2016) and other mental- and physical-health deficits, such as chronic pain (A-Tjak et al., 2015). ACT appears to work at least as well as other evidence-based psychotherapies (Hacker et al., 2016) with some studies suggesting ACT is modestly more effective (Lee et al., 2015; Ruiz, 2012). In sum, over 300 published studies have established the efficacy of ACT.
Group experimental design
Evaluate if the behavior of a treatment group(IV ON) is statistically significantly different from that of a control group (IV OFF). If so, then the difference is attributed to the independent variable.
Time out in the workplace
First, to arrange a time-out from positive reinforcement, we must identify a positive reinforcer in the workplace. For employees who are paid by the hour, being on the clock is presumably a positive reinforcer; if it was not, the employee would not come to work. To impose a time-out on rude employee behavior, we would simply instruct the employee to clock-out. By doing so, we shift the employee from more monetary reinforcers to no monetary reinforcers. If this procedure reduces the employee's frequency of making rude comments to customers, then this time-out from positive reinforcement functions as a negative punisher.
Preference for positive reinforcement in distinguishing between positive and negative reinforcement
If individuals object to negative reinforcement contingencies but not positive reinforcement, then it is important to distinguish between these two types of reinforcement.
Interval Recording
If the duration of the target behavior varies meaningfully from one instance to the next and has no distinct, observable, and lasting product, AND we are interested in frequency of behavior, we should use interval recording. Target behavior will be observed during back-to-back intervals of time. We don't record the number of occurrences; only whether or not it occurs. 2 types
When is punishment humane?
If you plan to be a loving parent, you will want to use punishment as infrequently as possible. Likewise, if a government has an obligation to care for the life, liberty, and pursuit of happiness of its people, then it too should aim to use punishment as little as possible. Punishment can be used infrequently if it is used effectively. That is, if a single punisher produces a rapid and long-lasting reduction in the future probability of problem behavior, then this is more humane than using ineffective negative consequences three times a day, every day, 365. If the latter is your parental practice, then what your children will learn from your "punishment" is not that they should stop engaging in problem behavior, but that you are a reliable source of negative consequences. In short, your children will learn that you are cruel.
Response-cost punishment
In applied settings, negative punishers that involve the removal or reduction of a reinforcer are often referred to as response-cost punishers. We have already considered several examples of response-cost punishers. When a parent takes a toy away from a misbehaving child, that stimulus removal functions as a response-cost punisher if, and only if, the future probability of misbehavior decreases. Response-cost punishment has also been used to reduce problem behavior in workplace settings. Bateman and Ludwig, for example, reduced the number of food orders sent to the wrong restaurant by subtracting money from warehouse employees' bonus accounts that otherwise accumulated as they met weekly goals. The response-cost intervention reduced errors and saved the company thousands of dollars in lost food.
Reversal (A-B-A) Design
Individual's behavior is evaluated in repeatedly alternating baseline (A) and experimental (B) phases. Better equipped to rule out confounds. It is possible though that there is a confounded variable that is perfectly correlated with the independent variable.
What role does learning play in survival?
Individuals capable of learning are able to adapt to a changing environment in ways that their genes do not prepare them for. It helps individuals survive in an ever-changing environment
Using the Matching Law to Positively Influence Behavior
It should be clear by now that Herrnstein's matching equation may be used to accurately predict choices between substitute reinforcers available in uncertain environments. What about the other goal of behavior analysis? Can the matching equation identify functional variables that may be used to positively influence behavior? If you read Extra Boxes 1 and 2, then you will know the answer is "yes." This section explicitly identifies those functional variables and specifies how to positively influence choice. In the matching equation, the functional variables appear on the right side of the equals sign, that is, R1 and R2. By changing these variables, we can increase socially desirable behavior and decrease undesirable behavior. Let B1 serve as the frequency of socially desirable behavior. Given Herrnstein's equation, there are two ways to increase the proportion of behavior allocated to B1. The first technique is to increase R1. That is, ensure that the socially desirable behavior is reinforced more often. This advice is entirely consistent with everything presented in the book thus far - if we want to see more of B1, reinforce it more often. The top portion of Table 13.1 provides an example of this prediction. The second technique for increasing B1 is to decrease R2. Said another way, if we want to increase socially desirable behavior, get rid of other substitute reinforcers. If we reduce R2 to zero, then we will have arranged a differential reinforcement procedure. That is, B1 will be reinforced, but B2 will not. The matching equation predicts differential reinforcement, if perfectly implemented, will yield exclusive choice for B1, the reinforced behavior. But short of reducing R2 to zero, the matching equation predicts a nonexclusive shift toward B1, the socially desirable behavior. This is demonstrated in the lower portion of Table 13.1. The matching equation also prescribes two techniques for decreasing an undesired behavior (B2). The first technique is to decrease R2. As shown in the upper part of Table 13.2, when we decreased R2 from 10 reinforcers per minute to 2, the percentage of behavior allocated to B2, the undesired behavior, decreased from 50% to 17%. The second technique for decreasing an undesired behavior (B2) is to increase R1. This prediction is shown in the lower portion of Table 13.2. When the rate of reinforcers on R1 is increased (from 10/minute to 50/minute), the percentage of behavior allocated to inappropriate B2 behaviors decreases from 50% to 17%.
Pavlovian extinction of conditioned punishers
Like conditioned reinforcers, conditioned punishers will gradually lose their ability to influence behavior if they are presented repeatedly without the US - the backup punisher. The reason for this loss of function is Pavlovian extinction. That is, when the CS is repeatedly presented without the US, the individual learns that it no longer signals a delay reduction to the US. In the case of conditioned punishment, a threat is only effective if it actually signals a delay reduction to the backup punisher. Empty threats to punish, whether they come from parents or employers, are soon ignored by children and employees.
Principle 1 for effective conditioned reinforcement
Principle 1: Use an effective backup reinforcer. The first principle of Pavlovian conditioning was "Use an important US." The more important the US, the better the Pavlovian conditioning. Translated to conditioned reinforcement, this becomes, Use an effective backup reinforcer. The better the backup reinforcer, the more effective the conditioned reinforcer will be. Simply put, the more effective the backup reinforcer, the more effective the conditioned reinforcer. One strategy for arranging an effective conditioned reinforcer is to use a token that can be exchanged for a lot of different backup reinforcers. For example, a $100 bill is a highly effective conditioned reinforcer because its receipt signals a delay reduction to many different backup reinforcers (ice cream, concert tickets, a new pair of shoes, etc.). When a conditioned reinforcer signals a delay reduction to more than one backup reinforcer, it is referred to as a generalized conditioned reinforcer.
5. Punish every time
Punishers are more effective when delivered following every instance of the problem behavior. If a bully's aggressive behaviors are infrequently punished, it should not be surprising that bullying continues. Similarly, most people steal music and movies online because, if their behavior has been punished at all, it has been punished infrequently. At a societal level, when a government collapses and the police go unpaid, criminal acts are infrequently punished and criminal behavior spikes. Of course, punishing every instance of problem behavior can be impractical. Even the most diligent parent will be distracted by a riveting show on Netflix now and again, thereby missing instances of dangerous behavior that could be reduced, if punished. If an unpunished problem behavior leaves behind a permanent product - such as a crying little brother with a bloody nose - an effective punisher can still be delivered by clarifying the contingency between the problem behavior and the punisher.
Principle 4 for effective conditioned reinforcement
Principle 4: Make sure the conditioned reinforcer is not redundant. The final principle of Pavlovian conditioning was, "Make sure the CS is not redundant." Translated to conditioned reinforcement, we want our conditioned reinforcer to be the only stimulus signaling a delay reduction to the backup reinforcer. For example, imagine that you want to start using a clicker as a conditioned reinforcer when training your dog. Establishing "click-click" as a conditioned reinforcer will be easier if you avoid simultaneously presenting a stimulus that already functions as a conditioned reinforcer. So, if you normally praise your dog just before giving her a treat, then praise probably already functions as a conditioned reinforcer. If you click at the same time that you praise the dog, then the click is a redundant stimulus - the praise already signals a delay reduction to the treat. Because the "click-click" is redundant with praise, it is unlikely to acquire conditioned reinforcing properties. Instead, when establishing "click-click" as a conditioned reinforcer, you should withhold the praise throughout. This ensures that "click-click" is the only stimulus signaling a delay reduction to the treat.
Outcome recording
Record the distinct, observable, and lasting products of behavior, instead of the behavior itself. (ex. how much soap is missing) It's efficient, cost effective, and reduces reactivity. Should be the method given first consideration. Can only be used if the behavior produces a distinct, observable, and lasting product. May not be used if you want to stop or reward the behavior when it happens.
Replication
Repeating the experiment and obtaining the same outcome. Replication is the essence of believability. The most important way to evaluate if scientific discoveries are true.
Tact
Saying "water" has a different function still when it is a tact. A tact is a verbal operant occasioned by a nonverbal stimulus and maintained by a variety of social reinforcers. When "water" is a tact, it is said in the presence of water (a glass of water, a waterfall, a raindrop landing on the skin). When parents hear their toddler tact "water" in the presence of any of these water-related stimuli, they reinforce this behavior - "Yes, that is water! You're so smart." They do not reinforce saying "water" in the presence of other stimuli such as a toy, a chair, or a plate. [Nonverbal Stimulus [Antecedent)] --> [Tact (Behavior)] --> [Variety of socially mediated reinforcers (Consequence)] Note how "water" is a tact when occasioned by con tact with the nonverbal stimulus (the sight, taste, or feel of water) and a mand when initiated by an establishing operation (thirst). When tacting, one describes the environment to others ("Look, a falling star!") and the others (the listeners) reinforce this verbal behavior in a variety of ways ("Whoa, that was amazing!"). When we tact we share our experiences with others, often to their benefit. Tacts such as "The new Marvel movie is great!" or "Watch out!" benefit the listener who may otherwise miss these opportunities to experience positive and negative reinforcers. Teaching a child with a language deficit to tact also benefits that child. If the child can accurately tact, they become more helpful to those around them, their parents, their siblings, their future friends. Our friends and loved ones are people who help us out. Teaching a child to tact teaches them to help others in one specific way - by drawing the attention of others to important things in their environment - water, falling stars, good movies. When a child learns to tact, they learn a behavior that helps them to be a friend.
Shaping Principle 1
Shaping Principle 1 asks us to provide an objective definition of the terminal behavior. That is, describe the terminal behavior in enough detail that it can be objectively measured. The terminal behavior in Plants vs. Zombies - beating the final level of the game - involves a lot of skills that the game shapes up over time. Let's simplify things by focusing on just one of these skills - the expert player clicks things quickly. For example, when suns fall from the sky, the expert clicks them immediately after they appear. Likewise, when a desired seed packet is available, the expert player clicks it right away. When playing at the highest levels of the game, clicking slowly is disastrous - brains!
What type of inferential statistics are appropriate for time-series data?
Simulation Modeling Analysis (SMA) can test single-subject time-series data to detect changes in trend and level. Ideal to supplement visual analysis.
B.F. Skinner
Skinner box, operant conditioning
The Behavior of the Listener
Skinner's functional approach to human language focused on the behavior of the speaker. Less emphasis was placed on how listeners understand what people are saying, writing, or signing. Discoveries made in the late twentieth century rekindled behavior analysts' interest in the behavior of the listener (Sidman & Cresson, 1973; Sidman & Tailby, 1982) and refined the meaning of "verbal behavior." To have insights on this topic, we invite you to recall how you learned to understand the word "cow." Of course, you cannot remember this early learning, but enough research has been done that we can make some educated guesses. Even before your first words, your parents said "cow" while you looked at a picture book containing many barnyard animals, one of which was a cow. They pointed to the cow and said "cow," and later asked you to "Show me the cow." If you pointed to the correct animal, they were elated and, as shown in the left panel of Figure 14.1, provided social reinforcers ("That is a cow, you are so smart!"). Not to demean your accomplishment, but we could teach your dog something similar. If you don't believe us, check out the YouTube video called Dog Understands 1022 Words. There you will see how a dog named "Chaser" learned to "show me the cow" by finding a stuffed cow among a pile of toys. Although this is amazing, we would not consider Chaser's behavior verbal. We will point out the missing ingredient in a moment, but first notice what your parents did after they taught you to point to the cow (something we would never do with Chaser the dog); your parents showed you the picture of the cow and said, "What is this?" When you did not say "cow," they trained it first as an echoic ("Say cow") and then as an intraverbal tact. That is, you learned to say "cow" only when the picture of the cow was presented (tact) and when someone asked "What is this?" (intraverbal). When you correctly said "cow," your parents went out of their minds with joy, showered you with reinforcers, and asked you to repeat this display of genius for every visitor. The second of these explicitly trained behaviors is illustrated in the center panel of Figure 14.1 - saying "cow" when asked to say what you see. The right panel shows the end-product of this training: symmetric relational responding. Symmetric relational responding is defined as the behavior of relating two arbitrary stimuli as, in many ways, the same. That is, when you heard someone say "cow" you related that auditory stimulus as, in many ways, the same as a picture of cow (right-pointing arrow in the right panel of Figure 14.1). When you saw a picture of a cow, you related that visual stimulus as, in many ways, the same as the sound "cow" (left-pointing arrow in the right panel of Figure 14.1). Chaser the dog could relate "cow" as the same as a stuffed cow in a pile of toys (i.e., he could find the cow toy), but he could not say "cow," so there was no way for him to show us if he could relate the stuffed cow as the same as the word "cow." Symmetric relational responding is the key ingredient in verbal behavior. When these two stimuli are related symmetrically, the child is responding verbally; the child understands what "cow" means. It is important to note a key term in the definition of symmetric relational responding - that word is "arbitrary." By this, we mean that neither the auditory stimulus "cow" nor the spoken word "cow" physically resembles a cow. "Cow" does not sound like a cow mooing, eating, or anything else that cows do. There is nothing about "cow" that suggests it should be related symmetrically with a cow. The relation between these stimuli is completely arbitrary, such that a cow could just as easily be related symmetrically with "vaca" (which it is, in Spanish speaking communities). Because "cow" and a cow do not in any way resemble each other, the principle of generalization (Chapter 12) cannot help us understand why a child relates "cow" with a cow. This symmetric relating of arbitrary stimuli is something unique to human speakers and listeners. Of course, as a child, you didn't just learn about "cows," you also learned about "pigs," "hens," "goats," and other farm animals; not to mention learning to symmetrically relate "mama" with your mother, "dada" with your father, "breakfast" with that food they fed you in the morning, and "music" with pleasant sounds that allowed you to do the diaper dance. Teaching an individual to symmetrically relate arbitrary stimuli, over and over again, with multiple examples, like those shown in Figure 14.2, is referred to as multiple-exemplar training. If you have children one day, you can informally watch your own behavior as you conduct multiple-exemplar training, teaching your child to symmetrically relate verbal stimuli with their referents. "Show me the pig...(child points to pig)...that is a pig!" "What is this...can you say pig?" You will teach this hundreds, if not thousands, of times until, one day, after your child acquires a new unidirectional relational response ("show me the emu" → point to the picture of the emu) the symmetric relational response will occur with no further training. That is, your child, "all by herself," points to the emu and says "emu" (see the green dashed arrow at the bottom of Figure 14.2). The young speaker, through prior training experiences (multiple-exemplar training), has learned that, when it comes to verbal behavior, symmetric relational responding is always reinforced (Barnes-Holmes et al., 2004; Greeret al., 2007; Luciano et al., 2007; Petursdottir & Carr, 2007). Nonverbal organisms, like Chaser the dog, are normally not taught to symmetrically relate stimuli in this way. Obviously, they cannot articulate a word like "emu." But if they were allowed to point at symbols instead of speaking, could they learn to relate stimuli symmetrically? A considerable amount of research has explored this question and the results thus far show that pigeons and sea lions can learn to do this, but only if they are given specialized training that they normally do not receive (Galizio & Bruce, 2018; Lionello-DeNolf, 2009). Other species such as rats, monkeys, baboons, and chimpanzees have thus far failed these tests of symmetric relational responding. (Lionello-DeNolf, 2009; Sidman et al., 1982). Perhaps one day researchers will arrange better training that will allow these species to pass a test of symmetric relational responding (Urcuioli, 2008). But one thing is for sure, animals in the wild do not learn to relate arbitrary stimuli symmetrically; they are not verbal.
Within-individual Replication
The behavior of the individual is repeatedly observed after the IV is turned ON. For each new session, will behavior remain changed or will it revert to eh pre-intervention baseline level? Used to evaluate if the intervention can be sustained.
First component of a behavioral experiment
The dependent variable is behavior
Interobserver agreement (IOA)
The extent to which two independent observers' data are the same after having directly observed the same behavior at the same time. IOA = Agreements/(Agreements + Disagreements) * 100 (Higher is better). If IOA is below 90%, the behavioral definition should be refined.
Contingency
The first dimension of a reinforcer that can influence its efficacy is the contingency according to which it may be obtained. As we have said many times before, reinforcers are more effective when delivered contingently and ineffective when provided noncontingently. Few will work for a reinforcer that may be obtained for free.
Edward L. Thorndike
The first scientist to demonstrate that reinforcers increase the probability of behavior. Widely known for the law of effect- the principle that rewarded behavior is likely to recur and punished behavior is unlikely to recur. This principle was the basis for BF Skinner's behavioral technology.
Pliance
The first category of rule-governed behavior occurs because of socially mediated positive and negative reinforcers. For example, if you followed your mother's instruction by loading the dishwasher, your rule-following contacted a negative reinforcer of the avoidance variety (SRA−); that is, you avoided her wrath. Alternatively, if you had a kind mother, then her socially mediated consequence was a positive reinforcer ("Thank you dear; you're such a help to me"). Rule-governed behavior occurring because of socially mediated positive or negative reinforcers is called pliance (as in compliance; Hayes et al., 1989). Pliance is common in everyday life. When children are given an instruction by parents, teachers, baby-sitters, and so on, there are often socially mediated reinforcing consequences for following those rules. For example, the baby-sitter may reinforce pliance by reading an extra bedtime story. Likewise, there are socially mediated consequences for complying with societal rules such as "Stand respectfully when the national anthem is played." Complying with this rule allows us to avoid a glaring look, a "tisk tisk" from someone nearby, or a reputational loss (SRA−). As evidence that these socially mediated negative reinforcers play a role in our pliance, consider that we never stand respectfully when the national anthem is played on TV (e.g., during the Olympics or before watching a baseball game) because the probability of a socially mediated consequence is zero. You may be familiar with the Stanley Milgram experiments on obedience. The studies were conducted at Yale University in the 1960s to better understand the atrocities of World War II. Specifically, how could German soldiers obey Nazi officers' orders to exterminate millions of predominantly Jewish concentration camp detainees? Milgrim wanted to know if Americans would similarly comply with the instructions of an authority figure (labeled E, for "experimenter," in the inset picture) to administer a series of increasingly painful electric shocks to another human being (labeled A, for "learner"). Of course, the shocks were not real, but the actor receiving them behaved as though they were painful and dangerous ("Let me out of here! My heart is bothering me!"). The results of the study were alarming: 100% of the subjects (T in the picture) pressed all of the sequentially arranged buttons on their panel until they reached the one labeled "300 volts" (which is a shock that can kill). Worse still, 65% of the subjects complied with all instructions provided by the authority figure, delivering the highest shock possible - 450 volts (Milgram, 1963). A contemporary derivation of the Milgram experiment is Derren Brown's Netflix special, The Push. A cleverly engineered series of instructions and gradually intensifying negative consequences for failing to comply produce an extreme form of pliance - following instructions to push someone off a building. Brown's results were similar to Milgram's; most participants reluctantly complied with the instruction to push the actor to his "death." The obedience of Milgram and Brown's participants could be described as pliance if they followed these instructions because they anticipated socially mediated positive and negative reinforcers for instruction-following. In Milgram's experiment, unmistakable stimuli gave the experimenter his authority (an individual wearing a lab coat and speaking authoritatively in an office at an Ivy League university). The close proximity of the authority figure appears also to have functioned as an SD signaling that important negative consequences could be avoided by following orders; much like the negative reinforcement contingencies employed by Nazi officers. Supporting this hypothesis, Milgram found that pliance dropped substantially when the SD was removed, when the experimenter provided instructions by telephone. But when the experimenter entered the room to provide the same instructions in person, pliance immediately increased with this reintroduction of the SD (Milgram, 1965). To summarize, pliance is common because we are surrounded by people who positively and negatively reinforce rule-following. Like other operant behaviors, pliance is more likely to occur in the presence of SDs that signal a high probability of socially mediated reinforcement. Thus, we comply with our boss' instructions, but are less likely to follow the same instruction given by a low-status employee who cannot give us a raise or terminate our employment. We are least likely to follow instructions in SΔ situations - pliance is improbable when an internet ad instructs us, "Don't delay! Order your vita-juicer today." Because the ad cannot deliver a socially mediated reinforcer, pliance does not occur. If you do order the vita-juicer, it is because your instruction-following is an instance of tracking, which is maintained by a different category of reinforcers.
Shaping Principle 4
The red dashed line in Figure 8.4 shows a reinforcement contingency that is neither too easy nor too difficult. If the player clicks things at least this quickly (anywhere in the reinforcement zone), the reinforcer will be obtained; that is, the player will defeat the zombies and win the level. Clicking slower than that (the extinction zone) is not reinforced. Reinforcing one response and extinguishing other, previously reinforced responses is, of course, differential reinforcement - Principle 4.
Unconditioned Response (UR)
The response (salivating) reliably elicited by the unconditioned stimulus
Assumption #2 of behavior analysis
The scientific method is a valid way to reveal the determinants of behavior
Loss Aversion
The strong tendency to regard losses as considerably more important than gains of comparable magnitude—and, with this, a tendency to take steps (including risky steps) to avoid possible loss. Negative reinforcers seem to more effectively influence behavior.
Loss aversion in distinguishing between positive and negative reinforcement
The tendency for prevention SRA- to influence behavior more than presentation of the same stimulus SR+. People value loss prevention more highly than an equivalent gain.
Direct instruction of reading skills
There are many approaches to teaching children to read. One approach is to show the child a whole word, like JUST and teach them to say "just." The child may learn to read 10 new words in an afternoon, but mistakes will be frequent; for example, saying "just" when the word is actually JEST. A different approach is to teach more fundamental discriminated operant responses. Instead of teaching 10 whole words in an afternoon, the child will instead learn the sounds of 10 individual letters. For example, when the child sees the letter C (SD), saying "kuh" will be reinforced. Saying "kuh" when other letters are shown will be extinguished (SΔ). When, at the end of the discrimination-training session, the child is able to make 10 different phonetic sounds under the tight stimulus control of 10 different letters, they will have the fundamental skills necessary to begin sounding-out hundreds of three-letter words, like "cat" (Watkins & Slocum, 2004). This use of discrimination training to establish fundamental skills that will form the foundation of higher-order thinking is the strategy employed in Direct Instruction (DI). The DI approach uses empirically supported behavior-analytic principles to design a curriculum that (a) moves systematically from the fundamental to the complex, (b) ensures that every child is actively responding and learning from their successes and mistakes, and (c) tailors the curriculum to individual learning. The development of DI materials involves careful design and repeated testing to ensure student success. In the late 1960s, the US government conducted a massive study to identify the reading curriculum that produced the best results in impoverished communities across the country. Of the 20 curricula tested, only DI produced positive outcomes on all outcome measures (Kennedy, 1978). Despite these results, most school districts did not adopt the DI curriculum, often preferring a curriculum that had far worse learning outcomes. In the years that followed, the DI curriculum was improved and expanded to math, language, spelling, and so on. More than 300 published studies have since evaluated the efficacy of DI and the nearly 3,000 outcomes are impressive. Children who attend DI schools demonstrate gains in academic achievement that a recent literature review described as "large" and "huge" (Stockard et al., 2018). These outcomes have been consistent over 50 years and do not depend on grade, race ethnicity, household income, or student at-risk status. Children who attend DI schools for more years have better outcomes than those with limited access. Despite all of these data, DI is still not widely used in US schools. Why not is hard to say. One reason may be that parents and administrators embrace "student-led" and "inquiry-based" forms of education - they sound great in the abstract (who doesn't want their child to be inquisitive and "self-motivated"?). Another reason may be that the active-learning style of DI doesn't appeal to teachers, parents, and administrators who prefer quiet classrooms. DI schools are noisy because everyone is actively responding and learning - no one raises their hand to answer a question. Others fear that DI involves rote memorization, which DI is explicitly designed not to do; hence, the better scores when testing for higher-order cognitive processes. So, it's a mystery why DI is not more popular. Perhaps one day you will be the parent who successfully lobbies to bring DI, and better learning outcomes, to your child's school.
Why is motivation as a mental essence not helpful in therapeutic settings?
This way of thinking does not help those suffering from "motivational deficits" because it fails to identify a functional variable that can be turned ON or OFF in an effort to improve performance. Viewing "motivation" as the product of MOs suggests Sherri's motivational decline may be due to an AO - an environmental or a biological factor that has (1) reduced the value of vocational reinforcers (e.g., money, positive feedback from her manager) and (2) reduced Sherri's on-the-job behaviors that normally produce these reinforcers.
What can reinforcers be used for?
To direct behavior toward actions that benefit the individual, those they interact with, and society at large.
Good Behavior Game
Token economies have also proven effective in schools. Perhaps, when you were growing up, your elementary-school teacher used a version of the "Good Behavior Game". In this game, students are assigned to teams and points, later exchangeable for goods and privileges, are given (and taken away) contingent upon appropriate and inappropriate behavior. Token economies have several attractive features: Motivationally robust: Because tokens can be exchanged for many different backup reinforcers, motivation to earn them remains fairly constant. For example, when a psychiatric patient uses his tokens to buy a shirt, he is still motivated to earn more tokens because they can be exchanged for candy, movies, and so on. Nondisruptive: Reinforcing an ongoing behavior with a token is easier than with a backup reinforcer that disrupts the performance. For example, providing a token when a patient buttons his shirt is less disruptive than taking him to the hospital's theater to watch a movie. Fair compensation: In a token economy, it is easy to assign larger reinforcers to motivate challenging or less-preferred behaviors - simply assign a larger number of tokens to that behavior. This ensures fair compensation is provided for each activity. Portability: Tokens are easy to keep on hand at all times and this allows reinforcement of appropriate behavior whenever it is observed. Likewise, points can be added to a team's score with nothing more than a whiteboard marker. This portability increases the probability that appropriate behavior will be reinforced. Delay-bridging: If professor Dumbledore told the Hogwarts students that the best-behaved students would win the House Cup at the end of the year, they would soon forget about this delayed reward. However, by providing points immediately after a desirable response, the delay is bridged between good behavior and the awarding of the House Cup. Nonfictional evidence for the importance of delay-bridging comes from animal experiments in which responding stops when backup reinforcers are delayed by just a few minutes. When the same animals' responding produces an immediate conditioned reinforcer, their behavior continues even when the backup reinforcer is delayed by an hour or more
Duration Recording
Used to measure both latency and duration of a target behavior. Should be used when the behavioral dimension of interest is related to time.
Confounds
Variables that influence behavior within an experiment, but are not controlled by the researcher
Conditioned Reinforcers
We have to learn something before these consequences will function as reinforcers. (Like Pavlovian conditioning) Conditioned reinforcers are those consequences that function as reinforcers only after learning occurs. Conditioned reinforcement is incredibly useful when teaching an individual to engage in new and complex behaviors. Therefore, those wishing to positively influence behavior will be more successful if they know precisely how to turn a neutral stimulus (e.g., a useless piece of paper) into an effective conditioned reinforcer (a $50 bill). Pavlovian learning is responsible for the transformation of a neutral consequence into a conditioned reinforcer.
When is alternating treatment design most often used?
When trying to understand why inappropriate behavior occurs.
Stable behavior
When, over repeated observations, there is little bounce and no systematic trend
Experimental Analysis of Behavior
Working to expand our ability to predict and influence behavior by conducting research in controlled laboratory settings. Not designed to be used therapeutically.
IF THOUGHTS ARE IMPORTANT, PREPARE TO SUFFER
While language allows our species to stand on the shoulders of giants and improve living conditions in ways unimaginable to our ancestors, it appears to come with some unwanted baggage. Humans are the only species on the planet who, when blessed with ample food, shelter, freedom from disease, and a partner that loves them, can nonetheless feel anxious, depressed, unloved, unworthy, and, in some cases, suicidal (Hayes et al., 1999, 2009). While it is tempting to blame our hectic lifestyles, social media, or income inequality, unwarranted human suffering is far older than these social ills. Suicide is ancient; its prevalence led the Greeks to debate it and the Bible to admonish against it. More than 2,400 ago, the Buddha's teachings on psychological suffering resonated with his fellow iron-age people: "it is only suffering that I describe, and the cessation of suffering."2 More recently, Mark Twain is said to have quipped, "I am an old man and have known a great many troubles, but most of them never happened." This points to something older than social media and income inequality, something as old as human language. Where other species experience negative emotions only when aversive events are occurring, humans experience these emotions before, during, and after the aversive event. Indeed, as Twain notes, most of the thoughts that occasion negative emotions are thoughts about things that will never occur - we call this "worrying". We might know that worrying makes us miserable, but we feel powerless to stop it. When we master symmetric relational responding, we experience thoughts of the past and future as, in many ways, equivalent to the things thought about. Figure 14.4 illustrated how overt verbal stimuli can evoke negative emotions, and it appears the same principle applies to private verbal stimuli - self-talk (see inset diagram; Hayes et al., 1999, 2009; Skinner, 1953, 1969). Thus, the recollection of past events, or imagined future events, puts language-able humans in a position to experience psychological suffering like no other species on earth - congratulations. Evaluating if language is responsible for human psychological suffering is difficult. Parents are understandably unwilling to have their children randomly assigned to a control group who never acquires language, despite the possibility that they might suffer less. Instead, we must look for "natural experiments" in which circumstances conspire to turn OFF verbal behavior. Such was the case for Dr. Jill Bolte Taylor, who experienced a massive stroke on the left side of her brain - the side involved in verbal behavior. What she was left with was the right hemisphere, which focuses on stimuli happening right now, not what has happened before, and not what one imagines will happen in the future. In a 2008 TED talk, Dr. Bolte Taylor describes how she gradually, over the course of a few hours, lost all ability to engage in verbal behavior; her self-talk went completely silent; her ability to read, write, or understand the verbal behavior of others was entirely lost. And what was her experience? Years later, after acquiring the verbal behavior necessary to describe the experience, she described it as peaceful, euphoric, liberating; hardly the stuff of human psychological suffering. Learning to skillfully live with this gift of verbal behavior, particularly private verbal behavior - cognition - is the goal of ACT therapy. The overarching goal is to adopt a more flexible approach to thinking. Thoughts are not going away, so learning to have them, instead of them having you, is a skill worth mastering. The following sections continue to provide a brief overview of some of the techniques used by ACT therapists to help their clients "think different" about thoughts. Of course, space does not permit a complete description of ACT. If the approach described here is of interest, and developing a flexible approach to cognition sounds like a skill worth acquiring, there are several practical workbooks available for use outside of formalized therapy sessions. A few of these books are listed in the "Further Readings" section at the end of this chapter.
Reinforcer
a consequence that increases the operant behavior above its baseline level. Refers to the consequence itself- the consequence of the behavior.
Operant Behavior
behavior influenced by antecedent and consequence events. Changes the environment and in turn is influenced by that consequence. To find out how the consequence influences operant behavior, we need to turn ON and OFF the consequence (independent variable) and see if it affects the behavior. Example: Pressing the button. Operant behavior operates the environment by producing consequences.
Pavlovian generalization
conditioned responding to a novel stimulus that resembles the CS. Amount of generalization depends on how closely the novel stimulus resembles the CS.
Operant Behavior
generic class of responses influenced by antecedents, with each response in the class producing the same consequence (example: to open spotify, there are many ways you can press the button)
Palmar grasp reflex
if you stroke a baby's palm their fingers will close and their hand will grasp your finger (or whatever is stroking its palm). This reflex and the baby's grip strength can save the baby's life.
What makes video games so engaging?
just as quickly as one reinforcer is obtained, another one is needed. For example, when you've landed safely just past the pit of fire, now you need to execute another skillful sequence of button presses and joystick movements to avoid an incoming projectile. In this way, the game influences your behavior through a series of reinforcement contingencies.
swimming reflex
life saving (but imperfect) reflex that babies can often do
rooting reflex
moving the mouth toward an eliciting object touching the cheek. Only occurs when infant is hungry
noncontingent consequence
occurs after a response, but not because the response caused it to occur. No causal relationship between response and consequence.
superstitious behavior
occurs when the individual behaves as though a response-consequence contingency exists when, in fact, the relation between response and consequence is noncontingent.
Kamin Blocking
refers to failures of learning and/or the expression of classically conditioned responses (CRs) when a target conditioned stimulus (CS) is presented to an animal as part of a compound that includes another CS that had been used previously to establish the target CR.
Predicting Impulsive Choice
The first task for a behavioral science is, as always, to accurately predict behavior; in the present case, predicting impulsive choice. At the opening of this chapter, we discussed four variables affecting choice. You may have noticed that two of those variables - reinforcer size/quality and reinforcer delay - are relevant to impulsive and self-control choices. These two variables are pulling behavior in opposite directions. A larger-later reward is attractive because of its larger reinforcer size/quality, but it is a less effective reinforcer because it is delayed. The smaller-sooner reward has immediacy going for it, but it is a less effective reinforcer because of its small size or lower quality. These two functional variables pull behavior in opposite directions - we want more, but we also want it now. If the delayed benefits of adhering to our diet could be obtained right after refusing dessert ("No cake for me please." Poof! You lose 1 pound of unwanted fat), no one would ever eat dessert again. Unfortunately, these health benefits are delayed and, therefore, much of their value is lost on us. This reduction in the subjective value of a delayed reinforcer is illustrated in Figure 13.10. Instead of showing the value of delayed health outcomes, we show the subjective value of the delayed monetary reward that Rich Uncle Joe promised us earlier in the chapter. The red data points in the figure plot how someone might subjectively value a $1,000 gift promised 1 month, 1 year, 5 years, and 25 years in the future. How did you subjectively value these delayed gifts (the subjective values are those amounts that you wrote in each blank of the Rich Uncle Joe survey)? Go back to the survey and add your own subjective values to Figure 13.10. If you said that you would sell the $1,000 in 5 years for $300 in cash right now (and not a penny less), then you should add a data point at $300 on the y-axis and 5 years on the x-axis. The first thing to note about the data in Figure 13.10 is that the subjective value of a reinforcer does not decline linearly (i.e., according to a straight line). Instead, the line is curved in a way that reflects the fact that, for our hypothetical decision-maker whose data are shown, reinforcers lose most of their value at short delays (≤ 5 years in Figure 13.10). Another way to say this is that the value of the reinforcer was discounted steeply in this range of delays. At delays longer than 5 years, the curve declines less steeply. In this range, waiting longer has less impact on the discounting of the delayed gift. The shape of this delay discounting curve is called a hyperbola. Importantly, this hyperbolic shape is predicted by a version of the matching law whose quantitative details we will not explore - you've had enough math for one day.2 This predicted hyperbolic discounting of delayed reinforcers has been evaluated and confirmed in many experiments. Rats, pigeons, monkeys, and humans all hyperbolically discount the value of delay events (Bickel & Marsch, 2001; Green et al., 2007; Laibson, 1997; Logue, 1988; Madden et al., 1999; Mazur, 1987; Rachlin et al., 1991). Now that we know delayed reinforcers are discounted hyperbolically, we can predict when impulsive choices will be made. Figure 13.11 shows another hyperbolic discounting curve but this time the x-axis is reversed to show time passing from left to right. If you are the stick-figure decision-maker in the figure, you must choose between a smaller-sooner reward (SSR; the short green bar) and a larger-later reward (LLR; the tall red bar). Because the smaller-sooner reward is available right now (e.g., that slice of cake), it maintains its full value, which is given by the height of the green bar. By contrast, the subjective value of the larger-later reward (the health benefits of eating a healthy diet) is discounted because this consequence is delayed. The hyperbolic discounting curve sweeping down from the red bar shows the discounted value at a wide range of delays. At T1, where the stick-figure decision-maker is making its choice, the discounted value of the larger-later reward is shown as a red dot. The red dot is lower than the height of the green bar; therefore, the value of the smaller-sooner reward is greater than the subjective value of the larger-later reward. Said another way, the smaller-sooner reward feels like it is worth more than the discounted value of the larger-later reward. Because choice favors the reward with the higher subjective value, our stick-figure decision-maker makes the impulsive choice; that is, they choose the smaller-sooner reward and forego the larger-later reward.
2 Interpretations of Motivation
1. A concise description of a long-term pattern of behavior. Allows us to make predictions about human behavior. 2. Motivation is inherent to the individual. Motivation is a mental essence exiting inside the individual, explaining why they behave the way they do. Less useful definition. Mentalistic explanation of behavior.
The evidence that proves if an independent variable influences behavior
1. A systematic and replicable change in behavior each time the independent variable is turned ON or OFF 2. No confounded variable can explain this change
Ethical guidelines for using punishment
1. Client freely consent to use of punishment. (People prefer the intervention that works- and that's sometimes punishment)
4 Types of single-subject experimental designs
1. Comparison (A-B) Design 2. Reversal (A-B-A) Design 3. Alternating-Treatments Design 4. Multiple-Baseline Designs
What are the 2 approaches to determining how reinforcement works?
1. Investigate the neurological events that occur when reinforcers are obtained. If reinforcers influence future behavior, they must do this by changing the biology of the individual. 2. investigates this process at the level of the whole organism and how it interacts with its environment (without going into the neurological specifics that underlie the reinforcement process). Reveal in further detail how present behavior is influenced by past events. Most practical approach to positively influencing human behavior.
6 Principles of Effective Shaping
1. Objectively define the terminal behavior. 2. Along what dimension does the learner's current behavior fall short of the terminal behavior? 3. When mapping out the sequence of successive approximations, ensure that each one is neither too easy nor too difficult. 4. Differential reinforcement: Reinforce the current response approximation and extinguish everything else, including old response approximations. 5. Be sure the learner has mastered each response approximation before advancing to the next one. 6. If the next approximation proves too difficult (extinction), lower the reinforcement criterion until responding is earning reinforcers again.
4 guidelines for effectively using time-out from positive reinforcement
1. Provide no more than one verbal warning: Time-out from positive reinforcement is more effective when it is implemented following either one or no verbal warnings, such as "You need to stop or you will go to time-out". Providing a second or third warning makes time-out ineffective. The reason for this is straightforward from a Pavlovian learning perspective. An effective warning is a conditioned punisher (CS) that signals a delay reduction to a punishing US. Providing multiple verbal warnings without the US is Pavlovian extinction - a procedure that decreases the ability of the conditioned punisher to decrease the future probability of behavior. 2. Time-outs from positive reinforcement are effective when they significantly reduce access to reinforcers. If the child in time-out can play with a toy, self-stimulate, or joke with a friend, then the consequence is not a time-out from positive reinforcement, and it is unlikely to reduce problem behavior. Worse, if the "time-out" removes the child from something aversive - for example, providing a time-out from an aversive math assignment - it can function as a negative reinforcer of the escape variety (SRE−), thereby increasing problem behavior. To be effective, time-outs must be a time-out from positive reinforcement. 3. Time-out from positive reinforcement should end after no more than 5 minutes, even if the child is not sitting quietly. Again, intuition tells us that older children should get longer time-outs and that the child should have to calm down before they are allowed to return to the reinforcing activity, but the scientific evidence does not support these practices. So, set a timer for no more than 5 minutes and let the child return to positive reinforcement after it goes off. 4. Time-out from positive reinforcement works best when every instance of the problem behavior produces a time-out. Don't wait for the child to misbehave several times before implementing a time-out. An exception to this rule is that, after time-out is used consistently and it has significantly reduced the problem behavior, intermittent time-outs can be effective in maintaining these improvements.
Complex contingencies
2 types: contingencies that require more than just one response and contingencies that involve the passage of time. These complex contingencies are often referred to as schedules of reinforcement. A common type of complexity is that not every response is reinforced. For example, most applications on our computers will open only if we double-click them (IF two clicks → THEN app access). Similarly, a dirty pan requires a dozen or so scrubbing responses before it is clean (IF a dozen scrubs → THEN clean pan). And most employers pay their contracted employee only after making many more responses still (IF ~ 1,000 raking and leaf-bagging responses → THEN $75 payment). Another common type of complexity is that reinforcers are often unavailable until some interval of time has passed. For example, if our friend is away from their phone for 48 minutes, then trying to video chat with them will not be reinforced until 48 minutes have passed, no matter how many times we send them a chat request. As soon as they return, they will accept the request and then our behavior will be reinforced (IF 48 minutes pass and one chat request → THEN social reinforcers are obtained during the chat).
Primary Reinforcers
A consequence that functions as a reinforcer because it is important in sustaining the life of the individual or the continuation of the species. Primary reinforcers are those that work because of our genetic inheritance - these reinforcers help us survive. Human and nonhuman animals don't have to learn anything for primary reinforcers to work - we are phylogenetically prepared for these consequences to reinforce our behavior. The logic is the same across all these primary reinforcers: If this consequence did not function as a reinforcer, the individual would be less likely to survive. Without oxygen, we die; therefore, oxygen-seeking behaviors (swimming to the water's surface) are reinforced with access to this primary reinforcer.
Automatic Reinforcer
A consequence that is directly produced by the response - it is not provided by someone else - and which increases the behavior above a no-reinforcer baseline.
Punisher
A contingent consequence that decreases the future probability of behavior below its pre-punishment level. Thus, like reinforcers, punishers are consequences; it's just that they have the opposite effect on behavior. Instead of increasing behavior (reinforcers), punishers decrease behavior. Behavior analysts define punishers based on their function - they decrease the future probability of the behavior that produced the punisher. Punishers do more than prevent the behavior from occurring; they decrease the behavior when the individual is free to choose whether to make the punished response or not. Tickets for running red lights only function as punishers if they decrease the probability of running red lights in the future.
Reactivity
Behavior changes because the individual is aware they are being watched. Can make data useless.
Cumulative Record
A cumulative record is a graphical display of responding as it unfolds over time. The three panels of Figure 11.3 show examples of cumulative records. In each panel, time unfolds from left to right along the x-axis and cumulative responses increase as we move up the y-axis. The blue line in each panel is the record of responding over time.
Whole interval recording
A direct-observation method used to estimate how frequently behavior occurs. Observers record whether or not the behavior occurs throughout each in a series of contiguous intervals. In order to record a positive interval, the behavior has to occur throughout the entire interval. Used to measure target behaviors that should be long-lasting.
discriminative stimulus (SD)
A discriminative stimulus (SD) is an antecedent stimulus that can evoke a specific operant response because the individual has learned that when the SD is present, that response will be reinforced. Some of the earliest laboratory research on the SD was conducted by Skinner (1933, 1938). In one experiment, a rat was placed in an operant chamber equipped with a lever and a small light. When the light was turned on, pressing the lever produced a reinforcer (a food pellet). When the light was turned off, pressing never produced a pellet. By interacting with these antecedents (light on/off) and consequences (food/no-food), the rat learned to respond when the light was turned on and to not respond when the light was off - discriminated operant behavior. Skinner's experiment is replicated every Halloween, with human children instead of rats. As any experienced trick-or-treater will tell you, knocking on the doors of houses with the lights on and Halloween decorations in the front yard will be reinforced with candy. Conversely, knocking on the door of a house with the lights off and no decorations is almost never reinforced. Like Skinner's rats, first-time trick-or-treaters quickly learn that IF they knock at a well-lit, decorated house → THEN they will get some candy and IF they knock at a dark, undecorated house → THEN they will get no candy. The product of this learning - discriminated operant behavior - is shown in the inset graph in Figure 12.1. By the end of the first hour of candy-foraging, our first-time trick-or-treaters will almost always knock on the doors of well-lit, decorated houses, and almost never knock on the doors of dark, undecorated houses. When operant behavior is discriminated in this way, we can classify the antecedent stimulus that increases behavior as an SD.
Fixed-Interval (FI) Schedule
A fixed-interval (FI) schedule specifies a constant time interval that must elapse before a single response will produce the reinforcer. For example, under an FI 60-second schedule, the first response after 60 seconds is reinforced. Figure 11.10 shows the operation of an FI 60-s schedule of reinforcement. As the 60 seconds timer is ticking away, responses are occasionally made; however, they are never reinforced - they are wasted responses because they do not make the reinforcer come any sooner. Instead, when 60 seconds pass (red dotted lines), the very first response is reinforced. If we changed to an FI 4-minute schedule, the first response made after 4 minutes elapsed would be the one that was reinforced. Any responses made before 4 minutes would be wasted.
The typical pattern of responding under a VR schedule
A high-rate of responding with little or no post-reinforcement pause. There are three things to notice in the figure. First, note how the reinforcers (green lines) are irregularly spaced. This occurs because under a VR schedule the number of responses per reinforcer is variable; it is not the same each time. Second, note that there is little to no pause in responding right after the reinforcer. Instead, the pigeon immediately begins working for the next reinforcer after it eats the last one. Post-reinforcement pauses can occur under VR schedules, but they are always shorter than under a comparable FR schedule. Third, the VR response rate is higher than the FR rate that was shown in Figure 11.4. This high-rate pattern of responding, with only occasional pauses under a VR schedule, has proven replicable in many labs and with many species, including humans
Behavioral Definition
A precise specification of the topography of the target behavior, allowing observers to reliably identify instances and non-instances. Make data collection objective and not influenced by bias. Must be observable and objective.
Differential Reinforcement
A procedure in which a previously reinforced behavior is placed on extinction while a second behavior is reinforced. Decreases 1st behavior and increases the 2nd. If working with alcohol addiction and Behavior 1 (working) becomes extinct, then Behavior 2 (drinking) will be reinforced. Better option than the more common approach of punishment. Differential reinforcement also (1) provides the opportunity to teach an adaptive behavior that will replace the problem behavior and (2) decreases the frequency of extinction bursts and extinction-induced aggression. Used to stop problem behavior of children (especially autistic children). Used in treating substance-use disorders. Behavior 1: Opiate Use -- Extinction Behavior 2: Opiate Refusal -- Reinforcement
Independent Variable
A publicly observable change controlled by the experimenter, which is anticipated to influence behavior in a specific way.
Ratio Schedule of Reinforcement
A ratio schedule of reinforcement specifies the number of responses that must be made in order for the reinforcer to be delivered. The term "ratio" refers to the ratio of responses to reinforcers. For example, the ratio might be 3:1; that is, IF 3 responses → THEN 1 reinforcer. When a ratio schedule of reinforcement is in operation, there is a direct relation between the number of responses and the number of reinforcers. The more responses we make, the more reinforcers we get. The faster we respond, the faster we get the reinforcer.
Differential Reinforcement of Variability
A unique contingency in which responses, or patterns of responses, that have either never been emitted before or have not been emitted in quite some time are reinforced, and repetition of recent response topographies are extinguished. Researchers in applied settings have begun to explore the utility of differentially reinforcing variability in children with autism, a diagnosis often characterized by rigid patterns of activity or speech.
Reward Devaluation
A reinforcer will be effective at increasing the frequency of a behavior in some cases, but less so in others. Like Hogarth and Chase study with smokers pushing buttons for cigarettes or chocolate
Schedule of Reinforcement
A schedule of reinforcement precisely specifies the nature of the contingent relation between a response and its reinforcer.1 Said another way, the schedule of reinforcement tells us exactly how the response-reinforcer contingency works. The simplest schedule of reinforcement is the continuous schedule shown on the left side of Figure 11.1. According to this schedule, IF one response → THEN one reinforcer. Undoubtedly you can think of other schedules of reinforcement. How about IF two responses → THEN one reinforcer, or perhaps IF one response → THEN two reinforcers. The possibilities are endless. For our purposes, we will discuss just two of the broad categories of schedules studied by Ferster and Skinner (1957) - ratio and interval schedules of reinforcement.
Objection 2: Performance- inhibiting properties of reinforcement
A second concern about using positive reinforcement is that reinforcement contingencies can actually inhibit performances. This performance decline takes two forms - reinforcers reduce creativity and reinforcers can lead to "choking under pressure." If creativity is an important component of the performance (as it is in any artistic endeavor, and in many problem-solving contexts), a contingent relation should be established between creativity and obtaining the reinforcer. Contingencies arranging very large rewards would appear to increase the probability of choking under pressure.
Response
A single instance of behavior.
Flow (Csikszentmihalyi)
A state in which one feels immersed in a rewarding activity and in which we lose track of time and self. Musicians report being in flow while playing a song. Rock climbers describe the flow state achieved when on the wall. Reading a page-turning novel can also produce a feeling of flow. To get a sense of this, consider a rock climber engaged in an afternoon ascent. Will she experience flow? Csikszentmihalyi's research suggests she will if the response-reinforcer contingencies naturally imposed by the rock face have three characteristics. First, flow will occur if the climb is neither too easy nor too difficult for the skills of our climber. If she has chosen a climb that is too easy, she will be bored. Conversely, if the climb is too difficult, she will feel frustrated, angry, and so on as her climbing fails to produce reinforcers (extinction). Flow will be achieved only in the "Goldilocks zone," where the challenge is neither too difficult nor too easy. It needs to be "just right" for her skill level. The responses required to meet the reinforcement contingency will require her complete attention. Second, our climber is more likely to experience flow if the rock wall naturally arranges what Csikszentmihalyi called "proximal goals." A proximal goal on the wall might be making it past a difficult set of holds and crossing over to the next section of the climb. This consequence signals a delay reduction to the ultimate reinforcer - getting to the top of the wall. Therefore, a "proximal goal" can be thought of as a contingency arranging a conditioned reinforcer along the way. Third, the flow state will be achieved if the wall provides immediate task-relevant consequences for the climber's behavior. Said another way, when a skillful or unskillful response occurs, she needs to know that right away. This immediate feedback is provided when the climber either loses a toehold (the immediate, contingent consequence of an unskillful response) or makes it to the next hold (contingent on a skillful response). These consequences are immediate; they mark the exact response that produced the outcome.
Typical pattern of responding under a VI schedule
A steady, moderate response rate with little to no post-reinforcement pause. There are four things to note about the VI response record. First the line is not as steep as we saw with the VR schedule (Figure 11.6). Again, responding faster will not produce the reinforcer any sooner on an interval schedule, so why waste the effort? But because the pigeon cannot predict exactly when the reinforcer will become available, it pecks at a steady, moderate rate, so as to obtain the reinforcer soon after the VI timer elapses. Second, the response rate is fairly constant - no scalloping occurs under this schedule. Third, the response rate maintained by the VI 120-s schedule is higher than was maintained by the FI 120-s schedule in Figure 11.11. Again, under the VI, pecking is sometimes reinforced a few seconds after the last reinforcer, but these early responses are never reinforced under an FI schedule (so they don't happen). Thus, the VI schedule reinforces continuous, moderate-rate responding. Finally, notice that the delivery of reinforcers occurs at irregular intervals - 120-s is the average time between reinforcers, not the exact time between them.
Reinforcer Surveys
A structured interview or written survey that asks the individual to identify highly preferred activities. When completing reinforcer surveys, people sometimes struggle to think of things they enjoy doing. To aid in this, the survey will often include items designed to jog the person's memory. Beyond identifying behaviors that the individual currently likes to do, the reinforcer surveys will ask about wish-list activities. That is, something the person cannot afford to do (question 11) or doesn't know how to do (questions 13 and 14). Arranging lessons, training, and/or access to these activities may also function as an effective reinforcer. Reinforcer surveys are a simple, practical way to identify potential reinforcers. However, they are not foolproof. The individual may list things that are impractical (a new car), not in their own best interests (drugs), or things that do not actually function as reinforcers. Fortunately, the available research suggests that, in workplace settings, most employees are able to identify things that function as reinforcers.
What kind of learning does AI use and why?
AI uses trial-and-error learning and forced variability to solve complex problems. The problem is that if the AI exploited the first successful action (by exclusively engaging in that action) then it would settle upon a satisfactory solution, but not an optimal one. The thing that makes AI so effective in finding optimal solutions is that it is willing to continuously try something new. Every AI program is written with a forced-variability component. No matter how successful the AI is in solving the problem, its programming forces it to do something different some of the time.
Using Discrimination Training to Positively Influence Behavior
According to UNICEF, over 110 million landmines remain buried in 64 countries worldwide. Scores of innocent people and large animals (e.g., tigers, elephants) have been injured or killed when they strayed from well-worn paths and stepped on a mine. These countries can experience food shortages because uncleared fields cannot be farmed, and parents live with the fear that their child will be the next victim of these underground killers. Enter APOPO. Equipped with their knowledge of behavior-analytic principles, this group used discrimination training to teach the African pouched rats to sniff out landmines, which can then be disarmed by their human handlers. Relative to other mine-clearing methods, African pouched rats are inexpensive, they have a great sense of smell, and they are too light to explode the mines they step on. Training begins in a cage with three holes in the floor. Below the holes are small pots containing dirt that either contains TNT (the SD) or contains just dirt (the SΔ). When the rat lingers over the SD hole, a clicker is used to present a conditioned reinforcer, followed by a bit of tasty food. Between response opportunities, the location of the SD is randomly assigned. Rats soon learn to detect the smell of TNT, discriminating it from ordinary dirt. The rat earns an initial training certification when its operant behavior is highly discriminated - it only lingers over soil containing TNT. Next, the rats are trained in a small area in which perforated steel balls have been buried, some of which contain TNT. When the rats dig at the ground above these balls (SD), they earn a click and a treat. Digging anywhere else (SΔ) is extinguished. Eventually, after the rats earn more training certifications, they make their way to the field, where their handlers will systematically walk them through an area, disposing of the landmines these highly trained rats detect. By 2019, these rats had cleared mines in Angola, Cambodia, Mozambique, Tanzania, and Zimbabwe. Mark Shukuru, originally from Tanzania, and now the head rat-trainer for the Cambodia project, described its success, "Everyone was surprised, even me ... We have never missed anything with the rats. We're doing good."
Environmental Events
All of the things you experience through your senses.
What does flow have to do with shaping?
Although the climber is not a novice, she is constantly improving her skills. If she chooses to climb on a Goldilocks-zone rock wall, the contingencies inherent to the wall will differentially reinforce skillful responses that she finds challenging, but not impossible. Less skillful responses are not reinforced with progress up the wall. These skillful responses are approximations of the terminal skillset she desires - a skillset that would allow her to climb a wall that today she would find nothing but frustrating. Effective shaping, whether imposed naturally by a good rock wall or arranged artificially by a skilled video game designer or behavior analyst, will meet the learner where they are — offering challenges, immediately reinforcing skillful responses, and arranging conditioned reinforcers along the way. When achieved, the learner improves or acquires new skills, while achieving a state of flow in which all sense of time and self is lost.
VR schedule and humans
Among humans, a VR schedule operates when reinforcers are probabilistic.2 For example, if a wizard in a game of Dungeons and Dragons will cast a successful spell against a powerful opponent only if she rolls a 10 on a 10-sided die, then her probability of success is just 10%. That is, on average, her spellcasting will succeed 1 time in 10 attempts - a VR 10 schedule of reinforcement. Probabilistic reinforcers are the rule in games of chance. If a slot machine is programmed to pay out, on average, 1 time in 50, then that machine is operating according to a VR 50 schedule of reinforcement. As is true of all VR schedules, it is impossible for the player to predict whether the next slot-machine play will be reinforced or not. Most players have no idea what schedule of reinforcement they are operating under. All they know is that the very next play may be the one that pays off - so let's play!
Why Follow the Rules?
Among humans, rule-governed behavior is quite common. When your mother told you to "Put your dirty dishes in the dishwasher...now!", you complied with this command (or "mand" to use the taxonomy introduced earlier). When our mapping app tells us to turn left on 18th street, we follow this instruction too. These examples point to two reasons for following rules, both rooted in reinforcement. We will differentiate two kinds of rule-governed behaviors - pliance and tracking - based on the kinds of reinforcers that maintain these behaviors.
Discriminative Stimuli and Establishing Operations
An SD is a functional antecedent stimulus - it precedes behavior and it evokes a specific operant behavior (the one that has been reinforced when that stimulus is present). However, if the behavior is not habitual, the SD will evoke this response only if the consequence currently functions as a reinforcer. To illustrate, consider that a restaurant is an SD for food-purchasing behaviors only when a meal functions as a reinforcer. If we just finished eating lunch, the restaurant SD will not evoke food-purchasing behavior. Instead, we will not even notice the restaurant's existence. By contrast, if we are famished, we will notice the SD and may stop in for a meal. In Chapter 9 we discussed how an establishing operation (EO), like food-deprivation, has two effects: (1) it temporarily increases the efficacy of a reinforcer and (2) it increases operant behaviors that produce that reinforcer. Not eating for 6 hours is an EO because it increases the efficacy of food as a reinforcer and it increases the probability that we will buy and consume food. To do this, we will look for an SD - a stimulus that signals our food-seeking behaviors will be reinforced. If we did not look for SDs when hungry, we would be just as likely to order food in a nail salon as in a restaurant. Such de-stimulized people would go hungry. One more example may help to make the point about the relation between EOs and SDs. Imagine that we ordered a taco at a fast-food restaurant. When the bag of food is handed to us through the drive-thru window, we notice there is no salsa. This functions as an EO that temporarily increases the reinforcing efficacy of a packet of salsa and initiates our search for an SD; a stimulus that signals our salsa-requesting behavior will be reinforced. So, where, in our past experience, will our salsa-requesting behavior be reinforced? In a nearby park? At a gas station? Under the hood of our car? Obviously not. The EO will initiate our search for an employee of the restaurant - the SD. When we see one, we will wave and, when they open the window, politely request a packet of salsa. When an EO increases the efficacy of a reinforcer, we begin looking for SDs that signal the availability of that reinforcer.
Echoic
An echoic is a verbal operant in which the response resembles the verbal antecedent stimulus and is maintained with a variety of socially mediated reinforcers. Said less formally, when the speaker makes an echoic response, they approximately or exactly repeat what someone else just said - they echo someone else's response. New parents spend a lot of time teaching their infants echoics such as "mama" and "dada." Mom, for example, says "mama" clearly for the baby, exaggerating her lip movements and doing so in a way that holds the baby's attention. When the echoic finally occurs, it is showered with social reinforcers. [Verbal Stimulus (Antecedent)] --> [Echoic (form closely resembles the verbal antecedent) (Behavior)] --> [Variety of socially mediated reinforcers (Consequence)] Learning to make a vocal response that closely resembles a verbal antecedent stimulus is useful when teaching children with severe language deficits to speak. A child who has acquired the echoic operant will listen to the verbal stimulus ("cow"), attempt to say the same thing ("caa"), and will receive reinforcers from the teacher ("Yes! That's it! That is so good!"). Shaping may be used to improve the form of the response, moving from "caa" to "caw" to "cow." Mastering the echoic (i.e., repeating all verbal antecedent stimuli when asked to do so) is a foundational skill in learning to say new words. The next step is learning to say them under the right stimulus conditions.
Motivating Operation
An environment and/or biological event that (1) temporarily alters the value of a specific reinforcer and (2) increases/decreases the probability of behaviors yielding that reinforcer. An MO is an observable event that can be objectively measured. The MO temporarily alters the value of a reinforcer. The MO increases or decreases behaviors yielding that reinforcer. Example: MO- Food Deprivation Our food seeking behaviors (buying, preparing) mostly occur when our stomach has been deprived of food. Food deprivation is both an environmental event and a biological event. When the MO occurs, the reinforcing value of food temporarily increases.
Abolishing Operation (AO)
An environmental and/or biological event that (1) temporarily decreases the value of a specific reinforcer and (2) decreases the probability of behaviors yielding that reinforcer. (a stomach full of food)
Establishing Operation (EO)
An environmental and/or biological event that (1) temporarily increases the value of a specific reinforcer and (2) increases the probability of behaviors yielding that reinforcer. (food deprivation)
Intraverbal
An intraverbal is a verbal response occasioned by a verbal discriminative stimulus, but the form of the response does not resemble that stimulus; intraverbals are maintained by a variety of social reinforcers. In everyday language, intraverbals allow us to meaningfully interact with other verbal humans. For example, when asked "How are you?" (a verbal SD), we respond with the intraverbal: "I'm doing well; how about you?" This everyday intraverbal is socially reinforced by the listener when they say, "I'm good too; thanks for asking." Intraverbals are the operants that make for meaningful conversations. For example, when we hear our friend say, "I think my boyfriend is cheating on me" (verbal SD), our intraverbal response - "You deserve better" - is reinforced when our friend replies, "Thanks; that means a lot to me." As these examples make clear, intraverbals are important behaviors in our social interactions. [Verbal SD (Antecedent)] --> [Intraverbal (form does not resemble the verbal antecedent) (Behavior)] --> [Variety of socially mediated reinforcers (Consequence)] Intraverbals are also important in education. Your responses on exams are controlled by the questions. For example, if an item on an upcoming exam asks, "What is an intraverbal," this functions as a verbal SD that increases the probability of providing a specific intraverbal response, a correct response that will be socially reinforced with points on the exam. Thus, not only is the intraverbal important in a child's social development, but it is also important in their ability to participate in an educational setting (Carr & Firth, 2005; Goldsmith et al., 2007; Sundberg & Michael, 2001). For these reasons, behavior analysts will teach children with autism important intraverbals such as answering questions and how to respond appropriately in a conversation (Charlop-Christy & Kelso, 2003; Sherer et al., 2001).
Which ratio schedule is better when prices are high?
Another reason VR schedules might be advantageous is that, when prices are high, VR schedules can maintain more behavior than an FR schedule. When pigeons worked for their daily meals under a VR 768 (a high ratio-requirement), their peak response rates were twice as high as under an FR 768. The price was the same - 768 responses per food reinforcer - but the occasional VR "win" kept the pigeons responding where the FR did not; Ferster and Skinner (1957) reported very similar outcomes. Likewise, in a human study, VR schedules maintained higher peak levels of exercising in obese children than did FR schedules. Indeed, under the VR schedules, obese children gradually came to exercise at rates comparable to nonobese children; something not achieved under FR schedules.
Price of the reinforcer
Another way to think about ratio schedules is that they specify the price of the reinforcer. Instead of saying that the ratio of reinforcers to responses is 3:1, we could say that the price of the reinforcer is three responses. Will the individual pay this price (will they respond three times and "buy" a reinforcer) and if so, how many reinforcers will they buy - how much will they spend? By conceptualizing ratio schedules as the price of the reinforcer, behavior analysts have been able to explore topics of interest to economists, which launched important areas of behavioral-economic research
The Behavioral Economics of Ratio Schedules
As previously mentioned, a ratio schedule of requirement can be thought of as the price of the reinforcer. For example, under an FR 20 schedule, the price of the reinforcer is 20 responses. Price is an intuitive way to think about ratio schedules - an FR 2 offers the reinforcer at a lower price than an FR 20, which in turn is a lower price than an FR 200. This relation between ratio schedules and economic prices has allowed behavior analysts to branch into behavioral economics and, interestingly, many of the laws of economics derived from studying human behavior have proven just as applicable to animal behavior. One example of this universality of behavioral-economic outcomes is shown in Figure 11.7. Here we can see what happens to behavior when the price of the reinforcer is changed. That is, as the FR requirement is increased from left to right along the x-axis of each graph, the price of the reinforcer increases. Regardless of the species, as prices increase so does the amount the individual works to obtain the reinforcer. However, there is a limit on this. At some point, the price increase will prove to be too much. After that, efforts to acquire the reinforcer will decline. To get a more intuitive understanding of these data, imagine what would happen to our gasoline-purchasing behavior if the price of gas changed. If gas was super cheap, say $0.10 a gallon, we would spend very little on gas in any given month - just like the subjects in Figure 11.7 responded very little to obtain their reinforcers at low prices. Later, when the price of gas goes up, we can use the data in Figure 11.7 to predict two things. First, moderate price increases will cause us to spend more on gas each month. No doubt we will be upset by these higher prices, but we'll pay them because, hey, we've got to get to work. Second, we can predict that very large price increases will cause us to eventually reduce our spending on gasoline. If the price of gas skyrocketed to $100 a gallon, most of us would have to leave our cars parked on the street while we take public transportation, walk, or maybe even hitchhike. These outcomes are predictable given the uniformity of behavior under ratio schedules of reinforcement
Promoting Generalization and Maintenance
As this chapter has made clear, when an operant behavior is reinforced in the presence of an SD, that stimulus (and others that closely resemble it) will evoke that response. When an applied behavior analyst or behavior therapist helps their client to acquire a new, adaptive behavior, they will face the challenge of getting the client's behavior to generalize beyond the therapy setting, where it must be maintained over time if the intervention is to be successful. An illustrative example occurs in parenting classes. New parents, or parents of at-risk children, enroll in these courses to acquire behavior-analytic skills (reinforcement, extinction, punishment, etc.) that will help them to raise well-adjusted, emotionally well-regulated children (Eyberg et al., 2008). Parents acquire these skills in a classroom setting in which they learn behavioral principles and role-play appropriate parenting behaviors with feedback from the instructor. The utility of this training, however, entirely depends on generalization and maintenance over time. If parents only use these skills in the classroom when the instructor is watching and not at home as their children grow up, then the parent-training sessions would be a waste of time and a lost opportunity. Thus, a technology for promoting the generalization and maintenance of adaptive behavior is critical (Dass et al., 2018). Stokes and Osnes (1989) outlined several tactics for promoting such generalization and maintenance. Three of their most important tactics are described next. TACTIC 1: TEACH BEHAVIORS THAT WILL CONTACT NATURAL CONTINGENCIES OF REINFORCEMENT. TACTIC 2: TRAIN DIVERSELY. TACTIC 3: ARRANGE ANTECEDENT STIMULI THAT WILL CUE GENERALIZATION.
self-report measures
Ask the individual to recall if they have engaged in the behavior. May not be truthful. People tend to overreport on positive things. Memory may be flawed. Shouldn't be used unless they are empirically validated.
Why does private decision making not equate to behaving purely according to our own will?
Because choice IS a behavior. Choice is determined (by public events). And we come come up with spurious reasons for the things we do.
Why do we prefer a bus that arrives at a predictable time? What is the utility in predicting the behavior of individuals?
Because predictability allows adaptive behavior.
The Rich Uncle Joe Experiment
Before we continue, we would like you to answer a few questions that we will revisit later in this chapter. It will not take you long to answer these questions. Answering them will give you greater insights about your own choices and the variables that affect them. If you are ready, grab a pen and read on. Imagine that a trusted member of your family, your rich Uncle Joe, says he is going to give every member of your immediate family $1,000 in cash in 5 years. Until that time, the money will be in a safety deposit box, where it will earn no interest. At the end of 5 years, Uncle Joe's lawyer will give you the cash. Next imagine that a friend of the family, Camille, says that she would like to purchase one of Uncle Joe's $1,000 gifts. Camille will pay in cash right now and, in 5 years, Uncle Joe's lawyer will give her the $1,000 gift that she purchases. Because everyone in the family is at least a little interested in selling their gift to Camille, she suggests everyone privately write down their selling price. Camille will buy the gift from the family member who writes down the lowest price. Please write your answer here. "I will sell Uncle Joe's gift ($1,000 in 5 years) if you give me $_______ in cash right now." Now erase that scenario from your mind. This time, imagine that your rich Uncle Joe is going to put your $1,000 gift into his safety deposit box for just 1 month. After that, Uncle Joe will give you the cash. Now what is the smallest amount of cash that you would sell this for? "I will sell Uncle Joe's gift ($1,000 in 1 month) if you give me $_______ in cash right now." If you are paying attention and answering honestly, the amount you just wrote down will be more than the amount you sold the gift for when it was delayed by 5 years. We will do this two more times and then be done. This time, imagine that Uncle Joe is going to put the money into his safety deposit box for 1 year. Now what is the smallest amount of cash that you would sell the gift for? "I will sell Uncle Joe's gift ($1,000 in 1 year) if you give me $_______ in cash right now." And finally, imagine that Uncle Joe's gift will sit in the safety deposit box for 25 years. Uncle Joe's law firm has instructions to give you the money in 25 years, should Uncle Joe die before the time elapses. Now what is the smallest amount of cash that you would sell the gift for? "I will sell Uncle Joe's gift ($1,000 in 25 years) if you give me $_______ in cash right now."
Summary
Behavior analysts have long recognized the importance of verbal behavior, whether it is spoken, signed, read, or merely thought. Skinner (1957) outlined a functional taxonomy of verbal operants, each occasioned by a different type of stimulus, and maintained by a different type of reinforcer. Later researchers observed that humans acquire verbal behavior at a young age, when they learn to symmetrically relate verbal and nonverbal stimuli, like "ball" and a toy ball, as, in many ways, the same. They acquire this relating behavior through multiple-exemplar training. With verbal behavior comes the ability to understand and follow rules. Two categories of rule-following are pliance and tracking; each is maintained by a different category of operant consequences. Pliance is maintained by social reinforcers and punishers for complying with the rule. Tracking occurs because it allows the rule-follower to obtain positive reinforcers and avoid aversive events (negative reinforcers). Pliance is most likely to occur when the rule is provided by an authority figure, as in the Milgram experiment. This tendency toward pliance speaks to the importance of recognizing power differentials. Authority figures must be careful to avoid taking advantage of the less powerful, who are more likely to follow their rules, instructions, and demands, even when it is not in their best interest to do so. Our unique verbal abilities give our species behavioral adaptations impossible in other species. But verbal behavior also comes with unwanted baggage - a tendency toward psychological suffering. Our verbal thoughts can occasion unwanted feelings, and maladaptive rules may hamper our ability to live a valuable life. ACT is a behavioral approach to psychotherapy. It can help clients learn new ways to interact with their thoughts. One that is less literal, freeing us to focus on what is important - behaving in accord with our values.
Quantitative Behavior Analysis
Behavior analysts specify the behavior of interest with enough precision that its occurrence can be counted. Instances and non-instances can be discriminated from each other.
Behavioral Service Delivery
Behavior analysts use the discoveries of laboratory and applied behavior analysists to address the needs of patients.
Phylogenetically selected behaviors
Behaviors we inherit from our parents that are well suited to stable environmental conditions that haven't changed for eons in the history of the species. Common to all members of the species. Increase chance of survival in our evolutionary niche.
Scheduling Reinforcers to enhance human performance and happiness
By carefully arranging complex reinforcement contingencies, behavior analysts can enhance human performance and, when skillfully used, enhance human happiness. These successes occur in academic, play, and wellness behaviors. We highlight a few of these applications here. In schools, teachers face a range of classroom challenges that can have detrimental effects on students and the learning environment of their peers. Fortunately, educators have successfully used schedules of reinforcement to improve the performance of students of all ages and from diverse backgrounds. Interval and ratio schedules have been used to reduce undesirable behavior, increase appropriate behavior, and help children learn valuable academic skills. For example, when teachers used a VI schedule to quietly praise academic engagement, fourth graders spent more time vigilantly working on their assignments in class. Teachers have also used ratio schedules to improve student behavior when completing academic tasks. Thus, schedules of reinforcement have practical applications in schools and have been used to improve learning outcomes. At play, it has long been known that reinforcing skillful performances improves player performance in sports such as golf. But schedules of reinforcement have also been used to increase and then maintain exercising behaviors in non-athletes. In these efforts to address the obesity crisis, behavior analysts have successfully used FR, VR, FI, and VI schedules to increase physical activity and then to thin the frequency of reinforcement, so exercising can shift from an incentivized behavior to a daily habit. Such schedule-influenced changes in behavior have improved patients' weight, blood pressure, and fitness. Wellness apps use schedules of reinforcement too. Artem Petakov, President and co-founder of Noom®, recently described how his team combined artificial intelligence with behavioral psychology (i.e., the principles of behavior analysis) to develop their behavior-change and weight-loss product. Noom® uses positive reinforcement and shaping to gradually increase daily activity and control food consumption. Coaches within the app arrange variable schedules of reinforcement, though exactly how it works is unknown. The behavioral technology used is proprietary information, known only to the company's behavioral design engineers. In sum, schedules of reinforcement have practical utility when it comes to improving academic, play, and wellness behaviors.
The Premack Principle
Circularity problem with reinforcers: As shown in Figure 9.3, we have arranged a reinforcement contingency for Sherri - IF she completes a major assignment on time at work → THEN she is awarded two tickets to a professional football game. When her performance improves, we conclude that the tickets functioned as a reinforcer. But how do we know the tickets are a reinforcer? The only evidence for this - the performance improvement - is the very thing reinforcement is asked to explain. Said another way, there is no independent evidence that the tickets are reinforcers. To address the circularity problem, David Premack conducted a series of experiments in which rats were given free access to water and an exercise wheel for an hour. Premack found that rats spent more time running than drinking, so he classified running as the high-probability behavior and drinking as the low-probability behavior. Next, Premack made a prediction about what would function as a reinforcer - access to a high-probability behavior will function as a reinforcer when made contingent upon a low-probability behavior; this has come to be known as the Premack principle (Premack, 1959). Note how Premack made this prediction before he tested whether or not access to running (the high-probability behavior) would actually reinforce the low-probability behavior - drinking. In Premack's experiment, he next arranged this contingency: IF drink → THEN unlock the running wheel. That is, access to the high-probability behavior (running) was made contingent on emitting the low-probability behavior (drinking). As predicted, the rats increased their drinking - access to the running wheel had indeed functioned as a reinforcer. IF I do all the dishes → THEN I can play my video game IF I clip my nails for 90 seconds → THEN I can study for an hour It's about HIGH probability behaviors being contingent upon LOW probability behaviors to increase the low probability behaviors. If Sherri is a good observer of her own behavior, then she will tell us about her high-probability behaviors. Access to these behaviors will, according to the Premack principle, serve as effective reinforcers for her low-probability work behaviors.
The "Acceptance" in Acceptance and Commitment Therapy
Clients seeking the help of a behavior therapist often have developed avoidance rules that guide their behavior. For example, the rule - "IF I don't talk to people I am attracted to→ THEN I will avoid feelings of anxiety and rejection" - specifies a strategy for avoiding unwanted thoughts and feelings. But the rule requires that the client close off a rather large part of the world. Thus, the price paid for safety from unwanted thoughts and feelings is a narrowing of the places one might go, the people one might interact with, and the behaviors one might attempt. Because attempts to suppress unwanted thoughts and feelings paradoxically increase these feelings (the white bear problem), ACT therapists embrace the opposite approach - acceptance. Instead of trying to push thoughts away and avoid situations in which they might arise, acceptance means approaching the thought, so as to examine it flexibly, with a sense of curiosity. The ACT therapist promotes such acceptance with exercises designed to undermine our default tendency to take thoughts literally. For example, the ACT therapist might encourage the client to repeat an unwanted thought over and over again ("I am worthless, I am worthless, ..."). After doing this for a few minutes, the words lose their literal meaning and become closer to what they really are - odd sounds emanating from their mouth. Sounds that one can occasionally choose, when it would be adaptive to do so, to not symmetrically relate with worthlessness. A complementary approach is to practice observing thoughts as each one arises. The client is encouraged to sit quietly with eyes closed and, when feeling relaxed, direct their attention to discrete thoughts as they arise. Let's assume that the next thought is "I am worthless." When this verbal stimulus is observed, the client has a choice - they can take the thought literally (symmetric relational responding between "worthless" and actual worthlessness) or they can do something else, something more flexible. For example, the client might calmly note "I am having the thought that I am worthless." From this perspective, the thought is a bit of behavior not unlike noticing that their right hand is scratching an itch. Both may be calmly observed and, from this perspective, a range of flexible responses are available. With the thought, the client might be encouraged to imagine placing it on a leaf and watching it float down a stream. Alternatively, the thought might be inspected more closely. It is a big thought or a small thought? Where is the thought? What color is the thought? The goal of such exercises is not to reduce the frequency of unwanted thoughts, but to acquire new behaviors that allow the client to have a thought instead of the thought having the client. Such flexible, nonliteral behaviors can help clients to have a thought while continuing to behave in a values-driven way.
Conditioned punishers for humans
Conditioned punishers are common in our everyday human lives as well. What they all have in common is that they signal a delay reduction to a backup punisher. For example, parents often call their children by their full name - "Robert Joseph Terrel, you get over here right now!" - just before a backup punisher is delivered. If the consequence of a mischievous act is hearing your full name shouted in this way, that consequence functions as a conditioned punisher if, and only if, it decreases the future probability of this mischievous action. Other conditioned punishers are more subtle. For example, a subtle facial expression, such as a raised eyebrow, can function as a conditioned punisher (decreasing the future probability of texting in class) if the individual has learned that it signals a delay reduction to a backup punisher (e.g., being shamed in front of the class).
Reinforcer Immediacy
Considerable empirical evidence (and our intuition) tell us that reinforcers are more effective when they are obtained immediately. One reason that delayed reinforcers are less effective than immediate reinforcers likely has to do with the stable environment of our evolutionary ancestors. In that environment, a lot of bad things could happen while waiting for a delayed reinforcer. The individual could be killed by a predator while waiting and, if that happens, the delayed reinforcer would be lost. Less dramatically, a bigger, stronger competitor could come along during the delay and steal the upcoming reinforcer. The reinforcers these ancients were waiting for were not football tickets or salted caramels, they were critical reinforcers needed for survival, for example, food, water, and mating opportunities. If waiting for these reinforcers translates to sometimes not getting them, then organisms would be at a survival advantage if their behavior was highly sensitive to reinforcer delays. Modern creatures, like you and I, have inherited the genes of these delay-sensitive survivors; hence, our aversion to waiting. A second reason why delayed reinforcers might lose their efficacy is that they make it more difficult to learn response-reinforcer contingencies.
Effective Methods of Discrimination Training
Considerable research efforts have been expended to discover the most effective and efficient methods of conducting discrimination training. The real-world utility of this is obvious. For example, efficient methods of establishing discriminated operant behavior are useful in helping children learn to read (see Extra Box 1). Likewise, if employee safety requires that they quickly learn to read hazmat signs (SDs) so that they will strictly follow a safety protocol (response) and avoid serious injury (negative reinforcer), then the technology of discrimination training can literally save lives. Of course, discrimination training will be more effective if its defining properties are strictly adhered to: (1) the response is reinforced in the presence of the SD, and (2) it is not reinforced in the presence of the SΔ. There are two further points to be made here. First, if conditions change and the response is no longer reinforced in the presence of the SD, the function of that stimulus will gradually change - it will function as an SΔ and will no longer evoke the response. Thus, it is important that the contingencies be strictly adhered to in the presence of the SD (reinforcement) and the SΔ (extinction). Second, discrimination training arranges both the SD and SΔ contingencies because experience with both produces better learning that experience with the SD alone (Dinsmoor, 1995b, 1995a; Honig et al., 1959). Alternating randomly between the SD and the SΔ is also a good idea, as it ensures the learner is attending to the antecedent stimuli rather than the sequence in which they are presented. Therefore, if we want to quickly teach baggage screeners to flag knife-containing luggage for further screening, then the best strategy is to randomly alternate between images that contain a knife (SD) and other images of long-narrow objects that are not knives (SΔs). That is, it is just as important to know when not to flag a piece of luggage for further screening as it is to learn to positively identify knives. Indeed, some evidence suggests that having more experience with the SΔ than the SD produces faster learning and more accurate discriminations. That said, there is undoubtedly an upper limit on the relative number of SΔ images. If the SD is extraordinarily rare, the learner's attention will wane in a sea of SΔs. In US airports, for example, knives, guns, and explosives in carry-on luggage are extraordinarily rare and, perhaps as a result, the baggage screeners fail to detect these items more often than they flag them for further screening (Frank, 2009; Levenson & Rose, 2019). Finding the optimal ratio of SD and SΔ images in discrimination training and the long-term maintenance of discriminated operant behavior is an important area for future research.
Punishing agent observation
Considerable research supports our intuition that when punishing agents (parents, police, or Hogwarts professors) are watching, problem behavior is less likely to occur. For example, a number of applied studies have reported that problem behaviors displayed by individuals with intellectual disabilities or autism are less likely to occur when a "you're being watched by a punishing agent" stimulus is on display. Figure 10.9 shows, for individual clients in the Doughty et al. study, the percentage of observation intervals in which a problem behavior occurred. When the you're being watched stimulus was presented, problem behavior was infrequent. When the you're not being watched stimulus was presented, the frequency of problem behavior skyrocketed. In sum, when the watchful eye of the punisher is trained on us, we refrain from engaging in punishable behaviors. Social scientists have speculated that the evolution of gods into beings who punished the immoral acts of humans corresponded with the transition from hunter-gatherer life to living in cities. Where the gods of hunter-gatherers cared only about their own godly activities (creating storms, fires, and floods), the gods of city dwellers were constantly monitoring human behavior and, at times, angrily punishing those who sinned. Why were watchful, punishing gods needed in the cities but not among hunter-gatherers? One hypothesis is that city dwellers, like most readers of this book, are surrounded by strangers who mostly turn a blind eye to bad behavior ("I don't know that belligerent drunk guy, so I'm not getting involved"). By contrast, hunter-gatherers had to live with "that guy" day after day, so punishing his inappropriate behavior now paid off immediately (the belligerence ceases) and later (the future probability of belligerence decreases). By convincing city dwellers that they are always being monitored by gods who will punish their transgressions, everyone enjoys the benefits of punishment (a more civil society, more sharing, less belligerence) without having to do any punishing themselves. Did it work? The data suggest it did. In modern times, when people encounter stimuli that suggest they are being watched by the predominant god in their culture (e.g., seeing a church or a religious symbol), they are less likely to engage in antisocial behavior, even if they are not religious.
Choice
Decision-making (i.e., making choices) is one of the most significant and complicated things we do every day. Shall I get out of bed or snooze the alarm? How much shampoo should I use: a little or a lot? Should I walk, take the bus, or drive? Shall I stick with my healthy eating plan or splurge on junk food with friends? Do I scroll further through my social-media feed, or do I study instead? Should I study in the library or the coffee shop where I know I will be distracted by music and nearby conversations? Do I succumb to peer pressure and have another drink/smoke? Our verbal answers to these questions - "Yes, I'll get out of bed when my alarm goes off" - may be different than our behavioral answers. What we verbally decide to do and our actual choices (pressing the snooze button eight times before getting out of bed) may be very different. The choices we actually make, by behaving one way or the other, are much more important than what we say we will do. Our New Year's resolutions will not improve our lives; only by repeatedly choosing to adhere to them can we improve our health, well-being, and the health of the environment. What does behavior analysis have to say about choice? Can behavioral scientists accurately predict the choices people make? Have they identified functional variables that can be used to positively influence these decisions? Spoiler alert: We can predict some choices, but we are a long way from being able to predict all of them. Once again, this is where readers of this book are important. If you choose a career in the behavioral sciences, you could make discoveries that help to improve the accuracy of our predictions and, importantly, improve the efficacy of interventions designed to positively influence human decision-making. So, let's get started.
The role of reinforcement in the act of punishing
Delivering a punisher contingent upon problem behavior is, itself, an operant response - one that is maintained by reinforcers. One reinforcing consequence is that the problem behavior stops. Consider a teacher who observes a bully tormenting another child in class. When she informs the bully that his actions will land him in after-school detention, the bullying stops. This immediate cessation of bullying undoubtedly functions as a negative reinforcer of the escape variety (SRE−); that is, it increases the future probability that the teacher will use this punisher again. Laboratory studies of human cooperation reveal that negative attitudes about punishment (punishment bad, reinforcement good) soon give way to strong preferences for punishment when cheating is encountered. That is, when delivering punishers is reinforced with reductions in cheating and increases in cooperation, people warm up to punishment. A good example of this occurred in a cooperation/competition game arranged by Gurek et al. In the game, participants repeatedly decided to either altruistically cooperate (more money for everyone) or to cheat (more money for me, less for everyone else). After each round of play, participants could choose between two versions of the game - one in which they could punish cheaters, and another in which they could not.1 Figure 10.10 shows that most participants initially preferred the no-punishment version of the game. However, as they continued to play the game, their preference for punishment substantially increased. Choosing to punish was reinforced with two consequences: (1) cheating decreased (a negative reinforcer) and (2) income increased (a positive reinforcer). Neuroeconomists have studied the activity of the human brain when we punish unfair behavior. When unfair or greedy actions are detected, brain regions that are active during negative emotional experiences are activated. Norm violations elicit such neural activity and may function as an establishing operation that increases punishment as a reinforcer. Said less technically, when we see someone cheating, bullying, pulling into our parking space, and so on, it makes us angry and we really want to see them get what they deserve. Seeing their behavior get punished functions as a reinforcer. Indeed, when we punish such unfair behaviors, reward-processing regions of our brain are activated. These reinforcement processes increase our inclination to punish cheating, when we detect it. This appears to have been critical in the evolutionary history of our species, when living in small hunter-gatherer communities required cooperation to survive.
Shaping
Differential reinforcement of successive approximations to a terminal behavior. Because the trainer asked for only a little more than before, the elephant succeeded in touching the target. This general strategy of asking for just a little bit more and gradually moving behavior toward a desired behavior is called "shaping." Let's dissect this definition so readers will fully understand shaping. Differential reinforcement involves reinforcing the desired behavior and extinguishing previously reinforced behaviors. For the elephant, the desired behavior is touching the trunk to the target stick. All other apple-seeking behaviors (e.g., trying to take apples from the trainer's bucket) are extinguished. Second, the terminal behavior is the performance you ultimately want. The trainer ultimately wants the elephant to walk to the target whenever and wherever it is presented. However, the trainer knows that if she asks for this performance at the beginning of training, the elephant will not be able to do it. If this happens, training fails, and the behavior of both the trainer and elephant is extinguished. Finally, shaping involves differentially reinforcing successive approximations to the terminal behavior. That is, we begin by reinforcing a first approximation of the terminal behavior; something simple, something we anticipate the elephant can already do. If the target is placed just an inch away from the trunk, the elephant will probably touch it. When the target is touched, the conditioned reinforcer is delivered and then the backup reinforcer is provided. The conditioned reinforcer marks the correct response and signals a delay reduction to the apple. All other apple-seeking behaviors are extinguished. Once the first approximation is mastered, shaping continues through a series of successive approximations of the terminal behavior. The next approximation is taking a step to reach the target. When the step is taken and the target is touched, the conditioned reinforcer marks the response and the backup reinforcer follows. The previous approximation, reaching for the target without taking a step, is extinguished, as are all other apple-seeking behaviors. Once the second approximation is mastered, the trainer moves on to the third (two steps must be taken to reach the target) and so on, until the terminal behavior is mastered.
Overgeneralization
Every stimulus that remotely resembles CS evokes the CR. Victims of PTSD show this.
Behavioral Epigenetics
Examines how nurture shapes nature. Environmental events can influence if specific genes are turned on or off and can affect the growth and activity of neurons and behavior. This behavior can then be passed down to offspring and create the same types of gene regulation/disregulation.
Examples of abolishing operations (AOs)
For example, sexual climax functions as an AO because it (1) temporarily decreases the value of sexual stimulation as a reinforcer and (2) decreases sex-seeking behaviors. Similarly, as we sleep at night, a biological AO is occurring that, in the morning, has the effect of (1) temporarily decreasing the value of sleep as a reinforcer (relative to how much we wanted to go to bed last night) and (2) decreases the probability that we will engage in sleep-seeking operant behaviors (going to the bedroom, getting under the covers, finding our teddy bear). Many prescription and recreational drugs have well-established AO effects. For example, the weight-loss drug phentermine is a stimulant (like amphetamine or cocaine) that (1) decreases the reinforcing efficacy of food and (2) decreases food-seeking behaviors. Antianxiety medications such as Xanax and Valium also have AO effects. They (1) decrease the value of sex as a reinforcer and (2) decrease sex-seeking behaviors. Marijuana is widely thought to have an AO effect, reducing smokers' "motivation" to work, study, and so on. Despite this widespread belief, well-controlled human laboratory experiments do not robustly support this assumption. Although we cannot conclude that smoked marijuana has an AO effect, it does function as an EO, increasing (1) the reinforcing value of snack foods and (2) trips to the nearest convenience store to procure them.
How could VR schedules encourage eco-friendly behavior?
Here's a wild idea - if humans, like nonhuman animals, have a strong preference for VR over FR schedules, then perhaps we could encourage more eco-friendly behavior by using VR schedules. For example, instead of taking $0.10 off our grocery bill because we remembered to bring in our reusable bags (a meaningless consequence that probably has no reinforcer efficacy), why not put this environmentally conscious behavior on a VR schedule? Specifically, if we check out with our reusable bags, then there is a 1 in 100 chance that we will win $10 in cash - a VR 100. When someone in the store wins the cash prize, a celebration occurs, so everyone can see that winning happens all the time. A very similar technique is used in the casinos. This simple, cost-neutral change in the schedule of reinforcement may encourage more green behavior.
Herrnstein's Matching Equation
Herrnstein (1961) developed a simple equation that predicted how pigeons chose to allocate their behavior between pecking the left (BL) and right keys (BR). He hypothesized that these choices would be influenced by the reinforcers obtained on the left (RL) and the right (RR) keys: BL / (BL + BR) = RL / (RL+RR) Before we plug any numbers into Herrnstein's equation, note that behavior (pecking BL or BR) appears on the left side of the equation and reinforcers (RL and RR) appear on the right. The equals sign in the middle is important. The equals sign indicates that the proportion of BL responses should match (or be equal to) the proportion of reinforcers obtained on the left (RL). If you are more familiar with percentages than proportions, then Herrnstein's equation says the percentage of responses allocated to BL should match (or be equal to) the percentage of reinforcers obtained on the left (RL). To convert proportions to percentages, we simply multiply the proportion by 100. Let's plug some very simple numbers into the equation. In Herrnstein's (1961) experiment, the VI 90-second schedule arranged reinforcers to occur at a rate of 40 per hour (i.e., 1 reinforcer, on average, every minute and a half), whereas extinction arranged none. Thus, RL=40, RR=0 Simple enough. Next, we plug these rates of reinforcement into the matching equation: 40 per hour / (40 per hour + 0 per hour) = 40/40 = 1/1 By multiplying the right side of the equation by 100, we obtain the percentage of reinforcers available on the left: BL / (BL+BR) = 1/1*100 -> BL / (BL=BR) = 100% of the reinforcer are on the left key If the right side of the equation is 100%, then Herrnstein's equation predicts that the percentage of responses allocated to the left key should match this number; that is, 100% of the responses should be made on BL (exclusive choice). Figure 13.5 shows this prediction graphically. On the x-axis is the percentage of reinforcers obtained on the left, and on the y-axis is the percentage of left-key pecks. The lone data point on the graph shows that when 100% of the reinforcers were arranged on the left, the pigeons should choose to allocate 100% of their pecks to that key. The pigeons conformed to this prediction, allocating over 99% of their behavior to BL.
RESEARCH SUPPORT FOR HERRNSTEIN'S EQUATION
Herrnstein's matching equation has been studied in rats, pigeons, chickens, coyotes, cows, and monkeys; it does a good job of predicting how behavior will be allocated when the outcomes of choice are uncertain (Dallery & Soto, 2013; Davison & McCarthy, 1988). If you take more advanced coursework in behavior analysis, you will learn that Herrnstein's matching equation does not always make perfect predictions. For example, cows tend to deviate from predictions of the matching equation more than the pigeons in Figure 13.8 (Matthews & Temple, 1979). In these more advanced classes, you will consider exceptions to the matching equation and will study theories proposed to account for these exceptions. What about humans? Can the matching equation be used to predict actual choices made by humans? Early research supported the predictions of the matching equation: humans chose to allocate more of their behavior toward the richer of two VI schedules of reinforcement (e.g. Bradshaw et al., 1976, 1979). However, a series of important experiments conducted by Horne and Lowe (1993) revealed several problems with these early studies. For example, subtle cues informed participants about the frequency of reinforcers. When these cues were removed, very few participants behaved in accord with the matching equation (Horne & Lowe, 1993). More recent research suggests humans conform to the predictions of the matching equation, but only when participants are paying attention. Much of the early research on human choice and the matching equation arranged button-pressing tasks resembling those used with rats and pigeons. While these tasks engage nonhuman attention, humans find them boring and mostly press the buttons without paying attention to the consequences of their behavior. Madden and Perone (1999) explored the role of participant attention by changing the task from one in which choices could be made without paying much attention, to one that was more engaging. As expected, when human attention was elsewhere, choice did not closely conform to the predictions of the matching equation. However, as shown in Figure 13.9, the matching equation accurately predicted human choice when participants paid attention to the engaging task. Other experiments have reported comparable findings with tasks that effectively engage human attention (Baum, 1975; Buskist & Miller, 1981; Critchfield et al., 2003; Magoon & Critchfield, 2008; Schroeder & Holland, 1969). We have prepared a reading quiz to give you practice calculating and graphing the predictions of Herrnstein's matching equation. For readers who wonder why all of these graphs and calculations matter, we invite you to read Extra Box 1. The matching equation makes strong, empirically supported predictions about what controls human decision-making and, as such, has important things to say about how to solve existential threats that have at their core disordered human choice.
response variability
If a reinforcer never occurs, it cannot increase behavior. Behaving variably is critical to solving the puzzle of a new reinforcement contingency. When individuals are exposed to new contingencies of reinforcement, they tend to respond more variably - they explore in a trial-and-error way. When individuals learn how a reinforcement contingency works, they shift from exploration to exploitation - behaving efficiently and rarely trying something new.
Generalization
If a young reader has learned to say "just" when she sees the written word JUST, it will not surprise us when she mistakenly says "just" when the written word on the page is actually JEST. These two written words closely resemble each other, differing by just a single letter. We might say that the verbal response "just" has generalized to a new stimulus that very closely resembles the SD. That is, generalization occurs when a novel stimulus resembling the SD evokes the response, despite that response never having been reinforced in the presence of that novel stimulus. Saying "just" has never been reinforced in the presence of JEST (the novel stimulus), but this verbal response occurs because JEST so closely resembles JUST (the SD). This sort of generalization was first demonstrated in the laboratory by Guttman and Kalish (1956). They trained pigeons to peck a yellow response key by reinforcing those pecks on a VI schedule. As is typical of VI schedules, the birds pecked the yellow key at a constant rate. Next, during a test session in which no reinforcers were ever delivered, the key was lit with 11 different colors, presented individually in a random order. As you can see in Figure 12.3, the pigeons pecked at the highest rate during the portion of this test when the key was yellow (SD). When the key was lit with a novel color, responding decreased, and the amount of that decrease was an orderly function of the amount of the color change. As the key color looked less and less like the SD, the pigeons pecked it less and less. The approximately bell-shaped response curve in Figure 12.3 is called a generalization gradient; that is, a graph depicting increases in responding as the novel antecedent stimulus more closely resembles the SD. The bell-like shape of the generalization gradient has been replicated in laboratory settings many times, in many stimulus modalities (visual, auditory, etc.), and in many species, including humans. Outside the lab, a common experience of generalization is the "phantom vibration." A vibrating cellphone functions as a tactile SD, which leads us to check our phone. But sometimes we have a tactile sensation that is not exactly like the SD, but we check it anyways. When we find no new text, push, or news notification, we can attribute this hallucination to generalization - the tactile stimulus that we sensed was similar to the vibration produced by our phones.
COMMITMENT STRATEGIES
If we are more likely to choose the larger-later reward at T2 (Figure 13.12), when neither reward is immediately available, then an effective strategy for improving self-control is to make choices in advance, before we are tempted by an immediate reward. For example, making a healthy lunch right after finishing breakfast is a commitment strategy for eating a healthy lunch. After eating breakfast, we are full and, therefore, are not tempted by a beef quesarito. However, we know we will be tempted at lunch time, so it would be wise to use a commitment strategy now. By packing a healthy lunch, we commit to our diet, and the larger-later rewards it entails. Of course, we will be tempted to change our mind at T1, when lunch time approaches, but having the packed lunch can help in resisting the temptation to head to Taco Bell. Rachlin and Green (1972) first studied commitment strategies in pigeons. In their experiment, pigeons chose between pecking the orange and red keys at T2 as shown in Figure 13.14. At this point in time, no food reinforcers were immediately available, so hyperbolic discounting suggests the larger-later reward will be preferred. If the red key was pecked, a delay occurred until T1, at which time the pigeon chose between one unit of food now (smaller-sooner reward of pecking the green key) and three units of food after a delay (larger-later reward). Given these options, if the pigeons found themselves at T1, they nearly always made the impulsive choice. But note what happens when the pigeons chose the orange key at T2. Instead of waiting for another choice at T1, they were given no choice at all. By choosing the orange key, the pigeon committed itself to making the self-control choice at T1. Consistent with hyperbolic discounting of delayed reinforcers, Rachlin and Green's (1972) pigeons frequently committed themselves to this course of self-control. The Nobel prize-winning behavioral economist, Richard Thaler, used a similar commitment strategy to help employees save money for retirement (Thaler & Benartzi, 2004). New employees are reluctant to sign up for a retirement savings program, in part, because money put into savings is money not used to purchase smaller-sooner rewards. Thaler's insight was to commit to saving money at T2, when the money to be saved is not immediately available. This was accomplished in the Save More Tomorrow program by asking employees to commit some of their next pay raise, which would not be experienced for many months, to retirement savings. At T2, with no immediate temptations present, the number of employees who signed up for retirement savings more than tripled. Although employees were free to reverse their decision at T1, when the pay raise was provided, very few took the time to do so. The program has been widely adopted in business and is estimated to have generated at least $7.4 billion in new retirement savings (Benartzi & Thaler, 2013). This is a wonderful example of how behavioral science can positively influence human behavior.
Reinforcement vs. No Consequence
In Figure 13.1, when you press the right button (BR) nothing happens, ever. When you press the left button (BL), your press is immediately accompanied by a "beep" and a display on a computer screen that says "25 cents have been added to your earnings," which we will assume functions as a reinforcer. You press the left button 24 more times and more money is added to your account each time. Score! So, which of these buttons would you choose to press going forward? Obviously, you will exclusively press the button that produced the reinforcer (BL). The reason is clear - pressing the left button is reinforced and pressing the other one is not. Every species of nonhuman animal tested so far prefers reinforcement over non-reinforcement. For example, in an experiment conducted by Rybarczyk et al. (2001), cows preferred to press the left lever (BL) because it was followed by a food reinforcer. They chose not to press the right lever (BR) after learning that it never provided a reinforcer. In sum, individuals choose reinforcement over no consequence.
Substitutes in herrnstein's experiments
In Herrnstein's experiments and the hundreds of choice experiments that have replicated his findings, the animals and humans chose between two identical reinforcers. For rats, the choice was between food reinforcers earned by pressing the BL lever and the same food reinforcers available for pressing the BR lever; similarly, for humans it was a choice between money and money. However, most of the choices we make in our daily lives are between different reinforcers. For example, at snack time we might choose between having a slice of chocolate cake or chips and salsa. Can the matching law be used to predict and influence these choices as well? The answer is, "it depends"; it depends on whether the two reinforcers substitute for each other. Before we dive into the "it depends" answer, let's explain what we mean by substitute. When we say two reinforcers substitute for one another, it helps to think of ingredients in a recipe. If we are out of coriander, we might Google "What substitutes for coriander." The answer will be a spice that has a similar taste (equal parts cumin and oregano). If we cannot easily obtain one reinforcer (coriander), the reinforcer that will take its place (a cumin/oregano mix) is a substitute. The more technical definition of a substitute reinforcer is a reinforcer that is increasingly consumed when access to another reinforcer is constrained. For example, if the price of coffee increases substantially, our access to this reinforcer is constrained - we can't afford to drink as much coffee as we normally do. When coffee prices spike, we will substitute other caffeinated drinks like tea, caffeine shots, and energy drinks. Likewise, if your partner moves to another city, your access to intimate reinforcers is greatly constrained. Substitute reinforcers might include social reinforcers from friends, intimate reinforcers from a new partner, or drug reinforcers from a bottle. Now back to the "it depends" answer. The matching law can be used to predict and influence choice if the two different reinforcers substitute for one another. For example, clothes stocked by Moxx and Rass substitute for one another - if Rass raises their prices, we buy more clothes at Moxx - so the matching law applies. However, if the two reinforcers do not substitute for one another, then no, the matching law does not apply. For example, clothes and hamburgers do not substitute for one another. If the price of clothes goes up, you will not begin wearing hamburgers. The study of substitute reinforcers has revealed some important findings in the area of substance use disorders. This topic is explored further in Extra Box 2.
Therapeutic Utility of Motivating Operations
In clinical settings, applied behavior analysts have found EOs and AOs to be especially helpful. For example, when teaching children with autism how to ask for items such as blocks or crayons, researchers have found it useful to constrain the child's access to these items. A child who has not played with blocks for a while will find them much more valuable than if a block-playing session just ended. Block deprivation is an EO because it (1) increases the reinforcing value of blocks and (2) increases the probability that the child with autism will be "motivated" to practice saying, "Can I have the blocks". AOs are also used to improve clinical practice. For example, repeatedly reinforcing the verbal operant "Can I have the blocks" with access to blocks will have a gradual AO effect. That is, each block-playing session decreases the child's "motivation" for another round with the blocks. For this reason, behavior analysts working in clinical settings will often vary the reinforcers presented, thereby reducing the probability that an AO will occur with any single reinforcer. This reinforcer-variation strategy works well, as long as the reinforcers used are all highly preferred items. A related strategy for maintaining motivation is to allow the individual to choose which of several highly preferred reinforcers they would like to have now. Because the client is best positioned to know what reinforcer they want right now, this EO/AO-based strategy not only enhances motivation but is also preferred by the client.
More uncertainty about herrnstein's experiment
In the next part of Herrnstein's experiment, uncertainty was increased by programming a VI schedule on both keys. A VI 3-minute schedule was arranged on the left key (RL = 20 reinforcers per hour) and a different VI 3-minute schedule was arranged on the right (RR = 20 reinforcers per hour too). In many ways, this resembles the reinforcement schedules operating when you choose between shopping at two comparable stores like Moxx and Rass. Reinforcers (bargain buys) are found intermittently; we cannot predict when the next one will be found, but our experience tells us that they are found about equally often at the two stores (RMoxx = RRass). If, in all other ways, the stores are the same (same distance from your apartment, same quality of clothes, etc.), then you should visit both stores equally often. This is what Herrnstein's equation predicts when a VI 3-minute schedule is programmed on the left and right keys: BL / (BL+BR) = 20 per hour / (20 per hour + 20 per hour) = 20/40 = 1/2*100 = 50% If 50% of the reinforcers are obtained by pecking the left key, then choice should match this percentage: half of the pecks should be directed toward that key. As shown in Figure 13.6, the pigeons behaved in accord with this prediction, allocating an average of 53% of their responses to BL. To make it easier for you to see deviations from predictions of the matching equation, we added a blue line to the figure. The matching equation predicts all of the data points should fall along this blue line. In another phase, Herrnstein's equation was tested by assigning a VI 9-minute schedule on the left key (RL = 6.7 reinforcers per hour) and a VI 1.8-minute schedule on the right key (RR = 33.3 reinforcers per hour). Now, reinforcers are obtained more often on the right key, the key the pigeons previously dispreferred. Plugging these reinforcement rates into the matching equation, BL / (BL+BR) = 6.7 / (6.7+33.3) = 6.7/40 = 0.1675*100 = 16.75% we can see that about 17% of the reinforcers are obtained by pecking the left key. Therefore, the matching equation predicts the pigeons will choose to allocate 17% of their behavior to the left key (and the other 83% to the right key). As shown in Figure 13.7, in Herrnstein's experiment this prediction was pretty accurate. Translating this to shopping, if we rarely find bargains at Rass but usually find them at Moxx, then Herrnstein's matching equation predicts we should spend most, but not all of our time shopping at Moxx. Makes sense, but imagine that our local Rass store gets a new acquisitions manager and now shopping there is reinforced twice as often as it is at Moxx. The matching equation predicts that we will switch our allegiance and spend two-thirds of our time shopping at Rass. As shown in Figure 13.8, when Herrnstein tested this prediction, his pigeons did exactly this. The additional green data points in the figure show choice data from other conditions in which Herrnstein tested the predictions of the matching equation. As you can see, the predictions were quite accurate.
Why are single-subject experimental designs preferred for behavioral experiments?
Keep the focus on the behavior of the individual and is transparent about presenting al the data to the jury. Doesn't allow computers to decide if behavior of a group changed; rather, it lets us directly see if the behavior of the individual changed or not.
Choosing between Uncertain Outcomes
Later in the chapter we will return to the choices you just made, but for now consider the choice you make when leaving home and deciding where to go shopping for clothes. There are many options, but for simplicity we will confine our choices to two stores: Moxx or Rass. Which one will you choose? You cannot be certain which store will have that great bargain you are looking for, so you will have to decide based on your past experiences. That is, you will have to decide which one has had more bargains in the past. In the 1960s, Richard Herrnstein conducted some of the first operant learning experiments on choice. He studied choices made by rats and pigeons. As animals are uninterested in shopping for clothes, Herrnstein arranged food as the reinforcer. Rather than building two little grocery stores that the subjects could shop in, he arranged two response keys (pigeons) or two levers (rats) on the wall of the chamber; by pecking or pressing them they could occasionally obtain these reinforcers. The individual animals participating in these experiments were free to do whatever they chose, whenever they wanted. They could peck one key for a while and then the other, they could choose to press one lever exclusively, or they could do something else, such as walk around, groom themselves, or take a nap. In one experiment with pigeons, Herrnstein (1961) programmed a variable-interval (VI) 90-second schedule of reinforcement to control the availability of reinforcers on the left key (BL), whereas pecks on the right key (BR) were never reinforced. Note that the consequence of pecking BL is uncertain. Sure, pecking this key will be reinforced once, on average, every 90 seconds, but it is impossible to be certain which peck will be reinforced. Despite this uncertainty, the pigeons spent all of their time pecking the left key (another example of preference for reinforcement over no consequence). Herrnstein's Matching Equation
Pavlovian Learning and Conditioned Reinforcers
Like all examples of Pavlovian conditioning, the neutral stimulus (ka-chunk) signals a reduction in the delay to the unconditioned stimulus (US). In this experiment, the US was food - a primary reinforcer. Because of this signaled delay reduction, ka-chunk comes to function as a conditioned stimulus (CS). In Figure 8.1, food is delivered once, on average, every 30 seconds. When the ka-chunk happens, the delay is reduced to a half-second; that's quite a delay reduction. Because of this large delay reduction, ka-chunk acquires a CS function - it evokes conditioned responding such as salivation and physiologically detectable excitation. If Pavlovian learning is necessary for a neutral consequence to become a conditioned reinforcer, then when the ka-chunk functions as a CS it should also function as a conditioned reinforcer. To test this, Skinner arranged an operant contingency between pressing a lever (newly introduced to the chamber) and the ka-chunk: IF lever press → THEN ka-chunk Note that the only consequence of pressing the lever was a ka-chunk; no food was provided. If Skinner's rats learned to press the lever (which they had never seen before), then he could conclude that ka-chunk functions as a reinforcer - a conditioned reinforcer. Indeed, Skinner's rats learned to press the lever when the only consequence was the ka-chunk. However, as they continued to earn ka-chunks, they gradually decreased their rate of lever pressing. Why? Because of Pavlovian extinction. When the CS (ka-chunk) was repeatedly presented without the US (food), the CS no longer signaled a delay reduction to the US. Eventually ka-chunk stopped functioning as a CS or as a conditioned reinforcer.
Rules and Rule-Governed Behavior
Now that we have a basic understanding of the learning and relational responding that underlies human language, let us explore one of the key advantages conferred to those who have mastered verbal behavior. Humans, because of language, have a massive advantage over other, nonverbal species - they don't have to learn things in a trial-and-error fashion. Instead, they can learn from others who tell them what to do. You might recall Thorndike's cat from Chapter 5, who was tasked with escaping from a puzzle box. The cat no doubt made hundreds of futile attempts before stumbling upon the correct response that opened the door. By contrast, were any of us placed in the puzzle box, the simple verbal instruction "pull the string" would readily lead to the response that opens the door. Likewise, a larger set of instructions can guide us through the assembly of a FJÄLLBO (shelving unit) purchased at Ikea. Without these instructions, we may never correctly assemble the unit. Behavior acquired and maintained by interacting with the contingencies of reinforcement alone is called contingency-shaped behavior (Skinner, 1966). The behavior of Thorndike's cat was contingency shaped; it was acquired and maintained by interacting with the contingency of reinforcement (IF string-pull → THEN escape from the box). Some human behaviors are contingency shaped. For example, you learned to walk through a contingency-shaping process. Much like the cat in the puzzle box, you made hundreds of futile attempts to stand and walk before you eventually took your first successful step. Even if your parents said things like "take a step, take a step," such instructions were useless in specifying the motor movements needed to maintain balance. By interacting with the contingencies of punishment (falling to the floor when the behavior failed) and reinforcement (maintaining balance), gravitational contingencies shaped your first steps. Even today, our walking is contingency shaped. We cannot verbally specify all of the small muscular adjustments necessary to maintain balance while walking. The best we can do is to provide a gross description of what we can see, "Well, you put one foot in front of the other..." By contrast, rule-governed behavior describes behavior influenced by a verbal description of the operative three-term contingency (antecedent-behavior-consequence; Peláez & Moreno, 1999; Skinner, 1966). For example, the instructions provided with the FJÄLLBO describe antecedents ("When you have completed Step 12 ..."), behaviors ("... attach the shelf labeled D"), and consequences ("With the completion of this step, your FJÄLLBO is ready for use"). Following these instructions is reinforced with a useful shelving unit, a consequence the user may otherwise have never achieved.
Principle 2 for effective conditioned reinforcement
Principle 2: Use a salient conditioned reinforcer. The second principle of Pavlovian conditioning was "Use a salient CS." Translated to conditioned reinforcement, this becomes, Use a salient conditioned reinforcer. Simply put, a noticeable conditioned reinforcer will work better than one that is easily overlooked. For example, giving a child a token and having her place it in her token bank is more salient than dropping the token into the bank for her. If you are going to deliver a conditioned reinforcer, you want to be certain the individual observes it. If they don't observe the conditioned reinforcer, it will never positively influence behavior. To increase the salience of conditioned reinforcers, animal trainers often use clickers instead of saying something like, "good boy" (after all, animals don't speak English). Clickers, which are inexpensive and may be purchased at most pet supply stores, present a salient auditory stimulus "click-click." This sound is established as a conditioned reinforcer by ensuring that it signals a delay reduction to an effective backup reinforcer, like a favorite treat. The sound of the clicker is unique. Pets, farm animals, and captive animals in zoos have never heard a sound like this before. This novelty makes the sound salient, that is, something that's difficult to miss. A second advantage of the clicker is that its "click-click" sound is brief, much briefer than the time it would take to say, "good boy." This brevity is important because an effective conditioned reinforcer marks the desired response (and no other behavior) as it occurs (Urcuioli & Kasprow, 1988; Williams, 1999). When we say marking, we mean that the conditioned reinforcer immediately follows the response, and this helps the individual learn which response produced the backup reinforcer.
How is sharing encouraged in hunter gatherer civilizations?
Punishment. When someone is greedy, they are punished immediately with ridicule and shaming. For serious infractions, the group may collectively impose harsher punishers- confiscating possessions, banishing the cheater, or killing them. Punishment, when combined with reinforcement of altruistic acts, plays an important role in decreasing greed and, instead, promoting cooperation, sharing, and group cohesion.
Strategy 2 to replace bad habits with good habits
Replace them with stimuli that will one day evoke a good habit Second, replace those stimuli that evoke bad habits with stimuli that will evoke good ones. These stimuli will not work at first, but carefully selecting these stimuli now will pay dividends later. Let's assume that we want to replace snacking on the couch with walking on the treadmill. What antecedent stimuli could we arrange that would, in the short run, remind us to walk and, in the long run, would reliably evoke habitual walking behavior? Well, we could move the couch to where the treadmill is now, and move the treadmill into the prime real-estate that the couch currently holds. Now the first thing you see as you enter the living room is the treadmill, not the couch. If our good habit is to be reliably evoked by an antecedent stimulus, then we must ensure we notice that stimulus every day.
PTSD
Shows the effects of fear conditioning. Previously neutral stimuli evoke unwanted emotions.
Strategy 3 to replace bad habits with good habits
Set the bar very low. Third, we must reinforce the desired behavior hundreds, if not thousands of times. To accomplish this, the behavior needs to occur hundreds, if not thousands, of times. Uh...how do we do that? We set the bar low. That is, we arrange a reinforcement contingency that asks very little of us. That's right, we reinforce successive approximations of the target behavior - we use shaping. Applied to treadmill walking, when we head to the living room to watch a bit of TV, we see the treadmill in its new, central location and we hop onto it and walk for 30 seconds. That's the first approximation, and when we do it (and this is the fourth strategy - make sure the behavior is reinforced), we feel good about ourselves - today we met our goal to becoming an exerciser (our new identity). We did not overdo it, leaving us sore tomorrow (and less likely to want to get on the treadmill). Instead, we began the long process of getting into the habit of seeing the treadmill and exercising. In addition, now that the treadmill holds that prime real-estate in the living room, we notice how much easier it is to see the TV from up there (another reinforcer for exercising).
Shaping Principle 2
Shaping Principle 2 asks us to evaluate what the novice player can currently do and how that falls short of the terminal behavior. The designers of Plants vs. Zombies knew that novice players could click things - after all, they clicked the app that loaded the game. However, novices do not normally click apps the millisecond they become available. The dotted line shows how quickly the player will have to click things in order to win at the highest level of the game. This is the speed of the terminal behavior. Clearly, the current speed falls short. Identifying this gap between current and terminal performance helps the designers identify a dimension along which behavior needs to change - it needs to get faster (reduced latency between stimulus and response). Once this dimension is identified, the game designer can set the reinforcement contingency for the first response approximation.
Shaping Principle 3
Shaping Principle 3 provides advice for setting the reinforcement contingencies - the response approximations should challenge the player, but not so much that reinforcers cannot be obtained. Throughout shaping we need to hit that Goldilocks zone in which reinforcers can be earned, but only if the performance is a little better than before. If we set the reinforcement contingency at the superfast level shown in Figure 8.3, the player would never beat the level; that is, they would never obtain critical reinforcers in the game. Operant extinction would leave these players feeling frustrated and angry, and they would, in turn, leave negative reviews in the app store. Not a good thing if you are trying to make a living in video game design.
Shaping Principle 5
Shaping Principle 5 suggests we ensure the learner has mastered the current response approximation before moving on to the next one. This is achieved in Plants vs. Zombies by not letting players advance to the next level (where suns will have to be clicked a little faster) until they beat the current level. Beating a level demonstrates the player has mastered the current response approximation.
How is shaping used on human behavior?
Shaping is an effective way to help humans learn complex terminal behaviors, that is, those requiring skills that make it impossible for the novice to obtain a reinforcer. Shaping can transform the novice player into a zombie-killing machine. Imagine what would happen if the novice player's first experience with the game was the extreme challenge of the final level. The outcome would be disastrous. The brain-hungry zombies would quickly overrun the house and the reinforcer (defeating the level) would not be obtained. No matter how many times the novice tried to beat the final level, his game-playing behavior would be extinguished. They arranged lots of conditioned reinforcers for behaviors that were mere approximations of the skills needed to win when playing the final level. If you've played the game, you may remember that the first approximation taught to novice players is clicking on a seed packet (they appear in the upper left corner of the screen). This response produces a conditioned reinforcer - the sound of the packet being torn open. This sound marks the correct response and signals a small delay reduction to the backup reinforcer - killing all the zombies and winning the first level. The next approximation is to click suns as they fall from the sky. The first time this is done, another auditory conditioned reinforcer marks the response and points are added to a counter. In this token economy, points (generalized reinforcers) may be exchanged for many different backup reinforcers - zombie-killing plants, zombie-blocking obstacles, and sunflowers that produce more suns/points. All told, the game differentially reinforces thousands of successive approximations before the terminal behavior is acquired. When framed as a protracted learning task in which thousands of new skills must be acquired, it sounds tedious. But because the game designers so effectively used shaping, we use other words to describe the game - "fun," "engaging," and "a place where I lose all track of time."
Motivating Operations Require Response-Reinforcer Contigency Learning
So, MOs (EOs and AOs) have two effects on behavior: (1) they temporarily alter the efficacy of a reinforcer and (2) they change the probability of behaviors that lead to that reinforcer. The second of these effects is possible only if the individual has learned the response-reinforcer contingency. As you have no-doubt deduced, an inability to learn response-reinforcer contingencies would be highly maladaptive. If we were incapable of learning which of our many behaviors will address our critical biological needs (nutrition, water, heat, sex), those needs are likely to go unaddressed. When famished we are just as likely to pick our noses as we are to order a pizza. When thirsty we are just as likely to dig a hole and jump in as we are to get a drink of water. An individual who cannot learn response-reinforcer contingencies is unlikely to survive very long.
Breaking the Rules in Clinical Psychology
Some of the behavior analysts who studied rule-governed behavior in the late twentieth century were clinical psychologists who were also conducting talk therapy with clients experiencing depression, anxiety, obsessive-compulsive disorder, and so on. These researchers recognized the parallels between the maladaptive rule-following observed in the lab and the problems experienced in daily life by their clients (e.g., Zettle & Hayes, 1982). Specifically, clients appeared to be rigidly following a set of maladaptive rules; rules that they regarded as absolutely correct and never to be strayed from. For example, a rule like "IF I attend the party → THEN others will notice how awkward/ugly/stupid I am" is maladaptive because, when followed, it is unfalsifiable. That is, if the client stays home, they will never know that a good time might have been had at the party. Such maladaptive tracking can greatly constrict healthy living. Maladaptive pliance is also an issue in therapy. Individuals seeking the assistance of a therapist often rigidly follow rules because they place a high value on social positive reinforcers (praise, credit, etc.) and social negative reinforcers (the avoidance of blame, criticism, etc.). This social reinforcement contingency can undermine progress in therapy if, for example, the client follows the therapist's advice not because it leads to effective action (tracking) but because it pleases the therapist (pliance; Hayes et al., 1989). When therapy ends and social reinforcers are no longer provided by the therapist for adhering to the rules of therapy, it should be no surprise when pliance undergoes extinction and the benefits of therapy are lost. Recognizing these pitfalls of rule-following, these clinical researchers sought a new approach to conducting talk therapy; one informed by behavioral research (Hayes et al., 1999, 2004, 2009). To understand what was new about this therapy, it helps to know something about most other forms of psychotherapy, which are grounded in the commonly held intuition that thoughts are important - they play a causal role in human behavior (Beck, 2005). This "thoughts are important" hypothesis accords well with the client's belief that changing their thoughts is a prerequisite to valued behaviors such as behaving confidently, responsibly, and compassionately toward others. With this mutually agreed upon hypothesis, the traditional psychotherapist's job is to search for maladaptive thoughts, as though they revealed a "computer virus" in the brain. When the virus is detected in the thoughts voiced by the client, the therapist installs a "software patch," of sorts, by challenging the maladaptive thoughts and replacing them with more rational modes of thinking. This is intuitively appealing - if thoughts cause behavior, then "installing" better thoughts should change behavior. But, like all behavior analysts, Hayes et al. (1999, 2009) recognized the fallacy of this "thoughts cause behavior" hypothesis. Their new approach, called Acceptance and Commitment Therapy or ACT (pronounced "act"), was designed to therapeutically undermine the client's rules about the causal nature of thoughts. For example, the client might argue that their depressive thoughts caused them to stay home instead of attending the party. Is this true? Chapter 1 discussed the fallacy of arguing that one behavior (a depressive thought) causes a second behavior (staying home). The fallacy is revealed by asking, "And what caused the depressive thought; a third behavior? And what caused the third behavior, a fourth behavior?" Causal variables are to be found elsewhere, and this book has summarized a myriad of functional variables that influence behavior. In talk therapy, the ACT therapist might undermine the "thoughts cause behavior" rule by asking the client to walk around the room while repeating the thought, "I can't walk." As the client realizes it is possible to have this thought and behave in a way that contradicts it, the "thoughts cause behavior" hypothesis is held a little less firmly. The ACT therapist uses many such techniques to undermine the belief that thoughts are important and must be controlled in order to behave in accord with one's values (Levin et al., 2012). A second tactic used by ACT therapists is to point out why the client's "I must control my thoughts" rule is both impossible and leads to suffering. This tactic was inspired by the verbal behavior research summarized earlier in this chapter. In particular, the finding that what makes verbal behavior verbal is symmetric relational responding. When the client has the thought "IF I have depressive thoughts and feelings → THEN I cannot behave in accord with my values," they are presenting to themselves a verbal stimulus ("depressive thoughts and feelings") that is related symmetrically with, you guessed it, actual depressive thoughts and feelings. And when, at other times, actual depressive feelings arise, you will not be surprised to learn that depressive thoughts follow (Wenzlaff et al., 1991). Thus, rules about controlling thoughts and feelings serve only to prompt into existence that which the client is trying so desperately to suppress. This process may be experienced firsthand when we tell you, "Whatever you do, don't think of a white bear. No, really, your life depends on it - don't think of a white bear." When you read or hear this, what do you inevitably do? Because you long ago learned that "white bear" should be related symmetrically with images of white bears (see Figure 14.5), you cannot help but think of a white bear (Wegner & Schneider, 2003). Consistent with this inability to suppress unwanted thoughts and feelings, clients who try to control these private events suffer more than those who adopt a different coping strategy (Degenova et al., 1994; Hayes et al., 2006; Masedo & Esteve, 2007). Consider the panic-disorder client who feels a twinge of anxiety and tries to suppress this by thinking "I must not have a panic attack." Because the verbal stimulus "panic" is related symmetrically with feelings of anxiety, the privately stated rule serves only to induce more anxiety and perhaps a full-blown panic attack. For the same reason, the rule, "Just say no to drugs and alcohol," is a failing strategy for promoting drug abstinence. The implication for ACT therapy is that the path to behavior change does not involve trying to control thoughts and feelings. Indeed, from an ACT perspective, attempting such control is the problem, not the solution (Hayes & Wilson, 1994). These concepts are explored in Extra Box 1.
THE MATCHING LAW, TERRORISM, AND WHITE NATIONALISM
Some of the most difficult to understand choices are those that we find reprehensible. For example, violent terrorist acts that take the lives of innocents are difficult for us to comprehend. What could possess someone to do something so evil? Likewise, why would anyone join a neo-Nazi group and march through the streets of Charlottesville while chanting, "Jews will not replace us!" If we could predict and influence such choices, we could reduce violence and hate. In her documentary films Jihad and White Right: Meeting the Enemy, the groundbreaking filmmaker, Deeyah Kahn explores the variables that lead young men to make these choices. In Jihad, Kahn interviews young Muslim men who leave their homes in the West to join Jihadi movements in the Middle East. In most cases, these men are struggling to succeed in the West. They are failing to establish friendships and romantic relationships; they are not meeting the expectations of their parents or their culture. Most of their interactions with Westerners left them feeling ashamed and humiliated. In a word, they perceived themselves as failures. Then came their first interactions with a Jihadi recruiter. While no one in the West was giving them any positive reinforcement, the recruiter spent hours online interacting with them, telling them how important they were, describing how they could be part of a life-changing movement, and how, if they died in the movement, they would be handsomely rewarded in the afterlife. The matching law makes clear predictions about the choices these young men are susceptible to making. First, we define BWest as the behavior of interacting with people in the Western world, and RWest as the reinforcers obtained in these pursuits. If BWest is rarely reinforced (i.e., RWest approaches zero), then the matching law predicts interacting with the Jihadi recruiter (BJihad) will occur if this behavior is richly reinforced with the kind of attention the Jihadi recruiters regularly provide (RJihad): These recruiters not only richly reinforce these online interactions, they shape increasingly hardline opinions when the recruits begin to mimic the recruiter's radical ideas. In extreme cases, among those young men who are the most alienated (i.e., the least likely to obtain social reinforcers from anyone in the West), they leave their homes and join the extremist movement. Consistent with this analysis, Deeyah Khan reported that these young men are not drawn to the movement by religion; they are more loyal to their recruiters than to their religion. These recruits are escaping alienation (i.e., a low value of RWest) in favor of a rich source of social reinforcement (RJihad). In her movie White Right: Meeting the Enemy, Khan found that a similar profile of young white men were drawn to the neo-Nazi movement. Those without friends, gainful employment, or supportive families were drawn to the reinforcers obtained by joining the gang of "brothers" who agree with one another, march with one another, and make headlines with one another. Not only does the matching law predict these outcomes, it tells us how to prevent them. According to the matching law, the answer is to identify those who are failing in life and offer them assistance, friendship, and a source of reinforcement for engaging in socially appropriate behavior. In other words, increase the value of RWest. Could this work? In her films, Kahn documents several cases of white nationalists and Jihadis who come to reject their radical beliefs when they begin interacting with those they hate. When they are befriended by these professed enemies. When they are accepted, even in the face of their hateful speech. Likewise, organizations that work with incarcerated white nationalists (e.g., Life After Hate) report that such underserved kindness from others can play an important role in turning these individuals away from hate. Of course, case studies fall short of experimental, scientific evidence, but these conversions from hate to love are predicted by the matching law. They suggest we should approach our enemies and shape their behavior with love, not revile them from afar.
Discrimination
The CS is the ONLY stimulus that evokes the CR. Other stimuli that closely resemble the CS are largely ineffective
What two factors influence how quickly behavior decreases to baseline levels under an operant-extinction contingency?
The first factor is the rate of reinforcement prior to extinction - the higher the rate of reinforcement, the faster extinction will work. The second is the individual's motivation to acquire the reinforcer - the more the reinforcer is needed, the more persistent behavior will be during extinction.
Reinforcer Delay
The final variable that produces exclusive choice is reinforcer delay; that is, how long you have to wait for the reinforcer after having made your choice. For example, if two online retailers offer the headphones you want for $50, but one of them (BL) can ship them twice as quickly as the other one (BR), then you will choose BL because of the difference in reinforcer delay (see Figure 13.4). Once again, many consumers exclusively buy online products from Amazon.com because they ship them to customers faster than other retailers. Delays to reinforcers strongly influence our daily decision-making. Consider that many iPhone users spent large sums of money on a new phone in 2017 when Apple slowed the processing speed of their older phones. Customers on Reddit accused Apple of using behavioral technology (reinforcer delays) to induce customers to buy a new iPhone. Apple denied the charge and offered to fix the cause of the spontaneous shut-downs (older batteries) at a reduced price.
EXPANDING THE VERBAL REPERTOIRE
The horizontal beige arrow in the top panel of Figure 14.3 illustrates prior training in which a child was explicitly taught to point to a horse when they hear "horse." That is, "horse" is the antecedent stimulus (e.g., "show me the horse") and the child points to a horse, standing in a field. The dashed green arrow pointing in the opposite direction shows the emergent (untrained) symmetric relational response. That is, because of prior multiple-exemplar training (Figure 14.2) the child spontaneously (without additional training) is able to say "horse" while looking at it. In the same panel of Figure 14.3, the vertical beige arrow shows that the child was subsequently taught to say "horse" when they hear "colt" (e.g., the parent asks, "What is a colt?" and the child replies "A horse."). Once again, the dashed green arrow pointing in the opposite direction reveals the emergent (untrained) symmetric relational response; saying "colt" when they hear "horse." Notice that in the upper panel the child has not been taught to relate "colt" and a horse as in any way the same. That is, the child was never asked to "show me the colt" nor to say "colt" while looking at a horse. The blue and orange dashed arrows in the lower panel of Figure 14.3 show these two untrained relating responses. If the child correctly says "colt" when they see a horse in the field (blue arrow) and points to the horse in the field when they hear "colt" (orange arrow), then we might say the child understands what "colt" means. Figure 14.3 makes explicit the relational responding that understanding is composed of. Behavior analysts refer to this understanding as stimulus equivalence. After explicitly teaching a unidirectional relation between three or more arbitrary stimuli (e.g., "horse" → real horse, "colt" → "horse"), symmetric relational responding is demonstrated between all stimuli (real horse → "horse", "horse" → "colt", real horse → "colt", "colt" → real horse). That is, the individual relates all of the stimuli, in many ways, as equivalent to one another.1 Not surprisingly, only humans have demonstrated stimulus equivalence; not pigeons, not sea lions, not even a colt. In human infants, the ability to demonstrate stimulus equivalence is correlated with the acquisition of language (Peláez et al., 2000). That is, children with greater language proficiency do better on tests of stimulus equivalence. In short, human language and the ability to understand words is fundamentally about relational responding, that is, relating arbitrary stimuli as, in many ways, equivalent to one another.
Duration
The interval of time between the start and the end of the behavior. Important when we are interested in how long a target behavior lasts.
What is the independent variable in therapeutic settings?
The intervention/ treatment
Objection 1: Intrinsic Motivation
The natural drive to engage in a behavior because it fosters a sense of competence. (Extrinsic Reinforcers: reinforcers that are not automatically obtained by engaging in the behavior, instead, they are artificially arranged.) Extrinsic reinforcers can undermine intrinsic motivation. Studies have shown that extrinsic reinforcers do NOT decrease intrinsic motivation to engage in behavior. Extrinsic reinforcers can help people find automatic reinforcers that they would have otherwise never experienced. Verbal extrinsic reinforcers enhance intrinsic motivation. Tangible extrinsic reinforcers can have a temporary negative impact. Negative effects of tangible rewards can be avoided if the reinforcer comes as a surprise.
Punishment
The process or procedure whereby a punisher decreases the future probability of an operant response. Thus, the hunter-gatherer community harnessed the power of punishment by arranging an IF hoarding → THEN shaming contingency. When parents talk of punishing their child, they rarely consider whether the consequence decreases problem behavior. Instead, the parent is angry and arranges a negative consequence that seems "just." This is not a behavior-analytic approach. Behavior analysts argue that putting the emphasis on the function of the consequence is more humane. The job of a punisher is to decrease the future probability of the problem behavior, not to exact retribution. Defining punishment functionally keeps us connected to one of the core goals of behavior analysis - to discover functional variables that may be used to positively influence behavior.
Conditioned Response (CR)
The response evoked by the conditioned stimulus. This may not be the same as the unconditioned response
Are We Hopelessly Compliant?
The results of the Milgram experiments and Derren Brown's Netflix special are a little depressing. Although their findings make us wonder if humans are hopelessly compliant, this is the wrong question to ask. A better question is, what conditions increase and decrease pliance? Such a question is consistent with the second goal of behavior analysis - to identify functional variables that can influence behavior. We have already identified one functional variable that increases and decreases pliance - if the individual or group giving the instruction controls powerful reinforcers, pliance is likely to occur; if they do not, pliance is less likely. Another pliance-decreasing variable is punishment. Specifically, when pliance produces an easily discriminated, highly negative consequence, pliance decreases (Cerutti, 1991; Galizio, 1979). For example, if a coach instructs a player to cheat and the compliant athlete is disqualified from the competition (an easily discriminated negative consequence of pliance), complying with the coach's unethical instructions is unlikely to happen again. A similar outcome occurred among the Hasidic Jews of Brooklyn. As a measles epidemic spread through this small community (an easily detectable, highly negative consequence), parents began rejecting the rules provided by their rabbi. Because the negative consequences of following anti-vax rules are more difficult to see in other communities (where most children are vaccinated and rates of MMR are low), anti-vax pliance continues. To summarize, rules, instructions, and norms are useful. When they accurately describe operant contingencies operating in the world (IF you smoke cigarettes → THEN you are likely to die of cancer), they can literally save our lives. When we follow these adaptive rules, we are tracking. Other instances of rule-following - pliance - occur because of social consequences (IF you sit in that chair, young man → THEN you will be in trouble). In daily life, tracking contingencies are sometimes pitted against pliance contingencies. That is, tracking pulls our behavior in one direction (IF I vaccinate my child → THEN we will avoid measles, mumps, and rubella) and pliance pulls in the opposite direction (IF I vaccinate my child → THEN my close community of anti-vaxxers will shun me). What we choose to do often depends on the choice-influencing variables discussed in Chapter 13 - reinforcer immediacy, frequency, size, and so on.
Functional Analysis of Behavior
The scientific method used to (1) determine if a problem behavior is an operant and (2) identify the reinforcer that maintains that operant The functional analysis is a brief experiment in which consequences that might be reinforcers are turned ON and OFF, while the effects of these manipulations on problem behavior are recorded. If the problem behavior occurs at a higher rate when one of these consequences is arranged, then we may conclude that the consequence functions as a reinforcer. If no experimenter-controlled consequences function as reinforcers, then either the behavior is maintained by an automatic reinforcer or the problem behavior is not an operant behavior. The results of a functional analysis of behavior are useful. If the problem behavior is not an operant, then an operant-based intervention may not be the right approach (perhaps a Pavlovian intervention, like graduated-exposure therapy, would work better). But if the problem behavior is an operant, then a consequence-based intervention can help to reduce the behavior.
Tracking
The second category of rule-following occurs because doing so allows us to access reinforcers that would otherwise be difficult to obtain. For example, if we open the box containing the 59 parts needed to assemble our new FJÄLLBO, but no assembly instructions are found, our chances of success are very low. Likewise, trying to find 3436 West Sierra St. in a new city would be next to impossible without instructions provided by a navigation app, a resident of the city, or our friend who lives at that address. Without instructions, we would drive for hours in a trial-and-error fashion before eventually finding the street or giving up. Tracking may be defined as rule-following occurring because the instructions appear to correctly describe operant contingencies (reinforcement, extinction, or punishment) that operate in the world (Hayes et al., 1989). The words and diagrams in the FJÄLLBO instructions appear to correctly describe how to build the shelving unit. Thus, we follow the instructions to obtain the positive reinforcer (a fully assembled FJÄLLBO). Similarly, broken down on the side of the road we access a YouTube video instructing us how to change a flat tire. Tracking follows because it appears that following the steps described will lead to the specified reinforcer - a changed tire. We track the instructions provided by our mapping app because doing so is reinforced, not by the app, but by the natural contingencies operating in the world - we arrive at our destination. It is important to note that one cannot discriminate between pliance and tracking based on the topography of rule-following. These categories are based on function, not on form. That is, a functional relation exists between pliance and pliance-contingent socially mediated positive and negative reinforcers. Likewise, a functional relation exists between tracking and operant contingencies that naturally operate in the world.
DELAY-EXPOSURE TRAINING
The second method for reducing delay discounting and impulsive choice takes a different approach. Instead of arranging an environment in which impulsive choice is less likely to occur, delay-exposure training teaches the individual by giving them a lot of experience with delayed reinforcers. The logic is simple - if we are used to waiting (we do it all the time) - then waiting for a larger-later reward is nothing out of the ordinary; why not choose the larger-later reward? By contrast, if we are used to getting what we want when we want it, then delays are unusual, aversive, and may signal that we are not going to get what we want. Delay-exposure training has been studied in several different species and several methodologies have been employed. For example, Mazur and Logue (1978) exposed pigeons to delayed reinforcers for several months and found that this experience produced long-lasting reductions in impulsive choice. Similar findings have been reported in laboratory studies with rats and humans (Binder et al., 2000; Eisenberger & Adornetto, 1986; Stein et al., 2015). A form of delay-exposure training has been used in preschools to improve self-control choice and reduce the problem behaviors that can occur when preschoolers don't get what they want when they want it (Luczynski & Fahmie, 2017).
Variable-Ratio Schedules
The second type of ratio schedule of reinforcement is the variable-ratio schedule. Under a variable-ratio (VR) schedule, the number of responses required per reinforcer is not the same every time. Under a VR schedule, the numerical value corresponds to the average number of responses required per reinforcer. For example, under a VR 3, reinforcers are delivered after an average of 3 responses. This is depicted in Figure 11.5. The first reinforcer is delivered after the very first response, but after that, reinforcers are delivered after 6 responses, then 2 responses, and then 3 responses. The average number of responses per reinforcer is 3 (that is, [1 + 6 + 2 + 3]/4 = 3). Under a VR schedule, one cannot predict which response will be reinforced. VR schedules are common in nature. For example, the average number of cheetah attacks per meal (the reinforcer) is about 2; that is, when they see a prey and initiate an attack, they will succeed about half the time. Thus, their hunting behavior operates under a VR 2 schedule of reinforcement.
Reinforcer Size/Quality
The second variable that strongly influences choice is the size or the quality of the reinforcer. Given a choice between a small ($0.05) and a large ($5.00) reinforcer, all else being equal, the larger one will be chosen exclusively (see Figure 13.2). To be sure that this outcome was influenced by the difference in reinforcer size ($0.05 vs. $5.00) and not a preference for the button on the right, we could switch the contingencies so the smaller reinforcer is assigned to the right button (BR) and the $5 reinforcer to the left (BL). When behavior shifts toward BL, we would conclude that reinforcer size strongly influences choice. Because reinforcer quality has the same effect as reinforcer size, we can use choice to answer the question, which reinforcer is qualitatively better? For example, animal-welfare advocates can give farm animals choices between different living arrangements. Such behavioral research has shown that cows prefer quiet settings over noisy ones, a plant-based over a meat-based diet, and that chickens prefer smaller cages than humans might expect (Foster et al., 1996; Lagadic & Faure, 1988). If you are an animal lover, you should know that this work is underappreciated and represents an important area for future growth in behavior analysis (see Abramson & Kieson, 2016 for a discussion of this potential). But for now, just remember that individuals choose larger (higher-quality) reinforcers over smaller (lower-quality) reinforcers.
TACTIC 3: ARRANGE ANTECEDENT STIMULI THAT WILL CUE GENERALIZATION.
There are a variety of ways to arrange antecedents that will cue generalization. One strategy is to take a salient stimulus from the setting to which the behavior needs to generalize and incorporate it into training. For example, a parent might be encouraged to bring their child (a salient stimulus from the home setting) to the next parent-training session. During that session, role-playing of appropriate parenting skills may be practiced while the child is present. Thereafter, the child may function as an SD, evoking good-parenting responses at home, where it counts. Another strategy in this vein is to arrange reminders to emit the good-parenting response. For example, the parent may set an alarm to go off at random times, twice per day. At each alarm, the parent will stop what they are doing and look for an opportunity to demonstrate a good-parenting skill, such as noticing something their child is doing that can be positively reinforced - "I like the way you two are cooperating. Thank you." These three tactics for promoting generalization will also help to maintain the newly acquired adaptive behavior over time. The behaviors that we emit every day of our lives are those that are consistently reinforced by natural contingencies operating in our environments. These natural contingencies are so reliable that our behavior can become habitual. As you may recall from Chapter 9, habitual behavior occurs under tight antecedent stimulus control, even when we have no real motivation to engage in the behavior. If a parent habitually deploys behavior-analytic skills in a compassionate way, that parent is likely to raise children who will do good in the world themselves. Therefore, applied behavior analysts will be tactical in promoting generalization of adaptive skills that are so compatible with the natural contingencies of reinforcement that those skills become habitual.
Why do we use FI Schedules?
There are not many real-world examples of FI schedules, so the reader may wonder why we study them. The answer is that they can tell us about an individual's perception of the passage of time. Time perception can impact our everyday behavior. For example, if we perceive that time is passing very slowly, then we may be less patient when asked to wait - "what's taking so long!". The FI schedule allows us to observe a behavior influenced by time perception. As noted by Ferster and Skinner (1957), if an individual had perfect time perception, they would not waste their efforts by responding prior to the passage of an FI 60-s timer. Instead, they would wait for 60 seconds and then make a single response. Adult humans are capable of this highly efficient response pattern, but nonhuman animals and preverbal infants (who cannot count) are not.
What do all experimental designs have in common?
They all turn the IV on and off.
Ways positive and negative reinforcers are the same
They are both consequences. They both increase behavior above baseline (no-reinforcement) level
Stimulus Preference Assessments
This technique is used when working with nonhuman animals or individuals with limited or no expressive language capabilities. During a stimulus preference assessment, a rank-ordered list of preferred stimuli is obtained by observing choices between those stimuli. That is, the potential reinforcers (referred to as "stimuli") are presented concurrently (at the same time) and the individual chooses the one they like the most. Prior to conducting the stimulus preference assessment, the behavior analyst will identify several stimuli that might function as reinforcers. To narrow the list, parents and caregivers might be asked to complete a reinforcer survey. Preference hierarchy - a list of stimuli rank ordered from most to least preferred. To conduct the stimulus preference assessment described in Table 9.2, all six candies would be arranged on a table and the child would be asked to choose one (and then allowed to eat it). If the first stimulus selected is the salted caramel, that one is placed at the top of the preference hierarchy; it is the most preferred of the stimuli available. Next, the individual gets to choose from the remaining stimuli, with each item chosen placed sequentially in the preference hierarchy. This continues until all of the stimuli are consumed. The assumption underlying this technique is that the stimuli at or near the top of the preference hierarchy will function as the most effective reinforcers.
Stimulus Response Chains
Thus far, we have discussed SDs that evoke reinforced behaviors. For example, a push-notification on our phone evokes a phone-checking behavior because, in our past experience, these notifications signal that checking our phone will be reinforced. What's lost in this summary is that a behavior like phone-checking is composed of a sequence of responses that must be executed in a precise order. When the phone presents the SD, this initiates our reaching for the phone. This response has a consequence - the tactile feel of the phone in our hand. This consequences undoubtedly functions as a reinforcer: IF push-notification (SD) AND reach for phone → THEN phone in hand But the push notification's interesting news story has not yet been accessed. To see the story, two more responses are needed, and they need to be made in the right order: IF phone in hand (SD) AND unlock phone → THEN news icon visible followed by ... IF news icon visible (SD) AND tap icon → THEN news story presented Note how the consequence of reaching for the phone (having the phone in hand) functions as a reinforcing consequence for that response and as an SD for the next response in the chain. This dual SD/reinforcer functioning happens a second time when the consequence of unlocking the phone (the news icon is visible on the face of the phone) functions both as a reinforcer and as an SD for the next response - tapping the icon. This fixed sequence of operant responses, each evoked by a response-produced SD, is referred to as a stimulus-response chain. These common, everyday stimulus-response chains were acquired many years ago and most of us cannot recall how we learned them. Perhaps someone told us how to do them; perhaps we imitated the behavior of someone else. Whatever their origin, the correct execution of stimulus-response chains is an important part of daily living.
1. Focus on reinforcement first
To effectively reduce a problem behavior, it is helpful to first identify the reinforcer(s) maintaining that behavior. Identifying the reinforcer is important because the efficacy of a punisher is partially determined by the efficacy of the reinforcer maintaining the problem behavior. Strong reinforcers (contingent, large, high-quality, and immediate) need highly effective punishers. A humane approach to punishing behavior will seek to weaken the reinforcer rather than unleash a strong punisher. For example, an abolishing operation (AO) can reduce the efficacy of the reinforcer, which will decrease motivation to engage in the punishable response. In nonhuman experiments, punishing a food-reinforced response (like lever pressing) is more effective when the animal is less food deprived. When the rat's King Kong hunger was reduced (AO), the punisher more effectively decreased the future probability of behavior. To know how strong the reinforcer is, you must identify the reinforcer. Use functional analysis of behavior. A several consequences to see if problem behavior is more frequent when each one is ON. If turning ON and OFF the consequence turns the problem behavior on and off, then that consequence functions as a reinforcer. Once the reinforcer is identified, the behavior analyst will evaluate its efficacy - is it immediate, contingent, and so on? If it is, the "focus on reinforcement first" strategy will look for ways to reduce the strength of the reinforcer. This can be done by delaying the reinforcer or delivering it noncontingently repeatedly (AO).
Training Verbal Operants
Typically developing children acquire echoics, mands, tacts, and intraverbals with seeming ease. These verbal operants are useful in acquiring new words (echoics), in recruiting the aid of others (mands), in helping others (tacts), and in interacting socially (intraverbals). Among children with autism and other developmental disabilities, these verbal operants may be limited in number and usage. This not only prevents them from successfully integrating into society, but it also impairs their ability to acquire the reinforcers available to language-able individuals. For example, an inability to address an establishing operation by manding evokes feelings of frustration in the child and their parents who are helpless to address the unspecified need. The child's frustration can take the form of problem behaviors, such as aggression or self-injury; these could be avoided if only the child could mand (Greer et al., 2016). Behavior analysts have developed successful techniques for training verbal operants to children with autism and other developmental disabilities. There are too many of these techniques to list here, so we note just four of the most commonly used techniques, all of which will be familiar to you. Readers wanting more are referred to one of the many excellent review articles on this topic (DeSouza et al., 2017; Heath et al., 2015; Lorah et al., 2015; Sundberg & Sundberg, 2011). The first of the commonly used techniques in teaching verbal operants is contact with the antecedent stimulus. Because verbal operants are influenced by antecedent stimuli, it is important that the individual observe those stimuli (Kisamore et al., 2013). For example, if a child does not hear the verbal stimulus "mama," an echoic response (saying "mama") is impossible. Likewise, if the individual does not see the cow in the field, tacting "cow" cannot happen. The second technique is prompting and fading the correct verbal response (Leaf et al., 2016; Thomas et al., 2010). That is, when the antecedent stimulus does not evoke the correct response, it is prompted; "Are you hungry?" prompts the child to observe internal stimuli then tacted as "hungry." When the verbal response is later influenced by the stimulus alone ("I feel hungry mommy"), prompts are faded out. The third technique is shaping (Newman et al., 2009). That is, when an approximation of the correct verbal response is emitted ("hungwy"), it is reinforced. Later, the individual will be asked for a closer approximation to obtain the reinforcer. The final technique is to arrange an effective reinforcer (Hartman & Klatt, 2005), like lunch for a "hungwy" child. Readers are referred to Chapter 9 for a review of how to identify and arrange effective reinforcers.
Variable-Interval Schedules
Under a variable-interval (VI) schedule, the amount of time that must elapse before the first response is reinforced is not the same every time. With a VI schedule, the numerical value of the schedule specifies the average interval of time that separates reinforcer availabilities. For example, under a VI 60-second schedule, the timer that controls reinforcer availability will be set to a different value each time, with the average of all the VI values equal to 60 seconds. This is shown in Figure 11.12. The first VI value is 30 seconds, and the first response after that time elapses is reinforced. Subsequently, reinforcers become available after 70 seconds, 55 seconds, and then 85 seconds. The average of all these intervals is 60 seconds ([30 + 70 + 55 + 85]/4 = VI 60 seconds). Because of the irregular intervals, under a VI schedule it is impossible to predict exactly when the next reinforcer will be available. Where FI schedules are rare in our daily lives, VI schedules are more much common. Consider social-media reinforcers. They become available after the passage of time and it's impossible to exactly predict when the next interesting event (a great meme, a retweet, a new follower, or the "liking" of one of our posts) will become available inside the app. In this way, posts that reinforce our social-media checking behavior are available on a VI schedule. The unpredictability of these reinforcer availabilities makes us feel like we are missing out on something good, if only we could check in and see. Many of us check social media too often, doing so even when we are not supposed to - when with a friend who needs us, in a class that requires our attention, or while driving. Alas, we are all beholden to the power of the VI schedule to maintain our constant attention. Those who enjoy hunting and fishing should know that their love of these sports is also a product of the VI schedule of reinforcement. Bagging the deer or catching the fish are the reinforcers that maintain the behavior of the hunter/fisher as they vigilantly search for game or angle for a fish. Increasing the rate of these behaviors will not produce reinforcers any faster; if deer or fish are not present, hunting/fishing behaviors will not be reinforced. Instead, one must wait for the game to arrive and one cannot predict exactly when that will happen. As a result, like the social-media maven who fears missing out, the hunter does not want to leave the field lest they miss out on a trophy buck. We are all beholden to the power of the VI schedule.
Measuring Reinforcer Efficacy
We know that reinforcers are effective if they increase behavior above a baseline level, but we also know that some reinforcers (1 billion dollars) are more effective than others (1 cent). One way to measure reinforcer efficacy is to let the individual choose between them. No doubt, everyone reading this chapter will select 1 billion dollars, so we may conclude that it is the relatively more effective reinforcer. The stimulus preference assessment uses choice as its measure of reinforcer efficacy. Stimuli at the top of the preference hierarchy are judged to be the most effective of the available reinforcers. Another approach to measuring reinforcer efficacy is to evaluate the maximum amount of behavior the reinforcer will maintain, that is, the breakpoint. Figure 9.4 shows the results of an experiment that compared the breakpoints of two stimulant drugs: chlorphentermine and cocaine (Griffiths et al., 1978). Cocaine's breakpoint was about four times higher than chlorphentermine's; that is, the baboons participating in this study worked four times as hard for a cocaine reinforcer than for chlorphentermine. Therefore, we may conclude that cocaine is four times as reinforcing - it maintained four times as much behavior. Breakpoint tells us what choice cannot - it tells us how much more reinforcing one stimulus is than another. Because it is often impractical to actually measure breakpoints in clinical or workplace settings, researchers developed the purchase task. In this task, participants indicate how many reinforcers they would hypothetically purchase if they were available at a range of prices. Under this task, the breakpoint is the highest price at which the individual continues to purchase the reinforcer.
The Dark Side of Tracking
When Skinner (1969) discussed the many advantages of rules and rule-following, he also discussed their dark side. When given an accurate set of instructions, a code of conduct, or a rule of the road (to name a few), tracking ensues, reinforcers are earned, and punishers avoided. However, there is a price to be paid for this efficiency - rules can suppress variability, and this leaves behavior unprepared for the inevitable changes in contingencies of reinforcement and punishment. To take an extreme example, the behavior of a robotic arm that welds parts together on an auto assembly line strictly follows the rules of its programming: detect object entering welding zone, move left, weld, move right, repeat. This produces amazing safety and efficiency until the contingencies change, when an employee accidentally enters the welding zone, and the arm behaves as instructed. Human behavior, established by a simple rule like "press slowly," can produce lockstep tracking that, like the robotic arm, lacks the variability that prepares it for a contingency change (Hayes et al., 1986; Matthews et al., 1977). Outside the lab, an assembly-line worker may have been given the rule, "Hard work is always rewarded" and tracked the rule in a lockstep fashion for years. While this resulted in regular paychecks and occasional promotions, the employee may have been more successful if a different rule were followed: "Work hard when the boss is around; the rest of the time take it easy." The latter adaptively conserves energy and may reduce injuries on the job. The dark side of tracking is that this form of rule-following constrains behavioral variability. Why explore an alternative behavior if rule-following is working? Recognizing this dark side of tracking, innovators encourage us to "Question everything" (Euripides), "Live dangerously ... Send your ships into uncharted seas!" (Friedrich Nietzche), or "Don't let your mind set limits that are not really there" (Zadie Smith). As discussed in Chapter 5, artificial intelligence arrives at more effective solutions by sending its ships into uncharted seas, occasionally trying something new, and carefully evaluating if the outcome is superior. So, here's a maxim to live by, "If a rule is working, mostly follow it; but periodically try something different and carefully note the consequences. If that something else works better, do that more often." Not as catchy as Apple's "Think different," but its specificity is inspired by behavioral and computer science.
Evoked response
When a behavior is influenced by a stimulus. These evoked responses BECOME evoked either due to operant or pavlovian learning and aren't natural.
2. Combine punishment with extinction and/or differential reinforcement
When a functional analysis of behavior identifies the reinforcer maintaining problem behavior, that behavior can sometimes be reduced through extinction alone (ensure that the problem behavior is no longer reinforced) or differential reinforcement of an alternative response (extinction of problem behavior+reinforcement of alternative behavior). However, some instances of problem behavior cannot be reduced to a safe level using extinction or differential reinforcement alone. When that is true and the problem behavior puts the client or others at risk of injury or death, then punishment should be used in combination with extinction and/or differential reinforcement. When possible, it is the ethical stance of applied and practicing behavior analysts that punishment should be combined with reinforcement-based interventions designed to teach an appropriate behavior that can either access the functional reinforcer or a substitute reinforcer. This practice is widely adhered to, for example, when differential reinforcement is combined with punishment. However, it is often impractical or unethical to use differential reinforcement. If, for example, the problem behavior is maintained by a socially unacceptable reinforcer - aggressive behavior reinforced by seeing others suffer - then it would be inappropriate to teach the client an alternative way to access this reinforcer. Instead, it will be necessary to identify a substitute reinforcer.
What Is Choice?
When a judge considers what punishment is appropriate to a crime, an important question is whether the individual who committed the crime chose to do it. If the criminal action was an involuntary response (e.g., an elicited startle-reflex caused harm to another person), then the judge would dismiss the case. For punishment to be imposed, the individual needs to have chosen to commit the crime. Choice may be defined as voluntary behavior occurring in a context in which alternative behaviors are possible. Two parts of this definition are noteworthy. First, choice is voluntary behavior, which is to say choice is not a phylogenetically determined reflex response or a Pavlovian response evoked by a conditioned stimulus (see Chapter 4). We don't choose when to begin digesting a meal, whether to salivate, vomit, or startle; these are involuntary reflexes either elicited by unconditioned stimuli or evoked by conditioned stimuli. Readers should not equate "voluntary behavior" with "willed behavior," that is, a behavior that we feel like we initiate. Instead, by "voluntary" all we mean is nonreflexive. Second, choice occurs in a context is which alternative behaviors are possible. After putting your money into the vending machine, several alternative behaviors are possible - you could press one button to get a bag of chips, another button to select a sleeve of cookies, and another button still to obtain the healthier granola bar. Similarly, when considering which colleges to apply to, deliberations occur in a choice context - one in which alternative behaviors are possible. When you think about it, almost all of our voluntary actions are choices. When walking, consider that we could be running, crawling, or balancing on one leg while sipping tea. Alternative behaviors are almost always possible. You began reading this chapter in a choice context - alternative behaviors were possible. Indeed, alternative behaviors are possible right now. You can always stop reading and engage in literally thousands of other behaviors at any time. But here you are reading. Why? While there are many theories of choice in the psychological and social sciences, we will begin by focusing on four simple variables that strongly influence choice. These variables will be familiar to you, as we discussed them in previous chapters.
TACTIC 1: TEACH BEHAVIORS THAT WILL CONTACT NATURAL CONTINGENCIES OF REINFORCEMENT.
When a new response is acquired in a therapeutic setting, artificial (therapist provided), reinforcers will be arranged. For example, instructors of parent-training classes will reinforce correct parenting behaviors when they occur during role-playing sessions. For obvious reasons, these artificial reinforcers will not be provided when the parent implements one of their good-parenting skills at home. Therefore, if the skill is to be consistently used at home, it needs to contact a natural contingency of reinforcement. Said less technically, the skill will have to work. If it does, the behavior acquired in a classroom setting will generalize to the home (and beyond) and will be maintained over time. To facilitate contact with these natural contingencies of reinforcement, parent-training instructors will promote generalization by giving parents homework assignments. These assignments ask parents to implement their newly acquired skills at home and collect direct-observation data, so they will see when their skills were effective and when they need further refinement in class. Such assignments put parents in contact with the natural contingencies operating at home. By teaching behaviors that will contact natural contingencies of reinforcement, applied behavior analysts are ensuring that the skills they teach are skills that will serve the client well in daily life
Elicited Response
When a specific stimulus occasions a specific reflex response, the stimulus elicits the response
Internal validity
When an experiment provides clear evidence that a functional relation exists between the independent variable and behavior change
Milk Let-Down reflex
When an infant suckles at the nipple, mother will release milk into the infants mouth
Occam's Law of Parsimony
When applied to behavior, holds that, all else being equal, the best explanations of behavior are the simplest explanations.
Systematic Behavior Analysis
When behavior analysists conduct an intervention designed to influence behavior, they make sure that it is implemented exactly as it is supposed to be.
Event Recording
When each instance of behavior is recorded at the moment it occurs. Direct observation. Useful for studying frequency or magnitude. Should be used when there is no lasting product of the behavior. Should only be used if the duration of each response is about the same each time.
3. Deliver punishers immediately
When experimenters gave participants the opportunity to press a button to earn money. Their baseline (pre-punishment) level of pressing is represented by the dashed line at the top of the graph. Next, they introduced a positive-punishment contingency: IF press → THEN electric shock delivered to the fingers. As you can see, when shocks were immediate (0-second delay) button pressing decreased by 80%. However, delaying the shock by as little as 30 seconds greatly reduced the efficacy of the punisher; when delayed by a minute or more, the punisher was ineffective - it produced virtually no reduction in behavior. Similar effects are observed when immediate and delayed negative-punishment contingencies are used and when punishers are delayed in applied/clinical settings. Practical issues often make it impossible to deliver immediate punishment. For example, we might discover that a child has stolen candy from a store several hours after the behavior occurred. Although the most effective punisher would be imposed right after the theft occurred, delayed punishers can still decrease behavior if the contingent relation between the problem behavior (stealing) and the punisher is clarified. This may be accomplished (1) verbally ("because you did ____, your punishment is ____"), (2) by playing an audio or video recording of the inappropriate behavior before it is punished, or (3) by having the individual re-enact the problem behavior just prior to delivering the punisher. While these contingency-clarifying procedures can help, some delays to punishment are undoubtedly too long to deter much problem behavior. For example, how many believers languish in guilt at their lack of "willpower," when a contributing factor to their problem behavior is that the punishment for their sins is too delayed - occurring (as their beliefs suggest) decades later, after they die.
Contingency Management
When it is time to change behavior, to really change it, we draw upon behavior-analytic principles. Paramount among those principles are complex schedules of reinforcement, the building blocks of which are the ratio and interval schedules covered in this chapter. An example of behavior analysts using complex contingencies to address maladaptive human behavior is contingency management of substance-use. Here, complex contingencies of reinforcement are arranged to promote drug abstinence among individuals whose lives have been negatively impacted by illicit drugs. Considerable research has revealed that contingency management is among the most successful approaches to reducing substance-use disorders, including heroin dependence. One approach to contingency management of substance-use employs a VR schedule of reinforcement. Rather than a pigeon pecking a key at a high rate, getting the occasional "win" as in Figure 11.6, drug abstinence is occasionally reinforced when an abstinent patient draws a lucky slip of paper out of a fishbowl and wins a prize. This unpredictable response-based schedule of reinforcement incentivized continuous abstinence because (a) draws from the fishbowl were contingent upon drug-free urine samples, (b) more abstinence yields more reinforcement, and (c) as is true of all VR schedules, the very next instance of abstinence always has a chance of earning the big prize. This VR approach to contingency management reduces cocaine and heroin use significantly more than the other widely used treatment approaches, and these better outcomes are sustained over time. Indeed, the VR approach was so successful, the US Veterans Administration adopted it in their treatment of service-people struggling with substance abuse.
Verbal behavior and emotions
When listening to an engaging podcast filled with tense moments, inviting love stories, or compelling descriptions of the unfairness of others, we experience emotions as if we were in direct contact with these events. Objectively we realize we are just driving and listening to words, but our emotions of fear, tenderness, or anger reveal something else is going on; something not experienced by the family dog, riding along with its head out the window. Our emotional responses to verbal stimuli are such a common experience that we rarely realize how uniquely human it is or wonder why it occurs. As you have probably guessed, the answer has something to do with verbal behavior and stimulus equivalence. The psychological function of verbal stimuli refers to the emotion-evoking function of verbal stimuli, despite those stimuli having never acquired Pavlovian conditioned-stimulus (CS) function. Figure 14.4 will walk us through this definition using an example from Chapter 4. You may recall from that chapter a young girl, Annora, who had a negative experience with a dog and subsequently experienced a debilitating fear of all dogs. The dog that originally attacked her acquired a Pavlovian CS function. That is, the sight of the dog signaled a delay reduction to the unconditioned stimulus (US) event - a vicious attack. Because of Pavlovian learning, the mere sight of the dog (the CS alone) evoked Annora's fear (the conditioned response, CR); see the top panel of Figure 14.4. Because Annora had previously learned to relate the auditory stimulus "dog" symmetrically with actual dogs, this verbal stimulus readily evoked a fear response. For example, when her older brother tormented her by yelling "dog!", Annora displayed fear despite no dog being present. When Annora learned that "perro" means "dog" in Spanish (vertical arrows in Figure 14.4), this auditory stimulus was readily added to form a three-member stimulus equivalence class composed of "dog," "perro," and actual dogs. Because the three stimuli are treated, in many ways, as equivalent to one another, when her brother yelled "perro!", Annora displayed a fear response, despite "perro" never having signaled a delay-reduction to a canine attack (Dougher et al., 1994). As you might predict, when Annora's fear response was extinguished through Pavlovian extinction-based therapy (see Chapter 4 for details), the verbal stimuli related as equivalent with dogs also lost their ability to evoke Annora's fear response. Bad news for her tormenting brother, but good news for Annora, who, to this day, has no lingering fear of actual dogs, "dogs," or "perros."
Shaping Principle 6
When mastery is achieved, the red dotted line in Figure 8.4 will be shifted a little further to the right. Clicking at this slightly faster rate will be reinforced, but slower clicking will not. If the individual struggles to obtain reinforcers at this next level, Principle 6 instructs us to lower the criterion for reinforcement. The new criterion has asked for too much from this learner; the sequence of successive approximations will need to proceed more gradually. Plants vs. Zombies follows Principle 6 by constantly monitoring the player's performance during the initial levels. If the player is clicking things too slowly, the game lowers the reinforcement contingency a little. Zombies appear a bit slower, and the level is extended in duration to give the player more practice with the new contingency. By lowering the requirement for reinforcement, the game keeps the reinforcers flowing and prevents the player from encountering extinction. It also gives the player more practice, so they can continue to advance toward the terminal behavior. Keeping the game just a little bit challenging keeps players in flow.
Extinction-Induced Resurgence
When one operant behavior is extinguished, other (different) behaviors that were previously reinforced are emitted again; that is, they become "resurgent." In this study of infant care, resurgence of previously successful behaviors was a good thing - it helped the caretakers sooth a crying infant. Individuals who have recovered from a substance-use disorder can often relapse to drug use when they lose a significant source of positive reinforcement. For example, an abstinent person with alcohol-use disorder is at risk of resurgent drinking if they lose their job or their spouse - both are sources of significant positive reinforcers.
reification fallacy
When people treat an abstract belief or hypothetical concept as if it represented a concrete event or physical entity Ex. IQ tests as an actual measure of intelligence Ex. The concept of race (even though genetic attributes exists) Circular logic
Social validity (of a behavioral definition)
When the consumer of the intervention or an expert in the field indicates that the behavioral definition accurately reflects the behavior of interest. Should be assessed before the study begins.
Differential Reinforcement of Other Behavior (DRO)
When this procedure is used, reinforcement is provided contingent upon abstaining from the problem behavior for a specified interval of time; presumably while "other behavior" is occurring. By setting a short time interval (e.g., IF the individual abstains from the problem behavior for 5 seconds → THEN the reinforcer will be delivered), the DRO procedure can arrange a high rate of therapeutic reinforcement, which can effectively compete with the reinforcement contingency maintaining problem behavior. As the patient succeeds in abstaining from the problem behavior, the contingency can be modified to require gradually longer intervals before the therapeutic reinforcer is delivered again. This strategy can be effective, although it does not teach a specific activity to replace the problem behavior, and increasing intervals between reinforcers can produce resurgence of problem behavior.
Mentalistic Explanations of Behavior
When we explain one behavior by appealing to a second private behavior (ie. I felt...) it appeals to private sensations. Behavior analysts reject mentalistic explanations of behavior.
How do behavioral analysts positively influence behavior?
When we predict maladaptive behavior, a behavior analyst can use knowledge of functional variables to increase probability that individual will choose a more adaptive behavior.
Prompting and Fading
When working with children who have limited verbal and nonverbal skills, showing them an Elmo icon will not automatically evoke an icon-touching response. If we are to establish the Elmo button as an SD that evokes this response, then we will need the response to occur and to be reinforced. When Emily did not spontaneously press the Elmo button, King et al. (2014) used a procedure known as prompting and fading. A prompt is an antecedent stimulus that facilitates or guides the desired response when it is not happening under appropriate discriminative-stimulus control. A variety of prompts are possible; we will mention three: physical, stimulus, and modeling prompts. For Emily, a physical prompt was used - her finger was gently guided to the Elmo icon. When she pressed it, Emily was allowed to play with the Elmo phone. Later, when Emily wanted to play with the Elmo phone again, the physical prompt was withheld for a few seconds, to allow her to press the icon independently. If that did not happen, then the behavior analyst offered the minimum amount of physical prompting needed, always giving Emily the opportunity to finish the response herself. The gradual removal of a prompt as the response is increasingly emitted under discriminative-stimulus control is called fading. A second prompting strategy is to arrange a stimulus prompt that provides a "hint" at the desired response. An example is provided in Figure 12.6. Here the Elmo icon on the tablet computer is much larger than the other icons and is framed in red to make it stand out. As Emily correctly presses the Elmo icon, we can fade out this stimulus prompt by removing the red frame and gradually reducing the size of the icon, making sure that Emily's responding remains accurate throughout. Some evidence suggests that these stimulus prompts are more effective than physical (response-guiding) prompts. The third type of prompt that behavior analysts have explored is the modeling prompt. That is, the therapist clearly demonstrates how to make the desired response while the client watches. Video modeling has proven to be an effective way to prompt the desired response under SD control (Bellini & Akullian, 2007; Shipley-Benamou et al., 2002). For example, Emily could watch a video of another child touching the Elmo icon and then getting to play with the Elmo phone. Prompting the desired behavior and gradually fading the prompt as the learner gains independence is an effective strategy in establishing antecedent stimulus control of adaptive behaviors (Cengher et al., 2018).
Contingent relation between response and consequence
When you press the elevator button (response), the elevator begins to operate (consequence). If you don't press the elevator button, the elevator will not operate. When the consequent is ON, there is a contingent relation between response and consequence. IF the behavior occurred, THEN the consequence followed. When consequence is off, no response-contingent relation. Describes the causal (IF → THEN) relation between an operant behavior and its consequence.
Reinforcer
Whether positive or negative, increase operant behavior above its no-reinforcer baseline level
Phylogenetic and Pavlovian Stimulus Control
You will recall that some of our behaviors are phylogenetic reflexes elicited by antecedent stimuli. That is, an unconditioned stimulus (US) elicits the unconditioned response (UR). The object placed in an infant's hand is an antecedent US that elicits the palmar grasp reflex (UR). A loss of support to the infant's body (US) elicits the Moro reflex (arms out, palms forward, fingers at the ready, should there be something to grasp onto; UR). These unlearned reflexes help infants stay alive, but only if they do them at the right time. An infant that made Moro-reflex responses at random times would be ill equipped to survive. Natural selection has prepared our species to react reflexively to important antecedent stimuli, like a loss of support. The other functional antecedent stimulus that you already learned about is the Pavlovian conditioned stimulus (CS). After the individual learns that the CS signals a delay reduction to the US ("The US is coming! The US is coming!"), this antecedent stimulus will evoke a conditioned response (CR).
Partial Interval Recording
a direct-observation method used to estimate how frequently behavior occurs. Observers record whether or not the behavior occurs during any portion of each in a series of contiguous intervals. If it occurs at all (1 or more times), it's a positive interval. If it doesn't occur, it's a negative interval.
Typical Pattern of Responding Under an FR Schedule of Reinforcement
a post-reinforcement pause followed by a high-constant rate of responding that ends with a reinforcer. That is, following each reinforcer, the pigeon whose record is shown takes a break. Some of these breaks are brief, others are longer. When the pigeon resumes working, pecking occurs at a high rate, with no interruptions. This response pattern is called a "break-and-run" pattern because the subject takes a break before the response "run" to the reinforcer. Perhaps you can empathize with a rat, pigeon, or monkey that decides to take a break before starting into the next FR run.
Typical FI response pattern of nonhuman subjects
a post-reinforcement pause gives way to an accelerating response rate that terminates with a reinforcer. This curved pattern of an accelerating response rate is referred to as a "scallop" because it resembles the curved shell of a sea scallop. If the post-reinforcement pause lasted exactly 120 seconds, then the pigeon would not waste a single response - the pigeon would make one well-timed response, and it would be reinforced every time. However, the nonverbal pigeon is incapable of such precise timing. It cannot silently count, "one one-thousand, two one-thousand," and so on. Instead, it waits until the time since the last reinforcer is perceived as resembling prior intervals that have separated reinforcer deliveries. As the waiting time is perceived as increasingly similar to these prior intervals, the pigeon's response rate increases. In this way, scalloped response patterns inform us of the pigeon's perception of time. Ferster and Skinner (1957) found that rats and pigeons had scalloped response patterns under FI schedules. Similarly, human preverbal infants have scalloped FI performances, but as they age and acquire language (and the ability to count) their response patterns shift toward those of adult humans - either very efficiently responding once after the FI timer elapses or not pausing at all after each reinforcer. Why adult humans sometimes respond at a high rate on FI schedules is unclear at this time. Perhaps they perceive that the experimenter wants them to keep responding and not take a break, or perhaps they never notice that regular time intervals separate the delivery of each reinforcer. Whatever the reason, it is clear that verbal humans behave differently than nonhuman animals under FI schedules of reinforcement.
Antecedent Stimulus
an observable stimulus that is present before the behavior occurs. The ringtone was present just before we sprang into action, trying to silence its embarrassing sound. Any stimulus that precedes behavior is an antecedent stimulus. But in behavior analysis, we are really only interested in those that have a behavioral function. Said another way, we only care about those stimuli that influence behavior. This functional focus is consistent with the two goals of our science. If we want to accurately predict behavior, then it would be nice to know which antecedent stimuli increase and which decrease behavior. Such antecedent stimuli are functional variables that may be used by behavior analysts to positively influence behavior.
Extinction-Induced Emotional Behavior
emotional responses induced by extinction- anger, frustration, violence. Long-term extinction contingencies placed on behaviors that previously produced important reinforcer can lead to debilitating emotions such as depression. (Long term unemployment --> Depression)
Discriminated Operant Behavior
operant behavior that is systematically influenced by antecedent stimuli. Returning to an earlier example, walking through a busy intersection is operant behavior. It is not phylogenetically elicited by the walk sign, nor do we begin walking because we have learned that the sign signals a delay reduction to some unspecified US. Instead, walking is operant behavior because it is influenced by an antecedent stimulus (the walk sign) and it is influenced by consequences; two of them actually: (1) we get to the other side of the intersection and (2) we avoid being hit by a car. To understand why antecedent stimuli influence operant behavior, answer this question: Is walking through an intersection always reinforced with these two consequences? If you are having a hard time seeing the point of such a question, imagine an intersection in New York City with hundreds of cars passing through. If, upon arrival, you immediately walk into the intersection, will the consequences be (1) getting to the other side and (2) avoiding being hit by a car? No. Walking into the intersection as soon as you arrive is a good way to get yelled at, honked at, and possibly killed. Therefore, it is very important that our street-crossing be discriminated operant behavior; that is, operant behavior that is systematically influenced by antecedent stimuli. When a walk sign is lit, pedestrians will walk through the busy intersection - their discriminated operant street-crossing behavior is systematically influenced by this antecedent stimulus.
Reinforcement
refers to the process or procedure whereby a reinforcer increases operant behavior above its baseline level. The whole process of the consequence increasing the behavior above baseline.
Dependent variable
the objectively measured target behavior
Differential Reinforcement of Alternative Behavior (DRA)
the only difference is that the reinforced response can be any adaptive behavior (it need not be topographically incompatible with the problem behavior). A teacher who calls on children when they raise their hand appropriately, and ignores children when they yell out the answer, is using DRA to encourage appropriate classroom behavior.