Lecture 1: Ch 5: Intro to Operant Conditioning
What is the behavior systems theory?
1 ) According to behavior systems theory, when an animal is food deprived and is in a situation where it might encounter food, its feeding system becomes activated, and it begins to engage in foraging and other food-related activities
What is the activity deficit hypothesis? What is the attention deficit hypothesis? What is stimulus relations in conditioning? What are Safety-Signal feedback cues?
1) According to the activity deficit hypothesis, animals in Group Y show a learning deficit following exposure to inescapable shock because inescapable shocks encourage animals to become inactive or freeze. As we discussed in Chapter 3, freezing is a common response to fear. 2) According to the attention deficit hypothesis, exposure to inescapable shock reduces the extent to which animals pay attention to their own behavior, and that is why these animals show a learning deficit. 3) What is it about the ability to make an escape response that makes exposure to shock less debilitating? The act of performing any skeletal response also provides internal sensory feedback cues. For example, you can feel that you are raising your hand even if your eyes are closed. Because of these response-produced internal cues, you don't have to see your arm go up to know that you are raising your arm. 4) Safety-signal feedback cues are cues that signal the animal that this action is safe, or this chamber is safe. However, no such safety signals exist for animals given yoked, inescapable shock because for them, shocks and shock-free periods are not predictable. Therefore, contextual cues of the chamber in which shocks are delivered are more likely to become conditioned to elicit fear with inescapable shock.
What is the basic phenomena in operant conditioning? Explain each.
1) Acquisition, Extinction, and Spontaneous Recovery. Step 1: Acquisiton: When we get a reward, the reward increases the likelihood of our response, and so our rate of responding increases to a maximimum point. -Step 2: Extinction: Then, when we take away the reward for that response (like stop giving cat food to the cat when he pushes lever), you get an increase of that response initially, then it goes down. -Step 3: Extinction/spontaneous recovery: Then you give the animal or whatever some rest, and even though you are not rewarding it, there is some spontaneous recovery of that response.
What is an appetitive stimulus? What is an aversive stimulus?
1) An appetitive stimulus is a pleasant or satisfying stimulus that can be used to positively reinforce an instrumental response. 2) An aversive stimulus is an unpleasant or annoying stimulus that can be used to punish an instrumental response.
What is differential reinforcement of other behavior?
1) An instrumental conditioning procedure in which a positive reinforcer is periodically delivered only if the participant does something other than the target response.
What is the behavioral regulation mechanism for operant or instrumental conditioning?
1) Behavioral Regulation Mechanism: -ENVIRONMENTAL OUTPUT (such as bad influence of friends or home environment)-can contribute/influence the PROPERTIES OF THE INDIVIDUAL (like their characteristics, feelings, determination, being lazy, etc), which leads to some sort of BEHAVIORAL OUTPUT (like not doing well in school), which leads to the CONSEQUENCE OF THE ACTION (not doing well in school could lead to dropping out of school), which then contributes/influence the PROPERTIES OF THE INDIVIDUAL so in future the individual could behave differently. -The CONSEQUENCE that is produced from the BEHAVIORAL OUTPUT is the focus, and the CONSEQUENCE contributes to the CHANGE in us, CONDITIONING us to that change, so in future we function/operate differently.
How is instrumental conditioning used to produce new behavior from old behavior? How is instrumental conditioning used to produce responses unlike anything the trainee ever did before?
1) By teaching the rat how to combine familiar responses into a new activity. (It knows how to sniff, how to get up on hind legs, combine to press lever). 2) By variability. Creation of new responses by shaping depends on inherent variability of behavior.
WHat are the characteristics of Extinction? Give example. According to Walter, what is some practical advice?
1) Characteristics of extinction: -Initial increase in response rate, then initial increase in variability, then initial increase in emotionality, then decrease in responding. -Picture someone who put money in vending machine, push button, nothing happens, keeps pushing, nothing happens (initial increase in rate). -Then they start pushing different buttons (increase in variability). -Then they start cursing and hitting machine (increase in emotionality). -Then they eventually just give up and stop responding. 2) To praise and ignore
What is clicker training? Give example. Who used this to shape animal behavior?
1) Clicker training is when the reinforcer is paired with the sound of the clicker so the clicker is the CONDITIONED reinforcer. -Like you click during the desired behavior and immediately deliver the reinforcer, so the reinforcer COMES WITH the clicker. Example: Provide stimulus paired with reward, so stimulus could be like good dog, and it comes with food, dog starts thinking good dog is a reward too, called secondary reward. Once the clicker sound is paired with the food, you don't need food for dog to do something. 2) Karen Pryor
What is contiguity? What is superstitious behavior and what was Skinners experiment? What is adventitious reinforcement?
1) Contiguity is the occurrence of two events, such as a response and a reinforcer, at the same time or very close together in time. 2) Supersitious behavior is behavior that increases in frequency because of accidental pairings of the delivery of a reinforcer with occurrences of the behavior. - Skinner placed pigeons in separate experimental chambers and set the equipment to deliver a bit of food every 15 seconds irrespective of what the pigeons were doing. The birds were not required to peck a key or perform any other response to get the food. After some time, Skinner returned to see what his birds were doing. He described some of what he saw as follows:In six out of eight cases the resulting responses were so clearly defined that two observers could agree perfectly in counting instances. One bird was conditioned to turn counterclockwise about the cage, making two or three turns between reinforcements. Another repeatedly thrust its head into one of the upper corners of the cage. A third developed a "tossing" response, as if placing its head beneath an invisible bar and lifting it repeatedly. 3) Adventitious reinforcement refers to the accidental pairing of a response with delivery of the reinforcer.
What was Thorndike's experiment? What is his law of effect?
1) Experiment: -Thorndike had a puzzle box, and put the cat in the puzzle box. The cat had to figure out how to get out of the box and get to the cat food. The way out of the box could be a latch, lever, etc. -Thorndike recorded the latency for the cat to get out of the cat box. The longest latency (in trial 1) was 160 seconds, and the shortest latency (in trial 23) was 6 seconds. This shows that during successive trials, the latency for the cat to escape from the box decreased. -As a result, when the cat did this, the underlying learning was that the stimuli in the environment (being in the puzzle box) and the response the cat made was a Stimuli-response connection. 2) Thorndike's law of effect is: -If a response R in the presence of a stimulus S is followed by a satisfying event, the association between the stimulus S and response R becomes strengthened. If response R is followed by an annoying event, association is weakened. a)Behavior followed by a pleasant state of affairs will be repeated. (cat likes cat food, cat repeatedly pressed lever to get food). b) Behavior followed by an unpleasant state of affairs will become less likely. c) Rewards created/stamping in a stimulation-response connection.
What are the two types of social interactions? Give examples of each.
1) Non-reciprocal (returning the favor) social interactions: -Daughter gives a tantrum because she wants candy, dad says no, she screams even louder, dad gives her candy, she stops screaming. NEGATIVE REINFOCRMENT and non-reciprocal exchange because dad gave in and gave her candy. -Walter is very disruptive in class. In return, he got attention from the kid, attention from teacher (sources of reward). So teacher thought that by yelling, she was doing positive punishment, but actually it was negative reinforcement cus walter loved the attention, creating a non-reciprocal interaction. 2) Reciprocal social interaction: DO not return -Daughter gves tantrum, dad ignore her and waits for her to stop and DOES NOT give her candy. -Walter is disruptive, teacher whispers at him to stop to not disrupt the class, NOT GIVING HIM REWARD.
What are free-operant procedures? Who invented this procedure? What is an example of a typical skinner box? What is an operant response?
1) Free operant procedures allow the animal to repeat the instrumental response over and over again without constraint and without being taken out of the apparatus until the end of the experimental session. 2) Skinner 3) Like we did in lab-a typical skinner box is a small chamber that contains a lever that the rat can push down repeatedly. The chamber also has reinforcer, like food. Everytime rat presses lever, food is delivered. 4) An operant response, such as lever pressing, is a response defined by the effect it produces in the environment, such as delivery of food. Does not matter how we press the lever-could be hands, tail, etc-as long as it is pressed, results in food delivery .
How does quantity and quality of a reinforcer affect positive reinforcement?
1) If a reinforcer is very small and of poor quality, it will not increase instrumental responding. Indeed, studies conducted in straight alley runways generally show faster running with larger and more palatable reinforcers
What is instrumental behavior? Who began Lab and Theoretical analyses of instrumental conditioning?
1) Instrumental behavior is an activity that occurs because it is effective in producing a particular consequence or reinforcer. 2) Thorndike
Courseworks: CH 5 Study Questions: 1. What is instrumental or operant conditioning? 2. How did Thorndike study instrumental learning? 3. Explain the law of effect. 4. What is the distinction between discrete trial and free-operant procedures? 5. Explain shaping by successive approximations. 6. Give an example of shaping. 7. Define and give examples of the 4 types of consequences. 8. What is differential reinforcement of other behavior? How could you use this idea in an elementary school classroom? 9. How could you use Neuringer's research findings about variability to design a more creative classroom experience? 10. Why is variability in behavior important?
1) Instrumental or operant conditioning is the study of how the consequences of behavior affect the future occurrence of behavior. 2) Thorndike studies instrumental learning by looking at how a Stimulus produces a Response, He examines the puzzle box, put an animal like a cat in it, and if the cat could get out of the box, it got food. Thorndike did this over multiple trials, and recorded the latency that the cat had to get out of the box over each trial. At each successive trial, latency improved. 3) Thorndike's law of effect is that if a response R which is in the presence of a stimulus S produces a reward, then the relation between the Stimulus and the Response is strengthened. For example, if pressing a lever (Response R) is in presence of a stimulus (food) results in a satisfying event (box opens so cat can get food), the relationship is strengthened. However, behavior followed by an unpleasant state of affairs will become less likely, so if response R in presence of stimulus S results in unsatifying event, SR relationship is weakened. -Rewards stamp in the stimulus-Response relation. 4) Discrete trial procedures are, for example, mazes, when once the required response is complete, animal is removed from apparatus. Put animal in maze, wait until animal finds food in maze, record time, take animal out of maze. -Free operant procedures, on the other hand, are when animals can perform intstrumental response without constrain for as long as they like for duration of experimental period. For example, rat pressing lever, rat can press lever for as long as it wants for exerimental duration. 5) Shaping an animal is when you first determine the desired final response, then you determine how far away the animal is from that response, then you determine how to get the animal from initial response to final response. -To use successive approximations, you would first have to enter the preliminary phase, in which the animal associates the sound with the delivery of food, for example the sound of dipper results in delivery of milk drop. -Then, you enter training phase. This phase, you want to get animal to press the lever which would result in machine sound and drop of milk. To do this, you first reinforce animal as it approaches lever. Then reinforce animal as it touches lever. Then reinforce animal as it presses lever. Now the animal is performing desired response via shaping. 6) An example of shaping could be Teaching a child to read. First, you reward child for sounding out the letters. Then you reward child for sounding out words. Then you reward child for sounding out sentences. Now the child knows how to read a book. 7) The 4 types of consequences are positive reinforcement, which results in introduction of appetitive stimulus, negative reinforcement, which results prevention of aversive stimulus, positive punishment which results in introduction of aversive stimulus, and negative punishment which results in prevention of occurrence of appetitive stimulus. Examples: -Positive reinforcement: You do your homework, you get candy. -Negative reinforcement: It is raining, you dont want to get wet, you open umbrella, prvent rain hitting you. -Positive punishment: You talk back to me, I scream at you and say STOP!. -Negative punishment: You misbehave and you cant play with your friend but have time out and have to sit in corner, dont have access to toys or books. 8) Differential reinforcement of other behavior is when you reinforce EVERY BEHAVIOR EXCEPT the behavior you are trying to get rid of. -To apply this in an elementary school classroom, say there is a kid that is always screaming and acting out. Everytime the kid is quiet, doing homework, on task, listening, etc, you reinforce the kid and say good job, well done, etc. However, when the kid is screaming and acting out, you ignore. 9) Neuringer found that if you positively reinforce variability, it increases variability response. For example, if one reinforcement is a big triangle, then next reinforcement is a small triangle, it increases variation in responses. -To apply this in designing a more creative classroom, let's say a math problem like 4*4=16, how else could you use the number 4 and other numbers to make 16, one reinforcement could be 4 squared, another reinforcement could be the answer 12+4, etc. This creates creativity in how to reach the number 16 using the number 4. 10) Variability in behavior is important because it teaches you that there are many responses you could perform to reach the same result, and it increases creativity and thinking.
What is avoidance? How is response variability increased? How were Thorndike and Skinner wrong in their theories regarding variability? What is belongingness and who came up with the term? What is instinctive drift?
1) Instrumental response prevents delivery of aversive stimulus. 2) response variability is increased if the instrumental reinforcement procedure requires variable behavior. 3) Thorndike and SKinner argues that responding becomes more stereotyped with continued instrumental conditioning and that this outcome was inevtiable. However, new response forms can ALSO be produced by instrumental conditioning if response variation is a requirement of reinforcement. 4) Proposed by Thorndike, belongingness is when an organism's evolutionary history makes certain responses fit or belong with certain reinforcers. For example, during mating season, male fish fight other male fish and court female fish. Group of male fish required to bite rod to obtain reinforcer. When reinforcer was male fish, biting behavior increased. When female fish, biting behavior did not increase. However, female fish presentation reinforced behavior like swimming through a ring. So "biting" behavior belongs with territorial defense and can be reinforced by presentation of rival male. 5) Instinctive drift is when there is a gradual drift away from responses required for reinforcement because animals instead perform activities that they would instinctively perform when obtaining food like moving food along ground. Natural food-related responses compete with training procedures.
What is learned-helplessness?
1) Interference with the learning of new instrumental responses as a result of exposure to inescapable and unavoidable aversive stimulation.
What is negative reinforcement? Give example. What is negative punishment? Give example.
1) Negative reinforcement is when something already present (like wet hands) is removed as a result of a behavior (by drying hands on towel), leading to an INCREASE of the behavior of drying hand son towel when they are wet. Instrumental response turns off an aversive stimulus. -Another example is brushing teeth. The more you brush your teeth, the less bad breath/fewer cavities, leading to increase in frequency of teeth brushing. -Basically, By doing something, something is taken away, making the BEHAVIOR MORE LIKELY IN FUTURE. Thing taken away is GOOD. 2) Negative punishment is when something you have access to is removed, leading to a DECREASE in that behavior. For example, you are playing with toys and start fighting with your sister. As a result, your mom takes toys away. So in future, you fight with sister less. Instrumental response results in removal of appetitive stimulus. -Basically, Your action causes removal of something you LIKE, leading to DECREASE of that action in future.
What does operant/instrumental conditioning mean?
1) Operant/instrumental conditioning is the study of how the consequences of behavior affect its future occurrence (of that behavior). -Responses (consequences) are studied in terms of how they (the consequence) operates on/or is instrumental in changing the environment.
What is a positive and negative correlation? give examples of each.
1) Positive correlation: The more you do something, the more it happens. For example, if brussel sprouts is a REWARD, the more you take notes, the more brussel sprouts you get us a POSITIVE correlation. 2) Negative correlation: The more you do it, the less you get. The consequence of your action is to take something away. For example, you are at a subway stop and train goes by. You cover your ears. As a result of covering your ears, loud sound is taken away. The more you cover your ears, the less sound you get.
What is a positive reinforcement? Give example. What is positive punishment? Give example.
1) Positive reinforcement is when an action (like listening to the teacher) leads to a rewards (like getting a sticker) that INCREASES the frequency of response (you listen to teacher more). INstrumental response produces an appetitive stimulus. -Basically, you do something, you get something you want, increasing the frequency of you doing that thing more. The MORE you do something (like listen), the MORE REWARD you get (like stickers), leading you to listen more. 2) Positive punishment is when an action (like child fighting) leads to a punishment (like detention) DECREASES the frequency of response (you fight less). Instrumental response produces an aversive stimulus. -Basically, the MORE you do something (like fight), the MORE PUNISHMENT you get (like detention), leading you to fight less.
Give examples of Positive reinforcement, positive punishment, negative reinforcement, negative punishment. Explain each.
1) Positive reinforcement: -Take a rat, and POSITIVELY reinforce the rat. Whenever the rat presses a bar, the more reward it gets, so rat pressing a bar gets rewards is a REWARD procedure. 2) Positive punishment: Rat presses a bar, gets an electric shock, reduces behavior of rat pressing a bar=POSITIVE PUNISHMENT. 3) Negative Reinforcement: Turn on shock, so rat is being shocked. If rat presses bar, rat is not shocked. so the more it presses bar, the less shock it gets=negative reinforcer. 4) Negative punishment-Omission: Give a bunch of rewards, but when you press bar, rewards stop coming, makes you less likely to press bar.
So, in sum, if the stimulus is present and leads to an increase in behavior, what is this called? Present stimulus leads to a decrease in behavior? Removal of stimulus leads to increase in behavior? Removal of stimulus leads to decrease in behavior?
1) Present stimulus leads to increase in behavior: Positive reinforcement. 2) present stimulus leads to decrease in behavior: Positive punishment. 3) Removal of stimulus leads to increase in behavior: Negative reinforcement. 4) Removal of stimulus leads to Decrease in behavior? Negative punishment.
What is the behavioral contrast effect?
1) Prior experience with a lower valued reinforcer increases reinforcer value (positive behavioral contrast), and prior experience with a higher valued reinforcer reduces reinforcer value (negative behavioral contrast).
What are discrete-trial procedures? What is running speed? What is response latency?
1) Procedures in which each trial begins with putting an animal in the apparatus and ends with removal of the animal after the instrumental response was performed. (These days, involved mazes, animal completes maze when finds food in maze). Discrete because after reaching the goal box, the animal is removed from the apparatus (maze). 2) Running speed is how fast an animal moves down a runway. 3) Latency is the time between the start of a trial (or the start of the stimulus) and the instrumental response.
Mindset Chapter 5: 1) Thorndike's law of effect involves ________. 2) A ________ allows animal participants to repeat an instrumental response over and over again without being taken out of the apparatus until the end of an experimental session. 3) A behavior researcher is observing how long it takes a rat to figure out how to navigate a maze in order to reach a food source. After each time the rat completes the maze, the researcher removes the rat from the apparatus. This researcher is using which approach to instrumental conditioning? 4) Ryan's mother gives him a time-out in his room during a play date because he is not being a good listener. The time-out is an example of ________. 5) Contrary to Thorndike and Skinner's beliefs, studies have found that instrumental conditioning ________. 6) An animal participant in an instrumental conditioning study receives a food reward for performing a particular action, but over time, the animal performs the desired action less frequently because the response behavior is eclipsed by the behaviors the animal usually performs to find food. This situation best illustrates which phenomenon? 7) The extent to which an instrumental response is necessary and sufficient to produce the reinforcer is known as the ________. 8) Why do time delays between the execution of a behavior and the receipt of a reinforcer disrupt instrumental conditioning so much? 9) Lab rats in an experiment are given a food pellet every 20 seconds, regardless of what they are doing. Over time each rat begins to incorrectly believe that its own actions are prompting the food—the rats have become accidentally conditioned to expect food when performing whatever action they happened to be doing when they misinterpreted the delivery of food as a reward for their actions. What does this situation illustrate? 10) A response that occurs at the end of an interval between successive reinforcements that are presented at fixed intervals is known as a(n) ________. 11) The learned-helplessness effect occurs when participants ________. 12) According to the learned-helplessness hypothesis, how does learned helplessness affect learning? 13)According to the activity deficit theory, animals in Group Y of a learned-helplessness study show a learning deficit because ________. 14) A major line of research that challenges the learned-helplessness hypothesis focuses on ________.
1) S-R (Stimulus-Response) learning 2) Free-operant procedure 3) Discrete-trial procedure 4) Negative punishment 5)Can be used to elicit varied responses 6)Instinctive drift 7) Response-Reinforcer contingency 8)Participants may have difficulty figuring out which response deserves credit for receiving the reinforcer. 9) belongingness 10) Terminal response 11) Cannot avoid an aversive stimulus 12) It undermines participants' ability to learn a new instrumental response. 13) Inescapable shocks cause the animals to freeze in fear 14) Differences in safety cues rather than whether or not a shock is escapable
What is shaping? What is the process of shaping? Give example of rat. Then give real-life examples.
1) Shaping is the process of developing new behavior using reinforcement and extinction. 2) The process of shaping is: a) Reinforcement, which strengthens a response. b) Extinction, which generates variability. c) Differential reinforcement of closer and closer approximations to a target response. Example: a) Let's say Im trying to teach my rats to press a bar. SO the rat does not just automatically press bar. So you have to use a reward to shape that behavior. b)SO rat walking toward bar, give it a reward. SO first everytime the rat walks towards the bar, you give it a rewards (positive reinforcement). c)Then after that, you withhold the reward until the rat gets closer to the bar (extinction-withholding the reward-generating variability-going closer to bar). d) Then now wait till rat touches bar, and give it reward (positive reinforcement). e) Then stop rewarding it until the rat pushes the bar down a little bit, and it gets a reward. SO this success of approximations to the target response that you're rewarding. 3) Real life examples: -Self-help skills, handwriting, professor and thermostat (professor faced wall, talked to thermostat when teaching because students gave him reward), animal training and the importance of conditioned reinforcement.
What did Skinner do? What is reinforcement? What is punishment?
1) Skinner gave functional definitions for rewards and punishers to avoid the problem of trying to identify what was pleasant or unpleasant. -Define whether its an award or punisher in terms of its effect in behavior. (Example, if youre doing notes and I give you a brussel sprout and you become more likely to take notes, increasing frequency in behavior, brussel sprouts are a rewarder. And vice versa). 2) Reinforcement is a process or procedure in which the future probability of a response is INCREASED, or strengthened. 3) Punishment is a process of procedure in which the future probability of a response is DECREASED, or weakened.
What is magazine training? What is response training?
1) So an animal (like rat) does nto just learn to press the lever and get food, but a series of steps is required. The preliminary stage of instrumental condiitioning is magazine training. -Magazine training is when a stimulus is repeatedly paired with the reinforcer to enable the participant to learn to go and get the reinforcer when it is presented. For example, everytime the rat hears the sound of the dipper (the machine), it learns to go get a drop of milk. 2) Response training is reinforcement of successive approximations to a desired instrumental response. For example, first reinforcing the rat every time it approaches the lever, then reinforcing the rat everytime it touches the lever, etc.
What is the temporal relation? What is temporal contiguity? What is the response-reinforcer contingency?
1) Temporal relation is the time interval between an instrumental response and the reinforcer. 2) Temporal contiguity is delivery of the reinforcer immediately after the response. 3) Response-reinforcer contingency is the extent to which the instrumental response is necessary and sufficient to produce the reinforcer.
What are terminal responses? What are interim responses? What is the explanation of the periodicity of Interime and Terminal responses?
1) Terminal response is a response that is most likely at the end of the interval between successive reinforcements that are presented at fixed intervals. 2) Interim response is A response that has its highest probability in the middle of the interval between successive presentations of a reinforcer, when the reinforcer is not likely to occur. 3) In the middle of the interval between food deliveries (when the subjects are least likely to get food), general search responses are evident that take the animal away from the food cup. -As the time for the next food delivery approaches, the subject exhibits focal search responses that are again concentrated near the food cup.
What is the triadic design and what does it involve? What is the finding?
1) The design involves two phases: an exposure phase and a conditioning phase. -During the exposure phase, one group of rats (E, for escape) is exposed to periodic shocks that can be terminated by performing an escape response (e.g., rotating a small wheel or tumbler). -Each subject in the second group (Y, for yoked) is assigned a partner in Group E and receives the same duration and distribution of shocks as its Group E partner. However, animals in Group Y cannot turn off the shocks. For them, the shocks are inescapable. - The third group (R, for restricted) receives no shocks during the exposure phase but is restricted to the apparatus for as long as groups E and Y. -During the conditioning phase, all three groups receive escape-avoidance training. This is usually conducted in a shuttle apparatus that has two adjacent compartments (see Figure 10.3). The animals have to go back and forth between the two compartments to avoid shock (or escape any shocks that they failed to avoid). 2) The impact of aversive stimulation during exposure phase depends on whether ot not the shock is escapable. -Exposure to uncontrolled shock-Group Y-produces disruption in escape-avoidance learning. Group Y shows a deficit in subsequent learning in comparison to Group E.
What is the learned-helplessness hypothesis?
1) The proposal that exposure to inescapable and unavoidable aversive stimulation reduces motivation to respond and disrupts subsequent instrumental conditioning because participants learn that their behavior does not control outcomes.
What three components does successful shaping of behavior involve?
1) Three components: a) Clearly define the final response you want the subject to perform. b) Clearly asses the starting level of performance. c) Divide the progression from the starting point to the final response into successive approximations.
What are some complications of punishment to consider?
1) When people have the opportunity, they will go somewhere else, so you have to do more than just give negative feedback to people to prevent them from running away. Positive punishment: a stimulus is delivered contingent on a response and this consequence decreases the probability of the response: -But in positive punishment, the person will more likely run away (creating negative reinforcement).
How is instrumental conditioning sensitive to a delay of reinforcement? How do you overcome delay of reinforcement?
1) With delayed reinforcement, it is difficult to figure out which response deserves the credit for the delivery of the reinforcer. To associate R1 with the reinforcer, the participant has to have some way to distinguish R1 from the other responses it performs during the delay interval. 2) A conditioned reinforcer is a conditioned stimulus that was previously associated with the reinforcer. For example, sound of clicker is paired with food delivery. So sound of clicker becomes effective reinforcer. So sound of clicker can be delivered immediately after desired response, even though primary food reinforcer is delayed. -Another way is marking procedure. A procedure in which the instrumental response is immediately followed by a distinctive event (the participant is picked up or a flash of light is presented) that makes the instrumental response more memorable and helps overcome the deleterious effects of delayed reinforcement.
11. How does relevance or belongingness affect instrumental conditioning? 12. How does behavior systems theory provide an account of belongingness? 13. How does a progressive ratio schedule work? How does the amount and quality of reward affect performance on the progressive ratio? 14. Define Positive and Negative Behavioral Contrast. Think of an example of where a shift in reward affected your own performance. 15. Explain how contiguity and contingency apply to the analysis of the effects of reinforcement. 16. How does delay of reward affect performance? 17. Define conditioned reinforcement. How does it help bridge delays? How would this apply to shaping a rat to press the bar? 18. What are some of the reasons why unsignaled delays might interfere with operant conditioning? 19. Explain how superstitious behavior could arise from adventitious reinforcement. Make up an example. 20. What is interim behavior and what causes it?
11) Belongingness is when an organism's evolutionary behavior is tied to a certain response. For example, pigs like to drag food on floor when scavenging for food. This can largely interfere with instrumental behavior, because whn trying to train a pig to drop a coin in a piggy bank to receive food, it will instead perform intrinsic behavior of dragging coin on floor and not put it in piggy bank because this is its evolutionary way of foraging food. 12) The behavior systems theory is the theory that when an animal is food deprived and is in a situation where it might encounter food, its feeding system becomes activated and it begins to engage in foraging and other food-related activities. This behavior explains why belongingness can interfere with instrumental conditioning, and instead of performing the required response (dropping coin in piggy bank) that will result in food, even though there is no food available, the animal starts dragging the coin or performing activities because its feeding system is activated. 13) A progressive ratio schedule works in which the number of responses required to receive a reinforcer increases following the delivery of the previous reinforcer by a certain amount. For example, if pressing button 1 time results in food, the next time I have to press button 3 times to get food, and the next time 5 times, and the next time 7, etc. -The amount and quality of reward, the larger the quality or the more amount of reward, the longer the participant lasts on the progressive ratio. For example, if the first rewARD is 10 sweets, I would last up to maybe PR 20. However, if the reward is 30 sweets, I will last up to PR 40. ETc. This was seen in the experiment with autistic kid who got attention for either 5 second, 10 second, 120 seconds. 14) A behavioral contrast is when prior experience affects the values of the reinforcer. For example, if I am used to getting 25$ a week in allowance, and the reinforcer for let's say running around the track is 10$ a week, I will be less likely to do so. This is a Negative behavioral contrast. Prior experience with a higher valued reinforcer reduces reinforcer value. -However, if I am used to 10$ a week and offerred 25$ a week for running around track, I am more likely to do so. This is a Positive behavioral contrast. Prior experience with a lower valued reinforcer increases reinforcer value. -Personal example: Babysitting. I am used to getting paid 20$/hour for babysitting. So if a new babysitting family offers a job that will pay 10$/hour, I am less likely to accept the job, because I am used to 20$/hour. 15) Contiguity is delivery of the reinforcer immediately after the response, and contingency is extent to which the instrumental response is necessary to produce reinforcer. These are both necessady for reinforcement. 16) Delay of reward largely affects performance because it confuses the animal into which response produces the reward. For example, a rat presses a lever, but food is delivered 30 seconds layer. In those 30 seconds, the rat is doing many responses, like walking around, grooming, standing on hind legs, etc. So now rat has no idea what response resulted in delivery of food because of the delay. And this in turn causes the animal to not associate pressing of lever with delivery of food , and largely reduces the animals performance of pressing the lever. 17) COnditioned reinforcement is when a secondary stimulu is paired with a primary stimulus. For example, the primary stimulus could be food, and the conditioned stimulus could be the sound of a machine. SO the rat gets accustomed with both, and knows the every time the sound of the machine is heard, the drop of milk is produced. So it associates the sound of machine with the delivery of food. -So now, if the rat presses the lever, which results in the conditioned stimulus of the sound of the machine, whcih results in delivery of food 30 seconds later, because the rat associates the conditioned stimulus of sound of machine with food, it continues to press the lever, and this helps bridge the delays. 18) previously answered 19) Superstitious behavior is behavior that increases in frequency because of belief that the behavior results in delivery of reward. Adventitious reinforcement is an accidental pairing between a behavior and the reinforcement delivery. -Superstitious behavior arises from adventitious reinforcement because if there is an accidental pairing between a behavior and the delivery of reinforcement, the animal will think that that behavior leads to the reinforcement, so the animal will increase its frequency of that behavior, leading to suprstitious behavior. -For example, Every 30 seconds I smile at my 4 year old. My 4 year old happens to to be jumping a lot and so accidentally thinks that every time she jumps, I smile. So the amount of times she jumps increases even though I am really just smiling at her every 30 seconds. 20) Interim behavior is the behavior that occurs sort of in the middle between the prior delivery of reinforcer and the next delivery of reinforcer. It is caused by general search. Dont really know.
21. What is the behavior system account of interim and terminal behavior? 22. What induces learned helplessness? 23. What are the symptoms of learned helplessness? 24. What does each group of the triadic design control for? 25. How does a consideration of safety signals offer an alternative account of learned helplessness? 26. From the point of view of reward and punishment what makes for a good relationship? 27. What maintains the behavior of parent and child when the child regularly throws tantrums 28. How could you use "praise and ignore" in a classroom and why does it work? 29. Why does shaping work? 30. What would we need to know to turn shaping from an art to a science?
21) Terminal behavior is behavior that occurs towards the end as next reinforcement will be delvivered. Not sure what behavior account is. 22) Learned helplessness is caused by exposing a subject to aversive stimuli (like continuous shock) and they are not able to escape or avoid it, so they feel helpless and like they can't do anything to prevent it. 23) Symptoms of learned helplessness is mainly just that the person can freeze, the person is slower to response, and in future, when escape is available, they are slower to figure that out. 24) Group 1 is when they can avoid shock by running between boxes, Group 2 is when they can't avoid shock and receive shock everytime group 1 receives shock, and group 3 is when they dont receive shock at all. 25) 26) What makes for a good relationship is positive reinforcement (everytime a student does well, praise them or give them sticker that says good job), combined with negative punishment (everytime a student is disruptive, remove attention by ignoreing them). 27) When the child regularly throws tantrums, it is receiving some kind of reward from the tantrums, so positive reinforcement maintains the behavior of throwing tantrums. 28) See 26 29) Shaping works because...just use my brain, dumb question. 30) Controls, experimental design, different groups, laboratory setting.
What is the response-Outcome contingency and the Result of procedure for following: a) Positive Reinforcement B) Positive punishment C) Negative Reinforcement D) Negative Punsihment
A) Positive reinforcement: -Response-Outcome Contingency: Positive-Response produces an appetitive stimulus -Result of Procedure: Reinforcement or increase in response rate B) Positive punishment: -Response-Outcome Contingency: Positive- Response produces an aversive stimulus -Result of Procedure: Punishment or decrease in response rate C) Negative reinforcement: -Response-Outcome Contingency: Negative- Response eliminates or prevents the occurrence of an aversive stimulus -Result of Procedure: Reinforcement or increase in response rate C) Negative punishment: -Response-Outcome Contingency: Negative: Response eliminates or prevents the occurrence of an appetitive stimulus -Result of Procedure: Punishment or decrease in response rate
Explain Chads experiment to show how amount and quality of reward affects performance on progressive ratio.
Average number of reinforcers earned by Chad per session as the response requirement was increased from 1 to 40. (The maximum possible was two reinforcers per session at each response requirement.) Notice that responding was maintained much more effectively in the face of increasing response requirements when the reinforcer was 120 seconds long.
General questions: Give an example of shaping a behavior in a friend. Explain how contrast might affect performance in a classroom Explain learned helplessness and how it can be prevented.
Do this you know it.
What was Neuringer's experiment and what did he discover?
Numerous experiments with laboratory rats, pigeons, and human participants have shown that response variability increases if variability is the response dimension required to earn reinforcement (Neuringer, 2004; Neuringer & Jensen, 2010). In one study, college students were asked to draw rectangles on a computer screen (Ross & Neuringer, 2002). They were told they had to draw rectangles to obtain points but were not told what kind of rectangles they should draw. For one group of participants, a point was dispensed if the rectangle drawn on a given trial differed from other rectangles the student previously drew. The new rectangle had to be novel in size, shape, and location on the screen. This group was designated VAR for the variability requirement. Students in another group were paired up or yoked to students in group VAR and received a point on each trial that their partners in group VAR were reinforced. However, the YOKED participants had no requirements about the size, shape, or location of their rectangles. The results of the experiment are shown in Figure 5.8. Students in group VAR showed considerably greater variability in the rectangles they drew than participants in group YOKED. This shows response variability can be increased if the instrumental reinforcement procedure requires variable behavior. Another experiment by Ross and Neuringer (2002) demonstrated that different aspects of drawing a rectangle (the size, shape, and location of the rectangle) can be controlled independently of one another by contingencies of reinforcement. For example, participants who are required to draw rectangles of the same size will learn to do that but will vary the location and shape of the rectangles they draw. In contrast, participants required to draw the same shape rectangle will learn to do that while they vary the size and location of their rectangles. These experiments show that response variability can be increased with instrumental conditioning. Such experiments have also shown that in the absence of explicit reinforcement of variability, responding becomes more stereotyped with continued instrumental conditioning.
kj
jh
blah
vlh