Operant Conditioning
Operant Conditioning
Skinner’s views were slightly less extreme than Watson’s (1913). Skinner believed that
we do have such a thing as a mind, but that it is simply more productive to study
observable behavior rather than internal mental events.
The work of Skinner was rooted in the view that classical conditioning was far too
simplistic to be a complete explanation of complex human behavior. He believed that
the best way to understand behavior is to look at the causes of an action and its
consequences. He called this approach operant conditioning.
How It Works
Skinner is regarded as the father of Operant Conditioning, but his work was based
on Thorndike’s (1898) Law of Effect. According to this principle, behavior that is
followed by pleasant consequences is likely to be repeated, and behavior followed by
unpleasant consequences is less likely to be repeated.
Skinner introduced a new term into the Law of Effect – Reinforcement. Behavior that is
reinforced tends to be repeated (i.e., strengthened); behavior that is not reinforced
tends to die out or be extinguished (i.e., weakened).
Skinner identified three types of responses, or operant, that can follow behavior.
•Neutral operants: responses from the environment that neither increase nor
decrease the probability of a behavior being repeated.
• We can all think of examples of how our own behavior has been affected by
reinforcers and punishers. As a child, you probably tried out a number of
behaviors and learned from their consequences.
• For example, when you were younger, if you tried smoking at school, and the
chief consequence was that you got in with the crowd you always wanted to hang
out with, you would have been positively reinforced (i.e., rewarded) and would be
likely to repeat the behavior.
• If, however, the main consequence was that you were caught, caned, suspended
from school, and your parents became involved, you would most certainly have
been punished, and you would consequently be much less likely to smoke now.
• Positive Reinforcement
Primary reinforcers are stimuli that are naturally reinforcing because they are not
learned and directly satisfy a need, such as food or water.
Secondary reinforcers are stimuli that are reinforced through their association with a
primary reinforcer, such as money, school grades. They do not directly satisfy an innate
need but may be the means. So a secondary reinforcer can be just as powerful a
motivator as a primary reinforcer.
Skinner showed how positive reinforcement worked by placing a hungry rat in his
Skinner box. The box contained a lever on the side, and as the rat moved about the box,
it would accidentally knock the lever. Immediately it did so that a food pellet would
drop into a container next to the lever.
The rats quickly learned to go straight to the lever after being put in the box a few times.
The consequence of receiving food, if they pressed the lever, ensured that they would
repeat the action again and again.
This method incentivizes the less desirable behavior by associating it with a desirable
outcome, thus strengthening the less favored behavior.
• Negative Reinforcement
For example, if you do not complete your homework, you give your teacher £5. You will
complete your homework to avoid paying £5, thus strengthening the behavior of
completing your homework.
Skinner showed how negative reinforcement worked by placing a rat in his Skinner box
and then subjecting it to an unpleasant electric current which caused it some
discomfort. As the rat moved about the box it would accidentally knock the lever.
Immediately, it did so the electric current would be switched off. The rats quickly
learned to go straight to the lever after a few times of being put in the box. The
consequence of escaping the electric current ensured that they would repeat the action
again and again.
In fact, Skinner even taught the rats to avoid the electric current by turning on a light
just before the electric current came on. The rats soon learned to press the lever when
the light came on because they knew that this would stop the electric current from being
switched on.
These two learned responses are known as Escape Learning and Avoidance Learning.
• Punishment
They are two distinct methods of punishment used to decrease the likelihood of a
specific behavior occurring again, but they involve different types of consequences:
1. Positive Punishment:
2. Negative Punishment:
• It aims to weaken the target behavior by taking away something the individual values or
enjoys.
• Example: A teenager loses their video game privileges (a desirable stimulus) for not
completing their chores. This is intended to decrease the likelihood of the teenager
neglecting their chores in the future.
• Creates fear that can generalize to undesirable behaviors, e.g., fear of school.
• Does not necessarily guide you toward desired behavior – reinforcement tells you
what to do, and punishment only tells you what not to do.
1. Positive Reinforcement: Suppose you are a coach and want your team to
improve their passing accuracy in soccer. When the players execute accurate
passes during training, you praise their technique. This positive feedback
encourages them to repeat the correct passing behavior.
5. Negative Punishment: If teenagers stay out past their curfew, their parents
might take away their gaming console for a week. This makes the teenager
more likely to respect their curfew in the future to avoid losing something they
value.
• A student who dislikes history but loves art might earn extra time in the art studio for
each history chapter reviewed.
• For every 10 minutes a person spends on household chores, they can spend 5 minutes on
a favorite hobby.
• For each successful day of healthy eating, an individual allows themselves a small piece of
dark chocolate at the end of the day.
• A child can choose between taking out the trash or washing the dishes. Giving them the
choice makes them more likely to complete the chore willingly.
B.F. Skinner conducted several experiments with pigeons to demonstrate the principles
of operant conditioning.
2. They were placed in a cage with a food hopper that could be presented for five
seconds at a time.
3. Instead of the food being given as a result of any specific action by the pigeon,
it was presented at regular intervals, regardless of the pigeon’s behavior.
Observation:
1. Over time, Skinner observed that the pigeons began to associate whatever
random action they were doing when food was delivered with the delivery of
the food itself.
2. This led the pigeons to repeat these actions, believing (in anthropomorphic
terms) that their behavior was causing the food to appear.
Findings:
2. These behaviors did not appear until the food hopper was introduced and
presented periodically.
3. These behaviors were not initially related to the food delivery but became
linked in the pigeon’s mind due to the coincidental timing of the food
dispensing.
5. The rate of reinforcement (how often the food was presented) played a
significant role. Shorter intervals between food presentations led to more rapid
and defined conditioning.
Superstitious Behavior:
The pigeons began to act as if their behaviors had a direct effect on the presentation of
food, even though there was no such connection. This is likened to human superstitions,
where rituals are believed to change outcomes, even if they have no real effect.
For example, a card player might have rituals to change their luck, or a bowler might
make gestures believing they can influence a ball already in motion.
Conclusion:
This experiment demonstrates that behaviors can be conditioned even without a direct
cause-and-effect relationship. Just like humans, pigeons can develop “superstitious”
behaviors based on coincidental occurrences.
This study not only sheds light on the intricacies of operant conditioning but also draws
parallels between animal and human behaviors in the face of random reinforcements.
Schedules Of Reinforcement
1. The Response Rate – The rate at which the rat pressed the lever (i.e., how hard the
rat worked).
2. The Extinction Rate – The rate at which lever pressing dies out (i.e., how soon the
rat gave up).
Skinner found that the type of reinforcement which produces the slowest rate of
extinction (i.e., people will go on repeating the behavior for the longest time without
reinforcement) is variable-ratio reinforcement. The type of reinforcement which has the
quickest rate of extinction is continuous reinforcement.
Behavior is reinforced only after the behavior occurs a specified number of times. e.g.,
one reinforcement is given after every so many correct responses, e.g., after every 5th
response. For example, a child receives a star for every five words spelled correctly.
One reinforcement is given after a fixed time interval providing at least one correct
response has been made. An example is being paid by the hour. Another example would
be every 15 minutes (half hour, hour, etc.) a pellet is delivered (providing at least one
lever press has been made) then food delivery is shut off.
Providing one correct response has been made, reinforcement is given after an
unpredictable amount of time has passed, e.g., on average every 5 minutes. An example
is a self-employed person being paid at unpredictable times.
Applications In Psychology
Token Economy
Token economy is a system in which targeted behaviors are reinforced with tokens
(secondary reinforcers) and later exchanged for rewards (primary reinforcers).
Tokens can be in the form of fake money, buttons, poker chips, stickers, etc. While the
rewards can range anywhere from snacks to privileges or activities. For example,
teachers use token economy at primary school by giving young children stickers to
reward good behavior.
Token economy has been found to be very effective in managing psychiatric patients.
However, the patients can become over-reliant on the tokens, making it difficult for
them to adjust to society once they leave prison, hospital, etc.
Staff implementing a token economy program have a lot of power. It is important that
staff do not favor or ignore certain individuals if the program is to work. Therefore, staff
need to be trained to give tokens fairly and consistently even when there are shift
changes such as in prisons or in a psychiatric hospital.
Behavior Shaping
Skinner argues that the principles of operant conditioning can be used to produce
extremely complex behavior if rewards and punishments are delivered in such a way as
to encourage move an organism closer and closer to the desired behavior each time.
To do this, the conditions (or contingencies) required to receive the reward should shift
each time the organism moves a step closer to the desired behavior.
According to Skinner, most animal and human behavior (including language) can be
explained as a product of this type of successive approximation.
2. Educational Applications
This is not an easy task, as the teacher may appear insincere if he/she thinks too much
about the way to behave.
Learning Type
While both types of conditioning involve learning, classical conditioning is passive
(automatic response to stimuli), while operant conditioning is active (behavior is
influenced by consequences).
• Classical conditioning links an involuntary response with a stimulus. It
happens passively on the part of the learner, without rewards or punishments.
An example is a dog salivating at the sound of a bell associated with food.
Learning Process
Classical conditioning involves learning through associating stimuli resulting in
involuntary responses, while operant conditioning focuses on learning through
consequences, shaping voluntary behaviors.
Summary
Looking at Skinner’s classic studies on pigeons’ / rat’s behavior we can identify some of
the major assumptions of the behaviorist approach.
• Psychology should be seen as a science, to be studied in a scientific manner. Skinner’s
study of behavior in rats was conducted under carefully controlled laboratory
conditions.
• Behaviorism is primarily concerned with observable behavior, as opposed to internal
events like thinking and emotion. Note that Skinner did not say that the rats learned to
press a lever because they wanted food. He instead concentrated on describing the
easily observed behavior that the rats acquired.
• The major influence on human behavior is learning from our environment. In the
Skinner study, because food followed a particular behavior the rats learned to repeat
that behavior, e.g., operant conditioning.
• There is little difference between the learning that takes place in humans and that in
other animals. Therefore research (e.g., operant conditioning) can be carried out on
animals (Rats / Pigeons) as well as on humans. Skinner proposed that the way humans
learn behavior is much the same as the way the rats learned to press a lever.
So, if your layperson’s idea of psychology has always been of people in laboratories
wearing white coats and watching hapless rats try to negotiate mazes in order to get to
their dinner, then you are probably thinking of behavioral psychology.
Behaviorism and its offshoots tend to be among the most scientific of the psychological
perspectives. The emphasis of behavioral psychology is on how we learn to behave in
certain ways.
We are all constantly learning new behaviors and how to modify our existing behavior.
behavioral psychology is the psychological approach that focuses on how this learning
takes place.
Critical Evaluation
Operant conditioning can be used to explain a wide variety of behaviors, from the process of
learning, to addiction and language acquisition. It also has practical applications (such as token
economy) which can be applied in classrooms, prisons and psychiatric hospitals.
Researchers have found innovative ways to apply operant conditioning principles to promote
health and habit change in humans.
In a recent study, operant conditioning using virtual reality (VR) helped stroke patients use their
weakened limb more often during rehabilitation. Patients shifted their weight in VR games by
maneuvering a virtual object. When they increased weight on their weakened side, they received
rewards like stars. This positive reinforcement conditioned greater paretic limb use (Kumar et
al., 2019).
Another study utilized operant conditioning to assist smoking cessation. Participants earned
vouchers exchangeable for goods and services for reducing smoking. This reward system
reinforced decreasing cigarette use. Many participants achieved long-term abstinence (Dallery et
al., 2017).
Through repeated reinforcement, operant conditioning can facilitate forming exercise and eating
habits. A person trying to exercise more might earn TV time for every 10 minutes spent working
out. An individual aiming to eat healthier may allow themselves a daily dark chocolate square
for sticking to nutritious meals. Providing consistent rewards for desired actions can instill new
habits (Michie et al., 2009).
Apps like Habitica apply operant conditioning by gamifying habit tracking. Users earn points
and collect rewards in a fantasy game for completing real-life habits. This virtual reinforcement
helps ingrain positive behaviors (Eckerstorfer et al., 2019).
Operant conditioning also shows promise for managing ADHD and OCD. Rewarding
concentration and focus in ADHD children, for example, can strengthen their attention skills
(Rosén et al., 2018). Similarly, reinforcing OCD patients for resisting compulsions may diminish
obsessive behaviors (Twohig et al., 2018).
However, operant conditioning fails to take into account the role of inherited and cognitive
factors in learning, and thus is an incomplete explanation of the learning process in humans and
animals.
For example, Kohler (1924) found that primates often seem to solve problems in a flash of
insight rather than be trial and error learning. Also, social learning theory (Bandura, 1977)
suggests that humans can learn automatically through observation rather than through personal
experience.
The use of animal research in operant conditioning studies also raises the issue of extrapolation.
Some psychologists argue we cannot generalize from studies on animals to humans as their
anatomy and physiology are different from humans, and they cannot think about their
experiences and invoke reason, patience, memory or self-comfort.