Vous êtes sur la page 1sur 5



CogLab # 48 (Statistical Learning) due next

Wednesday (1/20/16) before lecture.
CogLab # 40 (Categorical Perception: Identification)
due next Friday (1/22/16) before lecture.
To see which CogLab assignments you have
Within CogLab, go to 'Home > Access Account'. Under
the "Lab Information tab there will be a drop-down
menu showing which labs have been turned-in.

Quiz 2 (released over weekend) due Wednesday

(1/20/16) before lecture.


Dont fall behind or underestimate the material


Todays schedule

Todays schedule

Reality check
Associative learning: Operant conditioning

So far, the pace of the lectures has been:

A. Way too fast

Brain model of classical conditioning

B. A little fast

Operant condition overview

C. Just about right

Factors influencing conditioning

D. A little slow

Examples of paradigms

E. Way too slow

Last time: Classical conditioning

Prediction of the US

In aversive conditioning, you learn

to avoid or minimize the effect of an
expected unpleasant event.

A common paradigm for this is

eyeblink conditioning. The
eyeblink reflex is an UR to a puff of
air (US).

If you pair the airpuff with a tone

(CS), the rabbit (or human) will learn
that the tone predicts the airpuff

Conditioned Stimulus - the tone

Unconditioned Stimulus - the airpuff

Both extinction and blocking suggest that

the CS produces a CR as a means of
preparing for the US.

After training, the tone (CS) will

cause an eyeblink (CR)

The US (airpuff) always produces the UR

(eyeblink). If you suspect the US is
coming (cued by CS), then you will blink
sooner (CR).

The CR will start to occur before the US.

The CS allows the body to prepare for (or

adapt to) the US.

Interestingly, the CS needs to proceed

the US by about 500 ms or less for the
association to be made reliably.

The CS produces a CR as a way of preparing for the US - in terms of the rabbit, it knows that the tone (the conditioned
stimulus), the rabbit knows that the airpuff is going to come, and thus its going to blink (so, the CS will occur BEFORE the US)
CS needs to occur right after the US


Body wants to maintain homeostasis

Models of classical conditioning

Compensatory response model

The compensatory response model uses this adaptive
role to explain drug tolerance.

The earliest models assumed that the CS takes the place of

the US. However, this substitution model was too

Needles, smells, and settings act as the CS during drug use.

They predict the arrival of the drug, so the body prepares to
counteract the effects (CR).

It had been observed that heroin overdoses often occur in

unusual settings (Gutierrez-Cebollada et al., 1994).

Problems with this:

The CS generally never attains the same strength as the US.

The CR doesnt always match the UR.

Obal (1966) showed that administering the drug

dinitrophenol (US) caused an increase in oxygen
consumption and increased temperature (UR).

The body wants to maintain homeostasis, so will

counteract the effects of the drug.

The syringe is a cue for the drug (CS), so when a

placebo is administered, the CR is a decrease in oxygen
consumption and decrease in body temperature.

Here, the CR is the opposite of the UR and is adaptive.

body is attempting to prepare (adapt) bc it knows what is coming

> 50% of heroin addicts admitted to the hospital for

overdoses had taken their last dose in an unusual
location (no CS).

Other addicts admitted for reasons unrelated to the drug

took their last dose in the usual location.

The setting (CS) had increased tolerance (CR) to the

heroin (US).

It has been suggested that overcoming drug addictions might

involve extinguishing the CR to drug paraphernalia (CS).

The CR to the syringe*

You give a person a drug, and their natural response is an increase in oxygen consumption and body temp. (UR)
The body wants to naturally maintain homestasis (wants to keep internal temp at a constant level)
It was thought that the natural response of the body once the drug was administered was to counteract the effect
of the drug (if it increases temp, then the body will want to decrease the temp)
The syringe (CS) is a cue for the drug (US); the syringe is a warning that the drug is coming, and so the body
becomes adaptive and lowers the body temperature(CR) instead having it increase when responding to the
Thus, the CR is opposite of the UR and is ADAPTIVE

Studies show that heroin overdose usually happens in a different environment

than where the administration of the drug is normally taken, and the dosage
is usually the same as normal. What this means is that the body cannot take the
same precautions it normally would to maintain homeostasis because of the
new environment; aka body not used to it so cant preplan

It all has to do with expectation - depending on how effective the cue is will
determine how the object (in this case body) will respond


Biological model for classical conditioning

Thompson (1986) discovered that

damage in the cerebellum could
permanently prevent new classically
conditioned responses and eliminate
previously conditioned responses:

This was based on patients for

cerebellar lesions (and confirmed in
studies with rabbits).

Single cell recordings from the

cerebellum have helped establish
how classical conditioning is
developed and retained.

Classical conditioning is an example of associative learning. It

can be appetitive (i.e. satisfying a desire) or aversive (i.e.
avoiding an unpleasant event).

Through paired association, the conditioned stimulus becomes

predictive of the unconditioned stimulus.

After training, if the CS is no longer paired with the US, then the
CR will decline in frequency (extinction).

If an already paired CS and US are joined by a second CS, the

new CS will not cause the CR. This is because it has no additional
predictive value (blocking).

The compensatory response model illustrates how classical

conditioning is an adaptive response, helping the body maintain
homeostasis. It suggests that sometimes the CR will be the
opposite of the UR.

We have good evidence that the cerebellum is largely responsible

for establishing and maintaining classical conditioning.


Operant conditioning

Operant conditioning

Operant conditioning (also known as

instrumental conditioning and trial-anderror learning) is associating a voluntary
behavior (operation on the environment)
with an outcome.

B. F. Skinner refined the method to allow

the animal to respond repeatedly. This
free-operant paradigm allowed the
animal to control the rate of responding.

Law of effect: Animals learn that a

behavior (or class of similar behaviors)
predicts a particular outcome (Thorndike,

Behaviors with good outcomes increase;

behaviors with bad outcomes decrease.

Cat opens the puzzle box and is

reinforced with food reward. That escape
behavior becomes more likely (and faster)
in the future. [Discrete trial paradigm]

Behaviors could be automatically recorded

in a Skinner box count number of
behaviors and outcomes.



Operant conditioning

Operant conditioning outcomes

Basic elements of the paradigm:

Outcomes that increase the behavior are reinforcers.

You need a discriminative stimulus (S) that helps you select

the appropriate behavior (e.g. rat sees the lever).

A behavioral response (R), or class of similar responses, is

performed in response to the stimulus (e.g. rat pushes lever
with either paw).

An outcome (O) follows that either reinforces or punishes the

behavior (e.g. rat gets food, good outcome).

Primary reinforcers meet some innate need (e.g. food, water,

sleep, and sex). Approval is likely primary for social animals.
Note that these are not always reinforcing (i.e. you wont work
for water if already satiated). can be contextual depending on the state of organism
(not hungry->wont do more for food)
Secondary reinforcers have no intrinsic value, but predict or
are associated with primary reinforcers (e.g. money, good
grades, gold stars, etc.).

Outcomes that decrease the behavior are punishers.

Through repeated trials, the animal learns that the outcome

is contingent upon the appropriate response.

Primary punisher: Pain (shock), nausea, loud noises, social

disapproval, loss of freedom. anything that organism would naturally try to avoid

Secondary punisher: Monetary fines, demerits, bad grades,




Operant conditioning outcomes

Operant conditioning paradigms


If an outcome/consequence is added, this is positive (+) conditioning.


Progress check

If an outcome/consequence is removed, this is negative (-) conditioning.

You are about to press a button on your iClicker. When

you see that you got the correct answer to the question,
that acts as a ______________.

Note: this has nothing to do with good or bad.

A. Primary reinforcer
B. Secondary reinforcer
C. Primary punisher
D. Secondary punisher
E. You just blew my mind.



Operant conditioning paradigms

Operant conditioning
The timing and context are critical for forming the association.

Examples of the four basic conditioning paradigms for

learning operant associations:

Positive reinforcement: Eat all your vegetables


Positive punishment: Scratch the couch

get sprayed with
water; tease your sibling
parental scolding.

Negative reinforcement: Shut off the alarm clock

of an aversive stimulus; take ibuprofen
reduce a headache.

Negative punishment: Commit armed robbery

freedom; stay out too late
lose your car.

If the outcome is delayed, the association is not learned as well.

Example: a food reward for pressing a level loses effectiveness

rapidly with time.

get some

loss of




Operant conditioning

Operant conditioning
Reinforcement schedules

Reinforcement schedules (i.e.

how often you get the outcome) can
affect the rate at which the
associations are learned.

When you get a reward after every

behavior, this is a continuous
reinforcement schedule.

In a partial reinforcement schedule,

the outcome follows less than
100% of the time. For example, in
a fixed-ratio schedule, the reward
comes on a regular basis.

A powerful form of partial

reinforcement is the variable-ratio
schedule the exact timing cant
be predicted.

More examples with people?


More examples with people?

Progress check
Sheldon gave Penny chocolate each time she did
something to please him. What kind of paradigm is this?
A. Positive reinforcement
B. Negative reinforcement
C. Positive punishment
D. Negative punishment


More examples with people?


Problems with punishers?

Reinforcers and punishers can be equally effective at
producing behavior in laboratory conditions; however,
punishers can experience problems in the real world.

Progress check
Sheldon sprayed water on Leonard when he disagreed.
What kind of paradigm is this?

1. If you punish a behavior, you may encourage

cheating/circumvention. (Dont speed becomes Dont get
caught speeding.)

A. Positive reinforcement

2. Concurrent reinforcement may undermine the punishment.

(Student punished for talking in class may be reinforced with
approval by other students.)

B. Negative reinforcement
C. Positive punishment
D. Negative punishment

3. Punishment can lead to more variable behavior. (If a specific

behavior is decreased, what replaces it?)
4. The initial intensity of the punishers needs to be fairly intense
(otherwise you may get habituation).
5. Punishment can lead to stress and anxiety, which is associated
with other undesirable behaviors.



Complex behaviors

Operant vs. classical conditioning

How do animals get trained to do

complex (and sometimes stupid)

You cant simply reinforce a

complex behavior as it may not be
done accidentally.

Use chaining to create a series of

reinforced behaviors.
S (See platform)

R (Stand on platform)

Operant conditioning:

Classical conditioning

Passive: environment works

on animal.

Active: animal operates on


UStimulus evokes a

A behavioral response
produces an outcome.

Animal learns that the CS

predicts the US.

Animal learns that behavior

predicts an outcome.

Typically simple

More flexible and powerful,

producing more complexity.

O (Food reward)

S (See platform)
R (Stand on platform); S (See handles)
O (Food reward)

However, the two often work together (e.g. primary and

secondary reinforcers can become associated classically).

R (Place paws on



Models for operant conditioning?

Operant conditioning likely involves the interaction
of several neural systems.


Operant conditioning began with the Law of effect, stating that

animals make associations between voluntary behaviors and
contingent outcomes.

Reinforcers make a behavior more likely; punishers make a

behavior less likely. Both can be due to intrinsic preferences
(primary) or learned associations with intrinsic preferences

When you add something to the outcome (give a treat or shock),

that is positive. When you take away something (pain or freedom),
that is negative.

Punishers may not always be as effective as reinforcers in the real

world, but are equally effective in the lab.

Classical and operant conditioning can be distinguished by what is

being associated and whether the process is more active or

For next time

CogLab # 48 due Tuesday (1/20/15) before midnight.

Quiz 2 due Wednesday (1/21/15) before midnight.