Vous êtes sur la page 1sur 5

1/15/2016

Reminders

CogLab # 48 (Statistical Learning) due next


Wednesday (1/20/16) before lecture.
CogLab # 40 (Categorical Perception: Identification)
due next Friday (1/22/16) before lecture.
To see which CogLab assignments you have
completed:
Within CogLab, go to 'Home > Access Account'. Under
the "Lab Information tab there will be a drop-down
menu showing which labs have been turned-in.

Quiz 2 (released over weekend) due Wednesday


(1/20/16) before lecture.

1/15/16

Dont fall behind or underestimate the material


2

Todays schedule

Todays schedule

Reality check
Associative learning: Operant conditioning

So far, the pace of the lectures has been:


A. Way too fast

Brain model of classical conditioning

B. A little fast

Operant condition overview

C. Just about right

Factors influencing conditioning

D. A little slow

Examples of paradigms

E. Way too slow

Last time: Classical conditioning

Prediction of the US

In aversive conditioning, you learn


to avoid or minimize the effect of an
expected unpleasant event.

A common paradigm for this is


eyeblink conditioning. The
eyeblink reflex is an UR to a puff of
air (US).

If you pair the airpuff with a tone


(CS), the rabbit (or human) will learn
that the tone predicts the airpuff
(US).

Conditioned Stimulus - the tone


Unconditioned Stimulus - the airpuff

Both extinction and blocking suggest that


the CS produces a CR as a means of
preparing for the US.

After training, the tone (CS) will


cause an eyeblink (CR)

The US (airpuff) always produces the UR


(eyeblink). If you suspect the US is
coming (cued by CS), then you will blink
sooner (CR).

The CR will start to occur before the US.

The CS allows the body to prepare for (or


adapt to) the US.

Interestingly, the CS needs to proceed


the US by about 500 ms or less for the
association to be made reliably.

6
The CS produces a CR as a way of preparing for the US - in terms of the rabbit, it knows that the tone (the conditioned
stimulus), the rabbit knows that the airpuff is going to come, and thus its going to blink (so, the CS will occur BEFORE the US)
CS needs to occur right after the US

1/15/2016

Body wants to maintain homeostasis

Models of classical conditioning

Compensatory response model


The compensatory response model uses this adaptive
role to explain drug tolerance.

The earliest models assumed that the CS takes the place of


the US. However, this substitution model was too
simplistic.

Needles, smells, and settings act as the CS during drug use.


They predict the arrival of the drug, so the body prepares to
counteract the effects (CR).

It had been observed that heroin overdoses often occur in


unusual settings (Gutierrez-Cebollada et al., 1994).

Problems with this:

The CS generally never attains the same strength as the US.

The CR doesnt always match the UR.

Obal (1966) showed that administering the drug


dinitrophenol (US) caused an increase in oxygen
consumption and increased temperature (UR).

The body wants to maintain homeostasis, so will


counteract the effects of the drug.

The syringe is a cue for the drug (CS), so when a


placebo is administered, the CR is a decrease in oxygen
consumption and decrease in body temperature.

Here, the CR is the opposite of the UR and is adaptive.


body is attempting to prepare (adapt) bc it knows what is coming

> 50% of heroin addicts admitted to the hospital for


overdoses had taken their last dose in an unusual
location (no CS).

Other addicts admitted for reasons unrelated to the drug


took their last dose in the usual location.

The setting (CS) had increased tolerance (CR) to the


heroin (US).

It has been suggested that overcoming drug addictions might


involve extinguishing the CR to drug paraphernalia (CS).

The CR to the syringe*


You give a person a drug, and their natural response is an increase in oxygen consumption and body temp. (UR)
The body wants to naturally maintain homestasis (wants to keep internal temp at a constant level)
It was thought that the natural response of the body once the drug was administered was to counteract the effect
of the drug (if it increases temp, then the body will want to decrease the temp)
The syringe (CS) is a cue for the drug (US); the syringe is a warning that the drug is coming, and so the body
becomes adaptive and lowers the body temperature(CR) instead having it increase when responding to the
drug(UR)
Thus, the CR is opposite of the UR and is ADAPTIVE

Studies show that heroin overdose usually happens in a different environment


than where the administration of the drug is normally taken, and the dosage
is usually the same as normal. What this means is that the body cannot take the
same precautions it normally would to maintain homeostasis because of the
new environment; aka body not used to it so cant preplan

It all has to do with expectation - depending on how effective the cue is will
determine how the object (in this case body) will respond

Summary

Biological model for classical conditioning

Thompson (1986) discovered that


damage in the cerebellum could
permanently prevent new classically
conditioned responses and eliminate
previously conditioned responses:

This was based on patients for


cerebellar lesions (and confirmed in
studies with rabbits).

Single cell recordings from the


cerebellum have helped establish
how classical conditioning is
developed and retained.

Classical conditioning is an example of associative learning. It


can be appetitive (i.e. satisfying a desire) or aversive (i.e.
avoiding an unpleasant event).

Through paired association, the conditioned stimulus becomes


predictive of the unconditioned stimulus.

After training, if the CS is no longer paired with the US, then the
CR will decline in frequency (extinction).

If an already paired CS and US are joined by a second CS, the


new CS will not cause the CR. This is because it has no additional
predictive value (blocking).

The compensatory response model illustrates how classical


conditioning is an adaptive response, helping the body maintain
homeostasis. It suggests that sometimes the CR will be the
opposite of the UR.

We have good evidence that the cerebellum is largely responsible


for establishing and maintaining classical conditioning.

12

Operant conditioning

Operant conditioning

Operant conditioning (also known as


instrumental conditioning and trial-anderror learning) is associating a voluntary
behavior (operation on the environment)
with an outcome.

B. F. Skinner refined the method to allow


the animal to respond repeatedly. This
free-operant paradigm allowed the
animal to control the rate of responding.

Law of effect: Animals learn that a


behavior (or class of similar behaviors)
predicts a particular outcome (Thorndike,
1911).

Behaviors with good outcomes increase;


behaviors with bad outcomes decrease.

Cat opens the puzzle box and is


reinforced with food reward. That escape
behavior becomes more likely (and faster)
in the future. [Discrete trial paradigm]
13

Behaviors could be automatically recorded


in a Skinner box count number of
behaviors and outcomes.

14

1/15/2016

Operant conditioning

Operant conditioning outcomes

Basic elements of the paradigm:

Outcomes that increase the behavior are reinforcers.

You need a discriminative stimulus (S) that helps you select


the appropriate behavior (e.g. rat sees the lever).

A behavioral response (R), or class of similar responses, is


performed in response to the stimulus (e.g. rat pushes lever
with either paw).

An outcome (O) follows that either reinforces or punishes the


behavior (e.g. rat gets food, good outcome).
S

Primary reinforcers meet some innate need (e.g. food, water,


sleep, and sex). Approval is likely primary for social animals.
Note that these are not always reinforcing (i.e. you wont work
for water if already satiated). can be contextual depending on the state of organism
(not hungry->wont do more for food)
Secondary reinforcers have no intrinsic value, but predict or
are associated with primary reinforcers (e.g. money, good
grades, gold stars, etc.).

Outcomes that decrease the behavior are punishers.

Through repeated trials, the animal learns that the outcome


is contingent upon the appropriate response.

Primary punisher: Pain (shock), nausea, loud noises, social


disapproval, loss of freedom. anything that organism would naturally try to avoid

Secondary punisher: Monetary fines, demerits, bad grades,


etc.

15

16

Operant conditioning outcomes

Operant conditioning paradigms


+

If an outcome/consequence is added, this is positive (+) conditioning.


-

Progress check

If an outcome/consequence is removed, this is negative (-) conditioning.

You are about to press a button on your iClicker. When


you see that you got the correct answer to the question,
that acts as a ______________.

Note: this has nothing to do with good or bad.

A. Primary reinforcer
B. Secondary reinforcer
C. Primary punisher
D. Secondary punisher
E. You just blew my mind.

17

18

Operant conditioning paradigms

Operant conditioning
The timing and context are critical for forming the association.

Examples of the four basic conditioning paradigms for


learning operant associations:

Positive reinforcement: Eat all your vegetables


dessert.

Positive punishment: Scratch the couch


get sprayed with
water; tease your sibling
parental scolding.

Negative reinforcement: Shut off the alarm clock


removal
of an aversive stimulus; take ibuprofen
reduce a headache.

Negative punishment: Commit armed robbery


freedom; stay out too late
lose your car.

If the outcome is delayed, the association is not learned as well.

Example: a food reward for pressing a level loses effectiveness


rapidly with time.

get some

loss of

19

20

1/15/2016

Operant conditioning

Operant conditioning
Reinforcement schedules

Reinforcement schedules (i.e.


how often you get the outcome) can
affect the rate at which the
associations are learned.

When you get a reward after every


behavior, this is a continuous
reinforcement schedule.

In a partial reinforcement schedule,


the outcome follows less than
100% of the time. For example, in
a fixed-ratio schedule, the reward
comes on a regular basis.

A powerful form of partial


reinforcement is the variable-ratio
schedule the exact timing cant
be predicted.
21

More examples with people?

22

More examples with people?

Progress check
Sheldon gave Penny chocolate each time she did
something to please him. What kind of paradigm is this?
A. Positive reinforcement
B. Negative reinforcement
C. Positive punishment
D. Negative punishment

24

More examples with people?

25

Problems with punishers?


Reinforcers and punishers can be equally effective at
producing behavior in laboratory conditions; however,
punishers can experience problems in the real world.

Progress check
Sheldon sprayed water on Leonard when he disagreed.
What kind of paradigm is this?

1. If you punish a behavior, you may encourage


cheating/circumvention. (Dont speed becomes Dont get
caught speeding.)

A. Positive reinforcement

2. Concurrent reinforcement may undermine the punishment.


(Student punished for talking in class may be reinforced with
approval by other students.)

B. Negative reinforcement
C. Positive punishment
D. Negative punishment

3. Punishment can lead to more variable behavior. (If a specific


behavior is decreased, what replaces it?)
4. The initial intensity of the punishers needs to be fairly intense
(otherwise you may get habituation).
5. Punishment can lead to stress and anxiety, which is associated
with other undesirable behaviors.
26

27

1/15/2016

Complex behaviors

Operant vs. classical conditioning

How do animals get trained to do


complex (and sometimes stupid)
things?

You cant simply reinforce a


complex behavior as it may not be
done accidentally.

Use chaining to create a series of


reinforced behaviors.
S (See platform)

R (Stand on platform)

Operant conditioning:

Classical conditioning

Passive: environment works


on animal.

Active: animal operates on


environment.

UStimulus evokes a
response.

A behavioral response
produces an outcome.

Animal learns that the CS


predicts the US.

Animal learns that behavior


predicts an outcome.

Typically simple
associations.

More flexible and powerful,


producing more complexity.

O (Food reward)

S (See platform)
R (Stand on platform); S (See handles)
handles)
O (Food reward)

However, the two often work together (e.g. primary and


secondary reinforcers can become associated classically).

R (Place paws on
28

29

Summary

Models for operant conditioning?


Operant conditioning likely involves the interaction
of several neural systems.

30

Operant conditioning began with the Law of effect, stating that


animals make associations between voluntary behaviors and
contingent outcomes.

Reinforcers make a behavior more likely; punishers make a


behavior less likely. Both can be due to intrinsic preferences
(primary) or learned associations with intrinsic preferences
(secondary).

When you add something to the outcome (give a treat or shock),


that is positive. When you take away something (pain or freedom),
that is negative.

Punishers may not always be as effective as reinforcers in the real


world, but are equally effective in the lab.

Classical and operant conditioning can be distinguished by what is


being associated and whether the process is more active or
passive.
31

For next time

CogLab # 48 due Tuesday (1/20/15) before midnight.


Quiz 2 due Wednesday (1/21/15) before midnight.

32