Uncertainty

Dealing with Uncertainty
Reasoning Under Uncertainty

Monotonic Reasoning
• A reasoning process that moves in one direction only.

• The number of facts in the knowledge base is always increasing.
• The conclusions derived are valid deductions and they remain so.
Reasoning process applied to practical everyday problems must

recognize uncertainty
• Available information is frequently incomplete
• Conditions change over time
• There is frequently a need to make an efficient but possibly
incorrect guess when reasoning reaches a dead end.
Non-monotonic Reasoning
Non-monotonic reasoning (NMR) is based on augmenting

absolute truth with beliefs.
These tentative beliefs are generally based on default

assumptions that are made in light of lack of evidence.
A NMR system tracks a set of tentative beliefs and revise

those beliefs when knowledge is observed or derived.
• Uncertainty may cause bad treatment in medicine, loss

of money in business.
• Classic examples of successful expert systems which

deal with uncertainty are MYCIN for medical diagnosis
and PROSPECTOR for mineral exploration.

• In case of medicine, delaying treatment for more tests
(for more exact knowledge) may add considerable costs;
the patient may die.
Many different types of errors can contribute to
uncertainty.
1. data might be missing or unavailable

2. data might be ambiguous or unreliable due to
measurement errors
3. the representation of data may be imprecise or
inconsistent
4. data may just be user's best guess (random)
5. data may be based on defaults, and defaults may
have exceptions
Given these sources of errors, most knowledge base
systems incorporate some form of uncertainty
management.
There are three issues to be considered:

1. How to represent uncertain data.
2. How to combine two or more pieces of uncertain data.
3. How to draw inference using uncertain data
Errors and Induction
Deduction is going from general to specific

All men are mortal
Socrates is a man
therefore Socrates is mortal
Induction tries to generalize from the specific.
My disk has never crashed.

Inductive
therefore my disk will never crash.
Inductive arguments can never be proven correct (except mathematical
induction). Inductive arguments can provide some degree of
confidence that the conclusion is correct.
Deductive errors or fallacies may also occur

If p implies q
q is true
therefore p
Example
If the valve is in good condition then the output is normal
The output is normal
Therefore, the valve is in good condition
Uncertainty is major problem in knowledge elicitation, especially when

the expert's knowledge must be quantized in rules.
Approaches in Dealing with Uncertainty
Numerically oriented methods:

• Bayes’ Rules
• Certainty Factors
• Dempster Shafer
• Fuzzy Sets
Quantitative approaches
• Non-monotonic reasoning
Symbolic approaches
• Cohen’s Theory of Endorsements
• Fox’s semantic systems
Classical Probability
• This is also called a priori probability. It is

assumed that all possible events are known and
that each event is equally likely to happen
(rolling a die).
• Prior or unconditional probability is the one
before the evidence is received.
• Posterior or conditional probability is the one
after the evidence is received.
Theory of Probability
Formal theory of probability can be made using 3 axioms:
Axiom 1 0 =< P(E) =< 1
Axiom 2  P( E )  1
i
i where Ei, i=< 1=< n are mutually
exclusive
P(E) + P(E') = 1
Axiom 3 P( E1  E 2 )  P( E1 )  P( E 2 )  P( E1  E 2 )
Experimental or Subjective Probabilities
• In contrast to the prior approach, experimental probability
defines the probability of an event P(E) as the limit of a
frequency distribution.
P(E) =[ lim (N-> infinity)] f(E)/N
This type of probability is called a posterior probability.
A subjective probability is a belief or opinion expressed as a

probability rather than a probability based on axioms or
empirical measurements. This is applied on the decisions for
non-repeatable events.
Compound Probabilities
• What is the probability of rolling a die with an

outcome of even number divisible by 3.
Event A= {2, 4, 6}
Event B = {3, 6}
A  B   6
n( A  B ) 1
P( A  B)  
n( s ) 6
P( A  B)  P( A) P( B)
The two events are called stochastically independent
events if and only if the above formula is true.
Stochastic is a Greek word meaning "guess". It is

commonly used as a synonym for probability, random or
chance.
The probability of rolling a die with an outcome of even

number or divisible by 3.
P( A  B)  P( A)  P( B)  P( A  B)
= 3/6 + 2/6 – 1/6 = 4/6
Conditional Probabilities
• The probability of an event A, given that event B
occurred, is called a conditional probability and
indicated by P(A|B).
P( A  B)
P( A | B)  forP( B)  0
P( B)
P( A | B) P( B)  P( A  B)
Baye's Theorem
Baye's Theorem in terms of events E, and hypothesis,
P( E  H i )
P( H i | E ) 
 P( E  H j )
j
P( E | H i ) P( H i )

 P( E | H j ) P( H j )
j
P( E | H i ) P( H i )

P( E )
Baye's Theorem
The conditional probability, P(A|B), states the probability of

event A given that event B occurred. The inverse problem
is to find the inverse probability which states the probability
of an earlier event given that a later one occurred.
Example: Probability of chosing brand X given it has

crashed.
This is inverse or posterior probability.

Example
Table below shows hypothetical disk crashes using a brand X drive
within one year
X X’ Total of Rows
Crash C 0.6 0.1 0.7
No Crash C’ 0.2 0.1 0.3
Total of Columns 0.8 0.2 1.0
P(C|X) = ?
P(C|X) = P(C  X) / P(X) = 0.6 / 0.8 = 0.75
P(C|X’) = P(C  X’) / P(X’) = 0.1 / 0.2 = 0.50
P(X|C) = ?
P(X|C) = P(C  X) / P(C) = 0.6 / 0.7 = 6/7
P(X|C) = P(C|X) P(X) / P(C) = 0.75 * 0.8 / 0.7
= 0.6 / 0.7
Example
Suppose, statistics show that Brand X drive crashes with a probability
of 75% within one year and non-Brand X drive crash within one year is
50%. The inverse question is, what is the probability of a crashed drive
being brand X or non-brand X.
Hypothetical Reasoning and Backward
Induction
• Bayesian decision making is used in

PROSPECTOR to decide favorable sites for
mineral exploration.
• Generally conditional probability is forward in
time, while a posterior probability is backward in
time.
• Example of Bayesian decision making under
uncertainty.
Oil exploration
• If there is no evidence for or against we may guess that

P(O) = P(O') = 0.5
• We may believe that the chances are better for finding

oil.
P(O) = 0.6, P(O') = 0.4
• Assume the probabilities for the outcomes of seismic test

for oil exploration as:
P(+|O) = 0.8, P(-|O) = 0.2 (false -)
P(+|O')= 0.1 (false +), P(-|O') = 0.9
Using above conditional (prior) probabilities we can
construct the initial probability tree .
.
Advantages and Disadvantages of
Bayesian Methods
• Bayesian methods have support of probability theory and
have well defined semantics for decision making.
Disadvantages are
• They require significant amount of probability data to
construct a knowledge base.
• If the probabilities are statistical, sample size must be
sufficient. If they are provided by an expert then their
comprehensiveness and consistency must be queried.
• Reducing associations between the hypotheses and the
evidences to simple numerical values removes relevant
information necessary for reasoning (explanation of how a
conclusion is reached).
Reasoning with Certainty Factors
During the development of MYCIN, researchers developed certainty
factors formalism for the following reasons:
• The medical data lacks large quantities of data and/or the numerous
approximations required by Bayes' theorem.
• There is a need to represent medical knowledge and heuristics

explicitly, which can not be done when using probabilities.
• Physicians reason by capturing evidence that supports or denies a

particular hypothesis.
Certainty Factor (CF) Formalism
Eg of MYCIN rule
IF the stain of the organism is gram pos
AND the morphology of the organism is coccus
AND the growth of the organism is chains
THEN there is evidence that the organism is streptococcus CF
0.7
Given the evidence a doctor only partially believe the

conclusion
• General Form
IF E1 And E2 ….THEN H CF = Cfi
where E= evidence & H is the conclusion
• A measure of belief, MB(h, e) indicates the degree to
which our belief in hypothesis, h, is increased based
on the presence of evidence, e
• A measure of disbelief, MD(h, e), indicates the
degree to which our disbelief in hypothesis, h, is
increased based on the presence of evidence, e.
When p(h | e) = 0 MB(h, e) = 0 MD(h, e) = 1
p(h | e) = 1 MB(h, e) = 1 MD(h, e) = 0

CF interpretation
• CF(h | e) = MB(h, e) - MD(h, e)
-1 < CF < 1
• When there is total belief

– CF = 1, and
• When there is a total disbelief in hypothesis
– CF = -1
• When there is no evidence to make judgment
– CF = 0
-1 0 1
F range of disbelief range of belief T
• Composite CF can be calculated as follows:

CFcomp(h, e) = MBcomp(h, e) - MDcomp(h, e)
• For P1 and P2 premises of the rule,

CF(P1 and P2)= MIN((CF(P1), CF(P2))
CF(P1 or P2) = MAX ((CF(P1), CF(P2))
For example consider a rule in a knowledge base:

• (P1 and P2) or P3 R1(.7) and R2(.3)
• If CFs for P1, P2, and P3 are 0.6, 0.4, and 0.2,
respectively then R1 and R2 may be anticipated with
CFs 0.28 and 0.12 respectively.
CF(P1(0.6) and P2(0.4)) = MIN(.6, .4) = 0.4

CF((0.4) or P3(0.2)) = MAX (0.4, 0.2) = 0.4
CF(R1) = .7 * .4 = .28
CF(R2) = .3 * .4 = .12
Two properties that are required of the combination

operation are:
Commutative – The value should not depend on the
order in which the rules are taken.
Asymptotic – The more evidence we have for the
belief in a conclusion the higher should be the
certainty factor, but if it is not absolutely certain,
then it should remain below 1.
Propagation of Certainty Factors
When there are two or more rules supporting the same

conclusion CFs are propagated as follows:
CFrevised = CFold + CFnew(1 - CFold) if both CFold and

CFnew > 0
= CFold + CFnew(1 + CFold) if both CFold and
CFnew < 0
= otherwise
Certainty Factor Example
In a murder trial the defendant is being accused of a first degree
murder (hypothesis).The jury must balance the evidences presented
by the prosecutor and the defense attorney to decide if the suspect is
guilty.
RULE001 IFthe defendant's fingerprints are on the weapon,

THEN the defendant is guilty. CF=0.75
RULE002 IFthe defendant has a motive,
THEN the defendant is guilty. CF=0.60
RULE003 IFthe defendant has a alibi,
THEN he is not guilty. CF=-.80
We start with CF = 0.0 for the defendant being

guilty.
• After submission of the evidence 1 (fingerprints
on the weapon)
CFcomb1 = CF rule1's conclusin * CF evid1
= 0.75 * 0.90 = 0.675
CF revised = CF old + CF new * (1 - CFold)
= 0.0 + 0.675(1-0.0) = 0.675
Example of CFs Propagation
CFold=0.0
Guilty
CFcon1=CFnew=0.675 CFnew=0.675 Guilty
CF = 0.0 CFrevised=0.675
CFrevised=CFold + CFnew*(1-CFold)
=0.0 + 0.675*(1-0.0)
fingerprints
=0.675
on weapon
CFevid1=0.90
CFrule1=0.75
RULE 1. IF the defendant’s fingerprints are on the weapon

THEN the defendant is guilty
CFcon1=CFevid1*CFrule1 (single premise rule)

=0.9*0.75
=0.675
The defendant’s mother in law says that he had the motive

for slaying
CFnew = CFcomb2 = CF rule2's conclusin * CF evid2
= 0.60 * 0.50 = 0.30
CF revised = CF old + CF new * (1 - CFold)

= 0.675 + 0.30(1-0.675) = 0.7725
CFold=0.675
CFcon2=CFnew=0.30 Guilty CFnew=0.30 Guilty
CFrevised=0.675 CFrevised=0.772
CFrevised=CFold + CFnew*(1-CFold)
=0.675 + 0.30*(1-0.675)
=0.7725
Motive exists
CFevid2=0.50
CFrule2=0.60
RULE 2. IF the defendant has a motive

THEN the defendant is guilty of the crime
CFcon2=CFevid2*CFrule2
(single premise rule)
=0.50*0.60
=0.30
A respected judge witnesses for alibi, so a cf of 0.95 is
assigned for this evidence
CFcomb3 = CF rule3's conclusin * CF evid3

= 0.95 * (-0.80) = -0.76
CFrevised =
= (0.7725 - 0.76) / (1 - 0.76) = 0.052

CFold=0.772
CFcon3=CFnew=-0.76 Guilty CFnew=-0.76
Guilty
CFrevised=0.772
CFrevised=0.052
CFold  CFnew
CFreviced=
1  min(| CFold |,| CFnew | )
Alibi found
CFevid3=0.95 = (0.772-0.76)/(1-0.76)
CFrule3= -0.80 = 0.052
RULE 3. IF the defendant has an alibi

THEN he is not guilty
CFcon3=CFevid3*CFrule3
=0.95*(-0.80)
= -0.76
Confidence Factor in guilty verdict after introduction of all

evidences is:
Advantages of Certainty Factors
• It is a simple computational model that permits experts to
estimate their confidence in conclusions being drawn.
• It permits the expression of belief and disbelief in each
hypothesis, allowing the expression of the effect of
multiple sources of evidence.
• It allows knowledge to be captured in a rule
representation while allowing the quantification of
uncertainty.
• The gathering of the CF values is significantly easier
than the gathering of values for the other methods. No
statistical base is required – you merely have to ask the
expert for the values.
Difficulties
Deep Inference Chains
If we have a chain of inference such as:

IF A THEN B CF=0.8
IF B THEN C CF= 0.9
Then because of the multiplication of CFs the resulting CF

decreases.
For example if CF(A) = 0.8, then
CF(C) = .8*.8*.9 = .58
With long chain of inferences the final CF may become very
small
Difficulties
Many Rules with same Conclusion
The more rules with the same conclusion the

higher the CF value. If there are many rules then
CF can become artificially high.
Difficulties
Conjunctive Rules
If a rule has a number of conjunctive premises, overall CF may be
reduced too much.
IF sky dark AND temperature dropping
THEN will rain 0.9
If CF(sky dark) = 1,
CF(temperature dropping) = .1 then
CF(will rain) = min(1, .1)*.9 = .09 whereas if we had
IF the sky dark THEN will rain 0.7

IF temperature dropping THEN will rain 0.5
CF1 = 1 * .7 = 0.7, CF2 = .1 * .5 = 0.05

CF (will rain) = .7 + .05*(1 - .7) = .7 + 0.015 = .715
Fuzzy Logic
In everyday speech we use vague or imprecise terms to
describe properties.
Fuzzy logic was developed by Zadeh to deal with these

imprecise values in a mathematical way.
Fuzzy Logic
• It will allow us to deal with fuzzy rules
IF the temperature is cold

THEN the motor speed stops
IF speed is slow
THEN make acceleration high.
Fuzzy Sets
• In ordinary set theory, an element from the domain is
either in a set or not in a set.
• In fuzzy sets, a number in the range 0-1 is attached to an
element – the degree to which the element belongs to
the set.
• A value of 1 means the element is definitely in the set
• A value of 0 means the element is definitely not in the
set
• Other values are grades of membership.
• Formally a fuzzy set A from X is given by its membership
function which has type
A : X  [0, 1]
Fuzzy Sets
Fuzzy set of small men
Small men – Simpler Curve

Fuzzy Sets
• The following figure shows the representation of
three fuzzy sets for small, medium and tall men.
We see that a man of height 4.8 feet is
considered both small and medium to some
degree.
Boolean Operations
The Boolean operations of union, intersection, and complement can be
defined in the straightforward manner.
Complement
The operation is
A (x) = 1 - A (x)
Boolean Operations
Intersection
The intersection of two fuzzy sets A and B is given by
AB (x) = min({A (x), B (x)})
Union
The union of two fuzzy sets A and B is given by
AB (x) = max({A (x), B (x)})
Fuzzy Reasoning
In this section, fuzzy rules and how inference is

performed on these rules is presented.
This will be illustrated by a fuzzy system used to

control an air conditioner. The variables to be used
(with fuzzy values) are temperature (of the room)
and speed (of the fan motor).
Fuzzy Reasoning
The rules are given as follows:
• IF the temperature is cold
THEN motor speed stops
• IF the temperature is cool

THEN motor speed slows Temperature Fuzzy Sets
• IF the temperature is just right

THEN motor speed medium
• IF the temperature is warm

THEN motor speed fast
• IF the temperature is hot

Speed Fuzzy Sets
THEN motor speed blast
Fuzzy Reasoning
• In a fuzzy system all the rules fire in parallel,

although in the end many will not contribute to
the output.
• What we need to determine, in the above
system is, given a particular value of the
temperature how do we calculate the motor
speed.
Fuzzy Reasoning
• Now, the temperature can be measured fairly accurately,
but it will lie in several fuzzy sets. For example if the
temperature were 17C then from the figure we see that it
is about 25% cool and 80% just right.
Fuzzy Reasoning
• This means that rules 2 and 3 will contribute to the output
speed of the motor.
• The fuzzy set for the output can be calculated by multiplying

the slow graph by .25 and the medium graph by .80
assuming the contribution is proportional to the fuzzy values
of the input temperature
Fuzzy Reasoning
• One way to amalgamate two sets is to sum the values
(with a maximum of 1).
Amalgamated sets and average

Fuzzy Reasoning
• Other ways of amalgamation (e.g. taking maximum) are
possible.
• Now we need to determine the actual speed of the motor.

This can be done by finding the average value of the
curve – I.e. the position where the areas on either side of
the perpendicular through this point are equal.
acknowledgement
• Phil Grant: University of Wales Swansea

Uncertainty

Transféré par

Informations du document

Description originale:

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Uncertainty

Transféré par

Droits d'auteur :

Formats disponibles

Dealing with Uncertainty

Reasoning Under Uncertainty

• A reasoning process that moves in one direction only.

Reasoning process applied to practical everyday problems must

Non-monotonic reasoning (NMR) is based on augmenting

These tentative beliefs are generally based on default

A NMR system tracks a set of tentative beliefs and revise

• Uncertainty may cause bad treatment in medicine, loss

• Classic examples of successful expert systems which

1. data might be missing or unavailable

There are three issues to be considered:

Deduction is going from general to specific

Induction tries to generalize from the specific.

My disk has never crashed.

Deductive errors or fallacies may also occur

Uncertainty is major problem in knowledge elicitation, especially when

Numerically oriented methods:

• This is also called a priori probability. It is

Axiom 1 0 =< P(E) =< 1

P(E) =[ lim (N-> infinity)] f(E)/N

This type of probability is called a posterior probability.

A subjective probability is a belief or opinion expressed as a

• What is the probability of rolling a die with an

Stochastic is a Greek word meaning "guess". It is

The probability of rolling a die with an outcome of even

Baye's Theorem in terms of events E, and hypothesis,

The conditional probability, P(A|B), states the probability of

Example: Probability of chosing brand X given it has

This is inverse or posterior probability.

• Bayesian decision making is used in

• If there is no evidence for or against we may guess that

• We may believe that the chances are better for finding

• Assume the probabilities for the outcomes of seismic test

• There is a need to represent medical knowledge and heuristics

• Physicians reason by capturing evidence that supports or denies a

Given the evidence a doctor only partially believe the

When p(h | e) = 0 MB(h, e) = 0 MD(h, e) = 1

p(h | e) = 1 MB(h, e) = 1 MD(h, e) = 0

• When there is total belief

• Composite CF can be calculated as follows:

• For P1 and P2 premises of the rule,

For example consider a rule in a knowledge base:

CF(P1(0.6) and P2(0.4)) = MIN(.6, .4) = 0.4

Two properties that are required of the combination

When there are two or more rules supporting the same

CFrevised = CFold + CFnew(1 - CFold) if both CFold and

RULE001 IFthe defendant's fingerprints are on the weapon,

We start with CF = 0.0 for the defendant being

RULE 1. IF the defendant’s fingerprints are on the weapon

CFcon1=CFevid1*CFrule1 (single premise rule)

The defendant’s mother in law says that he had the motive

CF revised = CF old + CF new * (1 - CFold)

RULE 2. IF the defendant has a motive

CFcomb3 = CF rule3's conclusin * CF evid3

= (0.7725 - 0.76) / (1 - 0.76) = 0.052

RULE 3. IF the defendant has an alibi

Confidence Factor in guilty verdict after introduction of all

If we have a chain of inference such as:

Then because of the multiplication of CFs the resulting CF

Many Rules with same Conclusion

The more rules with the same conclusion the

IF the sky dark THEN will rain 0.7

CF1 = 1 * .7 = 0.7, CF2 = .1 * .5 = 0.05