Fall 2013 Bayesian Data Analysis 1 L03b Bayessian Analysis Independent Events Two events are statistically independent if the occurrence of one event does not depend on the occurrence of the other event. If A and B are independent: P(A | B) =P(A) where P(A | B) is the probability of A given the occurrence of B (or given B).
the probability to occur or exist simultaneously is the multiplication of the individual probabilities P(A B) =P(A) P(B)
2 examples??? L03b Bayessian Analysis If A and B are statistically dependent, the probability that A and B occur simultaneously is:
To begin an application, we state the general expression for P(AB), in either of the two equivalent forms, and then can consider whether A, B can be approximated as independent.
Intersection of Dependent A and B 3 P(AB) =P(A|B) P(B) Solve for P(A|B)
P(AB) =P(B|A) P(A) Solve for P(B|A) Can solve for either of the 2 dependent probabilities B uncertain, A observed A uncertain, B observed L03b Bayessian Analysis Example 3.4 (RERA 2.3) Suppose that a vendor 1 provides 40% and vendor 2 60% of circuit boards user in a computer. It is known tha 2.5% of Vendor 1s supplies are defective and only 1% of Vendors supplies are defective.
What is the probability that a unit is both defective and supplied by vendor 1? what is the same probability for vendor 2? 4 L03b Bayessian Analysis Theorem of Total Probability Probability of an event A cannot generally be quantified by itself directly, because the occurrence of A depends on occurrence of other events, E i , I =1, 2, . P(A) is made up of the conditional probabilities P(A|E i ) weighted by P(E i ) 5 L03b Bayessian Analysis 6 P(A) =P(A|E 1 )P(E 1 ) + +P(A|E n )P(E n ), where the conditioning events E i are mutually exclusive and (collectively) exhaustive (MEE) in representing P(A). Note how the sample space is reduced by the intersection of E i with A as shown earlier in class exercises.
A Theorem of Total Probability L03b Bayessian Analysis Total Probability of River Flooding, P(F) Flooding of a river in spring depends on snow accumulation in the mountains.
F =occurrence of flooding H =heavy, P(H) = 0.2 : P(F|H) =0.9 N =normal, P(N) = 0.5 : P(F|N) =0.4 L =light or no, P(L) =0.3 : P(F|L) =0.1 P of flooding during following spring: 7 Mutually exclusive, exhaustive sum to 1 L03b Bayessian Analysis Bayes' Theorem P(AB) =P(B|A) P(A) P(AB) =P(A|B) P(B) Solve for P(A|B) or P(B|A) 8 Equivalent expressions based on the joint distribution P(AB) = P(A,B) where P(A) is the total probability of A and P(B) is the total probability of B
Bayes Theorem, defines conditional probability for (A|B) or (B|A). L03b Bayessian Analysis Bayes Equation B depends on mutually exclusive and collectively exhaustive events that can occur in one of n ways, A 1 , A 2 , A n (discrete):
9 Bayes Equation B is partitioned by A Law of Total Probability (LTP) Note numerator expression is summed over all A i values. L03b Bayessian Analysis Bayes Equation P(A j ) is the prior probability based on generic data, expert judgment, or previous data before revising/updating, using current information or tests. P(B|A j ) is the likelihood function or factor by which the prior probability is updated based on current data, e.g., likelihood of a particular test outcome B given A j . The relative likelihood is the likelihood normalized by the total probability of B, P(B), so that P(A j |B) is a proper probability distribution that sums to 1 over all j.
10 L03b Bayessian Analysis Bayes Equation The posterior probability, P(A j |B), is the updated value of P(A j ) given the observation of B, e.g., revised based on new information, especially test data. The posterior probability mass function, P(A j |B), is based on a joint distribution of P(B|A j ) P(A j ), i.e., the current information and the prior information. P(A j |B) contains all of the previous and current information for updating of parameters and functions and to support decision making. 11 L03b Bayessian Analysis Example 3.5
If A is the set of all objects containing an A, B is the set containing a B, and Black is the set of all black objects.
Calculate the probability of an A being Black 13 L03b Bayessian Analysis Example 3.5, solution
From Bayes,
the same result is calculated as obtained directly using POI: P(Black|A) =3/5
14 L03b Bayessian Analysis Example 3.6 (RERA 2.6) Two experts assess the reliability of a component near the end of its useful life. Expert 1: R estimate =0.98; Expert 2, R estimate =0.60 Assume both experts have equal credibility with 0.50 probability that each expert is correct. Calculate the Pr that the experts are correct for: 1. Component is tested and it works (success) 2. Two tests result in success 3. One test is performed and it fails. 4. Two tests with one success and one failure 15 L03b Bayessian Analysis Example 3.6, solution 1 A 1 =event that Expert 1 is correct. A 2 =event that Expert 2 is correct. B =event that the test is successful (unit not fail). P(A 1 ) =P(A 2 ) =0.50 as stated based on history P(B|A 1 ) =0.98; P(B|A 2 ) =0.60 16 Use the Bayes to check. complement L03b Bayessian Analysis Example 3.6, solution 1 Begin with P(A 2 |B): 17 Based on this test, the credibility of Expert 1 increases, Expert 2 decreases. Each posterior can be updated using new data. Prior predictive P(B) L03b Bayessian Analysis Example 3.6, solution 2 B 1 =event that Test 1 is successful. B 2 =event that Test 2 is successful.
P(B 1 B 2 |A 1 ) = P(B 1 |A 1 ) P(B 2 |A 1 ) =(0.98) 2 P(B 1 B 2 |A 2 ) = P(B 1 |A 2 ) P(B 2 |A 2 ) =(0.60) 2 18 So the credibility of Expert 1 is increased and Expert 2 decreased. independent tests Use the Bayes to check. L03b Bayessian Analysis Example 3.6, solution 3
B =event that the component failed the test. P(B|A 1 ) =1 0.98 =0.02; P(B|A 2 ) = 1 0.60 =0.40
19 The component fails a single test. Due to a failed test, the credibility of Expert 1 is decreased greatly, and Expert 2 is increased greatly. L03b Bayessian Analysis Example 3.6, solution 4
B 1 =event that the first test is successful B 2 =event that the second test is a failure P(B|A 1 ) =0.02; P(B|A 2 ) =0.40
20 The component fails 1 test and passes 1 test. Because of 1 failed and 1 successful test, the results fall between those of #2 (two successful tests) and #3 (one failed test). L03b Bayessian Analysis Example 3.7 (RERA 2.7) Vendor 1 supplies 70% and Vendor 2 supplies 30% of chips used by a computer firm [use for prior probabilities] 99% from V 1 and 88% from V 2 are not defective [use for likelihood based on observation]
a. If a chip selected at random from the company inventory is defective, find the probability that the chip was from V 1, P(V 1 |D).
b. If a chip is selected at random from the company inventory, calculate the probability that the chip is defective, P(D). 21 L03b Bayessian Analysis Example 3.7, solution a A 1 =chip from Vendor 1 ; A 2 =chip from Vendor 2 D =event that a chip is defective. D|A 1 =defective chip, given chip is from V 1 . A 1 |D =chip is from V 1 , given chip is defective P(A 1 ) =0.70; P(A 2 ) =0.30 P(D|A 1 ) = 1 0.99 =0.01; P(D|A 2 ) =1 0.88 =0.12
22 The prior probability that V 1 was the supplier, 0.70, is updated to a posterior probability of 0.16 due to data that the unit is defective. L03b Bayessian Analysis Example 3.7, solution b Probability that the selected chip is observed to be defective = P(D) P(D), total probability of D is the prior predictive probability of the observed parameter, D, and the normalization factor of the Bayes expression for the posterior probability.
23 L03b Bayessian Analysis Bayesian Parameter Estimation: Discrete Case From Axiom 2 of probability
P() = Prior predictive probability of observing data Normalizing factor for P(|) is a discrete pmf (probability mass function) or a continuous pdf (probability density function) normalized to 1. Prior predictive probability of data for the observed parameter . 24 Observable is often designated by L03b Bayessian Analysis Total Probability P(B) of Event B Characteristics of P(B) in the Bayes model: Total probability of observing B Marginal distribution of the observable B (sum over all other variables) Normalizing factor for the posterior pmf , P(|B) Prior predictive probability of the observed B by summing over all prior values of the uncertain variable. How it is calculated, when the i are mutually exclusive and exhaustive in representing B: 25 L03b Bayessian Analysis Example 3.8 Probability of Having a Disease? Have the disease = a or do not have the disease = ; D =test positive
Laboratory test data Test is 98.6% reliable for those who have the disease, so P(D|a) =0.986; False negative rate =P( |a) =0.014 False positive rate =0.023 =P(D|)
Disease incidence, random sample from the population: P(a) =10 -4 or 1:10,000 & P() = 0.9999
26 L03b Bayessian Analysis Example 3.8, case a Bayes model Bayes calculation
Probability of having the disease =0.4% is much lower than 98.6% (test reliability) that often is perceived. 27
Pr of a +test Pr you given you have it have it = + Pr of a +test Pr you Pr of a +test given you have it have it given you don't have it Pr you have it given a +test
Pr you don't have it L03b Bayessian Analysis Example 3.8, case b Find the effect on the posterior distribution by reduction of the false positive rate (FPR) from 2.3% to 0.2%: P(D|) =0.002 Bayes calculation
Pr of having the disease =4.7% is now higher given a more accurate detection with lower FPR, P(D|). Note: As FPR approaches 0, P(a|D) approaches 1. 28 L03b Bayessian Analysis Example 3.8, case c You obtained a 1 st +test with a FPR of 2.3%: P(D|) =0.023 resulting in posterior P(a|D) =0.0042. You had a 2 nd +test with a FPR of 0.2%: P(D|) = 0.002 = data combined with P(a|D) prior for calculation of an updated posterior, P(a|D):
From 2 +tests, the probability of having the disease ~68% : higher than with just 1 test but still <98.6%. 29 1 st posterior 2 nd posterior L03b Bayessian Analysis Case Study Conclusions In the absence of sufficient amount of observed data, expert judgment information is more critically needed. Increased amount of observed system specific data eventually dominates prior generic data and expert estimates. How rapidly observed data may dominate depends on amount and quality of the data compared to amount and quality of the prior information. 43