Vous êtes sur la page 1sur 6

# EMSE 269/6020 Assignment #2: Probability Review EXERCISE 1. Explain why probability is important in decision analysis.

Explain also, in your own words, what an uncertain quality or random variable is. Why is the idea of an uncertain quantity important in decision analysis? EXERCISE 2. Flip the probability tree shown in Figure 7.21 on p. 284 of Clemen and Reilly. Solution. First, we have to find the marginal probability of obtaining a positive test:

## Pr (TP ) = Pr (TP DP ) Pr ( DP ) + Pr (TP DA ) Pr ( DA ) = 0.95 0.02 + 0.005 0.98 = 0.0239

Now, we find the marginal probability of obtaining a negative test:

## Pr (TN ) = Pr (TN DP ) Pr ( DP ) + Pr (TN DA ) Pr ( DA ) = 0.05 0.02 + 0.995 0.98 = 0.9761

Next, we obtain the conditional probabilities of inferring that the disease is present. This is done using Bayes theorem:

Pr ( DP TP ) = =

Pr ( DP ) Pr (TP DP ) + Pr ( DA ) Pr (TP DA )
; and,

Pr ( DP ) Pr (TP DP )

## 0.02 0.95 0.02 0.95 + 0.98 0.005 = 0.7949

Pr ( DP TN ) = = Pr ( DP ) Pr (TN DP )

Pr ( DP ) Pr (TN DP ) + Pr ( DA ) Pr (TN DA )

## The flipped probability tree is shown in Figure 1:

Pr ( DA TP ) = 0.7949

Pr (TP ) = 0.0239

Pr ( DA TP ) = 0.2051

Pr ( DA TN ) = 0.0010

Pr (TN ) = 0.9761

Pr ( DA TN ) = 0.9990

Figure 1. "Flipped" probability tree for medical test performance

EXERCISE 3. Figure 7.22 on p. 284 of Clemen and Reilly shows part of an influence diagram for a chemical that is considered potentially carcinogenic. How would you describe the relationship between the test results and the field results? Solution. The test results and field results are conditionally independent. Their relationship is mediated by the carcinogenic potential for the chemical. Thus, we might expect that learning a positive test result increases our belief that a subsequent field result will turn up positive, and vice versa. EXERCISE 4. Linda is 31 years old, single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice and also participated in antinuclear demonstrations. Use your judgment to rank the following statements by their probability, using 1 for the most probable statement and 8 for the least probable: a. b. c. d. e. f. g. h. Linda is a teacher in an elementary school. Linda works in a bookstore and takes Yoga classes. Linda is active in the feminist movement. Linda is a psychiatric social worker. Linda is a member of the League of Women Voters. Linda is a bank teller. Linda is an insurance salesperson. Linda is a bank teller and is active in the feminist movement.

The description and these statements often elicit responses that are not consistent with probability requirements. If you are like most people, you ranked statement h as more probable than statement f. Explain why statement h must be less probable than statement f. Solution. Statement h must be less probable than statement f due to joint probability. The probability of a joint event must be less than the probability of each of the joint events marginal probabilities. EXERCISE 5. Refer to exercises 7.23 and 7.24 on p.284 of Clemen and Reilly. Calculate p(FR Positive) and p(FR Positive|TR Positive). Compare p(FR Positive) and p(FR Positive|TR Positive). Would you say that the test results and field results are independent? Why or why not? Discuss the difference between conditional independence and regular independence. Solution. First, the calculation of Pr(FR+) is done using the LOTP:
Pr FR + = Pr FR + CP hi Pr CP hi + Pr FR + CP lo Pr CP lo = 0.95 0.27 + 0.17 0.73 = 0.3806

) (

) (

Next, we calculate Pr(FR+|TR+). I have elected to do this using the properties of the influence diagram to guide my calculation. First, I compute the probability that CPhi and CPlo given that we have observed a positive test result. This yields Pr(CPhi|TR+) and Pr(CPlo|TR+). Next, I use these conditional probabilities to compute Pr(FR+|TR+) using LOTP. These steps are as follows: Step 1: Compute Pr(CPhi|TR+) and Pr(CPlo|TR+) Pr CP hi Pr TR + CP hi hi + Pr CP TR = Pr CP hi Pr TR + CP hi + Pr CP lo Pr TR + CP lo

) (

) (

) (

)
,

## 0.27 0.82 0.27 0.82 + 0.73 0.21 = 0.5909 =

and;

Pr CP TR

lo

) = Pr (CP ) Pr (TR
hi

Pr CP lo Pr TR + CP lo
+

) (
hi

CP

) + Pr (CP ) Pr (TR
lo

CP lo

## Step 2. Compute Pr(FR+|TR+) using LOTP

Pr FR + TR + = Pr FR + CP,TR + Pr CP TR + = Pr FR + = Pr FR + = 0.6309

(
(

) ( ) CP ) Pr ( CP TR ) CP ) Pr ( CP TR ) + Pr ( FR
+ hi hi +

CP lo Pr CP lo TR +

) (

## = 0.95 0.5909 + 0.17 0.4091

We can visualize this calculation using the influence diagram directly. Recall that an influence diagram without decisions or consequence nodes is a Belief Network, Bayesian Belief Network, or Causal Network. The causal network reflecting this problem is shown in Figure 1.

Figure 2. Prior condition of the Exercise 6 carcinogen monitoring network.

This network as shown is called the prior condition. This is called the prior condition because no findings have been entered. Suppose the monitoring activities lead to a positive field test as indicated in the problem. We now can enter positive findings at the Test_Result node:

Figure 3. Positive findings for the test result.

This corresponds to Step 1 of our calculations above. Using Bayes rule, we can estimate the probability that the chemical is actually carcinogenic given a positive test result. Now, we use the updated carcinogenic probability node to update the state of the Field_Result node:

Figure 4. Updating the "Field_Result" node given the new probabilities at "Carcinogenic_Potential"

This corresponds to an application of LOTP at the Field_Result node using the probabilities updated at Carcinogenic_Potential because no findings at Field_Result have actually been made. We have shown this calculation in Step 2 above. In addition, because of the structure of the network, we see that Carcinogenic_Potential contains all of the information relevant to Field_Result from Test_Result. In

addition, learning about Test_Result makes a positive Field_Result more likely. In sum, the joint posterior condition is shown in Figure 4:

Figure 5. Joint Posterior condition, updating the knowledge base with positive findings at "Test_Result."

While this problem may be solved using the analytical tools of joint and conditional probability directly, I hope that visualizing the conditional independence relationships encoded in an influence diagram illustrates the simplification of analysis that might be possible using these networks to structure and elicit probabilities. [If you are interested in the program used in this analysis, please contact me for an appointment next week.] EXERCISE 6. In the John Hinckley trial example (Clemen and Reilly, p.278-280), reproduce figure 7.20. Show the sensitivity of posterior beliefs to the prior beliefs (e.g., 7.20), and the sensitivity and specificity of the CAT scan test (e.g., p(A|S_bar) and p(A|S)). Solution. First, we show the posterior probability that John Hinckley is schizophrenic based on his CAT scan analytically using Bayes theorem:
Pr ( S A ) = Pr ( S ) Pr ( A S ) + Pr ( S ) Pr A S Pr ( S ) Pr ( A S )

( )

In other words, the posterior belief that John Hinckley is schizophrenic is equal to the belief warranted by the evidence as weighted by the likelihood of a positive CAT scan if he is, in fact, schizophrenic, divided by the weighted sum of the evidences likelihood if he is either schizophrenic or not schizophrenic. The individual pieces of information we need are as follows:

prior CAT + scizophrenic Pr ( S A ) = prior CAT + scizophrenic + (1 prior ) CAT + not _ scizophrenic
In detection theory, the Pr(CAT+|schizophrenic) and Pr(CAT+|not_scizophrenic) are called the true positive rate or sensitivity of the test and false positive rate or 1-specificity of the test. Writing Bayes theorem one more time using the language of the problem statement:
Pr ( S A ) = Note that: prior sensitivity prior sensitivity + (1 prior ) (1 specificity )

sensitivity = Pr ( A S )

specificity = 1 Pr A S

( )

Sensitivity tells us how well the test correctly identifies individuals with the disease, while specificity tells us how well the test correctly fails to positively identify individuals who do not have the disease. Now we are ready to draw the three lines requested. I will only show the requisite equation with the parameters Ive assumed for the case, and the plots are shown in Figure 6. Case 1, Posterior probability varying with prior probability of scizophrenia.

Pr ( S A ) =

## prior 0.30 prior 0.30 + (1 prior ) ( 0.02 )

This equation reproduces figure 7.20. From this equation, we can see that schizophrenia is so rare in the population that our prior beliefs about the presence of schizophrenia dominate our posterior beliefs. If a juror is knowledgeable about the background rate of schizophrenia in the population (e.g., they know that only 1.5% of the people in the US are schizophrenic) and their beliefs are updated according to Bayes theorem, they will not be convinced by the CAT scan evidence. Case 2. Posterior probability varying with test sensitivity.

Pr ( S A ) =

## 0.015 sensitivity 0.015 sensitivity + (1 0.015 ) ( 0.02 )

This equation shows that, even if the test identifies all positive cases perfectly, the disease is so rare that the test can only modestly increase our posterior beliefs (all other things remaining the same as the original condition). Nearly perfect specificity is needed to make us confident in our belief about schizophrenia even after observing a positive CAT scan. Case 2. Posterior probability varying with test specificity.

Pr ( S A ) =

## 0.015 0.30 0.015 0.30 + (1 0.015 ) (1 specificity )

This equation shows a corollary to the conclusion of case 2. While perfect rejection of negative individuals increases the value of the test in terms of our posterior beliefs, the specificity of the test needs to be virtually perfect since schizophrenia is so rare in the population.

Figure 6. Illustration of the sensitivity of posterior beliefs to test sensitivity, specificity, and agent's prior beliefs about schizophrenia.