Notes Stats

Definitions Probability Experiment Process which leads to well-defined results call outcomes Outcome The result of a single trial
of a probability experiment Sample Space Set of all possible outcomes of a probability experiment Event One or more outcomes of a probability experiment Classical Probability Uses the sample space to determine the numerical probability that an event will happen. Also called theoretical probability. Equally Likely Events Events which have the same probability of occurring. Complement of an Event All the events in the sample space except the given events. Empirical Probability Uses a frequency distribution to determine the numerical probability. An empirical probability is a relative frequency. Subjective Probability Uses probability values based on an educated guess or estimate. It employs opinions and inexact information. Mutually Exclusive Events Two events which cannot happen at the same time. Disjoint Events Another name for mutually exclusive events. Independent Events Two events are independent if the occurrence of one does not affect the probability of the other occurring. Dependent Events Two events are dependent if the first event affects the outcome or occurrence of the second event in a way the probability is changed. Conditional Probability The probability of an event occurring given that another event has already occurred. Bayes' Theorem A formula which allows one to find the probability that an event occurred as the result of a particular previous event. Factorial
A positive integer factorial is the product of each natural number up to and including the integer. Permutation An arrangement of objects in a specific order. Combination A selection of objects without regard to order. Tree Diagram A graphical device used to list all possibilities of a sequence of events in a systematic way.
Introduction to Probability
Sample Spaces A sample space is the set of all possible outcomes. However, some sample spaces are better than others. Consider the experiment of flipping two coins. It is possible to get 0 heads, 1 head, or 2 heads. Thus, the sample space could be {0, 1, 2}. Another way to look at it is flip { HH, HT, TH, TT }. The second way is better because each event is as equally likely to occur as any other. When writing the sample space, it is highly desirable to have events which are equally likely. Another example is rolling two dice. The sums are { 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 }. However, each of these aren't equally likely. The only way to get a sum 2 is to roll a 1 on both dice, but you can get a sum of 4 by rolling a 3-1, 2-2, or 3-1. The following table illustrates a better sample space for the sum obtain when rolling two dice.
Second Die First Die 1 1 2 2 3 3 4 4 5 5 6 6 7
2 3 4 5 6
3 4 5 6 7
4 5 6 7 8
5 6 7 8 9
6 7 8 9 10
7 8 9 10 11
8 9 10 11 12
Classical Probability The above table lends itself to describing data another way -- using a probability distribution. Let's consider the frequency distribution for the above sums.
Sum Frequency Relative Frequency 1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36
2 3 4 5 6 7 8 9 10
1 2 3 4 5 6 5 4 3
11 12
2 1
2/36 1/36
If just the first and last columns were written, we would have a probability distribution. The relative frequency of a frequency distribution is the probability of the event occurring. This is only true, however, if the events are equally likely. This gives us the formula for classical probability. The probability of an event occurring is the number in the event divided by the number in the sample space. Again, this is only true when the events are equally likely. A classical probability is the relative frequency of each event in the sample space when each event is equally likely.
P(E) = n(E) / n(S)

Empirical Probability Empirical probability is based on observation. The empirical probability of an event is the relative frequency of a frequency distribution based upon observation.
P(E) = f / n
Probability Rules There are two rules which are very important. All probabilities are between 0 and 1 inclusive
0 <= P(E) <= 1

The sum of all the probabilities in the sample space is 1 There are some other rules which are also important. The probability of an event which cannot occur is 0. The probability of any event which is not in the sample space is zero.
The probability of an event which must occur is 1. The probability of the sample space is 1. The probability of an event not occurring is one minus the probability of it occurring. P(E') = 1 - P(E)
Probability Rules
"OR" or Unions Mutually Exclusive Events Two events are mutually exclusive if they cannot occur at the same time. Another word that means mutually exclusive is disjoint. If two events are disjoint, then the probability of them both occurring at the same time is 0. Disjoint: P(A and B) = 0
If two events are mutually exclusive, then the probability of either occurring is the sum of the probabilities of each occurring. Specific Addition Rule Only valid when the events are mutually exclusive. P(A or B) = P(A) + P(B) Example 1: Given: P(A) = 0.20, P(B) = 0.70, A and B are disjoint I like to use what's called a joint probability distribution. (Since disjoint means nothing in common, joint is what they have in common -- so the values that go on the inside portion of the table are the intersections or "and"s of each pair of events). "Marginal" is another word for totals -- it's called marginal because they appear in the margins.
B A A' Marginal 0.00 0.70 0.70
B' 0.20 0.10 0.30
Marginal 0.20 0.80 1.00
The values in red are given in the problem. The grand total is always 1.00. The rest of the values are obtained by addition and subtraction. Non-Mutually Exclusive Events In events which aren't mutually exclusive, there is some overlap. When P(A) and P(B) are added, the probability of the intersection (and) is added twice. To compensate for that double addition, the intersection needs to be subtracted. General Addition Rule Always valid. P(A or B) = P(A) + P(B) - P(A and B) Example 2: Given P(A) = 0.20, P(B) = 0.70, P(A and B) = 0.15
B A A' Marginal 0.15 0.55 0.70 B' 0.05 0.25 0.30 Marginal 0.20 0.80 1.00
Interpreting the table Certain things can be determined from the joint probability distribution. Mutually exclusive events will have a probability of zero. All inclusive events will have a zero opposite the intersection. All inclusive means that there is nothing outside of those two events: P(A or B) = 1.
B A A and B are Mutually Exclusive B' . Marginal .
if this value is 0 A' Marginal . . A and B are All Inclusive if this value is 0 . . 1.00
"AND" or Intersections Independent Events Two events are independent if the occurrence of one does not change the probability of the other occurring. An example would be rolling a 2 on a die and flipping a head on a coin. Rolling the 2 does not affect the probability of flipping the head. If events are independent, then the probability of them both occurring is the product of the probabilities of each occurring. Specific Multiplication Rule Only valid for independent events P(A and B) = P(A) * P(B) Example 3: P(A) = 0.20, P(B) = 0.70, A and B are independent.
The 0.14 is because the probability of A and B is the probability of A times the probability of B or 0.20 * 0.70 = 0.14. Dependent Events
If the occurrence of one event does affect the probability of the other occurring, then the events are dependent. Conditional Probability The probability of event B occurring that event A has already occurred is read "the probability of B given A" and is written: P(B|A) General Multiplication Rule Always works. P(A and B) = P(A) * P(B|A) Example 4: P(A) = 0.20, P(B) = 0.70, P(B|A) = 0.40 A good way to think of P(B|A) is that 40% of A is B. 40% of the 20% which was in event A is 8%, thus the intersection is 0.08.
Independence Revisited The following four statements are equivalent 1. 2. 3. 4. A and B are independent events P(A and B) = P(A) * P(B) P(A|B) = P(A) P(B|A) = P(B)
The last two are because if two events are independent, the occurrence of one doesn't change the probability of the occurrence of the other. This means that the probability of B occurring, whether A has happened or not, is simply the probability of B occurring.
Conditional Probability
Conditional Probability Recall that the probability of an event occurring given that another event has already occurred is called a conditional probability. The probability that event B occurs, given that event A has already occurred is P(B|A) = P(A and B) / P(A) This formula comes from the general multiplication principle and a little bit of algebra. Since we are given that event A has occurred, we have a reduced sample space. Instead of the entire sample space S, we now have a sample space of A since we know A has occurred. So the old rule about being the number in the event divided by the number in the sample space still applies. It is the number in A and B (must be in A since A has occurred) divided by the number in A. If you then divided numerator and denominator of the right hand side by the number in the sample space S, then you have the probability of A and B divided by the probability of A. Examples Example 1: The question, "Do you smoke?" was asked of 100 people. Results are shown in the table.
. Male Female Total Yes 19 12 31 No 41 28 69 Total 60 40 100
What is the probability of a randomly selected individual being a male who smokes? This is just a joint probability. The number of "Male and Smoke" divided by the total = 19/100 = 0.19 What is the probability of a randomly selected individual being a male? This is the total for male divided by the total = 60/100 = 0.60. Since no mention is made of smoking or not smoking, it includes all the cases. What is the probability of a randomly selected individual smoking? Again, since no mention is made of gender, this is a marginal probability, the total who smoke divided by the total = 31/100 = 0.31. What is the probability of a randomly selected male smoking? This time, you're told that you have a male - think of stratified sampling. What is the probability that the male smokes? Well, 19 males smoke out of 60 males, so 19/60 = 0.31666... What is the probability that a randomly selected smoker is male? This time, you're told that you have a smoker and asked to find the probability that the smoker is also male. There are 19 male smokers out of 31 total smokers, so 19/31 = 0.6129 (approx)
After that last part, you have just worked a Bayes' Theorem problem. I know you didn't realize it - that's the beauty of it. A Bayes' problem can be set up so it appears to be just another conditional probability. In this class we will treat Bayes' problems as another conditional probability and not involve the large messy formula given in the text (and every other text). Example 2: There are three major manufacturing companies that make a product: Aberations, Brochmailians, and Chompielians. Aberations has a 50% market share, and Brochmailians has a 30% market share. 5% of Aberations' product is defective, 7% of Brochmailians' product is defective, and 10% of Chompieliens' product is defective. This information can be placed into a joint probability distribution
Company Aberations Brochmailians Good 0.50-0.025 = 0.475 0.30-0.021 = 0.279 Defective 0.05(0.50) = 0.025 0.07(0.30) = 0.021 Total 0.50 0.30
Chompieliens Total
0.20-0.020 = 0.180 0.934
0.10(0.20) = 0.020 0.066
0.20 1.00
The percent of the market share for Chompieliens wasn't given, but since the marginals must add to be 1.00, they have a 20% market share. Notice that the 5%, 7%, and 10% defective rates don't go into the table directly. This is because they are conditional probabilities and the table is a joint probability table. These defective probabilities are conditional upon which company was given. That is, the 7% is not P(Defective), but P(Defective|Brochmailians). The joint probability P(Defective and Brochmailians) = P(Defective|Brochmailians) * P(Brochmailians). The "good" probabilities can be found by subtraction as shown above, or by multiplication using conditional probabilities. If 7% of Brochmailians' product is defective, then 93% is good. 0.93(0.30)=0.279.

What is the probability a randomly selected product is defective? P(Defective) = 0.066 What is the probability that a defective product came from Brochmailians? P(Brochmailian|Defective) = P(Brochmailian and Defective) / P(Defective) = 0.021/0.066 = 7/22 = 0.318 (approx). Are these events independent? No. If they were, then P(Brochmailians|Defective)=0.318 would have to equal the P(Brochmailians)=0.30, but it doesn't. Also, the P(Aberations and Defective)=0.025 would have to be P(Aberations)*P(Defective) = 0.50*0.066=0.033, and it doesn't.
The second question asked above is a Bayes' problem. Again, my point is, you don't have to know Bayes formula just to work a Bayes' problem.
Bayes' Theorem However, just for the sake of argument, let's say that you want to know what Bayes' formula is. Let's use the same example, but shorten each event to its one letter initial, ie: A, B, C, and D instead of Aberations, Brochmailians, Chompieliens, and Defective.
P(D|B) is not a Bayes problem. This is given in the problem. Bayes' formula finds the reverse conditional probability P(B|D). It is based that the Given (D) is made of three parts, the part of D in A, the part of D in B, and the part of D in C.
P(B|D) = P(B and D) ----------------------------------------P(A and D) + P(B and D) + P(C and D)
Inserting the multiplication rule for each of these joint probabilities gives
P(B|D) = P(D|B)*P(B) ----------------------------------------P(D|A)*P(A) + P(D|B)*P(B) + P(D|C)*P(C)
However, and I hope you agree, it is much easier to take the joint probability divided by the marginal probability. The table does the adding for you and makes the problems doable without having to memorize the formulas.
Counting Techniques
Fundamental Theorems Every branch of mathematics has its fundamental theorem or theorems. Fundamental Theorem of Arithmetic Every integer greater than one is either prime or can be expressed as an unique product of prime numbers Fundamental Theorem of Algebra Every polynomial function on one variable of degree n > 0 has at least one real or complex zero. Fundamental Theorem of Linear Programming If there is a solution to a linear programming problem, then it will occur at a corner point or on a boundary between two or more corner points
Fundamental Counting Principle In a sequence of events, the total possible number of ways all events can performed is the product of the possible number of ways each individual event can be performed. Factorials If n is a positive integer, then
n! = n (n-1) (n-2) ... (3)(2)(1) n! = n (n-1)!
A special case is 0!
0! = 1
Permutations A permutation is an arrangement of objects without repetition and where order is important. Another definition of permutation is the number of arrangements that can be formed. Permutations using all the objects A permutation of n objects, arranged into one group of size n, without repetition, and order being important is:
nPn
= P(n,n) = n!
Example: Find all permutations of the letters "ABC"

ABC ACB BAC BCA CAB CBA
Permutations of some of the objects A permutation of n objects, arranged in groups of size r, without repetition, and order being important is:
nPr
= P(n,r) = n! / (n-r)!
The calculator can be used to find the number of such permutations. On the TI-82 or TI-83, the permutation key is found under the Math, Probability menu.
Example: Find all two-letter permutations of the letters "ABC"

AB AC BA BC CA CB
Shortcut formula for finding a permutation
Assuming that you start a n and count down to 1 in your factorials ... P(n,r) = first r factors of n factorial Distinguishable Permutations Sometimes letters are repeated and all of the permutations aren't distinguishable from each other. Example: Find all permutations of the letters "BOB" To help you distinguish, I'll write the second "B" as "b"
BOb BbO OBb ObB bBO bOB
If you just write "B" as "B", however ...

BOB BBO OBB OBB BBO BBO
There are really only three distinguishable permutations here.

BOB BBO OBB
If a word has N letters, k of which are unique, and you let n (n1, n2, n3, ..., nk) be the frequency of each of the k letters, then the total number of distinguishable permutations is given by:
Consider the word "STATISTICS": Here are the frequency of each letter: S=3, T=3, A=1, I=2, C=1, there are 10 letters total
10! 10*9*8*7*6*5*4*3*2*1 Permutations = -------------- = -------------------- = 50400 3! 3! 1! 2! 1! 6 * 6 * 1 * 2 * 1
You can find distinguishable permutations using the TI-82. Combinations A combination is an arrangement of objects without repetition and where order is not important. Note: The difference between a permutation and a combination is not whether there is repetition or not - there must not be repetition with either, and if there is repetition, you can not use the formulas for permutations or combinations. The only difference in the definition of a permutation and a combination is whether order is important. A combination of n objects, arranged in groups of size r, without repetition, and order being important is:
nCr
= C(n,r) = n! / ( (n-r)! * r! )
Another way to write a combination of n things, r at a time is using the binomial notation: Example: Find all two-letter combinations of the letters "ABC"
AB = BA AC = CA BC = CB
There are only three two-letter combinations.

Shortcut formula for finding a combination
Assuming that you start a n and count down to 1 in your factorials ... C(n,r) = first r factors of n factorial divided by the last r factors of n factorial Pascal's Triangle Combinations are used in the binomial expansion theorem from algebra to give the coefficients of the expansion (a+b)^n. They also form a pattern known as Pascal's Triangle.
1 1 1 1 3 2 3 1 1 1
4 1 10 5 1 1 6 15 20 15 6 1 1 7 21 35 35 21 7 1 1 5 10
Each element in the table is the sum of the two elements directly above it. Each element is also a combination. The n value is the number of the row (start counting at zero) and the r value is the element in the row (start counting at zero). That would make the 20 in the next to last row C(6,3) -- it's in the row #6 (7th row) and position #3 (4th element). Symmetry Pascal's Triangle illustrates the symmetric nature of a combination. C(n,r)
r) = C(n,n-
Example: C(10,4) = C(10,6) or C(100,99) = C(100,1)

Shortcut formula for finding a combination
Since combinations are symmetric, if n-r is smaller than r, then switch the combination to its alternative form and then use the shortcut given above. C(n,r) = first r factors of n factorial divided by the last r factors of n factorial TI-82 You can use the TI-82 graphing calculator to find factorials, permutations, and combinations.
Tree Diagrams Tree diagrams are a graphical way of listing all the possible outcomes. The outcomes are listed in an orderly fashion, so listing all of the possible outcomes is easier than just trying to make sure that you have them all listed. It is called a tree diagram because of the way it looks.
The first event appears on the left, and then each sequential event is represented as branches off of the first event. The tree diagram to the right would show the possible ways of flipping two coins. The final outcomes are obtained by following each branch to its conclusion: They are from top to bottom:
HH HT TH TT
Probability Distributions
Definitions Random Variable Variable whose values are determined by chance Probability Distribution The values a random variable can assume and the corresponding probabilities of each. Expected Value The theoretical mean of the variable. Binomial Experiment An experiment with a fixed number of independent trials. Each trial can only have two outcomes, or outcomes which can be reduced to two outcomes. The probability of each outcome must remain constant from trial to trial. Binomial Distribution The outcomes of a binomial experiment with their corresponding probabilities. Multinomial Distribution
A probability distribution resulting from an experiment with a fixed number of independent trials. Each trial has two or more mutually exclusive outcomes. The probability of each outcome must remain constant from trial to trial. Poisson Distribution A probability distribution used when a density of items is distributed over a period of time. The sample size needs to be large and the probability of success to be small. Hypergeometric Distribution A probability distribution of a variable with two outcomes when sampling is done without replacement.
Probability Distributions
Probability Functions A probability function is a function which assigns probabilities to the values of a random variable.

All the probabilities must be between 0 and 1 inclusive The sum of the probabilities of the outcomes must be 1.
If these two conditions aren't met, then the function isn't a probability function. There is no requirement that the values of the random variable only be between 0 and 1, only that the probabilities be between 0 and 1. Probability Distributions A listing of all the values the random variable can assume with their corresponding probabilities make a probability distribution. A note about random variables. A random variable does not mean that the values can be anything (a random number). Random variables have a well defined set of
outcomes and well defined probabilities for the occurrence of each outcome. The random refers to the fact that the outcomes happen by chance -- that is, you don't know which outcome will occur next. Here's an example probability distribution that results from the rolling of a single fair die.
x p(x) 1 1/6 2 1/6 3 1/6 4 1/6 5 1/6 6 1/6 sum 6/6=1
Mean, Variance, and Standard Deviation Consider the following. The definitions for population mean and variance used with an ungrouped frequency
distribution were: Some of you might be confused by only dividing by N. Recall that this is the population variance, the sample variance, which was the unbiased estimator for the population variance was when it was divided by n-1. Using algebra, this is equivalent
to: Recall that a probability is a long term relative frequency. So every f/N can be replaced by p(x). This simplifies to be: What's even better, is that the last portion of the variance is the mean squared. So, the two formulas that we will be using are:
Here's the example we were working on earlier.

x p(x) x p(x) x^2 p(x) 1 1/6 1/6 1/6 2 1/6 2/6 4/6 3 1/6 3/6 9/6 4 1/6 4/6 16/6 5 1/6 5/6 25/6 6 1/6 6/6 36/6 sum 6/6 = 1 21/6 = 3.5 91/6 = 15.1667
The mean is 7/2 or 3.5 The variance is 91/6 - (7/2)^2 = 35/12 = 2.916666... The standard deviation is the square root of the variance = 1.7078 Do not use rounded off values in the intermediate calculations. Only round off the final answer. You can learn how to find the mean and variance of a probability distribution using lists with the TI-82 or using the program called PDIST.
Binomial Probabilities
Binomial Experiment A binomial experiment is an experiment which satisfies these four conditions

A fixed number of trials Each trial is independent of the others There are only two outcomes
The probability of each outcome remains constant from trial to trial.
These can be summarized as: An experiment with a fixed number of independent trials, each of which can only have two possible outcomes. A binomial experiment has a fixed number of independent trials, each with only two outcomes. The fact that each trial is independent actually means that the probabilities remain constant. Examples of binomial experiments

Tossing a coin 20 times to see how many tails occur. Asking 200 people if they watch ABC news. Rolling a die to see if a 5 appears. Asking 500 die-hard Republicans if they would vote for the Democratic candidate. (Just because something is unlikely, doesn't mean that it isn't binomial. The conditions are met - there's a fixed number [500], the trials are independent [what one person does doesn't affect the next person], and there's only two outcomes [yes or no]).
Examples which aren't binomial experiments

Rolling a die until a 6 appears (not a fixed number of trials) Asking 20 people how old they are (not two outcomes) Drawing 5 cards from a deck for a poker hand (done without replacement, so not independent)
Binomial Probability Function Example: What is the probability of rolling exactly two sixes in 6 rolls of a die? There are five things you need to do to work a binomial story problem. 1. Define Success first. Success must be for a single trial. Success = "Rolling a 6 on a single die" 2. Define the probability of success (p): p = 1/6
3. Find the probability of failure: q = 5/6 4. Define the number of trials: n = 6 5. Define the number of successes out of those trials: x = 2 Anytime a six appears, it is a success (denoted S) and anytime something else appears, it is a failure (denoted F). The ways you can get exactly 2 successes in 6 trials are given below. The probability of each is written to the right of the way it could occur. Because the trials are independent, the probability of the event (all six dice) is the product of each probability of each outcome (die)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 FFFFSS FFFSFS FFFSSF FFSFFS FFSFSF FFSSFF FSFFFS FSFFSF FSFSFF FSSFFF SFFFFS SFFFSF SFFSFF SFSFFF SSFFFF 5/6 5/6 5/6 5/6 5/6 5/6 5/6 5/6 5/6 5/6 1/6 1/6 1/6 1/6 1/6 * * * * * * * * * * * * * * * 5/6 5/6 5/6 5/6 5/6 5/6 1/6 1/6 1/6 1/6 5/6 5/6 5/6 5/6 1/6 * * * * * * * * * * * * * * * 5/6 5/6 5/6 1/6 1/6 1/6 5/6 5/6 5/6 1/6 5/6 5/6 5/6 1/6 5/6 * * * * * * * * * * * * * * * 5/6 1/6 1/6 5/6 5/6 1/6 5/6 5/6 1/6 5/6 5/6 5/6 1/6 5/6 5/6 * * * * * * * * * * * * * * * 1/6 5/6 1/6 5/6 1/6 5/6 5/6 1/6 5/6 5/6 5/6 1/6 5/6 5/6 5/6 * * * * * * * * * * * * * * * 1/6 1/6 5/6 1/6 5/6 5/6 1/6 5/6 5/6 5/6 1/6 5/6 5/6 5/6 5/6 = = = = = = = = = = = = = = = (1/6)^2 (1/6)^2 (1/6)^2 (1/6)^2 (1/6)^2 (1/6)^2 (1/6)^2 (1/6)^2 (1/6)^2 (1/6)^2 (1/6)^2 (1/6)^2 (1/6)^2 (1/6)^2 (1/6)^2 * * * * * * * * * * * * * * * (5/6)^4 (5/6)^4 (5/6)^4 (5/6)^4 (5/6)^4 (5/6)^4 (5/6)^4 (5/6)^4 (5/6)^4 (5/6)^4 (5/6)^4 (5/6)^4 (5/6)^4 (5/6)^4 (5/6)^4
Notice that each of the 15 probabilities are exactly the same: (1/6)^2 * (5/6)^4. Also, note that the 1/6 is the probability of success and you needed 2 successes. The 5/6 is the probability of failure, and if 2 of the 6 trials were success, then 4 of the 6 must be failures. Note that 2 is the value of x and 4 is the value of n-x. Further note that there are fifteen ways this can occur. This is the number of ways 2 successes can be occur in 6 trials without repetition and order not being important, or a combination of 6 things, 2 at a time. The probability of getting exactly x success in n trials, with the probability of success on a single trial being p is: P(X=x) = nCx * p^x * q^(n-x) Example: A coin is tossed 10 times. What is the probability that exactly 6 heads will occur. 1. Success = "A head is flipped on a single coin"
2. 3. 4. 5.
p = 0.5 q = 0.5 n = 10 x=6
P(x=6) = 10C6 * 0.5^6 * 0.5^4 = 210 * 0.015625 * 0.0625 = 0.205078125 Mean, Variance, and Standard Deviation The mean, variance, and standard deviation of a binomial distribution are extremely easy to find.
Another way to remember the variance is mu-q (since the np is mu). Example: Find the mean, variance, and standard deviation for the number of sixes that appear when rolling 30 dice. Success = "a six is rolled on a single die". p = 1/6, q = 5/6. The mean is 30 * (1/6) = 5. The variance is 30 * (1/6) * (5/6) = 25/6. The standard deviation is the square root of the variance = 2.041241452 (approx)
Other Discrete Distributions
Multinomial Probabilities A multinomial experiment is an extended binomial probability. The difference is that in a multinomial experiment, there are more than two possible outcomes. However, there are still a fixed number of independent trials, and the probability of each outcome must remain constant from trial to trial.
Instead of using a combination, as in the case of the binomial probability, the number of ways the outcomes can occur is done using distinguishable permutations. An example here will be much more useful than a formula. The probability that a person will pass a College Algebra class is 0.55, the probability that a person will withdraw before the class is completed is 0.40, and the probability that a person will fail the class is 0.05. Find the probability that in a class of 30 students, exactly 16 pass, 12 withdraw, and 2 fail.
Outcome Pass Withdraw Fail Total x 16 12 2 30 p(outcome) 0.55 0.40 0.05 1.00
The probability is found using this formula:

30! P = ---------------- * 0.55^16 * 0.40^12 * 0.05^2 (16!) (12!) (2!)
You can do this on the TI-82. The multinomial experiment will be used later when we talk about the chi-square goodness of fit test. Poisson Probabilities Named after the French mathematician Simeon Poisson, Poisson probabilities are useful when there are a large number of independent trials with a small probability of success on a single trial and the variables occur over a period of time. It can also be used when a density of items is distributed over a given area or volume.
Lambda in the formula is the mean number of occurrences. If you're approximating a binomial probability using the Poisson, then lambda is the same as mu or n * p. Example: If there are 500 customers per eight-hour day in a check-out lane, what is the probability that there will be exactly 3 in line during any five-minute period? The expected value during any one five minute period would be 500 / 96 = 5.2083333. The 96 is because there are 96 five-minute periods in eight hours. So, you expect about 5.2 customers in 5 minutes and want to know the probability of getting exactly 3.
p(3;500/96) = e^(-500/96) * (500/96)^3 / 3! = 0.1288 (approx)
Hypergeometric Probabilities Hypergeometric experiments occur when the trials are not independent of each other and occur due to sampling without replacement -- as in a five card poker hand. Hypergeometric probabilities involve the multiplication of two combinations together and then division by the total number of combinations. Example: How many ways can 3 men and 4 women be selected from a group of 7 men and 10 women?
The answer is
= 7350/19448 = 0.3779 (approx)
Note that the sum of the numbers in the numerator are the numbers used in the combination in the denominator. This can be extended to more than two groups and called an extended hypergeometric problem. You can use the TI-82 to find hypergeometric probabilities.
Normal Distribution
Definitions Central Limit Theorem Theorem which stats as the sample size increases, the sampling distribution of the sample means will become approximately normally distributed. Correction for Continuity A correction applied to convert a discrete distribution to a continuous distribution. Finite Population Correction Factor A correction applied to the standard error of the means when the sample size is more than 5% of the population size and the sampling is done without replacement. Sampling Distribution of the Sample Means Distribution obtained by using the means computed from random samples of a specific size. Sampling Error Difference which occurs between the sample statistic and the population parameter due to the fact that the sample isn't a perfect representation of the population. Standard Error or the Mean
The standard deviation of the sampling distribution of the sample means. It is equal to the standard deviation of the population divided by the square root of the sample size. Standard Normal Distribution A normal distribution in which the mean is 0 and the standard deviation is 1. It is denoted by z. Z-score Also known as z-value. A standardized score in which the mean is zero and the standard deviation is 1. The Z score is used to represent the standard normal distribution.
Normal Distributions
Any Normal Distribution

Bell-shaped Symmetric about mean Continuous Never touches the x-axis Total area under curve is 1.00 Approximately 68% lies within 1 standard deviation of the mean, 95% within 2 standard deviations, and 99.7% within 3 standard deviations of the mean. This is the Empirical Rule mentioned earlier. Data values represented by x which has mean mu and standard deviation sigma.
Probability Function given by
Standard Normal Distribution
Same as a normal distribution, but also ...

Mean is zero Variance is one Standard Deviation is one Data values represented by z.
Probability Function given by

Standard Normal Mean = 0 and Variance = 1 Non-Standard Normal Mean is not 0 or Variance is not 1
Normal Probabilities This table has not been verified against the book, please use the table out of your textbook. Comprehension of this table is vital to success in the course! There is a table which must be used to look up standard normal probabilities. The zscore is broken into two parts, the whole number and tenth are looked up along the left side and the hundredth is looked up across the top. The value in the intersection of the row and column is the area under the curve between zero and the z-score looked up. Because of the symmetry of the normal distribution, look up the absolute value of any z-score.
Computing Normal Probabilities There are several different situations that can arise when asked to find normal probabilities.
Situation Between zero and any number Between two positives, or Between two negatives Between a negative and a positive Less than a negative, or Greater than a positive Greater than a negative, or Less than a positive Instructions Look up the area in the table
Look up both areas in the table and subtract the smaller from the larger. Look up both areas in the table and add them together
Look up the area in the table and subtract from 0.5000
Look up the area in the table and add to 0.5000
This can be shortened into two rules. 1. If there is only one z-score given, use 0.5000 for the second area, otherwise look up both z-scores in the table 2. If the two numbers are the same sign, then subtract; if they are different signs, then add. If there is only one z-score, then use the inequality to determine the second sign (< is negative, and > is positive). Finding z-scores from probabilities This is more difficult, and requires you to use the table inversely. You must look up the area between zero and the value on the inside part of the table, and then read the zscore from the outside. Finally, decide if the z-score should be positive or negative, based on whether it was on the left side or the right side of the mean. Remember, zscores can be negative, but areas or probabilities cannot be.
Situation Area between 0 and a value
Instructions Look up the area in the table Make negative if on the left side Subtract the area from 0.5000 Look up the difference in the table Make negative if in the left tail Subtract 0.5000 from the area Look up the difference in the table Make negative if on the left side Divide the area by 2 Look up the quotient in the table Use both the positive and negative z-scores Subtract the area from 1.000 Divide the area by 2 Look up the quotient in the table Use both the positive and negative z-scores
Area in one tail
Area including one complete half (Less than a positive or greater than a negative) Within z units of the mean
Two tails with equal area (More than z units from the mean)
Using the table becomes proficient with practice, work lots of the normal probability problems!
Standard Normal Probabilities

z 0.0 0.1 0.2 0.00
0.0000
0.01
0.0040
0.02
0.0080
0.03
0.0120
0.04
0.0160
0.05
0.0199
0.06
0.0239
0.07
0.0279
0.08
0.0319
0.09
0.0359
0.0398
0.0438
0.0478
0.0517
0.0557
0.0596
0.0636
0.0675
0.0714
0.0753
0.0793
0.0832
0.0871
0.0910
0.0948
0.0987
0.1026
0.1064
0.1103
0.1141
0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0
0.1179
0.1217
0.1255
0.1293
0.1331
0.1368
0.1406
0.1443
0.1480
0.1517
0.1554
0.1591
0.1628
0.1664
0.1700
0.1736
0.1772
0.1808
0.1844
0.1879
0.1915
0.1950
0.1985
0.2019
0.2054
0.2088
0.2123
0.2157
0.2190
0.2224
0.2257
0.2291
0.2324
0.2357
0.2389
0.2422
0.2454
0.2486
0.2517
0.2549
0.2580
0.2611
0.2642
0.2673
0.2704
0.2734
0.2764
0.2794
0.2823
0.2852
0.2881
0.2910
0.2939
0.2967
0.2995
0.3023
0.3051
0.3078
0.3106
0.3133
0.3159
0.3186
0.3212
0.3238
0.3264
0.3289
0.3315
0.3340
0.3365
0.3389
0.3413
0.3438
0.3461
0.3485
0.3508
0.3531
0.3554
0.3577
0.3599
0.3621
0.3643
0.3665
0.3686
0.3708
0.3729
0.3749
0.3770
0.3790
0.3810
0.3830
0.3849
0.3869
0.3888
0.3907
0.3925
0.3944
0.3962
0.3980
0.3997
0.4015
0.4032
0.4049
0.4066
0.4082
0.4099
0.4115
0.4131
0.4147
0.4162
0.4177
0.4192
0.4207
0.4222
0.4236
0.4251
0.4265
0.4279
0.4292
0.4306
0.4319
0.4332
0.4345
0.4357
0.4370
0.4382
0.4394
0.4406
0.4418
0.4429
0.4441
0.4452
0.4463
0.4474
0.4484
0.4495
0.4505
0.4515
0.4525
0.4535
0.4545
0.4554
0.4564
0.4573
0.4582
0.4591
0.4599
0.4608
0.4616
0.4625
0.4633
0.4641
0.4649
0.4656
0.4664
0.4671
0.4678
0.4686
0.4693
0.4699
0.4706
0.4713
0.4719
0.4726
0.4732
0.4738
0.4744
0.4750
0.4756
0.4761
0.4767
0.4772
0.4778
0.4783
0.4788
0.4793
0.4798
0.4803
0.4808
0.4812
0.4817
2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 3.0
0.4821
0.4826
0.4830
0.4834
0.4838
0.4842
0.4846
0.4850
0.4854
0.4857
0.4861
0.4864
0.4868
0.4871
0.4875
0.4878
0.4881
0.4884
0.4887
0.4890
0.4893
0.4896
0.4898
0.4901
0.4904
0.4906
0.4909
0.4911
0.4913
0.4916
0.4918
0.4920
0.4922
0.4925
0.4927
0.4929
0.4931
0.4932
0.4934
0.4936
0.4938
0.4940
0.4941
0.4943
0.4945
0.4946
0.4948
0.4949
0.4951
0.4952
0.4953
0.4955
0.4956
0.4957
0.4959
0.4960
0.4961
0.4962
0.4963
0.4964
0.4965
0.4966
0.4967
0.4968
0.4969
0.4970
0.4971
0.4972
0.4973
0.4974
0.4974
0.4975
0.4976
0.4977
0.4977
0.4978
0.4979
0.4979
0.4980
0.4981
0.4981
0.4982
0.4982
0.4983
0.4984
0.4984
0.4985
0.4985
0.4986
0.4986
0.4987
0.4987
0.4987
0.4988
0.4988
0.4989
0.4989
0.4989
0.4990
0.4990
The values in the table are the areas between zero and the z-score. That is, P(0<Z<zscore)
Central Limit Theorem
Sampling Distribution of the Sample Means Instead of working with individual scores, statisticians often work with means. What happens is that several samples are taken, the mean is computed for each sample, and then the means are used as the data, rather than individual scores being used. The sample is a sampling distribution of the sample means. Examples
Example 1: Sampling Distribution of Values (x) Consider the case where a single, fair die is rolled. Here are the values that are possible and their probabilities.
Value Probability 1 1/6 2 1/6 3 1/6 4 1/6 5 1/6 6 1/6
Here are the mean, variance, and standard deviation of this probability distribution. Mean, mu = sum [ x * p(x) ] = 3.5 Variance, sigma^2 = sum [ x^2 * p(x) ] - mu^2 = 35/12 Standard deviation, sigma = sqrt ( variance ) = sqrt ( 35/12 ) Example 2: Sampling Distribution of Sample Means (x-bar) Consider the case where two fair dice are rolled instead of one. Here are the sums that are possible and their probabilities.
Sum Prob 2 1/36 3 2/36 4 3/36 5 4/36 6 5/36 7 6/36 8 5/36 9 4/36 10 3/36 11 2/36 12 1/36
But, we're not interested in the sum of the dice, we're interested in the sample mean. We find the sample mean by dividing the sum by the sample size.
Mean Prob 1.0 1/36 1.5 2/36 2.0 3/36 2.5 4/36 3.0 5/36 3.5 6/36 4.0 5/36 4.5 4/36 5.0 3/36 5.5 2/36 6.0 1/36
Computing the mean, variance, and standard deviation, we get ... Mean, mu = sum [ x * p(x) ] = 3.5 Variance, sigma^2 = sum [ x^2 * p(x) ] - mu^2 = 35/24 Standard deviation, sigma = sqrt ( variance ) = sqrt ( 35/24 )
Properties of the Sampling Distribution of the Sample Means When all of the possible sample means are computed, then the following properties are true:

The mean of the sample means will be the mean of the population The variance of the sample means will be the variance of the population divided by the sample size. The standard deviation of the sample means (known as the standard error of the mean) will be smaller than the population mean and will be equal to the standard deviation of the population divided by the square root of the sample size. If the population has a normal distribution, then the sample means will have a normal distribution. If the population is not normally distributed, but the sample size is sufficiently large, then the sample means will have an approximately normal distribution. Some books define sufficiently large as at least 30 and others as at least 31.
The formula for a z-score when working with the sample means is: Finite Population Correction Factor If the sample size is more than 5% of the population size and the sampling is done without replacement, then a correction needs to be made to the standard error of the means. In the following, N is the population size and n is the sample size. The adjustment is to multiply the standard error by the square root of the quotient of the difference between the population and sample sizes and one less than the population
size. For the most part, we will be ignoring this in class.
Normal Approximation to Binomial
Recall that according to the Central Limit Theorem, the sample mean of any distribution will become approximately normal if the sample size is sufficiently large. It turns out that the binomial distribution can be approximated using the normal distribution if np and nq are both at least 5. Furthermore, recall that the mean of a binomial distribution is np and the variance of the binomial distribution is npq. Continuity Correction Factor There is a problem with approximating the binomial with the normal. That problem arises because the binomial distribution is a discrete distribution while the normal distribution is a continuous distribution. The basic difference here is that with discrete values, we are talking about heights but no widths, and with the continuous distribution we are talking about both heights and widths. The correction is to either add or subtract 0.5 of a unit from each discrete x-value. This fills in the gaps to make it continuous. This is very similar to expanding of limits to form boundaries that we did with group frequency distributions. Examples
Discrete x=6 x>6 x >= 6 x<6 x <= 6 Continuous 5.5 < x < 6.5 x > 6.5 x > 5.5 x < 5.5 x < 6.5
As you can see, whether or not the equal to is included makes a big difference in the discrete distribution and the way the conversion is performed. However, for a continuous distribution, equality makes no difference. Steps to working a normal approximation to the binomial distribution 1. Identify success, the probability of success, the number of trials, and the desired number of successes. Since this is a binomial problem, these are the same things which were identified when working a binomial problem. 2. Convert the discrete x to a continuous x. Some people would argue that step 3 should be done before this step, but go ahead and convert the x before you forget about it and miss the problem. 3. Find the smaller of np or nq. If the smaller one is at least five, then the larger must also be, so the approximation will be considered good. When you find np, you're actually finding the mean, mu, so denote it as such. 4. Find the standard deviation, sigma = sqrt (npq). It might be easier to find the variance and just stick the square root in the final calculation - that way you don't have to work with all of the decimal places. 5. Compute the z-score using the standard formula for an individual score (not the one for a sample mean). 6. Calculate the probability desired.
Importance of the Normal Distribution
Parametric Hypothesis Testing All parametric hypothesis testing that we're going to perform requires normality in some sense. Population Mean Either the population was normally distributed, the sample size was large enough (so the central limit theorem applied and was approximately normal), or the population was approximately normal and the student's t was used. Population Proportion
The binomial distribution (the one that really applies) was approximated using the normal as long as np and nq were at least five. That is another way of saying the expected frequency of each category (success and failure) is at least five. Population Variance It was required that the population be normally distributed. Correlation and Regression The pairs of data had to have a bi-variate normal distribution. Multinomial Experiment The expected frequency of each category had to be at least five. This is analogous to approximating the binomial using the normal. Independence The expected frequency of each cell had to be at least five. This is analogous to approximating the binomial using the normal. Distributions The distributions have normality in them somewhere, too. Normal Distribution Well, obviously this one requires normality. Student's T Distribution Had to be approximately normal. As the sample size increases, the student's t approaches the normal distribution. Chi-squared Distribution Required a normal population. There is another interesting relationship between the normal and chi-square distributions. If you take a critical value from normal distribution and square it, you will get the corresponding chi-
square value with one degree of freedom, but twice the area in the tails. Example: z(0.05)2 = 1.6452 = 2.706 = chi-square(1,0.10) F Distribution Since F is the ratio of two independent chi-squared variables divided by their respective degrees of freedom, and the chi-squares require a normal distribution, then the F distribution is also going to require a normal distribution. Binomial Distribution Obviously, the binomial doesn't require a normal population, but it can be approximated using a normal distribution if the expected frequency of each category is at least five. Multinomial Distribution Same as with the binomial, the multinomial can be approximated using the normal if the expected frequency of each category is at least five. As stated in class and in the lecture notes ... your comprehension of the normal distribution is vital for success in the class.

Notes Stats

Transféré par

Informations du document

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Notes Stats

Transféré par

Droits d'auteur :

Formats disponibles

Definitions Probability Experiment Process which leads to well-defined results call outcomes Outcome The result of a single trial

P(E) = n(E) / n(S)

0 <= P(E) <= 1

B A A' Marginal 0.00 0.70 0.70

B' 0.20 0.10 0.30

Marginal 0.20 0.80 1.00

0.20-0.020 = 0.180 0.934

0.10(0.20) = 0.020 0.066

Example: Find all permutations of the letters "ABC"

Example: Find all two-letter permutations of the letters "ABC"

Shortcut formula for finding a permutation

If you just write "B" as "B", however ...

There are really only three distinguishable permutations here.

There are only three two-letter combinations.

Example: C(10,4) = C(10,6) or C(100,99) = C(100,1)

Here's the example we were working on earlier.

The probability of each outcome remains constant from trial to trial.

Examples which aren't binomial experiments

p = 0.5 q = 0.5 n = 10 x=6

The probability is found using this formula:

= 7350/19448 = 0.3779 (approx)

Any Normal Distribution

Probability Function given by

Standard Normal Distribution

Same as a normal distribution, but also ...

Probability Function given by

Look up the area in the table and subtract from 0.5000

Look up the area in the table and add to 0.5000

Situation Area between 0 and a value

Area in one tail

Standard Normal Probabilities

Central Limit Theorem

size. For the most part, we will be ignoring this in class.

Normal Approximation to Binomial

Importance of the Normal Distribution

Vous aimerez peut-être aussi