7964
Lecture 7

Lecture 7 focuses on probability. There's some review here, but a lot of this material is new.

I. Aleatory probability--Rules and Calculations

As you may recall, aleatory just means 'pertaining to luck'. But an aleatory probability generally means one that you can calculate because you have perfect information about the system.

A. The first law of probability (sometimes known as the convexity rule) states that

0 <= Pr(E) <= 1

In other words, the probability of any event is less than or equal to 1-- but greater than or equal to 0.

How would we express probability in general?

We would write:
```
PR (E) =                         number of ways in which E can occur
```
```
              number of ways in which all equally likely events can occur (including E)
```
That is, the probability that an "event" (E) might occur can be calculated as the number of ways in which the event can occur--divided by the total number of ways in which all such events can occur.

The example that we used in lecture 6 was of a six sided fair die. So, answer the following questions:
- What is the event in our example of the fair die?
  - Click here here for the answer.
- What is the number of ways in which the event can occur?
  - Click here for the answer.
- What is the total number of ways in which all equally likely events can occur (including the event of interest)?
  - Click here for the answer.
- What is PR (E) in this example?
  - Click here for the answer.
- What is the probability that you will not roll a six?
  - Click here for the answer.
B. Note that a second law of probability (combining two independent probabilities) is that

Pr (A or B) = Pr (A) + Pr (B)

Note that independent means that the two probabilities do not influence each other--the likelihood that one will happen does not influence the likelihood that the other will happen.
Given this,
- What is the probability of throwing a 3 OR a 6?
  - Click here for the answer.
C. Note that a third law of probability for independent events
(or the product rule) is that

Pr (A, B) = Pr(A) * Pr(B) . . . for independent events

That is, the probability of A and then B is equal to the probability of A multiplied by the probability of B.
Note that independent means that the two probabilities do not influence each other--the likelihood that one will happen does not influence the likelihood that the other will happen. The outcome of the first throw does not influence the outcome of the second throw. In real life, beyond throwing dice, outcomes are often not independent.

Given this,
- What is the probability of (given two throws) throwing a 1 and then a 6?
  - Click here for the answer.
D. The third rule of probability for dependent events

Let's think about independence a bit more thoroughly.

Imagine one roll of a die--and consider two possible events.
- The first event (let's call it A) is rolling an odd number.
- The second event (let's call it B) is rolling a number greater than 3.
Are these two events independent? Why or why not?
Click here for the answer.

So, the the third law of probability for dependent events can be expressed as
Pr (A, B) = Pr (A) * Pr (B | A)

We say this: "The probability of A and then B is equal to the probability of A plus the probability of B given A."

Essentially, what we're doing is breaking down the probabilities into independent probabilities. We don't really want to add the probability of B to the probability of A--because we're only interested in the probability of B if A happens.
```
_______________________________________
```
A couple of other notes:
1. P (B | A) = P (A and B) / P (A)
and so,
2. Also, for events that are not mutually exclusive (such as having a deck of cards (52 cards), where event 1 is getting a spade, and event 2 is getting a face card (jack, queen, king). Those two events are not "dependent"--getting a spade does not change the probability of getting a face card, and getting a face card does not change the probability of getting a spade. But they are also not mutually exclusive--the two events can happen at the same time.

P (A or B) = P(A) + P(B) - P(A and B)
Note that this works for events that are mutually exclusive as well---because P(A and B)=0.

II. Binomial Probability

Click here to look at one of many websites available giving an overview of binomial probability.

A few things that are relevant about binomial probabilities:

Binomial probabilities are relevant when there are only two possible outcomes So, for instance, a useful example is a coin toss--heads or tails.
When speaking of a binomial probability, the number of trials, or observations is fixed--in other words, with the example of the coin toss, you know in advance how many times you are going to toss the coin. We generally call these trials or observations "Bernoulli trials"
All the observations are independent.
The probability of success is identical for each observation.

The formula for binomial probabilities is:

What does all this mean? Let's talk about the components of the formula, and then apply it to an example: the website example of the probability that 2 (and only 2) skiers out of 5, each going down the hill one time, will break a leg.

The first part of the formula, the left hand term, is always just "The probability that the event happens is..."
The n refers to the number of total trials--so, when we talk about the case of skiers, n=5 because there are five opportunities for a skier to break a leg.
The x refers to the number of times of interest to us that the event might happen--in the case of the skiers, it's two times.
What's the exclamation point? The exclamation point indicates that it's a factorial expression.
What's p? p is the probability that the event will occur in any one trial -- so, for the skiers, let's assume that they have a .2 chance of breaking a leg (dangerous ski slope!)
What are the superscripts used for? The superscripts mean "to the power of".

Combinations Formula

Note that the first part of the right hand size--n! divided by [x! (n-x)!]--is called the combinations formula. It tells you how many combinations / ways there are in which the event you are interested in can happen.

Let's think about the ski slope example. How can the combinations formula tell us in how many combinations of trials two and only two skiers will break a leg?

N=number of objects / trials (in the ski lodge example, N=# skiers; N=5)
x=# of times, out of N, that you are interested in estimating the likelihood of (in this case, we're thinking about estimating the likelihood that two out of five skiers break their leg--so x = 2)
What is meant by the exclamation point? Mathematically, that means that you need to calculate the product of all numbers up to N, multiplying each other. In other words, (not related to the ski example),

4! = 1 X 2 X 3 X 4 X 5 X 6 = 720

So, what is N! in the ski slope example? Click here for the answer.

What is the numerator for the above formula? Click here for the right answer.

What is the denominator for the above formula? Click here for the answer.

So, what is the # of possible combinations--where two skiers out of five total could (in any order) break their leg?
Click here for the correct answer.

We can also think of this in terms of a tree diagram, where we can outline all the possible combinations. NB stands for "not broken"-- "B" stands for broken. Each line below represents a possible combination of the five skiers. Skier one is represented by the first letter--he can either break his leg or not break his leg. So, the first 16 cases represent the 16 possible combinations if skier one breaks his leg. The second 16 cases represent the 16 possible combinations if skier one doesn't break his leg. What happens to skier 2? Well, the first eight cases represent the possible combinations if skier 2 breaks his leg-- given that skier 1 broke his leg. The second eight cases represent the possible combinations if skier 2 doesn't break his leg--given that skier 1 broke his leg. The third eight cases represent the possible combinations if skier 2 breaks his leg--given that skier 1 didn't break his leg. And the last (fourth) eight cases represent the possible combinations if skier 2 doesn't break his leg -- given that skier 1 didn't break his leg.

B --> B ----> B --->B---->B
B --> B ----> B --->B---->NB
B --> B ----> B --->NB--->B
B --> B ----> B --->NB--->NB
B --> B ----> NB--->B---->B
B --> B ----> NB--->B---->NB
B --> B ----> NB--->NB--->B
B --> B ----> NB--->NB--->NB

B --> NB ----> B --->B---->B
B --> NB ----> B --->B---->NB
B --> NB ----> B --->NB--->B
B --> NB ----> B --->NB--->NB
B --> NB ----> NB--->B---->B
B --> NB ----> NB--->B---->NB
B --> NB ----> NB--->NB--->B
B --> NB ----> NB--->NB--->NB

NB --> B ----> B --->B---->B
NB --> B ----> B --->B---->NB
NB --> B ----> B --->NB--->B
NB --> B ----> B --->NB--->NB
NB --> B ----> NB--->B---->B
NB --> B ----> NB--->B---->NB
NB --> B ----> NB--->NB--->B
NB --> B ----> NB--->NB--->NB

NB --> NB ----> B --->B---->B
NB --> NB ----> B --->B---->NB
NB --> NB ----> B --->NB--->B
NB --> NB ----> B --->NB--->NB
NB --> NB ----> NB--->B---->B
NB --> NB ----> NB--->B---->NB
NB --> NB ----> NB--->NB--->B
NB --> NB ----> NB--->NB--->NB

Count up how many combinations have two (and only two) skiers with broken legs? What is the answer? And why? Click here for the correct answer.

Total Combinations

As an aside, note that a general formula for the total combinations of outcomes is # possible individual outcomes to the power of # trials--in this case, 2 to the power of 5. 2 because there are always 2 possible outcomes, and then "the power of 5" because there are 5 skiers going down the hill; that is, five trials.

The Remaining Part of the Binomial Formula--probabilities

What is p, in this example of skiing? We know that p=.2. So, we can plug in p in the following way to the rest of the formula:

p^x equals .2 to the power of 2: that is .2 X .2, which equals .04.
(1-p)^(n-x) equals .8 to the power of 3: that is, .8 X .8 X .8, which equals .512

So, after plugging everything in, our new formula for this skiing case looks like:

Which brings us to....

Which finally brings us to....

Let's think of a slightly more straightforward example. What is the probability of getting one and only one head if you toss a fair coin three times?

First, let's figure out the total number of combinations--
- HHH
- HHT
- HTH
- HTT
- THH
- THT
- TTH
- TTT

Note: you can also get the answer using the formula listed above for total number of equally likely combinations; two (because for each trial there are two possible outcomes: heads or tails) to the power of three (because there are three trials) = 8.

So, what would be the probability of getting one and only one head out of the three tosses? Try it by just counting up the relevant possible events--and try it by using the formula for binomial probabilities. Click here for the answer and explanation.

II. Binomial Distributions

Let's think again about the toss of the fair coin. What are the possible overall distributions of heads and tails -- if you're not considered about order? Well, you could come up with one head and two tails. Or, you could come up with two heads and one tail. Or, you could come up with three tails. Or, you could come up with three heads. Those four mutually exclusive outcomes exhaust the possibilities.

Either by calculating or by looking at the tree diagram, we can ask ourselves:

What is the probability of three tails? It's one-eighth.
What is the probability of three heads? It's one-eighth.
What is the probability of two heads and one tail? It's 3/8.
What is the probability of two tails and one head? It's 3/8.

These probabilities form a binomial distribution.

III. Poisson Distribution

Calculating probabilities when there are many ways that an event can happen--and not happen--can be quite difficult, because there are many factorial expansions (such as N!). So, sometimes we use an approximation of the Binomial Distribution--which is the Poisson Distribution. This distribution is particularly useful for rare events with a large sample size-- that is, when you're trying to predict the (very small) likelihood that a something unusual will happen.

The formula for the Poisson probability is:

Let's go through this step-by-step.

The x represents the same as in the binomial probability calculation-- it represents the number of cases in which the outcome must be 1. (Two skiers breaking their legs--1 head coming up in three coin tosses).
The λ refers to the expected frequency of the outcomes of particular interest--in other words, you need to have some idea of the system in order to have an expected frequency. For instance, the example that Lacy gives involves finding 15 red cars in a parking lot containing 200 cars. If you know that roughly 1/10 cars in the city are red, then the expected frequency is 10% of 200--that is, λ=20.
e is the natural log.

Let's think about that example of finding 15 red cars in a parking lot with 200 cars (given that the expected frequency of red cars is 10%).

In the numerator, what is λ^x?
In the numerator, what is e^-λ?
In the denominator, what is x!

So, plugging in the numbers would give us

The 10¹⁹ and the 10^-9 collapse down to 10¹⁰, which leads us to

The 10¹⁰ divided by the 10¹² reduces down to 1 / 10², or 1 / 100, and the 6.754 / 1.3076 = 5.165, which leads us to...

That's a pretty close approximation to what we would get if we used the binomial distribution formula.

Poisson probabilities are particularly useful in genetics because you might want to calculate the probabilities for events which are rare compared to a large potential sample size--i.e., the likelihood of a particular genotype might be expected to occur only a few times in the population. Given the size of the population, it would be almost impossible to calculate out the binomial probability--but the Poisson is often used, because it easier to work with (given all the cancelling out that we saw above).

IV. Empirical Probabilities

Recall our distinction between aleatory probabilities-- which are probabilities based on a known system.

Much more often, we are modeling probabilities based on observation. Modeling means that we calculate probabilities for a range of outcomes-- based on some summary statistic and our knowledge of some underlying distribution.

So, for instance, suppose we wanted to build a frequency table for "red cars in parking lots with 200 cars". We could visit a representative parking lot (that is, take a sample) and calculate out the likelihood that a car was red. If, in our sample, we found (as above) that 10% of the 200 cars were red, and we knew that the underlying distribution might be a poisson distribution, we could then calculate out the poisson distribution.

Based on the example above, the distribution (up to 100 red cars) would look something like this:

Notice that the Poisson probability distribution in the chart is left skewed--which is to say that once you move into the right-hand three-quarters of the chart, the probabilities are very, very small. In the case of finding a red car in a parking lot with 200 red cars, the probability of finding 50 or more red cars is almost 0. The poisson probability formula is appropriate to use in this case.

These are empirical probabilities--much different from when we were discussing aleatory probabilities, because we had full information to the degree possible about the fair coin or die.

V. Questions

1. What does P(A, B) mean?
2. What does P(A | B) mean?
3. If you toss a coin four times, what is the probability that you'll get two (and only two) heads? Show your answer by creating a tree diagram, and by calculating out the probability according to the binomial formula.
4. Can you calculate the answer out according to the poisson formula?
5. If in the Lottery a sample of six balls are drawn without replacement from a population of 49, what is the probability that you will win with five tickets? The probability that you won't win?
6. When would the poissson distribution be appropriate to use if you were calculating out empirical probabilities based on some knowledge of the situation, plus an expected frequency?
7. Describe how you would calculate out the probability of finding somewhere between 10 and 30 red cars in a parking lot of 200 cars, based on the poisson distribution, that expected frequency of 10% red cars, and the laws of probability.
8. How would your answer above change if you were to try to estimate the probability of finding between 10 and 30 red cars in a parking lot of 200 cars, given that you know that 40 of the 200 cars are blue?

7964Lecture 7

I. Aleatory probability--Rules and Calculations

A. The first law of probability (sometimes known as the convexity rule) states that

B. Note that a second law of probability (combining two independent probabilities) is that

C. Note that a third law of probability for independent events(or the product rule) is that

D. The third rule of probability for dependent events