Appendix A - Basic Probability Concepts

In this appendix definitions and some example calculations are presented which will aid in our discussions. This is not meant to be a comprehensive introduction to the topic. It is primarily meant to serve as a means for introducing notation and terminology for the course.

By definition, probability is the chance of a certain event occurring from a set of events that could possibly occur. Let us start with the most primitive example of a probability, flipping a coin. Now we know the set of possible outcomes is heads or tails, $S=\left\{H,T\right\}.\,\!$ Now since there are only two events that can occur and we know that there is an equal chance for them both to occur, we say that the probability for each occurring is 1/2, i.e. $P(H)=1/2\,\!$ and $P(T)=1/2\,\!$ , because the probabilities for every possible outcome of an event must equal 1, i.e. $P(H)+P(T)=1.\,\!$

In probability, the Boolean operator and can be somewhat counter intuitive at first. For instance, if someone were to tell you that he/she has 5 apples and just received 3 more, the operation that takes place in your head is he/she has $5+3=8\,\!$ apples. But, when working with probabilities, the Boolean and corresponds with multiplication. For example, say the probability that Bob stays and works through his lunch hour is 1/6 and the probability that Kathy stays and works through lunch is 5/6. Now if I were to ask, "What is the probability that Bob and Kathy stay and work through lunch?", you would not want add the probabilities because $P(B)+P(K)=1.\,\!$ This would imply that both will work through lunch, which doesn't make sense because we cannot guarantee, from the knowledge that we have, both will work through lunch. Instead, let us multiply their respective probabilities, $P(B)*P(K)=5/36.\,\!$ Since the answer is lower than the probability for each individual, it makes much more sense because, intuitively, the more uncertainty (i.e. more probabilities < 1) in a system, the more uncertain we are of success.

Now that we have examined the Boolean and, lets take a look at or. Or corresponds with addition, which follows directly from the condition that all probabilities for the outcomes of events must add up to 1. Revisiting the example of flipping a coin, we see that the two possible outcomes that occur are you obtain heads or you obtain tails. $P(H)+P(T)=1.\,\!$

(This example is a variation of one given by David Griffiths in Introduction to Quantum Mechanics (David J. Griffiths’ book [4]))

Example: Suppose that in some room, there are four people with the following heights:

1 person is 1.5 meters tall
1 person is 1.6 meters tall
2 people are 1.8 meters tall

Let $N\,\!$ stand for the total number of people. We might write the number of people with certain heights as $N(1.5)=1\,\!$ , $N(1.6)=1\,\!$ , $N(1.8)=2\,\!$ .

The total number of people is

N=\sum _{j=0}^{\infty }N(j),\,\!

where

j\,\!

runs over all values. It is easily seen that

N=4\,\!

.

Now if I draw a name out of a hat that contains each person's name once, I will get the name of a person who is 1.6 meters tall with probability $1/4\,\!$ . (We assume that each person has a unique name and that it appears once and only once in the hat.) We write this as

P(1.6)=1/4\,\!

and we would generally write for any value

P(j)={\frac {N(j)}{N}}.\,\!

Now since we are going to get someone's name when we draw, we must have

\sum _{j}P(j)=1,\,\!

which is easy enough to check.

There are several aspects of this probability distribution that we might like to know. Here are some that are particularly useful:

The most probable value (or mode) for the height is 1.8 meters.
The median is 1.7 meters (two people above and two below).
The average (or mean) is given by


	${\begin{aligned}\left\langle height\right\rangle &={\frac {1(1.5)+1(1.6)+2(1.8)}{4}}\\&={\frac {6.8}{4}}=1.7.\end{aligned}}$	(A.1)

Note that the mean and the median do not have to be the same. If there is an odd number of values, the median is the middle number in the list; if even, it is the mean of the two middle values. It is mere coincidence that they are the same here. The bracket, $\left\langle \cdot \right\rangle \,\!$ , is the standard notation for finding the average value of a function. This is done by calculating

\left\langle f(j)\right\rangle =\sum _{j=0}^{\infty }f(j)P(j).\,\!

For the average this is just

\left\langle j\right\rangle =\sum _{j=0}^{\infty }jP(j)=\sum _{j=0}^{\infty }j{\frac {N(j)}{N}}.\,\!

Note: The average value is called the expectation value in quantum mechanics. This can be misleading because it is not the most probable, nor is it ''what to expect.''

When one would like to discuss the properties of a particular probability distribution, describing it takes some effort. It is not enough to know the average, median, and most probable values; a lot of details of the probability distribution remain unknown to us if these are all we are given. What else would one like to know? Without describing it entirely, one may like to know more about the ''shape'' of the distribution. For example, how spread out is it?

The most important measure of this is the variance, which is the standard deviation squared ( $\sigma ^{2}\!$ ). The variance is defined as (in terms of our variable $j\,\!$ )