28.1 Discrete random variables

28.1.1 The linearity of expectation

Example 28.1.1

Let XX be the sum of three fair dice, and let AA be the outcome \sayXX is even. Then E[X|A]=E[A]\operatorname{E}[X|A]=\operatorname{E}[A].

This is true by the linearity of expectation, let X1,X2,X3X_{1},X_{2},X_{3} be the respective dice rolls, then we know that

E[X|A]\displaystyle\operatorname{E}[X|A] =E[X1+X2+X3|A]\displaystyle=\operatorname{E}[X_{1}+X_{2}+X_{3}|A] (28.1)
=E[X1|A]+E[X2|A]+E[X3|A]\displaystyle=\operatorname{E}[X_{1}|A]+\operatorname{E}[X_{2}|A]+% \operatorname{E}[X_{3}|A] (28.2)

We have three cases here, but they are all symmetrical, so we can just consider

E[X3|A]\displaystyle\operatorname{E}[X_{3}|A] =1i6i×Pr(X3=i|X is even)\displaystyle=\sum_{1\leq i\leq 6}i\times\Pr(X_{3}=i|X\text{ is even}) (28.3)

Then of course the question is what is Pr(X3=i|X is even)\Pr(X_{3}=i|X\text{ is even}). Here we can apply the definition of conditional probability

Pr(X3=i|X is even)\displaystyle\Pr(X_{3}=i|X\text{ is even}) =Pr(X3=iX is evenXis even)\displaystyle=\Pr\left(\frac{X_{3}=i\cap X\text{ is even}}{X\text{is even}}\right) (28.4)

and the easy thing to compute here is the probability that XX is even which is 12\frac{1}{2}. Then the slightly harder thing to compute is the numerator; clearly X3=iX_{3}=i with probability

28.1.2 Variance of a discrete random variable

The "variance" of a discrete random variable11 1 Note: you’re not imagining things, I still need to add the section I have written defining these. is a measure of "spread" (how far apart values in a distribution are). It gives the expected value of the square of the distance of the observed values (in the outcome space) from the mean (expected value of the distribution). That’s a mouthful to say, so it can be easier to write this as a formula.

22 2 If it’s not clear why XE[X]X-\operatorname{E}[X] gives the signed distance between XX and E[X]\operatorname{E}[X], take a look at the ”vectors” chapter.
Var[X]=E[(XE[X])2]\operatorname{Var}[X]=\operatorname{E}\left[\left(X-\operatorname{E}[X]\right)% ^{2}\right] (28.5)

There is an equivalent way in which the variance can be expressed which is a bit easier to use when trying to calculate the variance of a discrete random variable by hand:

Var[X]\displaystyle\operatorname{Var}[X] =E[(XE[X])2]\displaystyle=\operatorname{E}\left[\left(X-\operatorname{E}[X]\right)^{2}\right] (28.6)
=E[X22XE[X]+E[X]2]\displaystyle=\operatorname{E}\left[X^{2}-2X\operatorname{E}[X]+\operatorname{% E}[X]^{2}\right] (step 1) (28.7)
=E[X2]2E[X]E[X]+E[X2]\displaystyle=\operatorname{E}[X^{2}]-2\operatorname{E}[X]\operatorname{E}[X]+% \operatorname{E}[X^{2}] (step 2) (28.8)
=E[X2](E[X])2\displaystyle=\operatorname{E}\left[X^{2}\right]-\left(\operatorname{E}[X]% \right)^{2} (28.9)

When we went from step 1 to step 2, we took advantage of the fact that E[X]\operatorname{E}[X] is constant; in effect, we grouped our expression asE[(2E[X])X]\operatorname{E}\left[\left(2\operatorname{E}[X]\right)X\right],33 3 Bear in mind that E[X]\operatorname{E}[X] is a constant and then used the linearity of expectation 44 4 If this means nothing to you, please be aware that I have yet to write this section. to rewrite it as (2E[X])E[X]\left(2\operatorname{E}[X]\right)\operatorname{E}[X].