794 CHAPTER 38. PROBABILITY
Definition 38.9.5 Let X be a random variable. Its mean is defined as µ ≡ E (X) . The
variance is defined as E((X−µ)2
). The mean is a weighted average. It is what you
would expect to see if you took many random samples from this distribution and averagedthem. (In fact there is a theorem which says this.) The variance is a description of howspread out the probability density is. If the variance is small, then the random variable willbe close to µ with high probability and if it is large, then it is not as certain the randomvariable is close to µ .
Now with this definition of mean and variance, why is the normal distribution so impor-tant? It is because of the central limit theorem. Suppose E
(X2
k
)< ∞ where Xk is a random
variable.
Theorem 38.9.6 Let {Xk}∞
k=1 be random variables satisfying E(X2
k
)< ∞, which are inde-
pendent and identically distributed with mean µ = E (Xk) and positive variance 0 < σ2 ≡E((Xk−µ)2
). Let
Zn ≡n
∑j=1
X j−µ
σ√
n=
√n(X̄−µ)
σ(38.4)
where X̄ is the average of the Xk1n ∑
nk=1 Xk. Then for Z a normally distributed random
variable having mean 0 and variance 1,
limn→∞
P(Zn ∈ A) = P(Z ∈ A)
where A is a suitable set.
Of course this begs the question: What are µ,σ? Much that is done in statistics has todo with determination of these or other parameters. They both give interesting informationif they can be estimated.
How does independence relate to moment generating functions?
Proposition 38.9.7 Let Xk be a random vector with values in Rmk . Let
X =(
X1 · · · X p
)and let the moment generating function for X exist
M (t)≡M (t1, · · · ,tp)≡ E(et·X
)= E
(exp
(p
∑k=1
tk ·Xk
))
Then the Xk are independent if and only if
M (t) =p
∏k=1
M (0, · · ·0,tk,0, · · · ,0) (38.5)
Proof: First suppose the Xk are independent. Then the density function for X is of theform
f (x) = f1 (x1) f2 (x2) · · · fp (xp)