38.7. GENERAL CONSIDERATIONS PROBABILITY 785

Observation 38.7.12 In every case, if a,b are numbers, then if everything makes sense,and X , X̂ are two random variables having the same probability distribution, meaning thatP(X ∈ F) = P

(X̂ ∈ F

)for all F an interval, then

E(aX +bX̂

)= aE (X)+bE

(X̂)

Suppose you had many random variables Xi each having the same distribution and thecollection of random variables independent, explained below. If you averaged Xi for all i,what you would get is probably close to E (X). This is why taking the expectation is ofinterest. I will give a brief explanation why this is so.

Where do independent random variables come from? In practice, you have independentobservations from an underlying probability space, meaning that it makes sense to ask forthe probability that a random variable is in suitable subsets of R or Rn. These observationsare independent in the sense that the outcome of an observation does not depend on theoutcome of the others. Then the numerical values are called independent random variables.A more precise description is given below.

First, here is an important formula. I will be considering only the case of a continuousdistribution in explaining this inequality, but it all works in general. The inequality is calledthe Chebychev inequality.

Proposition 38.7.13 Let X be a random variable. Then for ε > 0,

P(|g(X)| ≥ ε)≤ 1ε

E (|g(X)|)

Proof: By definition of what is meant by a distribution function, if E ≡ |g|−1 ([ε,∞))≡{x : |g(x)| ≥ ε} , then on this set, |g(x)|/ε ≥ 1 and off this set, |g(x)| f (x)≥ 0. Thus

P(|g(X)| ≥ ε) =∫

Ef (x)dx≤ 1

ε

∫R|g(x)| f (x)dx =

E (|g(X)|) ■

Now suppose you have Xi a random variable having distribution function f (x) and supposeµ = E (X) , σ2 = E

((X−µ)2

)both exist. Suppose Xi, i = 1, ... all these random variables

are independent as in the next definition.

Definition 38.7.14 Let there be random variables X1, ..., having well defined mean µ ≡E (Xk) and variance σ2 = E

((X−µ)2

). Then to say these are independent implies that

E ((Xi−µ)(X j−µ)) = E (Xi−µ)E (X j−µ) = 0 whenever i ̸= j. The more completemeaning of independence is as follows: For each m,

P(Xi ∈ Ei for each i≤ m) =m

∏i=1

P(Xi ∈ Ei) .

Here the Ei can be considered intervals. The idea is that what happens in terms of proba-bility involving each X j for j ̸= i does not affect probability involving Xi. Also, if you haveX1, ..., independent, then if g is some continuous function, then g(X1) , ... will also be inde-pendent. This is because g(Xi) ∈ Ei if and only if Xi ∈ g−1 (Ei). Then there is a significantobservation.

38.7. GENERAL CONSIDERATIONS PROBABILITY 785Observation 38.7.12 In every case, if a,b are numbers, then if everything makes sense,and X,X are two random variables having the same probability distribution, meaning thatP(X €F)=P(X €F) forall F an interval, thenE (aX + bX) =aE (X)+DE (X)Suppose you had many random variables X; each having the same distribution and thecollection of random variables independent, explained below. If you averaged X; for all i,what you would get is probably close to E (X). This is why taking the expectation is ofinterest. I will give a brief explanation why this is so.Where do independent random variables come from? In practice, you have independentobservations from an underlying probability space, meaning that it makes sense to ask forthe probability that a random variable is in suitable subsets of IR or R”. These observationsare independent in the sense that the outcome of an observation does not depend on theoutcome of the others. Then the numerical values are called independent random variables.A more precise description is given below.First, here is an important formula. I will be considering only the case of a continuousdistribution in explaining this inequality, but it all works in general. The inequality is calledthe Chebychev inequality.Proposition 38.7.13 Let X be a random variable. Then for € > 0,1P(|g(X)| 2 €) S FE (Is (X)I)Proof: By definition of what is meant by a distribution function, if E = |g|~' ({e,)) ={x : |g (x)| > e}, then on this set, |g (x)| /€ > 1 and off this set, |g (x)| f (x) > 0. ThusPile lee)= [fears = [le@ife)ar= le (\e()))Now suppose you have X; a random variable having distribution function f (x) and supposeU=E(X), 0° =E ((x — u)°) both exist. Suppose X;,i = 1,... all these random variablesare independent as in the next definition.Definition 38.7.14 Let there be random variables X,,..., having well defined mean LU =E (X,) and variance o* = E ((x — u)’) . Then to say these are independent implies thatE ((X;— HL) (Xj —)) = E (Xi — pw) E (Xj; -— pw) = 0 whenever i # j. The more completemeaning of independence is as follows: For each m,mP(X; € Ej for each i < m) =|] P(X € £)).i=]Here the E; can be considered intervals. The idea is that what happens in terms of proba-bility involving each X; for j # i does not affect probability involving X;. Also, if you haveX{,..., independent, then if g is some continuous function, then g(X1),... will also be inde-pendent. This is because g (X;) € E; if and only if X; € g~' (E;). Then there is a significantobservation.