840 CHAPTER 39. STATISTICAL TESTS
pi· the marginal probability ∑ j pi j. Thus p· j = P(Z ∈ A j) = P(Ai) and pi· = P(Z ∈ Bi) =P(Bi).
The problem of interest is whether the events Ai and B j are independent. Is it the casethat
P(Ai∩B j) = P(Ai)P(B j)?
In other words, is pi j = p· j pi·? If you knew each pi j this would be no problem but you haveno idea about pi j.
The null hypothesis will be that pi j = p· j pi·. Then the p· j, pi· are to be considered asparameters. The first item is to find the maximum likelihood estimates for these based ona random sample of size n Xi j, i = 1,2 and j = 1,2,3. Here the sample has been indexedaccording to which “cell” Bi∩A j contains the sample point. Then the likelihood based onthe null hypothesis is to maximize
2
∏i=1
3
∏j=1
pXi j· j p
Xi ji·
subject to the constraint that ∑i pi· = 1as usual, it works best to maximize the ln of theabove. Thus maximize
2
∑i=1
3
∑j=1
(Xi j ln(p· j)+Xi j ln(pi,·)) , ∑i
pi· = 1
First consider the pi·. This amounts to maximizing
2
∑i=1
Si ln(pi,·) , ∑i
pi· = 1
where Si ≡ ∑ j Xi j. Using the method of Lagrange multipliers, we need(S1p1·
S2p2·
)= λ
(1 1
)thus λ p1· = S1,λ p2· = S2. Thus S2
λ+ S1
λ= 1 and so ∑i ∑ j Xi j
λ= 1 and so λ = n. Then
p̂i· =Si
n=
∑ j Xi j
n
where this is the maximum likelihood estimate for pi·. Similar reasoning shows that
p̂· j =∑i Xi j
n.
Now form
D≡∑i, j
(Xi j−np̂i· p̂· j)2
np̂i· p̂· j
By what was explained above, this is X 2 ((2×3−1)−3) . The reason there is a 3 thererather than a 5 is that there are only 3 unknown parameters due to the fact that ∑i pi· =1,∑ j p· j = 1. In general, if the table is r× s, the above expression would be
X 2 ((rs−1)− (r+ s−2))
This justifies the following proposition.