Kenneth Kuttler

840 CHAPTER 39. STATISTICAL TESTS

pi· the marginal probability ∑ j pi j. Thus p· j = P(Z ∈ A j) = P(Ai) and pi· = P(Z ∈ Bi) =P(Bi).

The problem of interest is whether the events Ai and B j are independent. Is it the casethat

P(Ai∩B j) = P(Ai)P(B j)?

In other words, is pi j = p· j pi·? If you knew each pi j this would be no problem but you haveno idea about pi j.

The null hypothesis will be that pi j = p· j pi·. Then the p· j, pi· are to be considered asparameters. The first item is to find the maximum likelihood estimates for these based ona random sample of size n Xi j, i = 1,2 and j = 1,2,3. Here the sample has been indexedaccording to which “cell” Bi∩A j contains the sample point. Then the likelihood based onthe null hypothesis is to maximize

∏i=1

∏j=1

pXi j· j p

Xi ji·

subject to the constraint that ∑i pi· = 1as usual, it works best to maximize the ln of theabove. Thus maximize

∑i=1

∑j=1

(Xi j ln(p· j)+Xi j ln(pi,·)) , ∑i

pi· = 1

First consider the pi·. This amounts to maximizing

∑i=1

Si ln(pi,·) , ∑i

pi· = 1

where Si ≡ ∑ j Xi j. Using the method of Lagrange multipliers, we need(S1p1·

S2p2·

)= λ

(1 1

)thus λ p1· = S1,λ p2· = S2. Thus S2

λ+ S1

λ= 1 and so ∑i ∑ j Xi j

λ= 1 and so λ = n. Then

p̂i· =Si

∑ j Xi j

where this is the maximum likelihood estimate for pi·. Similar reasoning shows that

p̂· j =∑i Xi j

Now form

D≡∑i, j

(Xi j−np̂i· p̂· j)2

np̂i· p̂· j

By what was explained above, this is X 2 ((2×3−1)−3) . The reason there is a 3 thererather than a 5 is that there are only 3 unknown parameters due to the fact that ∑i pi· =1,∑ j p· j = 1. In general, if the table is r× s, the above expression would be

X 2 ((rs−1)− (r+ s−2))

This justifies the following proposition.