39.7. CONTINGENCY TABLES 841
Proposition 39.7.1 Let there be an r× s contingency table such that the random variableis in exactly one of Bi∩A j for i = 1, · · · ,r, j = 1, · · · ,s. If P(Bi∩A j) = P(Bi)P(A j) for alli, j, then if a sample is taken of size n and Xi j is the observed number in Bi∩A j, then whenn is large,
D≡∑i, j
(Xi j−np̂i· p̂· j)2
np̂i· p̂· j
is distributed as X 2 ((rs−1)− (r+ s−2)) = X 2 (rs− s− r+1).
Assuming the null hypothesis that the events Bi and A j are independent, one can nowtest this hypothesis by using a graph or table for X 2 (rs− s− r+1) .
Example 39.7.2 You have a 3× 2 contingency table, three rows and two columns. Alsothe number in a random sample is 900. The numbers of observations found in the variouspositions are illustrated in the following.
120 300180 80100 120
Determine whether the underlying contingency table has the property that the eventscould be independent. If the probability is no more than .01 that the events are independent,reject the null hypothesis. Otherwise conclude that the events might be independent.
In the above, n = 900. Now lets find the p̂.
p̂1· =420900
, p̂2· =260900
, p̂3· =220900
p̂·1 =400900
, p̂·2 =500900
Now assemble D.
D =
(120−900
( 420900
)( 400900
))2
900( 420
900
)( 400900
) +
(300−900
( 420900
)( 500900
))2
900( 420
900
)( 500900
)+
(180−900
( 260900
)( 400900
))2
900( 260
900
)( 400900
) +
(80−900
( 260900
)( 500900
))2
900( 260
900
)( 500900
)+
(100−900
( 220900
)( 400900
))2
900( 220
900
)( 400900
) +
(120−900
( 220900
)( 500900
))2
900( 220
900
)( 500900
)Now compute this.
D = 107.64
This is way too big to accept the null hypothesis. The events are not independent. Thestatistic is distributed as X 2 (2) and a table gives probability 1 that the variable is less than10. Yet D is larger than 100.