Contact Us     Search     Site Map  
  Home

    StatXact

Cytel Home > Products > StatXact > 
StatXact Example 6

Cochran-Armitage Exact Test of Trend for Imbalanced Data

Data from a recent study of the effect of maternal alcohol consumption on congenital sex organ malformations (Graubard and Korn, Biometrics, 1988) are tabulated below:

Maternal Alcohol Consumption (drinks/day) 
Malformation 0 < 1 1-2 3-5 > 6 Total
Absent 17066 14464 788 126 37 32481
Present 48 38 5 1 1 93
Total 17114 14502 793  127 38 32574

To determine if increasing levels of maternal alcohol consumption place the fetus at increasing risk of malformations, one may use the trend test (Cochran, Biometrics, 1954; Armitage, Biometrics, 1955). StatXact is the only package currently available that can compute an exact trend test with any choice of column scores. Graubard and Korn strongly recommend using the average number of drinks/day (0, 0.5, 1.5, 4, 8) as the column scores, whereas most statistical packages default to the equally spaced scores (1, 2, 3, 4, 5). StatXact reveals that the choice of scores and the use of the exact option can make a difference to the statistical inference. 

Choice of Column Scores
P-Values Average Drinks Per Day Equally Spaced Scores
Exact .0159 (.0158) .1792 (.1047)
Asymptotic .0078 (.0039) .1765 (.0882)

The main points from the above analysis are:

  1. The "average drinks per day" scores, unlike the "equally spaced" scores, reveal that the risk of malformations increases significantly with increasing alcohol consumption.
  2. The asymptotic p-values are inaccurate even though the sample size is very large. Since there are only 93 birth defects in a sample of 32,574, the distribution of the trend statistic is highly skewed. Asymptotic one and two sided p-values cannot capture the skewness of the underlying exact distribution. For example, one sided p-value with unequally spaced scores underestimates the true p-value the asymptotic by about a factor of four.
  3. Despite the large sample size, StatXact computes the exact p-values in about three minutes on an IBM-AT. In their original analysis, Graubard and Korn utilized several hours of CPU time on an IBM main-frame computer.

 

 



Get Acrobat Reader