 |
StatXact Example 6
Cochran-Armitage Exact Test of Trend for Imbalanced Data
Data from a recent study of the effect of maternal alcohol consumption on congenital sex organ malformations (Graubard and Korn, Biometrics, 1988) are tabulated below:
|
Maternal Alcohol Consumption (drinks/day) |
| Malformation |
0 |
< 1 |
1-2 |
3-5 |
> 6 |
Total |
| Absent |
17066 |
14464 |
788 |
126 |
37 |
32481 |
| Present |
48 |
38 |
5 |
1 |
1 |
93 |
| Total |
17114 |
14502 |
793 |
127 |
38 |
32574 |
To determine if increasing levels of maternal alcohol consumption place the fetus at increasing risk of malformations, one may use the trend test (Cochran, Biometrics, 1954; Armitage, Biometrics, 1955). StatXact is the only package currently available that can compute an exact trend test with any choice of column scores. Graubard and Korn strongly recommend using the average number of drinks/day (0, 0.5, 1.5, 4, 8) as the column scores, whereas most statistical packages default to the equally spaced scores (1, 2, 3, 4, 5). StatXact reveals that the choice of scores and the use of the exact option can make a difference to the statistical inference.
|
Choice of Column Scores |
| P-Values |
Average Drinks Per Day |
Equally Spaced Scores |
| Exact |
.0159 (.0158) |
.1792 (.1047) |
| Asymptotic |
.0078 (.0039) |
.1765 (.0882) |
The main points from the above analysis are:
- The "average drinks per day" scores, unlike the "equally spaced" scores, reveal that the risk of malformations increases significantly with increasing alcohol consumption.
- The asymptotic p-values are inaccurate even though the sample size is very large. Since there are only 93 birth defects in a sample of 32,574, the distribution of the trend statistic is highly skewed. Asymptotic one and two sided p-values cannot capture the skewness of the underlying exact distribution. For example, one sided p-value with unequally spaced scores underestimates the true p-value the asymptotic by about a factor of four.
- Despite the large sample size, StatXact computes the exact p-values in about three minutes on an IBM-AT. In their original analysis, Graubard and Korn utilized several hours of CPU time on an IBM main-frame computer.
|
 |
 |
 |
| |
Example Submissions
Over the years, most of the examples have come from in-house statisticians, but we would like our examples to better reflect the actual uses customers have for our products.
Therefore, we would like to ask for data from our customers for use as examples. Any data we use would be properly cited, and with your permission, could be used for examples in our manuals, in marketing material, or even for one of our live webcasts.
This is a chance to have your hard work published and made known to many other statisticians in your field and in other disciplines.
Get Acrobat Reader
|
|