This is the first of a three part post in which we will consider (i) improvements to trial quality that result from bundling data management with biostatistics, (ii) reductions in cost and study length that result from bundling data management with biostatistics, and (iii) the contributions of statistical innovation to clinical data management, such as those by Cytel Board member Professor Marvin Zelen (Research Professor and former Chair of Biostatistics at the Harvard School of Public Health.)
“To consult the statistician after an experiment is finished is often merely to ask him to conduct a post mortem examination.” This well-known observation from Ronald Fisher comes with a caveat when applied to clinical data management – the remains presented to statisticians are themselves complex, disorderly, replete with missing data, and generally difficult to examine. Given that thousands of lives (and possibly millions of dollars) are at stake in drawing accurate statistical conclusions from clinical data, it is worth considering the benefits of collaboration between data management teams and statisticians.
In one sense, trial statisticians are the natural consumers of well-managed data. Whether or not data collected from sites is useful depends in large part on whether it is easy to clean and evaluate. By bundling biostatistics with data management, statisticians are positioned to anticipate complexities that might arise during analyses, and able to ensure that data sets arrive properly formatted.
Efficient Data Verification and Validation:
Data verification often takes the form of data management teams laboring through all the data available, in order to ensure that clinicians have followed protocol adequately. While in a perfectly functioning system, this would improve data, studies show that this process suffers from two types of inefficiency. Firstly, some parts of data are more critical to a study’s findings than others, indicating that an equal distribution of attention across all data points is an unnecessary expense of several millions of dollars. Even more worrisome, however, is the fact that due to the time spent focused on uncritical variables, management teams are less likely to provide adequate attention to data that needs to be precise for the purposes of a study’s findings. As a result, studies that could have had near perfect data sets if statisticians had been available to differentiate between critical and less critical data, often have to tackle the problem of missing data or unlock closed data bases.
Management of Critical Variables:
Even though all the information gathered during the progress of a clinical trial might inform the trial’s conclusion, it is typically the case that the accurate reporting of some critical variables is significantly more important for trial success than other variables. For example, it may be that small inaccuracies in the reported height or weight of a patient have less effect on a trial’s conclusions than similar inaccuracies in reported blood pressure. Since improper reporting of critical data can be highly detrimental to a trial’s quality, it is crucial that study oversight takes particular account of the management of these critical variables. Collaboration between statisticians and data management encourages such oversight, since statisticians are aware of what these variables are and the statistical hurdles that would result from inaccurate reporting.
Uncompromised Data Quality:
Statisticians who collaborate with data management are positioned to recognize bias, when bias is accidentally introduced into a trial. Statistical bias can appear in a trial, even after the use of established randomization procedures, if unanticipated correlations happen to exist amongst study participants or if problems exist with equipment at a trial site. As a result, without statisticians involved in data management, bias is very difficult to determine until a study is published. Additionally, as Colin Baigent, et al., point out, early discovery of bias not only ensures stronger statistical conclusions, but can improve participant safety in trials with ongoing enrollment.