Bayesian Methods for Contending with Homogeneity and Heterogeneity in Real World Data

February 25, 2020

Over the past decade, a new trend began to emerge, changing the way that clinical trials are conducted. Whereas placebo-controlled randomized control

RWA2_part_1_blog_pix trials remain the gold standard, in some situations, single arm trials have become an accepted way of assessing a new treatment intervention. Single arm trials establish clinical benefit by demonstrating the positive effects of a new therapy or treatment, without the need to use placebo or standard of care as a control. Instead, alternative approaches of establishing the comparison 
are used; these have become known as external controls or synthetic control arms (SCA henceforth) and include approaches leveraging real world data from various sources or evaluations of historical clinical trial data.  

The potential simplification of clinical development by using single arm trials and SCAs does not mean, however, that the application of the approach is straightforward. It requires a careful assessment of trade-offs between data content and their availability, as well as the determination of the optimal methodology.  One retrospective study found that single arm trials in oncology often do not meet criteria for clinical benefit as established by the European Medicines Agency.[1] Others have questioned whether non-randomized trials are equipped to handle the needs of unbiased comparisons, or whether they are simply a way to deal with unmet recruitment targets or the over- bureaucratization of the regulatory process.[2]

In contrast, there is a case to be made for making the most use of available real world and historical trial data, if the alternative is to conduct experiments on patients that only marginally improve robustness of findings. There are also legitimate cases, such as rare disease and/or genetic conditions, where recruitment is a challenge. Finally, enrolling into a control arm might be viewed as un-ethical (e.g., in pediatric conditions or some oncology indications). SCAs provide a solution to these issues, but require appropriate use.

SCAs are best suited for situations when a single arm trial is run in a patient population which is molecularly defined, allowing for a clearly defined historical or real-world control group to be created. They are also useful in situations where RCTs opt to enroll some, but fewer patients into the control arm (e.g. a 4:1 allocation ratio of historical data to newly enrolled patients.) Advanced statistical methods are applied to historical trial and/or real world data to build the SCA in a way that allows for the appropriate comparison with data gathered during the execution of the single arm trial. As such, this approach to arriving at the comparison between the new and the currently available treatments requires substantial amount of scientific and operational rigour, and there is a need for continuing education to understand the various ways SCAs are constructed.

Homogeneity and Heterogeneity

 

When combining historical datasets to create an SCA, researchers might discover that the historical data involve patients who are either homogenous or heterogenous to those enrolling in the new trial. Homogeneity and heterogeneity describe how similar external data are to the needs of the current trial. Pfizer’s Ibrance (for men with breast cancer) was the result of a Synthetic Control Arm composed of a post-market extension of a similar therapy for women, combined with results from three databases.[3] The fact that men had already been using it in these post market scenarios, meant the external data reflected similarities in clinical endpoint, population demographic and other variables (i.e. highly homogenous to the data that would have been collected had there been a placebo-controlled study). On the other hand, sometimes external data reflect a slightly different population sample (i.e. age, sex or race), or perhaps identify somewhat different clinical endpoints. In these instances of heterogeneity, further statistical adjustment must be incorporated. Note that while homogenous data have less bias than a heterogenous dataset might have, both are susceptible to some bias for which the analysis must account.

When real world data reflect a patient population homogenous to those currently enrolling, these data are easier to combine and the comparison can proceed as any other comparative analysis. When there is heterogeneity, the historical data might have to be weighted to reflect the fact that the historical data are biased. The solution, in such cases, is to add weights to the historical datasets, using a Bayesian hierarchical model. Suppose we have two datasets, one which reflects the desired population and endpoint, the other which reflects the desired endpoint but not the population. The conclusions of the investigative study must account for the fact that the first dataset is giving us information more suited to the study’s needs. Statisticians would not treat these two datasets the same way. Weighting is therefore a technique which takes into greater account information from the data set better suited for the study.  

Bayesian Models

Bayesian models offer a flexible way of incorporating historical controls in the analysis of trial data (whether single arm and randomized). In the context of SCAs, one popular utilization of Bayesian models is Bayesian Dynamic Borrowing.  An SCA is constructed by combining data from newly enrolled patients in the control arm of the trial, with historical ones using Bayesian methods. This enables fewer patients to be enrolled into the control group and optimizes the use of data already collected.

What is the ideal ratio for new to old patients, when constructing a synthetic control group? Using Bayesian Dynamic Borrowing, Dron et al., re-analysed a non-small cell lung cancer trial to discover that changes in ratio of new to historical patients have little effect on the statistical rigor of the trial.[4] The original trial recruited 734 patients to the control arm. Dron et al.’s findings reveal that the trial could have enrolled 440 new patients, and still achieved similar results. While situations with heterogeneous historical datasets require more patients, even in what they viewed as high-risk combinations of historical and new enrollments, new enrollments decreased significantly. [5]

This approach can particularly create efficiencies in clinical trials as fewer patients are randomized to control than the experimental treatment, but to make up for the sparse control information, external controls supplement the strength of the concurrent control to provide the same statistical efficiency. Here the historical or external control datasets are incorporated via Bayesian prior, and the final control arm is constructed by enrolling some new patients into the study.

It should be noted that no statistical model can offset poor data quality. One of the key first steps, therefore, when considering building an SCA to support a single arm submission is the evaluation of the available data. Datasets should not be combined if a source of data for the SCA does not meet the needs of the comparison with the single arm trial results.  A continuous dialogue is needed between RWD, statistical and clinical development experts at the time of planning this part of the clinical development program.

 

 

 

[1]Tibau, Ariadna, et al. "Magnitude of clinical benefit of cancer drugs approved by the US Food and Drug Administration." JNCI: Journal of the National Cancer Institute 110.5 (2018): 486-492.

[2]Collins, Rory, et al. "The Magic of Randomization versus the Myth of Real-World Evidence." (2020): 674-678.

[3]https://www.pfizer.com/news/press-release/press-release-detail/u_s_fda_approves_ibrance_palbociclib_for_the_treatment_of_men_with_hr_her2_metastatic_breast_cancer

[4]Dron L, Golchi S, Hsu G, Thorlund K. Minimizing control group allocation in randomized trials using dynamic borrowing of external control data - An application to second line therapy for non-small cell lung cancer. Contemporary Clinical Trials Communications 2019; 16: 100446.

[5]Ibid.