<img alt="" src="https://secure.lote1otto.com/219869.png" style="display:none;">
Skip to content

Reinforcement Learning: A Promising Tool for Predicting Optimal Treatment in Complex Diseases

Written by Fei Tang and Evie Merinopoulou

Fei Tang_croppedEvie Merinopoulou-01

Reinforcement Learning (RL), a crucial component of machine learning (ML), serves as a remarkable framework for identifying a sequence of actions aimed at increasing the likelihood of accomplishing a predetermined goal. RL has been widely used in the fields of gaming and robotics, with notable instances including its utilization in defeating a world champion program in chess, shoji, and Go.1 In recent years, RL has gained significant attention in the healthcare sector, particularly in the domain of clinical decision-making and the optimization of treatment regimens.


Reinforcement learning in healthcare decision-making: What to do for best possible patient outcomes

In the context of healthcare, treatment regimens often involve a dynamic and sequential decision-making process, where the sequence of interventions can significantly influence patient outcomes. For example, a clinician chooses a treatment based on a patient’s health, observes the response to the treatment, and then the process is repeated. The potential of RL in clinical decision-making is gaining recognition and showing great promise. As an example, a novel RL framework developed by Yala et al. to assist personalized breast cancer screening was found to outperform the standard regimen used in clinical practice.2 Additionally, using real-world medical registry data, Liu et al. employed an RL framework to predict the sequence of prevention and treatments for acute and chronic graft versus host disease and demonstrated its potential to enhance long-term outcomes of patients by learning from expert actions.3

RL provides the capacity to adapt and refine diagnostic and treatment strategies in response to evolving patient conditions and characteristics, ensuring the continuous pursuit of an effective course of action. In scenarios where numerous treatment options exist, RL’s ability to navigate this complex decision space is invaluable.

To illustrate, consider a patient with cancer undergoing treatment — RL can assist in determining the optimal sequence of regimens, dosages, and timing to maximize the effectiveness of the treatment. This process typically involves a series of decisions made at each treatment stage, where RL algorithms can learn from the patient’s response and adapt the treatment plans accordingly, making it a truly personalized approach.


A case for the treatment of multiple myeloma

In our upcoming presentation at ISPOR Europe, we will be showcasing a case study using an RL framework to estimate optimal personalized treatment sequencing for patients with multiple myeloma (MM). The established framework may provide a robust and clinically meaningful approach for estimating personalized treatment regimens that can improve patient outcomes and address heterogeneity in MM and similar disease settings using real-world claims data.

In the field of chronic disease management, RL can contribute to optimizing the management of chronic diseases by creating individualized care plans. Additionally, in the realm of pharmacogenomics, RL can aid in the interpretation of large and complex datasets. By analyzing an individual’s genetic makeup and understanding how it influences their response to treatments, RL can help clinicians make informed decisions about the most suitable therapies, dosages, and potential side effects for a patient.


The future of reinforcement learning

RL has emerged as a promising tool in the healthcare sector, particularly in clinical decision-making. By leveraging the utilization of data-driven decision-making and continuous adaptation, RL contributes to a healthcare landscape where treatments and interventions are uniquely tailored to each patient’s biology and circumstances. Its ability to identify optimal treatment sequences, adapt to evolving patient conditions, and personalize treatment plans makes it an asset in the pursuit of improved patient care and outcomes.

Despite its promise, there is still some work to do before one sees large-scale implementation. If used incorrectly, RL frameworks could suggest practices that are not supported by evidence, which may cause harmful patient outcomes. In addition, lack of measurement in relevant variables in real-world datasets may impact the prediction accuracy of RL frameworks.

Future research should focus on developing guidance to facilitate the validation process of RL algorithms intended for healthcare applications, leveraging extensive, prospectively collected datasets and incorporate appropriate RL systems with comprehensive static and time-varying features to ensure prediction accuracy.


Interested in learning more? Contact us!

Contact Us


ISPOR Eu 2023_Tang poster



  1. Silver, D., Hubert, T., Schrittwieser, J., Antonoglou, I., Lai, M., Guez, A., ... & Hassabis, D. (2018). A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. Science362(6419), 1140-1144.
  2. Yala, A., Mikhael, P. G., Lehman, C., Lin, G., Strand, F., Wan, Y. L., ... & Barzilay, R. (2022). Optimizing risk-based breast cancer screening policies with reinforcement learning. Nature medicine28(1), 136-143.
  3. Liu, Y., Logan, B., Liu, N., Xu, Z., Tang, J., & Wang, Y. (2017, August). Deep reinforcement learning for dynamic treatment regimes on medical registry data. In 2017 IEEE international conference on healthcare informatics (ICHI)(pp. 380-385). IEEE.

Read more from Perspectives on Enquiry and Evidence:

Sorry no results please clear the filters and try again

Embracing AI and ML in Medical Devices: FDA’s Total Product Lifecycle-Based Regulatory Framework

Written by Fei Tang, RWE Senior Research Consultant, and Paul Arora, Assistant Professor (Status), Dalla Lana School of...
Read more

Looking to the Future — Improving Diagnosis and Prognosis of Eye Conditions with Artificial Intelligence

Written by Alind Gupta, Cytel; Haridarshan Patel, Horizon Therapeutics; and Jason Simeone, Cytel Ophthalmology is...
Read more

A Better Way to Track Medication Adherence

Patients’ adherence to the medications or treatment regimens prescribed to them by their clinicians is an important...
Read more


contact iconSubscribe back to top