<img alt="" src="https://secure.lote1otto.com/219869.png" style="display:none;">
Skip to content

Under wraps: the importance of patient privacy


About the Author: Manjusha Gode has over 28 years' IT experience spanning delivery Management, quality management, software testing,  people management, process improvement and multi-locational operations.  She is a pioneering member of Cytel's clinical programming team.

 Clinical data transparency improves decisions for all healthcare stakeholders including patients, caregivers, healthcare providers, payers, and regulators.

The topic of transparency and the process by which clinical trial data can be effectively anonymized and disclosed for use by researchers has become a hot topic in recent years. In this blog, we will discuss the broader context of patient privacy and the key considerations for biopharmaceutical companies in this area.

Privacy Concept. Blue Button with Padlock Icon on Modern Computer Keyboard. 3D Render..jpeg

A recent book by industry experts Khaled El Emam and Luk Arbuckle, ‘Anonymizing Health Data’ (1) uses the phrase, ‘People are Private’,and this statement effectively sums up the crux of the issue. In clinical trials, we gather a host of information about patients to help support the development of new medicines. It is critical that we protect the confidentiality and privacy of the data that these individuals have entrusted to us. Keeping this priority at the front of mind is critical as we consider the most appropriate techniques for risk assessment and data anonymization to ensure that patient data is protected, while simultaneously maximizing the value of clinical trial knowledge through data sharing. 

The topic has become a particular focus of discussion within the pharma industry in recent years for a number of reasons.

Regulations - Folder Register Name in Directory. Colored, Blurred Image. Closeup View..jpeg

Firstly, the increase in electronic processing of data in various settings including hospitals and insurance companies means that more patient information is being generated and handled across the board.  Then, the regulatory environment is a key driver.  The European Medicines Agency (EMA)’s policy on Clinical Trial Disclosure  0070 came into effect in January 2015 (2).  The policy aims to cover the proactive release of clinical data, supplementing rather than replacing its Policy 0043 which handles the reactive release of data under written request.  The policy is being implemented in two phases phase 1 ( the current status) which covers the disclosure of Clinical Study Reports ( CSRs) only, and Phase 2 which will handle controlled access to Individual Patient Data ( IPD). All trials must be disclosed to the public regardless of whether the product is ultimately authorized, rejected or withdrawn and the EMA is now handling a backlog of applications dating back to 1st January 2015 when the policy came into effect. 

de-identification _infographic.jpg

The sheer volume of data means that developing efficient techniques for anonymization is vital.  We can note that ClinicalTrials.gov currently lists 246,603 studies (3) with locations in all 50 US States and in 200 countries. The data released for drugs Kyprolis and Zurampic alone totaled roughly 2,60,000 pages of information from more than 100 clinical study reports. (4)


Key aspects to consider

Risk Assessment. Business Concept on Blurred Background. Office Folder with Inscription Risk Assessment on Working Desktop. Risk Assessment - Concept. 3D..jpeg

Before considering the task of anonymizing or redacting clinical data for Clinical Trial Disclosure purposes, there are other measures sponsors should put in place to handle patient data privacy.  The most accessible example of this is that the right contractual terms should be executed with any partners handling data to ensure access is properly controlled.

Once we do arrive at the point of thinking about anonymization, the primary area for any company to consider is the context in which the data is going to be shared.

There are two broad situations which require the handling of the data. One is the primary use of the information. So for example, before treating a patient it is necessary that data about them is accessed and used by the caregiver. Or when settling a patient’s insurance claim, the data needs to be reviewed and used for this purpose.  What we are typically referring to when we think about the need for anonymization, is the secondary use of the data-  those situations when information is going to be used for further analysis, to draw certain conclusions, or for marketing purposes.

Understanding the context in which the data is being used helps to define which technique can best ensure the data is protected to the right level.

After defining the context, it’s also necessary to think about the risk of re-identification that exists in every situation after data has been anonymized. That is, that a link could somehow be re-established between the anonymized data and an individual patient. A full risk assessment can be conducted and suitable risk thresholds defined once we decide exactly where the data is going to be publicized after anonymization.


Industry standards

Since 2014, the  PhUSE  organization  has been working to define de-identification standards for CDISC data models beginning with SDTM, specifying a consistent approach for tackling particular variables.  For example, if there was a reference to a specific hospital where a study was conducted, and you wanted to de-identify the name of the hospital what method should you proceed with?

Business man pointing to transparent board with text Standards.jpeg

Should the details be redacted – i.e blacked out completely?  This approach would close off the possibility of any analysis on this variable.   Or, alternatively should the details be generalized- so, for example, Addenbrookes Hospital in Cambridge UK would be referred to as a generic hospital in the UK. To generalize one step further, it might be referred to as a hospital in Europe. 

The PhUSE standard (5) provides guidelines to support this process, and the document states:  ‘Each domain and variables holding potentially Personally Identifying Information (PII) have been rated in terms of impact on data privacy. Based on that rating the variables are allocated standard rules of de-identification, along with the rationale and the impact on data utility being documented’

So, what can Cytel do to help the process of data de-identification?

Cytel has worked on several de-identification projects and our team brings together a number of core competencies which are integral to success in this often complex area. Our software expertise, combined with a large and experienced programming team allows us to leverage technology effectively and create automated solutions for de-identification. In addition, our heritage and background in statistics provide the necessary tools to make judgments in the risk assessment process.  For example, where rare features or events are observed in the patient profile, these peculiarities can run the risk of the patient being re-identified.  Strong in-house statistical, medical and therapy area domain knowledge is key to help pinpoint where the risks of re-identification lie.

To learn more about Cytel's clinical research services click below.

Clinical Research

 Liked this article ? Join our community of biopharma innovators and sign up for Cytel blog updates direct to your inbox.

Cytel Blog


1) Anonymizing Health Data by Khaled El Emam and Luk Arbuckle

2) European Medicines Agency policy on publication of clinical data for medicinal products for human use

3) ClinicalTrials.gov Trends

4) EMA Transparency: New Clinical Reports go live October 2016 (raps.org)

5) De-Identification Standards for CDISC Data Models PhUSE Data Transparency Working Group


contact iconSubscribe back to top