<img alt="" src="https://secure.lote1otto.com/219869.png" style="display:none;">
Skip to content

It’s Time to Move, Time to Move to Define-XML 2.1


As of March 2023, specifically for any study started on or after March 15, 2023,1 for the submission of SEND, SDTM, and ADaM packages, the FDA recommends the use of Define-XML 2.1 (while this is not yet the case for PMDA).2, 3, 4 However, since Data Requirement Ends for Define v2.0 are not available, submitting data using Define v2.0 is still acceptable. 

The landing page of the CDISC Define-XML 2.1 anticipates six major updates:

  1. Updated approach to def:Origin
  2. Identification of the standards and controlled terminology
  3. Added support for sub-classes (for ADaM)
  4. Improved SENDIG support
  5. v2.0 errata fixes
  6. Additional updates

In this article, I will focus on items 1, 2, 3, and 6.


Updated Approach to def:Origin

In my opinion, among the six points, the most significant change is the modification of the "Origin" element. This change is noteworthy because it not only introduces a new attribute but also replaces the previous method of describing the origin of a variable or value-level. In the previous version of Define-XML, only the "Type" attribute was available, whereas now the Origin element includes an additional attribute called "Source." According to the Define-XML 2.1 standard document (section, both attributes are required, except in cases where the Type is "Predecessor."

Sections and of the Define-XML standard document contain two tables that illustrate how these two attributes should be used to identify various sources for SDTM and ADaM, respectively. While this change does not significantly impact ADaM, where the previous Type remains the same and the Source is always "Sponsor," it does require some additional effort for SDTM Define-XML.

In Define-XML 2.0, a variable in SDTM or a value-level could have the following origin:type:

    • CRF
    • Protocol
    • Assigned
    • Derived

However, in Define 2.1, you could have the combination of origin:type and origin:source illustrated in the following table.


Table 1: Change to "Origin" Element

Define 2.0 Define 2.1 Comment
CRF Collected Investigator Collected directly in the CRF
eDT Collected Subject Collected by the subject through an instrument
eDT Collected Vendor Received from a central lab, e.g., labs, ECG
Derived Derived Vendor Calculated within an EDC, e.g., BMI calculated or by an instrument, e.g., questionnaire section scores within an ePRO
Derived Derived Sponsor Calculated in SDTM, e.g., EPOCH
Assigned Assigned Vendor Coding terms such as MedDRA for AE, or assigned through a third-party adjudication process, e.g., best tumor response in an oncology trial
Assigned Assigned Sponsor Assigned in SDTM, e.g., --TESTCD, DOMAIN
Protocol Protocol Sponsor Not directly collected but it could be assigned by protocol, e.g., VSPOS (Vital Signs Position)


The Origin:Type has been modified in the Define-XML controlled terminology, and the addition of Origin:Source was introduced starting from version 2020-03-27. Furthermore, Origin:Type now allows the use of "Not Available" and "Other," which were added in version 2021-03-26.

To illustrate the differences in the identification of the "origin" for certain variables in the DM domain, Figures 1 and 2 show two examples extracted from the sample Define-XML provided with the Define-XML standard versions 2.0 and 2.1, as rendered by the Define-XML stylesheet.


Figure 1: SDTM DM Portion of Define-XML Using Version 2.0



Figure 2: SDTM DM Portion of Define-XML Using Version 2.1



The following example illustrates the portion of Define-XML describing Origin element of SUBJID in version 2.0 vs version 2.1.


Figure 3: Example of How Origin=CRF Is Now Defined in Define-XML 2.1

Define-XML 2.0 Define-XML 2.1
<def:Origin Type="CRF"/> <def:Origin Type="Collected" Source="Investigator">


Identification of the Standards and Controlled Terminology

With Define-XML 2.0, you had the opportunity to specify the standard used in the whole package through the MetaDataVersion element and its StandardName and StandardVersion attributes; see Figure 4 from the sample package included with the Define-XML 2.0.


Figure 4: Identification of Standard Used with Version 2.0



Prior to Define-XML 2.1, the CDISC controlled terminology version used in the CDISC package, either in SDTM, SEND, or ADaM, was typically specified in the reviewer guide or, in SDTM, it was assumed to be the version referenced in the TS domain through the TSVCDREF and TSVCDVER parameters.

Version 2.1 had deprecated the two attributes highlighted in Figure 4, and a new def:Standards element was introduced. Figure 5 is an example of how the new element is displayed in a new section of the rendered define-xml.


Figure 5: New “Standards” Section in Define-XML 2.1



Furthermore, the datasets’ metadata, through the ItemGroupDef element, now includes a new attribute called def:StandardsOID. This attribute allows you to reference one of the previously declared standards. The example in Figure 6, shows that in SDTM package, in addition to standard domains from the referenced SDTM IG, we also made use of additional SDTM domains specified in a separate IG, the Medical Device IG. Furthermore, we also have one domain, XS, that is not standard; this is specified through a new attribute def:IsNonStandard (it will be “Yes” in this case indicating XS is not a standard domain in any SDTM IG).


Figure 6: Datasets’ Metadata Portion Indicating for Each Domain which Standard Was Used



Similarly, we could also have a code list including standard terms from multiple CDISC controlled terminology versions, although I don’t see the reason to have more than one CDISC controlled terminology version used in the same CDISC package.


Added Support for Sub-Classes

A new def:Subclass attribute was added to further classify/group datasets. This is currently only applicable to ADaM. Currently, the latest Define-XML controlled terminology version (2022-12-16) has defined controlled terminology for three ADaM classes as described in Table 2.


Table 2: The New Sub-Class Terminology

Class Sub-Class


Additional Updates

The Define-XML 2.1 contains other updates well-documented in section 1.1.3 “Relationship to Prior Define-XML Specifications.”

We have, for example, the possibility to specify a new attribute def:HasNoData for datasets (ItemGroupDef element), see Figure 7, and for variables (ItemRef element) metadata, see Figure 8.


Figure 7: HasNoData for Datasets



Figure 8: HasNoData for Variables



The addition of the HasNoData attribute contradicts a “historical” recommendation provided in the SDTM IG and in particular the following sentence:

“In the event that no records are present in a dataset (e.g., a small PK study where no subjects took concomitant medications), the empty dataset should not be submitted and should not be described in the Define-XML document. The annotated CRF will show the data that would have been submitted had data been received; it need not be reannotated to indicate that no records exist.”

However, that sentence has been removed in the latest version of the SDTM Implementation Guide (SDTM IG 3.4), aligning it with the recommendation provided in the CDISC SDTM Metadata Submission Guideline v2.0.5


The Define-XML 2.1 contains some good improvements for both human and machine “readability.” 

The ADaM-specific Analysis Results Metadata (ARM)6 are not yet incorporated in the main Define-XML 2.1, therefore, when using Define-XML 2.1, you should still reference the ARM standard in the ODM element, as shown in Figure 9.


Figure 9: Referencing the ARM Standard



Most off-the-shelf software already have integrated support for the new version of Define-XML, including the necessary metadata. For internally developed software that relies on ad-hoc metadata repositories and tools like SAS macros, adopting Define-XML 2.1 may require additional metadata and the deprecation of certain elements from the previous version.

For those planning to move to Define-XML 2.1 for a study that has already been started and making use of Define-XML 2.0 (though migration is not required if your study started before March 15, 2023), migration from 2.0 to 2.1 is pretty straightforward, or that’s at least our current experience with a number of sponsors we have supported adopting and / or migrating to Define-XML 2.1.


I would like to express my gratitude to Steve Wong from my team for his valuable contributions and efforts in conducting the “investigations” discussed in this article.


Interested in learning more about data submission? Download our complimentary new ebook, The Good Data Doctor on Data Submission and Data Integration:

Download Publication


1 In the CDISC and data submission context, the study start date is defined as the date of the first signed informed consent (SSTDTC parameter in TS SDTM dataset).

2 “CDISC Define-XML 2.1.”

3 “Define-XML v2.1 CDISC guideline and PHUSE WG,” S. Faini, CDISC Italian UN 2019.

4 “Upgrading from Define-XML 2.0 to 2.1,” T. Markus, PharmaSUG 2022.

5 “SDTM Metadata Submission Guidelines v2.0.”

6  “Analysis Results Metadata (ARM) v1.0 for Define-XML v2.0.”

Read more from Perspectives on Enquiry and Evidence:

Sorry no results please clear the filters and try again

The Facts in the Case of Subject X

Over the past years, probably the entire last decade, there have been several discussions on how to handle multiple...
Read more

Raising Awareness for FDA Data Submission Recommendations (I)

For years CDISC data standards implementers have struggled to find good implementation examples and use cases beside...
Read more

New Ebook: “The Good Data Doctor on Data Submission and Data Integration”

Regular technical discussions with the FDA play a critical role in ensuring data submission success. These discussions...
Read more
contact iconSubscribe back to top