0% found this document useful (0 votes)
379 views31 pages

DM Domain

The document provides a comprehensive guide on the Study Data Tabulation Model (SDTM) and its Demographics (DM) domain, emphasizing its importance in clinical trials for regulatory compliance, data consistency, and integration. It details the key features of SDTM, including standard domains, clear definitions, and controlled terminology, while highlighting the significance of the DM domain in identifying subjects and providing baseline characteristics. Additionally, the document outlines the required variables within the DM domain and their roles in clinical trial datasets.

Uploaded by

tambolit2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
379 views31 pages

DM Domain

The document provides a comprehensive guide on the Study Data Tabulation Model (SDTM) and its Demographics (DM) domain, emphasizing its importance in clinical trials for regulatory compliance, data consistency, and integration. It details the key features of SDTM, including standard domains, clear definitions, and controlled terminology, while highlighting the significance of the DM domain in identifying subjects and providing baseline characteristics. Additionally, the document outlines the required variables within the DM domain and their roles in clinical trial datasets.

Uploaded by

tambolit2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

Guide

to
SDTM
Demographics
(DM)
Domain
(Note : Referred SDTMIG
V. 3.3 AND V. 3.4 For Creating
This Guide)
Overview of SDTM and Its Purpose
in Clinical Trials
What Is SDTM?
The Study Data Tabulation Model (SDTM) is a standardized
framework developed by the Clinical Data Interchange
Standards Consortium (CDISC). Its primary purpose is to
organize and present clinical trial data in a consistent
structure. By using SDTM, clinical trial data becomes easier to
understand, integrate, and submit to regulatory authorities
like the U.S. Food and Drug Administration (FDA) and the
European Medicines Agency (EMA).

Why Is SDTM Important?


Regulatory Compliance:

SDTM ensures that data submissions meet global regulatory


requirements, allowing for smoother reviews by agencies like
the FDA. Without SDTM, regulatory authorities may reject or
delay the review of clinical trial data due to inconsistencies or
lack of standardization.

Data Consistency:

Standardizing data ensures that all stakeholders, including


sponsors, researchers, and regulators, interpret data in the
same way. This reduces errors and confusion when working
with large datasets from multiple clinical trials.

Integration and Reuse:

SDTM allows datasets from different clinical trials to be easily


combined and analyzed. For example, a pharmaceutical
company studying a new drug can compare data across
multiple studies to assess its safety and efficacy.
SImproved Efficiency:

By adhering to SDTM standards, data preparation, analysis,


and submission processes become faster. Researchers no
longer need to reformat or reorganize data to meet specific
requirements.

Key Features of SDTM


Standard Domains:

SDTM organizes data into domains, such as DM


(Demographics), AE (Adverse Events), and LB (Laboratory
Tests). Each domain represents a specific type of data
collected during the trial.

Clear Definitions:

SDTM provides precise definitions for each variable, ensuring


that everyone uses the same terminology and structure.

Traceability:

SDTM datasets are designed to maintain traceability,


meaning it’s easy to track how a variable was derived or
where a specific piece of data originated.

Controlled Terminology:

Many SDTM variables use controlled terminology, such as


predefined lists of acceptable values (e.g., for SEX, the values
are "M" for Male and "F" for Female). This ensures consistency
across datasets.
Role and Significance of the DM
Domain
What Is the DM Domain?

The Demographics (DM) domain is one of the most


fundamental components of SDTM. It contains key
demographic information about each subject who
participated in the clinical trial. This includes data like the
subject's age, sex, race, and country of participation.
The DM domain serves as the backbone of the clinical trial
dataset, providing the foundation upon which other domains
are built.

Why Is the DM Domain Important?

Identification of Subjects:

Each subject in the study is assigned a unique identifier in the


DM domain (e.g., USUBJID), which links their data across all
other domains.

Without this identifier, it would be impossible to connect


information about a subject’s adverse events (AE), laboratory
tests (LB), or treatments (EX).

Baseline Characteristics:

The DM domain provides baseline demographic details such


as age, sex, and ethnicity, which are crucial for analyzing the
trial results.
For example, researchers may want to compare how a drug
performs across different age groups or between males and
females.

Regulatory Submissions:

Regulatory authorities use the DM domain to assess the


diversity and representation of the study population.

The data helps ensure that the trial included enough


participants from various demographics to draw reliable
conclusions.

Centralized Linking:

The DM domain acts as a central hub, connecting data across


all other SDTM domains.

For instance, a subject’s DM record ensures their adverse


events (AE domain) and medical history (MH domain) are
linked correctly.

Why the DM Domain Is the


Foundation of Clinical Trial
Datasets
Universal Inclusion

Every subject in the study is represented in the DM domain.


Whether the subject completed the trial, dropped out, or
experienced adverse events, their record remains in the DM
dataset.
Cross-Domain Consistency

The DM domain ensures consistency across all other SDTM


domains. For example:

If a subject’s USUBJID is incorrectly entered in the AE domain,


it won’t match the identifier in the DM domain, triggering an
error.

Key for Analysis

Most clinical trial analyses begin with the DM domain


because it provides the context needed to interpret other
datasets. For instance:

A high number of adverse events in older adults may indicate


the drug is less safe for that age group.

Regulatory Requirement

The DM domain is a required component of any SDTM


submission. Regulatory agencies like the FDA use it to verify
the study population and assess the trial’s reliability.
DM DOMAIN ALL VARIABLES

1. STUDYID

Label: Study Identifier


Type: Character
Length: 20
Control Terminology: Not applicable
Role: Identifier
Core: Required (Req)
Comment: Unique identifier for a clinical study.
Origin: Defined by the sponsor.

Logic:

Assign a unique alphanumeric code to represent the study.


For instance, STUDYID = 'ABC123' ensures each study is
uniquely identifiable across the database.

2. DOMAIN

Label: Domain Abbreviation


Type: Character
Length: 2
Control Terminology: Fixed to "DM" for the Demographics
domain.
Role: Identifier
Core: Required (Req)
Comment: Represents the dataset the variable belongs to.
Origin: Automatically assigned by the system or
predefined.

Logic:

Set as DOMAIN = 'DM' to indicate the dataset’s role within the


SDTM framework.
3. USUBJID

Label: Unique Subject Identifier


Type: Character
Length: 50
Control Terminology: Not applicable
Role: Identifier
Core: Required (Req)
Comment: A globally unique identifier for each subject
across all studies.
Origin: Derived from the concatenation of STUDYID,
SITEID, and SUBJID.
Logic:

USUBJID = CATX('-', STUDYID, SITEID, SUBJID);

This ensures that the combination of study, site, and subject


ID creates a unique identifier for each participant.

4. SUBJID

Label: Subject Identifier for the Study


Type: Character
Length: 20
Control Terminology: Not applicable
Role: Topic
Core: Required (Req)
Comment: Unique within the study; often the ID recorded
on the CRF.
Origin: Defined during data collection.

Logic:

Use the subject’s unique ID as recorded in the case report


forms to maintain consistency.
5. RFSTDTC

Label: Subject Reference Start Date/Time


Type: Character
Length: ISO 8601 format (e.g., YYYY-MM-DD).
Control Terminology: ISO 8601
Role: Record Qualifier
Core: Expected (Exp)
Comment: Typically the date of first exposure to
treatment.
Origin: Derived from the Exposure (EX) domain.

Logic:

Assign the earliest EXSTDTC value from the Exposure domain


to identify the initiation of treatment.

6. RFENDTC

Label: Subject Reference End Date/Time


Type: Character
Length: ISO 8601 format
Control Terminology: ISO 8601
Role: Record Qualifier
Core: Expected (Exp)
Comment: The date of the last exposure or end of the trial.
Origin: Derived from EXENDTC or disposition information.

Logic:

Assign the latest EXENDTC value from the Exposure domain


or the trial end date to capture the subject’s end of
participation.
7. RFXSTDTC

Label: Date/Time of First Study Treatment


Type: Character
Length: ISO 8601 format
Control Terminology: ISO 8601
Role: Record Qualifier
Core: Expected (Exp)
Comment: The first exposure to any protocol-specified
treatment.
Origin: Derived from EXSTDTC.

Logic:

Assign the earliest EXSTDTC value to identify the start of


study treatment.

8. RFXENDTC

Label: Date/Time of Last Study Treatment


Type: Character
Length: ISO 8601 format
Control Terminology: ISO 8601
Role: Record Qualifier
Core: Expected (Exp)
Comment: The last exposure to any protocol-specified
treatment.
Origin: Derived from EXENDTC.

Logic:

Assign the latest EXENDTC value to mark the conclusion of


study treatment.
9. RFICDTC

Label: Date/Time of Informed Consent


Type: Character
Length: ISO 8601 format
Control Terminology: ISO 8601
Role: Record Qualifier
Core: Expected (Exp)
Comment: The date when the subject signed the
informed consent form.
Origin: Collected from clinical trial documentation.

Logic:

Assign the exact date when informed consent was signed to


confirm subject’s participation eligibility.

10. RFPENDTC

Label: Date/Time of End of Participation


Type: Character
Length: ISO 8601 format
Control Terminology: ISO 8601
Role: Record Qualifier
Core: Expected (Exp)
Comment: Date when the subject ended participation in
the trial.
Origin: Derived from disposition or follow-up data.

Logic:

Assign the last known contact date or follow-up date to mark


the completion of participation.
11. DTHDTC

Label: Date/Time of Death


Type: Character
Length: ISO 8601 format
Control Terminology: ISO 8601
Role: Record Qualifier
Core: Expected (Exp)
Comment: Date when the subject died.
Origin: Derived from the clinical database.

Logic:

Assign the recorded death date. If unavailable, leave as null to


indicate missing information.

12. DTHFL

Label: Subject Death Flag


Type: Character
Length: 1
Control Terminology: Controlled terms are "Y" (Yes) or
null.
Role: Record Qualifier
Core: Expected (Exp)
Comment: Indicates whether the subject has died.
Origin: Derived from DTHDTC.

Logic:

If DTHDTC is populated, set DTHFL = 'Y'; otherwise, leave as


null.
13. BRTHDTC

Label: Date/Time of Birth


Type: Character
Length: ISO 8601 format
Control Terminology: ISO 8601
Role: Record Qualifier
Core: Expected (Exp)
Comment: Subject’s date of birth.
Origin: Collected from the subject’s records.

Logic:

Record the exact date of birth to enable age calculation for


analysis.

14. SEX

Label: Sex
Type: Character
Length: 1
Control Terminology: Controlled Terminology (e.g., M, F,
U)
Role: Topic
Core: Required (Req)
Comment: Biological sex of the subject.
Origin: Collected during screening.

Logic:

Use predefined codes for male (M), female (F), or unknown


(U).
15. RACE

Label: Race
Type: Character
Length: 200
Control Terminology: Controlled Terminology
Role: Record Qualifier
Core: Expected (Exp)
Comment: Subject’s racial background.
Origin: Collected during screening or self-reported.

Logic:

Record the race as per the controlled terminology guidelines.

16. ETHNIC

Label: Ethnicity
Type: Character
Length: 200
Control Terminology: Controlled Terminology
Role: Record Qualifier
Core: Expected (Exp)
Comment: Subject’s ethnicity.
Origin: Collected during screening or self-reported.

Logic:

Use standardized codes or terms to indicate ethnicity.


17. ARM

Label: Description of Planned Arm


Type: Character
Length: 200
Control Terminology: Not applicable
Role: Record Qualifier
Core: Expected (Exp)
Comment: Planned treatment or intervention group for
the subject.
Origin: Derived from the protocol.

Logic:

Record the treatment group as defined in the study protocol.

18. ARMCD

Label: Planned Arm Code


Type: Character
Length: 20
Control Terminology: Controlled Terminology
Role: Record Qualifier
Core: Expected (Exp)
Comment: Code for the planned treatment or
intervention group.
Origin: Derived from the protocol.

Logic:

Assign the code representing the planned arm.


19. COUNTRY

Label: Country
Type: Character
Length: 3
Control Terminology: ISO 3166-1 alpha-3
Role: Record Qualifier
Core: Expected (Exp)
Comment: Country of the study site.
Origin: Derived from site information.

Logic:

Use the three-letter ISO code for the country.

20. DTHDTC

Label: Date/Time of Death


Type: Character
Length: ISO 8601 format
Control Terminology: ISO 8601
Role: Record Qualifier
Core: Permissible (Perm)
Comment: Date of death for the subject, if applicable.
Origin: Derived from adverse events or follow-up data.

Logic:

Record the exact date of death for accurate reporting and


analysis.
21. ETHNIC

Label: Ethnicity
Type: Character
Length: 20
Control Terminology: Controlled by CDISC ethnicity terms
(e.g., "HISPANIC OR LATINO", "NOT HISPANIC OR LATINO").
Role: Record Qualifier
Core: Expected (Exp)
Comment: Ethnic group of the subject based on protocol
requirements.
Origin: Collected during data entry.

Logic:

Assign the ethnicity specified in the CRF or leave as null if


unavailable.

22. ARM

Label: Description of Planned Arm


Type: Character
Length: 20
Control Terminology: Not applicable
Role: Record Qualifier
Core: Required (Req)
Comment: Describes the planned arm of the subject in
the study (e.g., "Placebo", "Treatment A").
Origin: Derived from the protocol's planned arm
assignments.

Logic:

Assign the planned treatment arm for the subject based on


the protocol.
23. ARMCD

Label: Planned Arm Code


Type: Character
Length: 20
Control Terminology: Not applicable
Role: Record Qualifier
Core: Required (Req)
Comment: Short code representing the planned arm of
the study (e.g., "PLA", "TRTA").
Origin: Derived from the protocol's planned arm codes.
Logic:

Assign the short code for the planned arm based on the
protocol.

24. ACTARM

Label: Description of Actual Arm


Type: Character
Length: 20
Control Terminology: Not applicable
Role: Record Qualifier
Core: Expected (Exp)
Comment: Describes the actual arm the subject was
assigned to.
Origin: Derived from the actual arm assignments.

Logic:

Assign the actual arm description the subject participated in


based on the study records.
25. ACTARMCD

Label: Actual Arm Code


Type: Character
Length: 20
Control Terminology: Not applicable
Role: Record Qualifier
Core: Expected (Exp)
Comment: Short code representing the actual arm the
subject participated in.
Origin: Derived from the actual arm assignments.

Logic:

Assign the short code for the actual arm the subject was in.

26. COUNTRY

Label: Country of Participation


Type: Character
Length: 3
Control Terminology: ISO 3166-1 alpha-3 country codes
(e.g., "USA", "IND").
Role: Record Qualifier
Core: Required (Req)
Comment: Represents the country where the subject
participated in the study.
Origin: Derived from site information.

Logic:

Assign the country code corresponding to the subject's


participation site.
27. VISITNUM

Label: Visit Number


Type: Numeric
Length: Integer
Control Terminology: Not applicable
Role: Timing
Core: Required (Req)
Comment: Indicates the visit sequence number for the
study.
Origin: Defined in the protocol schedule.

Logic:

Assign sequential numbers starting from 1 for each visit.

28. VISIT

Label: Visit Name


Type: Character
Length: 20
Control Terminology: Not applicable
Role: Timing
Core: Required (Req)
Comment: Describes the name of the visit (e.g.,
"Screening", "Baseline").
Origin: Derived from the protocol schedule.

Logic:

Assign the visit name based on the visit schedule in the


protocol.
29. VISITDY

Label: Planned Study Day of Visit


Type: Numeric
Length: Integer
Control Terminology: Not applicable
Role: Timing
Core: Expected (Exp)
Comment: Indicates the planned day relative to the start
of treatment.
Origin: Derived from the protocol schedule.
Logic:

Calculate as the difference in days between VISIT and


RFSTDTC.

30. ARMU

Label: Arm Units


Type: Character
Length: 20
Control Terminology: Not applicable
Role: Variable Qualifier
Core: Permissible (Perm)
Comment: Units of measurement for the treatment arm if
applicable.
Origin: Defined in the protocol.

Logic:

Assign the units of measurement or leave as null if not


applicable.
Structure of the DM Domain
The DM domain consists of one record per subject. It contains
various variables that help in describing each subject’s
baseline characteristics and participation in the study.

Category Variable Name Core/Permissible Description

Identifier STUDYID Core Unique study identifier

Identifier DOMAIN Core Fixed value DM for the DM domain


Unique subject identifier combining STUDYID
Timing USUBJID Core
and SUBJID
Timing RFSTDTC Core Reference start date/time

Demographics RFENDTC Core Reference end date/time

Demographics AGE Core Age of the subject at the start of the study

Demographics SEX Core Sex of the subject (M, F, U)


Race of the subject based on controlled
Demographics RACE Core
terminology
Ethnicity of the subject (HISPANIC, NON-
Demographics ETHNIC Permissible
HISPANIC)
Trial Participation ARM Core Name of the treatment group (arm)

Trial Participation ARMCD Core Code for the treatment arm (P, TA, etc.)

Key Variables in the DM Domain


1. Identifier Variables

STUDYID: Identifies the clinical study.


DOMAIN: Always set to DM for the Demographics domain.
USUBJID: A unique identifier for each subject, typically
created by concatenating the study ID and subject ID.
SUBJID: A unique identifier used within the study.
2 Timing Variables

RFSTDTC: Reference start date/time – usually the date the


subject started participating in the study.
RFENDTC: Reference end date/time – usually the last date
the subject participated in the study.
RFXSTDTC: Randomization start date/time.
RFXENDTC: Randomization end date/time.

3 Demographic Variables

AGE: Age of the subject at the time of study entry.


AGEU: Units for age (e.g., YEARS, MONTHS).
SEX: Gender of the subject.
RACE: Race of the subject, based on a predefined list of
categories.
ETHNIC: Ethnicity of the subject.

4 Trial Participation Variables

ARM: The name of the treatment or intervention group


assigned to the subject.
ARMCD: A code for the treatment group.
ACTARM: Actual treatment arm received by the subject, if
it differs from the planned arm.
ACTARMCD: Code for the actual arm.

5 Stratification and Group Variables

BRTHDTC: Birth date of the subject.


DTHFL: Death flag indicating if the subject has died
during the trial.
DTHDTC: Date of death, if applicable.
SITEID: The site identifier for where the subject was
enrolled.
COUNTRY: Country of the subject’s participation.
Deriving Variables in the DM
Domain
1 Unique Subject Identifier (USUBJID)

The USUBJID combines the STUDYID and SUBJID to create a


unique identifier for each subject across the study.

Example:
USUBJID = catx("-", STUDYID, SUBJID);

2 Age Calculation and Units (AGE, AGEU)

The AGE variable is typically derived based on the difference


between the subject’s birth date (BRTHDTC) and the reference
start date (RFSTDTC).
Example:
AGE = intck('year', input(BRTHDTC, yymmdd10.), input(RFSTDTC,
yymmdd10.));
AGEU = "YEARS";

3 Trial Arm Assignment (ARM, ARMCD)

The planned treatment arm (ARM) can be directly mapped from


the trial protocol. If the subject receives a different arm due to
protocol deviations, the ACTARM and ACTARMCD should reflect
the actual treatment received.

4 Reference Dates (RFSTDTC, RFENDTC)

The RFSTDTC and RFENDTC fields capture the dates that mark
the start and end of a subject’s participation. These are often
derived from the trial database.
Controlled Terminology
Variable Value Description

SEX M Male

F Female

U Unknown

RACE ASIAN Asian

BLACK Black or African American

WHITE White

OTHER Other
ETHNIC HISPANIC Hispanic

NON-HISPANIC Non-Hispanic

Common Issues and Solutions


1 Missing Data

If demographic variables like RACE or SEX are missing, use the


controlled term U for unknown values where appropriate.
For age, if AGE is missing, it may need to be derived from other
available data, such as the birth date and trial reference dates.

2 Incorrect Date Formats

Dates should be in ISO 8601 format (YYYY-MM-DD). If the date is


provided in another format, ensure it is properly converted before
populating the RFSTDTC or RFENDTC variables.

3 Handling Multiple Arms

If a subject is assigned to more than one treatment arm over the


course of the study, ensure ARM and ACTARM accurately reflect this
with appropriate codes and descriptions.
Best Practices for Creating the DM
Domain

Traceability: Ensure clear documentation of data sources and


transformations from raw data to SDTM-compliant variables.

Controlled Terminology: Always use the appropriate controlled


terminology for variables like SEX, RACE, and ARM.

Data Validation: Validate the dataset for accuracy using tools like
Pinnacle 21, which ensures compliance with SDTM standards.

Subject-Level Data Integrity: Ensure no duplicate USUBJID values


and confirm that all required variables are populated.

Proper Formatting: Ensure the date variables are in ISO format


and adhere to the CDISC SDTM Implementation Guide.
Example SAS Code for DM Domain
Creation
Example SAS Code for DM Domain
Creation
Variable Metadata for DM
Domain (VARNUM Order)
MORE INFORMATIVE
POST
(CLICK ON TOPIC TO READ)
Comprehensive Classification of SDTM Domains

300 SAS Questions

SAS Practice Set (Macro)

Career Insights: Various Role in Pharma Industry

Clinical Data Standard: The CDISC Handbook

SAS MCQ Practice Set - 2

Pharmacovigilance Guide

Types Of Clinical Trials

4 Pillers For Becoming Master In Clinical SAS

Top Interview Questions On CDISC

Companies And Their Type Who Hire Clinical SAS


Programmer

SAS MCQ Practice Set

A Complete Guide to Clinical SAS


Thank you for exploring this guide to the SDTM DM
Domain. I hope it has provided valuable insights and
practical knowledge to help you navigate and
implement the DM domain effectively.

If you have any questions or feedback, feel free to


connect with me on LinkedIn or reach out directly.
Together, we contribute to advancing clinical trial
data quality and global healthcare.

Wishing you success!

Saurabh Patil
Clinical SAS Programmer

You might also like