Introduction

Clinical trials have historically lacked equitable representation of people identifying as women and members of racial or ethnic minority groups1. Recognizing the issue, the National Institutes of Health (NIH), World Health Organization (WHO), and the United States (US) Food and Drug Administration (FDA) have improved reporting and inclusion of minorities, aiming for medical research to better reflect the shifting US demographics2,3,4,5,6. Nevertheless, significant disparities in representation persist1,7,8,9.

Prior systematic reviews and meta-analyses have been performed analyzing gender, ethnicity, and racial demographics in clinical trials for niche diseases (e.g. glaucoma, acute coronary syndrome, rheumatoid arthritis, dementia, congestive heart failure, cardiovascular, oncology, dyslipidemia), as well as for trials sponsored by select pharmaceutical groups1,10,11,12,13,14,15,16,17,18,19,20. However, data remains sparse on the inclusion of gender, ethnic, and racial groups in trials overall in the US, as well as by study phase, size, institutional status, masking, and trends in representation over time. Furthermore, past systematic reviews have included multi-institutional studies with international locations, which can limit the ability to accurately reflect US demographics10,15,21.

In addition, despite policies that seek to address enrollment and recruitment in clinical trials, longitudinal data regarding inclusion of women and minorities in trials overall has not been assessed since the passage of these initiatives1,2,3,4,5,6. Using available ClinicalTrials.gov demographic data, our study assessed whether adult women and minorities were underrepresented in US phase 2 and 3 randomized clinical trials between 2008 and 2019, comparing demographic proportions overall and within disease categories (i.e., psychiatric disorders, obstetric/gynecologic, neurological, cardiovascular, etc.), by study phase, trial quality tier, institutional status, level of masking, and study start year.

Methods

Using the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines and the Cochrane Handbook of Systematic Reviews of Interventions, we conducted a systematic review through MEDLINE, Embase, Cochrane Central Register of Controlled Trials (CENTRAL), and ClinicalTrials.gov. (Fig. 1).

Figure 1
figure 1

PRISMA flowchart.

Data collection

Clinical trials were identified utilizing the US Clinical Trial Registry website (https://clinicaltrials.gov/). Clinical trials (N = 5388) were reviewed based on the following criteria: study type (interventional and randomized), minimum participant age (18 years old), study location (US, including Puerto Rico), phase (2 and/or 3), study status (completed or active/not recruiting new participants), start date (01/01/2008 to 12/30/2019), and results date (on or before 11/18/2020). Each clinical trial was screened independently by two reviewers. Multi-center studies including international locations were excluded. Discrepancies between reviewers for inclusion of specific trials were resolved through discussion and a third reviewer.

Data was then extracted from the clinical trials which met inclusion criteria (N = 2977). Extraction was performed independently by two reviewers and then compared for consensus. Data extracted from each trial included: start date, phase, level of masking, total number of participants, multi-center status (multi-center or single institution), number of participants of each gender (male, female, or transgender), ethnicity group (Hispanic or non-Hispanic), and racial group (American Indian and Alaskan Native [AIAN], Asian, Native Hawaiian and Pacific Islander [NHPI], Black and African American [Black], White, or more than one race [multiracial]), as well as whether race and ethnicity data were reported by the trial. Studies that were characterized as combined phases were categorized to the higher phase (i.e. trials having completed phases 1 and 2 were categorized as phase 2; trials having completed phases 2 and 3 were categorized as phase 3) 21.

A two-tier assessment was performed for each trial based on the following criteria (one point for each parameter): (1) multi-center; (2) ≥ 200 participants; (3) reports ethnicity; (4) reports race. Trials scoring 0, 1, or 2 points were assigned as tier 1. Trials scoring 3 or 4 points were assigned as tier 2. Tier assessment was based on Cochrane Library guidelines and the Hoy Risk of Bias Tool, but modified in that tier assigned was based on objective data available from ClinicalTrials.gov22,23. Discrepancies in individual assessments of tier were resolved through discussion and involvement of a third reviewer.

Trial participants labeled as unknown, other, missing, or not reported within the gender (individual participants, n = 190), race (n = 7437), and ethnicity (n = 2468) categories were excluded from each group’s total proportion. 0.10% of trials (N = 3) included gender minorities, therefore transgender trial participants (n = 5) were excluded from each trial’s total gender proportion for statistical analysis.

Statistical analysis

For full dataset analysis, as well as disease strata, we compared trial median proportions for gender, race, and ethnicity to 2010 official US Census proportions matched by age (≥ 18 years old) using Wilcoxon rank sum analysis24. For analysis of gender, random effects modeling meta-analyses were performed for female representation overall and for female representation excluding disease categories with significant gender skew. A secondary analysis was performed comparing gender, race, and ethnicity proportions by trial phase, multi-institutional status, trial tier, masking, and study start year. Trial median proportions for gender, ethnicity, and race trended by study start year were compared to the reference year of 2008 using Wilcoxon rank sum analysis and random effects modeling for the meta-analysis. Funnel plots were developed to examine scatter patterns of trial proportions surrounding the summary proportion by number of trial participants for gender, ethnicity, and race. All analyses were conducted via the R Statistical Software (R Foundation for Statistical Computing, Vienna, Austria). The systematic review and meta-analysis were registered on PROSPERO, with the identifier of CRD42021238101.

Results

Gender

Three clinical trials (0.10%) from 2977 reported inclusion of transgender participants (total participants, n = 5). These trials examined treatment of major depressive disorder, anal neoplasms, and human immunodeficiency virus (HIV).

Female representation

After excluding disease categories with significant gender recruitment skew (i.e., pregnancy, prostate cancer, etc.), females were found underrepresented (48.3%, 95% CI 47.2–49.3) overall when compared to the US Census proportion (51.5%, p < 0.0001). However, between 2008 and 2018 representation increased (p = 0.0005), with females being overrepresented in 2018 (64.0%, 95% CI 56.5–71.2; p = 0.0012) (Fig. 2).

Figure 2
figure 2

Female and Male Representation Trended Between 2008–2019. Median proportions of gender groups (%) by year (N = 2693). The dashed line is the male proportion from Census 2010. *P < 0.05 (Wilcoxon rank sum test comparing annual female proportion with the 2008 clinical trial proportion).

When examining clinical trial phases, females were underrepresented in phase 2 (46.2%, 95% CI 44.9–47.5; p < 0.0001), yet accurately represented in phase 3 (52.3%, 95% CI 50.5–54.0; p = 0.41), with the proportion significantly increasing between phase 2 to 3 (p < 0.0001). Regardless of a trial’s institutional status, females were underrepresented in both single institution (47.2%, 95% CI 45.7–48.7; p < 0.0001) and multi-institutional (49.2%, 95% CI 47.8–50.6; p = 0.0017) trials, with representation similar between both groups (p = 0.065). Tier 2 trials exhibited appropriate representation (50.5%, 95% CI 48.5–52.5; p = 0.33), unlike tier 1 (47.2%, 95% CI 45.7–48.7; p < 0.0001). For masking status, females were underrepresented in trials with no (42.6%, 95% CI 40.2–45.0; p < 0.0001), single (46.6%, 95% CI 42.8–50.4; p = 0.012), and quadruple (48.7%, 95% CI 46.8–50.6; p = 0.0032) masking, while accurately represented in trials with double (51.0%, 95% CI 48.9–53.1; p = 0.63) and triple (50.0%, 95% CI 47.7–52.4; p = 0.22) masking. Relative to trials with no masking, female proportions were greater in trials with double (p < 0.0001), triple (p = 0.0001), and quadruple (p = 0.0001) masking (Table 1).

Table 1 Female proportion estimates overall, by phase, multi-institutional status, tier assessment, masking, and year. 284 studies from diseases related to prostate cancer (54 trials), breast cancer (60 trials), gynecologic cancer (21 trials), female genitourinary diseases (73 trials), male genitourinary diseases (36 trials), and pregnancy (40 trials) were excluded from this analysis (N = 2693).

Ethnicity

Of the 2977 trials, 35.7% (N = 1062) reported ethnicity, with 0.4% of participants (n = 2468) having their ethnicity reported as unknown.

Hispanic

In trials reporting ethnicity, Hispanics were underrepresented (11.6%, 95% CI 10.8–12.4; p < 0.0001) overall, relative to the Census proportion (14.2%) (Table 2). Yet between 2008 and 2016 representation significantly increased (p < 0.0001), where Hispanics were over-represented in trials started in 2016 (18.7%, 95% CI, 15.8–21.7; p = 0.0003) (Fig. 3A).

Table 2 Hispanic proportion estimates overall, by phase, multi-institutional status, tier assessment, masking, and year.
Figure 3
figure 3

Ethnicity and Race Trended Between 2008–2019. Median proportions (%) of Hispanics (N = 1062) and race categories (N = 1589) by year, with the x-axis representing the year (2008–2019) and the y-axis the ethnicity/race proportion (%). (A): Hispanics, *P < 0.05 (Wilcoxon rank sum test comparing annual proportion relative to 2008 clinical trial proportion); (B): Blacks; (C): Whites; (D): American Indians and Alaskan Natives; (E): Asian; (F): Native Hawaiians and Pacific Islanders; (G): Multiracial individuals. The dashed line represents the designed demographic proportion from Census 2010.

While Hispanics were underrepresented in phase 2 trials (10.6%, 95% CI 9.63–11.6; p < 0.0001) the proportion significantly increased between phase 2 and 3 (p = 0.0012), with appropriate representation in phase 3 (13.5%, 95% CI 12.0–15.0; p = 0.972). For institutional status, there was underrepresentation in both single (10.8%, 95% CI 9.51–12.2; p < 0.0001) and multi-institutional (12.1%, 95% CI 11.0–13.1; p = 0.0093) trials. Similarly. regardless of trial tier, Hispanics were underrepresented in both tier 1 (10.3%, 95% CI 8.96–11.7; p < 0.0001) and tier 2 (12.2%, 95% CI 11.2–13.3; p = 0.017), yet representation increased between the first and second tiers (p = 0.03). For masking, Hispanics were underrepresented in trials with no (10.3%, 95% CI 8.61–12.1; p = 0.0008), double (11.0%, 95% CI 9.42–12.7; p = 0.0039), and quadruple masking (11.9%, 95% CI 10.4–13.4; p = 0.041), while accurately represented in single (11.1%, 95% CI 8.17–14.3; p = 0.14) and triple masking (13.6%, 95% CI 11.6–15.7; p = 0.93).

Race

From the 2977 clinical trials, 53% reported race (N = 1589), with 1% of participants having their race reported as unknown (n = 7437). Between 2008 and 2019, the proportions of all racial strata did not significantly change (Table 27).

American Indian and Alaska native

AIAN were underrepresented (0.19%, 95% CI 0.15–0.23; p < 0.0001) in trials overall, relative to the Census proportion (1.10%) (Table 3). In both phase 2 (0.13%, 95% CI 0.10–0.18; p < 0.0001) and 3 (0.30%, 95% CI 0.23–0.38; p < 0.0001), AIAN were underrepresented, yet representation increased between phase 2 to 3 (p < 0.0001). While AIAN were underrepresented regardless of institutional status of the trial, there was a significant increase (p < 0.0001) in representation between single (0.09%, 95% CI 0.10–0.18) and multi-institutional (0.26%, 95% CI 0.21–0.31) studies. For trial tier, AIAN were underrepresented in both tier 1 (0.09%, 95% CI 0.05–0.13; p < 0.0001) and 2 trials (0.29%, 95% CI 0.23–0.35; p < 0.0001), however representation was significantly greater in tier 2 relative to 1 (p < 0.0001). Regardless of masking status, AIAN remained underrepresented (p < 0.0001), with representation similar amongst all degrees of masking.

Table 3 American Indian and Alaska Native proportion estimates overall, by phase, multi-institutional status, tier assessment, masking, and year.

Asian

Relative to the Census (5.01%), Asians were underrepresented (1.27%, 95% CI 1.13–1.42; p < 0.0001) in clinical trials overall, regardless of trial phase, institutional status, tier, and masking classification (Table 4). While representation was similar between phase 2 and 3 (p = 0.98), as well as single and multi-institutional status (p = 0.31), trials classified as tier 2 exhibited greater representation than tier 1 (p = 0.0045), and in trials with triple masking representation was greater than those with no masking (p = 0.045).

Table 4 Asian proportion estimates overall, by phase, multi-institutional status, tier assessment, masking, and year.

Native Hawaiian and Pacific Islander

NHPI were overrepresented (0.76%, 95% CI 0.71–0.82; p < 0.0001) in trials overall (Census: 0.20%), regardless of phase, institutional status, tier, and masking classification (Table 5). The NHPI proportion was significantly lower in phase 3 versus 2 (p < 0.0001), multi-institutional versus single (p < 0.0001), and in tier 2 versus 1 (p < 0.0001). NHPI proportion was significantly greater in trials with single masking, relative to none (p = 0.017).

Table 5 Native Hawaiian and Pacific Islander proportion estimates overall, by phase, multi-institutional status, tier assessment, masking, and year.

Black

In relation to the Census proportion (12.3%), Blacks were overrepresented (17.0%, 95% CI 15.9–18.1; p < 0.0001) overall in clinical trials, regardless of phase, institutional status, trial tier, and masking classification (p < 0.0001) (Table 6). Single institutional trials had greater representation than multi-institutional (p = 0.0002). Meanwhile, compared with no masking, representation was significantly greater in trials with single (p = 0.0005), triple (p = 0.0065), and quadruple (p = 0.0077) masking.

Table 6 Black proportion estimates overall, by phase, multi-institutional status, tier assessment, masking, and year.

White

Overall, in clinical trials Whites were underrepresented (77.6%, 95% CI 76.4–78.8; p < 0.0001) when compared to the Census proportion (79.8%), irrespective of trial phase or tier (Table 7). However, multi-institutional trials exhibited appropriate representation (80.0%, 95% CI 78.5–81.5; p = 0.52), unlike single-institutional trials where Whites were underrepresented (74.1%, 95% CI 74.1–76.1; p < 0.0001). In trials with single (71.8%, 95% CI 66.7–76.6; p = 0.0003), triple (74.0%, 95% CI 71.0–76.9; p < 0.0001), and quadruple masking (77.2%, 95% CI 75.0–79.4; p = 0.0031) Whites were also underrepresented, while in trials with no (81.6%, 95% CI 79.1–83.9; p = 0.396) and double masking (78.7%, 95% CI 76.3–81.0; p = 0.136) there was accurately representation.

Table 7 White proportion estimates overall, by phase, multi-institutional status, tier assessment, masking, and year.

Multiracial

Relative to the Census (1.56%), multiracial participants were underrepresented (0.25%, 95% CI 0.21–0.31; p < 0.0001) in clinical trials overall, regardless of trial phase, institutional status, tier, and masking classification (Table 8). Of note, when examining masking status, multiracial representation did increase between trials with triple masking, relative to no masking (p = 0.025). Meanwhile, in multi-institutional trials multiracial representation significantly decreased, relative to single-institutional (p = 0.0001).

Table 8 Multiracial proportion estimates overall, by phase, multi-institutional status, tier assessment, masking, and year.

Representation by disease strata: gender, ethnicity, and race

The 2977 clinical trials were also stratified by 44 disease categories and subcategories to examine variations in representation by gender, ethnicity, and race (Table 9; Supplemental Table 1).

Table 9 Representation of Gender, Ethnicity, and Race in Clinical Trials by Disease Categories and Sub-Categories.

Females were underrepresented in 18 disease strata and overrepresented in ten, relative to the Census proportion (51.5%, p < 0.05). Hispanics were underrepresented in 18 disease strata, while overrepresented in three (p < 0.05). Asians were underrepresented in 36 disease strata, appropriately represented in eight, and overrepresented in none. AIAN were underrepresented in 32 disease strata and overrepresented in 1 (p < 0.05). Multiracial participants were underrepresented in 25 disease strata (p < 0.05) and overrepresented in five. Whites were underrepresented in 17 disease strata and overrepresented in eight. Blacks were underrepresented in four disease strata and overrepresented in 20. NHPI were overrepresented in 37 disease strata (p < 0.05), and statistically underrepresented in none.

Discussion

Over the last several decades, the issue of equitable clinical trial recruitment of women and minorities in the United States has garnered various degrees of attention2,3,4,7. In 1993, the NIH passed the Revitalization Act, which mandated inclusion of women and racial/ethnic minority groups in clinical trials4,7. The policy was then updated in 2000, 2001, and 2017 to require standardized minimum inclusion of sex, gender, and racial/ethnic minority groups in phase 3 clinical trials, with mandated reporting of demographic data to ClinicalTrials.gov2,3,4.

Despite FDA recommendations, our results indicate many studies did not comply with reporting guidance of demographic characteristics. Failure to report race and ethnicity data was prevalent in US clinical trials conducted between 2008 and 2019, a phenomenon reported in other reviews1,21. Likewise, the inclusion of sexual and gender minorities in clinical trials is nearly non-existent25,26,27,28. The lack of inclusion may be explained by incomplete and ambiguity in gender reporting on clinical trial recruitment servers, thus yielding in difficulties or failure to recruit from the population of sexual and gender minorities29,30.

Of the trials reporting demographics, these did not accurately represent the nation's demographics 2,3,4. When trials with significant gender skew were excluded from analysis, females remained underrepresented—a historically consistent observation31. The disparity likely arose secondary to a combination of research bias and categorization of women as a vulnerable population32,33,34,35. Nevertheless, there was a significant improvement in female representation from 2008 to 2018.

While female representation improved with time, such was not the case for underrepresented racial/ethnic groups, including Hispanics, AIAN, Asians, and multiracial populations. Low level of minority enrollment can potentially be explained by historical racial injustices, subject burden (i.e., transportation limitations, perceived interference with work/family obligations), lower socioeconomic status, communication barriers, and divergent cultural attitudes between investigators and participants36,37,38,39,40,41,42.

In contrast, Blacks and NHPI were found to be overrepresented overall in most clinical trials. These trends corroborate findings that people of color are much more willing to participate—as much as Whites–in trials than perceived43,44,45,46. Meanwhile, NHPI overrepresentation potentially represents an overall magnification of a small population, likely participating in trials from regions with significant NHPI density (i.e., Hawaii)47.

When examining clinical trial phase, females, Hispanics, and AIANs all exhibited greater representation in phase 3 than phase 2. Phase 3 trials may inherently lend themselves to readily attain diversity, as these investigations are typically more robust with larger financial resources and up to 1000 patients, relative to phase 2 trials which may have around a hundred participants48,49. On the other hand, the lower female proportion in phase 2 trials may arise secondary to phase 2 investigations often having exclusions on the basis of child-bearing potential50.

Hispanic, AIAN, Asian, and White groups have increased representation in tier 2 versus tier 1 trials, a trend possibly explained by increased trial size and multi-regionality. The difference in race/ethnicity based on trial size and multi-center nature, potentially highlights the trend of minorities to be differentially recruited based on trial characteristics—an issue raised in prior literature1.

When stratifying clinical trials by disease categories, our results suggested recruitment patterns often paralleled the baseline demographics of the particular illness. For instance, males were overrepresented in clinical trials investigating infectious diseases (i.e., HIV, hepatitis C), schizophrenia, cardiovascular diseases, stroke, and diabetes, while females were overrepresented in trials of musculoskeletal, gastrointestinal, obesity, and depression/mental health disorders51,52,53,54,55,56,57,58,59,60,61,62,63,64,65. Regarding race, representation paralleling the disease demographics was observed with overrepresentation of: Hispanics in diabetes and renal trials; AIAN in substance use disorders trials; Blacks in trials of infectious disease (i.e., HIV and hepatitis C), hypertension, stroke, obesity, hematology, musculoskeletal, and renal; Whites in gastrointestinal trials66,67,68,69,70,71,72,73,74,75.

Limitations

Overall, the findings should be considered in the context of several limitations. First, given non-compliance of data reporting on ClinicalTrials.gov, our investigation was unable appropriately conduct analyses stratified by age, while there is also a possibility studies omitted from the meta-analysis may have exhibited demographic proportions divergent from the observed trends. Second, reporting of race on ClinicalTrials.gov occasionally utilized non-standard categorization, requiring inference of race or exclusion of the data. Furthermore, given government policies to enhance reporting of race/ethnicity over the years for phase 3 clinical trials, some of the trends observed may have represented improved reporting rather than changes in demographic representation over the years. Finally, when examining funnel plots for gender, race, and ethnicity (Fig. 4), there appears a potential bias where the sample size of the study influences the proportion of multi-racial and NHPI proportions.

Figure 4
figure 4

Funnel plots: Proportions of the subgroups (%) by sample size. The horizontal line is the summary proportion.

Conclusion

The results of this study indicate persistence of gender, ethnic, and racial disparities in phase 2 and 3 randomized clinical trial recruitment of US adults. While representation of women and Hispanics has improved between 2008 to 2019, and Blacks with NHPIs generally overrepresented, the overall representation of several racial minorities (Asians, AIAN, and multi-racial individuals) has remained static, despite systems-based initiatives aimed at improving diversity. Overall, randomized clinical trials may not reflect the demographics of the populations sought to be served.