Demographic recruitment bias of adults in United States randomized clinical trials by disease categories between 2008 to 2019: a systematic review and meta-analysis

To promote health equity within the United States (US), randomized clinical trials should strive for unbiased representation. Thus, there is impetus to identify demographic disparities overall and by disease category in US clinical trial recruitment, by trial phase, level of masking, and multi-center status, relative to national demographics. A systematic review and meta-analysis were conducted using MEDLINE, Embase, CENTRAL, and ClinicalTrials.gov, between 01/01/2008 to 12/30/2019. Clinical trials (N = 5,388) were identified based on the following inclusion criteria: study type, location, phase, and participant age. Each clinical trial was independently screened by two researchers. Data was pooled using a random-effects model. Median proportions for gender, race, and ethnicity of each trial were compared to the 2010 US Census proportions, matched by age. A second analysis was performed comparing gender, race, and ethnicity proportions by trial phase, multi-institutional status, quality, masking, and study start year. 2977 trials met inclusion criteria (participants, n = 607,181) for data extraction. 36% of trials reported ethnicity and 53% reported race. Three trials (0.10%) included transgender participants (n = 5). Compared with 2010 US Census data, females (48.3%, 95% CI 47.2–49.3, p < 0.0001), Hispanics (11.6%, 95% CI 10.8–12.4, p < 0.0001), American Indians and Alaskan Natives (AIAN, 0.19%, 95% CI 0.15–0.23, p < 0.0001), Asians (1.27%, 95% CI 1.13–1.42, p < 0.0001), Whites (77.6%, 95% CI 76.4–78.8, p < 0.0001), and multiracial participants (0.25%, 95% CI 0.21–0.31, p < 0.0001) were under-represented, while Native Hawaiians and Pacific Islanders (0.76%, 95% CI 0.71–0.82, p < 0.0001) and Blacks (17.0%, 95% CI 15.9–18.1, p < 0.0001) were over-represented. Inequitable representation was mirrored in analysis by phase, institutional status, quality assessment, and level of masking. Between 2008 to 2019 representation improved for only females and Hispanics. Analysis stratified by 44 disease categories (i.e., psychiatric, obstetric, neurological, etc.) exhibited significant yet varied disparities, with Asians, AIAN, and multiracial individuals the most under-represented. These results demonstrate disparities in US randomized clinical trial recruitment between 2008 to 2019, with the reporting of demographic data and representation of most minorities not having improved over time.

caltr ials. gov/). Clinical trials (N = 5388) were reviewed based on the following criteria: study type (interventional and randomized), minimum participant age (18 years old), study location (US, including Puerto Rico), phase (2 and/or 3), study status (completed or active/not recruiting new participants), start date (01/01/2008 to 12/30/2019), and results date (on or before 11/18/2020). Each clinical trial was screened independently by two reviewers. Multi-center studies including international locations were excluded. Discrepancies between reviewers for inclusion of specific trials were resolved through discussion and a third reviewer.
Data was then extracted from the clinical trials which met inclusion criteria (N = 2977). Extraction was performed independently by two reviewers and then compared for consensus. Data extracted from each trial included: start date, phase, level of masking, total number of participants, multi-center status (multi-center or single institution), number of participants of each gender (male, female, or transgender), ethnicity group (Hispanic or non-Hispanic), and racial group (American Indian and Alaskan Native [AIAN], Asian, Native Hawaiian and Pacific Islander [NHPI], Black and African American [Black], White, or more than one race [multiracial]), as well as whether race and ethnicity data were reported by the trial. Studies that were characterized as combined phases were categorized to the higher phase (i.e. trials having completed phases 1 and 2 were categorized as phase 2; trials having completed phases 2 and 3 were categorized as phase 3) 21 .
A two-tier assessment was performed for each trial based on the following criteria (one point for each parameter): (1) multi-center; (2) ≥ 200 participants; (3) reports ethnicity; (4) reports race. Trials scoring 0, 1, or 2 points were assigned as tier 1. Trials scoring 3 or 4 points were assigned as tier 2. Tier assessment was based on Cochrane Library guidelines and the Hoy Risk of Bias Tool, but modified in that tier assigned was based on objective data available from ClinicalTrials.gov 22,23 . Discrepancies in individual assessments of tier were resolved through discussion and involvement of a third reviewer.
Trial participants labeled as unknown, other, missing, or not reported within the gender (individual participants, n = 190), race (n = 7437), and ethnicity (n = 2468) categories were excluded from each group's total proportion. 0.10% of trials (N = 3) included gender minorities, therefore transgender trial participants (n = 5) were excluded from each trial's total gender proportion for statistical analysis.

Statistical analysis.
For full dataset analysis, as well as disease strata, we compared trial median proportions for gender, race, and ethnicity to 2010 official US Census proportions matched by age (≥ 18 years old) using Wilcoxon rank sum analysis 24 . For analysis of gender, random effects modeling meta-analyses were performed for female representation overall and for female representation excluding disease categories with significant gender skew. A secondary analysis was performed comparing gender, race, and ethnicity proportions by trial Female representation. After  Hispanic. In trials reporting ethnicity, Hispanics were underrepresented (11.6%, 95% CI 10.8-12.4; p < 0.0001) overall, relative to the Census proportion (14.2%) ( Table 2). Yet between 2008 and 2016 representation significantly increased (p < 0.0001), where Hispanics were over-represented in trials started in 2016 (18.7%, 95% CI, 15.8-21.7; p = 0.0003) (Fig. 3A).
Race. From the 2977 clinical trials, 53% reported race (N = 1589), with 1% of participants having their race reported as unknown (n = 7437). Between 2008 and 2019, the proportions of all racial strata did not significantly change ( Table 2-7).

Representation by disease strata: gender, ethnicity, and race
The 2977 clinical trials were also stratified by 44 disease categories and subcategories to examine variations in representation by gender, ethnicity, and race (Table 9; Supplemental Table 1). Females were underrepresented in 18 disease strata and overrepresented in ten, relative to the Census proportion (51.5%, p < 0.05). Hispanics were underrepresented in 18 disease strata, while overrepresented in three (p < 0.05). Asians were underrepresented in 36 disease strata, appropriately represented in eight, and overrepresented in none. AIAN were underrepresented in 32 disease strata and overrepresented in 1 (p < 0.05). Multiracial participants were underrepresented in 25 disease strata (p < 0.05) and overrepresented in five. Whites were underrepresented in 17 disease strata and overrepresented in eight. Blacks were underrepresented in four disease strata and overrepresented in 20. NHPI were overrepresented in 37 disease strata (p < 0.05), and statistically underrepresented in none.

Discussion
Over the last several decades, the issue of equitable clinical trial recruitment of women and minorities in the United States has garnered various degrees of attention [2][3][4]7 . In 1993, the NIH passed the Revitalization Act, which mandated inclusion of women and racial/ethnic minority groups in clinical trials 4,7 . The policy was then updated in 2000, 2001, and 2017 to require standardized minimum inclusion of sex, gender, and racial/ethnic minority groups in phase 3 clinical trials, with mandated reporting of demographic data to ClinicalTrials.gov [2][3][4] . Despite FDA recommendations, our results indicate many studies did not comply with reporting guidance of demographic characteristics. Failure to report race and ethnicity data was prevalent in US clinical trials conducted between 2008 and 2019, a phenomenon reported in other reviews 1,21 . Likewise, the inclusion of sexual and gender minorities in clinical trials is nearly non-existent [25][26][27][28] . The lack of inclusion may be explained by incomplete and ambiguity in gender reporting on clinical trial recruitment servers, thus yielding in difficulties or failure to recruit from the population of sexual and gender minorities 29,30 .
Of the trials reporting demographics, these did not accurately represent the nation's demographics [2][3][4] . When trials with significant gender skew were excluded from analysis, females remained underrepresented-a historically consistent observation 31 . The disparity likely arose secondary to a combination of research bias and categorization of women as a vulnerable population [32][33][34][35] . Nevertheless, there was a significant improvement in female representation from 2008 to 2018.
While female representation improved with time, such was not the case for underrepresented racial/ethnic groups, including Hispanics, AIAN, Asians, and multiracial populations. Low level of minority enrollment can potentially be explained by historical racial injustices, subject burden (i.e., transportation limitations, perceived interference with work/family obligations), lower socioeconomic status, communication barriers, and divergent cultural attitudes between investigators and participants [36][37][38][39][40][41][42] .
In contrast, Blacks and NHPI were found to be overrepresented overall in most clinical trials. These trends corroborate findings that people of color are much more willing to participate-as much as Whites-in trials than perceived [43][44][45][46] . Meanwhile, NHPI overrepresentation potentially represents an overall magnification of a small population, likely participating in trials from regions with significant NHPI density (i.e., Hawaii) 47 .
When examining clinical trial phase, females, Hispanics, and AIANs all exhibited greater representation in phase 3 than phase 2. Phase 3 trials may inherently lend themselves to readily attain diversity, as these investigations are typically more robust with larger financial resources and up to 1000 patients, relative to phase 2 trials which may have around a hundred participants 48,49 . On the other hand, the lower female proportion in phase 2 trials may arise secondary to phase 2 investigations often having exclusions on the basis of child-bearing potential 50 .

Limitations
Overall, the findings should be considered in the context of several limitations. First, given non-compliance of data reporting on ClinicalTrials.gov, our investigation was unable appropriately conduct analyses stratified by age, while there is also a possibility studies omitted from the meta-analysis may have exhibited demographic proportions divergent from the observed trends. Second, reporting of race on ClinicalTrials.gov occasionally utilized non-standard categorization, requiring inference of race or exclusion of the data. Furthermore, given government policies to enhance reporting of race/ethnicity over the years for phase 3 clinical trials, some of the trends observed may have represented improved reporting rather than changes in demographic representation over the years. Finally, when examining funnel plots for gender, race, and ethnicity (Fig. 4), there appears a potential bias where the sample size of the study influences the proportion of multi-racial and NHPI proportions.

Conclusion
The results of this study indicate persistence of gender, ethnic, and racial disparities in phase 2 and 3 randomized clinical trial recruitment of US adults. While representation of women and Hispanics has improved between 2008 to 2019, and Blacks with NHPIs generally overrepresented, the overall representation of several racial minorities (Asians, AIAN, and multi-racial individuals) has remained static, despite systems-based initiatives aimed at improving diversity. Overall, randomized clinical trials may not reflect the demographics of the populations sought to be served. www.nature.com/scientificreports/ www.nature.com/scientificreports/   www.nature.com/scientificreports/