Introduction

According to a report published in 2018 by World Ovarian Cancer Coalition, the worldwide number of ovarian cancer cases by 2035 are predicted to increase to 371,000 per year, and deaths will increase by 67% to 254,0001. In developed countries, ovarian cancer remains as the deadliest gynecologic malignancy2,3. About 20% of women with epithelial ovarian cancer (EOC) are diagnosed at the early stage and the 5-year survival rates can be as high as 90%. In contrast, the 5-year survival rate is around 17–39% when the disease is diagnosed at the advanced stages4. Among all the biomarkers, cancer antigen-125 (CA-125) has been the biomarker of choice for managing ovarian cancer patients. However, serum CA125 level has low sensitivity for early diagnosis and the level is affected by conditions such as menstruation, pelvic inflammations, endometriosis, pregnancy, smoking, and intake of caffeine5,6. Other biomarkers including glycoprotein human epididymis protein 4, transthyretin, apolipoprotein A1, beta2-microglobulin, and transferrin showed unsatisfactory predictive value5,7,8. Therefore, novel biomarkers with high sensitivity and specificity are urgently needed for ovarian cancer diagnosis.

Biomarkers are more informative and may be more specific when their expression alterations are directly caused by genetic and molecular mechanisms underlying the pathophysiological development of the diseases. Mechanism-based biomarkers have utility in the identification of the origin of the disease, differentiation of disease types, and the application of more effective therapies9,10. MicroRNAs (miRNAs) play an active role in physio- and pathological processes and can reflect physiological alterations more directly11. Indeed, the microRNA profiles in blood exhibited a strong correlation with that in the tumor12. In addition, miRNAs are generally more stable than mRNAs in blood11 and are good candidates for mechanism-based biomarkers.

MicroRNA-200 (miR-200) family is a master mesenchymal–epithelial transition regulator controlling the expression of epithelial marker E-cadherin13. Previously we have shown that the expression of miR-200 and E-cadherin is elevated in human ovarian cancer14,15, as well as in laying hen spontaneous ovarian cancer model16, the only non-primate animal that develops ovarian adenocarcinoma in nature. Human miR-200 family consists of five members (miR-200a, miR-200b, miR-200c, miR-429, and miR-141) scattered in two separate chromosomal clusters17. Interestingly, only the miR-200a, miR-200b, and miR-429 cluster is present in the chicken genome. We have completed a functional study of miR-200, demonstrating that this miRNA family governs inclusion cyst formation in endosalpingiosis, a benign lesion in association with serous ovarian neoplasm, and collective movement involved in ovarian cancer spread18. Intriguingly, ectopic expression of miR-200 in normal human ovarian surface epithelial cells induced upregulation of steroid hormone pathway genes18, which, in conjunction with frequent ovulation of the laying hens, are consistent with the incessant ovulation hypothesis for ovarian cancer development19, and highlight the critical role miR-200 plays in ovarian carcinogenesis20. Taken together, miR-200 can be an excellent mechanism-based biomarker for ovarian cancer diagnosis and prognosis.

There have been multiple studies that investigated miR-200 in various diseases such as endometriosis21, renal fibrosis22, pleural effusion23 and colon cancer24. There were also a few studies that evaluate miR-200 as a diagnostic or prognostic marker for ovarian cancer (Table S1). Taylor et al. used a microarray platform to compare the miRNA profiles of tumor-derived EpCAM+ exosomes isolated from the sera of American women and found that the levels of miR-141, miR-200a, miR-200c, miR-200b, miR-203, miR-205 were elevated in patients with serous papillary adenocarcinoma compared to that isolated from the benign and healthy samples12. The second study conducted by Kan et al. in Australia found that miR200a, miR200b and miR200c were highly expressed in sera derived from serous cancer patients when compared to healthy control, with miR-200c the most significantly different between the two groups25. In that study, they selected miR-103 from four tested small RNAs as the internal control based on its detectability in sera and no significant differences between the case and control groups. The third study by Zheng et al. was conducted in China and they found nine miRNAs including miR-200a and miR-141 that were detected at higher levels in the plasma from the ovarian cases than in the healthy controls26. The fourth study by Zuberi et al. was conducted in India. The levels of miR200a, miR200b and miR200c were revealed higher in the 70 EOC sera samples than 70 sera sample from the healthy controls. The expression of miR-200a was found to be significantly up-regulated in the mucinous and serous subtype, while the level of miR200a and miR-200c were positively correlated to the stages of EOC and metastasis27. The fifth study by Gao et al. was conducted in China. MiR-200c and miR-141 were found significantly elevated in the EOC patient sera when compared to healthy controls7. Racial disparity, differences in the samples size, forms of blood, methods of blood collection and detection are factors that could contribute to the results found in the above studies.

In this study, we compared the levels of all five miR-200 family members in the blood specimens from patients with EOC and the healthy controls collected in the USA, Hong Kong and China. We also evaluated if there are racial disparities in the blood miR-200 levels, as racial differences in microRNA expression were frequently reported for various diseases28,29,30. As laying hen ovarian cancers express high levels of miR-200 similar to human ovarian cancers15,16,18, the miR-200 levels in the plasma samples collected from the healthy and cancerous laying hens were also compared to those in the human blood specimens.

Results

Elevated levels of miR-200 members in human ovarian cancer blood specimens and ethnic group difference analysis

Peripheral blood samples were collected initially from a total of 230 female subjects including 112 healthy controls with a median age of 55 and 118 ovarian cancer patients with median age of 57 (Table 1). Clinical characteristics of the ovarian cancer patients are presented in Table S2. The samples were contributed by two US institutions and three sources in Hong Kong/China. All blood samples were processed and reverse transcription-quantitative real-time polymerase chain reactions (RT-qPCR) were performed as described in Materials and Methods. To choose the endogenous normalizer for the miRNA expression, we tested two miRNAs recommended by the manufacturer and, similar to Ken et al.25, chose the same miR-103a-3p as the endogenous reference miRNA to normalize all our RT-qPCR results, as it did not show significant variations in a preliminary test of 40 normal and cancer plasma samples (Figure S1). We noted that serum and whole blood samples were provided by Hong Kong MedTimes and CloudHealth Genomics Ltd, respectively, while the blood samples provided by other institutions were plasmas. We compared the miRNA levels in 10 MedTimes specimens, which were available in all three forms. Statistical analysis of qPCR data showed that the miRNA levels in most serum and plasma specimens corroborated each other, whereas the whole blood samples were significantly different from the other two forms (Figure S2). Glinge et al. have shown that long-term storage of whole blood samples might change miRNA stability31. We therefore excluded the 16 CloudHealth whole blood samples from final analysis. miR-103a-3p normalized − ΔCt values32 were used to represent miR-200 levels for all the plasma samples as well as the 60 serum samples from Hong Kong MedTimes. In total there were 96 normal and 118 cancer human specimens for the final analysis.

Table 1 Sample distribution in this study.

The initial analysis of all 214 human samples using linear regression models with adjustment for institutional effects showed that miR-200a, miR-200b, and miR-200c in cancer samples have significantly higher levels than in normal samples (Table 2 and Figure S3). However, further analysis of the data showed that there were significant differences between American and Chinese samples. Within the 84 American samples, cancer samples have significantly higher miR-200a, miR-200b, miR-200c, and miR-141 levels (in average 18.6-fold, 21.7-fold, 23.2-fold, and 22.5-fold, respectively) than normal samples, whereas within the 130 Chinese samples, only miR-200b and miR-429 levels (64.1-fold and 19.9-fold, respectively) were significantly higher than normal samples, and the levels of miR-141 in cancer blood samples were actually significantly lower (0.05-fold) than in normal samples (Table 2). Within the normal samples, four miR-200 family members except miR-429 were significantly higher in Chinese samples than their American counterparts; and within the cancer samples, the miR-200b, miR-200c, and miR-429 levels were significantly higher in Chinese samples than the American samples (Table S3).

Table 2 Normalized miR-200 family levels (estimated means of − ΔCt, 95% CI) in (a) All human, (b) American-only, or (c) Chinese-only blood samples.

Correlation analysis of levels of miR-200 family members in human blood samples

To account for the observed racial differences, we compared the correlations of the levels of miR-200 family members in all samples and within different subgroups. When analyzed in all 214 samples, the levels of most miR-200 family members were significantly correlated with each other, except between miR-429 and miR-141 (Table S4). We saw similar correlation pattern in the analysis of all 118 cancer cases, with insignificant correlation between miR-429 and miR-141, and higher positive correlation coefficients among the other members (Table S5). However, the correlations were much weaker except between miR-200c and miR-141 in the analysis within all 96 normal controls, with miR-429 showing either insignificant or negative correlations with the other members (Table S6). When the analysis was performed within the American samples (N = 84), we saw significant correlations between miR-200c and miR-200a and miR-200b, as well as between miR-141 and all miR-200 members except miR-429 (Table S7). The same correlation pattern was also observed in the analysis within American cancer samples (N = 49), with miR-429 not showing any significant correlations with other members (Table S8). Within American normal samples (N = 35), correlations were only observed between miR-200b and miR-200c and miR-141, and negatively with miR-429 (Table S9). When the analysis was performed within the Chinese group (N = 130), all miR-200 family members showed significant correlations with each other except between miR-429 and miR-141 (Table S10), similar to what we observed in the analysis of all samples (Table S4). Like the American cancer counterparts, Chinese cancer samples (N = 69) showed the same pattern as the analysis for all Chinese samples, with even higher significant correlation coefficients (Table S11), suggesting that the correlation pattern observed with all samples was primarily contributed by the elevated levels of miR-200 family in cancer samples. In the analysis within Chinese normal samples (N = 61), only the levels between miR-200a and miR-200b, and the levels between miR-141 and miR-200b and miR-200c showed weak significance (Table S12).

Elevated levels of miR-429 in human endometrioid and laying hen ovarian cancers

We then proceeded to compare the miR-200 levels across different histological subtypes of ovarian cancer. We chose to analyze only the Chinese samples, which had comparable number of samples derived from different histological subtypes. By Dunnett multi-comparison, we showed that serous and mucinous ovarian cancer blood samples had significantly higher levels of miR-200b (670.5-fold and 267.1-fold, respectively) than normal blood samples; serous cancer blood samples also had significantly higher levels of miR-200c (198.2-fold) than normal; miR-429 levels in clear cell and endometrioid cancer blood samples were significantly higher (64.9-fold and 88.8-fold, respectively) than normal; whereas miR-141 levels in clear cell, endometrioid, and mucinous cancer blood samples were significantly lower (0.01-fold, 0.01-fold, and 0.03-fold, respectively) than normal blood samples (Table 3). Interestingly, when we examined miR-200 levels in the plasma samples from laying hens, only miR-429 showed significantly elevated levels (17.3-fold) in the cancer plasmas compared to control (Table 4 and Figure S4). Correlation analysis showed that chicken miR-200a and miR-200b levels had the highest correlation (Spearman correlation coefficient = 0.817, P < 0.001). MiR-429 also showed respectably high correlations with miR-200a (0.535, P = 0.005) and miR-200b (0.661, P < 0.001) (Table S13). As endometrioid ovarian cancer is the major histological type in the chicken33, the laying hen finding is also consistent with the significant increase of miR-429 levels in human endometrioid blood samples.

Table 3 Significant miR-200 family levels in human blood samples from different histological types of ovarian cancer versus normal blood samples.
Table 4 Comparison of miR-200 family levels between ovarian cancer and normal laying hen blood samples.

Multilayer perceptron neural network models to predict different miR-200 combinations for American and Chinese ovarian cancer patients

Multilayer perceptron (MLP) neural network predictive models34 were used to estimate separately individual and combinations of miR-200 members in differentiating cancer and control blood samples as described in Materials and Methods. The best neural network model consisting of miR-200a/200b/429/141 showed an AUC value of 0.904 for American women in distinguishing between cancer patients and controls, whereas a model consisting of miR-200b/200c/429/141 showed an AUC value of 0.901 for Chinese ovarian cancer prediction (Fig. 1).

Figure 1
figure 1

Receiver operating characteristic (ROC) curves for the best multilayer perceptron neural network predictive models to distinguish cancers from non-cancer controls for (a) American patients; and (b) Chinese patients.

Discussion

We examined levels of miR-200 family members in normal and cancer blood specimens collected from American and Chinese institutions. Statistical analyses have revealed that miR-200a, miR-200b, and miR-200c were significantly elevated in cancer specimens relative to controls in the analysis of all specimens, which are similar to the findings by Taylor et al.12, Kan et al.25, and Zuberi et al.27. Correlation analysis has shown that the levels of these three miR-200 family members were consistently highly correlated with each other than miR-429 and miR-141, with the latter two showing discordant correlation with each other.

Further analysis of the American plasma samples showed that all miR-200 family members except miR-429 were elevated in cancer relative to control plasma samples (Table 2), which is also consistent with the finding of another American miRNA biomarker discovery study using tumor serum-derived exosomes reported by Taylor et al.12, further confirming that plasma and serum specimens have similar miRNA levels and consistent elevation of these four miR-200 family members in American ovarian cancer patients. Similar findings of elevated miR-200a, miR-200b, miR-200c levels in cancer serum samples were also found in the Australian study25 and the Indian study27. However, our analysis of Chinese specimens revealed that Chinese cancer blood samples had significantly higher levels of miR-200b and miR-429 and lower miR-141 levels than in normal samples (Table 2), which is contradictory to two Chinese studies, which showed either increased levels of plasma miR-200a and miR-14126, or serum miR-200c and miR-1417 in cancer than in normal blood samples. The differences might be due to the use of different endogenous normalizer, the form of the blood samples, and the widely variable miR-141 levels observed in our study.

Correlation analysis of the miR-200 family levels show that the levels of miR-200 members were highly correlated with each other in the cancer samples with the exception of miR-429 in American cancer samples (Tables S5, S8, and S11), which also strongly represented the patterns observed in the analysis of combined cancer and normal samples (Tables S4, S7, and S10) than the normal samples (Tables S6, S9, and S12). This observation implies that normal blood samples usually have low and variable miR-200 levels, whereas cancer samples have consistently elevated levels of miR-200 members and dominate the patterns revealed from combined samples.

The significant positive association of clear cell and endometrioid ovarian cancers with miR-429 and negative association with miR-141 in our histological subtype analysis (Table 3) may account for the differences between American and Chinese cancer samples (Table 2) and differences in correlation analysis (Tables S8 and S11). Since our American cancer samples are primarily serous cancer type and have higher levels of all miR-200 family members except miR-429, and our Chinese samples have significant contribution from clear cell and endometrioid cancer samples, hence we find significant positive contribution by miR-429 in Chinese cancer cases (Table 2 and Table S11), but not in the American cases (Table 2 and Table S8). Future studies with more balanced numbers of histological subtypes in both ethnic groups will confirm the interesting finding of racial disparity of miR-200 blood levels, as there are reports about racial differences in microRNA expression in other diseases28,29,30. More importantly, the human endometrioid cancer-specific miR-429 association is also observed in the cancer plasma samples of laying hen (Table 4), which is prevalently associated with endometrioid ovarian tumors. This finding not only supports the incessant ovulation hypothesis, but also highlights the importance of miR-200-mediated ovarian carcinogenesis in both species15,16,18.

In conclusion, we show that miR-200 family is a highly relevant and informative mechanistic biomarker for ovarian cancer, demonstrating outstanding AUC values in our neural network modeling. Our study also shows racial disparity of miR-200 level in the blood, which might contribute to different miR-200 combinations identified in our neural network models for American and Chinese populations (Fig. 1). More importantly, miR-200 family members provide histological subtype-dependent information in both human and laying hen blood samples and have important implication in ovarian carcinogenesis. Longitudinal laying hen blood samples collected before and after tumor development may be useful for ovarian cancer biomarker discovery.

Materials and methods

Human and laying hen blood specimens

All methods were carried out in accordance with relevant guidelines and regulations for both humans and animals. Informed consent was obtained from all human subjects. All patient-derived biologic specimens were collected and archived under protocols approved by the respective Human Subjects Committee of the institutions, namely, Human Subjects Committee of the Brigham and Women's Hospital, Institutional Review Board of The MD Anderson Office of Human Subjects Protection, The Joint CUHK-NTEC Clinical Research Ethics Committee, Medtimes Medical Group Ethics Review Board, and CloudHealth Medical Group Clinical Research Review Board. For the American samples, 34 banked plasma samples (10 normals and 24 cancers) at Brigham and Women’s Hospital (BWH) were selected through a Biobank Portal query tool and requested from Partners HealthCare Biobank (https://personalizedmedicine.partners.org/Biobank/Default. aspx). 50 plasma specimens (25 normals and 25 cancers) were from MD Anderson Cancer Center, obtained from the Blood Specimen Research Resource (BSRR) (https://www.mdanderson.org/research/departments-labs-institutes/programs-centers/center-for-translational-and-public-health-genomics/resources.html). A search of databases showed that not any single specimen was from Asian patients. For the Chinese samples, 70 plasma samples were collected at the Department of Obstetrics and Gynecology, Prince of Wales Hospital, Chinese University of Hong Kong. 16 whole blood of healthy Chinese female individuals with no known diseases were collected at CloudHealth Genomics Limited. In addition, 60 serum samples of healthy Chinese females were collected at MedTimes Molecular Laboratory Limited. Of the 60 samples, 10 samples had accompanied plasma and whole blood for comparison. Peripheral blood samples were collected in color-coded Vacutainer tubes (BD Biosciences, San Jose, CA) using routine phlebotomy techniques and fractionated into multiple aliquots and stored at − 80 °C.

Single-comb White Leghorn hens were maintained as previously described35,36, with review and approval of the Institutional Animal Care and Use Committees at the University of Illinois at Urbana-Champaign and Southern Illinois University at Carbondale. The plasma samples tested were from age-matched laying hens fed with standard commercial layer feed.

Handling of blood specimens

All the blood specimens were sent to Brigham and Women’s Hospital in dry ice and processed by the first author except for the whole blood and serum samples from Chinese healthy women. The whole blood samples from CloudHealth Genomics were processed by the fifth author in Shanghai because of the strict policy for exporting biological samples, while the serum samples were processed in MedTimes Molecular Laboratory by the first author. Nevertheless, the same RNA extraction and RT-qPCR procedure were followed for all plasma, sera and whole blood specimens.

RNA extraction and reverse transcription

All samples were centrifuged at 2000 rpm for 5 min to remove remaining blood cells. One hundred thirty microliters of human blood sample or 50 μl of chicken plasmas were transferred to a 1.5 ml tube and top up to 200 μl with RNase free water. RNA extraction was performed according to miRCURY™ RNA Isolation Kit-Biofluids Instruction manual v1.7 (#300112 and #300113, Exiqon, Woburn, MA). Briefly, lysis buffer was added to the samples, followed by protein precipitation solution. After centrifugation, supernatant was transferred to a new collection tube. RNA was precipitated with glycerol-isopropanol mix. The microRNAs were captured by microRNA Mini Spin Column and eluted with RNase free water. Samples were stored at − 80 °C until reverse transcription was performed.

Reverse transcription was performed according to miRCURY LNA™ Universal RT microRNA PCR Instruction manual v6.1 (#203301, 203351, Exiqon, Woburn, MA). Spike-in RNA control (UniSp6) was included in the reverse transcription reaction to reveal the efficiency of reverse transcription and the subsequent PCR reactions. Normalization was done by subtracting the Ct value obtained with the spike-in RNA (UniSp6). Two microliters of RNA template and 8 μl of working master mix solution containing reaction buffer, enzyme for reverse transcription and synthetic RNA spike-in control were mixed together and incubated at 42 °C for 60 min, followed by 5 min at 95 °C.

Quantitative PCR reactions

We employed Exiqon (Woburn, MA) miRCURY LNA™ Universal RT microRNA PCR primer sets because of the exceptional sensitivity and specificity and low background. cDNA was diluted 40-fold with nuclease-free water before PCR amplification. cDNA from human samples were mixed with PCR master mix containing primers for different miR-200 members, UniSp6, or the endogenous reference miRNAs. cDNA from chicken samples were mixed with PCR master mix containing compatible human miR-200a, miR-103a-3p, or chicken (gga)-miR-200b, miR-429, or UniSp6. All PCR reactions were performed in a StepOne Plus or a 7500 real-time PCR instrument with 60 amplification cycles (95 °C for 10 s, followed by 60 °C for 1 min). Melt curve analysis was done to reveal primer dimer formation or non-specific binding. Ct value of each reaction was obtained with threshold set to 5000. Ct values larger than 60 were counted as 60. All reactions were performed in duplicates and the average Ct values were used to compute normalized miRNA level for each human and chicken sample.

Statistical analysis

Comparative − ΔCt (negative delta Ct) method was used to calculate normalized miRNA level for each sample32. Normalized delta Ct value was calculated as [Ct(target) − Ct(UniSp6) − Ct(103a)]. Mean levels of miR-200 family members between groups were compared using independent-sample t test and regression analysis. Mean fold-changes of miR-200 expression in various histological types of cancer relative to normal samples were calculated using the 2−ΔΔCt method. Levels of miR-200 members in the laying hen groups were compared using Mann–Whitney test due to small sample size. Histological subtype-specific levels of miR-200 members were analyzed using ANOVA with Dunnett post-hoc test comparing each subtype versus normal samples. Pearson correlation coefficients were calculated for correlation analysis in human samples and Spearman correlation coefficients for the laying hen samples. Multilayer perceptron (MLP) neural network predictive models34, separately for each marker and for a combination of markers, were estimated to differentiate between cancer and control blood samples. Sensitivity analyses were conducted to compute the importance of each marker in determining the neural network predictive models. Validation of predictive models was performed by randomly dividing the samples into a training data set and a test data set in a ratio of 7:3. The receiver operating characteristic (ROC) curve for each predictive model was obtained and the corresponding area under curve (AUC) was reported. Statistical analyses were performed using IBM SPSS-25 (IBM, Chicago, IL). For all statistical comparisons, a level of P < 0.05 was accepted as statistically significant.