Prediction of coronary heart disease incidence in a general male population by circulating non-coding small RNA sRNY1-5p in a nested case–control study

During the development of atherosclerotic lesion, s-RNYs (small RNAs of about 24/34 nucleotides) are derived by the processing of long Ro-associated non-coding RNAs (RNYs) in macrophages. The levels of serum s-RNYs have been found significantly upregulated in patients with coronary heart disease (CHD) compared to age-matched CHD-free individuals. The present study aimed to examine the predictive value of serum s-RNYs for CHD events in the general male population. Within the frame of nested-case–control study, the GENES study, we measured the absolute expression of a RNY-derived small RNA, the s-RNY1-5p, in the serum of individuals (without CHD at baseline) who encountered a CHD event within 12 years of follow-up (n = 30) (Cases) and compared them to individuals who remained event-free (Controls) (n = 30). The expression of s-RNY1-5p in serum was significantly upregulated in Cases compared to Controls (p = 0.027). The proportion of CHD event-free was significantly higher among individuals with serum s-RNY1-5p below the median value (631 molecules/mL). In a multivariable model adjusted for age, smoking, hypertension, diabetes and dyslipidemia, the risk of CHD events increased more than fourfold in individuals with serum s-RNY1-5p above the median value (HR, 4.36; 95% CI 1.22–15.60). A positive association with CHD events was also observed when considering s-RNY1-5p as a continuous variable (p = 0.022). Based on our results, we conclude that serum s-RNY1-5p is an independent predictor of CHD events in a general male population and might be a relevant biomarker for early detection of cardiovascular diseases.


Methods
Sampling frame. The "Genetique et ENvironnement en Europe du Sud" (GENES) study is a case-control study designed to assess the role of gene-environment interactions in the occurrence of CHD 5 . All participants signed an informed consent form. Biological samples and clinical/physical information were collected according to the principles expressed in the Declaration of Helsinki. The study protocol was approved by the local ethics committee (CCPPRB, Toulouse/Sud-Ouest, file #1-99-48, Feb 2000). A collection of biological samples has been constituted (declared as DC-2008-463 #1 to the Ministry of Research and to the regional Health authority) 6 . Control individuals analyzed in the frame of the present work were part of the 834 male individuals from the GENES study, free of CHD and stroke, and aged 45-75 at baseline, randomly selected from the general population using electoral rolls 5 . Those control individuals were followed-up and provided the cases and controls for the nested case-control study of the present work.
Finally, the whole study was performed according to STROBE guidelines (http://www.strob e-state ment.org).
Follow-up and case ascertainment. Control individuals from the GENES study were contacted dur- ing the year 2016 either by mail or by telephone and completed self-report questionnaires on clinical events. Among individuals with possible events, clinical information was obtained from medical records in hospital and general practitioners. All available data were gathered concerning hospital admission and type of interventions. Deceased patients' families and practitioners were contacted to clarify death circumstances, and death certificates were checked to complement clinical and postmortem information. Coronary events (unstable angina, coronary revascularization, non-fatal myocardial infarction), stroke events and the occurrence of peripheral vascular disease were adjudicated by an independent committee. The median follow-up period was 11.9 ± 4.1 years.
Nested case-control study. In the present work, case patients (n = 31) were individuals who were CHDfree at baseline and developed either fatal or non-fatal acute coronary events during the follow-up. The control subjects (n = 31), matched for age (± 2 years) and date of recruitment (± 3 months), were participants who remained free of cardiovascular diseases and stroke during follow-up. One Control failed for quantification of s-RNY1-5p in baseline serum and was thus excluded with its paired Case. Analyses on s-RNY1-5p were thus based on a nested case-control population of 30 case patients and 30 control participants from the GENES study.
The study workflow is illustrated in Fig. 1.
Baseline assessment. At the baseline survey, between 2001 and 2004, participants were interviewed and examined in the same health center administered by the national health insurance system. Age, environmental characteristics and information on cardiovascular risk factors were collected through standardized face-to-face interviews, performed by a single physician. Smoking status was classified as current smokers, smokers having quitted for more than 3 years and non-smokers. Alcohol consumption was assessed using a typical week pattern. The total amount of pure alcohol consumption was calculated as the sum of different types of drinks and was expressed as grams per day. Physical activity was investigated through a standardized questionnaire 7 and categorized into two levels, namely, "high" as physical activity for at least 20 min and at least twice a week or "low" physical activity for less than 20 min for once a week or less. Medical history was collected. Hypertension was defined as treatment with drugs, or by systolic blood pressure ≥ 160 mmHg or diastolic blood pressure ≥ 95 mmHg. Diabetes was defined as treatment with drugs, or fasting blood glucose ≥ 7.8 mmol/L. Dyslipidemia was defined as treatment with drugs or fasting serum total cholesterol ≥ 2.40 g/L 8 . Anthropometric measurements included waist circumference, height and body weight, and body mass index was calculated (BMI, kg/m 2 ). Blood pressure and resting heart rate were measured with an automatic Laboratory assays. Blood was collected at inclusion after an overnight fast. Serum samples aliquots were subsequently stored at − 80 °C until biological analyses. The following biomarkers were assayed with enzymatic reagents on automated analyzers (Hitachi 912 and Cobas 8000, Roche Diagnostics, Meylan, France): serum total cholesterol (TC), HDL-C, triglycerides (TG), fasting glucose, and gamma-glutamyl transferase (γGT). Highsensitive C-reactive protein (hs-CRP) was determined on the same analyzer by immunoturbidimetry assays. Apolipoprotein (Apo) A1, ApoB, and lipoprotein (a) [Lp(a)] were also assayed with an immunoturbidimetric method on an automated analyzer (Roche Diagnostics, France). LDL-C was calculated using the Friedewald formula, with VLDL-cholesterol (VLDL-C) (g/L) = TG (g/L)/5, as long as TG concentration was below 4 g/L 9 .
Absolute quantification of s-RNY1-5p in the serum by quantitative RT-PCR. Absolute quantification of s-RNY1-5p was performed in baseline serum samples from all participants included in the nested case-control study. Before RNA extraction from 0.2 mL human serum, hemolysis was evaluated by measuring through spectrophotometer the oxyhemoglobin at 414 nm and referring to a standard curve made by introducing haemolysis by serially diluting lysed red blood cells in non-haemolysed plasma from a healthy donor. The serum samples showing an optical density (OD) corresponding to > 0.002% of red blood cells (v/v) were discarded. 0.5 mL of Trizol LS Reagent was added to 0.2 mL of blood serum. Then (1) 5 pg of synthetic miRNA-39 from Caenorhabditis elegans (miScript miRNA Mimic syn-cel-miR-39-5p, Qiagen) were added as a spike-in control for purification efficiency and (2) 2 μL of glycogen (5 mg/mL) were added to enhance the efficiency of RNA column binding. Purification of extracted total RNA was performed with High Pure miRNA Isolation kit (Roche) according to the manufacturer's instructions. RNA was eluted in 50 μL of Elution Buffer. The quality of the extracted RNA was checked by ratio between the absorbance values at 260 and 280 nm and between 260 and 230 nm using NanoDrop Technologies ND-1000 spectrophotometer (Supplemental Table 1). Five-fold dilutions of synthetic s-RNY1-5p ranging from 10 9 to 10 5 molecules per RT reaction were prepared freshly in Ultrapure water (Invitrogen) before each experiment to create standard curves to use in each quantitative PCR (qPCR) plate. Reverse transcriptase (RT) reaction was performed according to Repetto et al. 2

for the
Control individual from the GENES study: 834 males selected at random from the general population were screened and included at baseline 14 subjects were excluded because of CVD at baseline 820 subjects without CVD were followed for a period of 11. Cases (n = 31) were male subjects from the GENES study (n = 834) who experienced a Coronary Heart Disease (CHD) events during the follow-up period (11.9 ± 4.1 years). The control subjects (n = 31), matched for age and date of recruitment, were participants who remained free of cardiovascular diseases and stroke during follow-up. One Control was excluded because quantification of s-RNY1-5p in baseline serum failed. Analyses were thus based on a nested case-control population of 30 case patients and 30 control participants from the GENES study. www.nature.com/scientificreports/ detection of s-RNY1-5p and with TaqMan (Life Technologies) for the cel-miR-39. For s-RNY1-5p detection, RT was performed with the GoScript Reverse Transcriptase Kit (Promega), following the manufacturer's instructions. Briefly, reagents were freshly mixed before each experiment in a total volume of 20 μL, containing 0.5 mM dNTPs, 0.5 μL reverse transcriptase, 4 μL 5 × buffer, 20 units RNase inhibitor, 6.6 μL nuclease-free H2O, 2 μM RT stem-loop primer, and 4 μL of serum RNA or the synthetic s-RNY1-5p. Every step was performed on ice. The reaction was performed at 25 °C for 5 min followed by 42 °C for 60 min and 70 °C for 15 min before being held at 4 C. For the synthetic cel-miR-39 detection, RT was performed with the TaqMan microRNA Reverse Transcription Kit (Applied Biosystems), using 1 μL of serum RNA and following the manufacturer's instructions. The resulting cDNAs were stored at − 80 °C, till use for the qPCR experiments. qPCR amplifications were performed in a total volume of 10 μL, containing 2 μL of freshly prepared RT reaction, 1 μL of a mix containing the specific s-RNY1-5p oligonucleotide primer at 1 μM and the universal oligonucleotide primer at 1 μM (see below for the sequence of the oligonucleotides), 2 μL of nuclease-free H2O, and 5 μl of Fast SYBR Green qPCR Master Mix (Applied Biosystems). Cel-miR-39 detection was performed according to the manufacturer's instructions (Applied Biosystems) in a final volume of 10 μL. Expression was considered undetectable with threshold cycle (Ct) value ≥ 40. Each sample was run in technical triplicate and a RT negative control without serum RNA was added in each qPCR plate. In addition, in each RT-qPCR plate, we performed s-RNY1-5p standard curve. For each sample, both cel-miR-39 and s-RNY1-5p RT-qPCR reactions were carried out (Supplemental Tables 2 and 3).
The cel-miR-39 spike-in measured by RT-qPCR did not vary in all samples (Supplemental Table 2), suggesting an equal yield of RNA purification. These results allowed us to normalize the s-RNY1-5p qPCR data by volume of serum (Supplemental Table 3). Copy number quantification in each sample relied on the standard curve method (Supplemental Fig. 1). Standard curve was generated by the StepOne software (ThermoFisher Scientific, MA) based on the concentration of synthetic s-RNY1-5p (Supplemental Fig. 1). The automatic threshold function was applied to determine the Ct for each sample and calculate the absolute number of copies. Data were normalized to the initial serum volume (Supplemental Table 3). None value was identified as outlier with the ROUT method (Q = 0.1%). Primer sequences used in this study are shown in the Supplemental Table 4.
Statistical analysis. Categorical (qualitative) variables are presented as proportions. Continuous variables are displayed as means and standard deviations (SD). Kurtosis and skewness tests were used to test for the assumption of normality and log-transformation was performed for some variables to improve the normality of the distribution. We first described and compared characteristics of participants according to the case-control status. Categorical variables were compared between groups using the McNemar's test. Student's t-test was used to compare the distributions of normally distributed continuous data. Wilcoxon signed rank sum test was used when distribution departed from normality, or when homoscedasticity was rejected.
Cumulative risk of patients for coronary event (fatal and non-fatal) was determined by the Kaplan-Meier method and compared using the Log-rank test for the individual endpoints of coronary events.
The relation between baseline levels of s-RNY1-5p and the case-control status was assessed using Cox proportional hazards regression analysis. Cox regression analyses were performed with adjustment on classical cardiovascular risk factors (age, smoking, dyslipidemia, hypertension and diabetes as well as for systolic and diastolic blood pressure, total cholesterol and HDL-C). We tested the proportionality assumption using cumulative sums of martingale-based residuals. We performed regression analysis with polynomial models (quadratic and cubic) to examine for possible non-linear relationship between s-RNY1-5p and occurrence of coronary event.
All statistical analyses were carried out using the SAS statistical software package 9.4 (SAS Institute, Cary, NC) or STATA statistical software, release 14.2 (STATA Corp, College Station, TX). All tests were two-sided and were considered significant at a p value < 0.05.

Results
The Nested Case-Control study we used here, comprised 62 men from the GENES study, CHD-free at baseline. The median follow-up period was 11.9 ± 4.1 years.
Comparison of individual's data at the recruitment in the cohort is given in Table 1. Results are presented distinguishing two groups: the ''Controls group'' (individuals with no coronary event during the follow-up, n = 31) and the ''Cases group'' (individuals who developed an acute coronary event during the follow-up, n = 31). During the follow-up period, half of the Cases (n = 16) had a fatal event while a non-fatal acute coronary syndrome such as unstable angina and non-fatal myocardial infarction was reported for the other half (n = 15). There were no significant differences between biochemical and physiological variables among the Cases and Controls, except for TG levels that were significantly higher in Cases (p = 0.03). No significant difference between Cases and Controls was noted with regards to the presence of diabetes and dyslipidemia at baseline. The proportion of Cases diagnosed with hypertension tended to be higher than in Controls (35.5% versus 19.4%, p = 0.14). Similar observations were made according to the treatment status. Overall, the present cohort population was representative of the general male population of this age range in the early 2000s in Southern France in terms of biological variables and treatments 10 .
One Control and its paired Case were excluded from sRNY1-5p analyses (see "Methods" section). As shown in Fig. 2 and Table 2, serum levels of s-RNY1-5p at baseline, expressed in number of molecules per mL, were significantly higher in Cases ( Fig. 2 and Table 2). Accordingly, the percentage of individuals with s-RNY1-5p above the median value (631 molecules/mL) was also significantly higher in Cases compared to Controls (70.0% versus 33.3%, p = 0.022). The relation between baseline levels of s-RNY1-5p and the case-control status was assessed using Cox proportional hazards regression analysis. As reported in  www.nature.com/scientificreports/ increased more than fourfold in individuals with serum s-RNY1-5p above the median value (HR = 4.33 [95% CI 1.23-15.30]). Significant association with CHD events was also observed when considering s-RNY1-5p as a continuous variable. Indeed, the adjusted HR for coronary event per increase of 10 molecules/mL was 1.03 [95% CI 1.01-1.05], p = 0.022. This positive association of s-RNY1-5p with CHD events was maintained following further adjustments for systolic and diastolic blood pressure, total cholesterol and HDL-C (Model 2, HR = 1.04 [95% CI 1.01-1.08], p = 0.039). Then, cumulative risk of patients for coronary event (fatal and non-fatal) was determined by the Kaplan-Meier method according to the median s-RNY1-5p value (631 molecules/mL) and compared, using the Log-rank test for the individual endpoints of CHD events (Fig. 3). This analysis indicated that the proportion of CHD event-free individuals was significantly lower in individuals with s-RNY1-5p level above the median value (Fig. 3, Log-rank test: Chi 2 = 4.8, p = 0.017). The difference between the two groups was particularly evident for the follow-up period > 7.5 years. Thus, high levels of s-RNY1-5p are associated with an increased risk of developing a first coronary event.

Discussion
s-RNYs-5p represent a large proportion of all small RNAs expressed in human serum and plasma 11 . They circulate as part of a complex with a mass between 100 and 300 kDa 11 , which contains the protein Ro60 2 . Specific subtypes of s-RNYs-5p have been found dysregulated in the serum of patients with breast cancer and have been proposed as novel biomarkers for diagnosis in this pathology 12 , as well in head and neck squamous cell carcinoma 13 . In 2015, we found that s-RNY1-5p is produced by macrophages and that its relative expression levels in serum, normalized by endogenous or spike-in controls with the 2 −∆∆Ct method, was significantly upregulated in CHD patients compared to age-matched CHD-free individuals 2 .
The goal of the present study was to evaluate whether serum s-RNY levels could be useful for primary prevention of CHD. To this aim, for the first time, we have developed and validated a method to quantify the absolute expression of s-RNY1-5p in human serum. Based on a nested case-control study of CHD-free individuals at baseline and followed for a median duration of 12 years, we observed that serum s-RNY1-5p levels are higher in Table 2. Serum levels of sRNY1-5p at baseline. Serum sRNY1-5p levels were measured at baseline (i.e. when individuals were first included in the GENES cohort). Values are expressed as mean ± SD or percentage. a Paired Student's t-test. b McNemar's test. Log-rank test chi2=4.8 p=0.017 www.nature.com/scientificreports/ individuals who developed an acute coronary event during the follow-up. Notably, we observed that s-RNY1-5p levels were positively associated to CHD events, independently of classical risk factors. Interestingly, s-RNY1-5p seems to predict long-term and not short-term incidence of CHD. Since the expression of s-RNY1-5p in macrophages was induced by pro-apoptotic and atherogenic stimuli, we hypothesized that s-RNY1-5p levels might reflect the local inflammatory condition of the vascular cells 2 . Thus, s-RNY1-5p might be associated to a low-grade inflammatory state of the vascular wall having deleterious effect in the long-term. Among the small non-coding RNAs, microRNAs (miRNAs) have also become potential biomarkers for human pathologies, including cardiovascular disorders 14 . miRNAs are ~ 20 nucleotides long endogenous noncoding RNAs abundantly expressed in virtually all human cells and associated with AGO proteins to repress gene expression at the post-transcriptional level 15 . miRNAs are localized in mostly all cellular organelles and are also present in the extracellular environment 16 . More than 100 miRNAs have been detected in serum and designed as circulating miRNAs 14 . Because of their extracellular stability, measurement of circulating miRNA dysregulation has been proposed as novel method for the diagnosis of several human pathologies, including CHD 14 . However, to date, none of the dysregulated circulating miRNAs may predict cardiovascular accidents in long-range forecast. To the best of our knowledge, the present study is the first one demonstrating the potential role of a small noncoding RNA in predicting CHD events, underlining therefore the originality and importance of our findings.
The prevalence of CHD is increasing and the burden on the healthcare system stems from late diagnosis. In this context, the importance of early identification and preventative intervention is being recognized worldwide. The recommended method for risk screening according to the European Society of Cardiology (ESC) Guidelines for the diagnosis and management of chronic coronary syndromes is the Systematic COronary Risk Evaluation (SCORE) 17 . SCORE assesses a combination of clinical parameters (age, gender, smoking status, systolic blood pressure) and blood-based biomarkers (lipid panel of total cholesterol and HDL-C) to estimate a 10-year risk of fatal CHD. However, both the ECS and American College of Cardiology/American Heart Association (ACC/ AHA) recognize that current risk evaluation methods are unreliable and have limitations. The strongest pain point experienced by the healthcare system is diagnosis of patients classified as 'moderate risk' following SCORE evaluation. These patients represent a large portion of the population, commonly those who are 35-50 years and diabetics. Primary care physicians then need to reclassify these patients as high or low risk by referring them to clinics or hospitals for expensive cardiac diagnostic testing. As a result, often the prevalence of CHD is overestimated, wasting costly and strained healthcare resources. Therefore, a more accurate biomarker that complements the lipid panel and SCORE assessment would improve the overall predictive performance of CHD risk evaluation. In this respect, our present results indicated that sRNYs, and particularly s-RNY1-5p, could address this unmet need for an early non-invasive biomarker that increase accuracy of CHD risk assessment.
One of the strengths of our study was the use of a matched control group of individuals who did not develop CHD during the follow-up period. However, several limitations of the present study must be noted. Firstly, the small sample size of the study is a limitation. Particularly, results presented here do not allow to address how much s-RNY1-5p adds to the risk prediction of CHD beyond a clinically used risk score. Thus, further measurements are foreseen within the framework of a large prospective cohort to establish more firmly the predictive value of s-RNY1-5p determinations. Yet, our study is the first to report the positive association between s-RNY1-5p upregulated levels with the prediction of CHD events. Secondly, our study was designed only with men, which has the advantage of recording a larger number of events than in a mixed all-gender cohort. Particularly, this cohort is composed of participants living in Toulouse area (South-west France), in which there is low cardiovascular mortality rate in women before 65 years of age 18 , thus limiting the translatability of our results to women.
To conclude, we reported here that s-RNY1-5p was positively and independently associated with CHD events in a general male population. Those results argue in favor of s-RNY1-5p being a novel predictive molecular biomarker for cardiovascular events. The present study represents a step forward towards a critical assessment of circulating small non-coding RNAs as biomarkers for prediction of CHD. In the future, measurement of s-RNY1-5p expression levels and other 5′ s-RNY, such as s-RNY4-5p 2 , could be used in clinical practice in addition to classical risk factors to identify those high-risk individuals who might benefit from prevention medicine.