Introduction

“The cure of many diseases remains unknown to the physicians because they do not study the whole person.”

  • Socrates

Suicides are preventable tragedies that continue to occur, and in fact have increased in number in recent years [1]. In the US, suicide rates increased approximately 36% between 2000–2021. Suicide was responsible for 48,183 deaths in 2021, which is about one death every 11 minutes (CDC, 2023). The number of people who think about or attempt suicide is even higher. In 2021, an estimated 12.3 million American adults seriously thought about suicide, 3.5 million planned a suicide attempt, and 1.7 million attempted suicide (SAMSHA, 2022).

Our previous studies had pioneered the identification of blood gene expression biomarkers for suicidality [2,3,4,5,6,7], and other groups have validated this blood-based approach as well [8]. These gene expression studies are complementary to other genetic studies in the field [9,10,11], and in fact we integrate these different lines of work into our approach, as a convergent prioritization second step. We had also developed and tested a 22- item, simple to administer, quantitative risk evaluation and mitigation questionnaire called Convergent Functional Information for Suicidality (CFI-S), that focuses on social determinants and other known risk factors, and does not ask about current suicidal ideation [12, 13]. We wanted to expand upon those studies, as a way of deriving future scientific and practical insights, that would move these precision medicine approaches towards widespread utilization in clinical practice. First, we used larger cohorts of psychiatric patients for our biomarker studies (discovery, validation, and testing). Second, we used larger literature-derived databases for our convergent prioritization approaches, and the annotation of our biomarkers for drug modulatory effects. Third, we now have a longer duration follow-up on subjects, and also used longitudinal, not just cross-sectional, calculations for predictions. Fourth, we used newer methodologies, such as RNA sequencing, as well as machine-learning [14], to generate and analyze data. Fifth, we derived from the data a deeper biological, clinical, and therapeutical understanding of suicidality. Lastly, we show examples of practical applications, including bio-socio-psychological integration. We propose that such approaches could and should be used in clinical practice, to stem and reverse the tide of suicides.

Materials and methods

Ethics approval and consent to participate

All methods were performed in accordance with the relevant guidelines and regulations.

All subjects understood and signed informed consent forms detailing the research goals, procedure, caveats and safeguards, per Indiana University IRB approved protocol (“Mood State and Other Biomarkers: a Discovery-Based Approach”, protocol no. 1011004024). No identifiable information or images are included in the publication.

Cohorts

We used three independent cohorts: discovery (a live psychiatric subjects cohort), validation (a postmortem coroner’s office suicide completers cohort), and testing (an independent live psychiatric subjects test cohort for predicting suicidal ideation, and for predicting future hospitalizations for suicidality) (Fig. 1A).

Fig. 1: Steps 1-3: discovery, prioritization, validation and testing of biomarkers for suicidality.
figure 1

A Cohorts used in study, depicting flow of discovery, prioritization, and validation of biomarkers from each step. B Prioritization using Convergent Functional Genomics (CFG). C Validation -biomarkers are assessed for stepwise change from discovery subjects with no symptoms, high symptoms to the validation subjects where samples were collected from suicide completers, using ANOVA. The histograms depict a top increased (I) and a top decreased biomarker (D). Number of probesets and scoring at each of the Steps. Step 1 -Discovery probesets are identified based on their score for tracking symptoms and ranked 33.3% (2 pt), 50% (4 pt) and 80% (6 pt). Step 2- Prioritization with CFG for prior evidence of involvement in Suicidality. Maximum of 6 pt. Genes scoring at least 6 pt out of a maximum possible of 18 pt after Discovery and Prioritization are carried forward to the validation step. Step 3- Validation in an independent cohort of suicide completers. We selected the top CFE score ≥8 (n = 2340) for further testing and characterization. E Predictions for State—High Suicidality. Top cross-sectional and longitudinal markers are shown in all subjects, males, and females. Table below displays number of significant markers within each prediction group by AUC. F Predictions for Trait—Hospitalizations in the First Year. Top cross-sectional and longitudinal markers are shown in all subjects, males, and females. Table below displays number of significant markers within each prediction group by AUC. G Predictions for Trait—All Future Hospitalizations. Top cross-sectional and longitudinal markers are shown in all subjects, males, and females. Table below displays number of significant markers within each prediction group by odds ratio.

Similar to our previous studies [2,3,4, 6], the live psychiatric subjects are part of a larger longitudinal cohort of adults that we are continuously collecting, the Indy500+ cohort. Subjects are recruited from the patient population at the Indianapolis VA Medical Center and Indiana University School of Medicine. All subjects understood and signed informed consent forms detailing the research goals, procedure, caveats and safeguards, per Indiana University IRB approved protocol (“Mood State and Other Biomarkers: a Discovery-Based Approach”, protocol no. 1011004024). Subjects completed up to nine testing visits, 3–6 months apart or whenever a new psychiatric hospitalization occurred. At each testing visit, they received a series of psychiatric rating scales, including the Convergent Functional Information for Suicidality (CFI-S) and the Hamilton Rating Scale for Depression-17, which includes a suicidal ideation (HAMDSI) rating item (Fig. S1), and their blood was drawn. We collected whole blood (5 ml) in two RNA-stabilizing PAXgene tubes, labeled with an anonymized ID number, and stored at -80 degrees C in a locked freezer until the time of future processing. Whole-blood RNA was extracted for microarray and RNA sequencing gene expression studies from the PAXgene tubes, as detailed below.

For this study, our within-subject discovery cohort, from which the biomarker data were derived, consisted of 90 subjects (69 males, 21 females) with psychiatric disorders and multiple testing visits, who each had at least one diametric change in SI scores from no SI to high SI (HAMD-SI ≥ 2), or vice versa, from one testing visit to another. There were 4 subjects with 6 visits each, 8 subjects with 5 visits each, 7 subjects with 4 visits each, 39 subjects with 3 visits each, and 32 subjects with 2 visits each resulting in a total of 273 blood samples for subsequent gene expression studies (Fig. 1, Table S1).

Our postmortem cohort (n = 101), in which the top biomarker findings were validated for behavior, consisted of 83 male and 18 female suicide completers obtained through the Marion County coroner’s office (Table S1). We required a last observed alive postmortem interval of 24 h or less, and the majority of cases selected completed suicide by means other than overdose, which could potentially affect gene expression. 59 subjects completed suicide by gunshot to head or chest, 27 by hanging, 4 by jumping, 2 by slit wrist, 2 by train, 1 by asphyxiation, 1 by electrocution, and only 2 by overdose, 2 by carbon monoxide poisoning. Next of kin signed informed consent at the coroner’s office for donation of blood for research.

Our independent test cohort for predicting suicidal ideation (Table S1) consisted of 330 male and 76 female subjects with psychiatric disorders, with one or multiple testing visits in our lab, with either no SI, intermediate SI, or high SI, resulting in a total of 820 blood samples in which whole-genome blood gene expression data were obtained (Fig. 1, Table S1).

Our test cohort for predicting future hospitalizations within one year of testing (Fig. 1, Table S1) is a subset (279 males, 47 females) of the independent test cohort for which we had longitudinal follow-up with electronic medical records for at least 365 days resulting in a total of 685 samples. The subjects’ subsequent number of psychiatric hospitalizations, with or without suicidality (planning, ideation or attempt), was tabulated from electronic medical records. Subjects were evaluated for the presence of future hospitalizations for suicidality within one year. A hospitalization was deemed to be without suicidality if suicidality was not listed as a reason for admission, and no SI was described in the admission and discharge medical notes. Conversely, a hospitalization was deemed to be due to suicidality if suicidal acts or intent were listed as a reason for admission, and/or SI was described in the admission and discharge medical notes.

Our test cohort for predicting all future hospitalizations (Fig. 1, Table S1) is a subset (310 males, 55 females) of the independent test cohort for which we also had longitudinal follow-up with electronic medical records resulting in a total of 745 samples. The subjects’ subsequent psychiatric hospitalizations were tabulated in the same way as for hospitalizations in the first year, but for all time following their testing visit. Subjects also were evaluated for the presence of future hospitalizations for suicidality, and for the frequency of such hospitalizations.

Medications

The subjects in the discovery cohort were all diagnosed with various psychiatric disorders and had various medical co-morbidities. Their medications were listed in their electronic medical records and documented by us at the time of each testing visit. Medications can have a strong influence on gene expression. However, there was no consistent pattern of any particular class of medication. Our subjects were on a wide variety of different medications, psychiatric and non-psychiatric. Furthermore, the independent validation and testing cohort’s gene expression data was Z-scored by gender before being combined, to normalize for any such effects. Some subjects may be non-compliant with their treatment and may have changes in medications or drugs of abuse not reflected in their medical records. Our goal is to find biomarkers that track suicidality, regardless of if the reason for it is due to internal biology or driven by exogenous substances or medication. In fact, one would expect some of these biomarkers to be targets of medications, as we show in this paper. Furthermore, the prioritization step that occurs after discovery is based on a field-wide convergence with literature that includes genetic data, which are unrelated to medication effects. Overall, the discovery, validation, and replication by testing in independent cohorts of the biomarkers, with our design, occurs despite the subjects having different genders, diagnoses, being on various medications, and other variables.

Blood gene expression experiments

RNA extraction

Whole blood (2.5 ml) was collected into each PaxGene tube by routine venipuncture. PaxGene tubes contain proprietary reagents for the stabilization of RNA. Total RNA was extracted and processed as previously described [2,3,4, 6].

Microarrays

Microarray work was carried out using previously described methodology on a subset of subjects (n = 794) [2,3,4, 6].

Of note, all genomic data was normalized (RMA for technical variability, then z-scoring for biological variability—by gender), before being combined and analyzed.

RNA sequencing

Next-generation RNA sequencing studies were carried out on the rest of the subject samples collected more recently (n = 248). We then endeavored to integrate the two platforms.

Biomarker analyses

Step 1: Discovery

Analyses were done in separately for Affymetrix data and for RNAseq data. Results were then integrated for a final discovery score.

For the Affymetrix dataset, a DE analysis was ran on the probeset level generating a raw score for each marker, and points were given when the probeset expression accurately corresponded to any changes in suicidality (from no to high, or high to no). The DE analysis detects how gradual changes in the gene expression track suicidality.

For the RNAseq dataset, a DE analysis was ran on the transcript level. Similarly, this analysis created a raw score for each transcript which corresponded with its ability to track suicidality. We substituted for the RNAseq data the 0 values with the next lowest non-zero values, so that the fold change calculations from visit to visit did not break in case of two consecutive zeros. Ensembl 110 was then used to match transcripts to probeset(s). The average score from all transcripts corresponding to a probeset was generated. We only carried forward probesets that mapped to transcripts (n = 46,600), thus discarding probesets that did not map to transcripts as not being suitable for future practical applications, and transcripts that did not have probsets (n = 53,002) due to the hybrid integrated nature of our current approach. We anticipate that in the future when our studies consist of RNAseq data only, to not have to discard any transcripts and thus have a richer dataset.

Within each probeset, the score for each probeset from the microarray discovery cohort was summed with the average score of the transcripts corresponding to each probeset from the RNA seq discovery cohort, creating a final score for each probeset. A negative score indicated a decrease in expression biomarker, and a positive score an increase in expression biomarker.

A value percentile (separate for increased and decreased markers) was then assigned to each probeset based on its final score. The percentile scores were given points by thresholds ( ≥ 80% - 6pts; ≥50% - 4pts, ≥33.3% - 2 pt, <33.3% - 0pts) (Fig. 1). Only probesets with a greater than 2 pts. move on to the next step, Prioritization.

Step 2: Prioritization

Probeset to gene mapping for CFG prioritization step

After the discovery step, we mapped probesets to corresponding genes through Ensembl 110. The majority of probesets map to a single gene. However, some probesets map to multiple genes. Each probeset- gene pairing was carried out separately throughout the subsequent analyses. Probesets that did not have a gene match from Ensembl were further queried through the NetAffy database. Furthermore, if multiple or no genes matches were identified with NetAffy, those probesets were run through UCSC Genome Browser (https://genome.ucsc.edu/). Finally, all genes thus identified as corresponding to probesets were put through the GeneCards database to ensure the gene symbols were contemporary and up to date.

Databases

We have established in our laboratory (Laboratory of Neurophenomics, www.neurophenomics.info) manually curated databases of the human gene expression/protein expression studies (postmortem brain, peripheral tissue/fluids: CSF, blood and cell cultures), human genetic studies (association, copy number variations and linkage), and animal model gene expression and genetic studies, published to date on psychiatric disorders. Only findings deemed significant in the primary publication, by the study authors, using their particular experimental design and thresholds, are included in our databases. Our databases include only primary literature data and do not include review papers or other secondary data integration analyses to avoid redundancy and circularity. We also favored unbiased discovery studies over candidate genes hypothesis-driven studies. These large and constantly updated databases have been used in our CFG cross validation and prioritization platform (Fig. 1).

Data from 551 papers on suicidality were present in the databases at the time of the CFG analyses (human genetic studies-231, human brain studies-204, human peripheral tissue/fluids-116). We have developed in our lab a computerized CFG Wizard to automate and score in bulk large lists of genes by integrating evidence from these large databases, checked against manual scoring. Analyses were performed as previously described. Points were assigned to each gene (Human Brain Expression Evidence – 6pts, Human Peripheral Expression Evidence– 4pts, Human Genetic Evidence -2pts). Points were added to the Discovery score (0-6). There were n = 2436 markers with a combined score (discovery + prioritization) of 6 or above, which were progressed to the next step, Validation (Fig. 1).

Step 3: Validation

Affy and RNAseq data integration

Validation analyses were ran on a hybrid dataset of microarray and RNAseq data compiled into one. For the RNAseq samples, we used a sum of transcript TPM approach for all the transcripts corresponding to each probeset. This virtual probeset metric represents the equivalent of the expression of the probeset, allowing for adequate integration of RNAseq and microarray data into one dataset.

Housekeeping gene selection and data normalization

To normalize and account for technical variance, as well as for potential post-mortem effects in the postmortem Validation cohort, a housekeeping gene was selected to use for normalization. We compiled a list of the most used candidate housekeeping genes in the literature and all their corresponding probesets (n = 35). Not all classic housekeeping genes are “housekeeping”, i.e. invariant or biologically not involved in the disorder, depending on the phenotype. An example of that is GAPDH [15]. So an empirical approach for each disease/phenotype/tissue is needed. The probeset with the lowest score after Discovery and integration of Affymetrix and RNAseq data (indicating the most invariance) was used as our housekeeping probeset (ACTB-224594_x_at). For the RNAseq samples, the TPM counts from all 14 transcripts of this probeset were summed up to generate a virtual probeset with a single value.

The microarray expression data probesets and the RNAseq expression data virtual probesets were divided by the housekeeping probeset/ virtual probeset expression levels.

Scaling factor

The final step to integrate Affymetrix and RNAseq data involves a scaling factor generation between the two platforms, to be able to bring data quantitatively to the same level/scale. Separate scaling factors were created for males and females. This process was done by taking the average expression for each probeset in the microarray data, as well as for each corresponding virtual probeset in the RNAseq data. The microarray average was then divided by the RNAseq average to generate a scaling factor. The RNAseq expression levels of each individual virtual probeset in each individual subject sample were then multiplied by their corresponding gender scaling factor.

$${\rm{Scaling\; Factor\; for\; each\; probeset}}({\rm{by\; gender}})=\frac{{\rm{Microarray\; Average}}\left({\rm{Intensity}}\right)}{{\rm{RNAseq\; Average}}\left({\rm{TPM}}\right)}$$
$${\rm{Scaling\; Factor}}\times {\rm{Virtual\; Probeset\; Expression}}\left({\rm{TPM}}\right){\rm{in\; a\; subject}}={\rm{Scaled\; Expression}}$$

This equation depicts how the Scaling Factor converts RNAseq expression TPM counts into the same, intensity-based levels, as microarray data.

Validation analysis

Three groups were used for Validation Analyses: the No SI (HAMD-SI = 0) and High SI (HMAD-SI ≥ 2) groups from the Discovery cohort, along with the suicide completers from the Coroner’s Office.

Expression levels from the hybrid dataset of samples run on Affymetrix (probesets) and RNAseq (virtual probesets) were normalized using the housekeeping gene and scaling factors, and then z-scored by gender across the 3 groups. We carried out an ANOVA in the biomarkers that were stepwise changed in expression from No SI to High SI to Suicide Completion. Biomarkers that survived Bonferroni correction for number of biomarkers tested received 6 points, those nominally significant 4 points, and those that were just stepwise 2 points. The rest were 0 points. These points contributed to each marker’s CFE3 score (discovery + prioritization + validation) (Fig. 1).

Top candidate biomarkers (after the first 3 steps)

Adding the scores from the first three steps into an overall convergent functional evidence (CFE) score (Fig. 1), we ended up with a list of 2340 top candidate biomarkers for suicidality that had a CFE score greater than 8 (1/3 of the possible maximum score of 24 after the first 3 steps). These top candidate biomarkers were carried forward into additional testing for clinical utility (Step 4).

Testing for clinical utility in independent cohorts

We tested in independent cohorts of psychiatric patients the ability of each of the top candidate biomarkers to assess state severity (measured by HAMD-SI ≥ 2), and predict trait risk (future hospitalizations for suicidality in the first year of follow-up, and in all future years of follow-up). We conducted our analyses across all patients, as well as personalized by gender.

The test cohorts for predicting suicidality severity (state), and the test cohorts for predicting future hospitalizations with suicidality (trait), were assembled out of hybrid datasets of Affymetrix and RNAseq samples, that were integrated as described above for the Validation step (housekeeping gene normalization, scaling factor and Z-scoring by gender). The cohorts were completely independent from the discovery and validation cohorts, there was no subject overlap with them. Individual biomarkers used for predictions were normalized as described above to avoid potential artefacts due to different ranges of expression by gene expression platform and gender, and to be able to combine different biomarkers into panels. Predictions were performed using R-studio. For cross-sectional analyses, we used biomarker expression levels. For longitudinal analyses, we combined four measures: biomarker expression levels, slope (defined as ratio of levels at current testing visit vs. previous visit, divided by time between visits), maximum levels (at any of the current or past visits), and maximum slope (between any adjacent current or past visits), as described in previous studies [16,17,18]. For decreased biomarkers, we used the minimum rather than the maximum for level calculations.

Predicting state-suicidality severity

Receiver-operating characteristic (ROC) analyses between marker levels and suicidality state were performed by assigning subjects visits with a HAMD-SI score ≥2 in the high suicidality category vs. the rest of the subjects in this independent test cohort (406 subjects, 820 visits). We used the pROC package of R (Table 1 and Fig. 1). Additionally, a one-tailed t-test was performed between high suicidality group vs. the rest, and Pearson R (one-tail) was calculated between suicidality scores and biomarker levels.

Table 1 Top biomarkers after 4 Steps of convergent functional evidence (CFE).

Predicting trait- future psychiatric hospitalization due to suicidality as a symptom/reason for admission

We conducted analyses for predicting future psychiatric hospitalizations with suicidality as a symptom/reason for admission in the first year following each testing visit, in subjects that had at least one year of follow-up in the VA system, in which we have access to complete electronic medical records (326 subjects, 685 visits). ROC analyses between biomarkers measures (cross-sectional, longitudinal) at a specific testing visit and future hospitalizations were performed as described above, based on assigning if subjects had been admitted to the hospital with suicidality or not. Additionally, a one tailed t-test with unequal variance was performed between groups of subject visits with and without future hospitalization with suicidality. Pearson R (one-tail) correlation was performed between hospitalization frequency (number of hospitalizations with suicidality divided by duration of follow-up) and marker levels. A Cox regression was performed using the time in days from the testing visit date to first hospitalization date in the case of patients who had been hospitalized, or 365 days for those who did not. The odds ratio was calculated such that a value greater than 1 always indicates increased risk for hospitalization, regardless if the biomarker is increased or decreased in expression.

We also conducted Cox regression and Pearson R analyses for all future hospitalizations with suicidality (365 subjects, 745 visits), including those occurring beyond one year of follow-up (up to 17.2 years, average: 7.8 years). The Cox regression was performed using the time in days from visit date to first hospitalization date in the case of patients who had hospitalizations with suicidality, or from visit date to last note date in the electronic medical records for those who did not. These calculations, unlike the ROC and t-test, account for the actual length of follow-up, which varied from subject to subject. The ROC and t-test might in fact, if used, under-represent the power of the markers to predict, as the more severe psychiatric patients are more likely to move geographically and/or be lost to follow-up.

Step 4 Predictions Scoring

Biomarkers that are nominally significant (for ROC AUC for State and First Year hospitalizations predictions, Cox Regression Odds Ratio for All Future Hospitalizations predictions) receive 4 points if they are predictive in all subjects in the cohort, and 2 points if they are only predictive within a gender. Scores are capped at 4, and the maximum score between cross-sectional and longitudinal predictions for each biomarker is taken moving forward. These points are then added to each biomarker’s CFE score to create a final score (discovery + prioritization + validation + state predictions + trait predictions) indicative of each markers ability to track and predict suicidality (Table 1). The maximum possible CFE4 score is 36 (6 + 12 + 6 + 12). We display in Table 1 the top biomarkers for suicidality with a CFE4 score of 26 and above (n = 30), chosen so the marker cannot have just maximum evidence from the first three steps, it has to have some evidence from Step 4 also. (Fig. 1, Table 1). This overall score provides a degree of certainty that the high-scoring biomarkers are indeed involved in the disease, while the performance might be better if the biomarkers were discovered, validated and tested separately by gender [3, 4]. In view of that, for the reports (Fig. 2), we chose a panel of the best predictive markers by gender (Table S4B, C).

Fig. 2: Prototype reports and population Radar Plot.
figure 2

Subject phchp328 (female, 37 years old) died by suicide by overdose a year after being tested by us. Phchp385 (male, 47 years old) died by suicide by hanging three years after being tested by us. A Prototype Report for Phchp328v1. B Prototype Report for phchp385v1. Reports based on panels of top predictive biomarkers for that gender. C, D Radar plots of Hospitalizations in the First Year following testing. Our individual subject scores (black line), as well as average scores for high risk subjects (red, n = 768) and average scores for low risk subjects (blue, n = 176).

Biological understanding

Pathway analyses

IPA (Ingenuity Pathway Analysis, version 107193442, Qiagen) and DAVID Functional Annotation Analysis (National Institute of Allergy and Infectious Diseases) (v2023q3) were used to analyze the biological roles, including top canonical pathways and diseases (Table 2A, B). We performed the pathway analyses for the 30 biomarkers for suicidality that were the top scoring CFE biomarkers after discovery, prioritization, validation, and testing.

Table 2 Biological Analyses A. Pathways. B. Diseases. C. Upstream Regulators. D. Genomic Co-Morbidity for Top Biomarkers. E. Therapeutics for Top Biomarkers.

Networks

For network analyses we performed STRING Interaction Network (https://string-db.org) by inputting the 30 genes into the search window, and performed Multiple Proteins Homo sapiens analysis (Fig. S3).

CFG beyond suicide: evidence for involvement in other psychiatric and related disorders

We also used a CFG approach to examine evidence from other psychiatric and related disorders, as exemplified for the list of top biomarkers after Steps 1- 4 (Table S3). This was not used to prioritize genes, but rather to understand the molecular basis of co-morbidities. We also calculated genomic co-morbidities % based on number of genes on our list that matched to different other disorders (Table 2D).

Therapeutics

Pharmacogenomics

We analyzed which of the top biomarkers for suicidality after Steps 1–4 (n = 30) are known to be changed in expression by existing drugs in a direction opposite to the one in disease, using our CFG databases (Table S4). These drugs and nutraceuticals are potential treatments and preventatives for patients with suicidality, and used in the prototype reports (described below) to demonstrate personalized medicine. Drugs are also listed individually by biomarker affected (Table 1 and S4).

Drug repurposing using the connectivity map

Following biomarker identification and validation, Connectivity Map was used in order to identify potential pharmaceuticals to alter the gene expression signature of the top biomarkers in a manner that opposes their alteration in suicidality. A Connectivity Map Query was performed using the selected top biomarkers (n = 30), performed using L1000 parameters with the latest dataset [19]. Results were converted into a matrix using cMapR [20]. The results from the query were analyzed and sorted based on normalized connectivity score. Drugs that are experimental were removed (Table S7).

Report generation

We present examples of how reports to doctors might look, using the above insights. We chose as case studies two patients who were tested by us, and who we learned subsequently had died by suicide. We used a panel of the top predictive biomarkers for state and trait, by gender.

Step 5 - Generalizability

All biomarkers that were nominally significant in the predictions in Step 4 Testing were retested for predictive ability on the whole database (n = 1127), consisting of male (n = 893) and female (n = 234) groups. Subsequently, to create our biomarker panel for the reports, for each gender we took the 12 best predictive biomarkers for state, first year hospitalizations, and all future hospitalization, resulting in a male and female panel of 36 biomarkers each.

Scores generation

The raw expression values of the biomarkers in our whole gene expression dataset (n = 1127) were Z-scored by gender. For state score, the Z-scored expression value of each increased biomarker was compared to the average value for the biomarker in the high suicidality group in the database, resulting in scores of 1 or 0 respectively, and 0.5 if it is in between. The reverse was done for decreased biomarkers. For trait chronic risk score, we calculated the average expression value for a biomarker in the first-year hospitalizations for suicidality group, and in the not hospitalized in the first-year group, and for all future hospitalizations for suicidality group, and no future hospitalizations group. We then compared the biomarkers for the subject of interest to these reference levels. If a biomarker was higher than the average of the high group it got a 1, if it was below the average of the no group it got a 0, and if it was in between, it got a 0.5 for increased biomarkers. For decreased biomarkers, if it was lower than the average of the high group it got a 1, if it was higher than the average of the no group it got a 0, and if it was in between in got a 0.5. These digitized scores for each biomarker are multiplied by the CFE4 score of each biomarker as a weight, to account for the totality of evidence, then summed into a polygenic risk score and then divided by a sum of all CFE4 scores.

The suicidality state risk score is the average score of all the state biomarkers multiplied by 100, generating 3 risk categories: high (red), intermediate (yellow), and low (green). The chronic suicidality risk score was calculated the same way using biomarkers for first year and for all future hospitalizations due to suicidality. These percentile scores of the patient are provided in the report (Fig. 2).

The digitized biomarkers are also used for matching with existing psychiatric medications and alternative treatments (nutraceuticals and others). We use our large datasets and literature databases to match biomarkers to medications that have effects on gene expression opposite to their expression in high suicidality. The gene expression data is from gene expression data in human and animal models. Each medication matched to a biomarker gets the biomarker’s score of 1, 0.5 or 0. The scores for the medications are added, and divided by the number of biomarkers that were 1 or 0.5 in that patient, resulting in a percentile match. Thus, psychiatric medications are matched to the patient and ranked in order of impact on the panel.

Hierarchical clustering and subtypes

A two-way unsupervised hierarchical clustering (Fig. S2) was done using subjects in the discovery cohort with high suicidal ideation (HAMD-SI ≥ 2, n = 103). Clustering was done on 4 psychiatric dimensions, using quantitative instruments: stress (SSS4) [21], anxiety (SAS4) [17], mood (SMS7) [16], psychosis (PANSS Positive) [18]. Subjects measures were classified as high (red) above average, and low (blue) if below average for that scale. 16 distinct subtypes were revealed via the hierarchical clustering. The average hospitalization frequency for suicide related hospitalizations were then calculated for each subtype within the independent testing cohort (Fig. S2). It likely reflects a combination of self-perceived need for, and ability to seek help and get hospitalized.

Machine learning (ML) analyses

We compared different ML approaches for predicting occurrence of hospitalizations for suicidality in the year following testing, and time to first hospitalization. We used a comprehensive bio-socio- psychological input into the models, consisting of panels of best predictive biomarkers by gender (the same ones used in the Reports in Fig. 2), as well as the CFI-S scale items, and the HAMD-SI item. The ML investigation is designed as follows: 1) the gene expression blood biomarkers, CFI-S items, overall CFI-S score, and HAMD-SI from each patient were considered as input features of the ML architectures; 2) the occurrence of hospitalization and the time to 1st hospitalization results were converted to binary indicators.

We used and compared the following ML approaches: 1. Support Vector Machine (SVM) is a supervised learning algorithm that finds the hyperplane which best separates different classes in the feature space; 2. Random Forest (RF) is an ensemble learning method that operates by constructing a multitude of decision trees at training time to output the class that is the mode of the classes of the individual trees; 3. XGBoost stands for eXtreme Gradient Boosting and represents a scalable and accurate implementation of gradient boosting machines, which are used for supervised learning problems by optimizing differential loss functions; 4. Transformer is a deep learning model that uses self-attention mechanisms to process sequential data, and it is widely used in natural language processing tasks; and 5. Deep Neural Networks (DNN) refer to a deep learning technique consisting of several artificial neurons (i.e., non-linear computational units) organized in multiple layers (i.e., an input layer matching in size the input feature data, several hidden layers of various sizes that are decided based on a neural architecture search to minimize specific loss functions, and an output layer whose size is determined by the desired ML classification or prediction task) for extracting discriminative features from a given dataset.

In comparing the strengths and weaknesses of various ML algorithms, each brings unique advantages and challenges to the table. SVM approaches offer robust performance in high-dimensional spaces and are effective in cases where the number of dimensions exceeds the number of samples. However, they struggle with large datasets and require careful tuning of parameters. RF approaches are known for their simplicity and ability to run efficiently on large datasets, but they can overfit in cases of noisy data. XGBoost excels in handling sparse data and is faster and more efficient than traditional Gradient Boosting, but tuning its hyperparameters can be complex and time-consuming. Transformers show exceptional performance on sequential data, particularly in natural language processing tasks, due to their self-attention mechanism; however, they require significant amounts of data and computational power, potentially limiting their applicability in resource-constrained settings. DNNs are highly flexible and capable of learning complex patterns from large amounts of data, making them suitable for a wide range of applications, including image and speech recognition. Their primary drawbacks are the need for extensive computational resources and the risk of overfitting, which can be mitigated with techniques such as dropout.

The DNN approach turned out to be the most predictive, and was chosen to be used also for the subsequent feature importance analysis described below.

In more detail, we designed a comprehensive DNN framework for predicting the occurrence of hospitalization and the time to 1st hospitalization for suicidality. We provide details about the constructed gender-specific DNN models in the following paragraph.

Deep neural network training and testing

261 subjects were designed as a training cohort, to train and tune the hyper-parameters of the DNN during the neural architecture search. Due to size and structure of the available data, the male and female models are trained and tested separately. The male model was trained with 217 subjects and tested on 570 subjects. The female model was trained with 44 subjects and tested on 115 subjects.

In females, for Occurrence of Hospitalizations in the First Year, the input layer had 32 neurons, with three hidden layers, with 128, 32 and 32 neurons each. For Time to First Hospitalization, the input layer had 32 neurons, with four hidden layers with 128, 128, 64 and 64 neurons each.

In males, for Occurrence of Hospitalizations in the First Year, the input layer had 32 neurons, with three hidden layers, with 128, 32 and 32 neurons each. For Time to First Hospitalization, the input layer had 32 neurons, with four hidden layers with 128, 64, 64 and 64 neurons each.

All the hidden layers were attached with a batch normalization layer and a dropout layer with 0.2 dropout rate and the ReLU activation function was applied on all dense layers, where both batch normalization and dropout layers are responsible for avoiding overfitting. The output layer has 2 output neurons with the “sigmoid” activation function. The DNN models used an optimizer with 0.001 learning rate and binary cross entropy loss function. Grid and random searches determined suitable hyperparameter values for all ML models in this work (e.g., batch size, kernel size, weight decay). An n-dimensional grid was defined to map the n hyperparameters and to identify their ranges. We examined all possible DNN configurations to identify optimal values for each hyperparameter. Of note, results of predictions in the testing cohorts were used for that. As such, unlike our four-step biomarker identification, prioritization, validation, and testing studies, the ML approach is not based on completely independent cohorts.

Feature importance analysis (saliency value)

Next, we computed the saliency value for each input. For input X0 and a DNN model with a score function S(X), we ranked features (genotypes) in X0 based on their importance to S(X). We considered the linear score model S(X)=wTX + b, where the input X, weight w and bias b are in one-dimensional (vectorized) forms. Since the DNN model and score function are highly nonlinear functions of X, the linear score model cannot be applied directly. We approximate S(X) at X0 using the first-order Taylor series S(X0)w0TX0+b, where w0 = მS/მX|X0is the partial derivative of S(X) at X0 and b0 is the bias at X0, and w0 represents the saliency value for each input in X0.

Results

In Step 1 Discovery, we identified candidate blood gene expression biomarkers that: 1. change in expression in blood between no and high suicidality states, 2. track the state across visits in a subject, and 3. track the suicidality states in multiple subjects. We used as a quantitative measure for suicidality the Suicidal Ideation item in the Hamilton Rating Scale for Depression (HAMD-SI) (Fig. S1). At a phenotypic level, this item quantifies suicidality state at a particular moment in time, and over the week prior to testing (Fig. S1).

For the discovery step, we used a powerful within –subject and then across-subject design in a longitudinally followed cohort of subjects (for the Affymetrix samples, n = 68 subjects, with 221 visits; for the RNAseq samples, n = 22 subjects, with 52 visits) who displayed at least a change in the suicidality measure (from 0 to 2 and above, and vice-versa) between at least two consecutive testing visits, to identify differentially expressed genes that track suicidality state.

The data from the two platforms was integrated as described in Methods. Using our 33% of maximum raw score threshold (internal score of 2 pt) [3, 4], there are 9184 unique probesets with corresponding transcripts.

In Step 2 Prioritization, we used a Convergent Functional Genomics (CFG) approach to prioritize the candidate biomarkers identified in the discovery step (33% cutoff, internal score of ≥2 pt.) by using prior published literature evidence (genetic, gene expression and proteomic), from human studies, for involvement in suicidality (Fig. 1 and Table S2). There were 2438 probesets with corresponding transcripts that had a total score (combined discovery score and prioritization CFG score) of 6 and above. These were carried forward to the validation step.

In Step 3 Validation, we validated the prioritized candidate biomarkers for change in suicide completers (n = 101, n = 45 Affymetrix and n = 56 RNAseq). We assessed which biomarkers were stepwise changed in expression from no suicidality in the discovery cohort, to high suicidality in the discovery cohort, to suicide completers (Fig. 1). Of the 2438 probesets after the prioritization step, 739 were nominally significant, and of these, 382 were Bonferroni significant.

Adding the scores from the first three steps into an overall convergent functional evidence (CFE) score (Fig. 1), we ended up with a list of 2340 top candidate biomarkers for suicidality, that had a CFE3 score ≥8, better than 33% of the maximum possible score of 24 after the first three steps, which we decided to use as an empirical cutoff. These top candidate biomarkers were tested in Step 4 for clinical utility/predictive ability in additional independent cohorts (Fig. 1 and Table 1).

Testing for clinical utility

In Step 4 Testing, we examined in independent cohorts from the ones used for discovery or validation whether the top candidate biomarkers after the first three steps can assess high suicidality states, as well as predict of future psychiatric hospitalizations due to suicidality (Fig. 1 and Table S1), using electronic medical records follow-up data of our study subjects (up to 17.2 years from initial visit at the time of the analyses). The gene expression data in the test cohorts was normalized (Z-scored)] by gender, before those groups were combined. This permits them to be combined, and reduces bias from larger groups. We used as predictors biomarker levels information cross-sectionally, as well as expanded longitudinal information about biomarker levels at multiple visits. We tested the biomarkers in all subjects in the independent test cohort, as well as in a more personalized fashion by gender (Fig. 1).

Convergent functional evidence (CFE)

For the top candidate biomarkers (n = 2340), we computed into a convergent functional evidence (CFE) score all the evidence from discovery (up to 6 points), CFG prioritization (up to 12 points), validation (up to 6 points), and testing (predicting state suicidality, first year hospitalizations with suicidality, all future hospitalizations with suicidality- up to 4 points each if it significantly predicts in all subjects, 2 points if in gender). The total score can be up to 36 points: 24 from our own new data, and 12 from literature data used for CFG. We weigh our new data more than the literature data, as it is functionally related to suicidality in 3 independent cohorts (discovery, validation, testing). The goal is to highlight, based on the totality of our data and of the evidence in the field to date, biomarkers that have all around evidence: track suicidality, have convergent evidence for involvement in suicidality, and predict suicidality state, and future clinical events (Table 1).

The 30 top blood biomarkers with the strongest overall convergent functional evidence (CFE) for tracking and predicting suicidality, after all four steps (Table 1) were, in order of CFE4 score: SLC6A4 (Solute Carrier Family 6 Member 4), TINF2 (TERF1 Interacting Nuclear Factor 2), INSR (Insulin Receptor), CLN5 (CLN5 Intracellular Trafficking Protein), PKP4 (Plakophilin 4), SLC49A4 (Solute Carrier Family 49 Member 4), SKP1 (S-Phase Kinase Associated Protein 1), ECHDC1 (Ethylmalonyl-CoA Decarboxylase 1), BCL2 (BCL2 Apoptosis Regulator), SELENOF (Selenoprotein F), SYNE2 (Spectrin Repeat Containing Nuclear Envelope Protein 2), NDFIP1 (Nedd4 Family Interacting Protein 1), VTI1B (Vesicle Transport Through Interaction With T-SNAREs 1B), E2F1 (E2F Transcription Factor 1), CTIF (Cap Binding Complex Dependent Translation Initiation Factor), MTCH2 (Mitochondrial Carrier 2), PRKAR2B (Protein Kinase CAMP-Dependent Type II Regulatory Subunit Beta), ANGPT1 (Angiopoietin 1), KLF12 (KLF Transcription Factor 12), CDH4 (Cadherin 4), APOE (Apolipoprotein E), MYH10 (Myosin Heavy Chain 10), UBL3 (Ubiquitin Like 3), CALD1 (Caldesmon 1), APC (APC Regulator Of WNT Signaling Pathway), MAP3K7 (Mitogen-Activated Protein Kinase Kinase Kinase 7), MAOA (Monoamine Oxidase A), LINC01432 (Long Intergenic Non-Protein Coding RNA 1432), S100A10 (S100 Calcium Binding Protein A10), and AGO2 (Argonaute RISC Catalytic Component 2).

SLC6A4, the overall top biomarker for suicidality in this study, is the serotonin transporter, which plays an essential role in the mechanism of action of serotonin-based anti-depressants. Abnormalities in serotonin may be central to suicidality [22]. SLC6A4, increased in expression in blood in high suicidality in our work, has previous, convergent evidence for involvement in suicidality. It is increased in expression in hippocampus in suicides [23]. There is also previous human genetic evidence [10, 24]. SLC6A4 in our studies modestly predicts high suicidality state in all patients in the independent testing cohort (AUC 59%, p = 0.02) using longitudinal analyses, with results being somewhat better in females (AUC 68%, p = 0.02). It also predicts future hospitalizations with suicidality in females, in the first year (AUC 79%, p = 0.0007), and in future years (OR 2.71, p = 0.002). SLC6A4 activity is blocked by antidepressants, but SLC6A4 may be increased in expression by SSRI treatment itself, in a physiological negative feedback loop. Medication non-compliance and withdrawal may expose the patient to high levels of SLC6A4, that decrease serotonin and increase impulsivity and suicidality. In particular short-half life SSRIs used in non-compliant populations like children and adolescents may increase the risk of suicidality. This may be why long-half life SSRIs like fluoxetine are safer and thus preferentially used in those under age 25.

TINF2 (TERF1 Interacting Nuclear Factor 2), another top biomarker, is a key component of the shelterin complex (telosome) that is involved in the regulation of telomere length and protection. A decrease in TINF2, as we see in suicidality, would lead to telomere shortening.

TINF2 in our studies predicts high suicidality state in females in the independent testing cohort (AUC 72%, p = 0.005) using longitudinal analyses. It also predicts future hospitalizations with suicidality in females, in the first year (AUC 67%, p = 0.03), and modestly predicts hospitalizations in all in future years (OR 1.2, p = 0.005). TINF2 has also been shown to be decreased in expression in blood in previous studies we did in stress [25], low memory [26], and hallucinations [26], suggestive of a stress-driven neuropathological component.

INSR (Insulin Receptor), another top biomarker, is a receptor tyrosine kinase which mediates the pleiotropic actions of insulin. In vivo and in several cell models, the expression of the insulin receptor and/or its mRNA is under positive regulation by glucocorticoid hormones and negative regulation by insulin. Glucocorticoid hormones stimulate receptor gene transcription and receptor protein synthesis. INSR is increased in expression in suicidality in our work, consistent with a high stress state. INSR in our studies modestly predicts high suicidality state in all patients in the independent testing cohort (AUC 60%, p = 0.01) as well as future hospitalizations (OR 1.38, p = 0.002), using longitudinal analyses, with results being somewhat better in females (OR 1.86, p = 0.02). It also has previous genetic evidence [27], and human postmortem brain evidence of being increased in the hippocampus in suicides [28]. INSR has also been shown to be decreased in expression in blood in previous studies we did in stress [25], anxiety [17], depression [16], low memory [26], and hallucinations [26], suggestive of a stress-driven neuropathological component. It is decreased in expression by lithium [29], valproate [30], and antidepressants [31].

CLN5 (CLN5 Intracellular Trafficking Protein), another top biomarker, is involved in the degradation of post-translationally modified proteins in lysosomes.CLN5 is decreased in suicidality in our work, consistent with a lysosomal accumulation. It modestly predicts high suicidality state in all patients in the independent testing cohort (AUC 66%, p = 0.0002), with results being somewhat better in females (AUC 74%, p = 0.004), using longitudinal analyses, as well as future hospitalizations in all (OR 1.85, p = 0.004). CLN5 is also decreased in the blood in our biomarker studies on stress [25], anxiety [17], and depression [16], as well as in excitatory neurons in dementia [32], suggestive of a stress-driven neuropathological component. It is increased in expression by lithium [29]. BCL2 (BCL2 Apoptosis Regulator) is involved in apoptosis. BCL2 was decreased in expression in suicidality in our work, consistent with an anti-survival mechanisms/ increased cell death. Suicidality may be a whole-organism apoptosis [2, 33], and we have described in our previous work suicidality as the opposite of longevity at a molecular level, the negative direction of a biological “Life Switch”. BCL2 modestly predicts high suicidality state in all patients in the independent testing cohort (AUC 60%, p = 0.01), with results being somewhat better in females (AUC 77%, p = 0.0008), using longitudinal analyses, as well as future hospitalizations in the first year in females (AUC 72%, p = 0.007). It also has previous independent convergent evidence from human postmortem brain studies of being decreased in the pre-frontal cortex in suicide completers [34], particularly females [35], and in white blood cells [36]. BCL2 is also decreased in expression in the blood in our biomarker studies on pain [37], depression [16], and in studies by others on aging [38, 39], suggestive of an adversity -driven pathological component. BCL2 is increased in expression by lithium [40, 41], clozapine [41], as well as SNRIs (venlafaxine [42], duloxetine [43]), buproprion [44], and doxepin [45]. It is also increased in expression by the nutraceuticals omega-3 fatty acids [46], CoQ10 [47], curcumin [48], fisetin [49], and CBD [50].

APOE (Apolipoprotein E) is involved in lipid transport, as well as the repair and regeneration of neurons. APOE was decreased in expression in suicidality in our work, consistent with decreased neuronal repair and growth. Suicidality may be a form of dementia. APOE predicts high suicidality state in all patients in the independent testing cohort (AUC 74%, p = 1.30E−22), with results being somewhat better in females (AUC 88%, p = 7.35E−06). It also predicts first year hospitalizations in all (AUC 67%, p = 9.04E−09,) with results better in females (AUC 88%, p = 1.07E−05), and all future hospitalizations in all (OR 1.38, p = 1.26E−05,) with results again better in females (OR 1.9, p = 1.60E−03). It also has previous independent convergent evidence from human postmortem brain studies of being decreased in the pre-frontal cortex in suicide completers [51], particularly females [35]. APOE is also decreased in expression in aging [52] and increased in longevity [53], potentially being involved in the Life Switch described by us [33]. It is decreased in expression in the blood in our biomarker studies on depression [16],as well as psychosis [54], memory disorders [26], and in studies by others on ASD [55], suggestive of a cognitive component. APOE is increased in expression by escitalopram [56] and by nortryptiline [56], as well as the nutraceuticals ginseng [57] and magnesium [58].

MAOA (Monoamine Oxidase A), a biomarker increased in expression in our studies, is an enzyme involved in neurotransmitter degradation, and is the target of a class of broad-spectrum antidepressants that inhibit it. MAOA modestly predicts high suicidality state in females (AUC 65%, p = 0.04) in longitudinal analyses, as well as future hospitalizations in the first year (AUC 74%, p = 0.003), and all future years (OR 2.42, p = 0.001). It also has previous independent convergent evidence from genetic studies [59, 60], and of being increased in expression in human neuronal progenitor cells studies of suicidality [61]. MAOA is also increased in expression in the blood in our biomarker studies on stress [25], pain [37], anxiety [62], depression [16], and in studies by others on panic disorders [63] and on depression [64], consistent with an adversity -driven pathological component. MAOA is inhibited/decreased in expression by antidepressants [65,66,67], as well as by ketamine [68], the nutraceutical olive extract [67], and by psychotherapy [69].

Biological understanding

Biological pathways

We carried out biological pathway analyses using the list of top biomarkers for suicidality (n = 30 genes). The top pathways were related to apoptosis and neurotransmitter clearance (Table 2A). Major depressive disorder and sleep disorders were top diseases identified by the pathway analyses using DAVID, pointing out to a molecular underpinning for these well-known clinical co-morbidities (Table 2B).

Networks and interactions

We carried out a STRING analysis (Fig. S3) of the top candidate biomarkers that revealed groups of interacting proteins. BCL2 is at the nexus of three networks: one containing SLC6A4, MAOA, and APOE; one containing INSR and ANGPT1; and one containing AGO2 and E2F1. These networks may have biological significance and could be targeted therapeutically.

Therapeutics

Overall, lithium (26.7%) had the best evidence for broad efficacy in suicidality (Table 2E), followed by clozapine (23.3%) and ketamine (20%). Interestingly, these are the only medications in psychiatry approved for suicidality, and their emergence out of our empirical, hypothesis-driven work, is a strong validation of our approach. Omega-3 fatty acids (13.3%) was the top nutraceutical and may be a widely deployable preventive treatment, with minimal side-effects, including in women who are or may become pregnant.

A number of individual top biomarkers are known to be modulated by medications in current clinical use for treating schizophrenia and suicidality such as by clozapine, mood disorders and suicidality such as lithium, ketamine, as well as the nutraceutical omega-3 fatty acids (Tables 1and S4). This is of potential utility in pharmacogenomics approaches matching suicidality patients to the right medications, and monitoring response to treatment.

Best predictive biomarkers

In Step 4, we identified best predictive biomarkers for suicidality state (HAMDSI ≥ 2) and trait (first year, and all future hospitalizations, for suicidality), using cross-sectional and longitudinal methodology. In an additional Step 5, all the nominally significant biomarkers were tested for ability to predict using the whole population used in the study (n = 1127), to avoid an overfit to the testing cohort. The best predictive biomarkers in all, and for each gender, male and female, can be combined in panels to generate reports for doctors, as shown in Fig. 2.

Machine learning

Machine learning (ML) models (e.g., XGB, RF, and SVM) were trained on the discovery cohort, and tested in the independent test cohort for predicting hospitalizations in the first year following testing. Deep Neural Networks (DNN) performed best. The model used a combination of the top biomarkers, CFI-S, and HAMD-SI. It’s predictive ability was compared to and shown to be less than to our simple additive model of the three scores (biomarker panel score, CFI-S score, and HAMD-SI score) (Table 3). In general, the model predicted better in females than in males, consistent with all of our other results. The top predictive features for occurrence of hospitalizations for suicidality in the first year following testing were HAMD-SI (suicidality intensity) and JOSD1 in females, and JOSD1 and THY1 in males. For imminence of hospitalizations, the top features were Feeling Useless and HAMD-SI in females, and Medical Problems and Age in males. These represent readily addressable targets for preventive approaches.

Table 3 Genomic and clinical predictions comparison and integration.

JOSD1 (Josephin Domain Containing 1) is involved in deubiquitination. It is decreased in expression in blood in high suicidality in our studies, and also has convergent evidence of being decreased in expression postmortem brain studies from suicide completers [35]. The decrease in JOSD1 may be associate with increased ubiquitination and apoptosis, which is cellular suicide [70].

Discussion

We describe novel and comprehensive efforts to advance precision medicine approaches for suicidality. The top blood biomarkers were discovered, validated and tested in independent cohorts to evaluate predictive ability and clinical utility. These biomarkers also open a window into understanding the biology of suicidality, as well as indicate new and more precise therapeutic approaches.

Current clinical practice and the need for biomarkers

Assessing a persons’ internal subjective perceptions and thoughts, along with more objective external ratings of actions and behaviors, are used in clinical practice to assess suicidality. Such an approach is insufficient, and lagging those used in other medical specialties. Moreover, individuals do not always report or want to share when they are suicidal, leading to missed opportunities to intervene and help. Blood biomarkers related to suicidality, if used as part of routine mental health clinical visits and even primary care annual exams, would provide a critical objective measurement to inform clinical assessments and treatment decisions.

Advantages of biomarkers

Blood biomarkers offer real-world clinical practice advantages. As the brain cannot be readily biopsied in live individuals, and CSF is less easily accessible than blood, we have endeavored over the years to identify blood biomarkers for neuropsychiatric disorders. A whole–blood approach facilitates field deployment of sample collection. The assessment of gene expression changes focuses our approach on immune cells. The ability to identify peripheral gene expression changes that reflect brain activities is likely due to the fact that the brain and immune system have developmental commonalities, marked by shared reactivity and ensuing gene expression patterns. There is also a bi-directional interaction between the brain and immune system. Not all changes in expression in peripheral cells are reflective of or germane to brain activity. By carefully tracking a phenotype with our within-subject design in the discovery step, and then using convergent functional genomics prioritization, we are able to extract the peripheral changes that do track and are relevant to the brain activity studied, in this case suicidality. Subsequent validation and testing in independent cohorts narrow the list to the best markers. In the end, we do not expect to recapitulate in the blood all that happens in the brain. We just want to have good accessible peripheral biomarkers- “liquid biopsies”, as they are called in cancer.

Comprehensiveness

In this current work, we carried out extensive blood gene expression studies in male and female subjects with major psychiatric disorders, an enriched population in terms of co-morbidity with suicidality. The potential molecular-level co-morbidity between other psychiatric disorders and suicidality is underlined by the fact that medications for mood disorders (lithium) and psychosis (clozapine) are also used to treat suicidality. Our primary goal was to discover and validate biomarkers for suicidality, that are transdiagnostic. Secondarily, we aimed to understand their universality vs. their specificity by gender.

Our studies were arranged in a stepwise fashion. First, we endeavored to discover blood gene expression biomarkers for suicidality using a longitudinal design, looking at differential expression of genes in the blood of male and female subjects with major psychiatric disorders (bipolar disorder, major depressive disorder, schizophrenia/schizoaffective, and post-traumatic stress disorder (PTSD)), high risk populations prone to suicidality, which constitute and enriched pool in which to look for biomarkers. We compared no suicidality states to high suicidality states using a powerful within-subject design [2,3,4, 71], to generate a list of differentially expressed genes. Second, we used a comprehensive Convergent Functional Genomics (CFG) approach with the whole body of knowledge in the field to prioritize from the list of differentially expressed genes/biomarkers of relevance to suicidality. CFG integrates multiple independent lines of evidence- genetic, gene expression, and protein data, from brain and periphery, from human studies, as a Bayesian strategy for identifying and prioritizing findings, reducing the false-positives and false-negatives inherent in each individual approach. Third, we examined if the expression levels of the top biomarkers identified by us as tracking suicidality state are changed even more strongly in blood samples from an independent cohort of subjects who died by suicide, to validate these biomarkers. Fourth, the biomarkers thus discovered, prioritized, and validated were tested in corresponding independent cohorts of psychiatric subjects. Fifth, we used the biomarkers to match to existing psychiatric medications, as well as to identify and potentially repurposed drugs for suicidality treatment and prevention, using bioinformatics analyses. Sixth, we used bio-socio-psychological approaches, including with and without machine learning, to identify best predictors. The series of studies was a systematic and comprehensive approach to move the field forward towards precision medicine.

Power

We used a systematic discovery, prioritization, validation, and testing approach, as we have done over the years for suicidality and other disorders [6, 16, 17, 21, 37, 72]. For discovery, we used a hard to accomplish but powerful within-subject design, with an N of 90 subjects with 273 visits. A within-subject design factors out genetic variability, as well as some medications, lifestyle, and demographic effects on gene expression, permitting identification of relevant signal with Ns as small as 1 [71]. Another benefit of a within-subject design may be accuracy/consistency of psychiatric symptoms (“phene expression”), as it is the same person reporting different states. This is similar in rationale to the signal detection benefits it provides in gene expression.

Based on our work of over two decades in genetics and gene expression, along with the results of others in the field, we estimate that using a quantitative phenotype is up to 1 order of magnitude more powerful than using a categorical diagnosis. The within-subject longitudinal design, by factoring out all genetic and some environmental variability, is up to 3 orders of magnitude more powerful than an inter-subject case-control cross-sectional design. Moreover, gene expression, by integrating the effects of many SNPs and environment, is up to 3 orders of magnitude more powerful than a genetic study. Combined, our approach may be up to 6 orders of magnitude more powerful than a GWAS study, even prior to the CFG literature-based prioritization step, which encompasses all the independent work in the field prior to our studies, which may add up to 1 order of magnitude as well. In addition, the Validation and the Testing steps add additional 1 order of magnitude power each. As such, our approach might be up to 10 orders of magnitude more powered to detect signal than most current genetic study designs as used in GWAS.

Reproducibility

We reproduced and expanded our earlier biomarker findings [6]. 86% of our candidate biomarkers from the discovery step in the current work reproduce those in our 2017 study. Moreover, 29 out of our 30 top biomarkers after Step 4 in the current study passed the discovery step in the 2017 study. 26 out of our 30 current top biomarkers were Bonferroni significant after the validation step, compared to 4 of them in the earlier smaller 2017 study.

Additionally, there is reproducibility of our candidate biomarkers from discovery with findings generated by other independent studies as part of the Step 2 Prioritization using Convergent Functional Genomics (see Table S2). This independent reproducibility of findings between our studies and these other studies, which are done in independent cohorts from ours, with independent methodologies, is reassuring, and provides strong convergent evidence for the validity and relevance of our approach and of their approaches. Our work also provides functional evidence for some of their top genetic hits.

Pathophysiology

Top biological pathways have to do with apoptosis (cellular suicide) and neurotransmitter clearance (Table 2A). Suicidality may be a whole-body apoptosis in response to an adverse environment.

The majority of top blood biomarkers we have identified have prior evidence in human brain data from suicides, which indicates their relevance to the pathophysiology of suicidality (Table S2). The co-directionality of blood changes in our work and brain changes reported in the literature needs to be interpreted with caution, as it may depend on brain region.

The top candidate biomarkers also had prior evidence of involvement in other psychiatric and related disorders (Table 1 and S3), providing a molecular basis for co-morbidity, and the possible predisposing effects of some these disorders on suicidality. In particular, over 80% of top biomarkers identified by us overlap with genes implicated in alcoholism, depression, and stress (Table 1, Table 2D), consistent with the known clinical co-morbidity. These are common, treatable and preventable disorders.

6 of the top 30 genes (20%) are involved in the circadian clock, an enrichment over the 7% of the genome that is involved in circadian mechanisms [16]. Circadian clock genes in general are core to mood and to levels of activity of the organism. Thyroid hormones were among the top upstream regulators (Table 2C). Thyroid hormones ae known to have a profound impact on mood and levels of activity.

Phenomenology

In addition to using the standard HAMD-SI item, we had previously developed and used in this study a scale to assess psycho-social determinants of suicidality risk. Convergent Functional Information for Suicidality (CFI-S) scale is a 22-items checklist of risk factors and social impairment, that notably does not ask about suicidal ideation. The CFI-S has been shown in previous studies [3, 4, 6, 12, 13], and confirmed in the current one, to be predictive of suicidality state and trait.

We have also looked at subtypes of suicidality in the subjects from the discovery cohort while they were in a high suicidality state. We identified 16 subtypes, based on two-way unsupervised hierarchical clustering on measures of stress, anxiety, mood and psychosis (Fig. S2). The three subtypes with the most subsequent hospitalizations in the year following testing had in common high anxiety.

Biomarkers vs. Scales

In general, the best predictive biomarkers were better than the rating scales (CFI-S, HAMD-SI) at predicting trait suicidality in all patients, in males, and in females. In females, the biomarkers were better than the scales for state suicidality also. This may reflect the fact that these are difficult phenotypes to assess by clinicians, and reinforces the need for using objective blood biomarkers to assess suicidality (Table S6). The biomarkers and scales are also synergistic (Table 3).

Diagnostics

For the biomarkers identified by us, combining all the available evidence from this current work into a convergent functional evidence (CFE) score, brings to the fore biomarkers that have clinical utility for objective assessment and risk prediction for suicidality (Table 1). These biomarkers should be tested individually as well as tested as polygenic panels of biomarkers in future clinical studies and practical clinical applications in the field. They may permit to distinguish, upon an initial clinical presentation of suicidality, whether the person is in fact severely suicidal and at chronic risk (Fig. 2). The integration of phenomic data, such as CFI-S done yearly, and repeated measures of HAMD-SI (perhaps via a phone app in a daily fashion), can further substantiate and elucidate suicidality risk, distinguishing between an intermittent type such as transient suicidal ideation, and continuous type such as chronic suicidality. We demonstrated by using a bio-socio-psychological approach that there was synergy between the components. Machine learning did not perform better, but it highlighted the importance of certain individual features (markers). Predictions of occurrence and imminence of hospitalizations for suicidality were stronger in women than in men (Fig. 3 and Table 3).

Fig. 3: Machine learning analysis.
figure 3

A, C, E, G Positive predictive value and ROC AUC of occurrence of hospitalizations as well as time to first hospitalization for various machine learning models utilizing all three aspects of the biopsychosocial model (biomarkers, CFIS, and HAMD-SI). B, D, F, H Salience analysis of which features of the model are the most important.

In general, our predictive results with biomarkers were stronger in females than in males, by an order of 10–20% points on AUCs. While some of it may be biological, in terms of immune system reactivity and brain-blood interplay being perhaps higher in women, it is also possible that men are not as accurate as women in terms of reporting suicidality symptoms (affecting our results on state predictions), and do not seek help as much (affecting our results on future hospitalizations predictions). If so, this under-reporting makes the use of objective biomarker tests in men even more necessary. Of note, death by suicide is four times higher in men than women.

In regard to how our biomarker discoveries might be applied in clinical laboratory settings, we suggest that panels of top biomarkers for suicidality risk be used (Fig. 2). In practice, every new patient tested would be normalized against the database of similar patients already tested, and compared to them for ranking and risk prediction purposes, regardless if a platform like microarrays, RNA sequencing, or a more targeted one like PCR is used in the end clinically. As databases get larger, normative population levels can and should be established, similar to any other laboratory measures. Moreover, longitudinal monitoring of changes in biomarkers within an individual, measuring most recent slope of change, maximum levels attained, and maximum slope of change attained in the past, may be even more informative than simple cross-sectional comparisons of levels within an individual with normative populational levels, as we have shown in our studies. For future point of care approaches, research and development should focus on top individual biomarkers, including at a protein level. One might look at a combination of the best universal biomarkers (that are predictive in all), for reliability, and of the best personalized biomarkers (that are predictive by gender, and even diagnosis), for higher accuracy.

Treatment

Biomarkers may also be useful for matching patients to medications and measuring response to treatment (pharmacogenomics) (Fig. 2, Tables 2E and S4), as well as new drug discovery clinical trials, and drug repositioning (Table S7). From the pharmacogenomics analyses, lithium was a top hit, second was clozapine. Other interesting matches were ketamine, valproate, magnesium, omega-3 fatty acids, citalopram, escitalopram. All these drugs and nutraceuticals are relatively safe if used appropriately, and have been used in clinical practice for other indications for decades, which facilitates the direct translation to clinical practice of our findings. The fact that the top medication matches were lithium, clozapine, and ketamine is striking, as the first two are FDA-approved for suicidality, and ketamine has been a recent addition to the list of medications studied for this indication.

Drug repurposing analyses identified inhibitors of the renin-angiotensin system (lisinopril, losartan, ramipril), and of the cyclooxygenase system (celecoxib pranoprofen, tenoxicam), as potential choices, as well as the SSRI sertraline. Our earlier 2017 study identified metformin as a potential anti-suicidal compound, and this study identified phenformin, a predecessor of metformin with more side-effects.

Conclusions

Overall, this work is a major step forward towards better understanding, diagnosing, and treating suicidality. Taken together, our data supports the possibility that biologically, suicidality is an extreme stress-driven form of active aging/death. Stress needs to be actively addressed and mitigated in high-risk individuals and circumstances, in both men [73], and women [74]. We hope that our trait biomarkers for future risk may be useful in preventive approaches, before full-blown suicidality manifests itself (or re-occurs). The two cases of subjects who completed out testing and died later by suicide illustrates the power of our approach to identify risk (Fig. 2). Prevention could be accomplished with biological interventions (i.e., early targeted use of medications or nutraceuticals), social measures to help with integration in society using the risks identified by CFI-S, and psychological support. Given the fact that suicidality is on the increase in the US and worldwide, that suicidality can severely affect quality of life and lead to shortened lifespans, and that not all patients respond to current treatments, the need for and importance of efforts such as ours cannot be overstated.