Evidence-based umbrella review of 162 peripheral biomarkers for major mental disorders

The literature on non-genetic peripheral biomarkers for major mental disorders is broad, with conflicting results. An umbrella review of meta-analyses of non-genetic peripheral biomarkers for Alzheimer’s disease, autism spectrum disorder, bipolar disorder (BD), major depressive disorder, and schizophrenia, including first-episode psychosis. We included meta-analyses that compared alterations in peripheral biomarkers between participants with mental disorders to controls (i.e., between-group meta-analyses) and that assessed biomarkers after treatment (i.e., within-group meta-analyses). Evidence for association was hierarchically graded using a priori defined criteria against several biases. The Assessment of Multiple Systematic Reviews (AMSTAR) instrument was used to investigate study quality. 1161 references were screened. 110 met inclusion criteria, relating to 359 meta-analytic estimates and 733,316 measurements, on 162 different biomarkers. Only two estimates met a priori defined criteria for convincing evidence (elevated awakening cortisol levels in euthymic BD participants relative to controls and decreased pyridoxal levels in participants with schizophrenia relative to controls). Of 42 estimates which met criteria for highly suggestive evidence only five biomarker aberrations occurred in more than one disorder. Only 15 meta-analyses had a power >0.8 to detect a small effect size, and most (81.9%) meta-analyses had high heterogeneity. Although some associations met criteria for either convincing or highly suggestive evidence, overall the vast literature of peripheral biomarkers for major mental disorders is affected by bias and is underpowered. No convincing evidence supported the existence of a trans-diagnostic biomarker. Adequately powered and methodologically sound future large collaborative studies are warranted.


Introduction
One of the overarching goals of the emerging field of precision psychiatry is to incorporate advanced technologies to provide an objective data-driven personalized approach to the diagnosis and treatment of mental disorders 1,2 . However, unlike other medical fields, there is an acknowledged 'translational gap' in psychiatry 1,3 . In parallel, the field of biological psychiatry aiming to provide a neurobiological basis for current mental disorders, has provided contrasting results, even in pivotal biomarkers 4 . Hence, the diagnosis and clinical management of major mental disorders is still entirely based on psychopathological knowledge, while the treatment of mental disorders remains predominantly based on 'trial and error', albeit within the confines of fitting evidence-based prescription to a clinical profile 5 .
Over the past two decades the field has witnessed a remarkable increase in interest on biomarkers for mental disorders 6 . In particular, the literature on non-genetic peripheral biomarkers has grown exponentially, with the publication of several systematic reviews and metaanalyses [7][8][9][10][11][12] . The identification and validation of biomarkers for mental disorders are thought to be crucial steps in the development of precision and biological psychiatry, and its ultimate incorporation in the current landscape of psychiatric care is expected to follow 1 . However, this change is not translating into meaningful modifications in clinical practice.
Several reasons may contribute to the contrast between the overall volume of this literature and the limited applicability of peripheral biomarkers in current psychiatric practice. For instance, it has been proposed that conventional psychiatric diagnoses based, for example, on the Diagnostic and Statistical Manual for Mental Disorders (DSM) may lack biological validity 2,13 . In this respect, it has been proposed that similarly to genetic 14 and neuroimaging 15,16 biomarkers, alterations in peripheral biomarkers for major mental disorders may be shared across distinct diagnostic categories, and thus may have a transdiagnostic nature 6 . However, what is a transdiagnostic construct in psychiatry remains debated, and no study has properly assessed the trans-diagnostic nature of any biomarker with a methodologically sound approach 17 .
In addition to the lack of consensus on how to define a trans-diagnostic construct, a core reason for this translational gap even in a single disorder may be due to the presence of several biases including large heterogeneity, an excess significance bias, as well as a selective reporting of statistically significant (i.e., 'positive') findings without proper adjustment to multiple confounders. An Umbrella review systematically evaluates and collects information from multiple systematic reviews and meta-analyses on all outcomes of a given topic for which these have been performed 18 . Umbrella reviews are particularly suited to uncover these biases 19 , as previously demonstrated with respect to peripheral biomarkers for depression 20 , bipolar disorder (BD) 20 , and schizophrenia 21 . However, those previous umbrella reviews have only addressed studies that have differentiated participants with a specific mental disorder and healthy controls, and not changes in peripheral biomarkers following treatment for these disorders. Moreover, those umbrella reviews focused on only one mental disorder each.
Thus, the current work provides a comprehensive umbrella review of meta-analyses of peripheral biomarkers for major mental disorders related to high prevalence and burden, namely Alzheimer's disease (AD), autism spectrum disorder (ASD), BD, major depressive disorder (MDD), and schizophrenia, including also firstepisode psychosis (FEP) stage. We aimed to re-assess the presence of bias in this literature and identify biomarkers that would be supported by most convincing evidence. In addition, we aimed to identify shared and unique alterations in biomarkers for those major mental disorders among those supported by either convincing or highly suggestive evidence. In the current analysis, we considered both studies that investigated abnormalities in peripheral biomarkers of mental disorders compared to controls (i.e., between-group meta-analyses) and ones that assessed alterations in the levels of peripheral biomarkers after treatment (i.e., within-group meta-analyses).

Literature search
We conducted an umbrella review, which is a systematic collection of multiple systematic reviews and metaanalyses done in a specific research topic 22 . The PubMed/MEDLINE database was searched from inception to February 17, 2019 for all available meta-analyses non-genetic peripheral biomarkers for major mental disorders. This search strategy was augmented through (1) handsearching the reference lists of included articles and (2) tracking citations of included articles through the Google Scholar database. The search string used in the current umbrella review was developed by a professional librarian and is available in the Supplementary Online material. The searches, screening, data extraction, and methodological quality appraisal were independently conducted by at least two investigators. Disagreements were resolved through consensus. When a consensus could not be reached a third investigator (AFC) made the final decision. An a priori defined protocol was followed (available upon reasonable request to the corresponding author of the current manuscript).

Eligibility criteria
We included meta-analyses published in peer-reviewed journals that assessed and synthesized studies on peripheral biomarkers for adults with AD, ASD, BD, MDD, Schizophrenia, including FEP. We included studies in which biomarkers were assayed in participants with a specific mental disorder compared to controls (i.e., between-group meta-analyses), as well as ones which assessed changes in peripheral biomarkers in any of those disorders after treatment (i.e., within-group meta-analyses). Studies published in English were considered for inclusion. This decision was made because most welldesigned systematic reviews and meta-analyses are published in English. We included studies in which diagnoses of mental disorders were conducted by means of a validated structured interview based on standard diagnostic criteria such as the International Classification of Disease (ICD) or the Diagnostic and Statistical Manual of Mental Disorders (DSM). We also considered studies in which a probable diagnosis of a major depressive episode was established through a validated screening questionnaire as well as studies in which a diagnosis of FEP was based on clinical assessment by a mental health care provider. We excluded the following types of studies: (1) systematic reviews without a meta-analytic synthesis of the evidence; (2) animal studies; (3) studies of other types of biomarkers (for example, genetic biomarkers); (4) studies that included participants with two or more diagnoses; (5) studies that included participants with other primary psychiatric diagnoses (e.g. anxiety disorders); (6) studies that investigated biomarkers for other purposes (for example, biomarkers of risk, stage or prognosis) 23 ; (7) studies conducted in pediatric samples (except from ASD and FEP); and (8) if there was more than one meta-analysis for the same biomarker in the same population, we considered only the largest MA (i.e., the one with the largest number of included individual studies).

Data extraction
For each eligible reference, we extracted the first author, year of publication, specific diagnoses assessed, as well as the number of included studies. We also extracted the summary effect size (ES) measure of each meta-analysis considering the ES used in each study. When available, the following variables were extracted at a study-level: number of cases, number of controls, sample size, ES, and study design. In each eligible reference, we only included the primary analyses due to the expected large amount of evidence. However, when included references provided details on the mood state of participants (e.g. mania or bipolar depression), we also extracted this information at an individual-study level.

Statistical analysis and methodological quality appraisal
Data were analyzed from March 1, 2019 to October 10, 2019. We estimated ESs and 95% confidence intervals (CIs) using both fixed and random-effects modeling 24 . Due to the anticipated high heterogeneity observed in meta-analyses of peripheral biomarkers for major mental disorders, random-effects calculations were considered in this review. When ESs were not provided as standardized mean difference (SMD) metrics (e.g., odds ratio), we converted the primary ESs to SMD 25 . We also estimated the 95% prediction interval, which accounts for betweenstudy heterogeneity and assesses the uncertainty of the effect that would be expected in a new study addressing the same association 26 . For the largest study included in each meta-analytic estimate, we calculated the standard error (SE) of the ES. If the SE of the ES is <0.1, then the 95% CI will be <0.20 (i.e., less than the magnitude of a small ES). We calculated the I 2 metric to quantify between-study heterogeneity. Values ≥50% and ≥75% are indicative of large and very large heterogeneity, respectively 27 . To assess evidence of small-study effects, we used the asymmetry test developed by Egger et al. 28 . A P-value <0.10 in the Egger's test and the ES of the largest study being more conservative than the summary randomeffects ES of the meta-analysis were considered indicative of small-study effects 20 . We also annotated whether the association reported in each meta-analytic estimate was nominally significant at a P < 0.05 level as well as at a P < 0.005 level. The level of P < 0.005 has been proposed as a more stringent level of significance that could increase the reproducibility of many fields 29 .
We also determined whether the meta-analysis had a statistical power ≥ 80% to detect either a small (i.e., ES ≥ 0.2) or a medium (i.e., ES ≥ 0.5). We used the method described in detail elsewhere 30 . Finally, we also assessed evidence of excess of significance bias with the Ioannidis test 31 . Briefly, this test estimates whether the number of studies with nominally significant results (i.e., P < 0.05) among those included in a meta-analysis is too large considering their power to detect significant effects at an alpha level of 0.05. First, the power of each study is estimated with a non-central t distribution. The sum of all power estimates provides the expected (E) number of datasets with nominal statistical significance. The actual observed (O) number of statistically significant datasets is then compared to the E number using a χ 2 -based test 31 . Since the true ES of a meta-analysis cannot be precisely determined, we considered the ES of the largest dataset as the plausible true ES. This decision was based on the fact that simulations indicate that the most appropriate assumption is the ES of the largest dataset included in the meta-analysis 32 . Excess significance for a single metaanalysis was considered if P < 0.10 in Ioannidis's test and O > E 20 . We graded the credibility of each association according to the following categories: convincing (class I), highly suggestive (class II), suggestive (class III), weak evidence (class IV), and non-significant associations (Table S1).
For evidence supported by either class I or class II evidence, we used credibility ceilings, which is which is a method of sensitivity analyses to account for potential methodological limitations of observational studies that might lead to spurious precision of combined effect estimates. In brief, this method assumes that every observational study has a probability c (credibility ceiling) that the true ES is in a different direction from the one suggested by the point estimate 33 . The pooled ESs were estimated considering a wide range of credibility ceilings. All analyses were conducted in STATA/MP 14.0 (StataCorp, USA) with the metan package.
The methodological quality of included systematic reviews and meta-analyses was also appraised using the Assessment of Multiple Systematic Reviews (AMSTAR) instrument, which has been validated for this purpose 34,35 . Scores range from 0 to 11 with higher scores indicating greater quality. The AMSTAR tool involves dichotomous scoring (i.e. 0 or 1) of 11 items related to assess methodological rigor of systematic reviews and meta-analyses (e.g., comprehensive search strategy, publication bias assessment). AMSTAR scores are graded as high (8)(9)(10)(11), medium (4-7) and low quality (0-3) 34 .

Results
Our search strategy identified 1161 unique references of which 991 were excluded after title/abstract screening and 170 underwent full-text review (Fig. 1). Therefore, 110 references met inclusion criteria [7][8][9][10][11] , and 60 references were excluded with reasons (Table S2). In the 110 included references, there were 81 between-group metaanalytic estimates for MDD, 79 for AD, 62 for schizophrenia, 45 for ASD, 37 for BD, and 15 for FEP. In addition, there were 25 within-group meta-analytic estimates for MDD, 13 for Schizophrenia, and 2 for BD (Mania) ( Table S3). In total, there were 247,678 biomarker measurements estimates in cases and 476,340 assays in controls across between-group meta-analyses, while there were 9298 biomarker measurements across within-group meta-analytic estimates (Table S3). One hundred and ninety meta-analytic estimates were statistically significant at a P-value < 0.05, whilst 109 were significant at a P-value < 0.005 (Table S3).

Power of meta-analyses
Fifteen between-group meta-analytic estimates had an estimated power >0.8 to detect a small ES, and 145 metaanalyses (126 between-group meta-analyses) had an estimated power >0.8 to detect a medium ES (Table S3).

Small-study effects and excess significance bias
Evidence of small-study effects, which is an indication of publication bias, was observed in 38 (10.6%) meta-analyses, whilst evidence of excess of significance bias was verified in 74 (20.6%) meta-analytic estimates (Tables S3).
Grading of the evidence Only 2 (0.5%) meta-analytic estimates exhibited class I evidence (83,119). In euthymic BD participants there was an increase in basal cortisol awakening levels (Hedges'g = 0.25; 95% CI: 0.15-0.35, P < 0.005) compared to controls 87 . Participants with schizophrenia presented decreased Vitamin B6 (pyridoxal) levels relative to controls 123 . In addition, 42 (11.7%) meta-analytic estimates were supported by class II evidence, of which 3 were derived from within-group meta-analyses (Table 1). Among those estimates, C-reactive protein levels were increased in euthymic BD, bipolar mania, and in MDD relative to controls 80,102 . In addition, soluble interleukin-(IL)-2 receptor (sIL-2R) levels were increased in MDD and in schizophrenia relative to controls 7,8 . Moreover, levels of antibodies against the N-methyl-D-aspartate receptor (NMDA-R) were elevated in BD and in schizophrenia relative to controls 85 . Brain-derived neurotrophic factor (BDNF) levels were decreased in AD and in MDD 44,110 . Furthermore, levels of insulin-like growth factor-1 (IGF-1) were elevated in bipolar mania and in MDD relative to controls 84 . The remaining findings supported by type II evidence were unique to a single disorder (Table 1).

Qualitative methodological appraisal of eligible metaanalyses
Qualitative methodological appraisal of eligible metaanalyses through the AMSTAR tool revealed that 49 references were classified as high, 58 as medium, and 3 as low methodological quality, respectively (Table S4). The overall methodological quality of included references was high according to the AMSTAR [(median: 8; IQR = 2 (7-9)] (Table S4).

Discussion
Our umbrella review provided an up-dated synthesis of the literature of non-genetic peripheral biomarkers for major mental disorders. We included data from 733,316 biomarker measurements. However, in this vast literature only two associations met a priori defined criteria for convincing evidence, whilst 42 meta-analytic estimates met criteria for highly suggestive evidence. This collaborative effort found compelling evidence that overall the literature on non-genetic peripheral biomarkers has a high prevalence of different types of bias. In addition, this umbrella review provides relevant insights for the conduct of further studies to investigate the associations supported by most convincing evidence. It should also be noted that overall the methodological quality of eligible metaanalyses as assessed with the AMSTAR tool was high, which provides further credibility to our quantitative grading of findings.
Associations supported by convincing evidence merit discussion. First, euthymic participants with BD exhibited a high cortisol awakening response relative to controls 87 . This finding indicates that the hypothalamic-pituitary-adrenal (HPA) axis is disrupted in BD on a trait-like basis. This suggests that the HPA axis could be targeted in BD 140 to improve cognitive function, which may be compromised even during euthymic states 141,142 . In addition, participants with schizophrenia exhibited decreased vitamin B6 (pyridoxal) levels compared to controls 123 . This suggests that individuals with schizophrenia may present aberrations in the one-carbon cycle where pyridoxal is a main metabolic component. An alternative explanation might be the poor nutrition which frequently affects people with schizophrenia 98 . This finding is consistent with a recent systematic review and meta-analysis which provided preliminary evidence that adjunctive pharmacological interventions targeting the one-carbon cycle may improve negative symptoms in schizophrenia (although the clinical significance of this improvement may remain questionable 143 and aligns with recent evidence showing that adjunctive treatment with B-vitamins may improve symptomatic outcomes in treatment of psychotic disorders 144,145 ).
Importantly, only five biomarkers were found to be significantly associated with more than one mental disorder. Also, the highest class of evidence for these biomarkers was II. Moreover, no study applied a methodologically solid approach to assess the transdiagnostic nature of any biomarker 17 . We found peripheral elevation on the acute phase reactant, CRP, in BD (both during euthymia and mania) as well as in MDD providing evidence that these disorders are at least partly associated with peripheral inflammation. In addition, the s-IL-2R was increased in both MDD and schizophrenia relative to controls. It is noteworthy that IL-2 is a key cytokine involved in the development, survival and function of regulatory T cells (TRegs) 146,147 , and it has been recently proposed that aberrations in "fine tuning" immune-regulatory mechanisms may contribute to the pathophysiology of both MDD and schizophrenia 148,149 . Antibodies against the NMDA-R were increased in BD and schizophrenia. This finding is consistent with the existence of autoantibodies against the GluN1 subunit of this receptor in patients with psychotic manifestations 150,151 . Furthermore, lower serum BDNF levels were observed in participants with MDD and AD relative to controls. This finding is consistent with the "neurotrophic hypothesis" of depression 152 , while parallel lines of evidence suggest that aberrations in BDNF signaling may contribute to neurodegeneration in AD 153 . Finally, lower levels of IGF-1 were observed in bipolar mania and MDD compared to controls. This finding is consistent with the modulatory role of glucose-related signaling including the trophic molecule IGF-1 in hippocampal plasticity 154 . In addition, preclinical evidence suggests that IGF-1 may be involved in the pathophysiology of affective disorders 155,156 .
There is an emerging body of literature investigating the putative role of non-genetic peripheral biomarkers for the prediction of treatment response in major mental   disorders. Surprisingly, no such biomarkers met criteria for convincing evidence, while only three biomarkers met criteria for type II evidence. Adiponectin levels in schizophrenia decreased after treatment with secondgeneration antipsychotics. This is an interesting finding since hypoadiponectinemia has been associated with a wide range metabolic diseases which are common untoward effects of these drugs 157,158 . In addition, IL-6 levels decreased after treatment with antidepressants. These data are consistent with preclinical findings which show that antidepressants have anti-inflammatory properties and may also inhibit M1 microglia polarization 159 . Finally, lipid peroxidation markers increased after antidepressant drug treatment for MDD. It is worth noting that only 15 meta-analytic estimates had a power >0.80 to detect a small ES. In addition, previous umbrella reviews indicate that the vast majority of peripheral biomarker studies are substantially underpowered 20 . This may undermine the progress and reliability of this particular field and of neuroscience in general through the generation of spurious findings 160 . The "true" ESs of most non-genetic peripheral biomarkers may be expected to be small, similarly to those reported in the genetic literature. Therefore, the design of large, multicenter studies with an open pre-registered protocol, or the creation of Consortia, may be a crucial step to assess the role of peripheral biomarkers in the diagnosis and treatment of major mental disorders within the framework of precision psychiatry 1 , as the model adopted by the Enigma neuroimaging group 161 , or similarly to other large collaborative initiatives 162 . Likewise the creation of biomarker scores using a similar rationale as for the generation of polygenic risk scores may ultimately be a next step in this field.

Strengths and limitations
It should also be noted that large statistical heterogeneity was verified in most included meta-analytic estimates (81.9%). Although this is considered a relevant indicator of bias in this literature, it may also reflect genuine heterogeneity, which may occur both within and between major diagnostic categories 163 . In addition, methodological differences of individual studies included in the assessed meta-analyses may also contribute to heterogeneity. Those include, for example, the time of sample selection as well as measurement properties of the assays (e.g. intra-assay and inter-assay coefficients of variation). Guidelines to standardize the collection and measurement of peripheral biomarkers in psychiatry have been recently proposed 164 . Furthermore, differences in sample selection across individual studies might have contributed to the observed heterogeneity in some metaanalytic estimates. For example, illness stage and disorders in which mixed presentations are common (e.g.,  Symbols: *Euthymia, **Mania, # Prospective study, $ Source: Red blood cell. BDNF brain-derived neurotrophic factor, IGF insulin-like growth factor, IL interleukine, INF interferon, KynA kynurenic acid, Quin quinolinic acid, LDL low-density lipoproteins, MDA malondialdehyde, NMDAR N-methyl-D-aspartate receptor antibody seropositivity, NGF nerve growth factor, NT neurotrophin, QUIN quinolinic acid, sIL-2 Receptor soluble interleukin 2 receptor, TGF transforming growth factor, TNF tumor necrosis factor, 3HK 3-hydroxykynurenine. bipolar disorder) might have contributed to heterogeneity across some included meta-analyses. In addition, approaches to subtype major mental disorders according to frameworks such as the NIMH Research Domain Criteria may help to decrease the heterogeneity of this literature in the future through the study of biologically valid and more homogenous phenotypes 13,163,165 .

Conclusion
This umbrella review of non-genetic peripheral biomarkers for major mental disorders revealed that this literature is fraught with several biases and is underpowered. Nevertheless, two associations supported by convincing evidence and 42 associations supported by highly suggestive evidence were verified. Most associations supported by either convincing or highly suggestive evidence pertained to a single disorder. Future multicentric studies with a priori publicly available protocols, with an ad-hoc methodology to assess the transdiagnostic nature of biomarkers 17 , as well as the subtyping of these disorders into more biologically valid phenotypes, and enough statistical power may improve the reliability and reproducibility of this field, which is of relevance for the translation of biological and precision psychiatry into practice.