Efficacy and safety of Sihogayonggolmoryeo-tang (Saikokaryukotsuboreito, Chai-Hu-Jia-Long-Gu-Mu-Li-Tang) for post-stroke depression: A systematic review and meta-analysis

This systematic review and meta-analysis aimed to analyze the efficacy and safety of Sihogayonggolmoryeo-tang (SGYMT), a classical herbal medicine consisting of 11 herbs, for treatment of post-stroke depression (PSD). Thirteen databases were comprehensively searched from their inception dates until July 2019. Only randomized controlled trials (RCTs) using SGYMT as a monotherapy or adjunctive therapy for PSD patients were included. Where appropriate data were available, meta-analysis was performed and presented as risk ratio (RR) or mean difference (MD) with 95% confidence intervals (CIs). We assessed the quality of RCTs using the Cochrane risk of bias tool and the Jadad scale. The quality of evidence for each main outcome was evaluated using the Grading of Recommendations Assessment, Development, and Evaluation (GRADE) approach. Twenty-one RCTs with 1,644 participants were included. In the comparison between the SGYMT and antidepressants groups, the SGYMT group scored significantly lower on both the Hamilton Depression Scale (HAMD) (8 studies; MD −2.08, 95% CI −2.62 to −1.53, I2 = 34%) and the National Institutes of Health Stroke Scale (NIHSS) (2 studies; MD −0.84, 95% CI −1.40 to −0.29, I2 = 19%), and significantly higher on the Barthel index (3 studies; MD 4.30, 95% CI 2.04 to 6.57, I2 = 66%). Moreover, the SGYMT group was associated with significantly fewer adverse events (6 studies; RR 0.13, 95% CI 0.05 to 0.37, I2 = 0%) than the antidepressants group. In the subgroup analysis, SGYMT treatment consistently reduced HAMD scores within the first 8 weeks of treatment, but thereafter this difference between groups disappeared. Comparisons between SGYMT combined with antidepressants, and antidepressants alone, showed significantly lower scores in the combination group for both HAMD (7 studies; MD = −6.72, 95% CI = −11.42 to −2.01, I2 = 98%) and NIHSS scores (4 studies; MD −3.03, 95% CI −3.60 to −2.45, I2 = 87%). In the subgroup analysis, the reductions of HAMD scores in the SGYMT combined with antidepressants group were consistent within 4 weeks of treatment, but disappeared thereafter. The quality of RCTs was generally low and the quality of evidence evaluated by the GRADE approach was rated mostly “Very low” to “Moderate.” The main causes of low quality ratings were the high risk of bias and imprecision of results. Current evidence suggests that SGYMT, used either as a monotherapy or an adjuvant therapy to antidepressants, might have potential benefits for the treatment of PSD, including short-term reduction of depressive symptoms, improvement of neurological symptoms, and few adverse events. However, since the methodological quality of the included studies was generally low and there were no large placebo trials to ensure reliability, it remains difficult to draw definitive conclusions on this topic. Further well-designed RCTs addressing these shortcomings are needed to confirm our results.

Inclusion criteria. Types of studies. This method was carried out as described previously 31 . We included only RCTs, and excluded quasi-RCTs using inappropriate random sequence generation methods. Studies using the expression "randomization" (随机) without descriptions of randomization methods were included. We included both parallel and crossover studies. In crossover designs, only first-phase data were used to calculate the effect size and in the meta-analysis. Other designs such as in vivo, in vitro, case reports, retrospective studies, and non-randomized controlled trials were excluded.
Participant characteristics. This method was carried out as described previously 31 . We included studies on patients diagnosed with depression following stroke using standardized diagnostic tools such as the DSM-5, regardless of sex, age, or race. Studies were excluded if the participants had drug allergies or other serious illnesses such as cancer, liver disease, or kidney disease.
Intervention types. This method was carried out as described previously 31 . We included studies using SGYMT, i.e. 11 kinds of herbs including Bupleuri Radix, Pinelliae Rhizoma, Ramulus Cinnamomi, Poria, Scutellariae Radix, Jujubae Fructus, Ginseng Radix or Codonopsis Radix, Ostreae Concha, Fossilia Ossis Mastodi, Zingiberis Rhizoma Recens, and Rhei Rhizoma. Given that HMs, such as SGYMT, are also known as so-called "modified HM, " which allow some modifications of their compositions to achieve increased efficacy [32][33][34] , we also included studies using modified SGYMT, which was defined in this review as SGYMT containing more than 50% of the original prescription composition (i.e. HM designated as "modified SGYMT", which contained 6 or more of the 11 basic components). We allowed the use of any form of SGYMT. Studies combining SGYMT with other therapies as treatment interventions were included, if the other therapies were used equally in both the treatment and control groups. For the control intervention, we included studies that used placebos, no treatment, and conventional medical treatments. We excluded studies using HM as the control intervention because these studies could not yield the net effect of SGYMT. There were no other restrictions regarding the control intervention.
Outcome measures. This method was carried out as described previously 31 . The primary outcome measures were (1) post-treatment value in the degree of depression measured by the Hamilton Depression Scale (HAMD) 35 or Beck Depression Inventory (BDI) 36 and (2) AEs measured by the Treatment Emergent Symptom Scale (TESS) 37 or the incidence. The secondary outcome measures included total effective rate (TER), a non-validated outcome measure that is processed secondarily according to certain evaluation criteria such as clinical symptom improvement, or the improvement rates of other quantified outcomes. In the assessment of TER, participants are generally classified as "cured", "markedly improved", "improved", or "non-responder" after treatment. TER is calculated consistently using the following formula: TER = N1 + N2 + N3/N, where N1, N2, N3, and N are the number of patients who are cured, markedly improved, improved, and the total sample size, respectively. We also evaluated post-treatment value in neurological function by the National Institutes of Health Stroke Scale (NIHSS), a tool used to quantify stroke-related impairment 38 , measured ADL by the Barthel index, a tool used to describe ADL and mobility 39 , and measured the quality of life by the 36-Item Short Form Health Survey, a patient-reported survey of their own health 40 as secondary outcome measures. Study selection. After removing duplicates, two researchers (CY Kwon and B Lee) independently screened the titles and abstracts of all searched studies for relevance and then evaluated the full texts of the eligible studies for final inclusion. Any disagreement about study selection was resolved through discussion with other researchers, as previously reported 31 . Data extraction. This method was carried out as described previously 31 . Two researchers (CY Kwon and B Lee) independently performed and crosschecked the data extraction using a standardized data collection form (Excel 2007, Microsoft, Redmond, WA, USA). Discrepancies were resolved through discussion with other researchers. The extracted items included the first author's name; year of publication; country; sample size and number of dropouts; details about the participants, HM, control intervention, and comparisons; duration of the intervention; outcome measures; and AEs associated with interventions. We contacted the corresponding authors of the included studies by e-mail to request additional information if the data were insufficient or ambiguous. Quality assessment. This method was carried out as described previously 31 . Two researchers (CY Kwon and B Lee) independently assessed the methodological quality of all included studies, and the quality of evidence for each main finding. We resolved discrepancies through discussion with other researchers.
The methodological quality of the included studies was evaluated using both the Cochrane Collaboration's risk of bias tool 41 and the Jadad scale 42 . Using the Cochrane risk of bias tool, the following domains were assessed: random sequence generation, allocation concealment, blinding of participants and personnel, blinding of outcome assessments, incomplete outcome data, selective reporting, and other potential biases for each included study. Each domain was categorized into one of three groups: "low risk," "unclear," or "high risk." In the random sequence generation domain, we assessed a study as high risk of bias if the expression "randomization" was mentioned without a description of randomization methods. We assessed other potential sources of bias with particular emphasis on possible baseline imbalances arising from a priori selection characteristics for treatment and control groups, such as mean participant age, or baseline depression level. Baseline imbalance arising from selection characteristics that are strongly related to outcome measures may bias the estimation of intervention effects in RCTs 41 . When using the Jadad scale, randomization method, blinding, and descriptions of withdrawals and dropouts are assessed, and the total score is presented on a scale of 1-5.
The quality of evidence for each main outcome was evaluated by using the Grading of Recommendations Assessment, Development, and Evaluation (GRADE) approach 43 . Using the online program GRADEpro (https:// gradepro.org/), we assessed the risk of bias; inconsistency, indirectness, and imprecision of the results; and the probability of publication bias using a four-item scale ("Very low", "Low", "Moderate", or "High").
Data synthesis and analysis. This method was carried out as described previously 31 . We used Review Manager version 5.3 software (Cochrane, London, UK) for data synthesis and analysis. Descriptive analyses of details of the participants, interventions, and outcomes were conducted for all included studies. Meta-analysis was performed for studies using the same types of intervention, comparison, and outcome measure. We pooled continuous outcomes as the mean difference (MD) with 95% confidence intervals (CIs), and dichotomous outcomes as a risk ratio (RR) with 95% CIs. Heterogeneity of effect measures between studies was assessed using both the chi-squared test and the I-squared statistic (I 2 ). We considered I 2 values greater than 50% and 75% indicative of substantial and high heterogeneity, respectively. In the meta-analyses, a random-effects model was used when the heterogeneity was significant (I 2 > 75%), while a fixed-effects model was used when the heterogeneity was non-significant. We planned to do this; however, during the review process we learned that this practice was no longer supported and that a random-effects model was preferable because of given potential heterogeneity in true treatment effects due to differences in the treatment components, research groups, and patient selection criteria among the included studies. Therefore, we reported both the results of the models that were pre-registered and those of potentially more appropriate random-effects models. However, we used only fixed-effects models when the number of studies included in the meta-analysis was less than 5, in which the estimates of between-study variance had poor accuracy 44,45 . If the necessary data were available, we conducted a subgroup analysis to account for the heterogeneity or to assess whether the treatment effects vary between subgroups according to the following criteria: (1) the treatment period; (2) the dosage form of SGYMT, such as decoctions or granules; (3) the presence or absence of a placebo; (4) the severity of depression; and (5) the types of antidepressants used. In addition, we performed sensitivity analyses to identify the robustness of meta-analysis results by excluding (1) studies with high risks of bias (2), studies with missing data, and (3) outliers that are numerically distant from the rest of the data. If more than 10 trials were included in the meta-analysis, reporting biases such as publication bias were assessed using funnel plots. When reporting bias was implied by funnel plot asymmetry, we attempted to explain possible reasons for this. Additionally, we used Egger's linear regression analysis and Begg and Mazumdar's rank correlation analysis to assess publication bias with Stata/MP version 15.1 software 46,47 .

Results
Description of included studies. We identified a total of 101 records through database searching. After screening of titles and abstracts, 38 articles were considered to be relevant. Among them, 1 review article, 4 non-RCTs or quasi-RCTs, 5 not describing the diagnostic criteria of PSD, and 7 not describing the contents of conventional medication prescribed were excluded by reviewing the full-texts. In total, 21 RCTs with 1,644 participants were included in this review and meta-analysis ( Fig. 1)   .
The general characteristics of the included studies are summarized in Table 1. All RCTs were conducted in China. One was a thesis 50 , 1 was a conference proceedings 48 , and the remaining 19 were journal articles. Thirteen RCTs compared SGYMT to antidepressants [48][49][50][51][52][53][54][55][56][57][58][59][60] , and the other 8 compared SGYMT combined with antidepressants to antidepressants alone [61][62][63][64][65][66][67][68] . We were unable to find any placebo-controlled trials. Sample sizes ranged from 48 to 165 with a median of 70, and treatment periods ranged from 14 to 90 days with a median of 42 days. Five studies 48,52,59,60,64 recruited participants with specific traditional Chinese medicine (TCM) symptom patterns; this approach enables individual treatment by categorizing the signs and symptoms of patients into a series of syndrome concepts 69 : four 48,52,60,64 were associated with stagnation of the liver or qi, and the remaining one 59 was a liver-kidney yin deficiency. As control interventions, a total of three types of antidepressants were used: selective serotonin reuptake inhibitors in nine 50,[52][53][54]56,57,60,63,68 , TCA in three 49,55,58 , and flupentixol/ melitracen in nine 48,51,59,61,62,[64][65][66][67] . In most cases, routine care for stroke (RCS) using pharmaceutical anti-platelet, anti-coagulation, and neurotrophic agents, and vasodilators, was performed for both groups. In one study 53 , psychotherapy was performed with the RCS for both groups. The most frequently used outcome was TER in 18 studies [48][49][50][51][52][54][55][56][57][58][59][60][61][62][63][64][65]67 , followed by HAMD in 15 48 64 was based on both the clinical symptoms and the TCM symptom score. Two studies reported the approval of institutional review board (IRB) 51,68 , and 11 studies reported that they had received consent from the participants 51-53,56,59,60,62,64,66-68 . Methodological quality. Based on analysis using the Cochrane risk of bias tool, eight studies 48,51,52,54,58,63,66,68 using appropriate methods of random sequence generation, such as computerized random number tables, were considered to have a low risk of bias on the random sequence generation domain. The remaining 13 studies 49,50,53,[55][56][57][59][60][61][62]64,65,67 were considered to have a high risk of bias because they did not describe their random sequence generation methods. No studies reported allocation concealment, or blinding of participants, personnel, and outcome assessors. The domain of participant and personnel blinding was rated as a high risk of bias in all studies, given that no study used placebos. For 2 studies that reported dropout 54,58 , the domains of incomplete outcome data were rated as low and high risk of bias respectively, according to the processing method for missing data that was intent-to-treat analysis 54 , or per-protocol analysis 58 . None of the included RCTs had published study protocols. Four studies that reported only TER as an outcome 49,54,55,58 , 1 that did not report the result of outcomes that were nonetheless described in the Methods section 59 , 1 that assessed HAMD but did not report the raw data 60 , and 1 that did not report depression-related outcomes 68 , were rated with a high risk of bias in the selective reporting domain. Although we contacted the corresponding authors of 2 of these studies via e-mail to obtain raw data 54,60 , we received no replies. All studies reported no significant baseline difference in demographic data between the two groups, and were rated as having low risk of bias in the other potential sources of bias domains (Figs 2 and 3). Based on the Jadad scale, the mean score was 2.38 (SD 0.50); 8 studies 48,51,52,54,58,63,66,68 had a total score of 3 and 13 49,50,53,[55][56][57][59][60][61][62]64,65,67 had a total score of 2 (Table 1 and Supplemental Digital Content 2).
Details of SGYMT administration. The decoction dosage form was used in all studies except for 2 using granules 48,60 . Except for 2 that did not report medication frequency 55,58 , 19 studies instructed patients to take prescriptions twice a day. Twenty-five types of herb were used in addition to 12 types of basic component. Except for Ginseng Radix (28.57%) used as a substitute for Codonopsis Radix, the remaining 11 basic herbs were used at 61.90-100% frequency in included studies. In particular, Bupleuri Radix, Pinelliae Rhizoma, and Fossilia Ossis Mastodi were used in all studies (all, 100%), and Poria and Ostreae Concha were used in 20 studies (both, 95.24%). The 25 additional herbs showed 4.76-42.86% frequency of use depending on the type, among which Curcumae Radix and Glycyrrhizae Radix showed the most frequent with 42.86%, followed by Astragali Radix, Hoelen cum Pini Radix and Angelicae Gigantis Radix at 28.57%, respectively (Supplemental Digital Content 3, which describes the details of SGYMT and herbs added to the original SGYMT formulation).

SGYMT versus antidepressants.
Efficacy. The meta-analysis showed that HAMD scores were significantly lower in the SGYMT group (8 studies 48,[50][51][52][53]56,57,59 ; MD −2.08, 95% CI −2.62 to −1.53, I 2 = 34%) (Fig. 4), and TERs based on depression scale were higher (11 studies 48-52,54-58,60 ; RR 1.11, 95% CI 1.06 to 1.17, I 2 = 0%) than corresponding scores in the antidepressants group. Subgroup analysis showed that when the treatment period was longer than 8 weeks, these significant between-group differences disappeared for the depression scales including HAMD (2 studies 50,57 ; MD −0.66, 95% CI −2.11 to 0.78, I 2 = 0%), and for TERs based on depression scales (3 studies 50,54,57 ; RR 1.05, 95% CI 0.91 to 1.21, I 2 = 0%). To confirm the robustness of these results, sensitivity analyses were performed after excluding low quality RCTs that had 3 or less low risk of bias on the 7 domains of the risk of bias tool. The superior effectiveness of SGYMT demonstrated by the depression scales including HAMD, and the TER, was consistent within 8 weeks of treatment (Supplemental Digital Content 4).       51 reported modified Edinburgh-Scandinavian stroke scales and functional independence measures respectively as their outcomes, with the SGYMT group showing significantly better results relative to the control group (p < 0.05 for both studies). Moreover, Liu et al. 53 reported significantly lower serum levels of interleukin-1β and tumor necrosis factor-α in the SGYMT group after 28 days of treatment (p < 0.05 for both comparisons).
Safety. There were significantly fewer AEs associated with SGYMT (6 studies 48,50-52,56,59 ; RR 0.13, 95% CI 0.05 to 0.37, I 2 = 0%) than with antidepressants (Fig. 5). In the subgroup analysis, significant differences between these two groups disappeared when the treatment period was longer than 8 weeks (1 study 50 ; RR 0.29, 95% CI 0.07 to 1.21), or when SGYMT was administered as granules (1 study 48 ; RR 0.07, 95% CI 0.00 to 1.13). However, sensitivity analysis performed by excluding low quality RCTs showed no significant difference between two groups when the treatment period was shorter than 4 weeks (1 study 48 ; RR 0.07, 95% CI 0.00 to 1.13) or when the type of antidepressant consisted of flupentixol/melitracen (2 studies 48,51 ; RR 0.07, 95% CI 0.00 to 1.13) (Supplemental Digital Content 4).   Table 1. Characteristics of included studies. ¶ Among three groups in this study, data for the control group undergoing psychotherapy combined with RCS was removed, as this was considered an irrelevant intervention. § Both groups showed no significant abnormality in blood and urine test, kidney function, and electrocardiogram. ※ An approach of some East Asian traditional medicines, including TCM, which enables individual treatment by categorizing the signs and symptoms of patients into a series of syndrome concepts. '*' and '+' mean significant differences between two groups, p < 0.05 and p < 0.01, respectively. 'N.S' means no significant difference between two groups, p  www.nature.com/scientificreports www.nature.com/scientificreports/ Interestingly, significant differences in HAMD between treatment groups disappeared when the treatment period was longer than 4 weeks (4 studies 62,63,65,67 ; MD −7.86, 95% CI −16.50 to 0.77, I 2 = 99%) (Fig. 6). Sensitivity analysis performed by excluding low quality RCTs showed that the combination treatment was consistently more effective when the treatment lasted less than 4 weeks (1 study 66 ; MD −4.04, 95% CI −6.51 to −1.57). In addition, the extremely high heterogeneity (I 2 = 98%) in the HAMD scores was reduced to 0% as a result of the sensitivity analysis performed by excluding low-quality RCTs (Supplemental Digital Content 4). www.nature.com/scientificreports www.nature.com/scientificreports/ Liu and Wang 65 calculated TER using both depression and stroke scales, and reported that the two groups showed similar efficacies (29/30 for the combination group, 27/30 for the control group, no P-value reported). Lai et al. 62 and Liu 66 reported the Barthel index and generic quality of life inventory-74 as their outcomes. Using these measures, the combination group showed significantly better results than did the antidepressants alone group (p < 0.05 and p < 0.01, respectively).

Safety.
No studies reported outcomes related to safety in this comparison. www.nature.com/scientificreports www.nature.com/scientificreports/ Quality of evidence. In the comparison of SGYMT and antidepressants, the qualities of evidence were graded as "Very low" to "Moderate" (Table 2). Meanwhile, in the comparison of SGYMT combined with antidepressants and antidepressants alone, the qualities of evidence were graded as "Very low" to "Moderate" (Table 3). There was no high quality of evidence. The main reason for downgrading was the high risk of bias in the RCTs included in each meta-analysis. In addition, most findings were judged to have low precision because they did not satisfy the optimal sample size and had wide CIs. The indirectness of outcome measure also lowered the quality of evidence.

Publication bias.
No evidence of publication bias (distinct asymmetry) emerged from the funnel plots of TER based on depression scales comparing the efficacy of SGYMT with that of antidepressants alone. In addition, publication bias could not be proven using Egger's method (P value for bias: 0.174) or Begg's method (continuity corrected Z score: 0.78, continuity corrected P value: 0.436) (Fig. 7).

Discussion
This review aimed to evaluate the effectiveness and safety of SGYMT as a monotherapy or adjunctive therapy to antidepressants for PSD. A comprehensive search yielded 21 RCTs that were suitable for inclusion in our review.
The findings of our analysis were as follows: (1) In the comparison between SGYMT and antidepressants, relative to pharmaceutical antidepressants, SGYMT monotherapy significantly alleviated depression measured by HAMD (MD −2.08, 95% CI −2.62 to −1.53, I 2 = 34%), and TER based on depression scale (RR 1.11, 95% CI 1.06 to 1.17, I 2 = 0%). However, subgroup analysis of treatment periods showed that such differences on HAMD (≤4 weeks: MD −1.98, 95% CI −3.13 to −0.83, I 2 = 34%; >4 weeks, ≤8 weeks: MD −2.48, 95% CI −3.04 to −1.93, I 2 = 0%) and TER (≤4 weeks: RR 1.11, 95% CI 1.04 to 1.18, I 2 = 16%; >4weeks, ≤8weeks: MD 1.21, 95% CI 1.06 to 1.39, I 2 = 0%) were only evident for treatment periods shorter than 8 weeks, a result consistent with that of the sensitivity analysis performed after exclusion of low quality RCTs. Additionally, the SGYMT group showed significant improvement of neurological functions evaluated by TER based on stroke scale (RR 1.31, 95% CI 1.15 to 1.49, I 2 = 89%), NIHSS (MD −0.84, 95% CI −1.40 to −0.29, I 2 = 19%), and CSS (MD −5.37, 95% CI −6.60 to −4.15, I 2 = 43%). Differences that emerged from this comparison were sustained when treatment periods were longer than 4 or 8 weeks, for the TER (>8 weeks: RR 1.80, 95% CI 1.37 to 2.37, I 2 = 0%) and NIHSS outcomes measures (>4weeks, ≤8weeks: MD −1.05, 95% CI −1.71 to −0.39). These results suggest that the effectiveness of SGYMT for treatment of PSD has a different time trajectory relative to that of antidepressants. (2) In the comparison between SGYMT combined with antidepressants and antidepressants alone, the combined treatment also significantly improved depression evaluated by HAMD (MD = −6.72, 95% CI = −11.42 to −2.01, I 2 = 98%) and TER based on depression scale (RR 1.66, 95% CI 1.40 to 1.97, I 2 = 94%); however, the benefits assessed using the HAMD were sustained only for treatment periods shorter These results are consistent with comparisons between SGYMT monotherapy and antidepressants, suggesting that SGYMT may alleviate the symptoms of PSD more rapidly than do pharmaceutical antidepressants. Moreover, the combination treatment group showed more marked improvement of neurological function evaluated by NIHSS (MD −3.03, 95% CI −3.60 to −2.45, I 2 = 87%) than did the group treated with antidepressants alone. (3) Regarding the safety data, only six RCTs 48,[50][51][52]56,59 , comparing SGYMT with antidepressants reported the incidence of AEs. The SGYMT group showed significantly fewer AEs than did the antidepressants group (RR 0.13, 95% CI 0.05 to 0.37, I 2 = 0%), regardless of the types of antidepressants compared (SSRI: RR 0.14, 95% CI 0.03 to 0.65, I 2 = 30%; flupentixol/melitracen: RR 0.07, 95% CI 0.01 to 0.53, I 2 = 0%). However, this difference disappeared when treatment periods were longer than 8 weeks (RR 0.29, 95% CI 0.07 to 1.21), or when SGYMT was administered as granules (RR 0.07, 95% CI 0.00 to 1.13). Additionally, sensitivity analysis performed by excluding low quality RCTs showed that the significant difference disappeared when treatment period was shorter than 4 weeks. Altogether, these results suggest that SGYMT may be consistently more effective and safer than antidepressants over treatment periods of 4 to 8 weeks. (4) The methodological quality of the included studies and the strength of evidence were generally poor. The Cochrane risk of bias tool showed that only 8 of 21 included trials used and reported appropriate methods of random sequence generation. Moreover, no studies reported allocation concealment; blinding of participants, personnel, and outcome assessors; or use of placebo designs. Moreover, because none of the studies available for our meta-analysis had previously published a study protocol, their results may be selectively reported and/or biased. We also assessed the quality of RCTs included by using the Jadad scale; the mean score was 2.38, which indicated that the quality of the studies included in this review was generally low. The quality of evidence assessed by the GRADE was "Very low" to "Moderate" and there was no "High" quality evidence. It means that the evidence comparing SGYMT and antidepressants would be significantly improved by future additions of high quality research. Although a definite conclusion could not be drawn due to the low qualities of included studies and the evidence, our findings suggested the following implications of SGYMT use. First, as an alternative or adjunctive therapy, SGYMT might have antidepressant effects especially within the first 4 to 8 weeks of treatments. Second, SGYMT probably improves neurological function and the ADL for PSD patients, which are difficult to be improved with conventional antidepressants 17 . Third, SGYMT was associated with fewer AEs, especially when administered between 4 and 8 weeks after the start of treatment. However, all these implications are hypothetical and cannot be confirmed by our results.  www.nature.com/scientificreports www.nature.com/scientificreports/ As a modality of complementary and alternative medicine, HM has been regarded as a potential replacement or supplement for conventional medicine when applied to various pathological conditions including psychiatric disorders such as depression, insomnia, and schizophrenia [70][71][72][73] . The underlying mechanism by which SGYMT, one of the famous classical herbal medicines, serves as treatment for PSD is not fully understood; however, for some key herbs of SGYMT, relevant underlying mechanisms have been reported. For example, Bupleuri Radix, a key component of the SGYMT prescription, is known to reduce neuro-inflammation 74 and oxidative stress 75 , and increase concentrations of nerve growth factor and brain-derived neurotrophic factor 76 . All these mechanisms are associated with the etiology of depression. Scutellariae Radix, another key component of this prescription, alleviates depression through several complex molecular mechanisms 77 , thereby complementing the action of Bupleuri Radix. Some HMs such as Chai Hu Shu Gan San and Xiao Yao San, which include Bupleuri Radix as a key component, have significant therapeutic effects on depression 70,78 . Other components of SGYMT, including Ginseng Radix, also have antidepressant effects [79][80][81] . Moreover, the multiple components of HM may exert a complex effect on multiple molecular targets 23 . Thus, HM such as SGYMT may help to improve neurological symptoms in addition to alleviating the symptoms of depression in PSD patients.
The following limitations should be kept in mind when interpreting the results of this meta-analysis. First, because all studies reviewed were conducted in China, general applicability of the results may be limited. Second, the quality of the included studies is generally low, particularly with respect to the lack of placebo-controlled trials. Therefore, the possibility that our study overestimated the effectiveness of SGYMT cannot be ruled out. The low quality of the included studies implies that the reliability of our results is very low. In other words, our results should be interpreted with great caution considering that they may change markedly according to the results of future rigorous research. Furthermore, the popularity of HM in China may have elevated Chinese participants' expectations of SGYMT. In studies comparing SGYMT combined with antidepressants with antidepressants alone, participants are likely to have high expectations of the former treatment, possibly increasing the placebo effect. Third, in the comparisons within our protocol we planned a subgroup analysis according to the severity of depression, but this could not be carried out because too few studies included criteria assessing the severity of depression. Fourth, only four of the included studies recruited PSD patients with a specific TCM pattern. The TCM pattern can be used in conjunction with the diagnosis of the disease, thereby so-called "disease-syndrome combination" can be used to fully exploit the advantages of the HM 82 , which is advantageous for the individual-specific treatment. Finally, in our review, the control groups of the included studies were prescribed antidepressants regardless of their type, which led to distinct clinical heterogeneity. Although we   conducted careful subgroup analyses according to each type of drug, the number of studies included was not sufficient to quantify the comparative effect size of SGYMT compared to each type of antidepressant and to explain the heterogeneity adequately.
Suggestions for future research are as follows. Further high-quality RCTs on the efficacy of SGYMT for reducing PSD are needed, particularly in countries other than China, where wide acceptance of HM for the treatment of PSD may positively bias the results of comparisons with pharmaceutical antidepressants. Accordingly, when planning these studies, it is necessary to consider stratified randomization or post-correction that reflects expectations for HM to avoid potential placebo effects. Moreover, placebo-controlled trials are essential to assess the efficacy and safety of SGYMT objectively. To optimize the use of SGYMT in PSD treatment, future studies should characterize participants in greater detail than was possible in our analysis, particularly the severity of their depression, and their TCM patterns. In particular, individual characteristics are an important component of HM practice, so it is necessary to establish a subgroup of PSD patients with personalized medicine profiles suitable for the administration of SGYMT. TCM patterns may be useful in this selection process. Furthermore, it is important to obtain ethical approval from an IRB before conducting clinical research to protect the dignity, rights, and welfare of research participants, which is in line with World Health Organization guidelines 83 . It is important to explain to the participants the purpose, content, and method of the research, as well as its potential benefits and risks; informed consent should also be obtained from participants in all clinical research studies. In addition, studies using health insurance data in China, Japan, Korea, and Taiwan, where health insurance for HM is applied, may enlarge the database and help specify the indications for SGYMT. Finally, the multi-compound multi-target aspect of HM has potential to contribute to the improvement of both neurological function and depressive symptoms. A comprehensive experimental study of the underlying molecular mechanism of action of SGYMT is needed.
In conclusion, current evidence suggests that SGYMT, either as a monotherapy, or as an adjuvant therapy combined with antidepressants, might have potential benefits for the treatment of PSD. However, since the methodological quality of the included studies was poor and there were no large, placebo-controlled trials to ensure freedom from bias, the results of the meta-analysis may be overestimated; thus, it remains difficult to draw definitive conclusions on this topic. Further well-designed RCTs are needed to confirm these results.