Introduction

Obsessive-compulsive disorder (OCD) is a prevalent and highly heterogeneous disorder of unknown etiology. Similar to other psychiatric disorders, OCD probably originates from a complex interaction of genetic and environmental risk factors [1,2,3].

Since the beginning of the twentieth century, family studies have consistently reported that OCD is a familial disorder. However, the sample size, and methodological rigor of these studies have been mixed [4, 5]. Consequently, these studies have limited external validity [6]. More recently, population-based studies have significantly increased the sample sizes to several thousand probands and relatives but are limited by less precise diagnostic procedures [7,8,9,10,11,12].

Since OCD is a heterogeneous disorder, it may be possible that certain OCD phenotypes (i.e., early onset or tic-related OCD) are more familial/heritable than others [13,14,15,16,17,18]. However, not all studies had sufficient statistical power to confirm this familial pattern.

The first systematic review and meta-analysis of OCD genetic epidemiology were published approximately 20 years ago [19]. This study reported that FDRs of OCD probands had a four-fold higher risk for OCD than the FDRs of non-affected control probands. Since then, more than 20 relevant and high-quality original family studies [5, 7,8,9,10,11,12,13,14,15,16,17,18, 20,21,22,23,24,25,26,27,28] including two re-analyses of previous data have been reported [4, 29] and six population-based studies [7,8,9,10,11,12] have been published. And more than 25 high-quality twin studies [7, 30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56] have also been published.

Considering the relevance of all the studies published since 2001, the current study aimed to update the state-of-art knowledge on the field by conducting a systematic review and meta-analysis of OCD family and twin studies.

Methods

Search strategy and selection criteria

The present meta-analysis was conducted according to the Preferred Reporting Items for Systematic Review and Meta-Analysis (PRISMA) guidelines [57] and the study protocol was registered in PROSPERO (registration number: CRD42019118317).

The PECO strategy [58] was used to frame the systematic review procedure criteria, as follows. Participants: family members and/or twins of OCD probands; Exposure: family history of OCD; Controls: family members and/or twins of probands without OCD; and Outcome: OCD rates in relatives and/or twins of OCD probands.

For this systematic review and meta-analysis, we considered all studies that examined familial loading thorough family aggregation rates and/or twin resemblance up to September 30, 2021. The following databases were searched: CENTRAL (Cochrane Library); MEDLINE by PubMed (US National Library of Medicine); EMBASE; BVS (Biblioteca Virtual em Saúde); and OpenGrey (for gray literature). The search strategy was designed using DeCS headings and adapted to the terms for each database indexing vocabulary thesaurus (i.e., Medical Subject Headings [MeSH] for MEDLINE). No restrictions were placed on language or date of publication. Specific details of the search strategy for each database including the uniterms used are provided in Box S1.

A specific search strategy was used to locate previous reviews. The reference lists of all previously published reviews and meta-analyses were carefully screened, and included articles for full-text selection procedures, which were also scrutinized for additional relevant studies.

Studies were included if the following criteria were met: OCD diagnosis was assessed using standardized and validated instruments, or by a population database record (based on the Diagnostic and Statistical Manual of Mental Disorders [DSM-III or DSM-III-R, or DSM-IV, or DSM 5], or the International Classification of Diseases, Eight, Ninth or 10th Revision [ICD-8, ICD-9 or ICD-10] diagnostic criteria); the probands were diagnosed with OCD by direct interviews; relatives were directly interviewed; the study designs comprised cohort or case-control studies; and included a control group.

Review studies, case reports, expert consensus, letters to editor, opinion papers, segregation analysis studies, molecular genetic studies, studies reporting obsessive-compulsive personality disorder as the outcome, and animal model studies were excluded.

A flowchart illustrating the study search and selection process is presented in Fig. 1.

Fig. 1: PRISMA flow diagram.
figure 1

From the 4022 studies screened, 19 family studies and 29 twin studies were selected for the systematic review and meta-analysis.

Two independent OCD experts (T.B.V. and L.M.) performed the screening procedure for the databases to determine which studies met the eligibility criteria. First, duplicate publications were screened and excluded (n = 101). Next, the screening process was applied to titles and abstracts (n = 4022), and potentially eligible full-text articles were selected (n = 95), followed by full-text review (T.B.V. and L.M.). Disagreements were discussed and resolved by consensus, or by consulting a third expert (M.C.R.). The Cohen’s kappa coefficient of agreement between the two reviewers was excellent (0.90). The screening process was performed using the Rayyan software [59].

During the study selection procedure, the reviewers concluded that all twin studies were based on community samples in which obsessive-compulsive symptoms (OCS) were not systematically assessed by standardized instruments and or direct interviews. This means that twin and population-based studies would have been excluded from the analysis, despite presenting extremely relevant information. Instead, we made the post hoc decision to wave this inclusion criterion for those studies (i.e., probands and FDRs were not directly interviewed).

From the 95 full-text records evaluated, 58 were excluded. A list of all the excluded studies after a full-text review and reasons for exclusion appears in the Supplement (Table S1). Of the remaining articles, 14 were family studies and 23 were twin studies. From the reference list search, five additional family and six additional twin studies were found and added to the meta-analysis. No studies were found in the gray literature.

The Newcastle-Ottawa Scale (NOS) [60] tool was used for the investigation of the risk of bias in case-control observational assessment. The NOS contains eight items, categorized into three domains: selection, comparability, and outcome/exposure. For each study type (i.e., family or twin), the items were adapted, and a series of response options were provided. A star system scoring was used, ranging from zero to nine stars. Details of the risk assessment of bias according to the NOS for each included study is provided in Table S2.

Data analyses

For the coding process, a standardized data extraction form was used by the reviewers consisting of the following items: author(s) and year of publication; study location; sample recruitment procedure; inclusion and exclusion criteria applied for sample selection; OCD diagnostic criteria coding resource; and assessment tools used.

For family studies, it also included: case definition methods, including the adoption of the best estimate method (or not); matching procedures for proband comparison group selection; blindness of interviewer for proband status; number of probands, controls, and relatives with OCD (definite and subthreshold/probable); mean age of the proband and relative samples.

For family studies, the analysis unit was the FDR(s) of the OCD patients or controls, and the outcomes of interest were: (a) the familial recurrence rates of definite and probable/subthreshold OCD in FDRs of OCD compared with FDRs of control probands; and (b) the familial recurrence rates of definite and probable/subthreshold OCD in FDR of early-onset (before 18 years old) OCD probands compared with FDRs of late-onset (after 18 years old) OCD probands. We conducted these analyses three times: once for all studies, once for studies involving children/adolescents, and once for adults.

For each outcome of interest reported by ≥2 family studies, we performed a random-effects Mantel-Haenszel meta-analysis to derive pooled effect estimates. Because all data under analysis were dichotomous outcome variables, we summarized the effects using odds ratio (OR) and their 95% confidence intervals (CIs).

For twin studies, the number of MZ and DZ pairs and the twin resemblance correlations according to zigozity were systematically extracted. When the studies only reported separate correlations for males and females, we transformed the correlation coefficients to Fisher z-values (see below), averaged them, and back-transformed the resulting z-value to a correlation coefficient.

For the twin studies, the analysis unit was the twin pairs. The outcome of interest was the correlation of OCS in monozygotic compared with dizygotic twins. We tested two hypotheses: (a) that the OCS correlation in monozygotic twins is equal to the OCS correlation in dizygotic twins; and (b) that the OCS correlation in monozygotic twins is the double of the OCS correlation in dizygotic twins. We conducted these analyses considering all studies, including only the children/adolescents and only adults.

For each outcome of interest reported by ≥2 twin studies, we transformed the correlation coefficients to Fisher Z values and performed a random-effects meta-analysis to derive pooled effect estimates. This transformation was beneficial because the standard error of a correlation depends on the correlation itself, making larger correlations appear more precise and thus receiving more weight. In contrast, the Fisher transformation only depends on the sample size. We conducted these analyses twice: once assuming that the standard error was that of the Pearson correlation and once assuming that it was that of the tetrachoric correlation. The latter assumes that the presence of OCS represents latent variables that follow a bivariate normal distribution. To use the standard error of the tetrachoric correlation, we first estimated the number of concordant and discordant twin pairs for the presence of OCS according to a 2.3% lifetime prevalence of OC [61] then using the “polychor” function to derive the standard error [62] and finally calculating the “effective” sample size to perform the meta-analysis, as described in Polderman et al. [63].

Finally, we used tetrachoric correlations to estimate the A (additive genetics), the C (shared environmental) and the E (non-shared environmental) components based on the following definitions:

$$r_{MZ} = {{{\mathrm{A + C}}}}$$
$$r_{DZ} = 0.5{{{\mathrm{A}}}} + {{{\mathrm{C}}}}$$
$$var = {{{\mathrm{A}}}} + {{{\mathrm{C}}}} + {{{\mathrm{E}}}}$$

We measured the heterogeneity between studies with the I2 statistic, which describes the percentage of the variability in effect estimates attributable to heterogeneity. We accepted I2 values <50%. When the I2 value exceeded this value, the studies were excluded one-by-one from the analyses to identify and analyze the outlier.

All statistical analyses were conducted using R with the packages “meta” (for meta-analysis) [64] and “polycor” (for tetrachoric correlations) [62].

Results

Family studies

Nineteen eligible family studies were included in the meta-analysis (Table 1). Almost all exclusions of family studies were due to the absence of a control group or the lack of an OCD diagnosis confirmation using standardized instruments and/or direct interviews of probands or relatives (see Table S1). Four studies (comprising three samples) assessed families of child/adolescent probands [17, 20,21,22]. Another 14 studies, comprising nine samples, reported data on family risk for adult OCD probands [4, 5, 16, 18, 23,24,25,26,27,28,29, 65,66,67]. One study did not compare OCD familial risk between case and control probands (probands without OCD) but were included because they compared OCD familial risk between early and late-onset OCD probands [68]. Finally, from the nineteen included publications, 18 studies [4, 5, 16,17,18, 20,21,22,23,24,25,26,27,28,29, 65,66,67] (comprising twelve different samples) were considered for the main analyses.

Table 1 OCD family studies.

Data from 5053 directly interviewed FDRs of 176 child and adolescent and 899 adult OCD probands and 522 control probands (95 from children/adolescents and 427 from adult samples) were pooled for meta-analysis. The analyses combining pediatric and adult probands showed that FDRs of OCD probands had higher risks for definite OCD (OR = 7.18, 95% CI 4.13–12.47, p < 0.00001) than the FDRs of controls (Fig. 2).

Fig. 2: Definite OCD prevalence.
figure 2

First-degree relatives of OCD probands had odds ratio of 7.18 (95% CI 4.13–12.47, p < 0.00001) for definite OCD in comparison to controls.

The comparison between pediatric and adult studies indicated that OCD was significantly more familial in children/adolescents than in adults. First-degree relatives of the child and adolescent OCD probands had a 16 times higher risk of definite OCD compared to the control FDRs (OR = 16.44, 95% CI 4.57–59.17, p < 0.00001). The FDRs of adult OCD probands had an approximately 6 times higher risk of definite OCD compared to the control FDRs (OR = 6.02, 95% CI 3.16–11.46, p < 0.00001) (Fig. 3).

Fig. 3: Definite/subthreshold OCD prevalence.
figure 3

First-degree relatives of definite and subthreshold OCD probands had 4.6-times higher risk of OC symptoms compared to controls.

For the above analyses, the degree of heterogeneity among the studies was acceptable (overall I2 = 37%; adult probands, I2 = 48%; child and adolescent probands, I2 = 0).

Regarding definite and subthreshold OCD, the pooled data among adult samples revealed slightly lower familial loading (OR = 4.06, 95% CI 2.91–5.66, p < 0.00001) than the analyses including only definite OCD. There was no heterogeneity between the studies pooled for this analysis (I2 = 0%).

Data on the occurrence of definite or definite/subthreshold OCD in FDRs of probands considering the age of symptom onset were reported for eight different samples in 12 publications [5, 16, 18, 23, 25,26,27,28,29, 66, 68]. However, eight studies did not present the number of relatives included in their studies, remaining only four samples for the statistical analyses [16, 18, 26, 68]. Together, the studies showed a high heterogeneity for the analyzed outcomes (definite OCD, I2 = 71%; definite/subthreshold OCD, I2 = 95%) (Fig. 4).

Fig. 4: Age of onset.
figure 4

Only four samples remained for OCD family recurrence rate statistical analysis. The studies showed a high heterogeneity for the analyzed outcome.

The tic-related OCD family aggregation analyses were not reported because of the insufficient number of studies included or the high heterogeneity of the extracted data.

Risk of bias

The Newcastle-Ottawa Scale (NOS) scores of the included family studies were generally high, indicating low risk of bias (Tables S2). The only exceptions were the studies by [65] and [20], which lacked an interviewer blinding procedure regarding proband/relative status, and [13, 69], and [68], which did not use the best estimate diagnosis method and lacked control samples.

Twin studies

Twenty-nine twin papers were included in the metanalysis [7, 30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56, 70] (Table 2), comprising 26 different samples. These studies involved self-report of symptoms using a range of validated questionnaires. Table 2 and Figs. 5 and 6 depict the MZ and DZ twin correlations from 29 twin studies (eight with child/adolescent samples) and the metanalytic summary estimates. The meta-analysis revealed similar correlations across the age subgroups and the total sample and that MZ twins had significantly higher correlations for OCD when compared to DZ twins (children/adolescents: rMZ = 0.52, 95% CI 0.37–0.65, p < 0.01; rDZ = 0.27; 95% CI, 0.18–0.36, p < 0.01; adults: rMZ = 0.43, 95% CI 0.40–0.45, p < 0.01; rDZ = 0.20; 95% CI, 0.17–0.22, p < 0.01; total: rMZ = 0.47, 95% CI 0.40–0.54, p < 0.01; rDZ = 0.23; 95% CI, 0.19–0.27, p < 0.01). The use of the standard error of the tetrachoric correlation did not generate significant changes in the above results.

Table 2 OCD twin studies.
Fig. 5: Monozigotic twin OCS resemblance.
figure 5

OCS coccurrence rate in monozigotic twin pairs was about 47%.

Fig. 6: Dizogotic twin OCS resemblance.
figure 6

OCS cooccurence rate between dizigotic twin pairs was about 23%.

The Fisher z-score analysis confirmed that MZ twins showed considerably higher correlations for OCS than DZ twins (children/adolescents: Diff Z = 0.30, 95% CI 0.17–0.42, p < 0.01; adults: Diff Z = 0.26, 95% CI 0.21–0.31, p < 0.01; total: Diff Z = 0.28, 95% CI 0.21–0.34, p < 0.01) (Fig. 7). Based on the tetrachoric correlations, we estimated that A = 0.46, C = 0, and E = 0.54 for the total sample; A = 0.50, C = 0, E = 0.50 for children/adolescents; and A = 0.45, C = 0, and E = 0.55 for adults.

Fig. 7: Fisher Z-score between monozigotic and dizogotic twins.
figure 7

MZ twins has considerably higher correlations for OCS than DZ twins.

Of note, there was a high heterogeneity index across the twin analyses. However, it is important to mention that despite the high heterogeneity, almost all results were statistically significant in the same direction.

Population-based cohorts (post hoc)

Six large nationwide register-based cohorts [7,8,9,10,11,12] did not meet our initial inclusion criteria but, given their superior statistical power and relevance, they were included in the current paper and their results are narratively described below.

From a cohort of more than 13.5 million people who were born or lived in Sweden between 1969 and 2009, Mataix-Cols et al. [7] found that the FDRs of the 24,768 individuals with OCD were more likely to also have OCD (OR = 5.03, 95% CI = 4.49–5.64 for siblings; OR = 4.70, 95% CI = 4.09–5.40 for parents; and OR = 4.56, 95% CI = 3.97–5.24 for the offspring); that this risk decreased proportionally to the degree of genetic relatedness; and that the risk tended to be higher amongst FDR of early-onset OCD individuals.

Brander et al. [10] studied the same population between 1967 and 2007. They reported that full siblings of individuals with tic-related OCD (HD = 10.63, 95% CI = 7.92–14.27) had a higher risk for OCD than the relatives of non-tic-related OCD (HD = 4.52, 95% CI 4.06–5.02).

Mahjani et al. [11] also analyzed the OCD familial risk in the Swedish population. The cohort enrolled 822,843 people born between 1982 and 1990, of which 7184 (0.87%) were diagnosed with OCD. The relative recurrence risks (RRRs) confirmed the OCD familial pattern (RRR = 4.82, 95% CI = 4.03–5.62 for full siblings; RRR = 1.85, 95% CI = 0–3.27 for maternal half-sibling; RRR = 1.09, 95% CI = 0–1.96 for paternal half-siblings; RRR = 1.85, 95% CI = 1.29–2.41 for maternal cousins; and RRR = 1.59, 95% CI = 1.13–2.08 for other cousins).

Steinhausen et al. [8] investigated a population sample from 1969 until 2009 in Denmark. The data indicated that an early-onset of OCS (N = 2057) significantly increased the risk of having OCD in the FDRs compared with controls (N = 6055) (OR = 34.4, 95% CI = 4.42–262.39 for paternal OCD; OR = 11.21, 95% CI = 3.14–0.04 for maternal OCD; OR = 6.19, 95% CI = 3.65–10.49 when a sibling had OCD; and an OR = 4.54, 95% CI = 1.28–16.13 if one of the children had OCD).

Browne et al. [9] analyzed data from the same Danish population (born from 1980 to 2007). The individuals with an older sibling (RRR = 4.89, 95% CI = 3.45–6.93) or a parent (RRR = 6.25, 95% CI = 4.81–8.11) diagnosed with OCD exhibited higher risk of having OCD.

Finally, Huang et al. [12] investigated the risk of OCD among 89,500 FDRs of OCD patients in Taiwan from 2001 until 2010. Compared to the general population, the FDRs of OCD patients had a higher risk of OCD (RR = 8.11, 95% CI = 7.68–8.57). More specifically, the relative risks were: RR = 7.64, 95% CI = 7.10–8.23 for parents; RR = 7.18, 95% CI = 6.65–7.75 for the offspring; RR = 8.95, 95% CI = 8.44–9.49 for siblings; and RR = 60.76, 95% CI = 49.12–75.16 for twins.

Discussion

The current meta-analysis included all OCD family and twin studies published until September 2021. The results update and extend the findings of the previous meta-analysis published more than 20 years ago [19]. The main findings were that OCD is highly familial, particularly in children and adolescents; that the heritability of OCS in twin samples is approximately 0.5; and that the higher OCS correlations between MZ twins were mainly due to additive genetic or to non-shared environmental components. These results are relevant for future genetic and clinical studies and reinforce the need for the development of specific guidelines for the screening of OCS in the FDRs of OCD subjects and the early referral for treatment when needed.

According to the 18 OCD family studies included in the analyses, OCD was 7.2 times more frequent in OCD families, when compared to control families. These estimates are almost twice higher than those from the last meta-analysis published in 2001 [19]. One conceivable explanation for these different ratings could be the fact that the current study included studies with larger samples, interviewed with validated assessment tools and based on reliable diagnostic criteria. Furthermore, considering the secrecy characteristic of OCD, the higher rates may be due to the fact that the current analyses included subjects that were directly interviewed.

Of note, the OCD rates among control relatives (2.3%) were very similar to the lifetime prevalence rates in the general population, ranging from 0.7% to 3% [61]. These findings suggest that the current results may be generalized to other samples and reinforce the robustness of the current estimates.

Additional analyses of very large population-based studies, primarily conducted in Scandinavian countries and Taiwan, support the estimates from the family studies [7,8,9,10,11,12]. In line with the studies that did meet our inclusion criteria, the risk for OCD in these population-based studies varied from 4.7 [7] to 7.64 [12] for parents, 4.82 [11] to 8.95 [12] for full siblings, and 4.54 [8] to 8.95 [12] for the offspring. Because these population studies had superior statistical power and less risk of selection bias, we conclude that the familial risk estimates are generalizable to the general population.

The twin studies demonstrated that OCD, or at least its dimensional representation, is not only familial but also heritable, with twin correlations ranging from 0.52 and 0.43 in MZ twins compared to 0.27 and 0.20 in DZ twins (in children and adult samples, respectively). These findings are in line with previous reports [1, 71, 72], and indicate that both genetic and environmental characteristics are important in the etiology of OCS. The analyses of the specific roles of additive genetic effects (A) and non-shared environment (E) components of the ACE model in the etiology od OCD revealed that our findings are in line with previous results [1] with each accounting for 46% and 54% of the variance, respectively. Interestingly, single-nucleotide polymorphisms -based heritability of OCD is still considerably lower, in the region of 30% [73], which indicates that further research is needed to understand the “missing heritability”. It is plausible to assume that while the majority of inherited liability for OCD is due to common genetic variation, rare variation may also contributes to some extent. Thus, future genetic studies should focus on common as well as rare genetic variants as a way to capture more of the unexplained phenotypic heritability.

Notably, the shared environment component (C) did not have any contribution to the etiology of OCS in this study. Taylor, 2011 [1] have also previously reported that the shared environment has a weak contribution to the OCS phenotypic variance. This finding is particularly relevant to the clinical field because it suggests that family environment (e.g., learning) is unlikely to have a major role in the etiology of the disorder. Instead, future studies should focus on the impact of specific environmental factors that are not shared between siblings or twins. Discordant sibling and twin designs are particularly suited to move the field forward because they effectively adjust for shared genetic factors and unmeasured confounders [3]. Using such designs, researchers have recently confirmed a dose-response relationship between perinatal complications and risk of OCD in the offspring [74].

Some limitations of the present study should be highlighted, such as the fact that the data were not analyzed according to the gender of the probands or the FDRs, to specific OCS subtypes or dimensions, to the symptom severity or the treatment response rates. These analyses could not be performed due to insufficient detail in many of the studies. For example, it seems likely that the tic-related subtype of OCD is particularly familial and heritable but limited data exists [75]. In addition, it would have been important to have more studies describing the recurrence risks for OCD according to the age of onset of OCS. Furthermore, Twin studies were based on self-reported questionnaires rather than on direct interviewed individuals but their results were largely compatible with those of the controlled family and population-based studies.

Despite these limitations, the current systematic review and meta-analysis represent a much needed update on the genetic epidemiology of OCD. The familial and heritable nature of OCD is now indisputable. In addition to large-scale gene-searching efforts, more needs to be done to understand environmental risk factors that are potentially modifiable, and how this newly gained knowledge can be used to improve the health of individuals with OCD and their relatives.