Introduction

Idiopathic nephrotic syndrome (INS) develops when the renal glomerular filtration barrier is damaged, allowing leakage of albumin into the urine (albuminuria), causing oedema and hypoalbuminaemia. The condition may be categorised according to initial steroid response (steroid sensitive nephrotic syndrome (SSNS) or steroid resistant nephrotic syndrome (SRNS))1,2. Most children will have steroid responsive disease with histology showing ‘minimal change’3. In patients with SRNS, focal segmental glomerulosclerosis (FSGS) is the most common histological finding. On a global scale, the incidence of FSGS is estimated at 0.2–1.8 cases per 100,000 people per year and contributes significantly to the burden of stage 5 chronic kidney disease (CKD) in children and young people4. Around 70% of SRNS patients have no identified mutation in a known podocyte gene (figures reaching as high as 90% in unselected cohorts (NEPTUNE for example), where referral bias is less of a concern) and are thought to have an unidentified ‘circulating factor’ that plays a causative role5,6,7,8. These patients receive intensive immunosuppression to control their disease but have a poor outcome overall, with relapse after renal transplant seen in up to 60% of cases9,10. Improving the molecular stratification of INS will be key to advancing therapeutic strategies and predicting recurrence risk post-transplant for this group of patients11.

The advancement of genomic sequencing technologies has identified the next largest subset of SRNS as being caused by single gene (monogenic) disorders, which account for up to 30% of cases12,13,14. Monogenic SRNS is associated with a more aggressive renal decline and little response to intensified immunosuppression. In addition, monogenic patients do not tend to have disease recurrence post-transplant9.

There are currently around 70 genes, mostly with AR inheritance, associated with monogenic SRNS9,14,15,16. Targeted gene panels have been developed based on this knowledge, facilitating rapid turnaround of results for affected patients16, with paediatric SRNS patients in the UK now recommended to have genetic testing by next generation sequencing (NGS)17. The ‘proteinuric renal disease’ panel covers 55 ‘green’ genes as outlined in the National Test Directory centrally commissioned by NHS England18,19. Rare variants in the ‘green’ or definitive SRNS genes are assessed for their significance using standard guidelines developed by the American College of Medical Genetics (ACMG)20. However, single rare heterozygous predicted-pathogenic variants (HPPVs) within AR genes that create uncertainty may be identified21, a problem recognised by Nephrologists. In such situations, the possibility that current sequencing technology is not able to reliably detect a second variant where one exists, or alternatively that a second variant resides either within the non-coding genome or in another podocytopathy gene, may lead to an assumption about pathogenicity.

A typical human genome contains 5 million variants of which 40,000 to 200,000 are designated ‘rare’ (MAF < 0.5%)22. Data from an exome-based study from Europe also shows that individuals carry at least 2 pathogenic variants in autosomal recessive (AR) genes23. This estimate can be used to identify couples at risk of having a child with an AR disease. Identification of an HPPV in a nephrotic gene, may therefore be incidental.

Another recent large-scale exome study involving sporadic and familial cases of FSGS in 662 individuals, showed that the there was a non-trivial rate of apparently ‘disease-causing’ (loss of function, non-synonymous) rare variants in known FSGS genes in healthy controls24.

There have been over 5000 diseases/traits found to have a monogenic cause25. Understanding the relevance of finding potentially pathogenic rare variants from the genomes of affected patients may involve a comparison with the frequency/distribution of similar variants in a control population26. A study in patients with Joubert’s syndrome (ciliopathy) for example, where predicted pathogenic variants can be detected in 2 or more Joubert-associated genes, used the comparison with an ExAC population database ‘control’ group to conclude that the distribution of variants within diseased individuals and healthy individuals was similar, and therefore not supporting the role for digenic inheritance in the condition27.

Continuing within the spectrum of genetic renal conditions, the study of rare variants has yielded information about new gene-disease associations and modifiers, in chronic kidney disease (CKD) for example. A comparison of the exome data on 3150 patients with chronic kidney disease (CKD) and 9563 healthy controls, observed an increased rare variant burden in several genes previously not associated with CKD (CPT2 for example)28. In the recessive salt-losing tubulopathy, Gitleman’s syndrome, 10% of patients are identified as having only one pathogenic variant in SLC12A3 using current sequencing methodologies. Long read sequencing of the gene in 67 such cases detected a second pathogenic/likely pathogenic variant in 45 (67%) of these cases29. This may be an additional tool to consider for patients with SRNS in similar circumstances.

There is increasing evidence that there is a spectrum of ‘recessiveness’ in heterozygous carriers of traditionally recessive genes. For some conditions, it has become apparent that there is a phenotype in carriers that is often milder, while for other conditions, it appears that they are properly recessive30,31,32,33,34. Two studies from the Finnish Biobank (FinnGen) suggest that there are HPPV effects in patients with mono-allelic NPHS1 (one of the commonest genes implicated in childhood nephrotic syndrome) variants31,32,34,35. Our study was not designed to assess such mild phenotypic HPPV effects and the absence of reporting such effects does not exclude the possibility of their relevance in paediatric SRNS.

For this study, we hypothesised that in paediatric SRNS, the frequency of HPPVs in recessive podocytopathy linked genes (Supplementary Table 1) would be similar to the frequency seen in a large European cohort from the gnomAD population database.

Methods

The study population was drawn from the UK Registry for Rare Kidney Disease database (RADAR; www.ukkidney.org), a national registry into which patients with SRNS are recruited33 as well as Evelina London, Guys and St Thomas’ Hospital Trust. All data usage and analysis were performed in accordance with the regulations and guidelines from RADAR and the University of Bristol. The details of this study cohort (Fig. 1) have previously been described, and the report includes details of the precise pathogenic variants in its supplementary pages9. From 187 patients with SRNS, 133 were of white Caucasian ethnicity. Of the 44 confirmed to have monogenic SRNS, 3 had an identical twin sibling and were therefore excluded to leave only 1 of each twin pair in the cohort. This left 130 white Caucasian patients for the analysis, 41 with a confirmed genetic diagnosis (‘monogenic SRNS’) and 89 in whom a genetic diagnosis had not been identified.

Figure 1
figure 1

The UK Paediatric SRNS cohort.

Our control population was drawn from the European (Non-Finnish) population in the Genome Aggregation Database (gnomAD (http://gnomad.broadinstitute.org version 2.1.1)34.

Exome data from 59 AR ‘nephrotic’ genes was systematically filtered to leave a list of rare (MAF < 0.01) heterozygous variants in the patient and control populations for comparison. 47 genes in the non-monogenic patients, and 28 genes in the monogenic patients were identified to contain at least 1 rare heterozygous variant in an AR podocytopathy gene.

A series of further filtering steps (outlined below) were employed to compile a final list of HPPVs in patients and controls (of note, this filtering was not designed to identify causative ‘mutations’ as might occur in a clinical analysis). For the patients already confirmed as monogenic (causative mutation in a podocytopathy linked gene identified previously), the causative pathogenic variants were filtered out of the data prior to comparison of remaining variants with the control group.

The GnomAD database does not provide data per individual, and therefore comparisons of variant frequencies were performed for the group as a whole.

Four final groups were established for comparison as follows: Rare heterozygous variants (independent of pathogenicity predictions) in confirmed monogenic patients vs controls/rare heterozygous variants (independent of pathogenicity predictions) in patients with no genetic diagnosis identified vs controls/HPPVs in monogenic patients vs controls/HPPVs in non-monogenic patients vs controls.

Step 1 (filtering):

  • Remove intronic variants (outside ± 10 base pairs from intron/exon junction).

  • Remove variants seen as homozygous.

  • Remove variants with MAF > 0.01.

Step 2 (variants must meet the following criteria):

  • Variant likely to disrupt the splice site (± 2 base pairs from intron/exon junction).

  • Variant was an insertion/deletion (frameshift/non-frameshift).

  • Variant resulted in a gain or loss of a stop codon.

  • Variant was missense and predicted to be pathogenic by in-silico tools (high and low stringency settings were applied for comparison*).

*Low stringency = any 4 of the 6 in-silico tools used agreed.

*High stringency = CADD and REVEL and 3 out of 4 remaining in-silico tools agreed.

Step 3 (pathogenicity predictions):

  • The Ensembl Variant Effect Predictor (VEP) was used to assign pathogenicity predictions to the list of remaining variants. A set of 6 in-silico prediction tools were used (CADD36/Revel35/MutPred37/Fathmm-XL38/Polyphen39/SIFT40).

Taking each gene in turn, the total number of (a) rare heterozygous variants and (b) HPPVs, along with corresponding allele counts (AC) and allele numbers (AN), were recorded for both patients and controls. For the control population, the number of individuals was estimated by taking an average of the ANs recorded in gnomAD for each variant and dividing it by 2 (which is why the number of individuals (denominator in Table 1) in the control population varies).

Table 1 Results summary showing number of variants per patient/control per gene for monogenic and non-monogenic SRNS.

Statistical analysis

The hypothesis that the allele count (AC) in the patient and control groups are statistically consistent (null hypothesis) was tested by calculating the Chi-squared (\({\chi }^{2} )\) statistic. In the present study the \({\chi }^{2}\) statistic was computed including the Yates’ correction, which is considered appropriate given the relatively small sample size of the patient group. A Bonferroni correction was applied to adjust the p-values considering the multiple (59 genes) comparisons. For comparisons in Table 1 we applied sampling theory for the proportions to calculate p values (comparing variants per patient/control per gene).

Ethics approval

South West-Central Bristol Ethics committee (09/H0106/80).

Our cohort of paediatric SRNS cases was recruited via the United Kingdom Registry for Rare Kidney Diseases (RaDaR) and included all tertiary paediatric nephrology centres in the UK. Detailed phenotypic information was entered online (https://www.radar.nhs.uk/), and laboratory data were automatically populated via links to the UK Renal Registry (www.renalreg.org). Appropriate informed consent from parents and/or carers was collected, and assent for collection of data and genetic analysis obtained. The study was approved by the South West research ethics committee and the institutional review board at each recruiting centre.

Results

Rare heterozygous variants (independent of pathogenicity predictions) were identified in 47 genes from the patients with no genetic diagnosis identified and in 28 genes from the confirmed monogenic patients (Supplementary Tables 2, 3). HPPVs were identified in 10 genes from the group with no genetic diagnosis and in 3 genes from the confirmed monogenic patients (Tables 2, 3), using a high-stringency in-silico approach. HPPVs were identified in 17 genes from the patients with no genetic diagnosis and in 7 genes from the confirmed monogenic patients (Supplementary Tables 4, 5), using a low-stringency in-silico approach. Table 1 provides a summary. The denominator for the gnomAD group changes because the allele number (AN) in gnomAD is different for each variant (as explained above).

Table 2 Heterozygous predicted-pathogenic variants (HPPVs) in patients with no confirmed genetic diagnosis (N = 89) versus controls using high stringency prediction tools (as defined in text).
Table 3 Heterozygous predicted-pathogenic variants (HPPVs) in confirmed monogenic patients (N = 41) versus controls using high stringency prediction tools (as defined in text).

In MMACHC (Metabolism of Cobalamin Associated C), we observed significantly more rare heterozygous variants (independent of pathogenicity predictions) in patients with no genetic diagnosis identified versus controls (9/89 versus 1355/58,114 (corrected P = 0.00059)). This difference was not observed in the confirmed monogenic patients (MMACHC corrected p value for comparison of monogenic vs control = 1) nor was it observed when variants were filtered using in-silico pathogenicity predictions.

Patients without a genetic diagnosis

158 rare heterozygous variants were found in 59 recessive genes from 89 unrelated patients. 125,083 rare heterozygous variants in the same 59 genes were identified in 55,773 individuals from gnomAD (Table 1).

Using a low stringency in-silico approach (Supplementary Table 4), 30 HPPVs (19% of the original 158) were found in 59 genes from the 89 patients without a genetic diagnosis compared with 28,361 in the same 59 genes from 55,844 individuals in gnomAD (controls). When the high stringency approach (Table 2) was used, 13 variants (8.2% of the original 158) were found in 59 genes in the patients, versus 13,108 in 55,796 individuals from gnomAD (controls). Comparing each gene in turn, the number of these HPPVs was not significantly different in patients versus controls.

Confirmed monogenic patients

43 rare heterozygous variants were found in 59 genes from 41 unrelated genetic patients compared with 125,083 in 55,773 individuals in gnomAD (controls) (Table 3).

Using a low stringency in-silico approach (Supplementary Table 5), 8 HPPVs (18.6% of the original 43) were found in 59 genes from 41 patients, compared with 28,361 variants in 55,844 individuals in gnomAD (controls). With a high stringency approach (Table 3), 4 HPPVs (9.3% of the original 43) were found in 59 genes from 41 patients compared with 13,108 variants in 55,796 individuals in gnomAD (controls).

In the ARHGDIA (Rho GDP Dissociation Inhibitor Alpha) and TP53RK (TP53 Regulating Kinase) genes, we observed a significantly higher number of HPPVs in the control population compared with patients (24/50,216 vs. 0/41 (corrected P = 0.034928) and 19/52,497 vs 0/41 (corrected P = 0.003953) respectively, Table 3) using the high stringency approach.

Comparing the remaining 57 genes in turn, the number of HPPVs were not significantly different in patients versus gnomAD controls.

Discussion

Whole exome sequencing (WES) in SRNS has been shown to be a helpful tool for diagnosis and also for identification of novel candidate genes41. This is also true in other renal phenotypes, for example in cohorts of children with increased renal echogenicity and chronic kidney disease, WES has been shown to identify a causative mutation in up to 63% of cases42.

Childhood onset SRNS is a disease that makes a large contribution to the burden of end stage renal disease seen in paediatrics. Based on our current understanding, a monogenic cause is identified in the minority of cases and there remains uncertainty about missing genetic data in the remainder43,44. The relevance of finding rare heterozygous variants in recessive podocytopathy-linked genes in children with SRNS is currently unknown. Our data has shown that there was no statistically significant difference in the prevalence of rare heterozygous variants identified in AR podocytopathy-linked genes between patients and controls for all but one gene (MMACHC in the patients with no confirmed genetic diagnosis), and when pathogenicity predictions were applied to these rare heterozygous variants, so that only those variants assigned ‘likely pathogenic’ (HPPV) remained, the prevalence was statistically significantly higher in the control population versus patients in the ‘confirmed monogenic SRNS’ group in only 2 genes (ARGHDIA and TP53RK), which we suggest may because we had removed all ‘causative mutations’ in the ‘confirmed monogenic’ patient group prior to comparing the frequencies of HPPV.

Mutations in the MMACHC (metabolism of Cobalamin Associated C) gene are responsible for Cobalamin C deficiency45. Renal manifestations of Cobalamin C defects include thrombotic microangiopathy, tubulointerstitial nephritis, proximal renal tubular acidosis and nephrotic syndrome45. We see a higher prevalence (corrected P = 0.00059) of rare heterozygous variants (independent of pathogenicity predictions) in MMACHC in the patients with SRNS in whom a genetic diagnosis has not been confirmed (Supplementary Table 2). This may be because these patients have been wrongly assigned as ‘non-genetic’ and a second unidentified pathogenic variant resides elsewhere in the genome, or, that these variants are relevant via some other mechanism—for example, through their ability to change/modify the expression of other genes involved in the production of a ‘circulating factor’ for example.

The ‘modifier’ concept has previously been described in association with renal disease progression relating to NPHS2 nephrotic syndrome46 and more recently, in an Indian cohort of paediatric SRNS, where a potential role for collagen gene modifiers was suggested, noting the high incidence of COL4A variants identified in their patients47.

This statistically different result in MMACHC is not seen after pathogenicity predictions are applied to the analysis and we find comparable results for the prevalence of predicted pathogenic variants (HPPV) in both monogenic and ‘genetic diagnosis not confirmed’ SRNS groups for this gene.

There were significantly more HPPVs (identified using prediction tools at high stringency) in controls compared with ‘confirmed monogenic’ SRNS patients in ARGHDIA and TP53RK. These genes are the smallest of the group analysed (ARHGDIA = 3658 bases and TP53RK = 5080 bases) and the number of HPPVs correspondingly small also in the gnomAD data set (24 and 19 respectively). There were no HPPVs identified in any of the patients for these genes. The gene size compounds the effects of the small sample size for this sub-group of patients. This significant difference is not evident in the ’no genetic diagnosis confirmed’ patients (n = 89), nor when prediction tools were applied to the filtering with a lower level of stringency.

In a mouse model of acute kidney injury, rare variants in the TNFAIP3 (TNF alpha-induced protein 3) gene were identified as having anti-inflammatory effects, and as such they were deemed ‘protective’48. The potential to use the understanding of the mechanisms through which such protective variants may exert their effects can be used in drug development, and as such is of high importance.

Comparing rare heterozygous variants in AR genes between patients and general population databases to determine their significance has been done before. For example, in Parkinson’s disease, the recessive gene PINK1 is associated with monogenic forms of the condition. A study involving 1100 patients with Parkinson’s disease demonstrated that rare heterozygous variants in PINK1 were seen at an equal frequency in patients (1.8%) and healthy controls (1.5%), with the conclusion that these rare variants are likely to play only a minor susceptibility role in the context of a multifactorial disease model49,50. Perhaps rare variants in the MMACHC gene act by increasing the ‘susceptibility’ to SRNS.

Our understanding of ‘recessiveness’ in disease is also expanding such that we are now aware that the presence of rare heterozygous variants in recessive genes may be associated with a phenotype in certain conditions51. For example, in the case of the collagen genes, where patients with rare heterozygous variants in COL4A3/COL4A4 often have persistent microscopic haematuria. The rate of progression to end stage renal disease is much lower in patients harbouring a single pathogenic variant in the recessive COL4A4 gene when compared with those found to have bi-allelic COL4A4 pathogenic variants52,53.

While there are likely to be several mechanisms that contribute to the expression of variants, there are cis-regulatory elements and eQTLs (expression quantitative trait locus) that control transcription, playing a role in transcription from the relevant allele54. In the case of a single pathogenic variant in a traditionally recessive disease, the regulation of whether the wild type or potentially pathogenic rare variant is expressed might contribute to pathogenicity. The control of transcription by eQTLs is cell type dependent, and we are only just beginning to have the tools to map these in the kidney55. We do not have this data for the patients in this study, and although there are similar numbers of HPPVs in our cases and controls, it is possible that other elements controlling whether these pathogenic variants are expressed could contribute to the phenotype.

The presence of a rare variant per se, does not predict disease and our data indicates that in common with other rare diseases, rare variants are found throughout the genome in disease-related genes from healthy individuals53,56. We see examples of this from other studies, for example, in an exome sequencing case–control study of 363 patients with familial and sporadic FSGS, 5.5% of the control group were found to have a rare variant that would have been deemed ‘causal’ if found in a patient24. A large exome analysis of 7974 healthy adults was assessed for the presence of pathogenic variants in 626 genes known to be associated with mendelian kidney and urinary tract disorders found, after stringent filtering, that 1.4% of them had a candidate pathogenic variant57.

Interpreting the significance of heterozygous variants that are predicted to be pathogenic (HPPV) in recessive disorders is challenging. We have approached this in a comprehensively phenotyped and genotyped cohort of paediatric patients with SRNS, thus ruling out incorrect phenotype being an issue. In the case of heterozygous changes in AR genes which are of unknown significance, other mechanisms to consider to avoid misinterpretation, are whether a second pathogenic variant resides within the non-coding region, or within the exome, but is missed due to the constraints of existing sequencing technology. A second pathogenic variant could also be present in a novel SRNS gene or possibly part of digenic inheritance including a novel SRNS gene58,59. However, the majority of gene-negative patients from this cohort were also subsequently whole genome sequenced, and this did not increase the rate of detection of mutations within the non-coding space60. Equally with advances in NGS technology and good coverage across all genes tested, false negatives, unless residing in hidden parts of the genome and requiring long read sequencing, become less of an issue. Our study therefore suggests that these possibilities will in practice only be relevant for a minority of cases, and as such, in the absence of evidence for other genetic mechanisms, heterozygous changes in AR podocyte genes should be reported as incidental rather than causal.

We propose that this study serves as a pilot for future larger scale analyses and that such analyses include additional sub-groups (circulating factor disease, patients responding to Rituximab etc.), where identification of enriched rare heterozygous variants in specific genes from patients may help to define new and important molecular pathways.

Limitations

There are a number of limitations to this study which are important to address. The sample size is relatively small, even if deeply phenotyped. The generalisability of the findings is limited to people of European ancestry. It was also not possible to assess the true number of HPPVs per individual in controls, as the data was only available for the population. Finally, our assessment of ‘predicted pathogenicity’ was rather more basic than that used for clinical reporting (we used a modified version of the standard ACMG criteria for variant classification), and it is possible therefore, that some of the variants were mis-assigned.

Conclusions

This study set out to investigate a previously unanswered question about the relevance of rare predicted-pathogenic heterozygous variants in recessive ‘podocytopathy-linked’ genes in a UK paediatric SRNS cohort. We conclude, whilst numbers were limited, there appears to be no significant difference in the number of HPPVs detected in paediatric SRNS compared with ancestrally-matched controls in the patients with no confirmed genetic diagnosis and for the majority of genes (57/59) in the confirmed monogenic patients. Despite limitations and the possibility of missed pathogenic changes in coding/non-coding regions as well as novel genes, our data provides evidence to support the observation that HPPVs are infrequently causal and associated with a pathogenic genotype in children with SRNS. This should reassure clinicians, and emphasises that whilst genetics may be a powerful tool to establish a diagnosis, clinical interpretation of less definitive results should be more cautious and may not provide the answer. One caveat does however remain for the interpretation of UK SRNS Panel testing through the National test directory. Since only 55 of a possible 70 genes linked to podocytopathy satisfy the criteria of being a ‘green gene’ or one with incontrovertible proof of pathogenicity, a negative result may not necessarily exclude the presence of genetic disease and in this instance the clinical presentation may be the more accurate.