Introduction

RCCX copy number variation (CNV) is a complex (it contains another CNV), multiallelic (there are more than two frequent CNV alleles in a population) and tandem (CNV segments are next to each other) CNV.1, 2, 3 Two full-length genes, complement component 4 (C4) and steroid 21-hydroxylase (CYP21), are located in RCCX CNV. C4 has two types with slightly different immune functions,4 C4A and C4B, and it contains an additional CNV, HERV-K(C4) CNV, in its intron 9 (Figure 1), derived from an ancient insertion of human endogenous retrovirus. RCCX CNV in one chromosome is referred as a haplotypic structure, and is traditionally described by the copy number of the segment, and, per segment, by the types of C4 and the alleles of HERV-K(C4) CNV (see the explanation of more specialized genetic terms in Table 1). A haplotypic RCCX CNV structure usually contains one functional CYP21 gene (CYP21A2) in the 3′-segment, and zero, one or two disabled pseudogenes (CYP21A1P) in the segments towards the 5′-direction (Figure 1). CYP21A2 expresses the steroid 21-hydroxylase enzyme (cytochrome P450c21) uniquely in adrenal cortex, which is one of the enzymes responsible for the biosynthesis of the two principal steroid hormones, aldosterone and cortisol.5 Aldosterone regulates blood pressure by acting on kidney functions, and cortisol suppresses immune functions, responds to stress and takes part in the intermediary metabolism of different tissues. Two chromosomal copies of disabled CYP21A2 causes congenital adrenal hyperplasia (CAH; we use this broader, but more prevalent, term instead of 21-hydroxylase deficiency), an autosomal recessive disorder accompanied by a partial or complete lack of aldosterone and cortisol.6 Special genetic rearrangements generally featured by multiallelic CNVs, such as non-allelic gene conversion and unequal crossover,7 are responsible for the vast majority of CYP21A2 mutations leading to CAH;8 therefore, the complex genetic structure of RCCX CNV indirectly manifests in CAH.

Figure 1
figure 1

Scaled representation of the organization of human RCCX CNV depicted by RCCX structures with one, two and three segments. Each segment (repeat) is abbreviated with two letters, the first represents the alleles of the HERV-K(C4) CNV (L – long (insertion) allele or S – short (deletion) allele), and the second symbolizes the types of C4 gene (A or B). The duplications of these two letters indicate RCCX structures with more than one segment. Dotted lines indicate the segment boundaries. The variable region of the RCCX with two segments contains two pairs of full-length genes, complement component 4 (C4A and C4B) and steroid 21-hydroxylase (CYP21A1P and CYP21A2). Other two gene pairs partially residing in RCCX CNV, TNXA-TNXB and STK19 and STK19P, are not illustrated.

Table 1 Glossary of genetic abbreviations and key terms used in the text of the current study

Difficulties in the molecular diagnosis of CAH due to the complicated genetic nature of RCCX CNV have been recognized for a long time,9 and the best-known case for this is an LSS haplotypic RCCX CNV structure with three segments harboring two copies of CYP21A2 and the c.955C>T (p.(Q319*), rs7755898) variant. (For numbering the genetic variants, we use the most recent human reference gene and the corresponding mRNA sequence (RefSeqGene NG_007941.3 and NM_000500.7), which contains the insertion (CTG) allele of the c.28_30delCTG (rs61338903) variant. Therefore, our numbering differs from the studies using a CYP21A2 sequence (eg, former RefSeqGene NG_007941.2) with the deletion allele of c.28_30delCTG. The p.(Q319*) variant name is used with parentheses at the protein level according to the nomenclature recommendation of Human Genome Variation Society (http://varnomen.hgvs.org/), which considers a protein variant without experimental evidence at RNA or protein level as a predicted one; however, the segregation patterns of classical (severe form) CAH families8, 10, 11, 12 (especially patients from consanguineous families with family history having homozygous c.955T genotypes without any other CAH mutations in exons and exon/intron boundaries and without reported duplicated CYP21A213) have provided clinical evidences for the pathogenicity since the initial report of c.955C>T.14) This unique RCCX structure was found for the first time in three patients with CAH, and the c.293−13C>R (rs6467) CAH mutation accompanied the c.955C>T variant.10 The unique RCCX structure with c.955C>T and c.293−13C>R has been confirmed in a Dutch CAH population,15 but the same RCCX structure without the c.293−13C>R has been observed in the healthy subjects of a Dutch population as well.16 After the initial findings, several studies on the topic have been published over time. However, healthy and CAH populations have rarely been examined simultaneously, and the determination of underlying RCCX structures has often been incomplete or has not been documented (Table 2). Nevertheless, new information on the unique haplotypic RCCX structure has been accumulated. In a healthy Spanish population, two CYP21A2 gene copies in one chromosome are associated with three genetic variants in complete linkage disequilibrium (LD): c.955C>T, c.293−79G>A (rs114414746) and c.*12C>T (rs150697472) variants. They occur with a moderate allele frequency (2.77%), and c.955C>T is located in the 3′-segment of CYP21A2, whereas a wild-type CYP21A2 is in the middle segment.17 An incomplete association among c.955C>T, c.293−79G>A and c.*12C>T has also been found in a later study,18 but, unfortunately, the diagnoses of CAH in the studied individuals have not been clearly documented, and therefore it cannot be known if the presence or absence of the association has been assigned to patients with CAH or healthy subjects.

Table 2 Published literature on the topic of the c.955C>T variant and two CYP21A2 gene copies

In the current study, a healthy and a CAH population with the same European ancestry were used to assess the occurrences and association of the haplotypic RCCX structure with three segments, two CYP21A2 in the same RCCX structure, c.955C>T, c.293−79G>A and c.*12C>T, and to offer a solution for the diagnostic problem. A recent set of experimental and bioinformatic methods1, 19 were applied to decipher the exact haplotypes of RCCX structure and CYP21A2, and to translate the evolutionary origin of the studied genetic variants into clinically useful information. Furthermore, patients with non-functioning adrenal incidentaloma (NFAI), which is a hormonally inactive adrenal mass unintentionally discovered by medical imaging,20 and the adrenocortical tumor specimens of NFAI patients, were used to compare blood hormone levels and to examine the mRNA levels. The rationale for studying NFAI patients was: (i) the baseline hormone profiles of NFAI subjects do not differ from those of healthy subjects, (ii) the NFAI subjects had a broader hormone profile and (iii) their tumor specimens are available.

Materials and methods

Subjects and clinical data

For the subjects with the c.955C>T variant, 96 healthy subjects (50% women) from a family study,21 68 healthy subjects (65.2% women) and 125 patients (76.0% women) with NFAI from a recent study,22 and 100 patients (58.0% women) with classical CAH due to 21-hydroxylase deficiency from the Second Department of Internal Medicine, Semmelweis University (Budapest, Hungary) and the Second Department of Pediatrics, Semmelweis University (Budapest, Hungary) were screened. In addition, one patient with supposed classical CAH was referred to us by the Department of Pediatrics, University of Pecs (Pecs, Hungary) for reexamination of the original genetic diagnosis. All subjects were genetically unrelated and Hungarian with European ancestry. The exclusion criteria for NFAI patients were described recently.22 The diagnosis was confirmed in all CAH patients by hormonal and genetic testing (except for the one who was reexamined) as described.23, 24 Furthermore, 34 adrenocortical tumor specimens of NFAI patients were obtained as described,25 and screened for the c.955C>T variant. All subjects gave written informed consent. The study protocol was approved by the National Scientific and Ethical Committee, Medical Research Council of Hungary (TUKEB, ETT), and was executed according to the Declaration of Helsinki principles.

Genetic analyses of RCCX CNV and CYP21 genes

The c.955T allele, full haplotypic RCCX CNV structures, and full-length CYP21A1P and CYP21A2 haplotypes were determined as described previously.1, 19, 22, 23, 24 Briefly, a combined molecular and inferred haplotyping approach was used with two stages. C4 type-specific quantitative PCRs (qPCRs), CYP21 gene-specific qPCR, HERV-K(C4) CNV allele-specific qPCR, a set of CYP21 gene-specific nested polymerase chain reactions (PCRs) resting on a set of allele-specific long-range (ASLR) PCRs, and the Sanger sequencing of the nested PCR products were carried out in the first, experimental stage. One smaller modification was set up using a new CYP21A1P-specific nested PCR primer for one individual, CYP21A1P_2R (Supplementary Table 1), because a mutation of CYP21A1P decreased the binding efficacy of the original primer, CYP21A1P_R.1 The experimentally determined new CYP21A1P and CYP21A2 haplotypes are available in GenBank (www.ncbi.nlm.nih.gov/genbank/) under KU302771–KU302776 IDs, and variants (IDs in Supplementary Table 2) in dbSNP (www.ncbi.nlm.nih.gov/projects/SNP/). The above-mentioned haplotypic genetic features of RCCX CNV and CYP21 genes, which could not be determined in some subjects, were inferred by bioinformatic haplotype reconstruction in the second stage. To map the origin of haplotypic RCCX structures and CYP21A2 haplotypes, a genealogical haplotype network was constructed using the new and published CYP21A2 haplotypes (Supplementary Table 2 and 3) as described.1

Functional studies on CYP21A2

Total RNA was isolated from adrenocortical tumor specimens by RNeasy Lipid Tissue Kit (Qiagen, Hilden, Germany), cDNA was synthesized by ProtoScript II Kit (New England Biolabs, Ipswich, MA, USA) and CYP21A2 mRNA level was determined by qPCR using a combined approach as described.19, 26 Briefly, the forward primer was designed on the boundary of exons 1 and 2 of the CYP21 genes, the reverse primer was located on c.332_339delGAGACTAC variant distinguishing CYP21A1P and CYP21A2 and the Taqman minor groove binder probe (Applied Biosystems, Foster City, CA, USA) on the boundary of exons 2 and 3 was labeled with the fluorescent dye 6-FAM (Supplementary Table 1). Reactions were carried out in a 7500 Fast Real-Time PCR equipment by TaqMan Fast Advanced PCR master mix (Applied Biosystems). The human OAZ1 gene was used as an internal control.27 Full-length CYP21 cDNA was amplified and cloned with primers designed for mRNA according to a previous study.28 Briefly, CYP21A2 cDNA was amplified by Phusion DNA polymerase (New England Biolabs), the PCR product was directionally cloned into a pTZ57R vector and subjected to Sanger sequencing. The methods of hormone measurements for NFAI patients and the software for statistics were recently described.22

Results

A unique haplotypic RCCX CNV structure with three segments, LBSASB, harboring two CYP21A2 gene copies and the c.955C>T variant

The c.955C>T variant was detected in 7 healthy subjects out of 164 (allele frequency is 2.1%), in 5 NFAI patients out of 125 (2.0%), and in 6 classical CAH patients out of 100 patients (3.0%). To find out which haplotypic RCCX CNV structure and CYP21A2 haplotype harbor the c.955T allele, a set of molecular haplotyping methods1 were used for the experimental dissection in the first stage. Then, the haplotypic information based on molecular haplotyping from the first stage and the genotypic information of all studied healthy subjects and patients with NFAI were used for bioinformatic haplotype reconstruction in the second stage (see detailed description of how the experimental and bioinformatic methods work elsewhere1). The first, experimental stage revealed the following genetic features: (i) All the healthy subjects and all the NFAI patients who had the c.955T allele (12 individuals) carried a haplotypic RCCX CNV structure on one chromosome harboring two CYP21A2 gene copies (they had three CYP21A2 gene copies as a whole). (ii) The extra CYP21A2 gene copy in the 5′-segment or the middle segment of RCCX CNV was always related to a C4A gene in the 5′-direction, and the CYP21A2 haplotype of these segments could be determined in all 13 individuals (Figure 2a). Accordingly, two almost identical and unique (not observed in former studies1, 19) CYP21A2 haplotypes were located in the 5′-segment or the middle segment, the h63 (in 1 individual) and h64 (in 11 individuals). (iii) From these 11 individuals with the h64 haplotype, the middle segment of an LBSASB RCCX structure harbored h64 in two individuals. (iv) Two individuals with c.955C>T carried a suitable RCCX structure combination for the CYP21A2 haplotype determination in the 3′-segment (Figure 2b), and they had a unique, undescribed h65 CYP21A2 haplotype there. In addition, both CYP21A2 haplotypes in one haplotypic RCCX structure were also determined in a 1-year-old CAH patient with c.955C>T, whose DNA sample was received to reexamine the original genetic diagnosis based on parental genotypes, because the required dose of steroid substitution was unchanged in spite of weight gain. In this putative CAH patient, the 3′-segment also harbored a h65 haplotype, and a h64 haplotype was carried as a second CYP21A2 copy.

Figure 2
figure 2

The experimental determination (molecular haplotyping) of haplotypic RCCX CNV structures, CYP21A2 haplotypes and the resultant CYP21A2 haplotypes of LBSASB RCCX structure. A-S ASLR-PCR indicates allele-specific long-range PCR only from the C4A gene in the 5′-segment or the middle segment, A-T ASLR-PCR indicates ASLR-PCR only from the C4A gene in the 3′-segment, and B-T ASLR-PCR indicates ASLR-PCR only from the C4B gene in the 3′-segment. Each haplotypic RCCX segment is abbreviated with two letters, the first represents the alleles of HERV-K(C4) CNV (L – long (insertion) allele or S – short (deletion) allele), and the second symbolizes the types of C4 gene (A or B). The multiplication of the two letters in a structure indicates the segment number. Other two gene pairs partially residing in RCCX CNV, TNXA-TNXB and STK19 and STK19P, are not illustrated. (a) The haplotypic RCCX structures and the CYP21A2 haplotype in the middle segment can be determined from the depicted diploid combination of the haplotypic RCCX structures, but the CYP21A2 haplotype in the 3′-segment cannot. (b) All CYP21A2 haplotypes can be determined from the depicted diploid combination of the haplotypic RCCX structures, because ASLR-PCR can separate the two 3′-segments from the different haploid copies of chromosome 6. (c) CYP21A2 haplotypes of the LBSASB RCCX structure. Only variants with at least moderate frequencies according to a recent study19 are presented. Specific genetic variants of these haplotypes are indicated by black triangles.

The second, bioinformatic stage resolved a unique LBSASB RCCX structure harboring very similar h63 or h64 CYP21A2 haplotypes in the middle segment and very similar h65, h66, h67 or h68 haplotypes in the 3′-segment above a 0.99 confidence probability threshold in all 12 individuals and the reexamined CAH patient (Figure 2c). A specific allele, c.*12T, occurred only in h63 and h64 haplotypes of the middle segment and not in any other haplotypes of healthy subjects and patients with NFAI, and a specific allele, c.293−79A, occurred only in h65–h68 haplotypes of the 3′-segment (Supplementary Table 2). Consequently, these two variants and c.955C>T have the same allele frequency as the LBSASB RCCX structure harboring them, they were in complete LD (r2=1), and, furthermore, there was no significant difference between the allele frequencies in healthy subjects and patients with NFAI (Fisher’s exact test, P=1.0000).

CYP21A2 haplotypes of classical CAH patients with c.955C>T

In total, seven CYP21A2 haplotypes with the c.955T allele were found in six patients out of the studied 100 patients with classical CAH (Table 3). The c.*12T allele did not occur in these six CAH patients, but the c.293−79A allele was observed in three patients. From the seven haplotypes with the c.955T allele, three CYP21A2 haplotypes were experimentally determined; a h65 haplotype harbored by an LASA RCCX CNV structure, a haplotype (h69), which had an extended CYP21A1P sequence approximately from intron 6 to the 3′ end of the gene including c.955T, and a haplotype (h70), which did not harbor any alleles characterizing CYP21A1P apart from c.955T. Bioinformatic haplotype reconstruction can accurately resolve haplotypes when they recurrently occur in a studied population, but CYP21A2 haplotypes with CAH variants not determined by experiments did not satisfy this requirement. However, it was highly probable that a CYP21A2 haplotype (in patient 2) was identical with the h67 haplotype of LBSASB RCCX structure, and a haplotype (in patient 5) also had an extended CYP21A1P sequence at the 3′ end of the gene. Furthermore, the CYP21A2 haplotypes of CAH patients were not assigned to LBSASB or a chromosome harboring a second copy of CYP21A2, although the presence of the c.293−79A allele in classical CAH patients implied that the CYP21A2 haplotypes harboring c.293−79A had derived from the CYP21A2 in the 3′-segment of LBSASB.

Table 3 Haplotypes and genotypes of classical CAH patients with the c.955C>T mutation

Origin of CYP21A2 haplotypes with c.955C>T in healthy and CAH individuals

To assess the evolutionary origin of CYP21A2 haplotypes, a genealogical (evolutionary) haplotype network was applied as described recently.1 Briefly, haplotypes that are more similar to each other are located close to each other in the haplotype network, and are considered to be closer relatives.29

The CYP21A2 haplotypes (h63 and h64) in the middle segment of the LBSASB RCCX structure clustered together and were far from the group of the haplotypes (h65–h68) in the 3′-segment of LBSASB on the haplotype network (Figure 3a). This supports the idea that the haplotypes in each group were closely linked, but a distant relationship was observed between the two groups. The distant relationship implies that an unequal crossover event between two distantly related RCCX structures generated the LBSASB structure. However, the SB and LASB RCCX structure groups next to the RCCX structure assigned to h63–h64 and h65–h68 groups do not suggest a one-step mechanism for this structure generation. Nevertheless, a potential multistep generation with unequal crossover or an alternative explanation of multiple non-allelic gene conversion events equally indicates a relatively ancient origin of the LBSASB structure. Not only the multistep mechanism but the presence of LBSASB-specific variants also confirms this.

Figure 3
figure 3

Evolutionary processes of RCXX CNV structure harboring the c.955C>T variant in CYP21A2. Each haplotypic RCCX segment is abbreviated with two letters, the first represents the alleles of HERV-K(C4) CNV (L – long (insertion) allele or S – short (deletion) allele), and the second symbolizes the types of C4 gene (A or B). The multiplication of the two letters in a structure indicates the segment number. (a) Genealogical haplotype networks of CYP21A2 haplotypes with projected haplotypic RCCX CNV structures. Numbers in gray circles indicate the known CYP21A2 haplotype, and numbers in orange, green, and dark green circles represent the CYP21A2 haplotype published here for the first time. Orange circles indicate CYP21A2 haplotypes with the c.955C>T variant in classical CAH patients, green circles represent CYP21A2 haplotypes in the middle segment of the LBSASB RCCX structure, and dark green circles indicate CYP21A2 haplotypes with the c.955C>T in the 3′-segment of LBSASB. The length of gray lines between the CYP21A2 haplotypes is proportional with the nucleotide (allele) identity: greater length implies more different alleles between haplotypes and lower identity. (b) Both simple (equal) crossover between LASA and LBSASB RCCX structures and unequal crossover between LASALB and LBSASB structures could generate the LASA structure harboring a h65 CYP21A2 haplotype in the 3′-segment.

One experimentally determined CYP21A2 haplotype with the c.955T allele from classical CAH patients was in the haplotype group of the 3′-segment of the LBSASB, indicating a common descent. The LASA RCCX structure harboring the LBSASB-identical h65 haplotype was generated by simple (equal) or unequal crossover1, 30 between LBSASB and existing RCCX structures in the studied population.1, 21 In the case of (equal) crossover, the breakpoint would have been after the C4A in the 3′-segments of the LASA structure and between the C4B and CYP21A2 in the 3′-segment of the LBSASB harboring the h65 haplotype (Figure 3b), while it would have been after the C4A in the middle segments of the LASALB structure and in the same place of the LBSASB structure in the case of unequal crossover. In both cases, the functional CYP21A2 in the middle segment of LBSASB vanished from the emerged LASA RCCX structure.

Two other experimentally determined CAH-causing CYP21A2 haplotypes with the c.955T allele (h69 and h70) were located separately from either haplotypes in LBSASB and from each other, supporting their independent origins from the LBSASB RCCX structure (probably by non-allelic gene conversions from the CYP21A1P of the same RCCX structure where c.955T occurs). The connections of these two CYP21A2 haplotypes to the stem of the haplotype network were confirmed by the harboring RCCX structures of the connected haplotypes (h69 harbored by LALB RCCX structure and h70 harbored by LASB RCCX structure).

Gene expression and function of CYP21A2 in the carriers of LBSASB RCCX structure with the c.955C>T variant

LBSASB RCCX structure with the c.955T allele was observed in 1 out of 34 adrenocortical tumor specimens of patients with NFAI. The CYP21A2 mRNA level of this c.955C>T carrier patient did not significantly differ (t-test, P=0.3312) from those of non-carrier patients (Figure 4a). Furthermore, this c.955C>T carrier patient had an h64 CYP21A2 haplotype in the middle segment of LBSASB, an h65 haplotype in the 3′-segment, and an h28 haplotype in the SB RCCX structure of the other chromosome (the exact RCCX structures of this patient is actually shown in Figure 2a). The transcripts of these three CYP21A2 haplotypes harbored distinctive alleles on c.*12C>T and c.*52C>T variants; CC alleles for h65, CT alleles for h28 and TT alleles for h64 (Figure 4b). To ascertain the frequencies of the different transcripts derived from these three different CYP21A2 haplotypes, cDNA was generated from total mRNA, and the CYP21A2 transcripts were cloned and sequenced in five independent experiments. The h64 transcript was present, indicating that CYP21A2 in the middle segment was transcriptionally active. The h65 transcript was not detected, whereas the frequency of the h28 transcript from the other chromosome was significantly lower than that of the h64 transcript (Figure 4c). The hormone profiles of the carriers of LBSASB with c.955C>T (N=5) and non-carriers (N=102) of NFAI patients from our recent study22 were compared, but no significant difference was detected in morning cortisol, midnight cortisol, adrenocorticotropic hormone (ACTH)-induced cortisol, morning aldosterone, ACTH-induced aldosterone, morning 17-OH-progesterone, morning corticosterone, ACTH-induced corticosterone, metyrapone-blocked 11-deoxycortisol, morning dehydroepiandrosterone sulfate, ACTH, or metyrapone-blocked ACTH levels (Supplementary Table 4).

Figure 4
figure 4

The results of the functional studies on adrenocortical specimens of NFAI patients. One asterisk indicates significant difference at P<0.05, whereas ns represents a nonsignificant difference, as calculated by t-test. (a) CYP21A2 mRNA level in the adrenal specimens of c.955T allele (on LBSASB RCCX structure) carrier NFAI patient (N=1) and non-carrier patients (N=3). (b) The forward and reverse chromatogram of h28 and h64 transcripts from the c.955T allele carrier NFAI patient, where the distinctive alleles on c.*12C>T and c.*52C>T variants are indicated (CT alleles for h28 and TT alleles for h64). (c) The abundance of transcripts expressed from three CYP21A2 haplotypes (h28, h64, and h65) of the NFAI patient with c.955T allele on LBSASB RCCX structure.

Discussion

A moderately frequent, unique haplotypic RCCX CNV structure with three segments harboring two CYP21A2 gene copies, abbreviated to LBSASB, was determined in 13 individuals (seven healthy subjects, five patients with NFAI and one reexamined CAH patient) with normal steroid levels by a set of experimental and bioinformatic methods published recently.1, 19 This set of methods is capable of not only a more exact determination of RCCX structure than some current methods but also the segment-specific sequencing of CYP21A2: thus the haplotypes of the two harbored CYP21A2 in the middle and the 3′-segments of the RCCX CNV were revealed. The CYP21A2 haplotypes in the middle segment harbored the c.*12T allele, whereas the CYP21A2 haplotypes in the 3′-segment harbored the c.293-79A and c.955T alleles. Both c.293−79G>A and c.*12C>T variants are in complete LD with the c.955C>T variant in the healthy subjects of two European (Spanish and Italian) populations,17, 31 suggesting a straightforward usage of these variants to distinguish between pathogenic and non-pathogenic genomic contexts of the c.955C>T variant in the genetic diagnosis of CAH, but unfortunately there is no available information from c.293−79G>A and c.*12C>T in the CAH patients of these populations. Our findings on the healthy subjects and the patients with NFAI and CAH of the Hungarian population indicated that only the c.*12T allele is unique in the individuals with normal steroid hormone levels. Therefore, the genetic examination of c.*12C>T allows us to avoid the genetic misinterpretation of p.(Q319*).

The ability of the c.*12C>T to discriminate between CYP21A2 copies with c.955C>T in non-pathogenic and pathogenic genomic contexts can be readily built into molecular diagnosis approaches,32, 33 which seems to be especially important for populations with high c.955C>T frequency from North Africa and Asia.34, 35 Before its inclusion in the routine genetic diagnosis of CAH, the occurrence pattern of c.*12C>T should be investigated in both healthy subjects and CAH patients belonging to the target population. The pioneer study on c.955C>T in RCCX structure with three segments and two CYP21A210 obviously illustrates the importance of validation in other populations: it is probable that the c.*12T and c.293−13G CAH alleles are simultaneously harbored in the middle segment in three CAH patients. However, the nationality and ancestry of these patients have not been documented, and thus a conclusion with regard to the origin of this RCCX structure unfortunately cannot be drawn.

Nevertheless, the three CAH-causing mutations in one patient harboring c.*12C>T implied that one of the mutations could occur in the CYP21A2 of the middle segment, disabling its functionality. Our evolutionary analysis provided a new insight into the underlying mechanism that generates such a clinical case. The CYP21A2 haplotypes with the c.955T allele are derived from the 3′-segment of the LBSASB RCCX structure or from CYP21A1P, according to the genealogical haplotype network. However, the second, functional copy of CYP21A2 in a chromosome was never carried by classical CAH patients with c.955C>T, independent of whether the c.955T allele of CYP21A2 haplotypes was derived from the 3′-segment of LBSASB or elsewhere. This putative rearrangement mechanism of CYP21A2 haplotypes with the c.955T derived from the LBSASB RCCX structure sheds light on why the c.*12T allele did not occur in Hungarian CAH patients carrying a CYP21A2 haplotype with c.293-79A and c.955T; if a rearranged RCCX structure loses the functional CYP21A2 from the middle segment of LBSASB, it will become pathogenic, and if a rearranged RCCX structure retains the functionally intact CYP21A2 with c.*12T from the middle segment, it will rescue its carrier from CAH. Nevertheless, it cannot be excluded that a rearrangement such as non-allelic gene conversion deletes or transfers the c.*12T allele from its original genomic context, or transfers a pathogenic mutation to CYP21A2 in the middle segment (the frequency of such an event will be very rare, because a rare event happens in the moderately frequent LBSASB). Therefore, careful consideration is always a basic requirement in the genetic diagnosis of CAH, and the sequencing of full-length CYP21A2 from the segment-specific PCR product of the middle segment is needed after the finding of the c.*12T allele in the extremely rare case when more than two CAH mutations (including c.955C>T) are detected in one individual.

The significance of the c.955C>T variant in the molecular diagnosis of CAH is testified by a recent case study on a misdiagnosed c.955C>T variant.36 This kind of case is not extremely rare, and there may be no better example of how true that is than one of our cases. The original genetic diagnosis of a 1-year-old patient with classical CAH was reexamined, because the required dose of hormone substitution was unchanged in spite of weight gain. The father had a known history of classical CAH due to CYP21A2 deficiency and the healthy mother carried a c.955T allele. The patient inherited the maternal c.955T allele besides a paternal RCCX structure without CYP21A2, and was therefore treated with mineralocorticoids and glucocorticoids from birth. The reexamination demonstrated that an intact CYP21A2 was harbored by the middle segment of LBSASB next to the c.955C>T variant in the 3′-segment. The steroid supplementation had been gradually reduced, and then completely discontinued based on this finding, without any clinical signs of adrenocortical insufficiency.

Several studies based on the normal steroid levels of subjects with two carried CAH variants have found indirect evidence proving that one of the two CYP21A2 associated with the c.955C>T variant in one chromosome expresses active steroid 21-hydroxylase enzyme.36, 37, 38, 39 However, direct evidence for the functional CYP21A2 in the middle segment was provided by the current study for the first time. In the adrenal specimen of a patient with NFAI, the CYP21A2 in the middle segment of the LBSASB RCCX structure produced an mRNA transcript differing from the transcripts of the CYP21A2 copies in the 3′-segment and the other chromosome, and this middle segment-specific CYP21A2 transcript was detected. The CYP21A2 transcript from the middle segment was significantly more abundant than the other two transcripts, but neither CYP21A2 mRNA expression levels nor hormone levels in blood significantly differed between the LBSASB carriers and the non-carriers in NFAI patients. It should be noted that the quantitative analysis of the abundance of different CYP21A2 transcripts was not absolutely reliable because only one adrenal specimen was examined (however, we have no reason to question the existence of the CYP21A2 transcript expressed from the middle segment of LBSASB), and the sample size of NFAI patients was not sufficient for the detection of subtle hormonal differences in blood. In fact, several hundreds of adrenal specimens or NFAI patients should be examined to collect enough LBSASB carriers for the more reliable analyses on the abundance of CYP21A2 mRNAs or the subtle hormonal differences due to the moderate prevalence of LBSASB.

The current study is in tune with the results that have been presented so far (Table 2). In addition, the allele frequencies (~1–2%) and the almost complete LD between c.293−79G>A and c.*12C>T in the European populations of the 1000 Genomes project40 also imply the presence of LBSASB RCCX structure. However, the next-generation sequencing (NGS)-based methods cannot reveal the exact internal content of multiallelic CNVs,41 and the bioinformatic analysis of NGS data is still prone to errors;42 hence, c.*12C>T is documented by dbSNP as a variant of CYP21A2 in the 3′-segment. According to the current finding, the c.*12 nucleotide position is never polymorphic in the 3′-segment, but it is in the middle segment. Therefore, this phenomenon cannot be described by the traditional variant definition of non-duplicated genomic regions, and the usage of the variant categories for duplicated genes19, 43 should be considered.