Introduction

The α7 nicotinic acetylcholine receptor gene (CHRNA7) is located at 15q13–q14, a region strongly linked to the P50 sensory gating deficit, an endophenotype of schizophrenia1, 2 and bipolar disorder.3, 4 The peak LOD score (5.3) occurs at a marker in CHRNA7, with linkage also supported by pharmacological evidence.5 Attempts to demonstrate linkage of this region to either schizophrenia or bipolar disorder have had mixed results, with one study showing linkage to bipolar disorder6 and several studies showing only weak evidence for linkage to schizophrenia.1, 2, 7, 8 There is also evidence for association of CHRNA7 with schizophrenia and bipolar disorder.9 Together, these results suggest that P50 may be modulated by variant(s) in the CHRNA7 region as one of many genetic defects that increases susceptibility to the major psychoses. This region has also been linked to two idiopathic epilepsies.10, 11

Genetic analysis of 15q13–q14 is complicated by a large duplication of part of CHRNA7.12 We have previously examined the sequence relationships of this and other duplications in this region and showed that the partial duplication of CHRNA7 (CHRFAM7A) is a hybrid of CHRNA7 and an unrelated sequence FAM7A, of which there are several copies.13 Both FAM7A and CHRFAM7A are transcribed, but translation is uncertain.

Our map of 15q13–q1413 (Figure 1, bottom) showed that CHRNA7 and CHRFAM7A are in opposite orientations in the DNA sequence database (Build 36), which is mostly represented by the RP11 library. This suggests that an inversion of CHRFAM7A may have taken place after the partial duplication that generated it.13 Later assemblies of the human sequence (NT_010194), after earlier incorrect assemblies, support this map. There is now a continuous tiling path of clones between the two ends of the map, confirming our finding that CHRNA7 and CHRFAM7A are in opposite orientation in the RP11 individual.14

Figure 1
figure 1

Model for generation of the duplicon containing CHRFAM7A. Homologous segments are indicated by the same letters, with positions and orientation of genes CHRNA7, CHRFAM7A and FAM7A shown below. In the model, two minor variants of the ancestral structure Ia (structures Ib and Ic) undergo a duplication mediated by non-allelic homologous recombination (NAHR) at the direct repeats QRAZRM to generate an intermediate structure, which undergoes non-homologous deletion to bring segment A adjacent to segment H in structure IIa. (See Supplementary Figure 2 for alternative scheme in which structure IIa is generated from structures Ib and Ic in one step.) Inversion of segments HSRZ (in the CHRFAM7A duplicon) needs to occur to produce the RP11 structure (structure IIb), with deletion of the most telomeric segments AZ occurring here or at an earlier step, both occurring by NAHR. Approximate sizes and positions of the three fluorescent probes are shown above appropriate segments for structures IIa and IIb (yellow, RP11-143J24; green, 13H18; red, 30H17). Positions of duplicons associated with CHRNA7 and CHRFAM7A and the breakpoints 4 and 5 (BP4 and 5), involved in inv dup (15) supernumerary marker chromosomes are shown below structure IIb.

The partial duplication of CHRNA7 is a recent event unique to humans, with only CHRNA7 and not CHRFAM7A in other higher primates.15 Chromosomes both with and without CHRFAM7A have been identified in several human populations, indicating a copy number variant (CNV) with respect to the duplicated part common to CHRNA7 and CHRFAM7A. We have recently shown association between this CNV and the major psychoses.16 This study found that the homozygous CHRFAM7A null genotype was rare, but the heterozygote occurred in 24% of psychosis patients compared to 16% of controls (P=0.04). We have deduced the likely structure of chromosomes with this null CHRFAM7A allele,14 which may represent a persisting ancestral chromosome. Other more recent ancestral chromosomal structures predating the putative inversion may also exist. In this paper, we demonstrate examples of both pre- and post-inversion forms showing that such an ancestral chromosome does persist and is very common.

Materials and methods

Identification of chimpanzee clones

Orthologous Pan troglodytes clones were identified using BLASTN against fully sequenced clones (nr/nt) or against genome survey sequences (gss) for clone end sequences, and aligned with human contig NT_010194.16 using Blast Two Sequences (www.ncbi.nlm.nih.gov/blast/Blast.cgi).

Sample

A total of 12 lithium heparin-treated fresh blood samples were from the First Onset of Psychosis Study (Institute of Psychiatry). Clinical data are summarized in Table 1.

Table 1 FISH analysis of CHRFAM7A duplicon

DNA preparation and assays

DNA was prepared from 1 ml of blood using the QIAamp DNA blood midi kit (Qiagen). The 2 bp deletion genotype and copy number of CHRFAM7A were determined as described previously.16

Probe selection for fluorescence in situ hybridization

BAC clone (RP11-143J24, AC087455; http://bacpac.chori.org/) was selected as fluorescence in situ hybridization (FISH) probe for segment B. For duplicated segments S and H (Figure 1), cosmid clones were used as probes because their smaller size (approximately 40 kb) enabled more precise targeting. PCR primers GACCTAGATCCACAGTAAG, CAGGTGGAGATTCCAAGAGC (segment S, from RP13-395E19, AC139426), TATCTATCAGCCCATCTGAG, CACGCACGATGAGCACCTCC (segment H, from RP11-382B18, AC019322) were used to amplify genomic DNA for cosmid library screening. PCR products were random primer labeled with α32P dCTP (specific activity, 1–2 × 106 c.p.m. per ml hybridization mixture) using the RediprimeII kit (Amersham Biosciences). LA15NC01, a chromosome 15 cosmid library (www.hgmp.mrc.ac.uk/geneservice) was screened using Express-hyb buffer (Amersham Biosciences). Individual positive clones 30H17 (segment H) and 13H18 (segment S) were confirmed by PCR using the original and additional flanking PCRs (see Figure 1 for locations of all three probes).

Chromosome studies

Molecular cytogenetic studies were performed on chromosomes taken from peripheral blood cells. Chromosome preparations were obtained according to standard protocols.17 FISH was performed as previously described.18 Cosmid clones were labeled with biotin or digoxygenin by nick translation (Invitrogen). Biotin was detected with an avidin-fluorescein isothiocyanate system (green) and digoxygenin was detected with anti-digoxygenin-rhodamine (red). The yellow signal was obtained by independently labeling the BAC clone with biotin and digoxygenin, which when colocalized give yellow. Slides were mounted in Vectashield (Vector Laboratories) containing4,6-diamidino-2-phenylindole as counterstain. Hybridization signals were visualized and analyzed using an Isis FISH Imaging System (Metasystems).

Linkage disequilibrium estimates

We calculated r2 estimates of linkage disequilibrium (LD) between the inversion polymorphism and the 2 bp deletion polymorphism using Gene Counting19 and 2LD.20 Because of the occasional presence of the CHRFAM7A null allele, a few samples were haploid for either polymorphism. To overcome this complication, haplotype frequencies for the two polymorphisms were initially estimated for the diploid samples using Gene Counting. These frequency estimates were then adjusted to include the haplotypes for the haploid samples, where the phase is always unambiguous. We obtained r2 and standard error estimates from the adjusted estimated haplotype frequencies using 2LD.

Results

A model for partial duplication of CHRNA7

A map showing the duplication structure of 15q13–q14 for the RP11 individual is shown as structure IIb (Figure 1, bottom), with CHRNA7 and CHRFAM7A in opposite orientations. This is a modified version of a map shown previously.14 Segments that share>5 kb with>95% sequence identity (mostly>99% identity) are shown in color, with homologous segments sharing the same letter. Duplicons associated with the duplication of CHRNA7 run from segments HS (in blue) to segments RZARQ (in yellow), where the sequences diverge in segment Q.

To derive a model for the creation of CHRFAM7A, we considered its likely ancestral structure. The CHRFAM7A null allele is probably part of a chromosome structure that predates the creation of CHRFAM7A and may represent a close approximation to this ancestral structure. Structure Ia (Figure 1, top) represents our best estimate of the ancestral human structure for this region, combining our data on RP11, the CHRFAM7A null allele and chimpanzee sequence (Supplementary Figure 1). Figure 1 shows a plausible series of steps whereby the RP11 structure (IIb) might have arisen from the presumed ancestral structure (Ia). Two key events in the creation of CHRFAM7A are a 2.1 Mb duplication followed by a 1.6 Mb deletion, via an intermediate structure containing two complete copies of CHRNA7. It is possible, however, that these two steps might have occurred as a single meiotic event (Supplementary Figure 2). Whether or not these two steps were combined, our proposed scheme predicts the existence of structure IIa. To generate the RP11 structure (IIb) from structure IIa requires inversion of segments HSRZ, probably by non-allelic homologous recombination (NAHR) between the two inverted repeats that flank segments H and S (defining a 320 kb region). (As presented in Figure 1, 60 kb deletion of segments AZ is also required, although this might have occurred at an earlier step.) An important question, therefore, is whether there is any evidence for structure IIa as well as the known structure IIb, thereby identifying a structural polymorphism with CHRFAM7A in either orientation. Because the putative inversion of segments HSRZ contains only duplicated segments with no junction unique to either orientation (Figure 1), it is not possible to investigate its orientation by examination of small regions, such as by PCR.

Fluorescent labeling and FISH

To determine the orientation of the CHRFAM7A duplicon, three fluorescent probes, located in segments B, S and H (shown on structures IIa and IIb in Figure 1), were used. FISH results were visualized as a string of fluorescent signals at interphase, with two alternative patterns of the three probes adjacent to the CHRFAM7A duplicon and red and green signals only for the distal CHRNA7 duplicon (Figure 2). Because of the large distance between the two duplicons (approximately 1.8 Mb) it was only occasionally possible to observe all five signals together in the same string (Figure 2a and b). Where two copies of CHRFAM7A were present, some nuclei showed two strings of signals, but it was rarely possible to observe more than one informative string per nucleus in the same focal plane, such as in Figure 2a. This result clearly demonstrates that the orientation of CHRFAM7A is polymorphic. A few experiments were also performed with an alternative labeling pattern to control for possible labeling artifacts, but the resulting order of segments was identical (not shown).

Figure 2
figure 2

Examples of informative interphase chromosomes. (a) A rare example of two informative alleles in the same nucleus: allele 2 (yellow-green-red, CHRFAM7A inverted with respect to CHRNA7 as in RP11 structure) and allele 3 (yellow-red-green (red-green), CHRFAM7A in same orientation as CHRNA7), (b) a rare unambiguous example of all five signals on same string: allele 2 (yellow-green-red-red-green), (c) allele 2, (df) allele 3. See Figure 1 for location of probes.

A total of 12 samples were analyzed by careful examination of a minimum of 50 individual interphases per sample (Table 1). We determined the orientation of the CHRFAM7A inversion according to the order of the three signals, assigning allele 2 to yellow-green-red and allele 3 to yellow-red-green. We also determined copy number of CHRFAM7A for each sample, enabling us to identify all three alleles: allele 1 (CHRFAM7A null) or alleles 2 or 3 (the two alternative orientations of CHRFAM7A). Alleles 2 and 3 appear to be of similar frequency and all three alleles were observed in both white Caucasian and black subjects.

Designation of each orientation was initially performed blind to CHRFAM7A copy number. For the nine samples with a copy number of 2, we analyzed further interphases to minimize error, but for sample 200, their quality did not allow us to interpret more than 14 interphases. For samples 117, 161 and 205 that are apparently heterozygous for orientation but display a disparity in the two orientations, we re-examined interphases with the infrequent orientation to confirm their validity.

The 2 bp deletion polymorphism is a surrogate marker for the inversion polymorphism

It is possible that due to lack of recombination within the inversion, genetic isolation of this DNA may have occurred. To investigate this, all 12 samples were genotyped for a polymorphism specific to CHRFAM7A, the 2 bp deletion polymorphism within exon 6. We genotyped this polymorphism by a combination of two assays as described previously16 to identify all three alleles: allele 1 (CHRFAM7A null, as above), allele 2 (wt CHRFAM7A), allele 3 (CHRFAM7A with 2 bp deletion). Comparison of the two polymorphisms in chromosomes containing CHRFAM7A (Table 1) revealed strong LD between them (r2=0.82, CI 0.53–1.00, P=0.00003), with the 2 bp deletion almost always occurring when CHRFAM7A is in the same orientation as CHRNA7. Only sample 240 prevents perfect LD between the two polymorphisms. In the RP11 individual, where CHRFAM7A is in the opposite orientation to CHRNA7,14 the 2 bp deletion is absent in both relevant BAC clones (RP11-382B18, AC019322; RP11-40J8, AC010799), which is consistent with this result.

Discussion

We have shown here that the CNV at 15q13–q14, which includes part of CHRNA7, contains a duplicon that frequently exists in either orientation and therefore contains a very common polymorphic inversion. We have previously reported that this CNV has a null allele frequency of around 10%.16 Thus, the remaining 90% of chromosomes containing CHRFAM7A are approximately equally divided between the two alternative orientations.

The proximal end of chromosome 15 contains many segmental duplications, which are probably responsible for some of the genomic rearrangements known to occur in this region.21, 22 In a recent paper we presented a comprehensive study of the arrangement of segmental duplications in the individual whose DNA was used to construct the RP11 library and responsible for the bulk of sequence information on the public access human sequence databases.14 In Figure 1 and the supplementary figures we propose a plausible way by which the only fully known structure for this region might have been generated. We already had evidence for the existence of a chromosome structure without the CHRFAM7A duplicon and adjacent duplicon, which could be represented by structures Ia, Ib and/or Ic in our proposed scheme. We now have evidence for a structure with the CHRFAM7A duplicon in opposite orientation to that found in the RP11 individual, which is consistent with structure IIa. These observations therefore support the existence of a pathway for the evolution of 15q13–q14, similar to that proposed, and for the persistence of some of the proposed ancestral structures in present human populations.

We have found that the CHRFAM7A inversion polymorphism is in strong LD with the 2 bp deletion polymorphism within exon 6 of CHRFAM7A. A similar situation has arisen with the well-studied 900 kb polymorphic inversion at 17q21.31, each orientation of which is strongly associated with one of two haplotypes H1 and H2 that appear to have diverged around 3 Myr ago.23 Other studies have described similarly strong patterns of LD between CNVs and SNPs, where the CNV is ancestral as evident from the same strong LD in different ethnicities.24, 25, 26 Strong LD has also been observed between other structural variants and SNPs.27 These strong LD relationships have suggested that these genomic variants have arisen as single ancestral events on a particular haplotype background rather than from repeated mutational events. Where it occurs with inversions, such strong LD is likely to persist for a very long time, due to lack of recombination within the inversion. The 2 bp deletion is therefore likely to be part of a haplotype associated with the CHRFAM7A inversion and supports the notion of an ancestral relationship between the two alternative orientations.

Both the 2 bp deletion and the orientation of CHRFAM7A may have functional consequences. The most likely effect of the 2 bp deletion would be to disrupt the reading frame to prevent translation of full-length CHRFAM7A protein. However, it is unknown whether CHRFAM7A is translated. Alternatively, CHRFAM7A mRNA may affect CHRNA7 expression by competition for transcription factors. If so, the inversion could affect expression of CHRFAM7A mRNA, modulating CHRNA7 expression. One of the presumed breakpoints of the inversion (indicated above structure IIa, Figure 1) lies around 25 kb upstream of CHRFAM7A, possibly altering the location or effect of regulatory elements. However, for CHRFAM7A gene products to influence expression of CHRNA7, both genes must be expressed in the same cell, but, with different promoter and 5′ regions, this is far from certain.

There may be biological consequences of the orientation of CHRFAM7A that are independent of expression. Some inversions are known to affect the risk of genomic rearrangements in meiosis and therefore affect the next generation. One example is an inversion at 15q11–q13 that is strongly overrepresented in mothers of Angelman syndrome (AS) patients28 and which we recently reanalyzed in light of our improved map of this region of chromosome 15.14 This and other examples at 4p16, 5q35, 7q11.23, 8p23 and Yp11.2 have been recently reviewed.29

When CHRFAM7A and CHRNA7 are in opposite orientation, there is likely to be a small increased risk of inv dup (15) syndrome.30 This syndrome occurs in around 1 in 10 000 live births, with phenotypes including autism and seizures, and arises from a maternal duplication on a supernumerary marker chromosome that includes the critical imprinted region deleted in Prader–Willi and Angelman syndromes (PWS/AS). More than half of cases involve NAHR between inverted repeats in 15q13–q14 (breakpoints BP4 and 5, under structure IIb in Figure 1),31 which in the RP11 structure can partly be accounted for by the inverted relationship between the CHRNA7 and CHRFAM7A duplicons.14 If NAHR occurred between these repeats within the same chromatid, an inversion of the region between BP4 and 5 would occur. Interestingly, a common polymorphic inversion in 15q13.3 has been recently reported.32 Although its exact location was not determined, it is clear from the location of the probes used that it is not the same as the smaller CHRFAM7A inversion within BP4 described here and is consistent with inversion of the region between BP4 and 5. It will be interesting to know the limits of this inversion and its LD relationship with the CHRFAM7A inversion.

When CHRFAM7A and CHRNA7 are in the same orientation, there is likely to be a small increased risk of deletions or duplications of CHRNA7 and flanking regions. Individuals with CHRNA7 deletions are likely to be viable as CHRNA7 null mice are without a major phenotype33 and, in humans, a small proportion of PWS/AS patients have large deletions that extend to CHRNA7.34 Individuals with CHRNA7 duplications would also be expected to be viable as patients with inv dup (15) syndrome often have three copies of CHRNA7 as do largely asymptomatic subjects with the equivalent paternal duplications.31 Our 2 bp deletion data suggest infrequent CHRNA7 deletions (<0.6%), as our assay would have detected such a deletion in patients homozygous for the 2 bp deletion in CHRFAM7A.16 Other copy number studies have indicated a higher frequency of deletions/duplications of CHRNA7 (1135 and 20%,36 mainly due to deletions), but most of these appear to be much smaller deletions/duplications than BP4–5. However, in a recent study of about 2000 mental retardation cases, nine individuals (six unrelated) were identified with BP4–5 deletions and had phenotypes that also included seizures, abnormal EEG and/or dysmorphic features.32 When around 1000 controls were investigated in the same study, one BP4–5 duplication was detected but no similar deletions. It was suggested that these rare deletions and duplications of BP4–5 might be linked to the large BP4–5 inversion. Whether such a link is established, our data strongly suggest a link with the smaller CHRFAM7A inversion involving NAHR between CHRNA7 and CHRFAM7A duplicons when in direct orientation.

Because of the strong LD between them, the 2 bp deletion polymorphism can be used as a surrogate for the CHRFAM7A inversion polymorphism. We can therefore reassess previous studies involving the 2 bp deletion polymorphism. A French study reported an association between the 2 bp deletion and abnormal P50.37 A more recent study reported association between the 2 bp deletion and poor episodic memory, another endophenotype of schizophrenia.38 It is therefore possible that the CHRFAM7A inversion is associated with some endophenotypes of schizophrenia. In our earlier study, we found no association of the 2 bp deletion with schizophrenia, bipolar disorder or psychosis, although we did find an association between heterozygotes for the CHRFAM7A null allele and psychosis.16 The same French group described above also failed to find association of the 2 bp deletion with schizophrenia.37 A Chinese group investigating the 2 bp deletion polymorphism without considering the CHRFAM7A CNV reported no association with schizophrenia39 but found association of the 2 bp deletion with bipolar disorder.40 These differences may be due to greater statistical power required to detect association with genetically more complex phenotypes compared to endophenotypes. Such a pattern was seen with the CHRNA7 region of 15q13–q14, which showed strong linkage to the P50 endophenotype but only weak linkage to schizophrenia within the same study.1

We have shown that, where present, CHRFAM7A exists in either orientation. It therefore appears that at least two versions of this region of 15q13–q14 ancestral to the version presented in the database have persisted at a significant frequency. It remains to be determined whether any of these common genomic variants is involved in psychosis. Such important genetic studies have been made feasible by our observation that the CHRFAM7A inversion polymorphism is in strong LD with the 2 bp deletion polymorphism in exon 6 of CHRFAM7A as the latter, which is more practical than FISH for large-scale genotyping, can therefore be used as a surrogate marker. We plan to investigate association of these variants with P50 and other endophenotypes of schizophrenia in a large sample of psychosis patients and their unaffected relatives.