Introduction

Complement factor I (CFI), or C3b/C4b inactivator, is a regulatory serine protease of the complement cascade and is responsible for cleaving the alpha-chains of C4b and C3b in the presence of the cofactors C4-binding protein and factor H, respectively (Nagasawa and Stroud 1977; Pangburn et al. 1977). The human CFI gene (GenBank GeneID: 3426; accession numbers: NM_000204.2 and NT_000004.10) spans 63 kb on chromosome 4q25 and consists of 13 exons, encoding a 583-amino acid polypeptide as an unprocessed precursor with a signal peptide consisting of 18 amino acids (Catterall et al. 1987; Goldberger et al. 1987; Shiang et al. 1989; Vyse et al. 1994). The protein is synthesized predominantly in the liver. Prior to secretion, the CFI proprotein is cleaved to be the mature protein, which is a heterodimeric glycoprotein composed of heavy and light chains (M r 50,000 and 38,000) linked by disulfide bonds. It circulates in plasma at a concentration of 35 μg/ml (Pangburn et al. 1977; Goldberger et al. 1984). CFI deficiency is associated with a propensity to pyrogenic infections and an increased incidence of immune complex diseases due to impaired complement-mediated functions (Vyse et al. 1996; Baracho et al. 2003). Some nonsynonymous mutations result in atypical hemolytic uremic syndrome (aHUS), which is characterized by acute renal failure, microangiopathic hemolytic anemia, and thrombocytopenia (Fremeaux-Bacchi et al. 2004; Kavanagh et al. 2008). The genetic polymorphism of CFI was first discovered by isoelectric focusing of desialyzed plasma samples followed by immunoblotting (Nakamura and Abe 1985). The CFI polymorphism is controlled by two major alleles, CFI*A and CFI*B, and a few rare variant alleles (Nakamura et al. 1990). CFI*A was rare in Europeans, but was observed at frequencies of more than 0.10 in East Asians, suggesting that this allele is Asian-specific (Yuasa et al. 1988). In this study, we elucidated the molecular basis of the CFI polymorphism and investigated the distribution of haplotypes in more than 2,400 people from 20 African and Eurasian populations.

Subjects and methods

Both DNA and serum were obtained from 174 unrelated Japanese individuals living in Tottori. DNA samples extracted from unrelated individuals living in various areas of Eurasia were used for a population study. Most of these samples were from the same set as those in a previous study (Yuasa et al. 2007). This study was approved by the Ethical Committee at the Faculty of Medicine, Tottori University. CFI phenotyping of desialylated serum samples was performed by isoelectric focusing and immunoblotting as reported previously (Yuasa et al. 1988). Mutations were identified by direct sequencing of products for 13 exons, obtained by the polymerase chain reaction (PCR) with primers designed by Kavanagh et al. (2005). Nucleotide and amino acid numbering begin from the ATG initiation codon and includes the 18-residue signal peptide. PCR products for exons 4 and 11 were digested with MboII and TaiI (Fermentas, Glen Burnie, MD), respectively, according to the supplier’s instructions. For the population study, three mutations found in this study and two known single nucleotide polymorphisms (SNPs), rs2298749 (c.804G>A, S268S), and rs11098044 (c.898G>A, A300T), were simultaneously typed by pentaplex PCR based on the amplified product length polymorphism (APLP) method. This method requires three primers for the amplification of DNA fragments at a locus: two allele-specific primers differing in length and one common primer on the opposite DNA chain. Noncomplementary nucleotides were introduced into primers to give a difference in length between two PCR products, to enhance the specificity of primers, and to optimize annealing temperature of primers (Watanabe et al. 1997; Yuasa et al. 2007). The nucleotide sequence and amount of each primer are shown in the online Table S1. The PCR cocktail consisted of 100 μl of Multiple PCR Master Mix from a Multiplex PCR Kit (Qiagen, Hilden, Germany), 18 μl of 15 primers with a concentration of 100 pmol/μl, and 82 μl of water. PCR was performed in a volume of 8 μl containing 7.5 μl of the PCR cocktail and 0.5 μl of a solution containing about 20 ng of genomic DNA. The cycle conditions were 95°C for 15 min, then 30 cycles of 94°C for 10 s, 56°C for 10 s, 72°C for 10 s, and a final extension step of 15 min at 72°C. The products were separated using a polyacrylamide gel (9%T, 5%C), then visualized by ethidium bromide staining. Allele and haplotype frequencies were estimated, and the Hardy-Weinberg equilibrium was tested using Arlequin program version 3.11 (Excoffier et al. 2005). Clustal W and TreeView were used to investigate the phylogenetic relationships of the haplotypes.

Results and discussion

Figure 1 shows the banding patterns of CFI after isoelectric focusing of the serum samples. Three common phenotypes and one anodal variant were observed in 174 Japanese individuals. Judging from the data reported previously (Nakamura et al. 1990), the variant band seemed to be very similar in isoelectric point to the CFI A1 band, but to be different in intensity from the CFI A1 band, which was much less intense than the CFI A and B bands. A direct comparison of these two variants was not carried out, because the CFI A1 sample was unavailable. The variant found in this study was designated CFI Aj tentatively. CFI A phenotype was found in 4 samples, CFI AB in 38 samples, CFI B in 131 samples, and CFI AjB in 1 sample. The allele frequencies for CFI*A, CFI*B, and CFI*Aj were calculated to be 0.1322, 0.8649, and 0.0029, respectively. The observed distribution was in good agreement with the Hardy-Weinberg law (= 0.58). These allele frequencies were consistent with the previous data on western Japan (Yuasa et al. 1988).

Fig. 1
figure 1

Banding patterns of desialylated complement factor I (CFI) obtained by isoelectric focusing and immunoblotting. Anode at top. Lanes 1, 2, 4, 7 CFI B phenotype, 3, 8 CFI AB, 5 CFI A, 6 CFI AjB

To elucidate the molecular basis of CFI*A, we sequenced all 13 exons in individuals with the CFI AB phenotype. CFI*A arose from a G-to-A transition at nucleotide position 1217 in exon 11 (Fig. 2b), resulting in the substitution of histidine (CAT) for arginine (CGT) at amino acid position 406. This substitution was previously identified as two heterozygotes in 100 healthy French control samples for a study of aHUS and was shown not to be responsible for aHUS (Fremeaux-Bacchi et al. 2004). The frequency was comparable to that (0.006) in a French population (Montpellier) obtained by an isoelectric focusing study (Yuasa et al. 1988).

Fig. 2
figure 2

Automated DNA sequence electropherograms of CFI gene mutations. Sequence analysis demonstrating the presence of the mutation in exon 4 from an individual with CFI AB phenotype (a), exon 11 from another individual with CFI AB phenotype (b), and exon 12 from an individual with CFI AjB phenotype (c)

This G-to-A transition brought about the loss of a TaiI restriction site. In some samples the results from the restriction fragment length polymorphism (RFLP) analysis were inconsistent with those from the isoelectric focusing. The frequency of this allele, designated CFI*Ah, was estimated to be 0.1034. When the CFI AB samples without the loss of restriction site were sequenced, an additional substitution was found: an A-to-C transversion at nucleotide position 603 in exon 4 (Fig. 2a), leading to a change from arginine (AGA) to serine (AGC) at amino acid position 201. This transversion brought about the loss of an MboII restriction site. RFLP analysis showed a consistency with the data from the isoelectric focusing. The frequency of this new additional allele, designated CFI*As, was calculated to be 0.0316. Thus, CFI*A was divided into two suballeles, CFI*Ah and CFI*As, which could not be subdivided by isoelectric focusing. Desialyzed CFI bands have isoelectric points near 7. Generally, the resolution power of isoelectric focusing is diminished in the higher pH range.

Sequencing of the CFI AjB sample revealed a G-to-T transversion at nucleotide position 1505 in exon 12 (Fig. 2c), resulting in the replacement from arginine (CGT) to leucine (CTT) at amino acid position 502 in addition to the c.1217G>A. The isoelectric point of the CFI Aj band indicated that these two mutations were linked to each other on a chromosome. We have identified three mutations: one in exon 4 occurred in the scavenger receptor cystein-rich (SRCR) domain and the others in exons 11 and 12, in the serine protease (SP) domain (Vyse et al. 1994; Kavanagh et al. 2005). The three mutations identified here were not registered in the Entrez SNP database.

Figure 3 shows the band patterns of products obtained by pentaplex PCR. The nucleotide substitutions were clearly and unambiguously detected as PCR products with different sizes (Fig. 3). The sizes ranged from 60 bp in c.603C to 100 bp in c.898A.

Fig. 3
figure 3

Simultaneous genotyping of the five SNPs by pentaplex PCR based on the APLP method. Lanes M 25-bp ladder, 1 haplotypes H1/H4, 2 H1/H6, 3 H1/H3, 4 H3/H5, 5 H5, 6 H2/H3, 7 H2, 8 H1, 9 H1/H2. Six haplotypes, H1–H6 estimated from the five SNPs are shown in Fig. 4

Haplotypes were constructed on the basis of the genotype data from 2,471 individuals by the EM algorithm, with phase-unknown samples, and six haplotypes were observed (Fig. 4), indicating that no recombination occurred. This region forms a haploblock according to the data of the International HapMap Project. The distribution of haplotypes is shown in Table 1. Two main haplotypes, H1 and H2, classified by c.804G>A, were observed in every population. The frequencies of this mutation in African, European, Chinese, and Japanese were similar to the data of the four populations in the HapMap data. Haplotype H1 was deduced to be ancestral, because chimpanzee has the same haplotype (GeneID: 471271; accession number: XM_526653). The other haplotypes were derived from these two haplotypes.

Fig. 4
figure 4

Five SNPs in the CFI gene and phylogenetic relationships of estimated haplotypes. a Gene map and five SNPs in the CFI gene on chromosome 4q25. Coding exons are marked by blue blocks and 5′- and 3′-UTR by white blocks. The first base of transcription start site is denoted as +1. b Phylogenetic relationships of six haplotypes (c) investigated using Clustal W and TreeView programs. c Six haplotypes estimated from the five SNPs using Arlequin program

Table 1 Distribution of CFI haplotypes in various populations

Haplotype H4 was characteristic of the African population including Nigerians and Ghanaians. The CFI*300T frequency was very similar to that in Yoruba of the HapMap data (0.15).

Haplotype H3, which is characterized by CFI*As, was restricted to Japanese and Koreans. The frequency was polymorphic only in Japanese from the main island of Japan, whereas it was rare in Okinawa and Korea. Koreans and Japanese from the main island of Japan and Okinawa have high genetic affinity with each other (Tokunaga et al. 1996; Omoto and Saitou 1997). They share common alleles, e.g., haplogroups O-SRY465 and O-47z on Y chromosome (Hammer et al. 2006). HLA haplotype B44-BFF-C4A3-C4B1-DR13 is shared by Koreans and main island Japanese, but is low or rare in Okinawa and other parts of East Asia (Tokunaga et al. 1996). Y-haplogroups C-M8 and D-M55* and mtDNA-haplogroup M7a are observed at high frequencies in Japanese, whereas at rare and lower frequencies in Koreans (Hammer et al. 2006; Tanaka et al. 2004). These distributions could be explained by a dual structure model (Hanihara 1991). Modern Japanese are the result of an intermixture between the upper Paleolithic native population of Japan (Jomon people) and migrants from northeast Asia through the Korean peninsula during the Neolithic Yayoi period (300 BC-300 AD). Yayoi people made less contribution to Japanese in Okinawa. However, haplotype H3 is ubiquitous only in the main island Japanese. Haplotype H3 must have occurred in the main island of Japan rather than be a remnant of the Jomon people. The fusion gene (se fus) at the ABO-secretor locus (FUT2) was found in the Japanese populations with a relatively high frequency of 0.057, but in the Korean population at a rare frequency of 0.006 (Liu et al 1999). AHSG*5 in the alpha2-HS-glycoprotein gene is polymorphic in Okinawa with a frequency of 0.026, whereas it is quite rare or absent in other populations (Yuasa et al. 1985; Tamaki 1998). These two variant alleles are characteristic of the Japanese in the main island of Japan and in Okinawa. Haplotype 3 (CFI*As) is also specific for the main island Japanese. This restricted distribution suggested that all CFI*As detected by isoelecric focusing were inferred to correspond to CFI*Ah in populations other than Far-East Asian populations.

Haplotype H5, characterized by CFI*Ah, prevails mainly in East Asian populations. The highest frequencies were observed in Han Chinese from Changsha and the second highest in Thais. An isoelectric focusing study showed that Chengdu had the highest frequency for CFI*A (0.153) among the data obtained to date (Zhang et al. 1999). This haplotype flowed into Japan and Korea with fairly high frequencies. According to an HLA study (Tokunaga et al. 1996), there are multiple migration and dispersal routes in East Asia. This haplotype may have dispersed to Japan like HLA haplotypes B46-DR9 and B54-DR4. Khalha and Buryat in Mongolia showed low frequencies among East Asian populations. The frequencies of haplotype H5 were higher in the southern part of East Asia than in the northern part. There was a significant correlation between allele frequencies and the degree of latitude in 15 East Asian populations (r = −0.872, P < 0.001), showing a south-north downward geographical gradient. The Indian population investigated here belongs to the Gujars, who are an important population of northern India. The frequency in the Gujars was similar to that (0.058) in the Newars from Nepal (Yuasa et al. 1988). It is noteworthy that haplotype H5 occurred at fairly high frequencies in the northern part of South Asia. The distribution of haplotype H5 reflects the migration of ancient southeastern Asians. Haplotype H6 was observed only in a Japanese individual.

In conclusion, this study has presented molecular evidence of the CFI polymorphism. Haplotypes characterized by CFI*As and CFI*Ah are unique in their distribution. It would be interesting to scrutinize whether there are differences in frequency of haplotype H3 among Japanese living in various areas, including the Kyushu and the Nansei islands. East Asians are divided into northern and southern populations (Tokunaga et al. 1996; Xue et al. 2006). Haplotype H5 is characteristic of the southern populations. However, this allele was also detected in South Asians. There may be some populations with higher frequencies in Tibet and the western part of China.