Introduction

Classical osteogenesis imperfecta (OI), or “brittle bone disease,” is a connective tissue disorder characterized by susceptibility to bone fractures, blue sclerae, and growth deficiency.1 Most (85%) individuals with OI have autosomal dominant mutations in COL1A1 or COL1A2 (OMIM nos. 166200, 166210, 166220, and 259420) that alter the structure or synthesis of type I collagen.2 Collagen mutations cause a range of OI phenotypes, from mild to perinatal lethality, described by the Sillence Classification.3 Autosomal dominant OI has an incidence of ~1/15–20,000 births, with >90% of cases resulting from de novo mutations.3,4,5 Mutation “hotspots” in COL1A1 and COL1A2 are associated with independent recurrences of mutations, rather than founder mutations.2

Type I collagen is a heterotrimer consisting of two α1(I) and one α2(I) chains that is subject to posttranslational modification during chain synthesis and folding by prolyl 4-hydroxylase and lysyl hydroxylase, with subsequent glycosylation of some hydroxylysine residues.6 An additional posttranslational modification system exists for types I, II, and V collagen, which results in 3-hydroxylation of the α1(I) and α1(II) Pro986 residues.7 Partial modification of the α1(II) Pro944 and α2(I) Pro707 residues has also been reported.8 These modifications occur in the endoplasmic reticulum by means of the collagen prolyl 3-hydroxylation complex, consisting of prolyl 3-hydroxylase 1 (encoded by LEPRE1), cartilage-associated protein (CRTAP), and cyclophilin B (encoded by PPIB). Deficiency of proteins comprising the collagen prolyl 3-hydroxylation complex causes autosomal recessive OI. Null mutations in CRTAP (type VII OI, OMIM no. 610682), LEPRE1 (type VIII OI, OMIM no. 610915), and PPIB (type IX OI, OMIM no. 259440) cause a moderate to lethal osteochondrodystrophy that overlaps phenotypically with dominant types II, III, and IV OI but has several distinctive clinical features,9,10,11,12,13,14 including white sclerae, rhizomelia, and extreme bone undermineralization. Surviving children with type VIII OI also present with extreme growth deficiency and bulbous metaphyses.13,14 Recessive OI caused by deficiency of components of the collagen 3-hydroxylation complex accounts for about 5–7% of severe cases of OI in North America.9,15

Among the first cases of type VIII OI reported, we identified homozygosity for the same mutation within several African American and West African émigré families in North America.13 This mutation accounts for about one-third (21/62) of all mutant alleles currently reported in type VIII OI.16,17 The mutation (LEPRE1 c.1080+1G>T) results in multiple alternatively spliced transcripts, each containing a premature termination codon (PTC). Homozygosity for this mutation is perinatal lethal, whereas compound heterozygosity with a different LEPRE1 mutation is compatible with survival into the second decade of life, and heterozygous carriers are clinically unaffected.13 We hypothesized that the LEPRE1 c.1080+1G>T allele arose in West Africa prior to the African diaspora and was introduced to the Americas during the Atlantic slave trade. We screened contemporary African American and African cohorts to determine the prevalence of carriers and illuminate the history of this mutation. Haplotype analysis of mutant alleles demonstrated one shared core haplotype surrounding the LEPRE1 gene on chromosome 1. Our findings are consistent with a founder mutation in West Africa more than 650 years ago that was transported to the Americas during the transatlantic slave trade.

Materials and Methods

Cohort selection and acquisition

African American cohorts from Pennsylvania (NeoGen)18 and Maryland (Maryland Department of Health and Mental Hygiene) contained randomly chosen anonymized newborn metabolic screening cards with parental racial designation. Because samples were anonymized, it was not possible to exclude twins or multiple samples within a family. Leukocyte genomic DNA from African Americans and contemporary Africans enrolled in the African American Diabetes Mellitus Study19 and Howard University Family Study of Hypertension20 were obtained from the District of Columbia and the West African countries of Ghana and Nigeria. Related individuals in the African American Diabetes Mellitus Study and the Howard University Family Study of Hypertension were excluded by pedigree analysis. Genomic DNA from individuals originating from the Central African countries of Cameroon, Chad, Central African Republic, and Congo was collected in Cameroon for an African genetic diversity study.21 Senegalese samples were collected for a prostate cancer susceptibility study in apparently healthy individuals.22 All samples were collected through institutional review board–approved protocols.

Sample screening

DNA was extracted from newborn metabolic screening cards and screened for the LEPRE1 mutation by PCR and restriction enzyme digestion with BslI (New England Biolabs, Ipswich, MA) and the resulting restriction products were analyzed by 10% polyacrylamide gel electrophoresis. In brief, a 6-mm punch from each card was rehydrated in Tris-EDTA buffer and then treated with 5% Chelex-100 resin (BioRad, Hercules, CA) for 25 min at 55 °C to remove heme groups. Samples were centrifuged to pellet the resin and the supernatant was used directly for PCR as follows: 50 µl reactions contained 15 µl of dried blood extract, 1% Perfect Match (Stratagene, La Jolla, CA), 3% dimethyl sulfoxide, 15 pmol LEPRE1 I4 sense (5′GGCCATCATGTTAAGTAGCAGGCAC CAGCT-3′), and LEPRE1 I5 antisense (5′-CTCCCTGTGCT CCCTTCTCCTCTGAATAAC-3′) primers with 1.0 U High Fidelity Platinum Taq polymerase (Invitrogen, Carlsbad, CA). After an initial 5-min denaturation at 94 °C, reactions were incubated for 40 cycles of 94 °C for 1 min, 63 °C for 1 min, and 72 °C for 1 min, followed by 7 min at 72 °C.

Genomic DNA was isolated from leukocytes (District of Columbia, Ghana, Nigeria, Cameroon, Chad, Central African Republic, and Congo) or buccal swabs (Senegal) and whole-genome amplification (WGA) was performed using the Illustra GenomiPhi DNA Amplification kit (GE Healthcare, Piscataway, NJ). Amplification products were quantitated with PicoGreen (Invitrogen, Carlsbad, CA) using a standard curve generated from normal control genomic DNA. For each sample, 20 ng of whole-genome amplification DNA was screened using a custom TaqMan genomic single-nucleotide polymorphism assay (Applied Biosystems, Foster City, CA) to detect the mutation with probes specific for each allele. A 91-bp PCR product corresponding to nucleotides g.9130–9221 of the LEPRE1 gene was amplified under standard conditions on an ABI 7000 Sequence Detection System using forward (LEPRE1-AFR_F, 5′-TTGGCC TATTATGCAGCTATGCTT-3′) and reverse (LEPRE1-AFR_R, 5′-GGCAGCTGTCATAACAGAAGGAA-3′) primers at a concentration of 0.9 µmol/l. Detection of PCR products was obtained by inclusion of 0.2 µmol/l minor groove-binding (MGB) probes specific for the normal “G” allele (LEPRE1-AFR_V, 5′-VIC-CCCGTGAGGTGAGAGA-3′) or the mutant “T” allele (LEPRE1-AFR_M, 5′-FAM-CCCGTGAGTTGAGAGA-3′), and compared with normal (G/G), heterozygous (G/T), and homozygous (T/T) control samples. The TaqMan allelic discrimination assay was used for automated allele calling of the measured fluorescence. Samples designated as carriers by this assay were confirmed by independent PCR and restriction analysis, as described above.

Haplotype analysis

To confirm the origin and determine the age of the mutation, the size of the conserved haplotype on the mutant allele was determined using microsatellites and short tandem repeats (STRs). Fourteen markers covering a 4.2-Mb region surrounding the LEPRE1 gene on chromosome 1p were chosen using the UCSC Genome Browser (http://genome.ucsc.edu). PCR amplification of the microsatellite markers was performed using 200-ng genomic DNA or whole-genome amplification DNA in 25-µl reactions using the HotStar PCR amplification System (Qiagen, Germantown, MD). Amplification was performed using 15-pmol fluorescently labeled primers and MgCl2 concentrations as described (Supplementary Table S1 online). After an initial 15-min denaturation at 95 °C, reactions were incubated at 94 °C for 30 s, at primer-specific annealing temperatures of 52–57 °C (see TA in Supplementary Table S1 online) for 30 s, and 72 °C for 30 s for 35 cycles, followed by 72 °C for 10 min. Amplification products were separated by capillary electrophoresis on an ABI 3730 Automated Sequencer and analyzed using Genemapper software (Applied Biosystems). Allele calls were binned to confirm the expected distribution of fragment sizes.

Calculation of the mutation age in generations was performed with the method of Ranalla and Slatkin23,24 using the following formula:

where (t) is the mutation age in generations, (n) is the number of mutation-bearing chromosomes sampled, (Y0) is the number of chromosomes carrying both the mutation and the ancestral marker allele, Θ is the recombination fraction between the marker and the mutation, and (p0) is the frequency of the ancestral marker allele in the normal chromosomes (non-mutation-carrying chromosomes). We assumed a 20-year time period per generation.

Results

Identification of recurring mutation in African Americans

To determine the carrier frequency of the LEPRE1 c.1080+1G>T mutation among African Americans, we screened genomic DNA from African Americans from the Mid-Atlantic United States ( Table 1 ). DNA from 1,429/1,594 (89.6%), 631/757 (83.4%), and 995/1,002 (99.3%) individuals from Pennsylvania, Maryland, and the District of Columbia could be genotyped ( Figure 1a ). Twelve carriers were identified among the 3,055 genotyped African Americans and confirmed by BslI digestion of independent PCR products ( Figure 1b ). Heterozygous carriers ( Table 1 ) were detected in 5 of 1,429 (0.35%) samples from Pennsylvania, 2 of 631 (0.32%) samples from Maryland, and 5 of 995 (0.50%) samples from the District of Columbia, supporting a 0.39 ± 0.11% (12/3,055) overall carrier frequency for the LEPRE1 mutation among Mid-Atlantic African Americans. These data predict that cases of type VIII OI, due to homozygosity of the c.1080+1G>T mutation, will occur in 1 in 160–400,000 African American births.

Table 1 LEPRE1 c.1080+1G>T carrier frequency in African Americans
Figure 1
figure 1

Screening of genomic DNA for the LEPRE1 c.1080+1G>T mutation. (a) Scatter plot of the results from a representative genotyping of whole-genome amplification DNA using a custom genomic single-nucleotide polymorphism assay. Amplification of the normal “G” allele (VIC-labeled probe) is plotted along the x-axis and amplification of the mutant “T” allele (FAM-labeled probe) is plotted along the y-axis. Individual results plotted as x’s. NTC, no template control, squares; TT, homozygous mutant genomic DNA control sample, diamonds; G/T, heterozygous genomic DNA control sample, triangles; GG, homozygous normal DNA control sample, circles.(b) Genotyping of genomic DNA extracted from newborn metabolic screening cards by polyacrylamide gel electrophoresis analysis of BslI-digested PCR products. The mutation eliminates a BslI site and results in the presence of a 112-bp fragment (lane 6) after restriction digestion. STD, 25-bp DNA size standard; lanes 1–8, reactions from eight anonymized samples. FAM, 6-carboxyfluorescein; VIC, 4,7,2’-trichloro-7’-phenyl-6-carboxyfluorescein.

Determination of West African origin and frequency of LEPRE1 c.1080+1G>T mutation

We previously noted that carriers of the c.1080+1G>T mutation were African Americans or African immigrants of West African origin.13 Therefore, we investigated the prevalence of carriers among contemporary West Africans. Using a genomic single-nucleotide polymorphism (SNP) assay, 1,665/1,690 (98.5%) whole-genome amplified DNA samples representing seven groups related by language from Ghana and Nigeria were genotyped ( Table 2 ). Within the cohort, 381 samples were eliminated because they were from related individuals according to pedigree analysis. We identified 9 carriers among 453 samples (1.99%) from Ghana, including 2/142 (1.4%) Ga, 3/280 (1.1%) Akan, and 1/28 (3.6%) Ewe, as well as among the Gonja and Hausa ethnic groups, in which frequency cannot be meaningfully calculated because of low sample number. The 818 Nigerian samples yielded 10 carriers (1.22%), including 7/635 (1.10%) Yoruba and 3/183 (1.64%) Ibo samples. Combined with 13 samples not affiliated with a specific West African ethnic group, our screening identified 19/1,284 (1.48%) carriers for the LEPRE1 mutation among contemporary West Africans. The high carrier frequency predicts that this mutation alone will cause recessive type VIII OI in ~1/18,260 births in West Africa, equivalent to the incidence of de novo dominant OI.

Table 2 LEPRE1 c.1080+1G>T carrier frequency in contemporary Africans

Pan-African screening

To determine whether the LEPRE1 mutation was localized specifically to the Ghana/Nigeria region of West Africa or whether it was Pan-African, we screened DNA samples for the mutation in individuals from regions east, west, and south of Ghana and Nigeria ( Table 2 ). Among the 1,647 genotyped African samples, we were unable to identify a single carrier for the mutation among 561 samples from Senegal, 864 samples from Cameroon, 95 samples from Chad, 69 samples from Central African Republic, 34 samples from Congo, or 24 samples collected in Cameroon but of unspecified origin.

Haplotype analysis and estimated age of LEPRE1 c.1080+1G>T mutation

We next determined the haplotype surrounding this mutation in African American and West African alleles from 15 families (Supplementary Figure S1 online). In the 12 unrelated West African families with 16 independent mutant alleles, we found a conserved haplotype of ~63–425 Kb surrounding LEPRE1 ( Table 3 ), extending from between D1S2861 and the LEPRE1 gene to the region between markers STR3 and STR5. In one African family (080), the mutation chromosome could not be determined with certainty for these three markers immediately surrounding LEPRE1, although the genotype of the mutant allele carrier in this family is compatible with the ancestral haplotype. Because of this uncertainty, this individual was excluded from calculations. Two (nos. 3F4 and 3G4) of 25 independent normal chromosomes in these African families carried the conserved haplotype, suggesting that the mutation occurred on an ancestral haplotype that exists with an undetermined, but possibly nonnegligible, frequency among West Africans.

Table 3 LEPRE1 c.1080+1G>T allele haplotypes in West Africans

For the three African American families with four independent mutant alleles, the conserved region surrounding LEPRE1 was ~770 Kb–1.2 Mb ( Table 4 ) and extended from between markers D1S2645 and D1S463 to the region between markers STR3 and STR5. Furthermore, the conserved region around African American mutant alleles corresponds to the conserved region from West African families 026, 080, and 223. It is noteworthy that the full haplotype was not found in nine African Americans unrelated to our study families who are not carriers of the mutation, although one of these individuals (African American cohort, no. 30) shared the ancestral haplotype markers around LEPRE1 from D1S2861 to STR5.

Table 4 LEPRE1 c.1080+1G>T allele haplotypes in African Americans

The age of the mutation was estimated using linked markers on chromosome 1 haplotypes from carrier families. Based on marker assignments for all individuals in the 15 combined African and African American families, we estimate the mutation age between 416 and 1,066 years at markers STR3 and D1S2861, respectively. However, calculations using independent mutant chromosomes from the combined African and African American families narrowed the mutation age range to 458–548 years. The number of independent African American carriers was insufficient for a separate calculation.

When analysis was limited to our 12 West African triads, calculations including all family members yielded an estimated age of 622–2,340 years. The mutation age range was further refined to 648–894 years using only independent mutant chromosomes at markers STR3 and D1S2861. Thus, the West African haplotypes date this mutation to the 12th–14th century, prior to the transatlantic slave trade, which occurred from the 16th through 19th centuries.

Discussion

We have identified a founder mutation in the gene encoding prolyl 3-hydroxylase 1 that causes autosomal recessive type VIII OI, a severe to lethal bone dysplasia. The LEPRE1 c.1080+1G>T mutation originated in West Africa, where nearly 1.5% of contemporary Ghanians and Nigerians are carriers. The mutant allele was presumptively brought to North America via the Atlantic slave trade, resulting in a carrier frequency among African Americans of 0.32–0.50%. Type VIII OI joins sickle cell disease and end-stage kidney disease (ESKD) resulting from apolipoprotein L-1 variants as examples of single-gene defects transported from Africa to the Americas.25 Previously reported founder mutations for recessive OI have occurred in genetically isolated populations, including a First Nations Tribe of Quebec with type VII OI,10,26 and the Irish Traveler population with type VIII OI.14,27

The high carrier frequency of this LEPRE1 mutation among West Africans results in distinctive OI inheritance patterns in West Africa as compared with North America and Europe. Although severe to lethal OI has been reported in West Africa,28,29 this study is the first to address the prevalence of OI in West African populations. Our data predict that the incidence of recessive OI in West Africa due to this mutation is about 1/20,000 births, equivalent to the incidence of de novo dominant OI in North America and Europe. In comparison, recessive OI accounts for ~5–7% of severe OI cases in North America and Europe, and type VIII OI is calculated to occur in ~1/250,000 African American births.

The occurrence of LEPRE1 c.1080+1G>T within a conserved haplotype confirms a founder effect, rather than the presence of a mutation “hotspot.” Furthermore, no individuals from outside this region of West Africa were identified as carriers, suggesting that LEPRE1 c.1080+1G>T is not a Pan-African mutation. The carrier incidence is higher in Ghana than Nigeria, consistent with a mutation that began in Ghana and was carried eastward. The mutation expanded among groups with similar languages; all carriers except one belong to ethnic groups that speak languages of the Kwa or Volta-Niger branches of the Niger-Congo family. Haplotype analysis also dates this mutation to ~648–894 years ago, placing its origin within the 12th–14th centuries. This date is prior to the Atlantic slave trade, which peaked from the late 17th to mid-19th centuries.30 Of note, combining the West African and African American haplotype data decreases the estimated age of the mutation, as compared with West African haplotypes alone (458–548 years vs. 648–894 years), consistent with descent of the African American alleles from a limited number of West African founders who did not represent the full African allelic variation. In fact, the conserved region around LEPRE1 in African American carriers corresponds to a subset of contemporary West African haplotypes. However, we cannot rule out incomplete ascertainment because only four independent African American mutant alleles were analyzed. It is also possible that the West African haplotypes represent continued recombination events since the end of the slave trade.

During the nearly 400 years of the Atlantic slave trade, more than 12 million slaves were transported to the New World, about 10.7 million of whom survived to disembark.30,31 More than a third (38.1%) of these Africans originated from the region of West Africa then known as the Gold Coast, the Bight of Benin, and the Bight of Biafra.31,32 These three areas included the territories of the Akan, Ewe, Ga, Yoruba, and Ibo, whose contemporary members are carriers of the LEPRE1 mutation. Thus, it is not surprising to identify mutation carriers among African Americans of the Mid-Atlantic United States. Assuming a 1.48% carrier incidence among the ~130,000 slaves from West Africa who disembarked in the American colonies,31 it is possible that as many as 1,900 carriers were transported to North America. This number of carriers is more than sufficient to make a second founder effect unlikely, as would have occurred if only a few carriers were transported. The incidence of carriers among African Americans would not have been decreased by immigration from the Caribbean after World War II33 or by direct immigration from Africa to the United States during the past two decades, which have both included a high proportion of West Africans.34

Only about 3.6% of all slaves transported in the Atlantic slave trade were sent to the future United States. The majority of slaves were transported to the Caribbean and South and Central America, with their origins and destinations reflecting the European colonial regions of influence.30,31 As the majority of Africans sent to Brazil by the Portuguese originated in West Central and Southeast Africa (Angola, Congo, and Mozambique), we do not expect a significant number of carriers for this LEPRE1 mutation among Brazilians of African descent. In contrast, the British, French, and Spanish traders obtained a large proportion of their slaves from the current Ghana/Nigeria area. Over 1.5 million West Africans, 39% of all West Africans transported during the slave trade, were sent to the British Caribbean, including Jamaica, Barbados, Antigua, Trinidad, and the Grenadines, where they constituted 68% of disembarking slaves. The French and Spanish areas of the Caribbean each received over 10% of all West African slaves, with nearly half of the slaves in Martinique, Cuba, and Spanish Central America originating in West Africa.31 Therefore, the carrier frequency of this mutation on multiple Caribbean islands may be higher than among Mid-Atlantic African Americans. The LEPRE1 mutation may be useful for African ancestry studies among Caribbean populations because of its specific region of origin and simple, rapid analysis.

A high carrier frequency for a lethal mutation is rare within a population that is not geographically isolated. Disease founder alleles have expanded within nongeographically isolated ethnic groups such as Eastern European Jews, with Tay-Sachs and Gaucher disease carrier incidences of 1:31 and 1:20,35,36 and among Finnish isolates, who have carrier frequencies for multiple recessive diseases as high as 1:45 or 1:60.37 There are several possible causes for the high West African carrier frequency of the LEPRE1 allele. First, despite its lethality in a homozygous state, the LEPRE1 mutation may act as a neutral variant in the heterozygous state. In fact, the heterozygous parents of our patients are apparently healthy. A second possibility is that heterozygosity for this mutation may provide some protection against one of the infectious diseases endemic to this region, for example, by maintaining the integrity of connective tissues. The best-known disease mutation that confers selective advantage is the sickle-cell trait (HbS), which provides resistance to malarial infection and is found in 8% of African Americans, and as much as a quarter of the West African population.38,39 In addition, variants of apolipoprotein L-1 (APOL1), which are associated with lysis of T. brucei rhodesiense, are common in Africa and are now linked to susceptibility to ESKD in African Americans.25 The APOL1 gene cluster occurs on a region of chromosome 22 that shows strong evidence of evolutionary selection. However, the LEPRE1 carrier frequency is lower than expected for a mutation conferring selective advantage, and a protective role against West African endemic diseases is not apparent. A third possibility is that the LEPRE1 lethal allele may be linked to a neighboring gene on chromosome 1 that is undergoing selection, as a so-called “hitchhiker,” a possibility requiring further sequencing to investigate.

Disclosure

The authors declare no conflict of interest.