Pharmacogenetics and genomics research has experienced great advances over the past decade as witnessed by the completion of the human genome in 2003 (www.genome.gov/HGP). The field has been driven by the belief that understanding the human genome, that of pathogens, and interindividual genetic variability would result in radical advances in medicine. Anticipated measurable end points include increased targets for which drug discovery campaigns could be initiated, increased understanding of human susceptibility to disease and variability in drug response hence development of diagnostic tools to realize individualized treatment where drugs would be given to patients in whom they are predicted to work and at doses predicted to be safe.1 Toward the realization of this biomedical paradigm, Biobanking and Pharmacogenetics Databasing have become well established in developed countries (www.biobanks.se, www.icelandbio.com, www.ukbiobank.ac.uk). Little has, however, been done in developing countries.2 Starting with a workshop organized by the African Institute of Biomedical Science and Technology (www.aibst.com) in 2003 on Pharmacogenetics of Drug Metabolism in Nairobi, Kenya, a number of African scientists initiated a consortium for the biobanking and pharmacogenetics databasing of African populations. We here report the results of the first phase of this initiative that has seen research groups from five different African countries with collaborative support from leading experts in Europe and America establish a biobank of blood and DNA from nine ethnic groups from across the African continent. The biobank of anonymous samples has been used to establish baseline frequency distribution of SNPs of genes important in drug metabolism, hence the initiation of a pharmacogenetics database (http://www.aibst.com/biobank.html).

Ethical approval for the study was obtained from each of the countries from which samples were collected. Blood samples were collected from 50 to 100 adult volunteers from ethnic groups in Nigeria, Kenya, Tanzania, Zimbabwe and South Africa (Figure 1). Portions of each blood sample were used to prepare DNA, blot on filter paper or store at −80°C. The biobank consists of 1488 DNA samples from nine ethnic groups (Yoruba, Hausa, Ibo, Luo, Kikuyu, Maasai, Shona, San and Venda). The utility of the biobank was illustrated by studying the frequency distribution of some polymorphisms of known phenotype characteristics (www.imm.ki.se/CYPalleles, http://louisville.edu/medschool/pharmacology/Human.NAT2.pdf) of six genes (Table 1) important in drug metabolism (CYP2B6, 2C19, 2D6, GSTM, GSTT and NAT-2). Genotyping was carried out using established PCR–RFLP methods.4, 5, 6, 7, 8, 9, 10 A database cataloging the samples and the genotype results was designed using Microsoft Access and Visual Basics software packages (http://www.aibst.com/biobank.html) and access is currently limited to authorized individuals involved in the research projects.

Figure 1
figure 1

Samples currently in the biobank. The number of available DNA samples from each group is shown. Black dots show location of the ethnic groups where the samples were collected. The Hausa are found mostly in the northern part of Nigeria, the Ibo in the east and the Yoruba on the west. The Maasai of Kenya are located mostly toward the border of Tanzania, the Luo in the west and the Kikuyu in south. The Venda are located at the northeast border of South Africa and Zimbabwe. The Shona are the major ethnic group in Zimbabwe and the San reside on the border of Zimbabwe and Botswana.

Table 1 Allele frequencies in the African populations in this study and other ethnicities or populations

The stratification of populations based on allele frequency data was evaluated using principal component analysis using the program SIMCA P+ (www.umetrics.com).

The frequency distribution of the polymorphisms (Table 1) analyzed stratified the major populations, Caucasian (European and North American), Oriental (Asian) and African into distinct clusters (Supplementary Figure 2). This is in agreement with current evolutionary understanding of the stratification of these major populations based on other genetic markers.11, 12 We further analyzed the data for possible stratification of the African ethnic groups alone. No distinct differentiation was observed. Studies using other genetic markers have demonstrated that there is great genetic diversity among African populations compared to Caucasian or Oriental populations.12 These preliminary findings from our studies could indicate that the number and/or type of genes and/or SNPs analyzed do not carry enough resolution power to capture the genetic diversity of African populations reported in other studies.

The current biobank represents a significant contribution to efforts to jump-start pharmacogenetic and genomics research in African populations in that it contains samples from the most diverse representation of African populations to date from north-west Africa, the Hausa, Yoruba and Ibo, from central-eastern Africa, the Kikuyu, Luo and Maasai and from southern Africa, the Shona, Venda and the San. These regional groups represent some of the major known ethnicities based on linguistic classifications generally used to identify the populations (http://www.ethnologue.com). This work complements biobanking projects on samples from West Africa (Nigeria, Cameroon and Gambia)13, 14, 15 and African Americans.16 It also adds to the increasing international biomedical resources for genetics research (http://www.p3gconsortium.org, www.pharmgkb.org, http://hgvbase.cgb.ki.se, www.hapmap.org). In the execution of these studies, we opted for the anonymization of the samples as a consensus document is being drafted on how to handle ethical issues in an area where most African ethics review boards do not have clear guidelines and recommendations. However, there is a limitation of not being able to trace back to the individual should any interesting genetic findings be found in a particular sample. In sample collection, challenges were faced when assigning ethnicity that can be complex due to interethnic marriages, therefore in this study ethnicity was assigned based on the submission that parents and grandparents of the volunteers were of the same self-identified ethnic group.

Extrapolations of possible clinical implications of the baseline frequency distribution of some of the alleles based on established phenotypic characteristics could guide doctors in drug prescription decision-making and/or provide explanations of ethnic-specific adverse effects in metropolitan medical practice. The high frequency of individuals who are homozygous for the CYP2B6*6 allele (18–25%) in African populations is predictive of reduced capacity to metabolize and dispose efavirenz compared to Caucasians where lower genotype frequencies of 5–10% have been reported.17, 18 This is in agreement with clinical observations in which Africans have been reported to have significantly higher plasma concentrations of the drug compared to Caucasians when given at the standard 600 mg per day doses.19 Ongoing studies in our laboratory in HIV/AIDS patients taking the anti-HIV drug, efavirenz, indicate the need to lower the dose of this drug in people of African origin homozygous for the CYP2B6*6 allele (Nyakutira et al, unpublished). This could go a long way in reducing adverse effects hence increase treatment compliance. This could also reduce the costs for individuals requiring lower doses and thus partly contributes to the diagnostic costs of identifying such individuals. For the African specific CYP2D6*17 variant20 similar clinical effects could be explored in the use of antipsychotic drugs21, 22 in African populations based on the observed high frequency of the variant (14–34%) across the major African populations reported in this study (Table 1). Though the current pharmacogenetics database carries no direct phenotype information linked to carriers of the variants, clinical extrapolations from well-established in vivo effects of these polymorphisms give an important baseline that can be the basis for population-specific clinical trial design for optimal use of some medicines.

The use of the biobank, expatriation of samples and intellectual property issues can be both sensitive and controversial. While the steering committee of the consortium is working on a guideline document in consultation with ethics review boards of relevant countries, for samples collected in this study, a working position has been taken to the effect that no samples will be permanently be expatriated from African Institute of Biomedical Science and Technology (AiBST, the project's technical headquarters) or national laboratories and that, should samples be temporarily be expatriated for specialized analysis, approval must be sort from the steering committee. To encourage North–South and South–South collaboration and maximal utilization of the biobank, scientists are welcome to visit the AiBST laboratories to carry out studies on the samples on research that will have been approved by the steering committee. This is hoped to encourage both scientific and technology transfer involving visits by international scientists interested in genetic studies in African populations as they will have to come and carry out their research work at AiBST or laboratories of consortium members hence share their skills and possibly invest in equipment.

The future for the Biobank and Pharmacogenetics Databasing project of African populations will aim to collect samples driven by hypotheses based on phenotypes of interest to health-care challenges in Africa such as malaria, TB, HIV/AIDS infections and side effects to anti-infectives. Such projects with nonanonymous sampling will require rigorous ethical considerations that are being explored by the steering committee but are necessary as they yield the most informative genotype–phenotype results. Toward increasing the number of samples in the biobank, there will be a move from blood sampling to buccal swabs, which is faster, posses less risk for infection to collectors and easy to transport. This effort will hopefully spur pharmacogenetics and genomics research capacity strengthening at African institutions through investment in molecular biology platforms in the form of DNA automatic sequencers, real-time PCR thermocyclers, bioinformatics tools and the training of biomedical scientists.