Original Article

The Pharmacogenomics Journal (2004) 4, 374–378. doi:10.1038/sj.tpj.6500268 Published online 10 August 2004

Identification of a pharmacogenetic effect by linkage disequilibrium mapping

C-F Xu1, K F Lewis1, A J Yeo1, L C McCarthy1, M F Maguire1, Z Anwar1, T M Danoff2, A D Roses3 and I J Purvis1

  1. 1Discovery and Pipeline Genetics, GlaxoSmithKline Medicines Research Centre, Stevenage, Hertfordshire, UK
  2. 2Clinical Pharmacology and Discovery Medicine, GlaxoSmithKline, Philadelphia, PA, USA
  3. 3Genetics Research, GlaxoSmithKline, Research Triangle Park, NC, USA

Correspondence: Dr C-F Xu, Discovery and Pipeline Genetics, GlaxoSmithKline Medicines Research Centre, Gunnels Wood Road, Stevenage, Hertfordshire, SG1 2NY, UK. Tel: +44 1438 768392; Fax: +44 1438 764231; E-mail: cfx74267@gsk.com

Received 26 November 2003; Revised 25 May 2004; Accepted 11 June 2004; Published online 10 August 2004.

Top

Abstract

A practical limitation to the identification of genetic profiles predictive of drug-induced adverse events is the number of patients with the adverse event that can be tolerated before the drug is withdrawn. Whole genome screening for regions of linkage disequilibrium (LD) associated with a particular phenotype may provide the mechanism to rapidly discover specific and sensitive profiles. We have used data from a large phase III clinical trial of tranilast and typed 76 SNPs over a 2.7 megabase region flanking the uridine diphosphate glucuronosyltranserferase 1A1 gene. Three SNPs within one LD block showed strong association with tranilast-induced hyperbilirubinemia (P<10-13). Our data illustrated that a genome-wide LD scan of 100 000–200 000 SNPs is sufficient to identify a pharmacogenetic association with a drug-induced adverse event.

Keywords:

pharmacogenetics, linkage disequilibrium, tranilast, hyperbilirubinemia, UGT1A1

Abbreviations:

AE, adverse event; LD, linkage disequilibrium; SNP, single-nucleotide polymorphism; UGT1A1, uridine diphosphate glucuronosyltranserferase 1A1

Top

INTRODUCTION

Whole genome linkage disequilibrium (LD) screening using single-nucleotide polymorphisms (SNPs) is a practical reality.1, 2, 3, 4 The seminal questions are: How few patients with adverse events (AE) does it take to recognize a LD pattern, how many patients are required to confirm the pattern, and what is the practical SNP density for genome-wide LD mapping? Prediction parameters can then be applied prospectively to identify patients at risk who could be advised not to take the drug.

There is much debate around the number of SNPs required to construct an SNP map with acceptable power to perform genome-wide association studies. A theoretical prediction suggested that 500 000 SNPs would be required,3 while others suggested that a map of 100 000–200 000 SNPs would be sufficient for a genome-wide association scan.1, 4, 5, 6 Recent examination of the LD patterns on chromosome 22 and chromosome 19 showed many peaks and valleys of LD, highlighting the complexity of LD in the genome and the challenges of defining a genome-wide association SNP set.2, 7 The arrival of results of the HapMap project, which defines patterns of LD across the genome, will assist scientists to develop an LD-based association scan map.8 Sets of tagging SNPs (tSNPs) selected on the basis of LD to represent genetic variation in the genome or in the regions of interest may well be used in future association studies.5, 9 However, as such genome-wide tSNPs are not currently available, most previous or current LD studies have been adopting an 'evenly-spaced SNP' approach, or selecting SNPs randomly and/or subjectively, or using all available SNPs in the database.

We have used the data from a large phase III clinical trial of tranilast, where approximately 12% of patients developed drug-induced hyperbilirubinemia.10, 11 We demonstrated that a Gilbert's syndrome variant a TA repeat polymorphism in the uridine diphosphate glucuronosyltranserferase 1A1 (UGT1A1) gene predicted genetic susceptibility to the drug-induced adverse effect.10, 12 We used this defined adverse event and an SNP scan across the genomic region containing the UGT1A1 gene to test the feasibility of using a genome-wide LD scan to identify a pharmacogenetic effect. Based on the availability of SNPs in the database and current understanding of LD during the design of this study, we used an evenly spaced SNP LD scan approach.

Top

RESULTS

Associations Between SNPs and Tranilast-Induced Hyperbilirubinemia

The SNPs employed in this study were identified using dbSNP (dbSNP database, build 101). The initial set consisted of 76 SNPs spanning 2.7 Mb, encompassing the UGT1A1 gene on chromosome 2. Where possible, SNPs were chosen for their evenly distributed locations across the region, with an average spacing of 1 SNP every 30 kb to represent the density of a genome-wide scan with 100 000 SNPs. The SNP map constructed was not optimal, reflecting the SNP distribution in the database at the time of our selection (dbSNP database, build 101). The SNP database has improved since our initial SNP selection, and a recent examination of the SNPs available in the region revealed no remaining gaps in SNP coverage (dbSNP database, build 115).

We genotyped the 76 SNPs on 1231 subjects consisting of 146 cases and 1085 controls. Significant association was found between 10 SNPs and tranilast-induced hyperbilirubinemia (P<0.05), with three SNPs showing significant levels at P<0.001 (P=3.6 times 10-19 for rs869283, P=7.4 times 10-15 for rs871514, and P=1.9 times 10-13 for rs1875263) (Figure 1a). These three SNPs spanned a region of 8 kb, located 40–48 kb upstream of the TA repeat polymorphism in the UGT1A1 promoter. This initial observation was reinforced upon genotyping further SNPs from a defined area of 459 kb encompassing UGT1A1. A total of 99 SNPs were analyzed within this region, giving an average density of 1 SNP every 4.6 kb. In all, 16 SNPs showed significant (P<0.05) association with tranilast-induced hyperbilirubinemia, including rs887829 which showed the strongest association at P=9.6 times 10-23 (Figure 1b). This marker is located 324 bp upstream of the TA repeat in UGT1A1, and the two polymorphisms are in complete LD (r=1). It is known that if, as in a pharmacogenetic setting, cases are difficult to ascertain, a study can borrow strength from the use of a large control set. Our data provided an opportunity to demonstrate the point empirically. When sequentially decreasing the number of cases from 146 to 10 while keeping the number of controls constant at 1085, significant association was detected between rs869283 and the phenotype (P<0.05). Although the significance levels in this investigation reflect the large genetic effect under study, our data illustrate that the likelihood of identifying a pharmacogenetic association with relatively few patients is increased considerably by using a larger control cohort.

Figure 1.
Figure 1 - Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

Association between SNPs and tranilast-induced hyperbilirubinemia in 146 cases and 1085 controls for (a) 76 SNPs spanning 2.7 Mb and (b) 99 SNPs spanning 459 kb flanking UGT1A1. The X-axis represents the physical distance (kb), and the Y-axis represents P-values in logarithmic scale. The solid black circles indicate the P-value estimates for association between each SNP and tranilast-induced hyperbilirubinemia.

Full figure and legend (29K)

Correlation of SNP Density and LD Coverage

Analysing LD (D') for the initial set of 76 SNPs identified four LD blocks, covering 10% of the 2.7 Mb region (Figure 2a). The three most significant markers associated with the adverse event were all located in one of these LD blocks (block d) containing UGT1A1. Increasing marker density in the subsequent experiment had a dramatic effect on LD block coverage (Figure 2b). In this case, approximately 70% of the 459 kb region lay within blocks of LD. Again, four of the five markers strongly associated with hyperbilirubinemia could be found in an LD block of 46 kb (block D, Figure 2b), which also contained the TA repeat polymorphism of UGT1A1. There were also SNPs less significantly associated with the AE located outside this LD block (Supplement Table 1).

Figure 2.
Figure 2 - Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

Diagrammatic representation of pairwise LD (D') (a) between 76 SNPs spanning a 2.7 Mb region and (b) between 99 SNPs across a 459 kb region flanking UGT1A1. SNPs are ordered using the NCBI map build 30. Red squares indicate high LD (absolute D'greater than or equal to0.8), clear squares indicate D'<0.8. The double arrows along the top of the figure show LD blocks. The single arrow within the figure points to the location of UGT1A1. The long diagonal black line along the figure indicates the physical length of the region, with short black lines showing the position of SNPs.

Full figure and legend (139K)

Top

DISCUSSION

A major question for pharmacogenetic profile screening is whether the SNP density is appropriate for the employment of whole genome LD mapping using a practical number of SNPs. The SNP density used in our experiment was designed to reflect the expected SNP density when scanning the whole genome with 100 000 SNPs. We demonstrated that a pharmacogenetic effect, tranilast-induced hyperbilirubinemia, would have been identified using such a genome scan. The application of learnings from this example should be tempered by the understanding that the impact of UGT1A1 mutations on the observed phenotype is extremely powerful. Nearly 40% of the cases carried the defective gene, with an odds ratio of 8.2.10 Other strong pharmacogenetics effects reported include the association between the HLA-B57 and hypersensitivity reactions to abacavir, with an odds ratio of 23.6.13 There is little doubt that the strength of such genetic associations is so startling that it would immediately focus attention on the relevant chromosomal regions following a genome scan. It is equally fair to highlight that pinning down 'culprit' region(s) for less significant effects would be more of a challenge.14, 15, 16, 17 Variable levels of phenocopy, penetrance and genetic heterogeneity may all contribute to decreasing the possibility of a successful SNP genome scan in other pharmacogenetic observations.

We applied the 'evenly spaced' approach to select the initial marker set due to the absence of LD information between these markers at the time of selection. There is much discussion in the literature of the relative merits of 'evenly spaced' and tagging SNP LD scans.5, 9 To address the question of whether tSNPs will adequately represent other genetic variants to allow the localization of regions associated with a genetic effect, we applied two SNP selection procedures18 to our high-density SNP set. We obtained 20–30 tSNPs from the original set of 99 SNPs to represent the 459 kb region. These tSNPs were equally successful in locating the region containing UTG1A1 (data not shown). Scaling up this finding, we estimated that 150 000–200 000 tSNPs would be required for whole genome LD scans, which may well provide a good balance between the cost and risk for association studies.

Patterns of LD blocks are known to be associated with marker density.5, 7, 9 We observed that the probability of detecting LD blocks was highly correlated with the SNP density employed. At low marker density (1 SNP every 36 kb), 10% of the region was represented by LD blocks, whereas at high marker density (1 SNP every 4.6 kb), 70% of the region was represented by LD blocks (Figure 2). A low-density SNP scan allowed us to locate the 'region of interest', while subsequent high-density mapping provided a better understanding of the LD pattern in this region and ultimately led to the identification of the causative polymorphism. Our data support the approach of fine mapping the 'region of interest' with high-density SNPs or all SNPs to locate the functional polymorphism, once an initial positive association is identified in a low-density LD scan.

Pharmacogenetic analyses designed for rapid identification of patients' genetic profiles correlated with either efficacy or AEs during the course of drug development are often limited by the number of patients available. This is particularly true for AEs, where the question of how few patients it takes to recognize a highly sensitive and specific pattern associated with the phenotype is an important consideration. Schork et al19 demonstrated with statistical models that increasing the number of matched controls could increase the power of a study. The results presented here are consistent with analytical calculations concerning power, and, although the extreme significance levels reflect the magnitude of the genetic effect under study, the data illustrate the power increases gained from using a large control set, when cases are few. This is particularly useful in studies of uncommon adverse events when the number of patients exhibiting this phenotype should be kept as low as possible. Bowman20 recently proposed a simple diagnostic likelihood ratio-based, empirical Bayes Factor framework for pharmacogenetic surveillance, which uses the practice of taxonomy as a model. This supervised, sequential, multi-point simulation and counting method estimates/visualizes the average objective positive evidence a reported individual gives in classifying adverse event cases as genetically different from controls. The tranilast data support this mathematical framework and suggest that only a limited number of adverse event patients, with a larger set of controls, are sufficient to recognize an area of the SNP profile associated with the phenotype. Once determined, the SNP pattern was as 'diagnostic' as the TA repeat, thereby mimicking the situation for disease genes where population controls are historical in discovery publications and are not performed again with each new patient who presents with the phenotype. It should also be noted that, had tranilast been approved for marketing, a simple bilirubin test before receiving the drug and at a month after taking the drug would have clinically defined this genetic risk, in this case for a phenotype that is not associated with major liver complications. Multiple loci identified by SNP mapping might well be developed clinically into simple tests and not necessarily DNA profiles.

In summary, the availability of high-throughput SNP genotyping methods will contribute towards making genome-wide SNP scans much more feasible. An SNP map constructed with LD information will allow standard maps to be developed and reduce the cost of genome-wide association studies. Furthermore rapid and automated statistical tools remain to be developed to analyze the large data sets that would be generated. Additional candidate-gene and candidate-region studies will help researchers gain a better idea of sample sizes required to provide enough power and thus confidence in genome-wide association results, which may help to identify the genetic basis of common diseases and medicine responses. The data presented here suggested that a genome scan of 100 000–200 000 SNPs would have allowed identification of the polymorphism predisposing some individuals to tranilast-induced hyperbilirubinemia.

Top

MATERIAL AND METHODS

Samples

Written informed consent to perform genetic analysis was obtained from 1231 Caucasian individuals who had participated in the PRESTO (prevention of restenosis with tranilast and its outcomes) clinical trial.21 Patients over the age of 18 years were collected from the United States. DNA was extracted using a nucleon DNA extraction and purification kit (Tepnel Life Sciences PLC, UK). In all, 146 individuals had greater than 2.0 mg/dl of total bilirubin following administration of tranilast and were defined as cases. A total of 1085 individuals had equal to or less than 2.0 mg/dl of total bilirubin following administration of tranilast, and were defined as controls in this study.10

SNP Genotyping

The Amplifluor™ SNP genotyping system (Serologicals Corporation®) was employed to validate and type these SNPs. Reagents, including dNTPs, PCR reaction mix, and FAM and JOE labeled Amplifluor™ Uniprimers™ were supplied by Serologicals Corporation®. Metabion GmbH (Germany) supplied the oligonucleotides. An ABI 7900HT was used to read fluorescence signals for each sample.

Statistical Analysis

Associations between the genotypes and phenotypes were assessed using Fisher's exact test. PROC FREQ22 computed exact P-values by generating R times C tables using a network algorithm developed by Mehta and Patel.23

Pairwise LD between SNPs was measured using both the D' statistic24 and the correlation coefficient r.25 An LD block was defined as consisting of a minimum of three markers with more than 90% of marker pairs within the block, with an absolute D' value greater than or equal to0.8.

Top

Notes

DUALITY OF INTEREST

None declared.

Top

References

  1. Carlson CS, Newman TL, Nickerson DA. SNPing in the human genome. Curr Opin Chem Biol 2001; 5: 78–85. | Article | PubMed | ISI | ChemPort |
  2. Dawson E, Abecasis GR, Bumpstead S, Chen Y, Hunt S, Beare DM et al. A first-generation linkage disequilibrium map of human chromosome 22. Nature 2002; 418: 544–548. | Article | PubMed | ISI | ChemPort |
  3. Kruglyak L. Prospects for whole-genome linkage disequilibrium mapping of common disease genes. Nat Genet 1999; 22: 139–144. | Article | PubMed | ISI | ChemPort |
  4. Lai E, Bowman C, Bansal A, Hughes A, Mosteller M, Roses AD. Medical applications of haplotype-based SNP maps: learning to walk before we run. Nat Genet 2002; 32: 353. | Article | PubMed | ISI | ChemPort |
  5. Goldstein DB, Ahmadi KR, Weale ME, Wood NW. Genome scans and candidate gene approaches in the study of common diseases and variable drug responses. Trends Genet 2003; 19: 615–622. | Article | PubMed | ISI | ChemPort |
  6. Lai E, Riley J, Purvis I, Roses A. A 4-Mb high-density single nucleotide polymorphism-based map around human APOE. Genomics 1998; 54: 31–38. | Article | PubMed | ISI | ChemPort |
  7. Phillips MS, Lawrence R, Sachidanandam R, Morris AP, Balding DJ, Donaldson MA et al. Chromosome-wide distribution of haplotype blocks and the role of recombination hot spots. Nat Genet 2003; 33: 382–387. | Article | PubMed | ISI | ChemPort |
  8. Couzin J. Human genome. HapMap launched with pledges of $100 million. Science 2002; 298: 941–942. | Article | PubMed | ISI | ChemPort |
  9. Cardon LR, Abecasis GR. Using haplotype blocks to map human complex trait loci. Trends Genet 2003; 19: 135–140. | Article | PubMed | ISI | ChemPort |
  10. Danoff TM, Campbell DA, McCarthy LC, Lewis KF, Repasch MH, Saunders AM et al. A Gilbert's syndrome UGT1A1 variant confers susceptibility to tranilast-induced hyperbilirubinemia. Pharmacogenomics J 2004; 4: 49–53. | Article | PubMed | ISI | ChemPort |
  11. Holmes Jr DR, Savage M, LaBlanche JM, Grip L, Serruys PW, Fitzgerald P et al. Results of Prevention of REStenosis with Tranilast and its Outcomes (PRESTO) trial. Circulation 2002; 106: 1243–1250. | Article | PubMed | ISI |
  12. Roses AD. Genome-based pharmacogenetics and the pharmaceutical industry. Nat Rev 2002; Drug Discov. 1: 541–549. | Article |
  13. Hetherington S, Hughes AR, Mosteller M, Shortino D, Baker KL, Spreen W et al. Genetic variations in HLA-B region and hypersensitivity reactions to abacavir. Lancet 2002; 359: 1121–1122. | Article | PubMed | ISI | ChemPort |
  14. Arranz MJ, Collier DA, Munro J, Sham P, Kirov G, Sodhi M et al. Analysis of a structural polymorphism in the 5-HT2A receptor and clinical response to clozapine. Neurosci Lett 1996; 217: 177–178. | Article | PubMed | ISI | ChemPort |
  15. Goldstein DB, Tate SK, Sisodiya SM. Pharmacogenetics goes genomic. Nat Rev Genet 2003; 4: 937–947. | Article | PubMed | ISI | ChemPort |
  16. Masellis M, Basile V, Meltzer HY, Lieberman JA, Sevy S, Macciardi FM et al. Serotonin subtype 2 receptor genes and clinical response to clozapine in schizophrenia patients. Neuropsychopharmacology 1998; 19: 123–132. | Article | PubMed | ISI | ChemPort |
  17. Murphy Jr GM, Kremer C, Rodrigues HE, Schatzberg AF. Pharmacogenetics of antidepressant medication intolerance. Am J Psychiatry 2003; 160: 1830–1835. | Article | PubMed | ISI |
  18. Meng Z, Zaykin DV, Xu CF, Wagner M, Ehm MG. Selection of genetic markers for association analyses, using linkage disequilibrium and haplotypes. Am J Hum Genet 2003; 73: 115–130. | Article | PubMed | ISI | ChemPort |
  19. Schork NJ, Nath SK, Fallin D, Chakravarti A. Linkage disequilibrium analysis of biallelic DNA markers, human quantitative trait loci, and threshold-defined case and control subjects. Am J Hum Genet 2000; 67: 1208–1218. | PubMed | ISI | ChemPort |
  20. Bowman C, Classification Using SNP Profiles Royal Statistical Society: Leuven, Belgium 2003; Statistical Genetics and Bioinformatics, July 14–17.
  21. Holmes D, Fitzgerald P, Goldberg S, LaBlanche J, Lincoff AM, Savage M et al. The PRESTO (Prevention of restenosis with tranilast and its outcomes) protocol: a double-blind, placebo-controlled trial. Am Heart J 2000; 139: 23–31. | Article | PubMed | ChemPort |
  22. SAS. SAS Release 8. SAS Institute: Cary, NC 2000.
  23. Mehta C, Patel N. A network algorithm for performing Fisher's exact test in r times c contingency tables. J Am Stat Assoc 1983; 78: 427–434.
  24. Lewontin R. The interaction of selection and linkage. I. General considerations; heterotic models. Genetics 1964; 49: 49–67. | PubMed | ISI | ChemPort |
  25. Weir BS. Disequilibrium. Genetic Data Analysis II. Sinauer Associates Inc.: Sunderland, MA 1996: 137.
Top

Acknowledgements

We thank all the scientists within Discovery Genetics, GlaxoSmithKline for their scientific contributions. In addition, we would like to thank Pete Boyd and Aruna Basal for statistical advice and critical reading of the manuscript, and Mike Barnes for SNP mapping assistance. This study is wholly funded by GSK Research and Development.

Supplementary Information

Supplementary Information accompanies the paper on The Pharmacogenomics Journal website http://www.nature.com/tpj)

Extra navigation

.

naturejobs

ADVERTISEMENT