Harnessing gene expression to identify the genetic basis of drug resistance
Bo-Juen Chen1,2,3, Helen C Causton1, Denesy Mancenido1, Noel L Goddard4, Ethan O Perlstein5 & Dana Pe'er1,3
- Department of Biological Sciences, Columbia University, New York, NY, USA
- Department of Biomedical Informatics, Columbia University, New York, NY, USA
- Center for Computational Biology and Bioinformatics, Columbia University, New York, NY, USA
- Department of Physics and Astronomy, Hunter College, 695 Park Avenue, 1225 Hunter North, New York, NY, USA
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
Correspondence to: Dana Pe'er1,3 Department of Biological Sciences, Columbia University, 2960 Broadway, 607D Fairchild Center, MC 2461, New York, NY 10027, USA. Tel.: +1 212 854 4397; Fax: +1 212 865 8246; Email: dpeer@biology.columbia.edu
Received 20 March 2009; Accepted 24 August 2009; Published online 13 October 2009
Article highlights
- Camelot (CAusal Modelling with Expression Linkage for cOmplex Traits) is a method that integrates genotype, gene expression and phenotype data from individuals and uses it to build models that predict complex quantitative phenotypes and identify genes that actively influence them. Gene expression data measured in a reference condition (drug-free) empowers prediction of the drug response and identification of genes that actively influence this response.
- We applied our method to segregants obtained from a cross between two diverse strains of Saccharomyces cerevisiae. Camelot accurately predicts the response of the segregants to 87/94 drugs and identifies genes (both inside and outside linked regions) that actively influence the phenotype. 25/27 of the gene-drug interactions predicted were confirmed. The integration of gene expression data was critical for achieving the performance reported.
- In addition to identifying linked regions Camelot pinpoints the causal gene within the region. For example Camelot identified GPB2 as the gene responsible for linkage to 3 drugs within the Chromosome I:1-55329 region. GPB2's influence on drug resistance was subsequently validated for all 3 drugs. While the mechanism of action of these drugs is unknown, identification of GPB2 suggests that the PKA pathway is likely involved in the response to these drugs.
- PHO84 was identified and validated as a causal gene for 5 drug responses. BY strains containing only a single coding nucleotide polymorphism from RM grew at a similar rate to RM strains in the presence of drug and expressed PHO84 at a similar level likely due to negative feedback. We believe that variation in gene expression serves as an indicator of variation in protein function and thus explains why expression helps identify the role of the gene in the response to drug.
Synopsis
We want to understand how differences in genotype account for the wide range of phenotypic diversity between individuals. Most traits are determined by multiple genes; so the challenge of predicting an individual's phenome (i.e., spectrum of traits) from its genome requires both identification of the genes that influence the trait and models that describe how they interact to determine the trait (Gabriel et al, 2002; Maller et al, 2006). We developed a computational framework, Camelot (CAusal Modelling with Expression Linkage for cOmplex Traits), which combines genotype and gene expression data to associate genetic factors with phenotype. Our premise is that gene expression is useful because it integrates information from multiple loci that are individually too weak to detect, but which, in combination, contribute significantly to the phenotype.
We applied Camelot to a data set containing genotype, gene expression (profiled in the absence of drug) and phenotype (growth in the presence of drug) data from segregants obtained from a cross between two diverse strains of S. cerevisiae (BY and RM; Brem and Kruglyak, 2005; Perlstein et al, 2007). The genetic diversity between strains manifests in extensive phenotypic diversity. We obtain genotype and gene expression data for each segregant grown in the absence of drug, to derive a quantitative prediction of the strain's phenome. Camelot identifies a small set of features: markers at a genetic locus, or transcripts, that actively influence growth in the presence of each drug and explain the observed differences between segregants. Camelot uses these features to accurately predict the growth of 'unseen' segregants that have an entirely different genotype.
Camelot accurately predicted the response to a drug, for 87/94 drugs. Gene expression data measured under an unrelated condition (no drug) significantly contributes to the accuracy of prediction and to the ability of Camelot to detect causal genes involved in this response.
Two statistical methods, the triangle test and zoom-in score, use gene expression to identify genes that actively influence the phenotype. The triangle test is used when the selected feature is a transcript, whereas the zoom-in score pinpoints causal variants within large linked regions (when the selected feature is a marker). Experimental validation demonstrates the outstanding performance of these two methods, with 7/9 predictions validated for the triangle test and 18/18 predictions validated for the zoom-in score.
Camelot's triangle test predicted DHH1 as a gene that actively influences growth in the presence of each of six drugs, including hydrogen peroxide (Figure 3A, B and D). We tested the prediction by measuring the growth yield of wild-type and dhh1
strains in hydrogen peroxide. As predicted, the dhh1
strain grew better than the wild type, confirming that DHH1 negatively influences drug resistance (Figure 3C). We subsequently validated the role of DHH1 in resistance to three additional drugs (Figure 3D). Although the drugs linked to DHH1 are diverse and include an antibiotic and an antipsychotic drug, they all affect mitochondrial function (Ni colson et al, 1999; Evans et al, 2000; Nulton-Persson and Szweda, 2001; Lee et al, 2005; Lee et al, 2008; Safiulina et al, 2006; Yip et al, 2006; Sancho et al, 2007). Dhh1 post-transcriptionally regulates genes involved in mitochondrial biogenesis (Lee et al, 2009), suggesting that mitochondrial function is important in the response to these drugs.
Figure 3
Causal role of DHH1. (A) Growth yield in the presence of H2O2 compared with model prediction from linkage analysis, elastic-net L model and Camelot, represented as in Figure 1C, demonstrating superior prediction by Camelot. Camelot chose a Chromosome XIII locus (227 254–243 624) and expression of DHH1 as features to predict the drug response; the values for each segregant are represented in the same order within the panel. (B) The full prediction function obtained from Camelot for response to H2O2. DHH1 is selected as a feature and confirmed by the triangle test; the Chromosome XIII marker is selected as a feature and the zoom-in score identifies ERG6 as the causal gene within the region, fitting with reports that overexpression of ERG6 leads to decreased resistance to hydrogen peroxide (Khoury et al, 2008). The Chromosome XIV locus is at position 449 639–486 861. Some notation for all the figures: Green rectangles (such as ERG6) represent expression of a gene within a linked region. (C) Averaged OD600 absorbance growth measurements of BY (red) and BY dhh1
mutant (blue) plotted against twofold dilution series of H2O2. The error bars represent the standard error of the mean for all growth yield data. These data confirm the causal effect of DHH1. (D) DHH1 is a hub passing the triangle test for six drugs (left column). Five of these were tested; validated causal effects are in green, with one false positive listed in red. To assess the drug specificity of DHH1-mediated effects, four negative controls were tested (right column); confirmed negative predictions are listed in green and one false negative in red. See Supplementary Figure 2 for drug response curves for each of the drugs tested, as represented in Figure 3C.
The zoom-in score identified GPB2, a gene not previously implicated in the phenotypic differences of the parental strains. We validated Camelot's prediction that GPB2 plays a causal role in the response to three drugs (Figure 5A).
Figure 5
Causal role of GPB2 in response to drugs. (A) Strains were grown overnight in YPD medium, diluted to OD600
0.2 and plated with 10-fold dilution on YPD+drug media (see section Materials and methods). The top three panels are photos of YPD plates containing DMSO (control), E6 berbamine or gliotoxin. The bottom panels are photos of YPD plates containing DMSO or haloperidol. The results show a large difference in drug sensitivity between BY and RM. The AS strain (BY GPB2-RM) grows at a rate similar to the RM strain. (B) Camelot identifies two loci (Chromosome I: 1–55 329 and Chromosome XIII: 27 644–33 681) and causal genes encoded within these loci, GPB2 and PHO84, that are responsible for the response to haloperidol. (C) Analysis shows that GPB2 and PHO84 interact with each other to influence growth in the presence of haloperidol. Shown are the genotypes for PHO84 and GPB2, and growth in the presence of haloperidol. Segregants with both the PHO84-RM and GPB2-BY alleles have significantly better resistance (P-value from Wilcoxon rank-sum test) to haloperidol compared with other segregants.
The zoom-in score also identified PHO84 as a causal variant for multiple drugs. We used the zoom-in score to distinguish which drugs are causally influenced by PHO84 and validated these predictions by growing wild-type BY and the allele-swapped (AS) strain (BY PHO84-RM) in the presence of one of nine drugs. The AS and RM strains behaved as Camelot predicted 9 out of 9 times.
Both PHO84 and GPB2 were identified as causal genes for the variation of growth in haloperidol, and strains carrying both RM-PHO84 and BY-GPB2 grow better than strains with other combinations of the two alleles, suggesting that PHO84 and GPB2 may function through a common pathway (Figure 5B and C). Both genes are involved in the cAMP/PKA pathway, which suggests a possible mechanism of action for haloperidol.
To better understand why gene expression helps us identify causal variants, we monitored PHO84 abundance in BY, RM and AS (BY PHO84-RM) strains. Although the AS strain contains cis- and trans-regulatory factors from BY, the presence of the RM coding region brought the expression of PHO84 down to that of the RM strain. It is likely that the difference in expression results from negative feedback that acts through the Pho84 protein (Wykoff et al, 2007). Additional data suggest that negative feedback is stronger in the RM and AS strains. We propose that variation in gene expression serves as an indicator of the variation in protein function, and that the difference in protein function is responsible for the observed differences in drug sensitivity between strains.
In conclusion, we systematically demonstrate Camelot's performance in predicting phenotypes and in identifying genes responsible for the variation in the growth in the presence of drug. It is intriguing that a gene expression profile measured in the absence of drugs empowers the prediction of traits under novel conditions (+drugs). Camelot provides another step towards the realization of personalized medicine and highlights the power to be gained by exploiting gene expression data for this application.
Acknowledgements
This research was supported by the National Institutes of Health Roadmap Initiative, NIH Director's New Innovator Award Program, through Grant number 1-DP2-OD002414-01 and National Centers for Biomedical Computing Grant 1U54CA121852-01A1. DP holds a Career Award at the Scientific Interface from the Burroughs Wellcome Fund. NLG is supported by NIH G12 RR003037-24-2245476. We thank Ron Davis for the kind gift of YAD350 and Fred Winston for FY1333. We also wish to thank Oren Litvin, Itsik Pe'er, Aviv Regev, Eran Segal, Olga Troyanskaya, Lyle Ungar and Dennis Wykoff for valuable comments.
Author contributions: BJC, HCC, NLG and DP designed research; BJC and DP designed the Camelot method; BJC implemented the Camelot method; BJC, HCC and DP analysed the data; EOP performed the drug validation for DHH1, PHO84 and MKT1; DM and HCC constructed the GPB2 allele swap; BJC and HCC performed all experiments related to PHO84 feedback and carried out the drug validation for GPB2; and BJC, HCC and DP wrote the paper.
References
- Brem RB, Kruglyak L (2005) The landscape of genetic complexity across 5,700 gene expression traits in yeast. Proc Natl Acad Sci USA 102: 1572–1577 | Article | PubMed | ChemPort |
- Evans GB, Furneaux RH, Gainsford GJ, Murphy MP (2000) The synthesis and antibacterial activity of totarol derivatives. Part 3: modification of ring-B. Bioorg Med Chem 8: 1663–1675 | Article | PubMed | ChemPort |
- Gabriel SB, Salomon R, Pelet A, Angrist M, Amiel J, Fornage M, Attie-Bitach T, Olson JM, Hofstra R, Buys C, Steffann J, Munnich A, Lyonnet S, Chakravarti A (2002) Segregation at three loci explains familial and population risk in Hirschsprung disease. Nat Genet 31: 89–93 | Article | PubMed | ISI | ChemPort |
- Khoury CM, Yang Z, Li XY, Vignali M, Fields S, Greenwood MT (2008) A TSC22-like motif defines a novel antiapoptotic protein family. FEMS Yeast Res 8: 540–563 | Article | PubMed | ChemPort |
- Lee CS, Park SY, Ko HH, Song JH, Shin YK, Han ES (2005) Inhibition of MPP+-induced mitochondrial damage and cell death by trifluoperazine and W-7 in PC12 cells. Neurochem Int 46: 169–178 | Article | PubMed | ChemPort |
- Lee IH, Kim HY, Kim M, Hahn JS, Paik SR (2008) Dequalinium-induced cell death of yeast expressing alpha-synuclein–GFP fusion protein. Neurochem Res 33: 1393–1400 | Article | PubMed | ChemPort |
- Lee S-I, Dudley AM, Drubin D, Silver PA, Krogan NJ, Pe'er D, Koller D (2009) Learning a prior on regulatory potential from eQTL data. PLoS Genet 5: e1000358 | Article | PubMed | ChemPort |
- Maller J, George S, Purcell S, Fagerness J, Altshuler D, Daly MJ, Seddon JM (2006) Common variation in three genes, including a noncoding variant in CFH, strongly influences risk of age-related macular degeneration. Nat Genet 38: 1055–1059 | Article | PubMed | ISI | ChemPort |
- Nicolson K, Evans G, O'Toole PW (1999) Potentiation of methicillin activity against methicillin-resistant Staphylococcus aureus by diterpenes. FEMS Microbiol Lett 179: 233–239 | Article | PubMed | ChemPort |
- Nulton-Persson AC, Szweda LI (2001) Modulation of mitochondrial function by hydrogen peroxide. J Biol Chem 276: 23357–23361 | Article | PubMed | ChemPort |
- Perlstein EO, Ruderfer DM, Roberts DC, Schreiber SL, Kruglyak L (2007) Genetic basis of individual differences in the response to small-molecule drugs in yeast. Nat Genet 39: 496–502 | Article | PubMed | ChemPort |
- Safiulina D, Veksler V, Zharkovsky A, Kaasik A (2006) Loss of mitochondrial membrane potential is associated with increase in mitochondrial volume: physiological role in neurones. J Cell Physiol 206: 347–353 | Article | PubMed | ChemPort |
- Sancho P, Galeano E, Nieto E, Delgado MD, Garcia-Perez AI (2007) Dequalinium induces cell death in human leukemia cells by early mitochondrial alterations which enhance ROS production. Leuk Res 31: 969–978 | Article | PubMed | ChemPort |
- Wykoff DD, Rizvi AH, Raser JM, Margolin B, O'Shea EK (2007) Positive feedback regulates switching of phosphate transporters in S. cerevisiae. Mol Cell 27: 1005–1013 | Article | PubMed | ChemPort |
- Yip KW, Mao X, Au PY, Hedley DW, Chow S, Dalili S, Mocanu JD, Bastianutto C, Schimmer A, Liu FF (2006) Benzethonium chloride: a novel anticancer agent identified by using a cell-based small-molecule screen. Clin Cancer Res 12: 5557–5569 | Article | PubMed | ChemPort |


