The liver pharmacological and xenobiotic gene response repertoire
Georges Natsoulis1,a, Cecelia I Pearson1, Jeremy Gollub1, Barrett P Eynon1, Joe Ferng1, Ramesh Nair1, Radha Idury1, May D Lee1,a, Mark R Fielden1,a, Richard J Brennan1,a, Alan H Roter1 & Kurt Jarnagin1
- Iconix Biosciences now Entelos, Foster City, CA, USA
Correspondence to: Stanford Genome Technology Center, 855 California Avenue, Palo Alto, CA 94304, USA. Tel.: +510 260 4973; Email: natsoulis@gmail.com or Email: georgesn@stanford.edu
Received 28 August 2007; Accepted 23 January 2008; Published online 25 March 2008
aPresent address: Stanford Genome Technology Center, Palo Alto, CA, USA
aPresent address: Limerick NeuroSciences, South San Francisco, CA, USA
aPresent address: Roche Palo Alto LLC, Palo Alto, CA, USA
aPresent address: GeneGo, St Joseph, MI, USA
aThe array data used in this study has been deposited in the Gene Expression Omnibus (GEO accession GSE8858).
Top of pageArticle highlights
- Systematic exploration of a large micro-array database derived from compound treated rats identifies 34 signatures for pharmacological and toxicological endpoints.
- As few as 200 genes are sufficient to classify all resolvable endpoints.
- Many different signatures composed of non-overlapping gene sets can be identified for a given phenotype.
- Analysis of those as a group can be used to derive a model of the underlying biology as exemplified here for liver fibrosis.
Synopsis
We have used a supervised classification approach (El Ghaoui et al, 2003; Natsoulis et al, 2005) to systematically mine a large microarray database derived from livers of compound-treated rats (Ganter et al, 2005). More than 5000 rats were treated with 344 compounds in multiple doses, for multiple time points and in biological triplicate. Extensive blood chemistry and histopathology were performed in parallel on the same animals. In total, 34 distinct gene expression-based signatures (classifiers) for pharmacological and toxicological end points were identified. While it is intuitively obvious that 'more data' is better, we show that our signatures have a better overall classification performance than many diagnostic tests in widespread use such as prostate-specific antigen, pap smear, Ames test and others. Moreover, deriving our signatures from such a large database ensures that the cross-validated classification performance reported here is more predictive of the forward validation results obtained on future data sets.
Analysis of the genes present in the 34 unique signatures reveals that some genes contribute disproportionably to the overall classification potential (i.e. appear in many different signatures). Those tend to be enriched in xenobiotic and acute phase response genes as well as un-annotated genes, indicating that not all key genes in the liver xenobiotic responses have been characterized. Just 200 genes are sufficient to classify all end points and can form the basis of a small toxicogenomic array. This last observation has been used to create of a high-information content but reduced gene number array and a software tool useful for preclinical toxicology and pharmacology (www.ToxFX.com).
We show that signature genes are not appreciably enriched in genes showing large amplitude of regulation or high levels of expression; we also show that aggressive gene pre-selection by amplitude of expression change or statistical significance reduces classifier quality. Our approach also identifies examples of very different signatures for a single end point. Similar results have been reported before and have often been regarded as problematic for the studies themselves or of the field in general (Michiels et al, 2005). Not only is this possible but we describe here a method to identify all the genes needed to define a classifier for a given phenotype, at a chosen quality threshold (the necessary gene set or NGS). Briefly, we derive a first signature for a given end point and strip the data set from all genes appearing in that signature. We then repeat the process until no valid classifier is obtained. The union of all sets of stripped genes forms the NGS. It is a naturally ranked gene list and analyzing it for GO term enrichment can reveal which pathways are most characteristic of a given phenotype.
Taking the NGS into consideration is more informative biologically than any individual signature considered in isolation. We illustrate the potential of this analysis using a signature for liver fibrosis. The 1380 unique probes forming the fibrosis NGS set are statistically enriched (P-value <0.05) in GO terms such as cell–matrix adhesion, amino-acid transporter activity, fatty acid biosynthetic process, cellular defense response, chemokine activity, organic anion transporter activity, sulfate transport, positive regulation of transcription and carbohydrate transport, most of which are affected during injury and subsequent fibrosis and bile duct hyperplasia. Other terms such as serotonin receptor activity, sensory perception and brain development were also enriched, indicating that local innervation and paracrine regulation of liver functions are remodeled during fibrosis. Many of these enrichments are not observed until the later signature cycles (cell–matrix adhesion and serotonin transporter activity, for example) and could be missed with more conventional methods of analysis. Downregulation of a number of liver-specific genes may signal a loss of function of and/or an actual loss of the major parenchymal cells in the liver, hepatocytes, which comprise 80% of the normal liver cell population. Genes that are preferentially downregulated include those that are involved in amino-acid metabolism, organic anion and amino-acid transport and metabolism, and several sulfotransferases and cytochrome P450s. Genes that are preferentially upregulated in contrast include those involved in cell adhesion, cytoskeleton organization, cell–cell signaling, proliferation, xenobiotic metabolism and the immune response. Molecules that are upregulated induce or promote cell and remodeling of the actin cytoskeleton. Many of the upregulated genes (PDGF
and endothelin 1, for instance) are expressed in rare cell types that are activated during liver injury and fibrosis, such as HSCs and Kupffer cells (Figure 5B).
This analysis of the liver fibrosis NGS suggests a model in which xenobiotic insult leads to loss of certain gene expression, apparently secondary to hepatocyte cell death through necrosis and apoptosis, and leads to the upregulation of weakly expressed genes, probably due to activation and expansion of less abundant cell types, such as HSCs and Kupffer cells.
This study illustrates that a comprehensive approach can distill a complex and broad issue to a definable set of answers, increase our knowledge and develop useful signatures and diagnostics.
On a more fundamental level, the systematic exploration of such a large homogeneous xenobiotic response data set reveals relationships between pathological end points, groups of genes capable of resolving these end points and the classes of drugs inducing these pathologies. This type of approach has the potential to elucidate the multiple modes of actions, the frequent cross-talk of many classes of drugs and the convergent pathways leading to common pathologies. A study aiming at a similar goal, albeit supported by a much smaller in vitro data set has recently received considerable attention (Lamb et al, 2006). We believe the scope of our data set and the novel methods used to analyze it will prove critical in achieving the goal of mapping the system of biological changes that underlay liver biology, toxicology and pharmacology.
References
- El Ghaoui L, Lanckriet GRG, Natsoulis G (2003) Robust classifiers with interval data. Report no. UCB/CSD-03-1279 Computer Science Division (EECS), University of California, Berkeley, CA
- Ganter B, Tugendreich S, Pearson CI, Ayanoglu E, Baumhueter S, Bostian KA, Brady L, Browne LJ, Calvin JT, Day GJ, Breckenridge N, Dunlea S, Eynon BP, Furness LM, Ferng J, Fielden MR, Fujimoto SY, Gong L, Hu C, Idury R et al (2005) Development of a large-scale chemogenomics database to improve drug candidate selection and to understand mechanisms of chemical toxicity and action. J Biotechnol 119: 219–244 | Article | PubMed | ChemPort |
- Lamb J, Crawford E, Peck D, Modell J, Blat I, Wrobel M, Lerner J, Brunet J, Subramanian A, Ross K (2006) The connectivity map: using gene-expression signatures to connect small molecules, genes, and disease. Science 313: 1929 | Article | PubMed | ISI | ChemPort |
- Michiels S, Koscielny S, Hill C (2005) Prediction of cancer outcome with microarrays: a multiple random validation strategy. Lancet 365: 488–492 | Article | PubMed | ISI | ChemPort |
- Natsoulis G, El Ghaoui L, Lanckriet GR, Tolley AM, Leroy F, Dunlea S, Eynon BP, Pearson CI, Tugendreich S, Jarnagin K (2005) Classification of a large microarray data set: algorithm comparison and analysis of drug signatures. Genome Res 15: 724–736 | Article | PubMed | ChemPort |


