Synopsis

Subject Categories: Functional genomics | Computational methods

Molecular Systems Biology 2 Article number: 2006.0029  doi:10.1038/msb4100067
Published online: 6 June 2006
Citation: Molecular Systems Biology 2:2006.0029

Adaptively inferring human transcriptional subnetworks

There is a News and Views associated with this document.

Debopriya Das1,a, Zaher Nahlé2 & Michael Q Zhang1

  1. Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
  2. Department of Internal Medicine, Center for Human Nutrition, Washington University in St Louis, St Louis, MO, USA

Correspondence to: Michael Q Zhang1 Cold Spring Harbor Laboratory, 1 Bungtown Road, Hershey Building, Cold Spring Harbor, New York, NY 11274, USA. Tel.: +1 516 367 8393; Fax: +1 516 367 8461; E-mail: Email: mzhang@cshl.edu

Received 23 September 2005; Accepted 28 March 2006; Published online 6 June 2006

aPresent address: Life Sciences Division, Ernest Orlando Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA

Top

Article highlights

  • This article presents an unsupervised computational approach to determine active mammalian transcriptional subnetworks
  • By avoiding the necessity to cluster gene expression profiles, this approach naturally allows modeling condition-specific regulation of genes by the same transcription factor under different conditions.
  • Our predictive framework leads to significantly improved determination of direct targets of transcription factors, as opposed to their indirect targets.
  • Examples on tissue-restricted (human liver) and temporal (human cell-cycle) regulation data clearly demonstrate the strength of this method in identifying novel transcriptional subnetworks.

Top

Synopsis

The importance of achieving an accurate quantitative understanding of gene regulation in humans can hardly be overstated. Deregulation of gene expression is a recurring theme in development and progression of several diseases including cancer. The emergence of new experimental platforms that probe transcription globally promises a comprehensive view of these fundamental biological processes in a large number of mammalian systems, in which very little is known in terms of their transcriptional regulation. By integrating the expression profiles with the genomic sequence information computationally, it is now possible to obtain a snapshot of the active transcriptional subnetworks in lower eukaryotes with a reasonable accuracy (Das et al, 2004; Wang et al, 2005). We define a transcriptional subnetwork as the set of transcription factors (TFs) as represented by the combinations of cognate cis-regulatory motifs, their target genes and the physiological processes they regulate (Figure 2). The generalization of such approaches to mammals remains challenging however (see, e.g., Figure 1 in Tompa et al, 2005). This is due to multiple factors, including enhanced degeneracy of TF binding sites, significantly elevated role of interactions between TFs in promoter recognition and multicellular architecture of mammals. Current computational methods, which are primarily clustering-based, do not adequately address these complicating factors. Moreover, many genes do not cluster tightly enough that their regulatory motifs can be discovered reliably. There is also marked subjectivity in how targets are determined. This work presents a minimally biased approach motivated by the switch-like behavior of transcriptional response, which overcomes the aforementioned limitations. It identifies potentially active motif combinations in proximal promoters by examining their correlation with mRNA expression levels across the genes. In this approach, both the active motif combinations and their target genes are learnt directly from the expression data in a condition-specific manner, and thus, adaptively. We demonstrate that this method can systematically infer transcriptional subnetworks in mammals from expression data with accuracy similar to those obtained for lower eukaryotes.

Figure 2
Figure 2 :  Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, or to obtain a text description, please contact npg@nature.com

Schematic representation of our analysis. A snapshot of the tissue-specific transcriptional subnetworks discovered from microarray data on adult human liver under a normal condition.

Full figure and legend (156K)Figures & Tables index

We applied our algorithm to the expression profile of adult human liver measured under a normal condition (Su et al, 2004) and discovered three functional liver-specific motif combinations. The inferred model was used to obtain their target genes, a Gene Ontology enrichment analysis of which subsequently revealed the over-represented biological pathways, thus leading to transcriptional subnetworks active in the profiled sample (Figure 2). HNF-1, the pleiotropic regulator of liver-specific genes, is among the three liver-specific combinations. We observe, a posteriori, that >70% of our predicted HNF-1 targets have been previously validated in biochemical assays. The other two liver-specific combinations are novel, one of which regulates sugar metabolism pathways, and another regulates lipid transport and metabolism. There are certain other advantages to this approach. For instance, we are able to identify the mRNA mixing effects present in tissue samples derived from a whole organ such as the liver. Additionally, we notice that several targets achieve their maximum expression in a tissue different from where the motif combination has its maximal regulatory effect. This suggests that genes are coregulated across multiple tissues, as one would expect in a synexpression group (Niehrs and Pollet, 1999). A distinct advantage of our method is that expression profiles from only a few conditions are necessary to reach this conclusion.

TFs regulate genes in a condition-specific manner. Hence, a particular TF can activate different sets of genes under different conditions (Zhu et al, 2005a). Application to human cell-cycle data (Whitfield et al, 2002) revealed that this technique can model such condition-specific gene regulation. Namely, the predicted targets of E2F in G1/S and G2/M phases are significantly different, as one would expect biologically (Zhu et al, 2005a). This is a natural outcome of the fact that the TF binding thresholds are learnt directly from response data in our approach. Many identified targets have been previously characterized as direct E2F targets, and the novel targets display strong E2F binding characteristics. We experimentally examined two G2/M-specific novel targets involved in hepatocellular carcinomas, CDC16 and DLG7, using an inducible E2F system. Our experiments confirmed that DLG7 and CDC16 are indeed direct E2F targets, in accordance with our computational predictions. The role of E2F as a G2/M activator has only been recently demonstrated. Our findings reaffirm and extend this hypothesis. Furthermore a strong correlation between the mitotic control and tumor progression in liver has been previously noted (Tsou et al, 2003). Intriguingly, our study suggests that part of this control may be exerted by E2F.

One of the major challenges in modern biology is to comprehensively decipher the regulatory network architecture within humans, both for advancement of fundamental understanding and in the development of novel therapeutics. We have presented an unsupervised algorithm to achieve this goal. It directly accounts for degeneracy and interactions among cis-regulatory elements, the two key impediments to modeling mammalian transcription, and leads to concrete testable hypotheses for transcriptional end points of pathways that are active under any specific biological condition. We find that, on average, the predicted models can explain approx24% of the variation in expression data for tissue-restricted profiles and approx21% for temporal profiles, comparable to those achieved in lower eukaryotes (Das et al, 2004). In addition, as our study clearly demonstrates, correlation with expression is a promising way to identify direct TF targets. By contrast, it is quite challenging to discriminate direct targets from indirect ones using previous approaches (Kirmizis and Farnham, 2004). Furthermore, our method for target determination does not depend on arbitrary fold cutoffs invoked by many comparable methods. Consequently, we can detect bona fide targets that undergo very subtle changes in expression as well. Autoregulatory loops, prevalent in biological networks, can also be easily discovered. The algorithm is equally applicable to both expression and ChIP-chip data. In summary, we think this approach will help accelerate systematic understanding of how phenotypic complexity is regulated at the transcription level in a wide range of mammalian systems and make scope for novel therapeutic interventions.

Top

Acknowledgements

We thank Terri Pietka and Mike Hsieh for excellent technical assistance, X Shirley Liu for providing the updated version of MDscan, Kristian Helin for E2F-1 expression vectors and Joe W Gray, Josh Huang, Matteo Pellegrini, Nilanjana Banerjee, Nada Abumrad, Fang Zhao, Dustin Schones, Gengxin Chen, Andrew Smith, Aaron Boudreau and Zhenyu Xuan for helpful discussions. This work was supported by NIH grant HG001696 (MQZ), CSHL Association Fellowship (DD) and a grant from the Philip Morris USA External Research Program (ZN).

Top

References

  1. Das D, Banerjee N, Zhang MQ (2004) Interacting models of cooperative gene regulation. Proc Natl Acad Sci USA 101: 16234–16239 | Article | PubMed | ChemPort |
  2. Kirmizis A, Farnham PJ (2004) Genomic approaches that aid in the identification of transcription factor target genes. Exp Biol Med (Maywood) 229: 705–721 | PubMed | ChemPort |
  3. Niehrs C, Pollet N (1999) Synexpression groups in eukaryotes. Nature 402: 483–487 | Article | PubMed | ISI | ChemPort |
  4. Su AI, Wiltshire T, Batalov S, Lapp H, Ching KA, Block D, Zhang J, Soden R, Hayakawa M, Kreiman G, Cooke MP, Walker JR, Hogenesch JB (2004) A gene atlas of the mouse and human protein-encoding transcriptomes. Proc Natl Acad Sci USA 101: 6062–6067 | Article | PubMed | ChemPort |
  5. Tompa M, Li N, Bailey TL, Church GM, De Moor B, Eskin E, Favorov AV, Frith MC, Fu Y, Kent WJ, Makeev VJ, Mironov AA, Noble WS, Pavesi G, Pesole G, Regnier M, Simonis N, Sinha S, Thijs G, van Helden J, Vandenbogaert M, Weng Z, Workman C, Ye C, Zhu Z (2005) Assessing computational tools for the discovery of transcription factor binding sites. Nat Biotechnol 23: 137–144 | Article | PubMed | ISI | ChemPort |
  6. Tsou AP, Yang CW, Huang CY, Yu RC, Lee YC, Chang CW, Chen BR, Chung YF, Fann MJ, Chi CW, Chiu JH, Chou CK (2003) Identification of a novel cell cycle regulated gene, HURP, overexpressed in human hepatocellular carcinoma. Oncogene 22: 298–307 | Article | PubMed | ISI | ChemPort |
  7. Wang W, Cherry JM, Nochomovitz Y, Jolly E, Botstein D, Li H (2005) Inference of combinatorial regulation in yeast transcriptional networks: a case study of sporulation. Proc Natl Acad Sci USA 102: 1998–2003 | Article | PubMed | ChemPort |
  8. Whitfield ML, Sherlock G, Saldanha AJ, Murray JI, Ball CA, Alexander KE, Matese JC, Perou CM, Hurt MM, Brown PO, Botstein D (2002) Identification of genes periodically expressed in the human cell cycle and their expression in tumors. Mol Biol Cell 13: 1977–2000 | Article | PubMed | ISI | ChemPort |
  9. Zhu W, Giangrande PH, Nevins JR (2005a) Temporal control of cell cycle gene expression mediated by E2F transcription factors. Cell Cycle 4: 633–636 | PubMed | ISI | ChemPort |

MORE ARTICLES LIKE THIS

These links to content published by NPG are automatically generated.

NEWS AND VIEWS

Promoting human promoters

Molecular Systems Biology News and Views (06 Jun 2006)

RESEARCH

On the relation between promoter divergence and gene expression evolution

Molecular Systems Biology Article (15 Jan 2008)

Reconstructing dynamic regulatory maps

Molecular Systems Biology Article (16 Jan 2007)

Modeling the regulatory network of histone acetylation in Saccharomyces cerevisiae

Molecular Systems Biology Article (18 Dec 2007)

See all 4 matches for Research

Extra navigation

.
ADVERTISEMENT