Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Target analysis by integration of transcriptome and ChIP-seq data with BETA

Abstract

The combination of ChIP-seq and transcriptome analysis is a compelling approach to unravel the regulation of gene expression. Several recently published methods combine transcription factor (TF) binding and gene expression for target prediction, but few of them provide an efficient software package for the community. Binding and expression target analysis (BETA) is a software package that integrates ChIP-seq of TFs or chromatin regulators with differential gene expression data to infer direct target genes. BETA has three functions: (i) to predict whether the factor has activating or repressive function; (ii) to infer the factor's target genes; and (iii) to identify the motif of the factor and its collaborators, which might modulate the factor's activating or repressive function. Here we describe the implementation and features of BETA to demonstrate its application to several data sets. BETA requires 1 GB of RAM, and the procedure takes 20 min to complete. BETA is available open source at http://cistrome.org/BETA/.

This is a preview of subscription content

Access options

Buy article

Get time limited or full article access on ReadCube.

$32.00

All prices are NET prices.

Figure 1: BETA workflow.
Figure 2: BETA output of activating/repressive function prediction and motif analysis of AR.
Figure 3: Activating and repressive function prediction of Tet1 in mouse ES cells and ESR1 in MCF-7 cells.
Figure 4: Screenshots of summarized BETA-plus analysis of ESR1 motifs in html format.

Accession codes

Accessions

Gene Expression Omnibus

References

  1. McLean, C.Y. et al. GREAT improves functional interpretation of cis-regulatory regions. Nat. Biotechnol. 28, 495–501 (2010).

    CAS  Article  Google Scholar 

  2. Jiang, C., Xuan, Z., Zhao, F. & Zhang, M.Q. TRED: a transcriptional regulatory element database, new entries and other development. Nucleic Acids Res. 35, D137–D140 (2007).

    CAS  Article  Google Scholar 

  3. Buck, M.J. & Lieb, J.D. A chromatin-mediated mechanism for specification of conditional transcription factor targets. Nat. Genet. 38, 1446–1451 (2006).

    CAS  Article  Google Scholar 

  4. Palii, C.G. et al. Differential genomic targeting of the transcription factor TAL1 in alternate haematopoietic lineages. EMBO J. 30, 494–509 (2010).

    Article  Google Scholar 

  5. Tang, Q. et al. A comprehensive view of nuclear receptor cancer cistromes. Cancer Res. 71, 6940–6947 (2011).

    CAS  Article  Google Scholar 

  6. Breitling, R., Armengaud, P., Amtmann, A. & Herzyk, P. Rank products: a simple, yet powerful, new method to detect differentially regulated genes in replicated microarray experiments. FEBS Lett. 573, 83–92 (2004).

    CAS  Article  Google Scholar 

  7. Sherman, B.T. et al. DAVID Bioinformatics Resources: expanded annotation database and novel algorithms to better extract biology from large gene lists. Nucleic Acids Res. 35, W169–W175 (2007).

    Article  Google Scholar 

  8. Feng, J., Liu, T., Qin, B., Zhang, Y. & Liu, X.S. Identifying ChIP-seq enrichment using MACS. Nat. Protoc. 7, 1728–1740 (2012).

    CAS  Article  Google Scholar 

  9. Ma, W. & Wong, W.H. The analysis of ChIP-seq data. Methods Enzymol. 497, 51–73 (2011).

    CAS  Article  Google Scholar 

  10. Zang, C. et al. A clustering approach for identification of enriched domains from histone modification ChIP-seq data. Bioinformatics 25, 1952–1958 (2009).

    CAS  Article  Google Scholar 

  11. Boyer, L.A. et al. Core transcriptional regulatory circuitry in human embryonic stem cells. Cell 122, 947–956 (2005).

    CAS  Article  Google Scholar 

  12. Rougemont, J. & Naef, F. Computational analysis of protein-DNA interactions from ChIP-seq data. Methods Mol. Biol. 786, 263–273 (2012).

    CAS  Article  Google Scholar 

  13. Cheng, C., Min, R. & Gerstein, M. TIP: a probabilistic method for identifying transcription factor target genes from ChIP-seq binding profiles. Bioinformatics 27, 3221–3227 (2011).

    CAS  Article  Google Scholar 

  14. Qian, J., Lin, J., Luscombe, N.M., Yu, H. & Gerstein, M. Prediction of regulatory networks: genome-wide identification of transcription factor targets from gene expression data. Bioinformatics 19, 1917–1926 (2003).

    CAS  Article  Google Scholar 

  15. Honkela, A. et al. Model-based method for transcription factor target identification with limited data. Proc. Natl. Acad. Sci. 107, 7793–7798 (2010).

    CAS  Article  Google Scholar 

  16. Redestig, H., Weicht, D., Selbig, J. & Hannah, M.A. Transcription factor target prediction using multiple short expression time series from Arabidopsis thaliana. BMC Bioinformatics 8, 454 (2007).

    Article  Google Scholar 

  17. Qin, J., Li, M.J., Wang, P., Zhang, M.Q. & Wang, J. ChIP-Array: combinatory analysis of ChIP-seq/chip and microarray gene expression data to discover direct/indirect targets of a transcription factor. Nucleic Acids Res. 39, W430–W436 (2011).

    CAS  Article  Google Scholar 

  18. Maienschein-Cline, M., Zhou, J., White, K.P., Sciammas, R. & Dinner, A.R. Discovering transcription factor regulatory targets using gene expression and binding data. Bioinformatics 28, 206–213 (2012).

    CAS  Article  Google Scholar 

  19. Smyth, G.K. Limma: linear models for microarray data. In Bioinformatics and Computational Biology Solutions using R and Bioconductor (eds. Gentleman, R., Carey, V.J., Huber, W., Irizarry, R.A., Dudoit, S.) 397–420 (Springer, 2005).

  20. Trapnell, C. et al. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat. Protoc. 7, 562–578 (2012).

    CAS  Article  Google Scholar 

  21. Langmead, B., Trapnell, C., Pop, M. & Salzberg, S.L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009).

    Article  Google Scholar 

  22. Zhang, Y. et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9, R137 (2008).

    Article  Google Scholar 

  23. Johnson, W.E. et al. Model-based analysis of tiling-arrays for ChIP-chip. Proc. Natl. Acad. Sci. USA 103, 12457–12462 (2006).

    CAS  Article  Google Scholar 

  24. Wang, Q. et al. A hierarchical network of transcription factors governs androgen receptor-dependent prostate cancer growth. Mol. Cell 27, 380–392 (2007).

    Article  Google Scholar 

  25. Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl. Acad. Sci. USA 102, 15545–15550 (2005).

    CAS  Article  Google Scholar 

  26. Chan, C.S. & Song, J.S. CCCTC-binding factor confines the distal action of estrogen receptor. Cancer Res. 68, 9041–9049 (2008).

    CAS  Article  Google Scholar 

  27. Sun, Z., Pan, J. & Balk, S.P. Androgen receptor-associated protein complex binds upstream of the androgen-responsive elements in the promoters of human prostate-specific antigen and kallikrein 2 genes. Nucleic Acids Res. 25, 3318–3325 (1997).

    CAS  Article  Google Scholar 

  28. Steketee, K., Ziel-van der Made, A.C., van der Korput, H.A., Houtsmuller, A.B. & Trapman, J. A bioinformatics-based functional analysis shows that the specifically androgen-regulated gene SARG contains an active direct repeat androgen response element in the first intron. J. Mol. Endocrinol. 33, 477–491 (2004).

    CAS  Article  Google Scholar 

  29. Cai, C., Wang, H., Xu, Y., Chen, S. & Balk, S.P. Reactivation of androgen receptor-regulated TMPRSS2:ERG gene expression in castration-resistant prostate cancer. Cancer Res. 69, 6027–6032 (2009).

    CAS  Article  Google Scholar 

  30. Thorvaldsdottir, H., Robinson, J.T. & Mesirov, J.P. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief. Bioinform. 14, 178–192 (2013).

    CAS  Article  Google Scholar 

  31. Kent, W.J. et al. The human genome browser at UCSC. Genome Res. 12, 996–1006 (2002).

    CAS  Article  Google Scholar 

  32. Korhonen, J., Martinmäki, P., Pizzi, C., Rastas, P. & Ukkonen, E. MOODS: fast search for position weight matrix matches in DNA sequences. Bioinformatics 25, 3181–3182 (2009).

    CAS  Article  Google Scholar 

  33. Liu, T. et al. Cistrome: an integrative platform for transcriptional regulation studies. Genome Biol. 12, R83 (2011).

    CAS  Article  Google Scholar 

  34. Habib, N., Kaplan, T., Margalit, H. & Friedman, N. A novel Bayesian DNA motif comparison method for clustering and retrieval. PLoS Comput. Biol. 4, e1000010 (2008).

    Article  Google Scholar 

  35. Fulton, D.L. et al. TFCat: the curated catalog of mouse and human transcription factors. Genome Biol. 10, R29 (2009).

    Article  Google Scholar 

  36. Da Wei Huang, B.T.S., Stephens, R., Baseler, M.W., Lane, H.C. & Lempicki, R.A. DAVID gene ID conversion tool. Bioinformation 2, 428 (2008).

    Article  Google Scholar 

  37. Fujita, P.A. et al. The UCSC genome browser database: update 2011. Nucleic Acids Res. 39, D876–D882 (2011).

    CAS  Article  Google Scholar 

  38. Dai, M. et al. Evolving gene/transcript definitions significantly alter the interpretation of GeneChip data. Nucleic Acids Res. 33, e175–e175 (2005).

    Article  Google Scholar 

  39. Williams, K. et al. TET1 and hydroxymethylcytosine in transcription and DNA methylation fidelity. Nature 473, 343–348 (2011).

    CAS  Article  Google Scholar 

  40. Hu, M., Yu, J., Taylor, J.M., Chinnaiyan, A.M. & Qin, Z.S. On the detection and refinement of transcription factor binding sites using ChIP-seq data. Nucleic Acids Research 38, 2154–2167 (2010).

    CAS  Article  Google Scholar 

  41. Carroll, J.S. et al. Genome-wide analysis of estrogen receptor binding sites. Nat. Genet. 38, 1289–1297 (2006).

    CAS  Article  Google Scholar 

Download references

Acknowledgements

This project was supported by the National Basic Research (973) Program of China (2010CB944904), the National Natural Science Foundation of China (31329003) and the US National Institutes of Health (HG4069 and U41 HG007000).

Author information

Authors and Affiliations

Authors

Contributions

S.W., C.A.M., Q.T. and X.S.L. designed the method; S.W., H.S., J.M., C.W., C.Z., J.W. and X.S.L. implemented the algorithm; S.W. performed the data analysis; and S.W. and X.S.L. wrote the initial manuscript. All authors contributed to the discussion and writing of the final manuscript.

Corresponding authors

Correspondence to Yong Zhang or X Shirley Liu.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Wang, S., Sun, H., Ma, J. et al. Target analysis by integration of transcriptome and ChIP-seq data with BETA. Nat Protoc 8, 2502–2515 (2013). https://doi.org/10.1038/nprot.2013.150

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nprot.2013.150

Further reading

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing