Abstract
Identification of proteins by tandem mass spectrometry requires a reference protein database, but these are only available for model species. Here we demonstrate that, for a non-model species, the sequencing of expressed mRNA can generate a protein database for mass spectrometry–based identification. This combination of high-throughput sequencing and protein identification technologies allows detection of genes and proteins. We use human cells infected with human adenovirus as a complex and dynamic model to demonstrate the robustness of this approach. Our proteomics informed by transcriptomics (PIT) technique identifies >99% of over 3,700 distinct proteins identified using traditional analysis that relies on comprehensive human and adenovirus protein lists. We show that this approach can also be used to highlight genes and proteins undergoing dynamic changes in post-transcriptional protein stability.
Access options
Subscribe to Journal
Get full journal access for 1 year
$242.00
only $20.17 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
from$8.99
All prices are NET prices.
Accessions
Primary accessions
ArrayExpress
Referenced accessions
NCBI Reference Sequence
References
- 1.
Wang, Z., Gerstein, M. & Snyder, M. RNA-Seq: a revolutionary tool for transcriptomics. Nat. Rev. Genet. 10, 57–63 (2009).
- 2.
Grabherr, M.G. et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat. Biotechnol. 29, 644–652 (2011).
- 3.
Brewis, I.A. & Brennan, P. Proteomics technologies for the global identification and quantification of proteins. Adv. Protein Chem. Struct. Biol. 80, 1–44 (2010).
- 4.
Lamond, A.I. et al. Advancing cell biology through proteomics in space and time (PROSPECTS). Mol. Cell. Proteomics 11, O112.017731 (2012).
- 5.
Nesvizhskii, A.I. A survey of computational methods and error rate estimation procedures for peptide and protein identification in shotgun proteomics. J. Proteomics 73, 2092–2123 (2010).
- 6.
Lundberg, E. et al. Defining the transcriptome and proteome in three functionally different human cell lines. Mol. Syst. Biol. 6, 450 (2010).
- 7.
Li, M. et al. Widespread RNA and DNA sequence differences in the human transcriptome. Science 333, 53–58 (2011).
- 8.
Castellana, N. & Bafna, V. Proteogenomics to discover the full coding content of genomes: a computational perspective. J. Proteomics 73, 2124–2135 (2010).
- 9.
Volkening, J.D. et al. A proteogenomic survey of the Medicago truncatula genome. Mol. Cell. Proteomics 11, 933–944 (2012).
- 10.
Cox, J. et al. Andromeda: a peptide search engine integrated into the MaxQuant environment. J. Proteome Res. 10, 1794–1805 (2011).
- 11.
Wu, B.J., Hurst, H.C., Jones, N.C. & Morimoto, R.I. The E1A 13S product of adenovirus 5 activates transcription of the cellular human HSP70 gene. Mol. Cell. Biol. 6, 2994–2999 (1986).
- 12.
Dallaire, F., Blanchette, P. & Branton, P.E. A proteomic approach to identify candidate substrates of human adenovirus E4orf6-E1B55K and other viral cullin-based E3 ubiquitin ligases. J. Virol. 83, 12172–12184 (2009).
- 13.
Evans, J.D. & Hearing, P. Relocalization of the Mre11-Rad50-Nbs1 complex by the adenovirus E4 ORF3 protein is required for viral replication. J. Virol. 79, 6207–6215 (2005).
- 14.
Lam, Y.W., Evans, V.C., Heesom, K.J., Lamond, A.I. & Matthews, D.A. Proteomics analysis of the nucleolus in adenovirus-infected cells. Mol. Cell. Proteomics 9, 117–130 (2010).
- 15.
Blankenberg, D. et al. Galaxy: a web-based genome analysis tool for experimentalists. Curr. Protoc. Mol. Biol. 89, 19.10 (2010).
- 16.
Trapnell, C., Pachter, L. & Salzberg, S.L. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25, 1105–1111 (2009).
- 17.
Trapnell, C. et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat. Biotechnol. 28, 511–515 (2010).
- 18.
Swift, F.V., Bhat, K., Younghusband, H.B. & Hamada, H. Characterization of a cell type-specific enhancer found in the human papilloma virus type 18 genome. EMBO J. 6, 1339–1344 (1987).
- 19.
Zhao, H., Granberg, F., Elfineh, L., Pettersson, U. & Svensson, C. Strategic attack on host cell gene expression during adenovirus infection. J. Virol. 77, 11006–11015 (2003).
- 20.
Zhao, H., Dahlo, M., Isaksson, A., Syvanen, A.C. & Pettersson, U. The transcriptome of the adenovirus infected cell. Virology 424, 115–128 (2012).
- 21.
Wu, T.D. & Watanabe, C.K. GMAP: a genomic mapping and alignment program for mRNA and EST sequences. Bioinformatics 21, 1859–1875 (2005).
- 22.
Soloway, P.D. & Shenk, T. The adenovirus type 5 i-leader open reading frame functions in cis to reduce the half-life of L1 mRNAs. J. Virol. 64, 551–558 (1990).
- 23.
Symington, J.S. et al. Biosynthesis of adenovirus type 2 i-leader protein. J. Virol. 57, 848–856 (1986).
- 24.
van den Hengel, S.K. et al. Truncating the i-leader open reading frame enhances release of human adenovirus type 5 in glioma cells. Virol. J. 8, 162 (2011).
- 25.
Xu, X. et al. The genomic sequence of the Chinese hamster ovary (CHO)-K1 cell line. Nat. Biotechnol. 29, 735–741 (2011).
- 26.
Cingolani, P et al. snpEff SNP effect predictor v.3.0 〈http://snpeff.sourceforge.net/〉 (2012).
- 27.
O'Shea, C.C. et al. Late viral RNA export, rather than p53 inactivation, determines ONYX-015 tumor selectivity. Cancer Cell 6, 611–623 (2004).
- 28.
Orazio, N.I., Naeger, C.M., Karlseder, J. & Weitzman, M.D. The adenovirus E1b55K/E4orf6 complex induces degradation of the Bloom helicase during infection. J. Virol. 85, 1887–1892 (2011).
- 29.
Woo, J.L. & Berk, A.J. Adenovirus ubiquitin-protein ligase stimulates viral late mRNA nuclear export. J. Virol. 81, 575–587 (2007).
- 30.
Boisvert, F.M. et al. A quantitative spatial proteomics analysis of proteome turnover in human cells. Mol. Cell. Proteomics 11, M111.011429 (2012).
- 31.
Ma, X.M., Yoon, S.O., Richardson, C.J., Julich, K. & Blenis, J. SKAR links pre-mRNA splicing to mTOR/S6K1-mediated enhanced translation efficiency of spliced mRNAs. Cell 133, 303–313 (2008).
- 32.
Forrester, N.A. et al. Serotype-specific inactivation of the cellular DNA damage response during adenovirus infection. J. Virol. 85, 2201–2211 (2011).
- 33.
Halbert, D.N., Cutt, J.R. & Shenk, T. Adenovirus early region 4 encodes functions required for efficient DNA replication, late gene expression, and host cell shutoff. J. Virol. 56, 250–257 (1985).
Acknowledgements
We thank P. Kellam and A. Palser for help and advice throughout and C. Trapnell, J. Goeks, J. Jackson, P. Cingonlani, B. Haas, T. Wu and J. Robinson for informative and helpful discussions by email. We especially thank I. Goodfellow for discussions on using proteomic data from baby hamster kidney and CHO cells. We are grateful to R.T. Hay (University of Dundee) and J. Blenis (Harvard Medical School) for antibodies to DBP and POLDIP3, respectively. In addition, we thank J. Blenis for the HA-tagged POLDIP3 expression plasmid and K. Leppard (University of Warwick) for the dl366 adenovirus. We also thank the members of the University of Bristol Transcriptomics facility (especially J. Coghill) and University of Bristol Wolfson Bioimaging facility for their help. D.A.M. and V.C.E. receive funding from the Wellcome trust (grant no. 083604). C.B. and J.F. receive funding from the Biotechnology and Biological Sciences Research Council (grant BB/I00095X/1).
Author information
Affiliations
School of Cellular and Molecular Medicine, University of Bristol, Bristol, UK.
- Vanessa C Evans
- & David A Matthews
School of Biological Sciences, University of Bristol, Bristol, UK.
- Gary Barker
School of Biochemistry, University of Bristol, Bristol, UK.
- Kate J Heesom
Bioinformatics Group, Cranfield Health, Cranfield University, Cranfield, Bedfordshire, UK.
- Jun Fan
- & Conrad Bessant
Authors
Search for Vanessa C Evans in:
Search for Gary Barker in:
Search for Kate J Heesom in:
Search for Jun Fan in:
Search for Conrad Bessant in:
Search for David A Matthews in:
Contributions
V.C.E. cowrote the manuscript, prepared infected cells, performed western blots and assisted with immunofluorescence. G.B. cowrote the manuscript, wrote software and assisted with handling the RNA-seq data. K.J.H. performed the mass spectrometry and assisted with analysis of the MS/MS data. J.F. helped with the BLAST analysis and wrote some of the BLAST search software. C.B. cowrote the manuscript and assisted with the MS/MS analysis and BLAST database searches. D.A.M. conceived of the experiments and PIT analysis pipeline, led the manuscript writing, wrote software, assisted with the immunofluorescence and the preparation of infected cells, and carried out manual curation, analysis and integration of the data.
Competing interests
The authors declare no competing financial interests.
Corresponding author
Correspondence to David A Matthews.
Supplementary information
PDF files
- 1.
Supplementary Text and Figures
Supplementary Figures 1–3 and Supplementary Tables 4 and 8
Zip files
- 1.
Supplementary Data
Supplementary Data 1–12: 1. Trinity transcripts derived from human RNA-seq data. 2. Trinity-derived protein data set from human Trinity transcripts data. 3. Sequence alignment map (SAM) file of Trinity transcripts with peptide data added from human PIT analysis. 4. Gene feature format (GFF3) file of the peptides associated with the Trinity transcripts from the human PIT analysis. 5. FASTA file of the longest open-reading frames in the human PIT analysis for which MS/MS data indicates that a protein is being made from a Trinity-predicted transcript. 6. Trinity transcripts derived from Chinese hamster ovary (CHO) RNA-seq data. 7. Trinity-derived protein list CHO Trinity transcriptome data. 8. SAM file of Trinity transcripts mapped to the Cricetulus griseus genome with peptide data added. 9. GFF3 file of the peptides associated with the Trinity transcripts on the Cricetulus griseus genome. 10. FASTA file of proteins identified as altered by SNPeffect. 11. FASTA file of proteins identified as altered by SNPeffect and corrected to reflect the nonsynonymous change. 12. SNPeffect output file containing a list of proteins that have been affected by a nonsynonymous changes and the location and type of change.
- 2.
Supplementary Tables
Supplementary Tables 1–3, 5–7, 9 and 10: 1. Combined data on gene expression and changes in protein abundance 2. Peptides identified by searching distinct data sets. 3. Results of BLAST analysis of proteins predicted by Trinity assembly and confirmed by MS/MS. 5. Analysis of a hamster proteome using a Trinity-derived list of hamster proteins, presenting peptides identified. 6. BLAST analysis of open-reading frames in the CHO-based Trinity list of which at least one peptide has been identified by MaxQuant. 7. Searching for SNPs within the proteome. 9. Analysis of transcript and ORF length generated by Trinity and getorf from human RNAseq data. 10. BLAST analysis of all the ORFS present in the Trinity list derived from the human RNAseq data and used by MaxQuant to search for peptides.
- 3.
Supplementary Software
A ZIP file (Scripts.zip) of the Supplementary Software (9 items) and the README file.
Rights and permissions
To obtain permission to re-use content from this article visit RightsLink.
About this article
Further reading
-
A reverse vaccinology approach to the identification and characterization of Ctenocephalides felis candidate protective antigens for the control of cat flea infestations
Parasites & Vectors (2018)
-
Proteomics informed by transcriptomics for characterising active transposable elements and genome annotation in Aedes aegypti
BMC Genomics (2017)
-
A proteomic insight into the midgut proteome of Ornithodoros moubata females reveals novel information on blood digestion in argasid ticks
Parasites & Vectors (2017)
-
Proteomics informed by transcriptomics for characterising differential cellular susceptibility to Nelson Bay orthoreovirus infection
BMC Genomics (2017)
-
Revealing the insoluble metasecretome of lignocellulose-degrading microbial communities
Scientific Reports (2017)