TranscriptSNPView: a genome-wide catalog of mouse coding variation

Cunningham, Fiona; Rios, Daniel; Griffiths, Mark; Smith, James; Ning, Zemin; Cox, Tony; Flicek, Paul; Marin-Garcin, Pablo; Herrero, Javier; Rogers, Jane; van der Weyden, Louise; Bradley, Allan; Birney, Ewan; Adams, David J

doi:10.1038/ng0806-853a

Download PDF

Correspondence
Published: 01 August 2006

TranscriptSNPView: a genome-wide catalog of mouse coding variation

Fiona Cunningham¹,
Daniel Rios²,
Mark Griffiths¹,
James Smith¹,
Zemin Ning¹,
Tony Cox¹,
Paul Flicek²,
Pablo Marin-Garcin¹,
Javier Herrero²,
Jane Rogers¹,
Louise van der Weyden¹,
Allan Bradley¹,
Ewan Birney² &
…
David J Adams¹

Nature Genetics volume 38, page 853 (2006)Cite this article

248 Accesses
13 Citations
Metrics details

To the Editor:

With the recent release of the genome-wide sequence for multiple inbred mouse strains¹, and with resequencing data for a large number of additional strains entering the public domain (http://www.niehs.nih.gov/crg/cprc.htm), we are one step closer to being able to identify the underlying genetic variants responsible for the trait characteristics that define each strain. Here, we describe a genome-wide catalog of coding variation in the mouse genome that was developed using an extensive collection of mouse DNA sequence reads, including those recently released by Celera, data from dbSNP² and resequencing data generated by Perlegen Sciences for the US National Institute of Environmental Health Sciences (NIEHS). To display these data, we developed a new software tool, TranscriptSNPView, which has been integrated into the Ensembl Genome Browser to take advantage of the evolving mouse genome assembly and the latest Ensembl³ and Vega gene predictions⁴. TranscriptSNPView can be accessed via the Ensembl Genome Browser (http://www.ensembl.org/Mus_musculus/transcriptsnpview).

TranscriptSNPView displays coding SNP data from 48 mouse strains (Supplementary Table 1 online). Using the SNP calling algorithm ssahaSNP⁵, we computed over 50 million SNPs from the common laboratory Mus musculus strains A/J, DBA/2J, 129X1/SvJ and 129S1/SvImJ from whole-genome shotgun sequence reads generated by Celera, and from C3HeB/FeJ and NOD BAC-end sequence reads generated by the Wellcome Trust Sanger Institute. We also generated SNP calls from the Mus musculus molossinus strain MSM/Ms using sequence reads generated by RIKEN⁶ (Supplementary Table 1). Collectively, these SNP calls have been designated 'Sanger SNPs'. The 25 million DNA sequence reads used to generate the Sanger SNP collection represent 7.32-fold coverage of the NCBI mouse build 35 genome assembly and are available via the Ensembl trace repository (http://trace.ensembl.org).

The Sanger SNP calls were distilled to 6.87 million nonredundant genome-wide SNP features and were combined with an additional 6.4 million dbSNP entries (version 126), providing data for an additional 41 mouse strains. By merging these data sets and mapping them against the Ensembl 38.35 mouse gene build, we collated 726,462 coding SNP variants across all strains and computed their amino acid consequences to identify 249,996 nonsynonymous coding changes and 2,667 stop codons. Coding SNP figures for each strain are provided in Supplementary Table 1. We also identified instances where stop codons had been lost, and we predicted mutations in introns, invariant intronic splice sites and in untranslated and regulatory regions. These predictions, which can be used as a basis for identifying functional SNP variants, are displayed in TranscriptSNPView. A detailed description of all of the features of TranscriptSNPView is provided in the Supplementary Note online.

A data collection of this quality and depth is unprecedented and will provide the means to obtain a high-resolution picture of coding variation in the mouse genome. TrancriptSNPView represents a powerful new tool for functional analysis of the mouse genome and will become a central repository for mouse coding variation data.

Note: Supplementary information is available on the Nature Genetics website.

References

Marris, E. Nature 435, 6 (2005).
Article CAS Google Scholar
Sherry, S.T. et al. Nucleic Acids Res. 29, 308–311 (2001).
Article CAS Google Scholar
Birney, E. et al. Nucleic Acids Res. 34, D556–D561 (2006).
Article CAS Google Scholar
Ashurst, J.L. et al. Nucleic Acids Res. 33, D459–D465 (2005).
Article CAS Google Scholar
Ning, Z. et al. Genome Res. 11, 1725–1729 (2001).
Article CAS Google Scholar
Abe, K. et al. Genome Res. 14, 2439–2447 (2004).
Article Google Scholar

Download references

Author information

Authors and Affiliations

The Wellcome Trust Sanger Institute, Hinxton, CB10 1SA, Cambridgeshire, UK
Fiona Cunningham, Mark Griffiths, James Smith, Zemin Ning, Tony Cox, Pablo Marin-Garcin, Jane Rogers, Louise van der Weyden, Allan Bradley & David J Adams
The European Bioinformatics Institute, Hinxton, CB10 1SD, Cambridgeshire, UK
Daniel Rios, Paul Flicek, Javier Herrero & Ewan Birney

Authors

Fiona Cunningham
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Rios
View author publications
You can also search for this author in PubMed Google Scholar
Mark Griffiths
View author publications
You can also search for this author in PubMed Google Scholar
James Smith
View author publications
You can also search for this author in PubMed Google Scholar
Zemin Ning
View author publications
You can also search for this author in PubMed Google Scholar
Tony Cox
View author publications
You can also search for this author in PubMed Google Scholar
Paul Flicek
View author publications
You can also search for this author in PubMed Google Scholar
Pablo Marin-Garcin
View author publications
You can also search for this author in PubMed Google Scholar
Javier Herrero
View author publications
You can also search for this author in PubMed Google Scholar
Jane Rogers
View author publications
You can also search for this author in PubMed Google Scholar
Louise van der Weyden
View author publications
You can also search for this author in PubMed Google Scholar
Allan Bradley
View author publications
You can also search for this author in PubMed Google Scholar
Ewan Birney
View author publications
You can also search for this author in PubMed Google Scholar
David J Adams
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ewan Birney.

Supplementary information

Supplementary Table 1

TranscriptSNPView: a genome-wide catalog of coding variation.

Supplementary Note

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cunningham, F., Rios, D., Griffiths, M. et al. TranscriptSNPView: a genome-wide catalog of mouse coding variation. Nat Genet 38, 853 (2006). https://doi.org/10.1038/ng0806-853a

Download citation

Issue Date: 01 August 2006
DOI: https://doi.org/10.1038/ng0806-853a

This article is cited by

The future of DNA sequence archiving
- Guy Cochrane
- Charles E Cook
- Ewan Birney
GigaScience (2012)
Ensembl variation resources
- Yuan Chen
- Fiona Cunningham
- Paul Flicek
BMC Genomics (2010)
A database and API for variation, dense genotyping and resequencing data
- Daniel Rios
- William M McLaren
- Fiona Cunningham
BMC Bioinformatics (2010)
Loss of Rassf1a cooperates with ApcMin to accelerate intestinal tumourigenesis
- L van der Weyden
- M J Arends
- D J Adams
Oncogene (2008)
What everybody should know about the rat genome and its online resources
- Simon N Twigger
- Kim D Pruitt
- Howard J Jacob
Nature Genetics (2008)

TranscriptSNPView: a genome-wide catalog of mouse coding variation

References

Author information

Authors and Affiliations

Corresponding author

Supplementary information

Supplementary Table 1

Supplementary Note

Rights and permissions

About this article

Cite this article

This article is cited by

The future of DNA sequence archiving

Ensembl variation resources

A database and API for variation, dense genotyping and resequencing data

Loss of Rassf1a cooperates with ApcMin to accelerate intestinal tumourigenesis

What everybody should know about the rat genome and its online resources

Search

Quick links

References

Author information

Authors and Affiliations

Corresponding author

Supplementary information

Supplementary Table 1

Supplementary Note

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

The future of DNA sequence archiving

Ensembl variation resources

A database and API for variation, dense genotyping and resequencing data

Loss of Rassf1a cooperates with ApcMin to accelerate intestinal tumourigenesis

What everybody should know about the rat genome and its online resources

Search

Quick links