the mouse genome
Nature 420, 563-573 (5 December 2002) | doi:10.1038/nature01266; Received 19 September 2002; Accepted 28 October 2002
Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs
and The FANTOM Consortium and the RIKEN Genome Exploration Research Group Phase I & II Team*
Abstract
Only a small proportion of the mouse genome is transcribed into mature messenger RNA transcripts. There is an international collaborative effort to identify all full-length mRNA transcripts from the mouse, and to ensure that each is represented in a physical collection of clones. Here we report the manual annotation of 60,770 full-length mouse complementary DNA sequences. These are clustered into 33,409 'transcriptional units', contributing 90.1% of a newly established mouse transcriptome database. Of these transcriptional units, 4,258 are new protein-coding and 11,665 are new non-coding messages, indicating that non-coding RNA is a major component of the transcriptome. 41% of all transcriptional units showed evidence of alternative splicing. In protein-coding transcripts, 79% of splice variations altered the protein product. Whole-transcriptome analyses resulted in the identification of 2,431 sense–antisense pairs. The present work, completely supported by physical clones, provides the most comprehensive survey of a mammalian transcriptome so far, and is a valuable resource for functional genomics.
- Laboratory for Genome Exploration Research Group, RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa, 230-0045, Japan;
- Genome Science Laboratory, Discovery and Research Institute, RIKEN Wako Main Campus, 2-1 Hirosawa, Wako, Saitama, 351-0198,Japan;
- NTT Software Corporation, 223-1 Yamashita-cho, Naka-ku, Yokohama, Kanagawa, 231-8554, Japan;
- Division of Genomic Information Resources, Science of Biological Supramolecular Systems, Graduate School of Integrated Science, Yokohama City University, 1-7-29 Suehiro-Cho, Tsurumi-Ku, Yokohama, 230-0045, Japan;
- Institute for Advanced Biosciences, Keio Univ, 403-1 Tsuruoka-city, Yamagata, 997-0017, Japan;
- Institute of Basic Medical Sciences, University of Tsukuba, 1-1-1 Tennoudai, Tsukuba, Ibaraki, 305-8577, Japan;
- Biomedical Knowledge Discovery Team, Bioinformatics Group, RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa, 230-0045, Japan;
- Center for Information Biology and DNA Data Bank of Japan, National Institute of Genetics, 1111 Yata, Mishima, Shizuoka 411-8540, Japan;
- Mouse Genome Informatics Group, The Jackson Laboratory, 600 Main Street, Bar Harbor, Maine 04609, USA;
- Institute for Molecule Bioscience and ARC Special Research Centre for Functional and Applied Genomics, University of Queensland, Queensland 4072, Australia;
- The Institute for Genomic Research (TIGR), 9712 Medical Center Drive, Rockville, Maryland 20850, USA;
- National Center for Biotechnology Information, NIH, Bldg 38A, 8600 Rockville Pike, Bethesda, Maryland 20894, USA;
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK;
- Graduate School of Information Science and Technology, Osaka University, 1-3 Machikaneyama, Toyonaka, Osaka, 560-8531, Japan;
- Genomics Institute of the Novartis Research Foundation (GNF), 10675 John Jay Hopkins Drive, San Diego, California 92121, USA;
- Boys Town National Research Hospital, 555 North 30th Street, Omaha, Nebraska 68131, USA;
- Laboratories for Information Technology, 21, Heng Mui Keng Terrace, Singapore, 119613, Singapore;
- Structural Studies, MRC Laboratory of Molecular Biology, Hills Road, Cambridge CB2 2QH, UK;
- LNCIB, Functional Genomics, AREA Science Park, Padriciano, 99 Trieste, 34012, Italy;
- Istituto Tumori Milano, Milano, 20133, Italy;
- The Scripps Research Institute, 10550N. Torrey Pines Road, La Jolla, California 92037, USA;
- The Zebrafish International Resource Center, University of Oregon, Eugene, Oregon 97403-5274, USA;
- Laboratory of Computational Genomics, The Rockefeller University, 1230 York Avenue, New York, New York 10021-6399, USA;
- Universita' di Milano, Milano, 20133, Italy;
- The Burnham Institute, 10901N. Torrey Pines Road, La Jolla, California 92037, USA;
- Department of Neurobiology, Harvard Medical School, 220 Longwood Avenue, Boston, Massachusetts 02115, USA;
- Graduate School of Medicine, University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-0033, Japan;
- MRC Human Genetics Unit, Crewe Road, Edinburgh, UK;
- Duke University Medical Center, Department of Neurobiology, Box 3209 Durham, North Carolina, 27710, USA;
- Howard Hughes Medical Institute, Department of Molecular Genetics, University of Texas Southwestern Medical Center at Dallas, 5323 Harry Hines Boulevard, Dallas, Texas 75390-9050, USA;
- Center for Genomics and Bioinformatics, Karolinska Institutet, Berzelius vag 35, 17177 Stockholm, Sweden;
- JDRF/WT Diabetes and Inflammation Laboratory, Cambridge Institute for Medical Research, Addenbrookes Hospital Hills Road, Cambridge CB2 2XY, UK;
- National Human Genome Research Institute, National Institutes of Health, 49/4A82 49 Convent Drive, MSC4472, Bethesda, Maryland 20892-4472, USA;
- Autoimmunity Research Unit, The Canberra Hospital, Yamba Drive, Woden, ACT 2606, Australia;
- Applied Genomics, Inc. 525 Del Rey Avenue, Sunnyvale, California 94085, USA;
- Hirakata Ryoikuen, 2-1-1 Tsudahigashi, Hirakata, Osaka, 565-0874, Japan;
- Institute for Advanced Medical Sciences, Hyogo College of Medicine, Mukogawa 1-1, Nishinomiya, Hyogo, 663-8501, Japan;
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK;
- University of California, San Diego School of Medicine, Pediatrics/Medicine 9500 Gilman Drive, La Jolla, California 92093-0627 USA;
- Department of Psychiatry, University of Bonn, Sigmund-Freud-Strasse 25, Bonn, 53105, Germany;
- Genome Sequencing Center, Washington University School of Medicine, Campus Box 8501, 4444 Forest Park Avenue, St Louis, Missouri 63108, USA;
- Whitehead Institute/MIT Center for Genome Research, 320 Charles Street, Cambridge, Massachusetts 02141, USA
- Correspondence and requests for materials should be addressed to Y.H. (e-mail: Email: yosihide@gsc.riken.go.jp). Details of methods, and accession numbers for all 60,770 cDNA clones, are available as Supplementary Information. Clone availability information regarding access to the FANTOM2 clones is available at http://www.riken.go.jp/.
Authors' contributions: Y. Okazaki, M. Furuno, T. Kasukawa, C. Schönbach, R. Baldarelli, D. P. Hill, C. Bult, D. A. Hume, J. Quackenbush, L. M. Schriml, A. Kanapin and Y. Hayashizaki are core authorship members; P. Carninci and J. Kawai are team organizers.
