Article

Nature 447, 799-816 (14 June 2007) | doi:10.1038/nature05874; Received 2 March 2007; Accepted 23 April 2007

Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project

The ENCODE Project Consortium

  1. EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
  2. Department of Genome Sciences, 1705 NE Pacific Street, Box 357730, University of Washington, Seattle, Washington 98195, USA.
  3. Department of Biochemistry and Molecular Genetics, Jordan 1240, Box 800733, 1300 Jefferson Park Ave, University of Virginia School of Medicine, Charlottesville, Virginia 22908, USA.
  4. Genomic Bioinformatics Program, Center for Genomic Regulation,
  5. Research Group in Biomedical Informatics, Institut Municipal d'Investigació Mèdica/Universitat Pompeu Fabra, c/o Dr. Aiguader 88, Barcelona Biomedical Research Park Building, 08003 Barcelona, Catalonia, Spain.
  6. Affymetrix, Inc., Santa Clara, California 95051, USA.
  7. Genome Technology Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20892, USA.
  8. Bioinformatics Program, Boston University, 24 Cummington St., Boston, Massachusetts 02215, USA.
  9. Biomedical Engineering Department, Boston University, 44 Cummington St., Boston, Massachusetts 02215, USA.
  10. Department of Molecular, Cellular, and Developmental Biology, Yale University, New Haven, Connecticut 06520, USA.
  11. Department of Molecular Biophysics and Biochemistry, Yale University, PO Box 208114, New Haven, Connecticut 06520, USA.
  12. The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.
  13. Division of Medical Genetics, 1705 NE Pacific Street, Box 357720, University of Washington, Seattle, Washington 98195, USA.
  14. Division of Genetics, Brigham and Women's Hospital and Harvard Medical School, 77 Avenue Louis Pasteur, Boston, Massachusetts 02115, USA.
  15. Department of Chemistry and Program in Bioinformatics, Boston University, 590 Commonwealth Avenue, Boston, Massachusetts 02215, USA.
  16. Genetics Branch, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, USA.
  17. Department of Biology and Carolina Center for Genome Sciences, CB# 3280, 202 Fordham Hall, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, USA.
  18. Allen Institute for Brain Sciences, 551 North 34th Street, Seattle, Washington 98103, USA.
  19. Program in Gene Function and Expression and Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, 364 Plantation Street, Worcester, Massachusetts 01605, USA.
  20. Institute for Genome Sciences & Policy and Department of Pediatrics, 101 Science Drive, Duke University, Durham, North Carolina 27708, USA.
  21. Center for Integrative Genomics, University of Lausanne, Genopode building, 1015 Lausanne, Switzerland.
  22. Department of Genetic Medicine and Development, University of Geneva Medical School, 1211 Geneva, Switzerland.
  23. Genome Exploration Research Group, RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, 1-7-22, Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa, 230-0045, Japan.
  24. Institute for Theoretical Chemistry, University of Vienna, Währingerstras zlige 17, A-1090 Wien, Austria.
  25. Department of Biomolecular Engineering, University of California, Santa Cruz, 1156 High Street, Santa Cruz, California 95064, USA.
  26. Center for Biomolecular Science and Engineering, Engineering 2, Suite 501, Mail Stop CBSE/ITI, University of California, Santa Cruz, California 95064, USA.
  27. Department of Biological Chemistry & Molecular Pharmacology, Harvard Medical School, 240 Longwood Avenue, Boston, Massachusetts 02115, USA.
  28. Department of Genetics, Facultat de Biologia, Universitat de Barcelona, Av Diagonal, 645, 08028, Barcelona, Catalonia, Spain.
  29. Bioinformatics Institute, 30 Biopolis Street, #07-01 Matrix, Singapore, 138671, Singapore.
  30. Bioinformatics Group, Department of Computer Science,
  31. Interdisciplinary Center of Bioinformatics, University of Leipzig, Härtelstras zlige 16-18, D-04107 Leipzig, Germany.
  32. Fraunhofer Institut für Zelltherapie und Immunologie - IZI, Deutscher Platz 5e, D-04103 Leipzig, Germany.
  33. Department of Genetics, Stanford University School of Medicine, Stanford, California 94305, USA.
  34. Genome Institute of Singapore, 60 Biopolis Street, Singapore 138672, Singapore.
  35. Laboratory for Computational Genomics, Washington University, Campus Box 1045, Saint Louis, Missouri 63130, USA.
  36. Department of Mathematics and Computer Science, University of California, Berkeley, California 94720, USA.
  37. Spanish National Cancer Research Centre, CNIO, Madrid, E-28029, Spain.
  38. Department of Epidemiology and Public Health, Imperial College, St Mary's Campus, Norfolk Place, London W2 1PG, UK.
  39. Department of Applied Science & Technology, University of California, Berkeley, California 94720, USA.
  40. Genome Science Laboratory, Discovery and Research Institute, RIKEN Wako Institute, 2-1 Hirosawa, Wako, Saitama, 351-0198, Japan.
  41. Department of Genetics, Yale University School of Medicine, 333 Cedar Street, New Haven, Connecticut 06510, USA.
  42. Department of Pediatrics, University of Massachusetts Medical School, 55 Lake Avenue, North Worcester, Massachusetts 01605, USA.
  43. Department of Statistics, University of California, Berkeley, California 94720, USA.
  44. Institute for Molecular Bioscience, University of Queensland, St. Lucia, QLD 4072, Australia.
  45. The Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, New Mexico 87501, USA.
  46. Department of Computer Science, Yale University, PO Box 208114, New Haven, Connecticut 06520-8114, USA.
  47. Program in Computational Biology & Bioinformatics, Yale University, PO Box 208114, New Haven, Connecticut 06520-8114, USA.
  48. NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20892, USA.
  49. Department of Computer Science and Engineering, The Pennsylvania State University, University Park, Pennsylvania 16802, USA.
  50. Center for Comparative Genomics and Bioinformatics, Huck Institutes for Life Sciences, The Pennsylvania State University, University Park, Pennsylvania 16802, USA.
  51. Division of Extramural Research, National Human Genome Research Institute, National Institute of Health, 5635 Fishers Lane, Suite 4076, Bethesda, Maryland 20892-9305, USA.
  52. Office of the Director, National Human Genome Research Institute, National Institute of Health, 31 Center Drive, Suite 4B09, Bethesda, Maryland 20892-2152, USA.
  53. Department of Computer Science, Stanford University, Stanford, California 94305, USA.
  54. Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, 6720 MSC, 1300 University Ave, Madison, Wisconsin 53706, USA.
  55. Department of Zoology and Animal Biology, Faculty of Sciences, University of Geneva, 1205 Geneva, Switzerland.
  56. Department of Statistics, Stanford University, Stanford, California 94305, USA.
  57. Department of Bioengineering, University of California, Berkeley, California 94720-1762, USA.
  58. National Center for Biotechnology Information, National Institutes of Health, Bethesda, Maryland 20894, USA.
  59. Department of Biochemistry and Molecular Biology, Huck Institutes of Life Sciences, The Pennsylvania State University, University Park, Pennsylvania 16802, USA.
  60. Howard Hughes Medical Institute, University of California, Santa Cruz, California 95064, USA.
  61. Department of Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA.
  62. Department of Pathology, Stanford University School of Medicine, Stanford, California 94305, USA.
  63. Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030, USA.
  64. Genome Sequencing Center, Washington University School of Medicine, Campus Box 8501, 4444 Forest Park Avenue, Saint Louis, Missouri 63108, USA.
  65. Broad Institute of Harvard University and Massachusetts Institute of Technology, 320 Charles Street, Cambridge, Massachusetts 02141, USA.
  66. Whitehead Institute for Biomedical Research, 9 Cambridge Center, Cambridge, Massachusetts 02142, USA.
  67. Children's Hospital Oakland Research Institute, BACPAC Resources, 747 52nd Street, Oakland, California 94609, USA.
  68. Ludwig Institute for Cancer Research, 9500 Gilman Drive, La Jolla, California 92093-0653, USA.
  69. The Linnaeus Centre for Bioinformatics, Uppsala University, BMC, Box 598, SE-75124 Uppsala, Sweden.
  70. Department of Pharmacology and the Genome Center, University of California, Davis, California 95616, USA.
  71. Institute for Cellular & Molecular Biology, The University of Texas at Austin, 1 University Station A4800, Austin, Texas 78712, USA.
  72. NimbleGen Systems, Inc., 1 Science Court, Madison, Wisconsin 53711, USA.
  73. University of Wisconsin Medical School, Madison, Wisconsin 53706, USA.
  74. Department of Genetics and Pathology, Rudbeck Laboratory, Uppsala University, SE-75185 Uppsala, Sweden.
  75. University of California, San Diego School of Medicine, 9500 Gilman Drive, La Jolla, California 92093, USA.
  76. Genes and Disease Program, Center for Genomic Regulation, c/o Dr. Aiguader 88, Barcelona Biomedical Research Park Building, 08003 Barcelona, Catalonia, Spain.
  77. Broad Institute of MIT and Harvard, 7 Cambridge Center, Cambridge, Massachusetts 02142, USA.
  78. Center for Statistical Genetics, Department of Biostatistics, SPH II, 1420 Washington Heights, Ann Arbor, Michigan 48109-2029, USA.
  79. Department of Statistics, University of Oxford, Oxford OX1 3TG, UK.
  80. Universitat Pompeu Fabra, c/o Dr. Aiguader 88, Barcelona Biomedical Research Park Building, 08003 Barcelona, Catalonia, Spain.
  81. †Present addresses: Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA (G.M.C.); Department of Biological Statistics & Computational Biology, Cornell University, Ithaca, New York 14853, USA (A.S.); Faculty of Life Sciences, University of Manchester, Michael Smith Building, Oxford Road, Manchester, M13 9PT, UK (S.W.); SwitchGear Genomics, 1455 Adams Drive, Suite 2015, Menlo Park, California 94025, USA (N.D.T.; S.F.A.).
  82. A list of authors and their affiliations appears at the end of the paper.

Correspondence to: Ewan Birney1John A. Stamatoyannopoulos2Anindya Dutta3Roderic Guigó4,5Thomas R. Gingeras6Elliott H. Margulies7Zhiping Weng8,9Michael Snyder10,11Emmanouil T. Dermitzakis12 Correspondence and requests for materials should be addressed to the co-chairs of the ENCODE analysis groups (listed in the Analysis Coordination group) E. Birney (Email: birney@ebi.ac.uk); J. A. Stamatoyannopoulos (Email: jstam@u.washington.edu); A. Dutta (Email: ad8q@virginia.edu); R. Guigó (Email: rguigo@imim.es); T. R. Gingeras (Email: Tom_Gingeras@affymetrix.com); E. H. Margulies (Email: elliott@nhgri.nih.gov); Z. Weng (Email: zhiping@bu.edu); M. Snyder (Email: michael.snyder@yale.edu); E. T. Dermitzakis (Email: md4@sanger.ac.uk) or collectively (Email: encode_chairs@ebi.ac.uk).

Top

We report the generation and analysis of functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project. These data have been further integrated and augmented by a number of evolutionary and computational analyses. Together, our results advance the collective knowledge about human genome function in several major areas. First, our studies provide convincing evidence that the genome is pervasively transcribed, such that the majority of its bases can be found in primary transcripts, including non-protein-coding transcripts, and those that extensively overlap one another. Second, systematic examination of transcriptional regulation has yielded new understanding about transcription start sites, including their relationship to specific regulatory sequences and features of chromatin accessibility and histone modification. Third, a more sophisticated view of chromatin structure has emerged, including its inter-relationship with DNA replication and transcriptional regulation. Finally, integration of these new sources of information, in particular with respect to mammalian evolution based on inter- and intra-species sequence comparisons, has yielded new mechanistic and evolutionary insights concerning the functional landscape of the human genome. Together, these studies are defining a path for pursuit of a more comprehensive characterization of human genome function.

MORE ARTICLES LIKE THIS

These links to content published by NPG are automatically generated.

NEWS AND VIEWS

The multitasking genome

Nature Genetics News and Views (01 Jun 2006)

44Od44Oq44Oh44Op44O844K85YGc5q2i44Go6Lui5YaZ6Kq/56+A

Nature Genetics News and Views (01 Dec 2007)

See all 3 matches for News And Views

Extra navigation

.

naturejobs

natureproducts


ADVERTISEMENT