Supplementary information

From the following article:

Generation and annotation of the DNA sequences of human chromosomes 2 and 4

LaDeana W. Hillier, Tina A. Graves, Robert S. Fulton, Lucinda A. Fulton, Kymberlie H. Pepin, Patrick Minx, Caryn Wagner-McPherson, Dan Layman, Kristine Wylie, Mandeep Sekhon, Michael C. Becker, Ginger A. Fewell, Kimberly D. Delehaunty, Tracie L. Miner, William E. Nash, Colin Kremitzki, Lachlan Oddy, Hui Du, Hui Sun, Holland Bradshaw-Cordum, Johar Ali, Jason Carter, Matt Cordes, Anthony Harris, Amber Isak, Andrew van Brunt, Christine Nguyen, Feiyu Du, Laura Courtney, Joelle Kalicki, Philip Ozersky, Scott Abbott, Jon Armstrong, Edward A. Belter, Lauren Caruso, Maria Cedroni, Marc Cotton, Teresa Davidson, Anu Desai, Glendoria Elliott, Thomas Erb, Catrina Fronick, Tony Gaige, William Haakenson, Krista Haglund, Andrea Holmes, Richard Harkins, Kyung Kim, Scott S. Kruchowski, Cynthia Madsen Strong, Neenu Grewal, Ernest Goyea, Shunfang Hou, Andrew Levy, Scott Martinka, Kelly Mead, Michael D. McLellan, Rick Meyer, Jennifer Randall-Maher, Chad Tomlinson, Sara Dauphin-Kohlberg, Amy Kozlowicz-Reilly, Neha Shah, Sharhonda Swearengen-Shahid, Jacqueline Snider, Joseph T. Strong, Johanna Thompson, Martin Yoakum, Shawn Leonard, Charlene Pearman, Lee Trani, Maxim Radionenko, Jason E. Waligorski, Chunyan Wang, Susan M. Rock, Aye-Mon Tin-Wollam, Rachel Maupin, Phil Latreille, Michael C. Wendl, Shiaw-Pyng Yang, Craig Pohl, John W. Wallis, John Spieth, Tamberlyn A. Bieri, Nicolas Berkowicz, Joanne O. Nelson, John Osborne, Li Ding, Rekha Meyer, Aniko Sabo, Yoram Shotland, Prashant Sinha, Patricia E. Wohldmann, Lisa L. Cook, Matthew T. Hickenbotham, James Eldred, Donald Williams, Thomas A. Jones, Xinwei She, Francesca D. Ciccarelli, Elisa Izaurralde, James Taylor, Jeremy Schmutz, Richard M. Myers, David R. Cox, Xiaoqiu Huang, John D. McPherson, Elaine R. Mardis, Sandra W. Clifton, Wesley C. Warren, Asif T. Chinwalla, Sean R. Eddy, Marco A. Marra, Ivan Ovcharenko, Terrence S. Furey, Webb Miller, Evan E. Eichler, Peer Bork, Mikita Suyama, David Torrents, Robert H. Waterston and Richard K. Wilson

Nature 434, 724-731(7 April 2005)

doi:10.1038/nature03466

BACK TO ARTICLE
Download plugins and applications

Supplementary Notes

Supplementary Notes and Supplementary Methods with additional relevant references.

Supplementary Figure S1

Comparison of mapped positions of deCODE STSs and their location within the (a) chromosome 2 and (b) chromosome 4 sequences. The apparent break at 90 Mb (a) and 50 Mb (b) reflects the position of the centomere.

Supplementary Figure S2

Sequence identity distribution of segmental duplications. For all pairwise alignments, the total number of aligned bases was calculated and binned based on percent sequence identity. Sequence identity distributions for interchromsomally (red) and intrachromosomally (blue) duplicated bases are shown.

Supplementary Figure S3

Pattern of recent segmental duplications on chromosome 2 and chromosome 4. Large (>10 kb) highly-similar (>95%) intrachromosomal (blue) and interchromosomal (red) segmental duplications are shown for chromosome 2 (a) and chromosome 4 (b).

Supplementary Figure S4

Location and % identity of segmental duplications on chromosome 2. Interchromosomal (red bars) and intrachromsomal (blue bars) are represented along the horizontal line (0.2 Mb increments).

Supplementary Figure S5

Location and % identity of segmental duplications on chromosome b) 4. Interchromosomal (red bars) and intrachromsomal (blue bars) are represented along the horizontal line (0.2 Mb increments).

Supplementary Table S1

Contiguous sequence accessions and lengths for chromosome 2 and 4 sequences. Estimated gap sizes are also included as estimated by FISH and by orthologous mouse and rat sequences.

Supplementary Table S2

Brief summary of repetitive content of chromosomes 2 and 4.

Supplementary Table S3

Recombination rates for human chromosomes 2 and 4 based on deCODE and Genethon genetic maps.

Supplementary Table S4

Known mRNAs that had no match at all to the mouse or rat genomes or their protein sets. The chromosome, gi identifier for the mRNA, number of coding exons, amino acid length, interpro domains, comments, and information on gi's for which we obtained sequences for the mRNA in multiple primates.

Supplementary Table S5

Additional details to support Table 1 in main text related to polymorphic mRNAs detected on human chromosomes 2 and 4.

Supplementary Table S6

Fraction of duplicated basepairs. The percentage of non-redundant duplications (>90% sequence identity; >1 kb in length) were based on the total genome size 2,865,069,170, chromosome 2 size 237,541,603 and chromosome 4 size 186,841,959.

Supplementary Table S7

Accessions identifying regions containing highly polymorphic overlap regions. The chromosome and starting and ending positions of those highly polymorphic overlap regions are provided for human genome build35/hg17. General observations of sequence features in the region including gene content are also provided.

Supplementary Table S8

Positions of possible human deletions based on fosmid-end placements.

Supplementary Table S9

Polymorphisms detected from fosmid paired-end placements and verified by fosmid clone-based sequencing. The Broad Institute library fosmid name, fosmid clone accession, and a description of each polymorphism are provided.

Supplementary Table S10

Distribution of non-redundant known genes and spliced ESTs within duplicated regions of chromosomes 2 and 4. The non-redundant transcript set was screened for confirmatory transcriptional support by best genomic placement of EST. The spliced EST and transcripts required at least two exons and evidence of splicing (two or more exons). In addition, each gene feature was binned as duplicated if at least 50 bp overlapped a duplicated region. Thus, exons less than 50bp were not included in this analysis. Known gene + EST is composed of non-redundant known gene and spliced EST transcription. Duplicated known genes and their coordinates are listed.

BACK TO ARTICLE

DOWNLOAD BROWSER PLUGINS AND OTHER APPLICATIONS

Flash movies

Audio files

Chemical structures

Mathematica

Microarray

Compressed Stuff files

Compressed Zip files

Systems Biology Markup Language files (SBML)

Chemical Markup language files (CML)

BACK TO ARTICLE