Bamboo represents the only major lineage of grasses that is native to forests and is one of the most important non-timber forest products in the world. However, no species in the Bambusoideae subfamily has been sequenced. Here, we report a high-quality draft genome sequence of moso bamboo (P. heterocycla var. pubescens). The 2.05-Gb assembly covers 95% of the genomic region. Gene prediction modeling identified 31,987 genes, most of which are supported by cDNA and deep RNA sequencing data. Analyses of clustered gene families and gene collinearity show that bamboo underwent whole-genome duplication 7–12 million years ago. Identification of gene families that are key in cell wall biosynthesis suggests that the whole-genome duplication event generated more gene duplicates involved in bamboo shoot development. RNA sequencing analysis of bamboo flowering tissues suggests a potential connection between drought-responsive and flowering genes.
At a glance
- World Bamboo Resources: A Thematic Study Prepared in the Framework of the Global Forest Resources Assessment 2005 (Food and Agriculture Organization of the United Nations, Rome, 2007). , , , &
- Genome-wide characterization of the biggest grass, bamboo, based on 10,608 putative full-length cDNA sequences. BMC Plant Biol. 10, 116 (2010). et al.
- High-throughput sequencing of six bamboo chloroplast genomes: phylogenetic implications for temperate woody bamboos (Poaceae: Bambusoideae). PLoS ONE 6, e20596 (2011). , &
- Insights into the bamboo genome: syntenic relationships to rice and sorghum. J. Integr. Plant Biol. 52, 1008–1015 (2010). et al.
- Non-monophyly of the woody bamboos (Bambuseae; Poaceae): a multi-gene region phylogenetic analysis of Bambusoideae s.s. J. Plant Res. 122, 95–108 (2009). , , &
- Evaluation of rice and sugarcane SSR markers for phylogenetic and genetic diversity analyses in bamboo. Genome 51, 91–103 (2008). et al.
- Generation and characterization of SCARs by cloning and sequencing of RAPD products: a strategy for species-specific marker development in bamboo. Ann. Bot. (Lond.) 95, 835–841 (2005). , &
- Chromosome Atlas of Major Economic Plants Genome in China, Tomus IV—Chromosome Atlas of Various Bamboo Species (Science Press, Beijing, 2003). et al.
- Genome size and sequence composition of moso bamboo: a comparative study. Sci. China C Life Sci. 50, 700–705 (2007). et al.
- SOAP: short oligonucleotide alignment program. Bioinformatics 24, 713–714 (2008). et al.
- The genome of black cottonwood, Populus trichocarpa (Torr. & Gray). Science 313, 1596–1604 (2006). et al.
- A high quality draft consensus sequence of the genome of a heterozygous grapevine variety. PLoS ONE 2, e1326 (2007). et al.
- KEGG for integration and interpretation of large-scale molecular data sets. Nucleic Acids Res. 40 Database issue, D109–D114 (2012). , , , &
- Close split of sorghum and maize genome progenitors. Genome Res. 14, 1916–1923 (2004). et al.
- Genome evolution in polyploids. Plant Mol. Biol. 42, 225–249 (2000).
- Evolutionary dynamics of grass genomes. New Phytol. 154, 15–28 (2002).
- Relationships of cereal crops and other grasses. Proc. Natl. Acad. Sci. USA 95, 2005–2010 (1998).
- A draft sequence of the rice genome (Oryza sativa L. ssp. japonica). Science 296, 92–100 (2002). et al.
- Ancestral genome duplication in rice. Genome 47, 610–614 (2004). &
- Phylogeny and subfamilial classification of the grasses (Poaceae). Ann. Mo. Bot. Gard. 88, 373–457 (2001). et al.
- Reinstatement and emendation of subfamily Micrairoideae (Poaceae). Syst. Bot. 32, 71–80 (2007). , , &
- Large multi-gene phylogenetic trees of the grasses (Poaceae): progress towards complete tribal and generic level sampling. Mol. Phylogenet. Evol. 47, 488–505 (2008). et al.
- Temporal and spatial profiling of internode elongation-associated protein expression in rapidly growing culms of bamboo. J. Proteome Res. 11, 2492–2507 (2012). , , , &
- Cellulose synthesis in higher plants. Annu. Rev. Cell Dev. Biol. 22, 53–78 (2006).
- The cellulose synthase superfamily in fully sequenced plants and algae. BMC Plant Biol. 9, 99 (2009). , &
- The B73 maize genome: complexity, diversity, and dynamics. Science 326, 1112–1115 (2009). et al.
- Rewriting the lignin roadmap. Curr. Opin. Plant Biol. 5, 224–229 (2002). &
- Lignin biosynthesis. Annu. Rev. Plant Biol. 54, 519–546 (2003). , &
- Genome-wide analyses of phenylpropanoid-related genes in Populus trichocarpa, Arabidopsis thaliana, and Oryza sativa: the Populus lignin toolbox and conservation and diversification of angiosperm gene families. Can. J. Bot. 85, 1182–1201 (2007). et al.
- MADS-box gene family in rice: genome-wide identification, organization and expression profiling during reproductive development and stress. BMC Genomics 8, 242 (2007). et al.
- Candidate gene association mapping of Arabidopsis flowering time. Genetics 183, 325–335 (2009). et al.
- SnapShot: control of flowering in. Arabidopsis. Cell 141, 550 e1–550.e2 (2010). , &
- The CONSTANS gene of Arabidopsis promotes flowering and encodes a protein showing similarities to zinc finger transcription factors. Cell 80, 847–857 (1995). , , , &
- Substitution rate comparisons between grasses and palms: synonymous rate differences at the nuclear gene Adh parallel rate differences at the plastid gene rbcL. Proc. Natl. Acad. Sci. USA 93, 10274–10279 (1996). , , &
- The TIGR Rice Genome Annotation Resource: improvements and new features. Nucleic Acids Res. 35, D883–D887 (2007). et al.
- The Sorghum bicolor genome and the diversification of grasses. Nature 457, 551–556 (2009). et al.
- Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat. Methods 5, 621–628 (2008). , , , &
- Construction of plant bacterial artificial chromosome (BAC) libraries: an illustrated guide. J. Agric. Genomics 5, 34–40 (2000). , , , &
- Amplification-free Illumina sequencing-library preparation facilitates improved mapping and assembly of (G+C)-biased genomes. Nat. Methods 6, 291–295 (2009). et al.
- The Phusion assembler. Genome Res. 13, 81–90 (2003). &
- ABySS: a parallel assembler for short read sequence data. Genome Res. 19, 1117–1123 (2009). et al.
- Scaffolding pre-assembled contigs using SSPACE. Bioinformatics 27, 578–579 (2011). , , , &
- Gap5—editing the billion fragment sequence assembly. Bioinformatics 26, 1699–1703 (2010). &
- The genome sequence of black cottonwood (Populus trichocarpa) reveals 18 conserved cellulose synthase (CesA) genes. Planta 221, 739–746 (2005). , , , &
- The cellulose synthase gene superfamily and biochemical functions of xylem-specific cellulose synthase–like genes in Populus trichocarpa. Plant Physiol. 142, 1233–1245 (2006). , , &
- Cellulose synthase–like genes of rice. Plant Physiol. 128, 336–340 (2002). , &
- Global transcript profiling of primary stems from Arabidopsis thaliana identifies candidate genes for missing links in lignin biosynthesis and transcriptional regulators of fiber differentiation. Plant J. 42, 618–640 (2005). et al.
- Characterization in vitro and in vivo of the putative multigene 4-coumarate:CoA ligase network in Arabidopsis: syringyl lignin and sinapate/sinapyl alcohol derivative formation. Phytochemistry 66, 2072–2091 (2005). et al.
- DEGseq: an R package for identifying differentially expressed genes from RNA-seq data. Bioinformatics 26, 136–138 (2010). , , , &
- Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc., B 57, 289–300 (1995). &
- The TIGR Plant Transcript Assemblies database. Nucleic Acids Res. 35 Database issue, D846–D851 (2007). et al.
- OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res. 13, 2178–2189 (2003). , &
- Dissecting plant genomes with the PLAZA comparative genomics platform. Plant Physiol. 158, 590–600 (2012). et al.
- MRBAYES: Bayesian inference of phylogeny. Bioinformatics 17, 754–755 (2001). &
- CAFE: a computational tool for the study of gene family evolution. Bioinformatics 22, 1269–1271 (2006). , , &
- Automated de novo identification of repeat sequence families in sequenced genomes. Genome Res. 12, 1269–1276 (2002). &
- De novo identification of repeat families in large genomes. Bioinformatics 21 (suppl. 1), i351–i358 (2005). , &
- LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons. BMC Bioinformatics 9, 18 (2008). , &
- LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res. 35, W265–W268 (2007). &
- Supplementary Text and Figures (3 MB)
Supplementary Note, Supplementary Figures 1–16 and Supplementary Tables 1–19