Draft genome sequence of the oilseed species Ricinus communis

Journal name:
Nature Biotechnology
Volume:
28,
Pages:
951–956
Year published:
DOI:
doi:10.1038/nbt.1674
Received
Accepted
Published online

Abstract

Castor bean (Ricinus communis) is an oilseed crop that belongs to the spurge (Euphorbiaceae) family, which comprises ~6,300 species that include cassava (Manihot esculenta), rubber tree (Hevea brasiliensis) and physic nut (Jatropha curcas). It is primarily of economic interest as a source of castor oil, used for the production of high-quality lubricants because of its high proportion of the unusual fatty acid ricinoleic acid. However, castor bean genomics is also relevant to biosecurity as the seeds contain high levels of ricin, a highly toxic, ribosome-inactivating protein. Here we report the draft genome sequence of castor bean (4.6-fold coverage), the first for a member of the Euphorbiaceae. Whereas most of the key genes involved in oil synthesis and turnover are single copy, the number of members of the ricin gene family is larger than previously thought. Comparative genomics analysis suggests the presence of an ancient hexaploidization event that is conserved across the dicotyledonous lineage.

At a glance

Figures

  1. Reciprocal best BLAST matches between castor bean genes.
    Figure 1: Reciprocal best BLAST matches between castor bean genes.

    Strings of paralogous genes that correspond to triplicated regions are highlighted in the same color. The 30 pairs of scaffolds that contained the highest numbers of paralogous gene pairs are shown.

  2. Collinearity between three paralogous castor bean genomic regions and their putative orthologs in other dicot genomes.
    Figure 2: Collinearity between three paralogous castor bean genomic regions and their putative orthologs in other dicot genomes.

    (a) An example of a conserved paralogous triplication in the castor bean genome. (be) Putative orthologous gene pairs are shown as colored lines connecting the castor bean scaffolds (noted as Rc:scaffold number) to chromosomes or scaffolds in the other dicot genome. In most cases, one copy of the paralogous castor bean genes corresponds to two genes in poplar (b), one gene in grapevine (c) and four genes in A. thaliana (d). The castor bean–papaya relationship (e) is inconclusive. Numbers around the circles correspond to linkage group numbers (b), chromosome numbers (c and d) or scaffold numbers (e). Grapevine scaffolds that were mapped to chromosomes but their exact location is unknown are noted with an 'r' (random). The size of the castor bean genomic regions is proportional in all circles. Additional castor bean paralogous regions and their corresponding orthologs from other dicots are shown in Supplementary Figure 3.

  3. Schematic representation of the members of the ricin/RCA lectin gene family in castor bean.
    Figure 3: Schematic representation of the members of the ricin/RCA lectin gene family in castor bean.

    Ricin protein domains are represented at the top by blue boxes, and gray boxes represent protein sequences from this gene family aligned to the ricin precursor protein sequence used as reference. The ruler indicates the amino acid coordinates. The ricin and RCA genes are indicated and the amino acid sequence length for each gene model is shown in parenthesis. Pairs of adjacent gene models that could belong to a single pseudogene are shown in gray.

Accession codes

Referenced accessions

NCBI Reference Sequence

GenBank/EMBL/DDBJ

References

  1. Allan, G. et al. Worldwide genotyping of castor bean germplasm (Ricinus communis L.) using AFLPs and SSRs. Genet. Resour. Crop Evol. 55, 365378 (2008).
  2. Foster, J.T. et al. Single nucleotide polymorphisms for assessing genetic diversity in castor bean (Ricinus communis). BMC Plant Biol. 10, 13 (2010).
  3. da Silva Ramos, L.C., Shogiro Tango, J., Savi, A. & Leal, N.R. Variability for oil and fatty acid composition in castorbean varieties. J. Am. Oil Chem. Soc. 61, 18411843 (1984).
  4. da Silva Nde, L., Maciel, M.R., Batistella, C.B. & Maciel Filho, R. Optimization of biodiesel production from castor oil. Appl. Biochem. Biotechnol. 130, 405414 (2006).
  5. Scarpa, A. & Guerci, A. Various uses of the castor oil plant (Ricinus communis L.). A review. J. Ethnopharmacol. 5, 117137 (1982).
  6. Knight, B. Ricin–a potent homicidal poison. BMJ 1, 350351 (1979).
  7. Lord, J.M., Roberts, L.M. & Robertus, J.D. Ricin: structure, mode of action, and some current applications. FASEB J. 8, 201208 (1994).
  8. Schnell, R. et al. A Phase I study with an anti-CD30 ricin A-chain immunotoxin (Ki-4.dgA) in patients with refractory CD30+ Hodgkin's and non-Hodgkin's lymphoma. Clin. Cancer Res. 8, 17791786 (2002).
  9. Fidias, P., Grossbard, M. & Lynch, T.J. Jr. A phase II study of the immunotoxin N901-blocked ricin in small-cell lung cancer. Clin. Lung Cancer 3, 219222 (2002).
  10. Endo, Y., Mitsui, K., Motizuki, M. & Tsurugi, K. The mechanism of action of ricin and related toxic lectins on eukaryotic ribosomes. The site and the characteristics of the modification in 28 S ribosomal RNA caused by the toxins. J. Biol. Chem. 262, 59085912 (1987).
  11. Macbeth, M.R. & Wool, I.G. Characterization of in vitro and in vivo mutations in non-conserved nucleotides in the ribosomal RNA recognition domain for the ribotoxins ricin and sarcin and the translation elongation factors. J. Mol. Biol. 285, 567580 (1999).
  12. Lord, J.M., Hartley, M.R. & Roberts, L.M. Ribosome inactivating proteins of plants. Semin. Cell Biol. 2, 1522 (1991).
  13. Lord, J.M. Synthesis and intracellular transport of lectin and storage protein precursors in endosperm from castor bean. Eur. J. Biochem. 146, 403409 (1985).
  14. Roberts, L.M., Lamb, F.I., Pappin, D.J. & Lord, J.M. The primary sequence of Ricinus communis agglutinin. Comparison with ricin. J. Biol. Chem. 260, 1568215686 (1985).
  15. Arumuganathan, K. & Earle, E.D. Nuclear DNA content of some important plant species. Plant Mol. Biol. Rep. 9, 208218 (1991).
  16. Jaillon, O. et al. The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla. Nature 449, 463467 (2007).
  17. Velasco, R. et al. A high quality draft consensus sequence of the genome of a heterozygous grapevine variety. PLoS ONE 2, e1326 (2007).
  18. Crabtree, J., Angiuoli, S.V., Wortman, J.R. & White, O.R. Sybil: methods and software for multiple genome comparison and visualization. Methods Mol. Biol. 408, 93108 (2007).
  19. The Arabidopsis Genome Initiative Analysis of the genome sequence of the flowering plant Arabidopsis thaliana . Nature 408, 796815 (2000).
  20. Tuskan, G.A. et al. The genome of black cottonwood, Populus trichocarpa (Torr. & Gray). Science 313, 15961604 (2006).
  21. Ming, R. et al. The draft genome of the transgenic tropical fruit tree papaya (Carica papaya Linnaeus). Nature 452, 991996 (2008).
  22. Halling, K.C. et al. Genomic cloning and characterization of a ricin gene from Ricinus communis . Nucleic Acids Res. 13, 80198033 (1985).
  23. Tregear, J.W. & Roberts, L.M. The lectin gene family of Ricinus communis: cloning of a functional ricin gene and three lectin pseudogenes. Plant Mol. Biol. 18, 515525 (1992).
  24. Leshin, J. et al. Characterization of ricin toxin family members from Ricinus communis . Toxicon 55, 658661 (2010).
  25. McKeon, T.A., Chen, G.Q. & Lin, J.T. Biochemical aspects of castor oil biosynthesis. Biochem. Soc. Trans. 28, 972974 (2000).
  26. van de Loo, F.J., Broun, P., Turner, S. & Somerville, C. An oleate 12-hydroxylase from Ricinus communis L. is a fatty acyl desaturase homolog. Proc. Natl. Acad. Sci. USA 92, 67436747 (1995).
  27. He, X., Turner, C., Chen, G.Q., Lin, J.T. & McKeon, T.A. Cloning and characterization of a cDNA encoding diacylglycerol acyltransferase from castor bean. Lipids 39, 311318 (2004).
  28. Kroon, J.T., Wei, W., Simon, W.J. & Slabas, A.R. Identification and functional expression of a type 2 acyl-CoA:diacylglycerol acyltransferase (DGAT2) in developing castor bean seeds which has high homology to the major triglyceride biosynthetic enzyme of fungi and animals. Phytochemistry 67, 25412549 (2006).
  29. Saha, S., Enugutti, B., Rajakumari, S. & Rajasekharan, R. Cytosolic triacylglycerol biosynthetic pathway in oilseeds. Molecular cloning and expression of peanut cytosolic diacylglycerol acyltransferase. Plant Physiol. 141, 15331543 (2006).
  30. Thomaeus, S., Carlsson, A.S. & Stymne, S. Distribution of fatty acids in polar and neutral lipids during seed development in Arabidopsis thaliana genetically engineered to produce acetylenic, epoxy and hydroxy fatty acids. Plant Sci. 161, 9971003 (2001).
  31. Dahlqvist, A. et al. Phospholipid:diacylglycerol acyltransferase: an enzyme that catalyzes the acyl-CoA-independent formation of triacylglycerol in yeast and plants. Proc. Natl. Acad. Sci. USA 97, 64876492 (2000).
  32. Lu, C., Xin, Z., Ren, Z., Miquel, M. & Browse, J. An enzyme regulating triacylglycerol composition is encoded by the ROD1 gene of Arabidopsis . Proc. Natl. Acad. Sci. USA 106, 1883718842 (2009).
  33. Burgal, J. et al. Metabolic engineering of hydroxy fatty acid production in plants: RcDGAT2 drives dramatic increases in ricinoleate levels in seed oil. Plant Biotechnol. J. 6, 819831 (2008).
  34. Cahoon, E.B. et al. Engineering oilseeds for sustainable production of industrial and nutritional feedstocks: solving bottlenecks in fatty acid flux. Curr. Opin. Plant Biol. 10, 236244 (2007).
  35. Hillocks, R.J. & Jennings, D.L. Cassava brown streak disease: a review of present knowledge and research needs. Int. J. Pest Manage. 49, 225234 (2003).
  36. van Ooijen, G., van den Burg, H.A., Cornelissen, B.J. & Takken, F.L. Structure and function of resistance proteins in solanaceous plants. Annu. Rev. Phytopathol. 45, 4372 (2007).
  37. Fristensky, B., Horovitz, D. & Hadwiger, L.A. cDNA sequences for pea disease resistance response genes. Plant Mol. Biol. 11, 713715 (1988).
  38. Musshoff, F. & Madea, B. Ricin poisoning and forensic toxicology. Drug Test Anal 1, 184191 (2009).
  39. Audi, J., Belson, M., Patel, M., Schier, J. & Osterloh, J. Ricin poisoning: a comprehensive review. J. Am. Med. Assoc. 294, 23422351 (2005).
  40. Goodrum, J.W. & Geller, D.P. Influence of fatty acid methyl esters from hydroxylated vegetable oils on diesel fuel lubricity. Bioresour. Technol. 96, 851855 (2005).
  41. Broun, P. & Somerville, C. Accumulation of ricinoleic, lesquerolic, and densipolic acids in seeds of transgenic Arabidopsis plants that express a fatty acyl hydroxylase cDNA from castor bean. Plant Physiol. 113, 933942 (1997).
  42. Smith, M.A., Moon, H., Chowrira, G. & Kunst, L. Heterologous expression of a fatty acid hydroxylase gene in developing seeds of Arabidopsis thaliana . Planta 217, 507516 (2003).
  43. Lu, C., Fulda, M., Wallis, J.G. & Browse, J. A high-throughput screen for genes from castor that boost hydroxy fatty acid accumulation in seed oils of transgenic Arabidopsis . Plant J. 45, 847856 (2006).
  44. Li, R., Yu, K., Hatanaka, T. & Hildebrand, D.F. Vernonia DGATs increase accumulation of epoxy fatty acids in oil. Plant Biotechnol. J. 8, 184195 (2010).
  45. Cahoon, E.B. et al. Conjugated fatty acids accumulate to high levels in phospholipids of metabolically engineered soybean and Arabidopsis seeds. Phytochemistry 67, 11661176 (2006).
  46. Cernac, A. & Benning, C. WRINKLED1 encodes an AP2/EREB domain protein involved in the control of storage compound biosynthesis in Arabidopsis . Plant J. 40, 575585 (2004).
  47. Thelen, J. & Ohlrogge, J. Metabolic engineering of fatty acid biosynthesis in plants. Metab. Eng. 4, 1221 (2002).
  48. Umanah, E.E. & Hartmann, R.W. Chromosome numbers and karyotypes of some Manihot species. Am. Soc. Hortic. Sci. 98, 272274 (1973).
  49. Brigham, R. Registration of castor variety Hale (Reg. No. 3). Crop Sci. 10, 457 (1970).
  50. Rabinowicz, P.D. et al. Differential methylation of genes and retrotransposons facilitates shotgun sequencing of the maize genome. Nat. Genet. 23, 305308 (1999).
  51. Sambrook, J. & Russell, D.W. Molecular Cloning. A Laboratory Manual 3rd edn., (Cold Spring Harbor Laboratory Press, 2001).
  52. Sanger, F., Nicklen, S. & Coulson, A.R. DNA sequencing with chain-terminating inhibitors. Proc. Natl. Acad. Sci. USA 74, 54635467 (1977).
  53. Myers, E.W. et al. A whole-genome assembly of Drosophila . Science 287, 21962204 (2000).
  54. Birren, B., Green, E.D., Klapholz, S., Myers, R.M. & Roskams, J. Genome Analysis. A Laboratory Manual. Analyzing DNA Vol. 1 (Cold Spring Harbor Laboratory Press, 1997).
  55. Price, A.L., Jones, N.C. & Pevzner, P.A. De novo identification of repeat families in large genomes. Bioinformatics 21 Suppl 1, i351i358 (2005).
  56. Jurka, J. et al. Repbase Update, a database of eukaryotic repetitive elements. Cytogenet. Genome Res. 110, 462467 (2005).
  57. Salamov, A.A. & Solovyev, V.V. Ab initio gene finding in Drosophila genomic DNA. Genome Res. 10, 516522 (2000).
  58. Stanke, M. & Waack, S. Gene prediction with a hidden Markov model and a new intron submodel. Bioinformatics 19 Suppl 2, ii215ii225 (2003).
  59. Majoros, W.H., Pertea, M. & Salzberg, S.L. TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene-finders. Bioinformatics 20, 28782879 (2004).
  60. Korf, I. Gene finding in novel genomes. BMC Bioinformatics 5, 59 (2004).
  61. Haas, B.J. et al. Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. Nucleic Acids Res. 31, 56545666 (2003).
  62. Huang, X., Adams, M.D., Zhou, H. & Kerlavage, A.R. A tool for analyzing and annotating genomic sequences. Genomics 46, 3745 (1997).
  63. Childs, K.L. et al. The TIGR Plant Transcript Assemblies database. Nucleic Acids Res. 35, D846D851 (2007).
  64. Birney, E., Clamp, M. & Durbin, R. GeneWise and Genomewise. Genome Res. 14, 988995 (2004).
  65. Haas, B.J. et al. Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biol. 9, R7 (2008).
  66. Bendtsen, J.D., Nielsen, H., von Heijne, G. & Brunak, S. Improved prediction of signal peptides: SignalP 3.0. J. Mol. Biol. 340, 783795 (2004).
  67. Krogh, A., Larsson, B., von Heijne, G. & Sonnhammer, E.L. Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J. Mol. Biol. 305, 567580 (2001).
  68. Finn, R.D. et al. Pfam: clans, web tools and services. Nucleic Acids Res. 34, D247D251 (2006).
  69. Haas, B.J. et al. Complete reannotation of the Arabidopsis genome: methods, tools, protocols and the final release. BMC Biol. 3, 7 (2005).
  70. Wortman, J.R. et al. Annotation of the Arabidopsis genome. Plant Physiol. 132, 461468 (2003).
  71. Lowe, T.M. & Eddy, S.R. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 25, 955964 (1997).
  72. He, S. et al. NONCODE v2.0: decoding the non-coding. Nucleic Acids Res. 36, Database issue, D170D172 (2008).
  73. Claudel-Renard, C., Chevalet, C., Faraut, T. & Kahn, D. Enzyme-specific profiles for genome annotation: PRIAM. Nucleic Acids Res. 31, 66336639 (2003).
  74. Jaccard, P. The distribution of the flora in the alpine zone. New Phytol. 11, 3750 (1912).
  75. Krzywinski, M. et al. Circos: an information aesthetic for comparative genomics. Genome Res. 19, 16391645 (2009).

Download references

Author information

  1. These authors contributed equally to this work.

    • Agnes P Chan &
    • Jonathan Crabtree

Affiliations

  1. J. Craig Venter Institute (JCVI), Rockville, Maryland, USA.

    • Agnes P Chan,
    • Qi Zhao,
    • Hernan Lorenzi,
    • Admasu Melake-Berhan &
    • Pablo D Rabinowicz
  2. Institute for Genome Sciences (IGS), University of Maryland School of Medicine, Baltimore, Maryland, USA.

    • Jonathan Crabtree,
    • Joshua Orvis,
    • Kristine M Jones,
    • Julia Redman,
    • Jennifer R Wortman,
    • Claire M Fraser-Liggett,
    • Jacques Ravel &
    • Pablo D Rabinowicz
  3. Center for Bioinformatics and Computational Biology, University of Maryland, College Park, Maryland, USA.

    • Daniela Puiu
  4. United States Department of Agriculture, Agricultural Research Service, Western Regional Research Center, Crop Improvement and Utilization, Albany, California, USA.

    • Grace Chen
  5. Center for Plant Science Innovation and Department of Biochemistry, University of Nebraska-Lincoln, Lincoln, Nebraska, USA.

    • Edgar B Cahoon
  6. International Institute of Tropical Agriculture, Oyo State, Ibadan, Nigeria.

    • Melaku Gedil
  7. Institut für Mikrobiologie und Genetik, Abteilung Bioinformatik, Universität Göttingen, Göttingen, Germany.

    • Mario Stanke
  8. Broad Institute of the Massachusetts Institute of Technology and Harvard, Cambridge, Massachusetts, USA.

    • Brian J Haas
  9. Department of Biochemistry and Molecular Biology, University of Maryland School of Medicine, Baltimore, Maryland, USA.

    • Pablo D Rabinowicz

Contributions

A.P.C., J.C., H.L., B.J.H. and J.R.W. performed genomic analyses. Q.Z., J.O. and M.S. conducted genome annotation. D.P. worked on the genome assembly. A.M.-B., K.M.J. and J.R. made DNA preparations, library constructions, and closure work. G.C., E.B.C. and M.G. performed manual annotations. C.M.F.-L. and J.R. conceived the project. P.D.R. conceived and directed the project.

Competing financial interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to:

Author details

Supplementary information

PDF files

  1. Supplementary Text and Figures (12M)

    Supplementary Tables 1–5 and Supplementary Figs. 1–3

Additional data