Sequencing of diverse mandarin, pummelo and orange genomes reveals complex history of admixture during citrus domestication

Nature Biotechnology
Published online


Cultivated citrus are selections from, or hybrids of, wild progenitor species whose identities and contributions to citrus domestication remain controversial. Here we sequence and compare citrus genomes—a high-quality reference haploid clementine genome and mandarin, pummelo, sweet-orange and sour-orange genomes—and show that cultivated types derive from two progenitor species. Although cultivated pummelos represent selections from one progenitor species, Citrus maxima, cultivated mandarins are introgressions of C. maxima into the ancestral mandarin species Citrus reticulata. The most widely cultivated citrus, sweet orange, is the offspring of previously admixed individuals, but sour orange is an F1 hybrid of pure C. maxima and C. reticulata parents, thus implying that wild mandarins were part of the early breeding germplasm. A Chinese wild 'mandarin' diverges substantially from C. reticulata, thus suggesting the possibility of other unrecognized wild citrus species. Understanding citrus phylogeny through genome analysis clarifies taxonomic relationships and facilitates sequence-directed genetic improvement.

  1. A selection of mandarin, pummelo and orange fruits, including cultivars sequenced in this study.
    Figure 1: A selection of mandarin, pummelo and orange fruits, including cultivars sequenced in this study.

    Pummelos (1,2 in outline on left) are large trees that produce very large fruit with white, pink or red flesh color (2) and yellow or pink rinds. Most cultivars have large leaves with petioles with prominent wings. Apomictic reproduction is absent, and most selections are self-incompatible. Mandarins (3–7) are smaller trees bearing smaller fruit with orange flesh (9,11) and rind color. Mandarins have both apomictic and zygotic reproduction, and some are self-compatible. Oranges (8,10) are generally intermediate in tree and fruit size; the flesh (10) and rind color is commonly orange, and apomictic reproduction is always present. (The sour orange shown (12) is immature.)

  2. Nucleotide-diversity distribution in citrus.
    Figure 2: Nucleotide-diversity distribution in citrus.

    (a) Nucleotide-heterozygosity distribution computed in overlapping 100-kb windows (with 5-kb step size) across the low-acid (LAP) and Chandler (CHP) pummelo genomes and between the nonshared haplotypes of this parent-child pair (LAP/CHP). The peak at ~6 heterozygous sites/kb in all three pairwise comparisons represents the characteristic nucleotide diversity of the species C. maxima; the peak near ~1 heterozygous site/kb reflects a bottleneck in the ancestral C. maxima population after divergence from C. reticulata (Supplementary Note 10). (b) Nucleotide heterozygosity for the traditional Willowleaf mandarin (WLM) plotted along chromosome 6, computed in overlapping windows of 200 kb (with 100-kb step size). This chromosome shows an example of the clear discontinuity in single-nucleotide-variant heterozygosity levels between ~5/kb in the M/M segment (orange bar) and ~17/kb in the M/P segment (blue bar). (c) Nucleotide heterozygosity distribution computed in overlapping 500-kb windows (with 5-kb step size) in Ponkan (PKM, solid line) and Willowleaf (WLM, dashed line) mandarins. Genomic segments are designated M/M, M/P or P/P on the basis of a set of 1,537,264 SNPs that differentiate C. reticulata (M) from C. maxima (P). Both mandarins contain admixed segments from C. maxima introgression (M/P) as well as M/M segments, and these are plotted and normalized separately for easy comparison. (d) Nucleotide heterozygosity distribution computed in overlapping windows of 500 kb (5-kb offsets) for sweet orange (SWO) and sour orange (SSO). The three different genotypes of the sweet-orange genome (M/M, P/P and M/P) and the sour-orange genotype M/P are normalized and plotted separately.

  3. Admixture patterns and nucleotide diversity in cultivated citrus.
    Figure 3: Admixture patterns and nucleotide diversity in cultivated citrus.

    For each of the three groups of sequenced citrus, variation in nucleotide diversity (averaged over 500-kb windows with step size 250 kb) is shown across the genome for one representative cultivar above genotype maps (horizontal bars). Green, C. maxima/C. maxima; blue, C. maxima/C. reticulata; orange, C. reticulata/C. reticulata; gray, unknown. The nine chromosomes are numbered at top. (a) Sweet orange (SWO) nucleotide diversity with genotype maps for sweet orange and sour orange (SSO), indicating the C. maxima/C. maxima genotype (green segments present on chromosomes 2 and 8) in sweet orange. (b) Willowleaf mandarin (WLM) nucleotide diversity and genotype maps for three traditional mandarins (Ponkan mandarin (PKM), Willowleaf mandarin (WLM) and Huanglingmiao (HLM)) and three recent mandarin types (Clementine (CLM), W. Murcott mandarin (WMM) and haploid Clementine reference (HCR)). For the haploid Clementine reference sequence, orange and green segments indicate C. reticulata and C. maxima haplotypes, respectively. All five mandarin types show pummelo introgressions (blue or green segments). (c) Low-acid pummelo (LAP) nucleotide diversity and genotype maps for two pummelos (low-acid pummelo and Chandler pummelo (CHP)).

  4. Mangshan mandarin is a species distinct from C. maxima and C. reticulata.
    Figure 4: Mangshan mandarin is a species distinct from C. maxima and C. reticulata.

    (a) Midpoint-rooted neighbor-joining phylogenetic tree of citrus chloroplast genomes. (b) Frequency distributions of the pairwise sequence divergences (across 100-kb windows) between Mangshan mandarin (CMS) and C. maxima (green), CMS and C. reticulata (orange), C. reticulata and C. maxima (light blue) as well as the distinctly lower CMS intrinsic nucleotide diversity (dashed blue). Ret, C. reticulata; max, C. maxima; het, heterozygous. (c) The first two coordinates of principal coordinate analysis of the citrus nuclear genomes, based on pairwise distances and metric multidimensional scaling. The C. maximaC. reticulata axis (principal coordinate 1, 47.5% variance) separates pummelos (green) from mandarins (orange), with oranges (blue) lying in between; principal coordinate 2 (19.6% of variance) separates CMS (purple) from the others.

G.A.W., development and application of methods to analyze citrus genetic diversity, population history and ancestry; S. Prochnik, genome annotation and initial analysis of genetic diversity; J.J., J.G. and J.C., sequence assembly and map integration of haploid Clementine reference; J. Salse and F.M., analysis of synteny and genome evolution.; U.H., analysis of population history and ancestry; K.L., J.P.-P., A.C., J.P., D.B. and K.J., dideoxy shotgun sequencing and analysis of haploid Clementine reference; S.S., S. Pinosio, A.Z., C.D.F., X.P. and M. Ruiz, analysis of sequencing and resequencing data, and repetitive sequence annotation and analysis; F.C., Sanger and Illumina sequencing; A.L., P.B. and M.B., sweet-orange gene model predictions; C.C. and W.G.F., 454 sequencing of sweet orange and Illumina sequencing of Siamese Sweet pummelo; C.C., contributions to sweet-orange transcriptome, annotation and strategic rationale for comparative analyses; P.A., J.P.-P. and L.N., haploid Clementine DNA; J.P.-P. and D. Ramón, haploid Clementine transcriptome; J.T., F.R.T., L.H.E., J.V.M.-S., V.I., A.H.-O. and M.T., generation of BAC clones of the haploid Clementine and contribution of genome sequences of sweet orange, Ponkan, diploid Clementine and Willowleaf mandarins; B.D., C.K., M. Mohiuddin, T.H. and K.F., sweet-orange 454 transcriptome and genome sequencing and assembly; M.A.M. and M.A.T., Ponkan shotgun sequence; M. Roose, W. Murcott shotgun sequence; M. Morgante, Chandler pummelo and Seville sour-orange shotgun sequence; G.R., J.F.-A., F.Q., L.N., F.L. and M. Roose, project coordination; D. Rokhsar, F.G., G.A.W. and S. Prochnik, writing of the paper with substantial input from M.T., P.O., M. Mohiuddin, O.J. and M. Roose; F.G., D. Rokhsar, O.J., P.O., M.A.M., M. Morgante, M.T., J. Schmutz and P.W., project coordination and scientific leadership.

