Chloroplast gene organization deduced from complete sequence of liverwort Marchantia polymorpha chloroplast DNA


Chloroplasts contain their own autonomously replicating DNA genome. The majority of proteins present in the chloroplasts are encoded by nuclear DNA, but the rest are encoded by chloroplast DNA and synthesized by the chloroplast transcription–translation machinery1–4. Although the nucleotide sequences of many chloroplast genes from various plant species have been determined, the entire gene organization of the chloroplast genome has not yet been elucidated for any species of plants. To improve our understanding of the chloroplast gene system, we have determined the complete sequence of the chloroplast DNA from a liverwort, Marchantia polymorpha, and deduced the gene organization. As reported here the liverwort chloroplast DNA contains 121,024 base pairs (bp), consisting of a set of large inverted repeats (IRA and IRB, each of 10,058 bp) separated by a small single-copy region (SSC, 19,813 bp) and a large single-copy region (LSC, 81,095 bp). We detected 128 possible genes throughout the liverwort chloroplast genome, including coding sequences for four kinds of ribosomal RNAs, 32 species of transfer RNAs and 55 identified open reading frames (ORFs) for proteins, which are separated by short A+T-rich spacers (Fig. 1). Twenty genes (8 encoding tRNAs, 12 encoding proteins) contain introns in their coding sequences. These introns can be classified as belonging to either group I or group II, as described for mitochondria5. Interestingly, seven of the identified ORFs show high homology to unidentified reading frames (URFs) found in human mitochondria6,7.

