In their Perspective (A unified classification system for eukaryotic transposable elements. Nature Rev. Genet. 8, 973–982 (2007))1, Wicker et al. attempt to introduce the 'first universal classification scheme'. Here, we would like to point out a similar universal hierarchical classification system that was developed earlier by us and implemented in Repbase2.

Repbase is a database of eukaryotic repetitive and transposable elements (TEs), developed since 1990. It was first published as a collection of the consensus sequences and sequence fragments of human TEs, as well as satellite DNA, that was available at the time3. Subsequently, repetitive and transposable sequences from other animal and plant species were added, and in the mid-1990s Repbase became available online for downloading and sequence analysis using the computer tools Censor server4 and RepeatMasker. At the same time, a systematic classification of repetitive elements based on their origin from different classes of TEs was developed and implemented in parallel in Repbase and RepeatMasker. Since 2001, Repbase has been routinely used in conjunction with RepeatMasker to analyze and annotate entire genomes. The resulting information is available through major international genome browsers, including the University of California, Santa Cruz (UCSC) browser and the Ensembl browser.

The current version of Repbase, known as Repbase Update, contains >7,600 sequences of TEs and other repeats, including those that are reported in the literature and those that are only reported in Repbase. Since 2001, all new information on TEs compiled in Repbase is first published in an electronic journal, Repbase Reports2,4.

In 2005, Repbase was converted to a relational database, which permitted us to implement our universal classification of TEs. According to this classification (Fig. 1), all eukaryotic TEs belong to two types (retrotransposons and DNA transposons) and are composed of five major classes: long terminal repeat (LTR) retrotransposons, non-LTR retrotransposons, cut-and-paste DNA transposons, rolling-circle DNA transposons (Helitrons) and self-synthesizing DNA transposons (Polintons). This classification is based on enzymology, structural similarities and sequence relationships5,6,7,8,9,10,11,12,13,14. Each class of TE is composed of a small number of superfamilies or clades5,6,8,9,10,11,15 (see the 40 superfamilies in Fig. 1). Each superfamily consists of numerous families of TEs. Ancient families are represented in Repbase by consensus sequences approximating active TEs from which these families were derived (consensus sequences of any two families are less than 75% identical to each other).

Figure 1: The universal classification and nomenclature of eukaryotic transposable elements.
figure 1

Different classes of transposable elements (TEs) are differently coloured. Penelope and DIRS can be viewed as two additional classes of retrotransposons. An asterisk indicates that the lengths of target-site duplications (TSDs) by short interspersed nuclear elements (SINEs) depend on non-LTR retrotransposons being involved in their transpositions. Alternative names for the superfamilies are shown in parentheses. LTR, long terminal repeat; TA, TpA dinucleotide.

For instance, the class of LTR retrotransposons is composed of the Gypsy, Copia, BEL and DIRS superfamilies, plus the ERV1, ERV2 and ERV3 superfamilies of endogenous retroviruses6,13,15. The class of non-LTR retrotransposons is composed of the CR1, CRE, I, Jockey, L1, NeSL, Penelope, R2, R4, RandI, Rex1, RTE and Tx1 superfamilies (also known as clades)8,15. It also includes the SINE1, SINE2, and SINE3 superfamilies of short interspersed nuclear elements (SINEs), which are viewed as non-autonomous non-LTR retrotransposons7. The class of cut-and-paste DNA transposons consists of 15 superfamilies, including those described only in Repbase (Mirage, Rehavkus, Nobosib, Kolobok, ISL2EU and Chapaev). Autonomous TEs from each of these superfamilies encode superfamily-specific transposases when transposases from different superfamilies are not similar to each other (that is, when the E-value in BLASTP or PSI-BLAST is greater than 0.01).

Based on a system that was established over a decade ago by Smit and ourselves13,16, non-autonomous DNA transposons are routinely classified based on significant similarities of their terminal inverted repeats and target-site duplications to those in known autonomous DNA transposons. Analogously, structural and sequence similarities are used for the classification of non-autonomous LTR and non-LTR retrotransposons.

Although the Repbase interface does not directly display the hierarchical classification scheme, it reflects and corresponds to this scheme published in literature. According to the published information, eukaryotic DNA transposons identified so far in eukaryotes belong to three classes characterized by the so-called cut-and-paste, rolling-circle and self-synthesizing mechanisms of transposition, reflecting three different mechanisms of transposition11,15.

During the last 4 years, thousands of families of transposable elements in genomes of several eukaryotic species have been identified, classified and named based on the classification scheme and nomenclature shown in Fig. 1, including those from protozoans (diatom Thalassiosira pseudonana and green alga Chlamydomonas reinhardtii)17,18, fungi (Aspergillus nidulans, Aspergillus oryzae and Aspergillus fumigatus)19, cnidarians (starlet sea anemone Nematostella vectensis)20 and mammals (opossum Monodelphis domestica)21.

In April 2006, the above classification scheme was presented by us during the first international conference and workshop named Genomic Impact of Eukaryotic Transposable Elements, which also included a session devoted to the unified classification and nomenclature of TEs. During this conference, which was attended by 150 scientists working in the field, an International Committee on the Classification of Transposable Elements was constituted.