Credit: P. Morgan/Macmillan Publishers Limited

The domestication of the class II CRISPR–Cas system has revolutionized molecular biology by allowing targeted editing of specific sequences in the genome. However, the CRISPR–Cas technologies currently in use are derived from isolated bacterial species, despite these 'bacterial immune systems' being widespread in nature. Now, Burstein et al. use metagenomics to identify novel CRISPR–Cas systems across bacterial and archaeal species.

The authors analysed a terabase-scale metagenomic data set consisting of microbial communities from groundwater, sediment, biofilms and gut, to identify novel class II CRISPR–Cas systems. They focused on genes proximal to CRISPR arrays and the Cas1 integrase. In doing so, the group identified the first Cas9 proteins to be found in non-bacterial species, the nanoarchaeal ARMAN-1 and ARMAN-4. Intriguingly, phylogenetic analysis suggests that these archaeal systems may represent a fusion between canonical type II-C and type II-B systems. The ARMAN-1 system protospacer sequences suggest a defence function against transposons, whereas the ARMAN-4 system has no identifiable target, indicative of an alternative role for this system.

In addition to the archaeal systems, the team identified an uncharacterized bacterial protein that they term CasX in Deltaproteobacteria and Planctomycetes, which may have arisen owing to cross-phylum gene transfer. CasX has weak similarities to transposases and a RuvC domain, but the majority of the protein has no similarity to other known proteins. Nonetheless, as the authors demonstrate, CasX functions as a DNA-targeting dual-RNA guided CRISPR-associated protein.

The group also identified a CRISPR–CasY system in symbiotic candidate phyla radiation bacteria. Sequence analysis indicated that of the six CasY proteins identified, four had some similarity to the effector C2c3 proteins. The authors synthesized CRISPR–CasY to show that, in transformed Escherichia coli cells, the system is capable of targeting and depleting target DNA sequences.

By mining metagenomic datasets, the team identified new CRISPR–Cas systems in bacterial and archaeal species that are difficult to culture with traditional techniques. Combining metagenomics with experimental analysis allows potentially all environments to be explored for novel CRISPR–Cas systems. The compact loci identified in this study suggest potential new avenues for the development of genome editing tools.