Pseudogenes are genomic sequences that contain mutations which prevent the production of a functional protein. They arise through various processes, including retrotransposition, duplication and subsequent gene inactivation, and acquisition of a new disabling mutation. Because pseudogenes are markers of remodelling of the genome, they are an important tool in the study of genome evolution. Sisu et al.1 have leveraged genomic data generated by the ENCODE project to annotate and investigate mammalian pseudogenes. The authors use the mouse reference genome and 18 inbred mouse strains, and employ both manual curation and automatic pipelines to provide a comprehensive catalogue of mouse and human pseudogenes. The integration of genomic and functional data further allowed the authors to investigate the mechanisms of pseudogene biogenesis and activity, providing evidence that the process of pseudogene biogenesis is similar in human and mouse. Next, the authors focused on unitary pseudogenes, which arise when functional genes are inactivated by mutation of the original coding loci, and annotated 165 mouse and 303 human unitary pseudogenes. Finally, analysis of different RNA sequencing datasets from mouse development allowed the authors to investigate whether pseudogenes are generated during gametogenesis and early embryonic development. This comprehensive catalogue of mammalian pseudogenes is publicly available online at http://mouse.pseudogene.org/, a resource that will aid in the investigation of genome evolution and its regulation.
Sisu, C. et al. Transcriptional activity and strain-specific history of mouse pseudogenes. Nat. Commun. https://doi.org/10.1038/s41467-020-17157-w (2020)