GenePRIMP: a gene prediction improvement pipeline for prokaryotic genomes

We present 'gene prediction improvement pipeline' (GenePRIMP;, a computational process that performs evidence-based evaluation of gene models in prokaryotic genomes and reports anomalies including inconsistent start sites, missed genes and split genes. We found that manual curation of gene models using the anomaly reports generated by GenePRIMP improved their quality, and demonstrate the applicability of GenePRIMP in improving finishing quality and comparing different genome-sequencing and annotation technologies.

Figure 1: GenePRIMP analysis of gene calls in the M. palustris genome by three gene callers.
Figure 2: The GenePRIMP processing pipeline.


We acknowledge the help and support of I. Anderson, K. Mavromatis, X. Zhao and V. Markowitz. GenePRIMP was developed under the auspices of the US Department of Energy′s Office of Science, Biological and Environmental Research Program and by the University of California, Lawrence Berkeley National Laboratory under contract DE-AC02-05CH11231, Lawrence Livermore National Laboratory under contract DE-AC52-07NA27344 and Los Alamos National Laboratory under contract DE-AC02-06NA25396. Validation and improvement of the system was supported by US National Institutes of Health Data Analysis and Coordination Center contract U01-HG004866. The work conducted by the US Department of Energy Joint Genome Institute is supported by the Office of Science of the US. Department of Energy under contract DE-AC02-05CH11231.

Author information

N.N.I. and N.C.K. conceived the initial approach. N.N.I. and A.P. designed the system. A.P. implemented the GenePRIMP code base and web portal. S.D.H. contributed to the development of the web portal. N.N.I., N.M., G.O. and A.L. manually curated the genomes sequenced at the Department of Energy Joint Genome Institute and contributed to testing and validation.

Correspondence to Amrita Pati.

Pati, A., Ivanova, N., Mikhailova, N. et al. GenePRIMP: a gene prediction improvement pipeline for prokaryotic genomes. Nat Methods 7, 455–457 (2010) doi:10.1038/nmeth.1457

