Fig. 2: Progress in annotating the human genome. | Nature

Fig. 2: Progress in annotating the human genome.

From: Perspectives on ENCODE

Fig. 2

Link to high-resolution PDF file: a, Improvement of gene annotations in the past 15 years by GENCODE, an international gene annotation group that uses ENCODE data42. b, ENCODE annotations in 2012 with phase II data. Bars show the percentages of the mappable human genome (3.1 billion nucleotides; hg19) that were annotated as open chromatin by DNase-seq data, enriched in four types of active histone mark according to ChIP–seq data, and annotated as transcription factor binding sites (TFBSs) according to ChIP–seq data. Also shown are percentages of the genome assigned as transcription start sites (TSSs), enhancers and the insulator-binding protein (CTCF) by combining ChromHMM and Segway genome segmentations7. c, ENCODE annotations in 2019 with ENCODE 2, Roadmap, and ENCODE 3 data. The registry of cCREs developed during phase III defines 0.3%, 1.1%, 5.8%, 0.2% and 0.4% of the human genome as cCREs with promoter-like signatures (PLS), proximal enhancer-like signatures (pELS), distal enhancer-like signatures (dELS), with high DNase, high H3K4me3 and low H3K27ac signals (DNase-H3K4me3), and bound by CTCF, respectively. d, A UCSC genome browser view of GENCODE genes (V7) coloured by transcript annotation (blue for coding, green for noncoding, and red for problematic) and combined genome segmentation (TSSs in red, enhancers in orange, weak enhancers in yellow, transcription in green, repressed in grey) at the CTCF locus on the hg19 human genome. e, The UCSC genome browser view of GENCODE genes (V28, coloured as in d) and cCREs at the CTCF locus on the hg38 human genome8. Promoter-like, enhancer-like, and CTCF-only cCREs annotated in B cells are in red, yellow, and blue, respectively. The last four tracks show the DNase, H3K4me3, H3K27ac, and CTCF signals in B cells.

Back to article page