Mutations are not distributed evenly within human cancer genomes. Comparing changes in mutation rate with the distribution of epigenetic marks throughout the genome has revealed some of the mechanisms underlying mutational heterogeneity in tumours. However, different cancer types have different mutation rates and distributions of mutations within their genomes, as well as different patterns of chromatin accessibility, histone modifications or gene expression. Therefore, understanding which factors contribute to mutational heterogeneity requires analyses of the relationship between mutation patterns and epigenetic marks in a cell type-specific manner.

Credit: Lara Crow / NPG

Polak et al. analysed the distribution of cancer-associated genetic mutations in a total of 173 cancer genomes from 8 different cancer types (including melanoma, liver cancer, multiple myeloma, lung adenocarcinoma and glioblastoma among others). They then compared the regional distribution of mutations in these cancer genomes (which seemed similar, although not identical) with the distribution of 424 epigenetic features from more than 100 normal cell types obtained from the epigenome maps produced by the Roadmap Epigenomics Program.

This suggested that the effect of chromatin structure on local mutation density is highly cell type-specific

The authors observed that the genomic distribution of chromatin features corresponding to the likely cell type of origin was more strongly associated with local mutation density than the distribution of features found in unrelated cell types. For example, even though the regions enriched in histone H3 lysine 4 monomethylation (H3K4me1) in melanocytes and hepatocytes are similar, the distribution of mutations in liver cancer followed the levels of H3K4me1 in hepatocytes but not in melanocytes, whereas melanoma mutations correlated with the levels of H3K4me1 in melanocytes but not in hepatocytes. This suggested that the effect of chromatin structure on local mutation density is highly cell type-specific.

The authors quantified the extent to which the epigenetic signature contributed to the regional mutation density by using Random Forest regression, a non-parametric machine-learning method that combines the output of an ensemble of regression trees to predict the value of a continuous response variable. By using multiple regression trees, the risk of over-fitting and the noise are reduced. The results of this analysis confirmed that the density and distribution of cancer mutations are strongly linked to a cell type-specific epigenomic signature. Interestingly, the epigenomic signatures of cancer cell lines were poor predictors of the mutation profile of the different cancers analysed. Prediction of liver cancer mutation density using chromatin features of the liver cancer cell line HepG2 was less accurate than the prediction using hepatocytes, whereas chromatin accessibility in melanocytes resulted in higher prediction accuracy than that from the COLO829 melanoma cell line.

Finally, the cell type specificity of the association between chromatin features and mutation density suggests that the cell of origin of an individual tumour sample could be predicted from its mutation pattern. Indeed, a predictor based on enrichment of epigenomic variables from a single cell type was able to link 88% of cancer genomes from 6 different types of cancer to their cell type of origin. This could be relevant for the identification of the cell type of origin in cancers of unknown primary.