Accurate pathological diagnosis is crucial for optimal management of patients with cancer. For the approximately 100 known tumour types of the central nervous system, standardization of the diagnostic process has been shown to be particularly challenging—with substantial inter-observer variability in the histopathological diagnosis of many tumour types. Here we present a comprehensive approach for the DNA methylation-based classification of central nervous system tumours across all entities and age groups, and demonstrate its application in a routine diagnostic setting. We show that the availability of this method may have a substantial impact on diagnostic precision compared to standard methods, resulting in a change of diagnosis in up to 12% of prospective cases. For broader accessibility, we have designed a free online classifier tool, the use of which does not require any additional onsite data processing. Our results provide a blueprint for the generation of machine-learning-based tumour classifiers across other cancer entities, with the potential to fundamentally transform tumour pathology.
Gene Expression Omnibus
We thank U. Lass, A. Habel, I. Oezen for technical and administrative support, the Microarray unit of the Genomics and Proteomics Core Facility (DKFZ) for methylation services, the German Glioma Network and the Neuroonkologische Arbeitsgemeinschaft for sharing their data. This research was supported by the DKFZ-Heidelberg Center for Personalized Oncology (DKFZ-HIPO_036), the German Childhood Cancer Foundation (DKS 2015.01), an Illumina Medical Research Grant, the DKTK joint funding project ‘Next Generation Molecular Diagnostics of Malignant Gliomas’, the A Kids’ Brain Tumour Cure (PLGA) Foundation, the Brain Tumour Charity (UK) for the Everest Centre for Paediatric Low-Grade Brain Tumour Research, the Friedberg Charitable Foundation and the Sohn Conference Foundation (to M. Snuderl and M. Karajannis), the RKA-Förderpool (Project 37) and Stichting Kinderen Kankervrij and Stichting AMC Foundation (to E. Aronica), NIH/NCI 5T32CA163185 (to A.O.), NIH/NCI Cancer Center Support Grant P30 CA008748 to MSKCC, the Luxembourg National Research Fond (FNR PEARL P16/BM/11192868 to M.M.) and the National Institute of Health Research (NIHR) UCLH/UCL Biomedical Research Centre (S.Bra.).
Extended data figures
Overview of reference methylation class characteristics. This table gives an overview of the main characteristics of the 82 tumour and 9 non-tumour methylation classes including full names of the methylation class, association of class with a methylation class family, number of cases per class, class age characteristics, male / female ratio, tumour localization, most frequent pathological diagnoses and a running text summarizing typical class features. Further, the Hex colour code of the reference classes used throughout this manuscript is provided.
Case by case list of reference cohort. This table gives case-by-case details of the n=2801 biologically independent samples constituting the reference cohort including the Sentrix ID (.idat), tissue source, clinical data, methylation class and technical specifications.
Single class sensitivity and specificity. This table provides single class specificity and sensitivity for the ≥0.9 calibrated classifier score for methylation class families and methylation classes that are not assigned to a methylation class family. In addition single class specificity and sensitivity is provided for the ≥0.5 calibrated classifier score for methylation classes that are part of a methylation class family and that can be used for subclassification for individual family member identification. The data was generated using n=2801 biologically independent samples.
Case by case list of prospective validation cohort. This table gives case-by-case details of the n=1104 biologically independent samples constituting the prospective clinical cohort including information on the tissue source, clinical data, methylation class prediction (Classifier version V11b2), interpretation of classification and technical specifications.
Case by case list of of discordant cases. This table gives case-by-case details of the n=139 biologically independent samples with discordant results between pathological diagnosis and methylation profiling. The cases are categorized into reclassified ("establishing new diagnosis", n=129) or misleading profile (n=10). Information on the orthogonal methods used for reassessment as well as the key information resulting in reclassification is provided.
Case by case list of external diagnostic cohort. This table gives case-by-case details of the n=401 biologically independent samples constituting the external centre diagnostic cohort including clinical data, original pathological diagnosis, methylation class prediction, interpretation of classification and the final pathological diagnosis (after integration with classifier result).