MethylSPWNet and MethylCapsNet: Biologically Motivated Organization of DNAm Neural Networks, Inspired by Capsule Networks

Levy, Joshua J.; Chen, Youdinghuan; Azizgolshani, Nasim; Petersen, Curtis L.; Titus, Alexander J.; Moen, Erika L.; Vaickus, Louis J.; Salas, Lucas A.; Christensen, Brock C.

doi:10.1038/s41540-021-00193-7

Download PDF

Technology Feature
Open access
Published: 20 August 2021

MethylSPWNet and MethylCapsNet: Biologically Motivated Organization of DNAm Neural Networks, Inspired by Capsule Networks

npj Systems Biology and Applications volume 7, Article number: 33 (2021) Cite this article

2269 Accesses
8 Citations
2 Altmetric
Metrics details

Subjects

Abstract

DNA methylation (DNAm) alterations have been heavily implicated in carcinogenesis and the pathophysiology of diseases through upstream regulation of gene expression. DNAm deep-learning approaches are able to capture features associated with aging, cell type, and disease progression, but lack incorporation of prior biological knowledge. Here, we present modular, user-friendly deep-learning methodology and software, MethylCapsNet and MethylSPWNet, that group CpGs into biologically relevant capsules—such as gene promoter context, CpG island relationship, or user-defined groupings—and relate them to diagnostic and prognostic outcomes. We demonstrate these models’ utility on 3,897 individuals in the classification of central nervous system (CNS) tumors. MethylCapsNet and MethylSPWNet provide an opportunity to increase DNAm deep-learning analyses’ interpretability by enabling a flexible organization of DNAm data into biologically relevant capsules.

Introduction

DNA methylation (DNAm) is a key epigenetic regulator of gene expression in health and disease states, processes of aging and cellular differentiation/stemness, and response to environmental exposures^1,2,3. DNAm of cytosine in the context of cytosine–guanine dinucleotide (CpG) sites can be measured with standardized genome-scale oligonucleotide bead arrays at hundreds of thousands of sites^4,5. Though a CpG is either unmethylated or methylated, fluorescence signal intensities from array measures of bulk biospecimen DNA are used to derive a beta-value measure that approximates the proportion of methylated DNA copies. Gene promoter CpG island methylation is associated with repression of transcription, whereas unmethylated CpG islands are permissive to gene transcription. Alterations to DNAm have a well-established role in carcinogenesis and tumor progression, including inactivation of tumor suppressor genes, aberrant oncogene expression, and loss of repression of repetitive element sequences that contribute to genomic instability^6,7.

The World Health Organization Central Nervous System (CNS) tumor classification includes over 38 tumor types defined by histopathological features⁸. Most of the 38 can be grouped into the broader glioma, ependymoma, and embryonal tumor types. Within those three categorizations, over 80 further delineations are specified by molecular subtyping. DNAm alterations have been heavily implicated in the development and prognosis of CNS tumors. For instance, epigenetic silencing of MGMT is associated with an improved response to chemotherapy in glioblastoma patients through the deactivation of crucial DNA repair mechanisms⁹. IDH mutations are associated with improved survival in glioma patients through subsequent global hypermethylation of CpG island promoters, known as induction of the CpG island methylator phenotype (CIMP)^10,11,12,13. Other examples include hypermethylation of Wnt and Shh pathways in medulloblastoma patients¹⁴. The success of differential methylation analyses in characterizing CNS tumors has recently led to the development of DNAm classifiers of brain tumors as companion diagnostic tools to understand and correctly diagnose challenging histologic cases and for the selection of targeted therapies⁸.

While the development of this methylation-based brain-tumor machine-learning classifier has been heralded as an improvement, existing diagnostic framework clinically applicable classifiers use only a small subset of measured CpGs (e.g., 10,000)¹⁵. Incorporating additional CpG predictors may allow for the resolution of tumor classes otherwise not identified and help understand relationships with outcomes¹⁶. This problem may be better approached using machine-learning analyses by merit of their prohibitive dimensionality. Deep-learning algorithms are a subclass of machine-learning approaches that are based on the use of artificial neural networks (ANN)^17,18,19. Multilayer perceptrons (MLP) represent a subclass of neural networks that treat the input data as a one-dimensional vector and then pass the information from one set of nodes to a subsequent set of nodes through fully connected layers of weights/parameters. The information at the subsequent layer of nodes is transformed using nonlinear transforms/activations/link functions. These types of analyses are common for deep-learning analyses of DNAm data, where the input data are a list of beta values for each subject²⁰.

DNAm deep-learning frameworks, e.g., MethylNet, can accurately characterize tissue, disease states, and infer subject age and cell-type proportions through unsupervised embedding, generation, classification, and regression tasks^{20,21,22,23,24}. They also attempt to ascribe important methylated loci using model interpretability frameworks such as SHAP²⁵ or LIME²⁶. While the inclusion of more CpGs presents an opportunity to expand the space of biologically testable hypotheses²⁰, statistical challenges (e.g., multicollinearity) with interpretations and generation of associations with pathways remain understudied²⁷.

Multicollinearity, the unusually high correlation between features, can be addressed with careful feature selection or grouping^1,28. Feature-selection methods and statistical learning methods, such as sparse Group LASSO and network regularization, have identified important CpGs in highly complex data^{29,30,31,32,33}. More recent work has called for a greater understanding of DNAm–DNAm interactions’ implications through the incorporation of Gaussian graphical models, canonical correlation analysis, and module discovery through weighted gene comethylation networks^{34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50}. There is growing support for the use of novel deep-learning methods to aggregate, group, and select CpGs by their local context (e.g., genes) to connect and interpret the data with clinical outcomes^51,52,53. Incorporation of prior biological knowledge improves the transparency and interpretability of the modeling approach and reduces noise while increasing the signal by meaningfully pruning redundant relationships between predictors⁵⁴.

Capsule networks have served as inspiration for methods that group CpGs to harness their statistical interactions and relate predictors’ groupings to clinical and biological outcomes²⁷. Capsule networks explicitly model the relationships between constituent parts/groups of predictors, or capsules, through parameterizing pose matrices (unitary transformations) and then hierarchically associate each of these parts independently to higher-order targets of interest. While capsule networks are primarily featured in the computer vision domain, evolving methods within different biomedical specialties often utilize grouped organization of predictors in the neural network design⁵⁵.

Here we provide a deep-learning framework for methylation data that draws inspiration from capsule networks. We investigated the organization of CpG features into DNAm capsules, which represent local contexts that can be related to one another. MethylSPWNet and MethylCapsNet organize sets of CpGs into a series of capsules defined by higher-order genomic contexts and performs classification tasks (Fig. 1A, B). To bring additional interpretability to existing deep-learning approaches while capturing hierarchical association networks, we propose and explore MethylSPWNet (Sparse Pathway Network) and MethylCapsNet as deep-learning analogs of traditional enrichment approaches, both of which serve to highlight pertinent disease-related regulatory contexts. We provide recommendations for developing these capsule and network-deriving models and provide open-source software for training these models. The MethylCapsNet framework proposes to expand the broad utility of these tools by allowing end users to construct their unique capsules that represent an array of biologically plausible contexts that further explain their target of interest.

**Fig. 1: Description of modeling approaches.**

Results

To illustrate the potential utility of capsule-inspired neural network approaches, we revisited the Capper et al.⁸ dataset used to train a model that differentiates CNS tumors⁵⁶. CNS tumor histology is largely characterized by the presence or absence of morphologically distinct cells of origin, including neuronal, astrocytic, microglial, oligodendrocytic, and Schwann cells. We aimed to predict the 38 histological subtypes of CNS tumors (39 classes, including controls) as a test case for the capsule-inspired neural network approaches. While distinct cell types may characterize these histological subtypes, it was not our aim to classify these cell types through this modeling approach, as methods for brain cell-type estimation using DNAm data are still under development. We compare the MethylCapsNet and MethylSPWNet frameworks for capsule organization with the existing MethylNet framework (which does not account for capsule-organized information²⁰) and a Random Forest model fit on 10k important CpGs derived using a previously established method (Random Forest 10k). Additionally, we provide a Random Forest model on the capsule-organized information extracted from MethylSPWNet (Random Forest Capsules). Additional details of modeling approaches, fitting procedures, and capsule selection are in the “Methods” section.

Capsule generation for CNS tumor prediction

Capsules may be supplied to the neural network approaches in the form of annotations and/or gene sets from MSigDB and GSEA: (1) genes, (2) sites upstream/downstream of the gene, (3) the following Illumina methylation array annotations—UCSC_RefGene_Name, UCSC_RefGene_Accession, UCSC_RefGene_Group, UCSC_CpG_Islands_Name, Relation_to_UCSC_CpG_Island, Phantom, DMR, Enhancer, HMM_Island, Regulatory_Feature_Name, Regulatory_Feature_Group, and DHS—and (4) the following GSEA gene sets: C5.BP, C6, C1, H, C3.MIR, C2.CGP, C4.CM, C5.CC, C3.TFT, C5.MF, C7, C2.CP, and C4.CGN. Importantly, users can also input custom capsules into the pipeline through a dictionary that maps CpG to a context name of choice. Finally, capsule generation has been integrated with BedTools⁵⁷ (genomic_binned selection), which can break up the entire hg19 genome into overlapping windows of fixed width. CpGs in these windows will belong to these capsules. We utilized gene capsules for the primary classification study, though alternative methods for capsule formation are explored in the section “Exploration of alternative capsule formations and cancer subtypes.”

Classification study

We trained each of the modeling approaches to differentiate 38 histological subtypes of CNS tumors and compared their classification performance via a 1000-iteration nonparametric bootstrap of F1 scores over the test set, which balances sensitivity and specificity and reduces the bias in output. Our results indicate that MethylNet, MethylSPWNet, and MethylCapsNet can achieve very similar high performance on a common data set (Table 1). The neural network approaches achieved marginally better performance than the Random Forest approaches. A breakdown of classification scores for the capsule-inspired models has been included in the supplementary material (Supplementary Table 1). Since all three neural network approaches offer similar performance on classifying brain tumors, we next sought to uncover overlap or complementary insights provided by each modeling approach based on their data organization. The high predictive accuracy of both capsule approaches provided grounds for exploring the factors related to its decision-making process for increased transparency and validation of our approach.

Table 1 Classification results for random forest approaches, MethylNet, MethylSPWNet, and MethylCapsNet^a.

Full size table

Clustering gene-level brain cancer embeddings

Until this point, our unit of analysis has been individual CpGs. Summarizing gene-level methylation using median or mean methylation is generally not appropriate. The relationship between methylation state and gene expression can vary, depending on the genomic context (e.g., promoter and gene body). However, while training MethylSPWNet to predict tumor histological subtype, the model learns to generate gene-level summaries of methylation by updating the weight of each CpG when aggregating beta values of CpGs on the gene-level (see Methods “Description of MethylSPWNet”). This gene-level aggregation can transform a design matrix of samples by CpGs into samples by genes. Gene-level embeddings correlated with outcome (here tumor histological subtype), can then be interrogated for their relationship with known pathways and gene networks. To visualize gene-level embeddings, we generated cluster heatmaps, where rows constitute observations and columns comprise genes, showing plots of the top 2000 variable genes from neural network gene-level embeddings (Supplementary Figs. 1-2).

To assess the representation capacity of MethylNet, MethylCapsNet, and MethylSPWNet embeddings, we clustered embeddings with histologic tumor subtype, cell of origin, and histological subtype with the molecular subclass (Supplementary Table 2). Preliminary clustering of the observations demonstrates, for instance, the inability to differentiate IDH mutant subtypes of glioma when defined by median methylation versus the neural network parameterization. There is observed concordance between hierarchical clustering in this embedding space and the brain cancer subtypes. This concordance is defined by their molecular subtypes, the original histological subtypes that the model was trained on, or higher-order cells of origin (e.g., mesenchymal, ependymal, and neuroglial origin). MethylSPWNet had the highest degree of concordance with histological and molecular subtypes within the gene-level embedding space⁵⁸ (V-Measure 0.72 ± 0.0059; Supplementary Table 2). The ability to recapitulate relevant histological subtypes of CNS tumors through the embeddings alone is further corroborated by embedding plots⁵⁸. In these, MethylCapsNet appears to generate the best separation and differentiation of subtypes (Silhouette Score: 0.52 ± 0.0048), followed by MethylNet (Silhouette Score: 0.25 ± 0.01), and MethylSPWNet (Silhouette Score: 0.1 ± 0.0087), estimated using a 1000-sample nonparametric bootstrap. Since these three approaches were trained to recognize histological subtypes, the signal of the cell of cancer origin and molecular subtypes were less well captured.

Gene-level and modularity enrichment analyses

Next, we aimed to evaluate the utility of the group-regularized deep-learning approach for capsule-organized summaries of DNAm on the gene level for pathways and gene network analyses. We performed a preliminary analysis of pathway and module detection based on the extraction of hypervariable genes across the neural network embeddings and Louvain clustering of networks of genes based on the pairwise correlation between the genes. Further description of the methods and results is provided in the supplementary material (Supplementary Table 3; Supplementary Figs. 3, 4).

A description and flowchart showing an overview of methods for pathways and gene network analysis downstream from MethylSPWNet can be found in the Methods section “Description of Potential Downstream Analyses.” We focused our presentation of results on three specific CNS tumor subtypes: glioblastoma (GBM), low-grade glioma (LGG), and medulloblastoma (MB). Gene-level embeddings (gene by sample) and pathways and gene network analysis results (derived from those embeddings) are shown in Figs. 2, 3, and 4, respectively. The results on pathways and gene network analyses for these three CNS tumor subtypes (GBM, LGG, and MB) are provided in Supplementary data files 1–3. A description of the supplementary data files may be found in the supplementary materials (section “Description of Supplementary Data”).

**Fig. 2: Clustermap of gene-level embeddings for 2000 most variable genes in.**

**Fig. 3: Example output from pathway enrichment analyses for.**

**Fig. 4: Example output from WGCNA analyses for.**

Subtype-specific pathways discovery

Using the gene-level MethylSPWNet embedding values, we sought to calculate differentially embedded genes between disease and controls (empirical Bayes) and determine the associations of these genes with some of their correspondent pathways (see Methods “Description of potential downstream analyses”). A visualization of pathway networks and output of gene-subtype associations and pathway enrichment statistics are provided in Fig. 3. The empirical Bayes results of the differential embedding analyses for each subtype are provided as Supplementary Data Files.

Our analysis of the selected three CNS tumor subtypes (GBM, LGG, and MB) found that many top differential genes have been implicated for these subtypes in prior literature. For instance, TACC and FGFBP2 (Fig. 3B) are differentially embedded between GBM and controls and have been implicated with a tumorigenic gene fusion event (TACC-FGFR3^59,60,61). RASGRF2 (Fig. 3B) has been linked to congenital GBM⁶². LGI2 and NPY5R (Fig. 3B) have both been related to changes in Sox2 expression⁶³, which promotes cellular plasticity in GBM. An interesting associated pathway for GBM was type 1 diabetes mellitus (Fig. 3A and B), an autoimmune disease, of which its spurious association could be related to found associations with several immune-evading markers. For LGG, TLK1 (Fig. 3D), a serine–threonine kinase associated with replication, focal adhesion, and cell cycle, has previously been implicated in gliomas⁶⁴. Similarly, low ELL2 expression (Fig. 3D), regulated by microRNA (miRNA)-mediated gene silencing, was reported to be a marker for poorer survival in GBM patients⁶⁵. GNL1 (Fig. 3D) was found in our analysis to be associated with LGG and has been identified as being related to cell proliferation, given its role in the phosphorylation of Rb⁶⁶. We also found associations with the opioid-signaling pathway and G-alpha (i) signaling events⁶⁷, and the tyrosine kinase receptor pathway VEGFR (vascular endothelial growth factor receptor) and downstream signaling pathway ERK (Fig. 3C and D), largely involved with proliferation and angiogenesis⁶⁸. Regarding MB, as examples, we uncovered NRBP2 (nuclear receptor binding protein 2; Fig. 3F), which had been shown to be downregulated in MB⁶⁹, and SOX14 (Fig. 3F), part of the SOX family which largely determines cell fate and thus heavily implicated across many CNS tumors⁷⁰. Additionally, pathways such as muscle contraction (Fig. 3E and F) have been associated with specific molecular subtypes of MB^14,71.

Grouped-subtype pathways discovery

Additionally, we investigated associations uncovered by grouping together a few select disease subtypes. We would expect, at minimum, differences between these subtypes and healthy controls to be related to pathways that are specialized to those larger histological groupings. First, we compared melanoma-related CNS tumors (MELAN/MELCYT) to controls by performing enrichment analyses of the top 40 differentially embedded genes, as defined by the ranked p-values. As a few examples of potentially enriched gene sets across multiple databases after Bonferroni adjustment, we found potential enrichment for MITF transcription factor targets (TRRUST; p = 0.06) and neural crest differentiation (Wikipathways; p = 0.07), BMP signaling (GO Biological Processes; p = 0.03), and IL23-mediated signaling (NCI-Cancer; p=0.05). Of interest from the ependymal tumors (EPN/SUBEPN) was that the top 40 genes had demonstrated an overlap with genes related to the spinal cord (Human Gene Atlas; p=0.03).

Derivation of weighted gene co-embedding networks

To investigate the gene-level embeddings for each of the 38 brain cancer subtypes (paired to normal controls), we derived disease-specific modules of genes using the Weighted Gene Correlation Network Analysis (WGCNA) R package (see Methods “Description of potential downstream analyses”)⁴⁸.

We derived 606 modules of genes across the 38 subtypes (37 networks were derived, one subtype was omitted for low sample count), 297 of which were significantly associated with subtype (all P-values < 0.05). We have included as Supplementary Data Files the module membership of each of the genes, module expression across the samples for the three example subtypes (GBM, LGG, and MB), hub genes for each module (genes located most centrally in each subnetwork), and statistics that relate each module to the subtype. The connectivity of individual genes from the generated WGCNA modules for GBM, LGG, and MB subtypes is shown in Fig. 4. Tables of top hub genes from selected modules strongly associated with each subtype are shown.

Some of the WGCNA modules’ hub genes were found to be correspondent with prior knowledge about their respective subtypes. In GBM, RASGRF4 (blue module; Fig. 4A and B) was previously featured in Fig. 3B. The silencing of MGMT (green module; Fig. 4A and B) plays a significant role in the progression of GBM through inactivation of its DNA repair mechanisms⁹. MIR33B (brown module; Fig. 4A and B) is also related to GBM progression by regulating cell proliferation, invasion, migration, and MYC signaling^72,73. Finally, the role of platelet factors (PF4; blue module) and CpG island hypermethylation (homeobox gene BARHL2; turquoise module) has previously been implicated with GBM^74,75. Examples of hub genes in LGG include NRP1 (black module; Fig. 4D)⁷⁶, PTPRZ1 (pink module; Fig. 4D)⁷⁷, and COL6A3 (green module; Fig. 4D). NRP1 has been shown to be related to poor prognosis in gliomas and signals through microglia/macrophages. PTPRZ1 has previously been related to malignant growth in GBM. Finally, COL6A3 is a member of genes serving to form the tumor vasculature. Finally, of interest in MB were SOX 14 and SOX17 (green module, Fig. 4F)⁷⁰, CD4 (green–yellow module, Fig. 4F), SLIT3 and GFPT2 (black module, Fig. 4F), and SYNPO (blue module, Fig. 4F). CD4 is a gene that codes for the membrane glycoprotein of the CD4 T cell, where its characterization could be corroborated by the immune-infiltration patterns of the stroma for MB. SOX14 and SOX17 are pertinent for cell-fate lineage. SLIT3 is a gene characterized by axon guidance and consequently tumor growth, migration, and angiogenesis. Interestingly, GFPT2 (amino acid metabolism), implicated with higher expression and lower GBM survival⁷⁸. SYNPO, central to the black module of MB, was also central to the blue module of GBM.

Enrichment of neural network CpGs for gene and island contexts

As Weighted Gene Correlation Network Analysis (WGCNA) identifies significant associations with known pathways and novel gene–gene comethylation networks, we sought to investigate the CpG-specific parameters that corresponded to producing the embeddings to understand better why the neural network decided to upweight some CpGs, but not others. To elucidate the genomic contexts that MethylSPWNet found to be important, we explored the CpG island context and spatial relationship to the transcriptional start site (TSS) (Methods section “CpG island/gene context analysis”). CpG islands (CGI) are CpG-dense regions. Approximately 60% of gene promoters contain CpG islands⁷⁹. CpG shores immediately flank the CGIs by up to 2 kb, shelves flank the shores by an additional 2 kb, as regional CpG density decreases. Variables for the spatial relationship to the TSS include TSS1500 and TSS200, within 1500 bp and 200 bp of the TSS, respectively. Additional TSS variables are the 5′UTR immediately downstream of the TSS, the first exon, gene body, and 3′UTR.

Of note, we found that CpGs with positive weights (rank-ordered) were depleted for promoter island regions (defined as having TSS1500/TSS200 annotation and not open sea) (OR = 0.69; p = 0.04) as compared with sites not included in promoter-island regions (OR = 1.45; p = 0.04). However, when limiting the set of CpGs to only promoter regions (i.e., TSS1500/TSS200), we noted that positive-weight CpGs were enriched for island context (i.e., not in an open-sea region) (OR = 1.21; p = 0.03), while negative-weight CpGs were depleted for the CGI context (OR = 0.89; p = 0.02). Furthermore, both the positive and negative weights of intragenic CpGs were depleted for association with the correspondent methylated promoters (as compared with unmethylated promoters) for their respective genes (positive weight OR = 0.54, p < 0.01; negative weight OR = 0.44, p < 0.01). We have included tables for the relationship between CpG weight and independently considered contexts in the Supplementary Materials (Supplementary Table 4; Supplementary Fig. 4).

MethylCapsNet module enrichment

From the embedding module discovery analysis and further contextualization of the neural network CpG weights, we observed that information encoded in MethylSPWNet corresponds to key pathways associated with various CNS tumors and important genomic contexts. MethylCapsNet offers the ability to infer more granular relationships between capsules on the individual sample level when we can reduce the number of parameters specified. The primary capability and emphasis of the capsule-inspired network approach is to compare capsules to each other and directly relate them to particular outcomes of interest via the dynamic construction of a bipartite network (gene-subtype relationships) as part of the training and prediction process. For the MethylCapsNet analysis, we preselected a subset of genes previously shown to be implicated in various types of brain cancer (see “Selection of capsules for MethylCapsNet and MethylSPWNet” in Methods). As such, we believe it would not be appropriate to test for enrichment of these genes due to the preselection procedure that introduces a bias. Instead, we derived modules of genes that the neural network deemed to have a coordinated DNAm response in elucidating particular subtypes. Our modularity analysis projects the estimated bipartite graph (gene subtype) across samples into a univariate graph (gene–gene), then clusters the graph using Louvain modularity to yield four modules of genes (green, red, blue, and yellow) (Fig. 5).

**Fig. 5: MethylCapsNet-derived gene network.**

Here, we offered an example of the kinds of inferences that can be made from the resultant unipartite network and subsequent clustering. For instance, the yellow module implicates relationships between WNT3A and EGFR, heavily implicated in Igf and Wnt signaling. The red module features the relationship between FRZ and APC, both of which are heavily involved in WNT signaling (APC forms the complex to inhibit the accumulation of β-catenin, while WNT binding to frizzled family receptors may degrade this inhibition and permit cell proliferation⁸⁰). Of the green module, IDH3G and NPR3 (linked to energy metabolism, gene fusion, and chromatin remodeling) were related to both LRDD (proapoptotic MAPK pathway) and WIF1 (both previously implicated WNT signaling suppressors) from the yellow module. KIAA1549, related to astrocytomas and fused to BRAF for its progression to oncogenesis, was implicated with WNT1 in the blue module⁸¹. Insulin-like growth factor binding protein 2 (IGFB2, glioma oncogene) of the red module appeared to be negatively correlated with many of the genes across the yellow and green modules⁸². TP53 (which lacked consequential methylation patterns) and MYC⁸³, MGMT, and TERT share relationships with each other but not with the other modules, perhaps highlighting how ubiquitous these somatic alterations are for oncogenesis.

Despite having highlighted potentially interesting relationships, we acknowledge that there will be a future opportunity to increase the search space of possible relationships between genes and their coordinated response for bringing about subtypes of brain cancer. In the Supplementary Material, we provide the routing matrix that was used to align each gene to a correspondent CNS subtype, which is the predicted gene-subtype bipartite network averaged across individuals (Supplementary Figure 5). We have also provided a locally deployable web application that allows the user to interrogate their uncovered capsules and form networks on the individual level and aggregated across patient subgroups or disease subtypes using gene-capsule-specific embeddings (Supplementary Figure 6). For instance, in our supplemental material, we demonstrate how the MethylCaps web application can be used to derive individual networks for LGG and GBM, identifying ANO9 and EGFR, respectively, as implicated in these conditions (Supplementary Figures 7-8). These genes and their associated pathways have been heavily implicated in tumorigenesis and gliomas^84,85.

Exploration of alternative capsule formations and cancer subtypes

In this section, we briefly present results from a few of many alternative means of forming capsules. Particularly, we consider the following scenarios for CNS tumor classification: (1) MethylCapsNet is fit when (a) only half the genes are retained from the original list (selected randomly) and (b) none of the genes are retained and instead randomly sampled from all genes and (2) MethylCapsNet is fit using binned genomics regions, utilizing CpGs encapsulated in 1-Mb bins. In addition to these analyses, we also explore an integrated breast cancer dataset utilized for PAM50 molecular classification^20,86,87 using a few different capsule configurations: (1) MethylCapsNet is fit using a curated list of genes, (2) MethylCapsNet is fit using binned genomics regions, utilizing CpGs encapsulated in 700 kb bins, and (3) MethylSPWNet is fit using capsules organized by CpG island promoters, formed by intersecting CpG island with gene promoter annotations from the Illumina 450k database. A complete set of results can be found in the supplementary material (Supplementary Tables 5-6; Supplementary Fig. 9).

Discussion

Recent reviews and initial explorations discussed the potential utility of capsule-inspired networks to relate biologically organized capsules to each other and known disease outcomes^27,88,89,90. In this work, we set out to perform a preliminary evaluation that shows the feasibility and suitability of DNA methylation capsules for deep learning analyses as a means to organize CpG information to higher-order contexts to improve prediction and transparency while uncovering instances of coordinated gene-level methylation patterns. In our analyses, we compared several state-of-the-art predictive modeling methods for DNA methylation classification of brain tumors. We demonstrated that capsule-based deep-learning approaches could achieve performance on par with existing deep-learning models and prove better than existing traditional machine-learning frameworks for analyzing DNA methylation data. Our work demonstrated the potential for new insights compared with other existing methylation-based tumor classification schemes currently used, which are often based on a small subset of CpGs, and lack built-in interpretation of the loci selected^8,91. We demonstrated the efficacy of increasing the use of the available CpGs on the Illumina 450k Array, ultimately using 200,000 loci before subsetting by context.

DNA methylation capsules focused on the gene level can disentangle important CpGs that might otherwise be downweighted in a feature-by-feature deep-learning unsupervised or supervised learning approach. These CpGs demonstrated substantial overlap with genes known to be related to tumorigenesis in the brain, such as NOTCH1, PTEN, and GNAS. This is consistent with previous studies that demonstrated mutations common in brain tumors, such as IDH1, are correlated with disruptions in methylation^{10,92,93,94,95,96,97}.

The context-specific CpG weight enrichment analyses suggest that within promoter regions, island context is important for differentiating different CNS tumor subtypes, but taken as a whole, regions outside of CpG promoter islands are important for capturing this heterogeneity. Furthermore, outside of the promoter context (supposedly regions that better capture tumor heterogeneity), the ability of intragenic CpGs to distinguish tumor subtypes is still dependent on the promoter methylation status of the respective gene. Clustering of CpGs with the highest weights at CpG islands, shores, gene bodies, and transcription start sites will help us understand where the most diagnostically relevant sites are in the genome, but demands additional investigation.

We also presented a few examples of the potential downstream applications of capsule-based approaches. In particular, our framework demonstrated the ability to relate derived gene-level measures of MethylSPWNet to known disease pathways via differential methylation analysis of the gene-level embeddings and gene–gene comethylation networks via WGCNA. Additionally, we provided a preliminary interpretation of bipartite (gene subtype) and unipartite (gene–gene) networks, which can be derived by MethylCapsNet web framework. Finally, we explored alternative means from which to form capsules. We expected the curation of genes to lead to more accurate models. Contrary to our initial hypothesis, for CNS classification, random capsules’ selection appeared to still produce a highly accurate model. These results suggest either the potential to uncover novel associations between genes and subtype or that these genes may be comethylated with other genes that have well-established relationships. The binned genomics, fit using MethylCapsNet for classification tasks in brain and breast, were similar to the leading methods. The island promoter capsules slightly underperformed, suggesting that these capsules’ selection alone does not contain enough information to distinguish PAM50 molecular subtypes.

There are a few limitations to this work, presenting room for future improvements in the analytical method. First, the included CpG loci selection is biased on limited sites available from the Illumina Array platform. At present, the utilization of an Illumina methylation array platform is more tractable due to lower technology costs and expertize required⁹⁸ than whole genome bisulfite sequencing (WGBS) that requires substantial sequencing depth on the order of 100x for comparable precision⁹⁹.

Second, a reduced set of genes were fit using the capsule-inspired network. It remains challenging to run MethylCapsNet at scale due to the heavy computational demand and the large number of free parameters in its current formulation. Since MethylCapsNet can only analyze approximately one-thousand capsules at a time, the capsule-selection step is critical to the method’s successful application. This parameter space should be reduced by finding some marriage between the scalability enabled by MethylSPWNet and perhaps greater transparency offered by MethylCapsNet. Presently, we advise end users to utilize MethylSPWNet when the number of contexts under evaluation is large (≥1000 capsules), or if the number of CpGs per gene is small, and to utilize MethylCapsNet when the number of contexts under consideration is smaller (<1000 capsules). Uncurated gene sets can be analyzed using MethylSPWNet, while curated gene sets are best suited to MethylCapsNet, e.g., regions of the genome fragmented by consistent windows or larger DNAm CpG modules that have been uncovered through methods such as WGCNA. In addition, the adoption of capsule-inspired approaches that explicitly form networks via their routing mechanisms presents a future area of research⁸⁹. It is also assumed that MethylCapsNet capsules that are more closely embedded are interacting, but it is not entirely clear the nature of these interactions without incorporating gene expression data, methylated quantitative trait loci analyses, and other pertinent omics modalities (e.g., ATAC-seq, Hi-C). In realistic analysis settings, performing a MethylCapsNet analysis on both marker genes and genes not associated with the disease of interest may yield genes that may interact through means that are not disease specific. To rank interactions for disease relevance, other potential sources of confounding (e.g., cell composition) should be controlled for and incorporation of expression data may provide the means to establish causal disease-association pathways.

While inspired by capsule networks, we also emphasize that these methods are not analogous to capsule networks featured in computer vision tasks. While the new capsule-based approaches were as accurate as fully connected approaches, this was done so under the constraint of sparse connections, where such specification points to the validity of imposing these constraints. Given the nature of the problem (classification among dozens of histological subtypes), intermediate embeddings may reflect a more linearly separable subspace to the subtypes of origin. Such a subspace may require additional exploration/penalization to avoid potential biases pertaining to the minimal redundant set of predictors to produce a subspace optimal for prediction. The application of such methods does not preclude the potential for selection of genes due to technical reasons such as noise, batch effects, and weight initializations, which are common to many domains of application of neural networks. We attempted to account for such biases through preprocessing methods on the data such as functional normalization and note that strict interpretation of threshold cutoffs for methods devised for differential gene expression may not be applicable. Thus, relaxation of the scope of features’ input into a pathway and other enrichment analyses may potentially reduce bias so as long as limitations are appropriately stated. Additionally, differences in tissue preparation (frozen, permanent) were not accounted for, and however, given the high concordance between these preparation methods, we felt that such adjustment was not necessary¹⁰⁰.

While DNAm deep-learning methods with built-in interpretability do not yet exist, we hope these methods, though constrained by potential limitations in design choice, may spur further research into more interpretable capsule methods. Here, there is also an opportunity to further apply concepts from topological data analysis (TDA), such as Mapper^{101,102,103,104,105}, to distill the key functional relationships from high-dimensional, complex data.

Existing classification frameworks currently used in the clinical setting for aiding brain cancer diagnosis only utilize a small subset of the total possible set of CpGs that can be measured. Current modeling approaches can be difficult to trust or use to study new network biology until they can consider a larger, more complete set of predictors. However, it is also important to note that doing so would introduce additional noise into the modeling approach, but the incorporation of prior biological knowledge can potentially help reduce noise while improving the detection of biologically relevant signals. We note that underperformance could suggest selecting capsules that may not be optimally aligned to the target task/dataset. By demonstrating the organization of CpGs into their respective genomic contexts, we present further opportunity to reduce the feature space and disentangle correlation and collinearity between CpG sites to create a new class of transparent, clinically tractable models. For instance, future classifiers should include brain cell-type classification using DNAm data and incorporate it as covariates in the prediction model, yet brain cell-type differentially methylated regions for deconvolution by DNAm patterns are not well-established. The opportunity space of epigenetics research questions is ample and poised to grow substantially as the field moves to expand reference-based approaches to cell-type deconvolution, include tandem assessments of other cytosine modifications (hydroxymethylcytosine), and apply DNA methylation age clocks to questions of biological aging. Despite having demonstrated the promising downstream analyses that users may readily adopt through our framework, we acknowledge that there is ample opportunity to develop related methods and their use cases further.

In this work, we have demonstrated the feasibility and utility of DNAm-based capsules for performing disease classification and potentially determining dysregulated genes for these diseases. We found that DNA methylation capsule methods can predict brain cancer subtypes with high accuracy and present convenient means for organizing data over traditional techniques for studying DNA methylation data. As such, we advocate for the organization of well-defined DNAm capsules as a means to improve the accuracy, transparency, and broad applicability of DNAm deep-learning models. Future deep-learning prognostic models that reimagine the formation and incorporation of DNA methylation capsules, paired with cell-type inference, gene expression, and/or corroborating chromatin capture, may serve as grounds for the derivation of unknown heterogeneity.

Methods

Overview of framework

The MethylCapsNet methodology presents an extension of the MethylNet framework²⁰ and is implemented as a command-line interface that allows the user to group CpGs into capsules and then dynamically route the capsules to make a prediction and interpret the results. While this approach draws inspiration from capsule networks featured in computer vision tasks, MethylCapsNet is not explicitly a capsule network, as defined in previous works in this domain.

MethylCapsNet utilizes separate MLPs for every set of CpGs (one set per context) to derive context-specific embeddings (separate context embeddings per each individual), and dynamic routing processes force information from child capsules into disease/categorical outcomes (Fig. 1A). The information is hierarchical because each child capsule may only align with one parent capsule. Once the capsule-inspired network is fit, graph structures that describe the relationships between each individual’s contexts can be derived by thresholding the correlation between pairwise n-dimensional context embeddings. Highly conserved biological networks can be derived by thresholding the number of individuals that share the same edge between the contexts. The simplest genomic context considered are genes that the CpGs annotate to. Other capsules can be defined by, for instance, genomic region or pathway/biological process annotation. MethylSPWNet is a specialized neural network architecture that routes beta values from the CpGs in each context into a single node representing the context (Fig. 1B). Each CpG is given a weight based on the importance of its contribution, both on the gene level and toward the classification task as a whole. This information passes through additional neural network layers that dynamically relate latent sets of predictors to outcomes of interest, whether they be prognostic or diagnostic^53,106. Much like Group LASSO approaches, group L1 penalization can be utilized on the CpG weights routed to each gene to select relevant genes of interest.

The software implementation (Fig. 6) comprises modules pertaining to prediction and interpretation tasks, which take into account the relationships and embeddings derived through the training process.

Data preparation

DNAm data from CNS tumors (n = 3897) were accessed from the GEO archive (GSE109381), preprocessed using PyMethylProcess²¹, and divided into 70%/10%/20% training, validation, and testing sets (MethylationArray objects) via PyMethylProcess. The 200,000 most variable CpG loci across the training samples were retained for analysis. Sets of CpGs were tracked to genes, which were then selected to form capsules. The original set of 200,000 CpGs was used as features for the MethylNet approach, the complete set of intersecting gene capsules with more than five associated CpGs was used for the MethylSPWNet (n = 10,341; 139,028 CpGs), and Group LASSO approaches, and a reduced set of capsules (n = 55), was utilized for MethylCapsNet after manual curation and a hyperparameter search (see “Selection of capsules for MethylCapsNet and MethylSPWNet”).

Description of potential downstream analyses

After fitting a MethylSPWNet model (and MethylCapsNet), the user may further interrogate the gene-level embeddings, depending on the research question being addressed. The user may explore how each gene relates to each outcome or how they relate to one another, the details of which have been included in an informative flow diagram (Fig. 7) (separate text describing information that can be extracted after fitting MethylCapsNet can be found in the section “Description of capsule-inspired neural network”).

**Fig. 7: Flow diagram for possible downstream applications of *MethylSPWNet*.**

Differentially embedded genes (the extent to which gene-level embeddings vary between subtypes vs. normal) from gene-level values derived by the neural network embeddings in each of GBM, LGG, and MB, were identified using the limma¹⁰⁷ package. This package compares tumor to nontumor control tissue through least-squares regression and empirical Bayes moderated F-tests, yielding FDR-adjusted p-values and log-odds ratios for the degree of differential embedding. We profiled functionally enriched pathways using the g:Profiler package¹⁰⁸ after selection of genes below an FDR-adjusted significance threshold and visualized the results (relating pathways by the number of shared genes, clustering into higher-order pathways via Markov clustering) using EnrichmentMap, as part of the Cytoscape network visualization framework^109,110.

The pairwise correlation between MethylSPWNet-derived gene methylation was calculated using Pearson’s correlation coefficient. Weighted adjacency matrices were calculated from the pairwise correlation matrices for each of the subtypes using the power adjacency function, which takes the comethylation to a power specified separately for each subtype. To further cluster the genes, the weighted adjacency is transformed into a topological overlap matrix (TOM), defined by the extent to which two genes share a third common gene. Finally, hierarchical clustering is applied to derive the final modules of genes. Finally, to relate each module with the disease subtype via the aforementioned least squares and empirical Bayes differential analysis methods, we calculated eigengenes (1^st principal component) for the genes in each module to further reduce the design matrix (samples by modules)¹⁰⁷. A large number of genes (on the order of five thousand) for such a summary gene network plot may make plotting the individual genes cumbersome and hard to understand, so we utilized Mapper^101,103,104, a tool from Topological Data Analysis, to further summarize and portray the relationships between the genes in the network summary plot.

CpG island/gene context analysis

Using the MethylSPWNet, each CpG is assigned a weight that relates the CpG to its associated gene or genomic context. These CpG weights are learned by the neural network and can be used to rank genes based on their relative importance (rank assigned by maximum absolute CpG weight), an alternative measurement to the modularity analysis. Inspection of the weights of CpGs within each gene can provide insight into sites and contexts that are important for predicting brain cancer subtypes. Further, investigation of weights spatially across the genome may give rise to important patterns and motifs that could warrant future investigation. In the supplementary material, we first considered the contexts mentioned above independently and did not consider the joint impact of context (e.g., did not associate with island-promoter regions, which are generally considered to be regions more causally related to changes in their expression). We then considered sites that were associated with island promoter regions (including shore and shelf context, more causally associated with gene expression) and separately compared the overlap of the CpGs correspondent to top positive and negative weights to the CpGs that were unassociated with this context (open sea and not TSS200/1500). We separately considered CpGs within the promoter regions. Finally, we considered intragenic CpGs and whether or not their corresponding gene’s promoter was methylated or unmethylated (as operationalized by calculating a beta-value methylation cutoff via local minima in the distribution of beta-values. The beta-value distribution, bimodally distributed, reflects the distribution of proportion of methylated alleles across a bulk mixture of cells for individual CpG sites. This distribution across CpG sites is typically estimated per individual(s). Beta values can take on values between 0 and 1, but particularly concentrate closer to 0 or 1 to reflect that a site is either “methylated” or “unmethylated”. The intermediate proportions reflect scenarios from which around half of the cells of the mixture are methylated at that site, which is uncommon, and used to as the threshold to denote whether a site is methylated. Any CpG with a beta value above the threshold was methylated. CpG methylation was averaged across each promoter and subject to the threshold to determine methylation status). We calculated odds ratios for enrichment/depletion in these contexts using Fisher’s exact tests. Without matching gene expression information, we could not make any causal claims/inferences about how these contexts modify gene expression to bring about these disease states.

Description of capsule-inspired neural network

The capsule-inspired network featured in this work operates by first finding representations of the given CpG sets as denoted by the primary capsule formation. The features, CpGs, of the CpG sets are fed into parallel implementations of a multi-layer perceptron, f_j, where the output dimensions of each of the neural networks are the same. Thus, the dimensionality of the primary capsules reflects the number of output neurons, a latent representation of each CpG set, times the number of capsules, per individual. The mathematical formulation of this transformation is presented below:

$${\overrightarrow{{z}_{genej}}} = f_{j}\left( {\overrightarrow{{x}_j}} \right)$$

(1)

For a single individual, the capsules, represented by row vectors, are stacked to form a capsule matrix:

$$\overleftrightarrow{Z} = \left[ {\begin{array}{l} {\overrightarrow{{z}_{gene1}}} \\ \quad\vdots \\ {\overrightarrow{{z}_{genen}}} \end{array}} \right]$$

(2)

An affine transformation, ${\overleftrightarrow{W}}$, a set of learnable parameters that seek to rotate, scale, and shift the data, transforms the primary capsules to encode information pertaining to the interactions between capsules:

$${\overleftrightarrow{{Z}_{*}}} = {\overleftrightarrow{W}}{\overleftrightarrow{Z}}$$

(3)

Each primary child capsule’s information is then dynamically routed to parent hidden or output capsules:

$$\overleftrightarrow{Y} = {\upsigma}\left(\overleftrightarrow {C}{\overleftrightarrow{{Z}_{*}}} \right) = sq\left( \left[ {\begin{array}{l} {\overrightarrow{{y}_{class\, 1}}} \\ \quad \vdots \\ \overrightarrow{{y}_{class\,m}} \end{array}} \right] \right)$$

(4)

where:

$$\overrightarrow {y_{class\,j}} = sq\left( {\mathop {\sum}\nolimits_i {C_{gene\,i,\,class\,j}\overrightarrow {z_{gene\,i}} } } \right)$$

(5)

Dynamic routing aims to force the information encoded into each child to align with one parent capsule, thus utilized to calculate $\overleftrightarrow{C}= \{ C_{ij}\}$, a bipartite network relating the child-parent-capsule. A vector of the same length represents each child as the output of the parent capsule. Analogous to the nonlinear transformation of the sum of the information output from the previous layer of neurons for traditional neural networks, for each parent capsule, the child-capsule values are summed, and then a nonlinear transform called a squash function, $sq\left( {\vec{x}} \right) = \frac{{\left\| {\vec{x}} \right\|^2}}{{1 + \left\| {\vec{x}} \right\|^2}}\frac{{\vec{x}}}{{\left\| {\vec{x}} \right\|}}$, is applied to effectively zero-out, or squash, child capsules that do not agree with parent capsules.

Each child’s contributions to a parent are weighted, but two constraints are imposed: first, the weights from each child to its parents must sum to 1. Second, a reward for the alignment of a child to exactly one parent is a dynamic routing by agreement mechanism. An iterative process updates the weights between the child and parent by adding their dot product. The update mechanism for calculating $\overleftrightarrow{C}$ is recapitulated below. After initializing ${\overrightarrow{{C}_{i}}} = softmax\left( {\overrightarrow{{\upbeta }_{{\mathrm{i}}}} = \vec{0}} \right)$ for $r \in \{ 1,2,3, \ldots \}$ iterations:

$${\overrightarrow{{C}_{i}}} = softmax\left( {\overrightarrow{{{\upbeta }}_{{{\mathrm{i}}}}}} \right)$$

(6)

$$\overrightarrow {Y_j} = sq\left( {\mathop {\sum}\nolimits_i {C_{gene\,i,\,class\,j}\overrightarrow {z_{gene\,i}} } } \right)$$

(7)

$${\upbeta}_{ij} = {\upbeta}_{ij} + \overrightarrow {Y_{class\,j}} \cdot \overrightarrow {z_{gene\,i}}$$

(8)

This formula is simplified from its original derivation and utilizes a few notational shortcuts.

Applying this operation for r iterations per batch per training epoch effectively prunes the other connections between the child and its parents as it converges on a single parent from which to send its information. Each output capsule per individual, $\overrightarrow {Y_j}$, is represented by a vector in some n-dimensional space. The output capsule with the highest L2 norm, $\left\| {\overrightarrow {Y_j} } \right\|_2$, is selected as the predicted class, and a margin loss is applied to penalize the model when it fails to either concretely have a very high ($m^ + = 0.9$) or very low probability ($m^ - = 0.1$) of prediction.

$$L_{margin:\,class\,j} = {\updelta}_{y_i,\,j}max\left( {0,m^ + - \left\| {\overrightarrow {Y_j} } \right\|} \right)^2 + {\uplambda}\left( {1 - {\updelta}_{y_i,\,j}} \right)max\left( {0,\left\| {\overrightarrow {Y_j} } \right\| - m^ - } \right)^2$$

(9)

The Kronecker’s delta ${\updelta}_{y_i,j}$ is equal to one when the outcome for individual i is equal to the jth class, thereby activating the left-hand margin loss that penalizes the model if the probability is below $m^ + = 0.9$. On the right-hand of the equation, ${\updelta}_{y_i,j}$ is equal to zero when the outcome for individual i is not equal to the jth class, thereby activating the right-hand margin loss that penalizes the model if the probability is above $m^ - = 0.1$.

The model is also penalized based on how much the original methylation array could be constructed from the true class’s output capsule via a decoder neural network ${{{\hat{\mathrm X}}}} \sim p_\phi \left( {\overrightarrow X {{{\mathrm{|}}}}\overrightarrow {Y_j} } \right)$:

$$L_{reconstruct} \propto \left( {\hat X - \overrightarrow X } \right)^2$$

(10)

Of the most interest to a biologist may be the primary capsule embeddings per individual, $\overleftrightarrow{Z_*}$, which demonstrate interactions between these biological hypotheses and how the outcome of interest is separable within a certain genomic context, and the weights between the primary and output capsules, $\overleftrightarrow{C}$, a bipartite graph demonstrates how these genomic regions are related hierarchically and have implications for parent processes. The coordinated response of capsules can also be derived through a bipartite projection of $\overleftrightarrow{C}$ into a unipartite network of capsules. Second, of importance are the concatenation of the primary capsules, which demonstrate overall class separation, and the decoded output. Tweaking the embeddings or L2 norm of the output capsules and decoding can potentially effectively generate methylation data conditionally on outcomes of interest and interpolate between purified states, though this aspect was unexplored due to prohibitive dimensionality.

Description of MethylSPWNet

MethylSPWNet is the deep-learning analog of a Group LASSO Regression model. The beta values for the CpGs for each gene, $\overrightarrow {x_j}$, are transformed into a single value, $z_{gene\,j}$, through the multiplication of a set of gene-specific CpG weight matrices, $\overrightarrow {w_j}$. These weights are updated throughout the training process to minimize the divergence between observed and expected outcomes. The magnitude of the weights dictates how much information from each CpG should be considered. The final gene-level summary value is given by

$$z_{gene\,j} = {\upsigma}\left( {\overrightarrow{w_j} \cdot \overrightarrow {x_j} } \right)$$

(11)

The gene-level summary values are concatenated to form an array of gene-level summaries ($\vec z \in {\mathbb{R}}^n$):

$$\overrightarrow{z} = \left[ {z_{gene\,1}z_{gene\,2}z_{gene\,3} \ldots z_{gene\,n}} \right]$$

(12)

The final prediction for the network can be obtained using the following transformation via an MLP, f:

$$\hat y = f\left( {\vec z} \right)$$

(13)

$$\hat p = softmax\left( {\hat y} \right)$$

(14)

In the classification case, this predicted outcome is compared to the expected outcome via

$$L_{CE} = \frac{{ - \mathop {\sum }\nolimits_i \mathop {\sum }\nolimits_c y_{i,c}log\left( {\widehat {p_{i,c}}} \right)}}{N}$$

(15)

We applied group L1 regularization to these weights to cause certain genes to drop out, returning genes important for the prediction of the cancer subtypes. The final LASSO penalty is given by

$$L_{L1} = \mathop {\sum}\nolimits_j {\sqrt {d_j} \left\| {\overrightarrow {w_{gene\,j}} } \right\|_2}$$

(16)

where $d_j$ is the number of CpGs assigned to that gene. An intermediate layer of the neural network, $\overrightarrow{z}$, stores gene-level summaries of DNAm information, and $\overrightarrow {w_{gene\,j}}$ contains the importance of each CpG for a particular gene.

Here, we contrast this summary measure, $z_{gene\,j} = {\upsigma}\left( {\overrightarrow{w_j} \cdot \overrightarrow{x_j} } \right)$ to a more traditional summary measure, such as the median or mean methylation (mean displayed on the right): ${\upbeta}_{gene\,j} = \frac{{\mathop {\sum }\nolimits_k {\upbeta}_{jk}}}{{d_j}}$. Assuming the mean as our measurement, for simplification, it can be seen that each CpG is given equal weight $\frac{1}{{d_j}}$, while for MethylSPWNet, each CpG is given weight $w_{jk}$, which is learnable and reflective of the relative contribution of the given methylation beta value to the aggregate measure. A comparison between $z_{genej}$ and ${\upbeta}_{genej}$ can be found in the supplementary materials.

Hyperparameter scans

MethylCapsNet includes the use of a hyperparameter optimization scheme, accessible through the methylcaps-hypscan module. Currently offered by the package is the availability to scan a number of hyperparameters, including the number of training epochs, length of the genomic region, the minimum number of CpGs to constitute a capsule, weighting schemes for reconstruction loss and survival loss, and learning rate, in addition to other focused hyperparameters. Additionally, the search for MethylNet model architecture, randomized neural network topologies, was replaced by a framework that searches for the ideal number of neurons per neural network layer, conditional upon the choice of the number of layers. There are three search strategies for optimization, including randomized searches and Bayesian optimization techniques. This scheme differs from MethylNet, as both the neural network topology and a set of hyperparameters can be optimized through the application of successive Gaussian processes to update some prior of losses over the set of hyperparameters. However, the results presented in this paper utilized the randomized search design. The jobs can be launched in parallel and scaled to meet the demands of a larger compute cluster.

Capsule generation

Capsules specify the groupings of CpGs of the MethylationArray object. Capsule selection has been incorporated into the hyperparameter_scan and the methylcaps-model subcommands. Application Programming Interface (API) access to capsule selection and the building may be accessed through the build_capsules script. As mentioned in the “Results” section, prespecified capsules include the following Illumina methylation array annotations—UCSC_RefGene_Name, UCSC_RefGene_Accession, UCSC_RefGene_Group, UCSC_CpG_Islands_Name, Relation_to_UCSC_CpG_Island, Phantom, DMR, Enhancer, HMM_Island, Regulatory_Feature_Name, Regulatory_Feature_Group, and DHS. Additionally, the following GSEA gene sets may be queried: C5.BP, C6, C1, H, C3.MIR, C2.CGP, C4.CM, C5.CC, C3.TFT, C5.MF, C7, C2.CP, and C4.CGN. Users can also specify their own capsules through the presentation of a pickled dictionary containing a DataFrame that maps each CpG to a context name of choice. Capsule generation may also be accomplished by breaking up the entire hg19 genome into overlapping windows of fixed width⁵⁷ (genomic_binned selection). We recommend the utilization of the Circos tool¹¹¹ for visualization of derived capsule relationships using the genomic_binned option.

Selection of capsules for MethylCapsNet and MethylSPWNet

For the training of MethylSPWNet, we utilized all genes that overlapped with the 200,000 most variable CpGs across the CNS tumors. For MethylCapsNet, we could not utilize the complete set of genes due to the number of free parameters, a gene list of 650 genes was manually curated for MethylCapsNet that included genes related to WNT, SHH, DKK1, beta-catenin, SFRP, and NPR3, among others. This list was reduced to 55 genes via recognition of genes by domain experts and thresholding of the minimum number of CpGs. As a further description, for both approaches, hyperparameter scans were utilized for pruning genes that did not contain a minimum number of CpGs (this threshold was varied via the hyperparameter scan), resulting in a lower number of genes than originally specified (n = 10,341). In future iterations of the capsule-inspired network-based approach, gene-selection constraints will be lifted via reduction of free parameters and the adoption of explicit network building approaches.

Comethylation embedding modules

MethylSPWNet-derived gene-level methylation summaries/embeddings were correlated to each other and within their own set of top genes. To identify modules of gene comethylation patterns and understand how they relate to the underlying pathways, we selected the 2000 most variably methylated genes across the 38 brain cancer subtypes as defined by gene median methylation and SPW-derived gene-level methylation. Louvain modularity was performed on a k-nearest neighbor graph of MethylSPWNet gene-level embeddings to establish preliminary coembedding modules and then tested for enrichment after combining the two largest modules. The genes that were identified in the largest two modules were selected for enrichment analysis. Results for the preliminary module analysis may be found in the supplementary material, section “Preliminary Pathways and Module Analysis”. For MethylSPWNet, the final gene comethylation/embedding analysis was carried out on a subtype-specific basis on all genes, done so through the use of WGCNA.

For MethylCapsNet, capsule-level embeddings were averaged across all individuals to form overall embeddings. Though just as relevant, these approaches can be extended to capsule-level embeddings on the individual level or aggregated across meaningful subgroups. To derive the final measures of coordinated response between capsules, we averaged the routing matrix coefficients across the individuals to form a weighted bipartite graph and calculated a bipartite projection of the graph to form a unipartite graph of capsules. We utilized the Louvain modularity algorithm to discover hubs in this network and performed enrichment analyses on the pathway level using enrichr¹¹² to describe these hubs.

Random Forest approaches in comparison

As a comparison to MethylSPWNet and MethylCapsNet, we adapted the Random Forest scheme featured in a DNAm machine-learning classification study. We selected 10 k CpGs by first fitting 100 random forest models, each themselves fit on 10 k randomly selected CpGs. Shapley Additive Feature Explanations (SHAP)¹¹³, was employed to determine the top CpGs from each random forest run. The 10 k CpGs with the highest average rank across the 100 random forest models were selected for the final RF model. We note that we did not have access to the original set of 10 k CpGs featured in the previous classifier development study. The previous study also utilized probability calibration methods to boost the model sensitivity and specificity, which we avoided to ensure a proper comparison between methods.

Analysis of CpG weights derived from MethylSPWNet

MethylSPWNet derives CpG-specific weights, $\overrightarrow w$, that relates each CpG to its respective gene. We rank-ordered, reverse rank-ordered, and absolute-value reverse rank-ordered these lists to yield CpGs that were important to differentiate the tumor types. We subset the first 1000 CpGs, marked which genes they corresponded to, and tested for enrichment using enrichr in our preliminary weight analysis. The results for the preliminary weight analysis may be found in the supplementary material, section “Preliminary Pathways and Module Analysis”. Finally, weights were also rank-ordered and reverse-rank-ordered to yield the set of top negative and positive weights, respectively; the CpGs correspondent to the top number of CpGs (selected to highlight tendencies of enrichment and depletion) were related to the various islands and gene context.

Method to cluster gene-level brain cancer embeddings by samples

Recall that embeddings for individuals for MethylSPWNet were given in the following form (the design matrix is of dimensionality samples by genes):

$$\overrightarrow{z} = \left[ {z_{gene\,1}z_{gene\,2}z_{gene\,3}...z_{gene\,n}} \right]$$

(17)

For MethylNet, the embeddings are derived using the encoder:

$$\vec{z} = f\left( {\vec{x}} \right)$$

(18)

Embeddings for individuals using the MethylCapsNet approach (of dimensionality samples by genes by latent dimensions) can be obtained by either averaging or concatenating (an aggregation, or AGG operator) the gene-level embeddings:

$$\overleftrightarrow{z} = AGG\left( {\left[ {\begin{array}{*{20}{c}} {\overrightarrow{{z}_{gene\,1}} } \\ \vdots \\ {\overrightarrow {{z}_{gene\,n}} } \end{array}} \right]} \right)$$

(19)

Stacking these vectors for individuals would yield a design matrix that can be clustered using methods such as hierarchical clustering. We implemented hierarchical clustering using scikit-learn (>0.22) and found 14 clusters to compare against true labels of cell-of-origin, histological subtype, and histological and molecular subtypes using the v-measure statistic and cluster separation using the Silhouette coefficient.

Web application

We have developed a web application for the submission and investigation of MethylCapsNet outputs. The web application features three modules. The first is the network-projection model, where capsules are related to each other across subtypes, and network configurations can be changed by having some users tweak the relationships between the capsules and conservation. The second module displays routing information and the third module displays embedding information. Usage is detailed in the wiki.

Analysis hardware and software

The analyses run for this work were optimized utilizing K80 GPUs at the Dartmouth Research Computing Cluster. The algorithms were designed using Python 3.7, PyTorch version 1.1, and CUDA 9.0.

Dataset preprocessing

We acquired data from GEO accession GSE109381 preprocessed data using PyMethylProcess and the subselected 200 K of the most hypervariable CpGs (to focus on CpGs that may better differentiate CNS tumor subtypes) after functional normalization was applied to the data. SNPs and nonautosomal (sex chromosome) probes were omitted. Preprocessing steps have been detailed in using the pipeline of PyMethylProcess²¹.

Data availability

Data used in this study were acquired from GEO accessions GSE109381, GSE84207, and GSE75067. Test data are available in our GitHub repository.

Code availability

The software (MethylCapsNet, MethylSPWNet) is open source and can be found on GitHub at https://github.com/Christensen-Lab-Dartmouth/MethylCapsNet, on PyPI under the tag methylcapsnet, and on Docker at joshualevy44/methylcapsnet. While new features may be developed for the MethylCapsNet framework, community contributions are welcome in the form of GitHub pull requests and issues. A test pipeline is available in the software implementation, a Wiki, help documentation, and example R scripts for possible downstream analyses can be found in the GitHub repository.

References

Bell, C. G. et al. DNA methylation aging clocks: challenges and recommendations. Genome Biol. 20, 249 (2019).
Article PubMed PubMed Central Google Scholar
Khavari, D. A., Sen, G. L. & Rinn, J. L. DNA methylation and epigenetic control of cellular differentiation. Cell Cycle 9, 3880–3883 (2010).
Article CAS PubMed Google Scholar
Christensen, B. C. et al. Aging and environmental exposures alter tissue-specific DNA methylation dependent upon CpG island context. PLoS Genet. 5, e1000602 (2009).
Article PubMed PubMed Central CAS Google Scholar
Dedeurwaerder, S. et al. Evaluation of the infinium methylation 450K technology. Epigenomics 3, 771–784 (2011).
Article CAS PubMed Google Scholar
Moran, S., Arribas, C. & Esteller, M. Validation of a DNA methylation microarray for 850,000 CpG sites of the human genome enriched in enhancer sequences. Epigenomics 8, 389–399 (2016).
Article CAS PubMed Google Scholar
Heyn, H. & Esteller, M. DNA methylation profiling in the clinic: applications and challenges. Nat. Rev. Genet. 13, 679–692 (2012).
Article CAS PubMed Google Scholar
Dor, Y. & Cedar, H. Principles of DNA methylation and their implications for biology and medicine. Lancet 392, 777–786 (2018).
Article CAS PubMed Google Scholar
Capper, D. et al. DNA methylation-based classification of central nervous system tumours. Nature 555, 469–474 (2018).
Article CAS PubMed PubMed Central Google Scholar
Hegi, M. E. et al. MGMT gene silencing and benefit from temozolomide in glioblastoma. N. Engl. J. Med. 352, 997–1003 (2005).
Article CAS PubMed Google Scholar
Turcan, S. et al. IDH1 mutation is sufficient to establish the glioma hypermethylator phenotype. Nature 483, 479–483 (2012).
Article CAS PubMed PubMed Central Google Scholar
Noushmehr, H. et al. Identification of a CpG island methylator phenotype that defines a distinct subgroup of glioma. Cancer Cell 17, 510–522 (2010).
Article CAS PubMed PubMed Central Google Scholar
Christensen, B. C. et al. DNA methylation, isocitrate dehydrogenase mutation, and survival in glioma. J. Natl Cancer Inst. 103, 143–153 (2011).
Article CAS PubMed PubMed Central Google Scholar
Dabrowski, M. J. & Wojtas, B. Global DNA methylation patterns in human gliomas and their interplay with other epigenetic modifications. Int. J. Mol. Sci. 20, 3478 (2019).
Article CAS Google Scholar
Cavalli, F. M. G. et al. Intertumoral heterogeneity within medulloblastoma subgroups. Cancer Cell 31, 737–754 (2017). e6.
Article CAS PubMed PubMed Central Google Scholar
Maros, M. E. et al. Machine learning workflows to estimate class probabilities for precision cancer diagnostics on DNA methylation microarray data. Nat. Protoc. 15, 479–512 (2020).
Article CAS PubMed Google Scholar
Rauschert, S., Raubenheimer, K., Melton, P. E. & Huang, R. C. Machine learning and clinical epigenetics: a review of challenges for diagnosis and classification. Clin. Epigenetics 12, 51 (2020).
Article CAS PubMed PubMed Central Google Scholar
LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).
Article CAS PubMed Google Scholar
Hinton, G. E. Connectionist learning procedures. Artif. Intell. 40, 185–234 (1989).
Article Google Scholar
Krizhevsky, A., Sutskever, I. & Hinton, G. E. ImageNet classification with deep convolutional neural networks. in Advances in Neural Information Processing Systems 25 (eds. Pereira, F., Burges, C. J. C., Bottou, L. & Weinberger, K. Q.) 1097–1105 (Curran Associates, Inc., 2012).
Levy, J. J. et al. MethylNet: an automated and modular deep learning approach for DNA methylation analysis. BMC Bioinforma. 21, 108 (2020).
Article CAS Google Scholar
Levy, J. J., Titus, A. J., Salas, L. A. & Christensen, B. C. PyMethylProcess - convenient high-throughput preprocessing workflow for DNA methylation data. Bioinformatics (2019) https://doi.org/10.1093/bioinformatics/btz594.
Titus, A. J., Wilkins, O. M., Bobak, C. A. & Christensen, B. C. Unsupervised deep learning with variational autoencoders applied to breast tumor genome-wide DNA methylation data with biologic feature extraction. bioRxiv 433763 (2018) https://doi.org/10.1101/433763.
Titus, A. J., Bobak, C. A. & Christensen, B. C. A new dimension of breast cancer epigenetics - applications of variational autoencoders with DNA methylation. in Proceedings of the 11th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2018) - Volume 3: BIOINFORMATICS 140–145 (SCITEPRESS, 2018).
Angermueller, C., Lee, H. J., Reik, W. & Stegle, O. DeepCpG: accurate prediction of single-cell DNA methylation states using deep learning. Genome Biol. 18, 67 (2017).
Article PubMed PubMed Central CAS Google Scholar
Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. in Advances in Neural Information Processing Systems 30 (eds. Guyon, I. et al.) 4765–4774 (Curran Associates, Inc., 2017).
Ribeiro, M. T., Singh, S. & Guestrin, C. ‘Why Should I Trust You?’: Explaining the Predictions of Any Classifier. arXiv:1602.04938 [cs, stat] (2016).
Camacho, D. M., Collins, K. M., Powers, R. K., Costello, J. C. & Collins, J. J. Next-generation machine learning for biological networks. Cell 173, 1581–1592 (2018).
Article CAS PubMed Google Scholar
Horvath, S. DNA methylation age of human tissues and cell types. Genome Biol. 14, R115 (2013).
Article PubMed PubMed Central Google Scholar
Handl, L., Jalali, A., Scherer, M., Eggeling, R. & Pfeifer, N. Weighted elastic net for unsupervised domain adaptation with application to age prediction from DNA methylation data. Bioinformatics 35, i154–i163 (2019).
Article CAS PubMed PubMed Central Google Scholar
Sun, H. & Wang, S. Penalized logistic regression for high-dimensional DNA methylation data with case-control studies. Bioinformatics 28, 1368–1375 (2012).
Article CAS PubMed PubMed Central Google Scholar
Zhou, W. & Lo, S.-H. Analysis of genotype by methylation interactions through sparsity-inducing regularized regression. BMC Proc. 12, 40 (2018).
Article CAS PubMed PubMed Central Google Scholar
Choi, J., Kim, K. & Sun, H. New variable selection strategy for analysis of high-dimensional DNA methylation data. J. Bioinform Comput Biol. 16, 1850010 (2018).
Article PubMed CAS Google Scholar
Dong, N. T. & Khosla, M. Revisiting Feature Selection with Data Complexity. bioRxiv 754630 (2019) https://doi.org/10.1101/754630.
Sun, L., Namboodiri, S., Chen, E. & Sun, S. Preliminary analysis of within-sample co-methylation patterns in normal and cancerous breast samples. Cancer Inf. 18, 1176935119880516 (2019).
Google Scholar
Rickabaugh, T. M. et al. Acceleration of age-associated methylation patterns in HIV-1-infected adults. PLoS ONE 10, e0119201 (2015).
Article PubMed PubMed Central CAS Google Scholar
Zhang, J. & Huang, K. Pan-cancer analysis of frequent DNA co-methylation patterns reveals consistent epigenetic landscape changes in multiple cancers. BMC Genomics 18, 1045 (2017).
Article PubMed PubMed Central CAS Google Scholar
Gomez, L. et al. coMethDMR: accurate identification of co-methylated and differentially methylated regions in epigenome-wide association studies with continuous phenotypes. Nucleic Acids Res. 47, e98–e98 (2019).
Article PubMed PubMed Central CAS Google Scholar
Lien, T. G., Borgan, Ø., Reppe, S., Gautvik, K. & Glad, I. K. Integrated analysis of DNA-methylation and gene expression using high-dimensional penalized regression: a cohort study on bone mineral density in postmenopausal women. BMC Med. Genomics 11, 24 (2018).
Article PubMed PubMed Central CAS Google Scholar
Ng, B., Jafarzadeh, S., Cole, D., Goldenberg, A. & Mostafavi, S. DNA methylation network estimation with sparse latent gaussian graphical model. bioRxiv https://doi.org/10.1101/367748 (2018).
Davies, M. et al. Functional annotation of the human brain methylome identifies tissue-specific epigenetic variation across brain and blood. Genome Biol. 13, R43 (2012).
Article CAS PubMed PubMed Central Google Scholar
Cui, Z.-J., Zhou, X.-H. & Zhang, H.-Y. DNA methylation module network-based prognosis and molecular typing of cancer. Genes 10, 571 (2019).
Mallona, I., Aussó, S., Díez-Villanueva, A., Moreno, V. & Peinado, M. A. Modular dynamics of DNA co-methylation networks exposes the functional organization of colon cancer cells’ genome. bioRxiv 428730 (2018) https://doi.org/10.1101/428730.
Tremblay, B. L., Guénard, F., Lamarche, B., Pérusse, L. & Vohl, M.-C. Network analysis of the potential role of DNA methylation in the relationship between plasma carotenoids and lipid profile. Nutrients 11, 1265 (2019).
Article CAS PubMed Central Google Scholar
Mallik, S. & Bandyopadhyay, S. WeCoMXP: weighted connectivity measure integrating co-methylation, co-expression and protein-protein interactions for gene-module detection. IEEE/ACM Trans Comput Biol Bioinform (2018) https://doi.org/10.1109/TCBB.2018.2868348.
Wang, F., Xu, H., Zhao, H., Gelernter, J. & Zhang, H. DNA co-methylation modules in postmortem prefrontal cortex tissues of European Australians with alcohol use disorders. Sci. Rep. 6, 1–11 (2016).
CAS Google Scholar
Bartlett, T. E., Olhede, S. C. & Zaikin, A. A DNA methylation network interaction measure, and detection of network oncomarkers. PLoS ONE 9, e84573 (2014)..
Horvath, S. et al. Aging effects on DNA methylation modules in human brain and blood tissue. Genome Biol. 13, R97 (2012).
Article CAS PubMed PubMed Central Google Scholar
Langfelder, P. & Horvath, S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinforma. 9, 559 (2008).
Article CAS Google Scholar
Akulenko, R. & Helms, V. DNA co-methylation analysis suggests novel functional associations between gene pairs in breast cancer samples. Hum. Mol. Genet. 22, 3016–3022 (2013).
Article CAS PubMed Google Scholar
Affinito, O. et al. Nucleotide distance influences co-methylation between nearby CpG sites. Genomics 112, 144–150 (2020).
Article CAS PubMed Google Scholar
Hao, J., Kim, Y., Kim, T.-K. & Kang, M. PASNet: pathway-associated sparse deep neural network for prognosis prediction from high-throughput data. BMC Bioinforma. 19, 510 (2018).
Article CAS Google Scholar
Hao, J., Masum, M., Oh, J. H. & Kang, M. Gene- and pathway-based deep neural network for multi-omics data integration to predict cancer survival outcomes. in Bioinformatics Research and Applications (eds. Cai, Z., Skums, P. & Li, M.) 113–124 (Springer International Publishing, 2019).
Borisov, V., Haug, J. & Kasneci, G. CancelOut: a layer for feature selection in deep neural networks. in Artificial Neural Networks and Machine Learning – ICANN 2019: Deep Learning (eds. Tetko, I. V., Kůrková, V., Karpov, P. & Theis, F.) 72–83 (Springer International Publishing, 2019).
Crawford, J. & Greene, C. S. Incorporating biological structure into machine learning models in biomedicine. Curr. Opin. Biotechnol. 63, 126–134 (2020).
Article CAS PubMed PubMed Central Google Scholar
Xie, G. et al. Group Lasso regularized deep learning for cancer prognosis from multi-omics and clinical features. Genes 10, 240 (2019).
Article CAS PubMed Central Google Scholar
Barthel, F. P., Johnson, K. C., Wesseling, P. & Verhaak, R. G. W. Evolving insights into the molecular neuropathology of diffuse gliomas in adults. Neurol. Clin. 36, 421–437 (2018).
Article PubMed PubMed Central Google Scholar
Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
Article CAS PubMed PubMed Central Google Scholar
Artemenkov, A. & Panov, M. NCVis: Noise Contrastive Approach for Scalable Visualization. arXiv:2001.11411v1 (2020).
Szklarczyk, D. et al. STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 47, D607–D613 (2019).
Article CAS PubMed Google Scholar
Babic, I. & Mischel, P. S. Multiple functions of a glioblastoma fusion oncogene. J. Clin. Invest 123, 548–551 (2013).
CAS PubMed PubMed Central Google Scholar
Parker, B. C. et al. The tumorigenic FGFR3-TACC3 gene fusion escapes miR-99a regulation in glioblastoma. J. Clin. Invest 123, 855–865 (2013).
CAS PubMed PubMed Central Google Scholar
Macy, M. E. et al. Clinical and molecular characteristics of congenital glioblastoma. Neuro Oncol. 14, 931–941 (2012).
Article CAS PubMed PubMed Central Google Scholar
Berezovsky, A. D. et al. Sox2 promotes malignancy in glioblastoma by regulating plasticity and astrocytic differentiation. Neoplasia 16, 193–206 (2014). e25.
Article CAS PubMed PubMed Central Google Scholar
Ibrahim, K., Abdul Murad, N. A., Harun, R. & Jamal, R. Knockdown of Tousled‑like kinase 1 inhibits survival of glioblastoma multiforme cells. Int. J. Mol. Med. 46, 685–699 (2020).
Article CAS PubMed PubMed Central Google Scholar
Huang, Q. et al. Up-regulated microRNA-299 corrected with poor prognosis of glioblastoma multiforme patients by targeting ELL2. Jpn J. Clin. Oncol. 47, 590–596 (2017).
Article PubMed Google Scholar
Krishnan, R., Boddapati, N. & Mahalingam, S. Interplay between human nucleolar GNL1 and RPS20 is critical to modulate cell proliferation. Sci. Rep. 8, 11421 (2018).
Article PubMed PubMed Central CAS Google Scholar
Friesen, C. et al. Opioid receptor activation triggering downregulation of cAMP improves effectiveness of anti-cancer drugs in treatment of glioblastoma. Cell Cycle 13, 1560–1570 (2014).
Article CAS PubMed PubMed Central Google Scholar
Pearson, J. R. D. & Regad, T. Targeting cellular pathways in glioblastoma multiforme. Signal Transduct. Target Ther. 2, 17040 (2017).
Article PubMed PubMed Central Google Scholar
Xiong, A. et al. Nuclear receptor binding protein 2 is downregulated in medulloblastoma, and reduces tumor cell survival upon overexpression. Cancers 12, 1483 (2020).
Article CAS PubMed Central Google Scholar
de la Rocha, A. M. A., Sampron, N., Alonso, M. M. & Matheu, A. Role of SOX family of transcription factors in central nervous system tumors. Am. J. Cancer Res. 4, 312–324 (2014).
PubMed PubMed Central Google Scholar
Rivero-Hinojosa, S. et al. Proteomic analysis of Medulloblastoma reveals functional biology with translational potential. Acta Neuropathol. Commun. 6, 48 (2018).
Article PubMed PubMed Central CAS Google Scholar
Qi, Y. & Gao, Y. Clinical significance of miR-33b in glioma and its regulatory role in tumor cell proliferation, invasion and migration. Biomark. Med. 14, 539–548 (2020).
Article CAS PubMed Google Scholar
Wang, X. et al. MYC-regulated mevalonate metabolism maintains brain tumor initiating cells. Cancer Res. 77, 4947–4960 (2017).
Article CAS PubMed PubMed Central Google Scholar
Marx, S. et al. The role of platelets in cancer pathophysiology: focus on malignant glioma. Cancers 11, 569 (2019).
Article CAS PubMed Central Google Scholar
Wu, X. et al. CpG island hypermethylation in human astrocytomas. Cancer Res. 70, 2718–2727 (2010).
Article CAS PubMed PubMed Central Google Scholar
Caponegro, M. D., Moffitt, R. A. & Tsirka, S. E. Expression of neuropilin-1 is linked to glioma associated microglia and macrophages and correlates with unfavorable prognosis in high grade gliomas. Oncotarget 9, 35655–35665 (2018).
Article PubMed PubMed Central Google Scholar
Xia, Z. et al. The expression, functions, interactions and prognostic values of PTPRZ1: a review and bioinformatic analysis. J. Cancer 10, 1663–1674 (2019).
Article CAS PubMed PubMed Central Google Scholar
Panosyan, E. H., Lin, H. J., Koster, J. & Lasky, J. L. In search of druggable targets for GBM amino acid metabolism. BMC Cancer 17, 162 (2017).
Article PubMed PubMed Central CAS Google Scholar
Rodríguez-Paredes, M. & Esteller, M. Cancer epigenetics reaches mainstream oncology. Nat. Med. 17, 330–339 (2011).
Article PubMed CAS Google Scholar
Chenn, A. Wnt/β-catenin signaling in cerebral cortical development. Organogenesis 4, 76–80 (2008).
Article PubMed PubMed Central Google Scholar
Testa, U., Castelli, G. & Pelosi, E. Genetic abnormalities, clonal evolution, and cancer stem cells of brain tumors. Med. Sci. 6, 85 (2018).
CAS Google Scholar
Li, T. et al. IGFBP2: integrative hub of developmental and oncogenic signaling network. Oncogene 39, 2243–2257 (2020).
Article CAS PubMed PubMed Central Google Scholar
Ishak, G. et al. Deregulation of MYC and TP53 through genetic and epigenetic alterations in gallbladder carcinomas. Clin. Exp. Med. 15, 421–426 (2015).
Article CAS PubMed Google Scholar
Jun, I. et al. ANO9/TMEM16J promotes tumourigenesis via EGFR and is a novel therapeutic target for pancreatic cancer. Br. J. Cancer 117, 1798–1809 (2017).
Article CAS PubMed PubMed Central Google Scholar
Hatanpaa, K. J., Burma, S., Zhao, D. & Habib, A. A. Epidermal growth factor receptor in glioma: signal transduction, neuropathology, imaging, and radioresistance. Neoplasia 12, 675–684 (2010).
Article CAS PubMed PubMed Central Google Scholar
Fleischer, T. et al. DNA methylation at enhancers identifies distinct breast cancer lineages. Nat. Commun. 8, 1379 (2017).
Article PubMed PubMed Central CAS Google Scholar
Holm, K. et al. An integrated genomics analysis of epigenetic subtypes in human breast tumors links DNA methylation patterns to chromatin states in normal mammary cells. Breast Cancer Res. 18, 27 (2016).
Article PubMed PubMed Central CAS Google Scholar
Sabour, S., Frosst, N. & Hinton, G. E. Dynamic Routing Between Capsules. arXiv:1710.09829 [cs] (2017).
Venkatraman, S., S, B. & Sarma, R. Building Deep, Equivariant Capsule Networks. arXiv:1908.01300 [cs.LG] (2019).
Wang, L., Miao, X., Zhang, J. & Cai, J. MultiCapsNet: a interpretable deep learning classifier integrate data from multiple sources. bioRxiv 570507 (2019) https://doi.org/10.1101/570507.
Danielsson, A. et al. MethPed: a DNA methylation classifier tool for the identification of pediatric brain tumor subtypes. Clin. Epigenetics 7, 62 (2015).
Article PubMed PubMed Central CAS Google Scholar
Hovestadt, V. et al. Decoding the regulatory landscape of medulloblastoma using DNA methylation sequencing. Nature 510, 537–541 (2014).
Article CAS PubMed Google Scholar
Baeza, N., Weller, M., Yonekawa, Y., Kleihues, P. & Ohgaki, H. PTEN methylation and expression in glioblastomas. Acta Neuropathol. 106, 479–485 (2003).
Article CAS PubMed Google Scholar
Capaccione, K. M. & Pine, S. R. The Notch signaling pathway as a mediator of tumor survival. Carcinogenesis 34, 1420–1430 (2013).
Article CAS PubMed PubMed Central Google Scholar
Fan, X. et al. Notch pathway inhibition depletes stem-like cells and blocks engraftment in embryonal brain tumors. Cancer Res. 66, 7445–7452 (2006).
Article CAS PubMed Google Scholar
Li, J. et al. PTEN, a putative protein tyrosine phosphatase gene mutated in human brain, breast, and prostate cancer. Science 275, 1943–1947 (1997).
Article CAS PubMed Google Scholar
He, X. et al. The G protein α subunit Gαs is a tumor suppressor in Sonic hedgehog-driven medulloblastoma. Nat. Med. 20, 1035–1042 (2014).
Article CAS PubMed PubMed Central Google Scholar
Pidsley, R. et al. Critical evaluation of the Illumina MethylationEPIC BeadChip microarray for whole-genome DNA methylation profiling. Genome Biol. 17, 208 (2016).
Article PubMed PubMed Central CAS Google Scholar
Zhou, L. et al. Systematic evaluation of library preparation methods and sequencing platforms for high-throughput whole genome bisulfite sequencing. Sci. Rep. 9, 10383 (2019).
Article PubMed PubMed Central CAS Google Scholar
Moran, S. et al. Validation of DNA methylation profiling in formalin-fixed paraffin-embedded samples using the Infinium HumanMethylation450 Microarray. Epigenetics 9, 829–833 (2014).
Article PubMed PubMed Central Google Scholar
Bodnar, C., Cangea, C. & Liò, P. Deep Graph Mapper: Seeing Graphs through the Neural Lens. arXiv:2002.03864 [cs, stat] (2020).
van Veen, H. J., Saul, N., Eargle, D. & Mangham, S. W. Kepler Mapper: A flexible Python implementation of the Mapper algorithm. J. Open Source Softw. 4, 1315 (2019).
Article Google Scholar
Wang, T., Johnson, T., Jie, Z. & Huang, K. Topological methods for visualization and analysis of high dimensional single-cell RNA sequencing data. Pac. Symp. Biocomput. 24, 350–361 (2019).
PubMed PubMed Central Google Scholar
Rizvi, A. H. et al. Single-cell topological RNA-Seq analysis reveals insights into cellular differentiation and development. Nat. Biotechnol. 35, 551–560 (2017).
Article CAS PubMed PubMed Central Google Scholar
Lum, P. Y. et al. Extracting insights from the shape of complex data using topology. Sci. Rep. 3, 1236 (2013).
Article CAS PubMed PubMed Central Google Scholar
Singh, D. & Yamada, M. FsNet: Feature Selection Network on High-dimensional Biological Data. arXiv:2001.08322 [cs, stat] (2020).
Ritchie, M. E. et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 43, e47–e47 (2015).
Article PubMed PubMed Central CAS Google Scholar
Raudvere, U. et al. g:Profiler: a web server for functional enrichment analysis and conversions of gene lists (2019 update). Nucleic Acids Res. 47, W191–W198 (2019).
Article CAS PubMed PubMed Central Google Scholar
Shannon, P. et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13, 2498–2504 (2003).
Article CAS PubMed PubMed Central Google Scholar
Gustavsen, J. A., Pai, S., Isserlin, R., Demchak, B. & Pico, A. R. RCy3: Network biology using cytoscape from within R. F1000Res 8, 1774 (2019).
Krzywinski, M. et al. Circos: An information aesthetic for comparative genomics. Genome Res. 19, 1639–1645 (2009).
Article CAS PubMed PubMed Central Google Scholar
Chen, E. Y. et al. Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool. BMC Bioinforma. 14, 128 (2013).
Article Google Scholar
Lundberg, S. M. et al. From local explanations to global understanding with explainable AI for trees. Nat. Mach. Intell. 2, 56–67 (2020).
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We would like to acknowledge Christian Haudenschild, Hildreth Robert Frost, and A. James O’Malley for their thoughtful discussions. This work was supported by NIH grants R01CA216265, R01CA253976, and P20GM104416 to BCC, Dartmouth College Neukom Institute for Computational Science CompX awards to BCC and LJV, and training fellowship support for AJT from T32LM012204. CLP and JJL are supported through the Burroughs Wellcome Fund Big Data in the Life Sciences at Dartmouth.

Author information

Authors and Affiliations

Program in Quantitative Biomedical Sciences, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
Joshua J. Levy & Youdinghuan Chen
Department of Epidemiology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
Joshua J. Levy, Youdinghuan Chen, Nasim Azizgolshani, Curtis L. Petersen, Lucas A. Salas & Brock C. Christensen
Emerging Diagnostic and Investigative Technologies, Department of Pathology and Laboratory Medicine, Dartmouth Hitchcock Medical Center, Lebanon, NH, USA
Joshua J. Levy & Louis J. Vaickus
The Dartmouth Institute for Health Policy and Clinical Practice, Lebanon, NH, USA
Curtis L. Petersen & Erika L. Moen
Department of Life Sciences, University of New Hampshire, Manchester, NH, USA
Alexander J. Titus
Department of Biomedical Data Science, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
Erika L. Moen
Department of Molecular and Systems Biology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
Lucas A. Salas & Brock C. Christensen
Department of Community and Family Medicine, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
Brock C. Christensen

Authors

Joshua J. Levy
View author publications
You can also search for this author in PubMed Google Scholar
Youdinghuan Chen
View author publications
You can also search for this author in PubMed Google Scholar
Nasim Azizgolshani
View author publications
You can also search for this author in PubMed Google Scholar
Curtis L. Petersen
View author publications
You can also search for this author in PubMed Google Scholar
Alexander J. Titus
View author publications
You can also search for this author in PubMed Google Scholar
Erika L. Moen
View author publications
You can also search for this author in PubMed Google Scholar
Louis J. Vaickus
View author publications
You can also search for this author in PubMed Google Scholar
Lucas A. Salas
View author publications
You can also search for this author in PubMed Google Scholar
Brock C. Christensen
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

The conception and design of the study were contributed by JJL and BCC. Implementation, programming, data acquisition, and analyses were by JJL. All authors contributed toward refining the analytic plan and direction. All authors contributed to the writing and editing of the paper. CLP and JJL tested the pipeline.

Corresponding author

Correspondence to Joshua J. Levy.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Data 1

Supplementary Data 2

Supplementary Data 3

Supplementary Data 4

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Levy, J.J., Chen, Y., Azizgolshani, N. et al. MethylSPWNet and MethylCapsNet: Biologically Motivated Organization of DNAm Neural Networks, Inspired by Capsule Networks. npj Syst Biol Appl 7, 33 (2021). https://doi.org/10.1038/s41540-021-00193-7

Download citation

Received: 17 September 2020
Accepted: 01 July 2021
Published: 20 August 2021
DOI: https://doi.org/10.1038/s41540-021-00193-7

This article is cited by

Assessment of emerging pretraining strategies in interpretable multimodal deep learning for cancer prognostication
- Zarif L. Azher
- Anish Suvarna
- Joshua J. Levy
BioData Mining (2023)
Widespread redundancy in -omics profiles of cancer mutation states
- Jake Crawford
- Brock C. Christensen
- Casey S. Greene
Genome Biology (2022)