Epigenetic mechanisms constitute an essential mode of gene regulation and act as an interface between environmental exposures, cellular response, and pathological processes. DNA methylation level and its location constitute important gene regulatory mechanisms.1, 2, 3, 4 Abnormal epigenetic marks, including DNA methylation alterations, are a hallmark of most human diseases. Importantly, epigenetic modifications are reversible, and represent potential targets for disease prevention and therapy.2, 3, 4, 5, 6 There are other epigenetic mechanisms of gene regulations, such as non-coding RNA including microRNA.7, 8, 9, 10, 11, 12, 13 Gene expression levels are consequences of epigenetic regulation; however, there exists a challenge in accurate quantification of transcript levels in archival tissues.14 As most studies, which have examined host exposures and epigenetic alterations, utilized DNA methylation as biomarkers, our discussion on previous data mostly addresses DNA methylation.

Accumulating evidence suggests that epigenetic aberrations induced by environmental, dietary, lifestyle, and microbial factors contribute to specific disease processes.15, 16, 17, 18, 19, 20, 21, 22 To examine the complex relationships between etiological factors, molecular alterations, and disease evolution, ‘molecular pathology’ and ‘epidemiology’ have recently become integrated, generating the interdisciplinary field of ‘molecular pathological epidemiology (MPE)’.21, 22, 23

As clinical molecular pathology testing is becoming more and more common, we anticipate that molecular pathology data can be acumulated into disease registries around the world. This enables MPE to become routine epidemiology and pathology research practice. Thus, a role of pathologists as educators for epidemiologists will increase. We must emphasize pivotal roles of modern pathologists in broader transdisciplinary biomedical and public health sciences, as well as in clinical decision-making process.

In this article, we provide an overview of the MPE paradigm, and proceed to illustrate the contribution made by epigenetic research. While we exploit MPE data on neoplastic disorders, MPE approaches and paradigms can conceptually extend to the study of non-neoplastic diseases.

Tissue and Cellular Heterogeneity: Challenges in Epigenetic Research

In many non-neoplastic, non-hematological, non-dermatological diseases, access to diseased cells is limited by current technologies.24, 25, 26, 27 Even if tissues affected by a non-neoplastic disease (eg, inflamed liver) can be obtained, those tissues consist of many different cell types with varying epigenomes. Therefore, epigenetic analysis of non-neoplastic diseases faces a fundamental challenge of heterogeneity in tissue, cells, and epigenomes, which is often overlooked or underestimated in research studies and proposals (not limited to MPE).

Notably, the epigenome differs between specific cell types (even within a single organ). In a single cell, the epigenome changes, as the cell responds to the microenvironmental changes over time. A single organ (or tissue) consists of numerous cell types with different epigenomes.

A particular human disease process is caused by dysfunction of a specific cell type, or multiple cell types (in one organ or across multiple organ systems). Thus, the optimal approach is to analyze molecular changes in the afflicted cell types specific to the disease process. To study a psychiatric or neuronal disease, it would be best to analyze disordered neurons (within the context of local microenvironment), rather than blood leukocytes or brain tissue as a mixture of different cell types.28, 29, 30 Clearly, we must analyze specific cell types in each tissue in a particular microenvironmental and disease context, using techniques such as laser capture microdissection and flow cytometry.31, 32

Epigenomic differences between cell types may be present in a small part of the genome (with overall similar epigenomic status); however, those minor differences are critical in specific cell-type function and disease pathogenesis, and unlikely inferred from examining the epigenomes of different cell types. Although many epigenetic studies rely on blood leukocytes as a surrogate for alterations in other diseased cell types,24, 25, 26, 27, 29, 30, 31, 32, 33, 34, 35, 36 there is very little evidence supporting the validity of inferring epigenetic mechanisms of non-hematological disorders from epigenetic analysis of blood leukocytes.

In contrast to non-neoplastic diseases, neoplastic diseases are characterized by uncontrolled cellular proliferation, which can provide abundant amounts of diseased cells for epigenetic analyses. We nevertheless should note: (1) that a tumor consists of many different cell types (transformed neoplastic cells and various non-transformed cells, such as fibroblasts, endothelial cells, smooth muscle cells, and inflammatory cells), and (2) that, even within a single tumor, neoplastic cells are heterogeneous.37 While we should be aware of these caveats, neoplastic diseases still give opportunities to study epigenetic alterations in diseased cells, by providing relatively enriched disease cell population.

Basic Characteristics of Disease: The Unique Disease Principle

Human diseases are typically very complex processes (Figure 1), involving alterations in epigenomes, transcriptomes, proteomes, metabolomes, microbiomes, and interactomes. Because each of us has a unique genome, and distinct combinations of exposome,38 epigenomes, transcriptomes, proteomes, and metabolomes in specific cell types, as well as unique microbiomes and interactomes in the tissue microenvironment, each disease process in each human must surely be unique, and distinct from what is nominally the same disease process in other individuals. This concept embodies the ‘unique disease principle’.

Figure 1
figure 1

A variety of endogenous and exogenous etiological factors contribute to epigenetic changes leading to heterogeneity of disease processes, which is implicated by the ‘unique disease principle’. To simplify, only selected examples of those etiological factors are demonstrated. There are numerous interactions between the factors, which are not depicted for simplicity.

To measure the contribution of each of the numerous molecular changes to disease pathogenesis will require enormous amounts of functional and correlative studies by both systems pathobiology and MPE approaches. Increasing roles of pathologists cannot be overemphasized in such efforts.

The ‘unique disease principle’ poses a challenge to epidemiological research, which is founded on the premise that we can predict disease occurrence and evolution, by inference from individuals with the disease with the same name. MPE, taking into account the unique disease principle, asserts that we can predict, to some extent, occurrence and evolution of a specific disease subtype by proper inference, which we will discuss further.

MPE: Integrative Science

MPE has evolved through the integration of molecular pathology and epidemiology.21, 22, 23 The MPE paradigm has widely been utilized in a number of original and review articles,39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62 while further conceptual development of MPE remains active.63, 64, 65, 66, 67

MPE differs from conventional molecular epidemiology. MPE addresses the fundamental heterogeneity of disease processes, while conventional molecular epidemiology generally treats a given disease as a single entity.65, 66 The conceptual framework of MPE resembles that of systems biology,68, 69 and MPE integrates analyses of populations and the macroenvironment, with those of molecules and microenvironment. Importantly, the MPE paradigm encompasses all human diseases.

The MPE research approach allows investigators to examine the relationships between potential etiological factors and disease subtypes based on molecular signatures. In addition, MPE permits the assessment of interactive effects of environmental influences and disease molecular signatures on disease progression.70, 71, 72, 73, 74 Thus, MPE research can provide insights into disease pathogenesis by demonstrating how specific etiological factors influence mechanistic pathways in disease evolution and progression.

MPE possesses key advantages over traditional epidemiological or pathological research. Firstly, relationships can be uncovered between specific etiological factors and molecular subtypes, supporting causality,21, 22 and etiological heterogeneity.67, 75 Secondly, the risk of developing a specific disease subtype can be more accurately estimated.21, 22 Thirdly, for individuals with susceptibility to a specific disease subtype or for patients with a specific disease subtype, personalized treatment and lifestyle modification strategies may be developed;21, 22 examples include aspirin use for PIK3CA-mutant colorectal cancer patients,73 and physical activity recommendations for CTNNB1-negative colorectal cancer patients.72 These advantages are possible only with integrated MPE approach.

In the following sections, we describe how epigenetics has contributed to MPE, with an emphasis on major disease epigenotypes.

Disease Epigenotypes

While the unique disease principle emphasizes the individuality of each disease process, molecular disease classification attempts to identify commonality in disease features, subgroup disease based on these shared characteristics, and predict disease evolution, progression, and therapeutic response.64, 76 Epigenotyping can successfully classify cancers in various organs into distinct groups with different clinical, pathological, and molecular characteristics. Currently, clinical utility of epigenotyping remains limited compared with conventional pathological assessment. However, accumulating evidence indicates distinct molecular signatures and phenotypes associated with specific epigenotypes, which cannot be discerned by pathological examination alone. Therefore, epigenotyping and pathological assessments should complement each other in the future.

The CpG Island Methylator Phenotype

A specific tumor phenotype appears to exist that is characterized by the propensity of tumor cells to acquire widespread CpG island hypermethylation. This phenotype was named the CpG island methylator phenotype (CIMP), and was first described in colorectal cancer.77 The CIMP concept is important because it draws attention to the presence of an epigenome-wide driving force for CpG island hypermethylation. Therefore, when assessing CpG island methylation at a particular locus, CIMP must always be considered as a potential confounder (Figure 2).

Figure 2
figure 2

Why is examining global molecular phenomena so important? (a) A typical study examining the relationship between gene X hypermethylation and disease Y, or response to treatment. Global molecular features, such as CpG island methylator phenotype (CIMP) status, are not often considered. (b) In reality, the significant relationship, if any, between gene X hypermethylation and disease, or treatment response, may reflect the association between CIMP and disease Y, or treatment response. CIMP status always needs to be considered as a potential confounder when examining locus-specific gene promoter hypermethylation and clinical outcome.

Normal differentiation state (including tissue of origin) likely influence epigenomic aberrations during neoplastic evolution.78 Although CIMP has been described in a variety of tumors,79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96 our discussion here focuses on colorectal cancer where CIMP has been most extensively characterized.39, 44, 76, 97, 98, 99 The pathogenic basis of CIMP remains elusive, and CIMP may represent a multifactorial phenomenon.22 Although CIMP-high (high-level CIMP) is strongly associated with BRAF mutation in colorectal cancer,100, 101, 102, 103, 104 whether BRAF mutation causes CIMP remains uncertain.105, 106 DNA methyltransferase 3B (DNMT3B) overexpression has been implicated in CIMP-high (high-level CIMP).107, 108, 109, 110, 111, 112 CIMP-high colorectal cancer has been thought to arise from serrated precursor lesions such as sessile serrated polyp/adenoma.52, 113, 114, 115, 116, 117 In addition to CIMP-high, a third CIMP category (‘CIMP-low’) in colorectal cancers was found to be associated with KRAS mutations,118 which was confirmed by multiple studies.119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129 Features of CIMP-low colorectal cancers have been characterized.118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134 Prognostic and predictive roles of CIMP remains uncertain.39, 44, 76, 132, 135, 136, 137, 138, 139

CIMP-high is the cause of most colorectal cancers displaying high levels of microsatellite instability (MSI-high), which occur as a result of epigenetic inactivation of the mismatch repair gene MLH1.100, 101, 102, 140 CIMP-high in colorectal cancer is associated with older age, female sex, proximal colonic location,100, 103, 141, 142 poor tumor differentiation, mucinous and signet ring cell histology,143, 144, 145, 146 immune and lymphocytic reactions,144, 147, 148, 149 wild-type TP53,100, 150 PTGS2 (cyclooxygenase-2) negativity,150 CDKN1A (p21) expression,151 loss of CDKN1B (p27) expression,152 CTNNB1 (β-catenin) membrane localization,153 wild-type APC,154 SIRT1 expression,155 PTGER2 expression,156 high levels of LINE-1 methylation,103, 157 loss of CDX2 expression,158, 159 and low-level chromosomal instability.160, 161, 162, 163 Tumor invasiveness and budding phenotype are inversely associated with MSI-high, rather than CIMP-high.164, 165 Because BRAF mutation and MSI-high in colorectal cancer are associated with worse and better prognosis, respectively,121, 122, 123, 132, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182 it requires a large sample size and appropriate analyses to decipher prognostic significance of CIMP.132 CIMP-high is common in synchronous colorectal cancers (ie, multiple separate primary cancers in a single patient),183, 184, 185 indicating that colorectal epithelial cells may be predisposed to CpG island methylation due to genetic and/or environmental factors.183, 186

Despite the importance of CIMP phenomena, several caveats exist in CIMP research.44, 76, 187, 188, 189, 190 Firstly, there has been a relative lack of consensus and validation in CIMP analysis methodologies.44, 188 Validation of each component of analytical procedures is essential,191, 192, 193, 194 as is replication of study findings in independent studies.102, 103, 195 Secondly, most previous studies on CIMP in human cancer specimens had convenience cohorts196 with small sample sizes, causing spurious findings, lack of robust statistics, and lack of generalizability.22, 23, 196, 197

LINE-1 Methylation Epigenotypes

Owing to the relative simplicity of the assay, methylation levels in the long interspersed nucleotide element-1 (LINE-1; also called long interspersed nuclear element-1; long interspersed element-1; L1) have commonly been used as a surrogate measurement of cellular global DNA methylation level.198, 199, 200 In addition to its role as a surrogate marker of global DNA hypomethylation, LINE-1 hypomethylation by itself may have functional implications.201, 202 Activation of LINE-1 retrotransposons may lead to the transcription of adjacent genes, gene disruptions, chromosomal instability, or the generation of transcripts that regulate gene expression.203, 204, 205, 206, 207, 208

LINE-1 methylation level in colorectal cancer is widely distributed.157, 209, 210 LINE-1 hypomethylation has been associated with poor outcome in several cancer types,211, 212, 213, 214, 215 including colon cancer.216, 217, 218 LINE-1 hypomethylation is common in esophageal squamous cell carcinoma,45 metastatic prostate cancer,219 and metastatic pancreatic endocrine tumors.220 Average LINE-1 methylation levels in colorectal tumors decline as tumors progress from adenomas to invasive cancers, to highly invasive cancers.111, 221 LINE-1 hypomethylation in colorectal cancer is associated with chromosomal instability,210, 222 IGF2BP3 overexpression,223 and hypomethylation at the IGF2 differentially methylated region-0,224 which may be a somatic event in carcinogenesis.225 LINE-1 hypomethylated colorectal cancers are associated with family history of colorectal cancer and younger age of onset.210, 226, 227, 228 Given that synchronous colorectal cancers show concordant LINE-1 hypomethylation patterns,183 there may be a predisposition to LINE-1 hypomethylation in colorectal epithelial cells.

Interplay of Genetic and Epigenetic Changes

When we consider epigenetic alterations, which may have importance in any stage of tumor development,229 we also need to consider genetic changes,230, 231, 232, 233, 234 which may be causes or consequences of epigenetic alterations. Genetic changes are increasingly studied on routine formalin-fixed, paraffin-embedded archival tissue using next-generation sequencing technologies,235 which is opening up opportunities for pathology research.

Epigenetic silencing of the DNA repair gene MGMT is implicated in somatic G>A mutations in various genes, including KRAS, PIK3CA, APC, and TP53.120, 236, 237, 238, 239, 240, 241, 242 Furthermore, CIMP-high in colorectal cancer is causally linked to MSI through epigenetic inactivation of mismatch repair gene MLH1,100, 101, 102, 140 and then, MSI increases the rate of genome-wide mutational events,104, 243, 244, 245, 246 and influences cell cycle regulators.150, 151, 152, 247, 248, 249 There are causally uncertain relations between BRAF mutations and CIMP-high,100, 101, 102, 103 and between KRAS mutations and CIMP-low.118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128

MPE of Etiologies and Epigenetics

Traditional epidemiology research has uncovered lifestyle, dietary, and environmental exposures that are positively or negatively associated with disease risk. In terms of cancer risk, these include smoking, some nutrients, alcohol consumption, energy metabolism status (energetics), aspirin use, hormone therapy, and some infectious/inflammatory conditions. However, generally, how these exposures influence disease pathogenesis remains not well understood. Lifestyle, dietary, and environmental factors likely influence the pathogenic process via altering the local tissue microenvironment, and epigenetics have a key role in cellular response to microenvironmental change.

Various studies have adopted an MPE design to address the roles of potential etiological factors. With the advent of technologies that enable analysis of genome-wide DNA methylation targets in archival tissue,250, 251, 252 we expect enormous opportunities to investigate environmental influences on somatic epigenetic alternations. Although this work is currently in its infancy, we highlight some examples, and the potential insights yielded, in the following sections.

MPE of Cigarette Smoking and Epigenetics

Cigarette smoking has been associated with CIMP-high,253, 254, 255, 256 MSI-high,253, 254, 255, 256, 257, 258, 259, 260 and BRAF-mutant subtypes253, 254, 255, 256, 261 of colorectal cancer. Duration of smoking cessation is associated with a reduction of risk for CIMP-high colorectal cancer.256 Similar to colorectal cancer, smoking has been associated with hypermethylation of specific genes262, 263 and CIMP81 in lung cancer.

MPE of One-Carbon Metabolism and Epigenetics

The methyl (CH3-) groups for DNA methylation are derived from one-carbon methyl donors, suggesting an intrinsic link between one-carbon nutrients and epigenetic alterations. However, the relationship between one-carbon nutrients and somatic molecular alterations appears complex.16 In most epidemiological studies, low folate intake has been associated with an increased risk of colorectal cancer and adenomas.264 However, there have been concerns that supplementation with folic acid may have tumor-promoting effects.264, 265, 266 In mouse models, folate supplementation may promote epigenomic and microbiomic changes, and intestinal tumor formation.267, 268 Examining molecular changes in tumor cells in relation to folate intake may provide insights into the role of one-carbon metabolism in carcinogenesis.22

Altered levels of intracellular folate metabolites have been linked to aberrant DNA methylation patterns.269, 270 The relationship between folate/alcohol intake and aberrant promoter hypermethylation in colorectal cancer remains unclear.124, 271, 272, 273, 274, 275, 276, 277 The MTHFR rs1801131 polymorphism (codon 429) may (or may not124, 278) be associated with CIMP-high cancer.279, 280

With regard to LINE-1, or global DNA methylation level, experimental evidence suggests a link between folate deficiency and global DNA hypomethylation in the colonic epithelium.281 Folate supplementation increases global DNA methylation levels in glioma and colon cancer cells.282, 283 Folate deficiency (or excess alcohol consumption) has been associated with increased risk of TP53-mutated,284 and LINE-1-hypomethylated colon cancers.285 Randomized trials suggest that folic acid supplementation may (or may not286) increase global DNA methylation levels in normal colonic mucosa.287 Collectively, there is suggestive evidence of a link between one-carbon nutrients and global DNA (LINE-1) hypomethylation, which may lead to carcinogenesis.

MPE of Energetics and Epigenetics

Energetics has been implicated in metabolic diseases and cancers.288, 289, 290, 291, 292, 293 Nonetheless, analyses of potential links between energetics and epigenetic alterations in specific diseased cells are in their infancy.

Evidence suggests that caloric restriction in early life is associated with a lower risk of CIMP-high colorectal cancer.294 In contrast, obesity in adult has been associated with non-MSI colorectal cancer,49, 295, 296, 297 and CIMP-low/negative colorectal cancer.274 A recent prospective study has shown that obesity is associated with an increased risk of fatty acid synthase (FASN)-negative colorectal cancer.298 FASN overexpression has been implicated in carcinogenesis,299, 300 and associated with MSI-high colorectal cancer.301 The relationship between obesity and non-MSI (or CIMP-low/negative) cancer might be explained by the link between obesity and FASN-negative tumors.21

Energetic status has been shown to interact with tumor molecular signatures to modify the behavior of colorectal cancer. Such tumor markers include expression of CTNNB1 (β-catenin),72 FASN,70 PRKAA (AMP-activated protein kinase),302 STMN1,303 CDKN1A (p21),304 and CDKN1B (p27).305, 306 This type of interaction analysis represents an emerging paradigm in MPE,21, 22 and may inform the design of new clinical trials to assess lifestyle or pharmacological interventions.

Endogenous and Exogenous Hormones, and Epigenetics

Hormone therapy has been linked to lower risk of certain cancers, but epigenetic and other molecular mechanisms remain poorly understood. In breast cancer, ESR1 (estrogen receptor 1) and PGR (progesterone receptor) expression status has been associated with methylome alteration patterns.307 Hormone therapy has been associated with ESR1 and PGR promoter methylation in colorectal cancer.308 However, recent prospective cohort studies failed to demonstrate a clear relationship between hormone therapy and CIMP status.47, 48, 309 Hormone therapy may be associated with a decreased risk of colorectal cancer lacking CDKN1A (p21) expression.309 There are possible interrelations between the vitamin D pathway, the RAS-PI3K-AKT pathway, and epigenetic modulations in colorectal cancer.310, 311

Microbiota, Inflammation, Immunity, and Epigenetics

Microorganisms, such as viruses, bacteria, and parasites, have been increasingly implicated in human health and chronic disease.19, 20, 312 Recently, the interplay of microenvironment, microbiota, immunity, inflammation, cellular epigenetic alterations, and various chronic diseases has attracted increasing attention.15, 17, 18, 19, 20, 23, 59, 313, 314, 315, 316 Helicobacter pylori infection has been associated with epigenetic changes in gastric epithelial cells,317, 318 and Enterobacteriaceae and Tenericutes have been associated with epigenetic changes in head and neck squamous cell carcinoma.319

The role of viral infection in epigenetic changes has been extensively reviewed elsewhere.320 Evidence indicate roles of HBV, HPV, and EBV in epigenetic alterations and carcinogenesis.320, 321, 322, 323 In colorectal cancer, a role of JCV in epigenetic alterations has been controversial.324, 325, 326

Inflammation appears to have a crucial role in carcinogenesis,327, 328, 329 and has been linked to energetics,330, 331 and epigenetics.332 Cellular epigenetic changes may be induced by inflammation and associated oxidative damage,333, 334, 335 while cellular epigenetic aberrations may cause inflammatory diseases.336 A study has demonstrated that the inflammatory mediator, prostaglandin E2, upregulates DNMT3B, resulting in promoter CpG island hypermethylation and promotion of intestinal tumorigenesis in mice.335 Regular use of anti-inflammatory drugs such as aspirin, an inhibitor of PTGS2 (cyclooxygenase 2), has been associated with a decrease in cancer incidence and mortality,57, 329 Cancer-preventive effect of aspirin is apparent against PTGS2-positive colorectal cancer.71, 337, 338 Moreover, aspirin appears to be very effective to treat PIK3CA-mutated colorectal cancer, suggesting PIK3CA mutation as a predictive tumor biomarker for clinical use.73

The importance of tumor–host interactions, encompassing microbiota and inflammation, has been highlighted by the recent discovery of a continuum in the frequency of molecular features (including CIMP-high, MSI-high, and BRAF mutation) in colorectal cancers along subsites in the proximal–distal axis of the bowel.142, 339 Luminal microbial contents and immune infiltrates appear to change gradually along the bowel.142, 339, 340, 341 Taken together, local host and environmental factors, such as luminal contents, microbiome, inflammation, and the innate immune response, likely contribute to the development of specific molecular subtypes of colorectal cancer.142

Implications of MPE in Disease Prevention and Therapy

Despite the ‘uniqueness’ of each disease process, molecular disease classification exploits shared molecular features of disease processes in multiple patients, on the premise that disease evolution and progression can be, to some extent, generalized to other patients with the same molecular subtype, and that appropriate (‘personalized’ or ‘precise’) treatment measures can be initiated for the patients with the particular disease subtype.64, 76 Essentially, pathology testing can guide treatment decision-making. This is relevant to not only pharmacological and immunological interventions23, 73 but also to lifestyle modifications.72, 306

With regard to disease prevention, MPE research may identify risk factors for a specific disease subtype. For individuals who are susceptible to the specific disease subtype, appropriate preventive measures (such as avoiding the identified risk factors) can be taken, or early detection can be attempted. For example, genetic susceptibility and familial clustering have been suggested for colorectal cancer with LINE-1 hypomethylation,210, 226, 227, 228 which is an aggressive subtype,216, 217, 218 but can be prevented by adequate folate intake and avoidance of alcohol.285

Analysis of Normal Tissue, Stool, Blood, or Other Body Fluids in MPE Context

To date, most epigenetic studies on non-neoplastic diseases have relied on blood leukocytes as a surrogate for molecular processes in diseased cells,24, 25, 26, 27, 29, 30, 33, 34 although there is very little evidence supporting the validity of this approach in non-hematological diseases. Thus, the following discussion focuses on epigenetic or other molecular analyses of normal tissues or blood in the context of the MPE of cancer epigenetics. The ability to detect cancer or estimate cancer risk from normal tissue, stool, peripheral blood, or other body fluids (such as sputum and urine) has become the holy grail of biomarker discovery.342 Biomarkers in normal tissue, stool, or body fluids can represent: (1) a pathological outcome, analogous to established serum tumor markers; (2) a surrogate or shared indicator of etiological exposure and disease predisposition; or (3) an intermediary in a causal pathway from etiology to downstream outcome (cancer incidence or behavior). Analysis of normal tissue, stool, and body fluids can expand the scope of, and add a novel dimension to, MPE research (Figure 3).

Figure 3
figure 3

Biomarker analysis in normal tissue, stool, blood, and body fluids adds new dimensions to molecular pathological epidemiology (MPE) research. Analyses of interactions among etiological factors and biomarkers can be performed to test a specific research hypothesis.22, 23, 72, 73

Bearing in mind the limitations of epigenetic and other molecular analyses on plasma or peripheral blood leukocytes, blood can conceivably carry a pathological molecular signature, or diseased cells or cellular constituents, from any part of the body (eg, bone marrow343). As a result of its abundance, ease of specimen collection, and practicality as a future clinical test, blood has been a common specimen type for studies on epigenetic changes in ‘normal’ cells. Global DNA or LINE-1 methylation in leukocytes has attracted much interest as a potential cancer biomarker.344 Alterations in leukocyte DNA methylation have been associated with risk for a variety of solid tumors, including colon, bladder, stomach, and breast cancers.344, 345, 346, 347 LINE-1 methylation level and other epigenetic changes in leukocytes appear to be influenced by a variety of exposures, including smoking, early life, or prenatal events.27, 345, 346, 347, 348 LINE-1 methylation in leukocytes and normal tissues is less variable compared with tumor, and may not correlate with tumor LINE-1 methylation levels.194, 209, 210, 263, 349

The study of interrelations between epidemiological exposures, molecular changes in normal tissue or biological specimens, and cancer and other chronic diseases is an evolving field.350, 351 Notably, several caveats must be applied to inferences drawn from study findings to date. Most studies have been relatively small and cross-sectional or retrospective in design, and replication in larger prospective studies is required. The type of assay employed and specific cell type analyzed appear to influence critically the determination of cellular DNA methylation levels.346, 347, 348, 349, 351, 352, 353, 354 There is a relative lack of uniformity and robust validation data across laboratories in cell or nucleic-acid isolation protocols, downstream processing, and assays. It remains a challenge to define exact biological mechanisms to account for the associations between epigenetic alterations in normal blood and pathobiological changes in specific cell types (eg, breast duct epithelial cells that give rise to neoplasia). This has limited the potential for insights into disease pathogenesis and causality. Although blood biomarkers have the capability to reflect a disease process at a distant site in the body, epigenetic biomarkers, such as global DNA methylation in leukocytes, are often rather nonspecific, and may be associated with a variety of different cancer types, as well as non-malignant conditions.27, 345, 346, 347

Role of MPE in Post-GWAS Era

Over the past decade, GWAS have identified many germline genetic variants associated with numerous multifactorial diseases,355 and next steps such as fine locus mapping, gene–environment interaction analysis, family history analysis, and population structure analysis have been initiated.38, 355, 356, 357, 358, 359, 360, 361 However, GWAS findings have made virtually no impact on clinical medicine and public health. Major shortcomings of existing GWAS approaches include insufficient consideration of disease heterogeneity,22 and relative lack of follow-up functional analyses of risk variants.362, 363

Recently, the ‘GWAS-MPE approach’ was proposed,22 to take disease heterogeneity into account following GWAS analyses. In a typical GWAS design, a disease of interest is regarded as a single entity without consideration of etiological and biological heterogeneity. Epigenetic analysis of diseased cells can provide ample opportunities to study biological significance of GWAS findings. By employing the MPE approach, molecular disease classification can help to identify a specific disease subtype that is more strongly associated with a given risk variant than other subtypes of the same disease. The ‘GWAS-MPE approach’ has been applied to epidemiology research.364, 365, 366 The ‘GWAS-MPE approach’22 has advantages: (1) it may provide a possible causal link between the risk variant and molecular signatures in diseased cells; (2) it can more precisely refine risk estimates for each molecular subtype; and (3) it may identify new variant–subtype relationships, which may otherwise be obscured in conventional GWAS, dealing only with overall disease risk.

Problems in Epigenome-Wide Association Study

After flourishing GWAS research over the past decade, epigenome-wide association studies are attracting increasing attention.367 However, we must be aware of significant flaws and caveats associated with current epigenome-wide association study design.368 In essence, each human being possesses innumerable different epigenomes. Epigenomes differ between cell types even within a single organ (which consists of numerous different cell types). Even in a single cell, the epigenome changes over time in dynamic time-varying macro- and microenvironmental milieu. Despite the presence of numerous epigenomes in each individual, current epigenome-wide association study design assumes one representative epigenome for each individual. Furthermore, it is unlikely a totally valid approach to infer epigenomic variants in specific cells (non-leukocytes) from epigenomic analysis of leukocytes.

At this juncture, we must be prudent, and first develop the required technologies and biosensors capable of interrogating epigenomic variations and interactomes in different cell types (preferably, in vivo) in one human being. This is prerequisite for launching into very expensive, resource-intensive epigenome-wide association study consortia, which presently erroneously assume a single epigenome to be representative of all epigenomes in an individual.

Conclusions and Future Perspectives

Achieving the goal of personalized medicine and prevention requires the integration of molecular medicine and population health sciences, and the willingness to explore beyond conventional disease classification. Personalized medicine holds the promise of biomarkers that will help stratify patients and guide decisions on optimal disease treatment and prevention.369 However, there is an increasing gap between basic scientific discoveries and real impact on population health.370, 371 Recently, MPE has emerged as the evolving transdisciplinary science that can help fill in this gap.21, 22, 65, 66 MPE integrates molecular pathology and epidemiology, in an attempt to decipher disease at the molecular, cellular, organ, individual, and population levels. The application of molecular pathology is feasible in existing cohort studies with large amounts of accumulated multidimensional data on dietary, lifestyle and environmental exposures, and clinical outcomes. This represents a very cost-efficient research approach to advance our understanding of disease and improve medicine and public health.49, 63, 372, 373, 374, 375 To advance this integrated science requires cooperation of all practicing pathologists, because it is necessary to gather tissue specimens (from various community and academic pathology laboratories, to minimize selection bias) within well-defined cohort populations for molecular analyses.

Most chronic diseases are complex, multifactorial, genetic, and epigenetic diseases. Epigenetic research is promising because epigenetic mechanisms have critical roles in the regulation of cellular growth, differentiation, and behavior, and epigenetic changes are potential modifiable targets for therapy and chemoprevention. Analysis of normal tissue, and biological specimens, adds additional dimensions to MPE research, through we must keep in mind caveats of analysis of those specimens. Epigenetic analyses continue to contribute substantially to biomedical and population health sciences. The future widespread application of methylome and epigenome analyses250, 376, 377, 378 to paraffin-embedded archival tissues represents a powerful investigative tool, capable of enhancing our understanding of disease heterogeneity and host–disease interactions. In the future, application of in vivo real-time molecular pathology, encompassing genomic, epigenomic, transcriptomic, proteomic, metabolomic, microbiomic, and interactomic analyses, will further transform biomedical and population health sciences.

Ultimately, molecular disease classification should be the primary pathophenotypic datum of entry in population registries and databases around the world; this will further advance integrated population health science. There has been and will be increasing contribution of modern pathology to broader public health sciences, which attests pivotal roles of pathologists in the integrated science towards our ultimate goal of personalized medicine and prevention.

Note added in proof

Spitz et al379 have used the term ‘integrative epidemiology’ to describe an integration of molecular analyses (on exposures and tumors) into epidemiology. Integrative epidemiology encompasses MPE and conventional molecular epidemiology. MPE differs from conventional epidemiology because MPE takes disease heterogeneity into analysis.