Looking backward in time to define the chronology of metastasis

The timing of cancer metastasis has implications for treatment and prevention. Traditional forward-time views of metastasis assume it occurs late during evolution. However, looking backward in time reveals metastasis often occurs prior to clinical detection of primary tumors.

Linear progression-major clone model. Genomic data from paired primary tumors and metastases are at odds with the minor clone hypothesis. In particular, if metastasis is seeded by a minor subclone that has acquired additional mutations, these alterations are unlikely to be detected by bulk sequencing of the primary tumor and will appear "metastasis-specific" in the sequencing data ( Fig. 1c, scenario (I)). The detection of such low frequency mutations is further aggravated by the fact that typically a small fraction of total tumor volume is biopsied and subject to sequencing. Consequently, if the minor clone model dominates, metastases should possess private driver mutations that are not detected in the paired primary tumor (Fig. 1c, scenario (I)). Instead, genomic data across a variety of cancer types have documented high driver gene concordance between primary tumors and metastases with few metastasis-private driver mutations identified 1,5,6 . For instance, in our recent study 6 , we analyzed exome sequencing data from 457 paired primary tumor and metastatic samples from 136 patients with colorectal, lung, or breast cancer and found that 84, 86 and 59% of driver mutations were shared in primary tumor and metastases in these common cancer types, respectively. Of note, the lower prevalence of shared driver mutations in breast cancer is associated with adjuvant treatment which likely selects for drug resistant mutations in a minor clone in the primary tumor, leading to the higher prevalence of metastasis-specific driver mutations in treated metastases. In this regard, metastasis-private mutations are not drivers of cancer spread but instead are associated with drug resistance. Taken together, the low genomic divergence between primary tumors and metastases across diverse epithelial tumors suggests that metastases commonly derive from a major clone (Fig. 1a-c, scenario (II)) in the primary tumor.
The origin of metastasis from a major mutant clone in the primary tumor is in line with the clonal dominance hypothesis proposed by Kerbel et al. 7 . This model assumes that once a metastatic subclone emerges within a primary tumor, cells from this subclone with high fitness will outcompete and dominate the tumor mass itself, resulting in genetic and phenotypic similarities between the primary tumor and metastasis. This hypothesis was originally derived from xenograft experiments using the integration of foreign DNA as a means to genetically tag diverse tumor cell populations 7 . These studies suggest that the same cell lineage dominates both the primary site and metastasis consistent with a competitive growth advantage for the metastatic lineage in the primary tumor. A recent study by Chen et al. 8 employed a CRISPR/Cas9 screen to identify metastatic drivers in mouse models. These data indicate that very few metastasis-competent lineages survive and ultimately dominate both the primary tumor and metastatic sites. The clonal dominance model is also consistent with early observations that gene expression profiling of primary tumors can predict distant recurrence 9 .
Parallel progression. The parallel progression model assumes that metastatic seeding occurs early during tumorigenesis, typically before emergence of the cancerous founder cell (or the most recent common ancestor, MRCA) of the primary tumor 10 . After dissemination, both the primary tumor and metastasis independently evolve and acquire additional driver mutations, leading to genomic divergence (Fig. 1a-cscenario (III)). In summary, parallel progression is characterized by early metastatic spread, relatively short truncal branches and the occurrence of private driver mutations in both primary and metastatic lineages. While this pattern has been observed in some studies (reviewed in ref. 10 ), primary tumor-private and metastasis-private driver mutations are rarely observed in bulk genomics data in the same patient 1,6 .

Backward-time models of metastatic seeding
The time from emergence of a cancerous founder cell to clinical detection of the primary tumor has been estimated to be on the order of a few years, typically less than ten 2,3,6 . Although this window is much shorter than pre-malignant evolution which can occur over decades (Fig. 1a), metastasis can still occur early relative to clinical detection. For instance, a metastasis might be seeded several years prior to diagnosis of primary. In other words, metastasis can occur late relative to tumor initiation, but early relative to clinical detection (Fig. 1b).
Early studies (reviewed in ref. 11 ) analyzed tumor doubling times across human cancers and metastases and extrapolated back using a mathematical model of tumor growth to estimate metastatic timing. These estimates suggest that metastasis occurs years before diagnosis of the primary tumor. Hanin et al. 12 developed a stochastic mathematical model of the sizes of multiple metastases in an individual patient, and estimated that metastasis in breast cancer can occur early when the primary tumor is <1 mm in diameter. Intriguingly, these results suggest a long latency between dissemination and metastatic regrowth 12 . Haeno et al. 13 used a similar mathematical framework to model metastatic burden estimates based on imaging from the time of diagnosis to death in pancreatic cancers obtained via rapid autopsy and reasoned that all primary cancers are likely to harbor metastatic clones at diagnosis.
Recent studies have performed phylogenetic analysis of genomic data to calibrate the timing of metastatic divergence, yielding genomic support for clinically early dissemination. For instance, Zhao et al. 14 reconstructed the phylogenies of primary tumors and matched metastases (n = 2-7 per patient) in 40 patients with 13 cancer types and found that in nearly 90% of patients, genetic divergence of the first metastatic lineage had occurred by the time of diagnosis of the primary tumor while some lineages diverged years earlier. Noorani et al. 15 performed phylogenetic analysis of paired primary and metastatic samples in a cohort of esophageal adenocarcinomas and showed that in 90% of patients, multiple subclones spread from the primary site to form multiple metastases. Interestingly, the metastatic lineages exhibited a depletion of clock-like mutational signature (predominantly C > T transitions in NpCpG trinucleotide context), consistent with rapid spread after emergence of the malignant founder clone.
It should be noted, however, that the chronological time of metastatic dissemination cannot readily be resolved by phylogenetic inference since the divergence time on the phylogenetic tree is not equivalent to the time of dissemination when metastasis founder cells leave the primary tumor and seed the metastases 5 . Additionally, phylogenetic approaches do not account for potential biases caused by limited tissue sampling, spatial structure within solid tumors, clonal selection, and mutation rate variation etc. To address these issues, we developed a spatial computational model of tumor growth and metastasis and leveraged a statistical inference framework to quantify the size (age) of the primary tumor at the time of metastatic dissemination 5 . Applying this method to a cohort of paired primary colorectal cancers and metastases revealed early metastatic seeding when the primary tumor was small and clinically undetectable (typically <0.1 cm 3 ). In a follow-up study, we analyzed genomic data from colorectal, breast and lung cancers and estimated that metastatic dissemination commonly occurs 2-4 years prior to diagnosis of the primary tumor 6 . Of note, adjuvant treatment was associated with the presence of private driver mutations in metastases and later metastatic seeding, suggesting that treatment selects for drug-resistant metastatic clones and can delay metastatic outgrowth. Taken together, accumulating lines of evidence suggest that metastatic spread can occur rapidly following the emergence of invasive malignant cells, often several years prior to the diagnosis of the primary tumor. In colorectal cancer candidate biomarkers associated with early metastasis have been identified 5 and it will be important to further validate these findings as well as to define such associations in other tumor types. Additionally, it will be crucial to understand factors that prevent metastatic outgrowth, including the contribution of immune surveillance.

Conclusions
The traditional forward-time view of metastasis has favored the assumption that systemic spread is a late event in a tumor's lifespan. This is certainly the case if the clock is assumed to start during tumor initiation. Indeed, it is appreciated that tumorigenesis occurs over many years (potentially decades) through a multi-step process where driver mutations can precede diagnosis of the primary tumor by years. At the same time, inference of metastatic timing from cancer genomic data suggests that this can occur years prior to clinical detection. Genomic data further suggest that in many cancers metastases are commonly seeded by a major mutant clone in the primary tumor. These observations thus help to resolve the controversy regarding early versus late metastatic seeding and the attendant clinical implications.