Special 05 February 2020

Pan-Cancer Analysis of Whole Genomes

Cancer is a disease of the genome, caused by a cell's acquisition of somatic mutations in key cancer genes. These mutations alter pathways involved in regulating cellular growth and interactions with the tissue environment. Until recently, research on the cancer genome was focused on protein-coding genes, which together account for only 1% of the genome. To address this issue, the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Project performed whole genome sequencing and integrative analysis on over 2,600 primary cancers and their matching normal tissues across 38 distinct tumor types. This study revealed the extensive role played by large-scale structural mutations in cancer, identified previously-unknown cancer-related mutations in gene regulatory regions, inferred tumor evolution across multiple cancer types, illuminated the interactions between somatic mutations and the transcriptome, and studied the role of germline genetic variants in modulating mutational processes. This collection comprises papers describing the core set of analyses conducted by the PCAWG Consortium, and showcases data, tools, and other resources useful for those who seek to further explore this legacy data set.

Browse the PCAWG publications and associated content, including News and Views, Comment, and Nature editorial. This dedicated collection compiles the PCAWG datasets, other resources and community-generated content.

Research

Pan-cancer analysis of whole genomes

The flagship paper of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes Consortium describes the generation of the integrative analyses of 2,658 cancer whole genomes and their matching normal tissues across 38 tumour types, the structures for international data sharing and standardized analyses, and the main scientific findings from across the consortium studies.
- The ICGC/TCGA Pan-Cancer Analysis of Whole Genomes Consortium
ArticleOpen Access5 Feb 2020 Nature
Patterns of somatic structural variation in human cancer genomes

Whole-genome sequencing data from more than 2,500 cancers of 38 tumour types reveal 16 signatures that can be used to classify somatic structural variants, highlighting the diversity of genomic rearrangements in cancer.
- Yilong Li
- Nicola D. Roberts
- PCAWG Consortium
ArticleOpen Access5 Feb 2020 Nature
The repertoire of mutational signatures in human cancer

The characterization of 4,645 whole-genome and 19,184 exome sequences, covering most types of cancer, identifies 81 single-base substitution, doublet-base substitution and small-insertion-and-deletion mutational signatures, providing a systematic overview of the mutational processes that contribute to cancer development.
- Ludmil B. Alexandrov
- Jaegil Kim
- PCAWG Consortium
ArticleOpen Access5 Feb 2020 Nature
The evolutionary history of 2,658 cancers

Whole-genome sequencing data for 2,778 cancer samples from 2,658 unique donors across 38 cancer types is used to reconstruct the evolutionary history of cancer, revealing that driver mutations can precede diagnosis by several years to decades.
- Moritz Gerstung
- Clemency Jolly
- PCAWG Consortium
ArticleOpen Access5 Feb 2020 Nature
Genomic basis for RNA alterations in cancer

Integrative analyses of transcriptome and whole-genome sequencing data for 1,188 tumours across 27 types of cancer are used to provide a comprehensive catalogue of RNA-level alterations in cancer.
- PCAWG Transcriptome Core Group
- Claudia Calabrese
- PCAWG Consortium
ArticleOpen Access5 Feb 2020 Nature
Analyses of non-coding somatic drivers in 2,658 cancer whole genomes

Analyses of 2,658 whole genomes across 38 types of cancer identify the contribution of non-coding point mutations and structural variants to driving cancer.
- Esther Rheinbay
- Morten Muhlig Nielsen
- PCAWG Consortium
ArticleOpen Access5 Feb 2020 Nature
Comprehensive molecular characterization of mitochondrial genomes in human cancers

Analysis of mitochondrial genomes (mtDNA) by using whole-genome sequencing data from 2,658 cancer samples across 38 cancer types identifies hypermutated mtDNA cases, frequent somatic nuclear transfer of mtDNA and high variability of mtDNA copy number in many cancers.
- Yuan Yuan
- Young Seok Ju
- PCAWG Consortium
AnalysisOpen Access5 Feb 2020 Nature Genetics
Disruption of chromatin folding domains by somatic genomic rearrangements in human cancer

A pan-cancer genomic analysis reports the effects of structural variations on chromatin domains (TADs). Most TAD disruptions do not result in appreciable changes in expression of nearby genes.
- Kadir C. Akdemir
- Victoria T. Le
- PCAWG Consortium
ArticleOpen Access5 Feb 2020 Nature Genetics
Pan-cancer analysis of whole genomes identifies driver rearrangements promoted by LINE-1 retrotransposition

An analysis of 2,954 genomes from 38 cancer subtypes identified 19,166 retrotransposition events in 35% of samples. Aberrant LINE-1 retrotranspositions can lead to the deletion of tumor-suppressor genes as well as the amplification of oncogenes.
- Bernardo Rodriguez-Martin
- Eva G. Alvarez
- PCAWG Consortium
ArticleOpen Access5 Feb 2020 Nature Genetics
The landscape of viral associations in human cancers

Viral pathogen load in cancer genomes is estimated through analysis of sequencing data from 2,656 tumors across 35 cancer types using multiple pathogen-detection pipelines, identifying viruses in 382 genomic and 68 transcriptome datasets.
- Marc Zapatka
- Ivan Borozan
- PCAWG Consortium
ArticleOpen Access5 Feb 2020 Nature Genetics
Comprehensive analysis of chromothripsis in 2,658 human cancers using whole-genome sequencing

Analysis of whole-genome sequencing data across 2,658 tumors spanning 38 cancer types shows that chromothripsis is pervasive, with a frequency of more than 50% in several cancer types, contributing to oncogene amplification, gene inactivation and cancer genome evolution.
- Isidro Cortés-Ciriano
- Jake June-Koo Lee
- PCAWG Consortium
AnalysisOpen Access5 Feb 2020 Nature Genetics
Butler enables rapid cloud-based analysis of thousands of human genomes

Efficient, large-scale genomic analysis is facilitated on the cloud by a computational tool with error-diagnosing and self-healing capabilities.
- Sergei Yakneen
- Sebastian M. Waszak
- PCAWG Consortium
Brief CommunicationOpen Access5 Feb 2020 Nature Biotechnology
Cancer LncRNA Census reveals evidence for deep functional conservation of long noncoding RNAs in tumorigenesis

Joana Carlevaro-Fita, Andrés Lanzós et al. present the Cancer LncRNA Census (CLC), a manually curated dataset of 122 long noncoding RNAs (lncRNAs) with experimentally-validated functions in cancer based on data from the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium. CLC lncRNAs have unique gene features, and a number display evidence for cancer-driving functions that are conserved from humans to mice.
- Joana Carlevaro-Fita
- Andrés Lanzós
- PCAWG Consortium
ArticleOpen Access5 Feb 2020 Communications Biology
Integrative pathway enrichment analysis of multivariate omics data

Multi-omics datasets pose major challenges to data interpretation and hypothesis generation owing to their high-dimensional molecular profiles. Here, the authors develop ActivePathways method, which uses data fusion techniques for integrative pathway analysis of multi-omics data and candidate gene discovery.
- Marta Paczkowska
- Jonathan Barenboim
- PCAWG Consortium
ArticleOpen Access5 Feb 2020 Nature Communications
Pathway and network analysis of more than 2500 whole cancer genomes

Understanding deregulation of biological pathways in cancer can provide insight into disease etiology and potential therapies. Here, as part of the PanCancer Analysis of Whole Genomes (PCAWG) consortium, the authors present pathway and network analysis of 2583 whole cancer genomes from 27 tumour types.
- Matthew A. Reyna
- David Haan
- PCAWG Consortium
ArticleOpen Access5 Feb 2020 Nature Communications
A deep learning system accurately classifies primary and metastatic cancers using passenger mutation patterns

Some cancer patients first present with metastases where the location of the primary is unidentified; these are difficult to treat. In this study, using machine learning, the authors develop a method to determine the tissue of origin of a cancer based on whole sequencing data.
- Wei Jiao
- Gurnit Atwal
- PCAWG Consortium
ArticleOpen Access5 Feb 2020 Nature Communications
High-coverage whole-genome analysis of 1220 cancers reveals hundreds of genes deregulated by rearrangement-mediated cis-regulatory alterations

In this study the authors consider the structural variants (SVs) present within cancer cases of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium. They report hundreds of genes, including known cancer-associated genes for which the nearby presence of a SV breakpoint is associated with altered expression.
- Yiqun Zhang
- Fengju Chen
- PCAWG Consortium
ArticleOpen Access5 Feb 2020 Nature Communications
Genomic footprints of activated telomere maintenance mechanisms in cancer

In somatic cells the mechanisms maintaining the chromosome ends are normally inactivated; however, cancer cells can re-activate these pathways to support continuous growth. Here, the authors characterize the telomeric landscapes across tumour types and identify genomic alterations associated with different telomere maintenance mechanisms.
- Lina Sieverling
- Chen Hong
- PCAWG Consortium
ArticleOpen Access5 Feb 2020 Nature Communications
Combined burden and functional impact tests for cancer driver discovery using DriverPower

Analysis of cancer genome sequencing data has enabled the discovery of driver mutations. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium the authors present DriverPower, a software package that identifies coding and non-coding driver mutations within cancer whole genomes via consideration of mutational burden and functional impact evidence.
- Shimin Shuai
- PCAWG Drivers and Functional Interpretation Working Group
- PCAWG Consortium
ArticleOpen Access5 Feb 2020 Nature Communications
Inferring structural variant cancer cell fraction

The authors present SVclone, a computational method for inferring the cancer cell fraction of structural variants from whole-genome sequencing data.
- Marek Cmero
- Ke Yuan
- PCAWG Consortium
ArticleOpen Access5 Feb 2020 Nature Communications
Divergent mutational processes distinguish hypoxic and normoxic tumours

Many tumours exhibit hypoxia (low oxygen) and hypoxic tumours often respond poorly to therapy. Here, the authors quantify hypoxia in 1188 tumours from 27 cancer types, showing elevated hypoxia links to increased mutational load, directing evolutionary trajectories.
- Vinayak Bhandari
- Constance H. Li
- PCAWG Consortium
ArticleOpen Access5 Feb 2020 Nature Communications
Reconstructing evolutionary trajectories of mutation signature activities in cancer using TrackSig

Cancers evolve as they progress under differing selective pressures. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, the authors present the method TrackSig the estimates evolutionary trajectories of somatic mutational processes from single bulk tumour data.
- Yulia Rubanova
- Ruian Shi
- PCAWG Consortium
ArticleOpen Access5 Feb 2020 Nature Communications
Sex differences in oncogenic mutational processes

There’s an emerging body of evidence to show how biological sex impacts cancer incidence, treatment and underlying biology. Here, using a large pan-cancer dataset, the authors further highlight how sex differences shape the cancer genome.
- Constance H. Li
- Stephenie D. Prokopec
- PCAWG Consortium
ArticleOpen Access28 Aug 2020 Nature Communications
Retrospective evaluation of whole exome and genome mutation calls in 746 cancer samples

With the generation of large pan-cancer whole-exome and whole-genome sequencing projects, a question remains about how comparable these datasets are. Here, using The Cancer Genome Atlas samples analysed as part of the Pan-Cancer Analysis of Whole Genomes project, the authors explore the concordance of mutations called by whole exome sequencing and whole genome sequencing techniques.
- Matthew H. Bailey
- William U. Meyerson
- PCAWG Consortium
ArticleOpen Access21 Sep 2020 Nature Communications

Pan-Cancer Analysis of Whole Genomes

Research

Pan-cancer analysis of whole genomes

Patterns of somatic structural variation in human cancer genomes

The repertoire of mutational signatures in human cancer

The evolutionary history of 2,658 cancers

Genomic basis for RNA alterations in cancer

Analyses of non-coding somatic drivers in 2,658 cancer whole genomes

Comprehensive molecular characterization of mitochondrial genomes in human cancers

Disruption of chromatin folding domains by somatic genomic rearrangements in human cancer

Pan-cancer analysis of whole genomes identifies driver rearrangements promoted by LINE-1 retrotransposition

The landscape of viral associations in human cancers

Comprehensive analysis of chromothripsis in 2,658 human cancers using whole-genome sequencing

Butler enables rapid cloud-based analysis of thousands of human genomes

Cancer LncRNA Census reveals evidence for deep functional conservation of long noncoding RNAs in tumorigenesis

Integrative pathway enrichment analysis of multivariate omics data

Pathway and network analysis of more than 2500 whole cancer genomes

A deep learning system accurately classifies primary and metastatic cancers using passenger mutation patterns

High-coverage whole-genome analysis of 1220 cancers reveals hundreds of genes deregulated by rearrangement-mediated cis-regulatory alterations

Genomic footprints of activated telomere maintenance mechanisms in cancer

Combined burden and functional impact tests for cancer driver discovery using DriverPower

Inferring structural variant cancer cell fraction

Divergent mutational processes distinguish hypoxic and normoxic tumours

Reconstructing evolutionary trajectories of mutation signature activities in cancer using TrackSig

Sex differences in oncogenic mutational processes

Retrospective evaluation of whole exome and genome mutation calls in 746 cancer samples

Related content

Pan-Cancer Analysis of Whole Genomes

The era of massive cancer sequencing projects has reached a turning point

Genomics: data sharing needs an international code of conduct

Global genomics project unravels cancer’s complexity at unprecedented scale

Search

Quick links