Special |

Pan-Cancer Analysis of Whole Genomes

Cancer is a disease of the genome, caused by a cell's acquisition of somatic mutations in key cancer genes. These mutations alter pathways involved in regulating cellular growth and interactions with the tissue environment. Until recently, research on the cancer genome was focused on protein-coding genes, which together account for only 1% of the genome. To address this issue, the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Project performed whole genome sequencing and integrative analysis on over 2,600 primary cancers and their matching normal tissues across 38 distinct tumor types. This study revealed the extensive role played by large-scale structural mutations in cancer, identified previously-unknown cancer-related mutations in gene regulatory regions, inferred tumor evolution across multiple cancer types, illuminated the interactions between somatic mutations and the transcriptome, and studied the role of germline genetic variants in modulating mutational processes. This collection comprises papers describing the core set of analyses conducted by the PCAWG Consortium, and showcases data, tools, and other resources useful for those who seek to further explore this legacy data set.

Browse the PCAWG publications and associated content, including News and Views, Comment, and Nature editorial. This dedicated collection compiles the PCAWG datasets, other resources and community-generated content.

Research

The flagship paper of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes Consortium describes the generation of the integrative analyses of 2,658 cancer whole genomes and their matching normal tissues across 38 tumour types, the structures for international data sharing and standardized analyses, and the main scientific findings from across the consortium studies.

Article | Open Access | | Nature

Whole-genome sequencing data from more than 2,500 cancers of 38 tumour types reveal 16 signatures that can be used to classify somatic structural variants, highlighting the diversity of genomic rearrangements in cancer.

Article | Open Access | | Nature

The characterization of 4,645 whole-genome and 19,184 exome sequences, covering most types of cancer, identifies 81 single-base substitution, doublet-base substitution and small-insertion-and-deletion mutational signatures, providing a systematic overview of the mutational processes that contribute to cancer development.

Article | Open Access | | Nature

Whole-genome sequencing data for 2,778 cancer samples from 2,658 unique donors across 38 cancer types is used to reconstruct the evolutionary history of cancer, revealing that driver mutations can precede diagnosis by several years to decades.

Article | Open Access | | Nature

Integrative analyses of transcriptome and whole-genome sequencing data for 1,188 tumours across 27 types of cancer are used to provide a comprehensive catalogue of RNA-level alterations in cancer.

Article | Open Access | | Nature

Viral pathogen load in cancer genomes is estimated through analysis of sequencing data from 2,656 tumors across 35 cancer types using multiple pathogen-detection pipelines, identifying viruses in 382 genomic and 68 transcriptome datasets.

Article | Open Access | | Nature Genetics

Joana Carlevaro-Fita, Andrés Lanzós et al. present the Cancer LncRNA Census (CLC), a manually curated dataset of 122 long noncoding RNAs (lncRNAs) with experimentally-validated functions in cancer based on data from the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium. CLC lncRNAs have unique gene features, and a number display evidence for cancer-driving functions that are conserved from humans to mice.

Article | Open Access | | Communications Biology

Multi-omics datasets pose major challenges to data interpretation and hypothesis generation owing to their high-dimensional molecular profiles. Here, the authors develop ActivePathways method, which uses data fusion techniques for integrative pathway analysis of multi-omics data and candidate gene discovery.

Article | Open Access | | Nature Communications

Understanding deregulation of biological pathways in cancer can provide insight into disease etiology and potential therapies. Here, as part of the PanCancer Analysis of Whole Genomes (PCAWG) consortium, the authors present pathway and network analysis of 2583 whole cancer genomes from 27 tumour types.

Article | Open Access | | Nature Communications

In this study the authors consider the structural variants (SVs) present within cancer cases of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium. They report hundreds of genes, including known cancer-associated genes for which the nearby presence of a SV breakpoint is associated with altered expression.

Article | Open Access | | Nature Communications

In somatic cells the mechanisms maintaining the chromosome ends are normally inactivated; however, cancer cells can re-activate these pathways to support continuous growth. Here, the authors characterize the telomeric landscapes across tumour types and identify genomic alterations associated with different telomere maintenance mechanisms.

Article | Open Access | | Nature Communications

Analysis of cancer genome sequencing data has enabled the discovery of driver mutations. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium the authors present DriverPower, a software package that identifies coding and non-coding driver mutations within cancer whole genomes via consideration of mutational burden and functional impact evidence.

Article | Open Access | | Nature Communications

Cancers evolve as they progress under differing selective pressures. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, the authors present the method TrackSig the estimates evolutionary trajectories of somatic mutational processes from single bulk tumour data.

Article | Open Access | | Nature Communications

Related content