The Pan-Cancer Analysis of Whole Genomes (PCAWG) project generated a vast amount of whole-genome cancer sequencing resource data. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2658 cancers across 38 tumor types, we provide a user’s guide to the five publicly available online data exploration and visualization tools introduced in the PCAWG marker paper. These tools are ICGC Data Portal, UCSC Xena, Chromothripsis Explorer, Expression Atlas, and PCAWG-Scout. We detail use cases and analyses for each tool, show how they incorporate outside resources from the larger genomics ecosystem, and demonstrate how the tools can be used together to understand the biology of cancers more deeply. Together, the tools enable researchers to query the complex genomic PCAWG data dynamically and integrate external information, enabling and enhancing interpretation.
The Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium aggregated whole-genome sequencing (WGS) data from 2658 cancers across 38 tumor types generated by the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA) projects. These sequencing data were re-analyzed with standardized, high-accuracy pipelines to align to the human genome (reference build hs37d5) and identify germline variants and somatically acquired mutations, as described in the PCAWG marker paper1. Here we provide a user guide to five tools introduced in the PCAWG marker paper: The ICGC Data Portal, UCSC Xena, Chromothripsis Explorer, Expression Atlas, and PCAWG-Scout. Each of them was created or extended to explore PCAWG data resources1. All of the tools aim to streamline analysis and visualization by pre-loading the PCAWG data so that users do not need to locate, curate, or manage the data and by making the tools accessible through a web interface. Each of these five tools also integrates other genomics datasets and tools that provide context and insight for interpretation of patterns in the PCAWG data helping this resource fully realize its potential. Some of the datasets and tools integrated include the UCSC Genome Browser2, Ensembl3, drug target compendia4, COSMIC5, and even large and complementary sequencing efforts such as GTEx6. Intuitive access to these additional tools and datasets is provided either by showing their data side by side or by providing context-dependent URL links.
The five resources in this paper each provide a different perspective and focus to the PCAWG data (Table 1). The ICGC Data Portal serves as the main entry point for accessing all PCAWG data and also enables exploration of PCAWG consensus simple somatic mutations, including point mutations and small indels, each by their frequencies, patterns of co-occurrence, mutual exclusivity, and functional associations. UCSC Xena integrates diverse types of genomic and phenotypic/clinical information at the sample level across the large number of samples, enabling rapid examination of patterns within and across data types. The Chromothripsis Explorer visualizes genome-wide mutational patterns, with a focus on complex genomic events, e.g., chromothripsis and kataegis. This is achieved through interactive Circos plots for each tumor with different tracks that correspond to allele-specific copy number variants, somatic structural variations, simple somatic mutations, indels, and clinical information. The Expression Atlas focuses on RNA-seq data, supporting queries in either a baseline context (e.g., finding genes that are expressed in prostate adenocarcinoma samples) or in a differential context (e.g., finding genes that are under- or over-expressed in prostate adenocarcinomas compared to adjacent normal prostate samples). PCAWG-Scout allows users to run their own analyses on-demand, including prediction of cancer-driver genes, differential gene expression, recurrent structural variations, survival, pathway enrichment, mutations as visualized on a protein structure, mutational signatures, and possible recommended therapies (based on the in-house PanDrugs resource; Supplementary Fig. 1). Each of the five tools offers different visualizations and analyses of the PCAWG data resource, each with its own strengths, and each enabling different insights into the data. When employed together, they provide the user with a deeper understanding of the cancer’s biology (Fig. 1). More information about the tools can be found at the PCAWG Landing Page (http://docs.icgc.org/pcawg).
ICGC Data Portal and a use case
As a main entry point, the ICGC Data Portal (https://dcc.icgc.org, Zhang7) provides an intuitive graphical interface for browsing, searching, and visualizing PCAWG datasets (Fig. 1a). Uniformly aligned sequencing BAM files and variant calling VCF files, although physically residing in multiple repositories globally, can be centrally searched via the ICGC Data portal (https://icgc.org/ZEA). Users can readily find specific datasets of interest with a few mouse clicks using various facet terms to narrow their search. Other downstream analysis results generated by PCAWG working groups are available at https://dcc.icgc.org/releases/PCAWG. Close to 23 million open access PCAWG consensus simple somatic mutations have been annotated with consequences for protein structure, affected pathways, targeting cancer drugs, gene ontology terms, and clinical parameters. The portal’s Advanced Search (https://icgc.org/ZzP) tool allows users to perform complex queries, for example, to retrieve the most frequently mutated targets of drugs in stage 2 liver cancers (https://icgc.org/ZHe). Analytic tools, including access to a Jupyter Notebook sandbox for advanced users, support exploration of potential associations between molecular abnormalities and phenotypic observations such as donor survival (https://dcc.icgc.org/analysis). The ICGC Data Portal publicly displays non-identifiable, aggregated analysis results from protected data.
The ICGC Data Portal is best for users who are seeking to download PCAWG data for their own analyses. It also includes the richest resources and functionality for users interested in single-nucleotide variants (SNVs), including patterns of co-occurrence, mutual exclusivity, and functional associations. Figure 1a shows an example use case that demonstrates how bioinformaticians and other tool creators can download results from the portal and then run their own analyses or offer their own visualizations of the data.
UCSC Xena and a use case
UCSC Xena’s (https://pcawg.xenahubs.net) adaptable visualizations, fast performance, and flexible data format make the full power of the PCAWG resource available to all researchers8. It displays data mapped to coding and non-coding regions of the genome, including introns, promoters, enhancers, and intergenic regions. Xena can display tens of thousands of data points on thousands of samples, all within seconds. The Xena Browser excels at integrating the diverse datasets generated by the PCAWG Consortium using the Xena Visual Spreadsheet, which enables users to view multiple types of data side by side (Fig. 1b). In addition to the Visual Spreadsheet, Xena offers survival analyses, the ability to compare and contrast dynamically built subgroups, statistical tests such as analysis of variance, and URLs to live visualizations for sharing with collaborators or others. Xena’s hub-browser architecture enables users to view the protected consensus simple somatic mutations, including non-coding mutations, by loading the dataset into a user’s local private Xena hub (Fig. 2, Supplementary Fig. 2). The Xena Browser seamlessly integrates data from multiple hubs, allowing users who have access to the protected mutation data to visualize it in conjunction with other PCAWG data publicly available on the PCAWG Xena Hub (https://pcawg.xenahubs.net).
UCSC Xena is best for integrating diverse PCAWG data types, including simple mutations, gene expression levels, and gene fusions, as well as less common types such as alternative splicing1 events, promoter usage, and mutational signature scores, all from the same set of samples (Supplementary Note 1). It also provides a mechanism for viewing protected non-coding SNVs either separately or in conjunction with other PCAWG data. Figure 2 shows an example use case, exploring alterations in the TERT gene. Both public data (structural variants (SVs)) and private data (SNVs) on the TERT gene are shown. The data are integrated in the browser, keeping private data protected. Even though the data are distributed across multiple hubs with different access controls, they appear to the user to come from a unified dataset, allowing easy visualization and data integration. Figure 2 shows alterations by SNV and alterations by larger structural variation that are mutually exclusive. We also see that there are significant differences in the type of alteration in different cancer types (chi-square, one-sided, F = 426.2, p < 0.001).
Chromothripsis Explorer and a use case
Chromothripsis refers to a mutational process characterized by massive de novo rearrangements that affect one or multiple chromosomes9. The whole-genome dataset assembled by PCAWG permitted us to characterize chromothripsis patterns on a large scale at single-base resolution across >30 cancer types10. Although chromothripsis is generally identified by statistical metrics11, visual inspection still remains essential to dismiss false-positive cases generated by other mechanisms of genome instability10,12. The Chromothripsis Explorer (http://compbio.med.harvard.edu/chromothripsis/) is an open source R Shiny application that visualizes chromothripsis patterns detected using WGS data1,10.
The Chromothripsis Explorer provides tools for exploration of chromothripsis frequencies and patterns across tumor types (Fig. 3a). Specifically, it provides interactive Circos plots13 for each tumor, allowing researchers to explore large-scale alterations such as chromosome arm deletions and complex mutational patterns such as chromothripsis and chromoplexy (Fig. 3b). Each Circos plot is divided into seven tracks that display, from outer to inner rings: (i) hg19 cytobands; (ii) inter-mutation distance and location for pathogenic (i.e., non-synonymous, stop-gain, and stop-loss) and nonpathogenic SNVs, as well as frame-shift and in-frame indels; (iii) chromothripsis regions; (iv) total copy number; (v) minor copy number profiles, defined as the least amplified allele, to visualize loss of heterozygosity (LOH) regions; (vi) gene annotation track, and (vii) structural variations displayed according to read orientations at the breakpoints (duplication-like SVs in blue, deletion-like SVs in orange, head-to-head inversions in black, and tail-to-tail inversions in green). By hovering over a Circos plot, the user can obtain information about a mutation of interest at single-base resolution and also see gene annotations and functional effect predictions. In addition to the genomic data, clinical and histo-pathological information are provided for all tumors in the form of customizable tables that enable the user to map tumor identifiers across cancer projects (e.g., TCGA to ICGC IDs; Fig. 3b).
The Chromothripsis Explorer is best for users who are looking for a global picture of somatic alterations in a tumor (e.g., large-scale aneuploidies or translocations). It also provides visualizations of the point mutations, as well as small insertions and deletions, on a genome-wide scale. A representative use case for Chromothripsis Explorer is the exploration of complex rearrangements in one or more human cancers, as shown in Fig. 3b for ColoRect-AdenoCA tumor ICGC ID: DO9034. By selecting the chromosomes that harbor massive rearrangements, in this case chromosomes 5, 8, 10, 11, and 19, the user can investigate the consequences of complex rearrangements such as LOH across chromosome 8 and copy number amplifications in multiple locations.
Expression Atlas and a use case
Expression Atlas (https://www.ebi.ac.uk/gxa/experiments/E-MTAB-5200/, Petryszak14) is an added-value database and web service that enables the user to assess gene expression in different tissues, cell types, diseases, and developmental stages. It collects, annotates, re-analyses, and displays gene, transcript, and protein expression data. It supports two types of study design: baseline and differential. Baseline studies involve quantitation of genes by tissue type, developmental stage, cell line, cancer type, or other factors. Differential studies perform expression comparisons between different samples, for example, disease vs. healthy tissue (Fig. 4). In addition to the PCAWG datasets, selected expression studies from archives such as ArrayExpress, GEO (Gene Expression Omnibus) and ENA (European Nucleotide Archive) also underwent further curation and processing. Data curation is semi-automated and involves identifying the experimental factors, such as diseases or perturbations, annotating metadata with Experimental Factor Ontology (EFO) terms, and describing the experimental comparisons for further processing. Currently, Expression Atlas provides results from >3500 experiments that include about 120,000 assays from >60 different organisms. The datasets cover >100 cell types from the Cell Ontology and >700 diseases represented in the EFO.
Expression Atlas includes differential studies on human diseases in humans and animal models as well as large baseline studies on human subjects or cell lines, including GTEx, CCLE, ENCODE, BLUEPRINT, and HipSci. Analyses of bulk or single-cell RNA-seq datasets are performed using our open source pipeline iRAP15. Expression Atlas can be searched by gene, gene set, or experimental condition (Fig. 4a). Gene, transcript, and protein expression across different conditions are displayed through heatmaps and boxplots (Fig. 4b). Annotation of datasets with EFO terms enables nested searching across related tissues, diseases, and other conditions modeled within EFO. For example, a search for “cancer” will produce results for all different types of cancer, including “leukemia.” PCAWG datasets can be viewed and queried within their study pages or they can be viewed alongside other studies within Expression Atlas, returned as matches to gene or condition queries from the home page.
Expression Atlas is best for users who are interested in viewing how PCAWG gene expression data compare with those from other sources, especially normal tissues in GTEx. It also provides the ability to see gene expression on an anatomical figure, making it easy to visualize patterns of expression across the body. An example use case in Fig. 4 shows a typical gene search, in this case for gene SFTPA2, to identify in which tissues it is expressed and under what conditions its expression changes. The results of the query show high expression in lung tissue across different baseline expression studies available through Expression Atlas. Focusing on the PCAWG datasets, we see that expression is low in lung cancers (adenocarcinoma and squamous cell carcinoma), whereas it is highly expressed in the corresponding adjacent normal tissues. It is also highly expressed in lung samples from GTEx. Finally, through the panel of available differential studies (bulk RNA-Seq or microarray), the user can confirm from additional studies in Expression Atlas that SFTPA2 is downregulated in lung cancers.
PCAWG-Scout and a use case
As opposed to offering only a limited and predefined list of analyses, PCAWG-Scout (http://pcawgscout.bsc.es/) offers a variety of on-demand analysis functionalities. The analyses enable researchers to explore and visualize the data, form a hypothesis, run the relevant analysis, and immediately explore and visualize the results, giving rise to an analysis loop that drives discovery. The analyses are performed on data from the PCAWG main data release (available in the ICGC data repository) and on results from the PCAWG working groups. Results from the working groups include driver calls for different cohorts and for individual samples, mutation clonality assignments, and mutational signatures, all of which are integrated into different sections of the PCAWG-Scout reports, tables, and interactive visualization graphics. PCAWG-Scout generates a set of visualizations and analyses, called a report, on any number of cohorts, samples, or genes. Reports include descriptions, statistics, plots, interactive three-dimensional (3D) protein representations, and network graphs (Fig. 5). The reports also offer additional, optional analyses, including enrichment analysis of gene lists, driver predictions over cohorts, survival analysis for lists of samples, and potential recommended therapies for individual donors (Supplementary Fig. 3). PCAWG-Scout uses a plugin approach that makes it easy for the user to customize reports or perform new types of analyses. Data and results are exported in interoperable formats to help integrate PCAWG-Scout with other software packages.
PCAWG-Scout is best for users who are looking for a web interface to run analyses on PCAWG data (e.g., differential gene expression or gene set enrichment). It also offers 3D mutation views for coding SNVs and INDELs. The potential to explore PCAWG data in PCAWG-Scout is illustrated in Fig. 5, which shows a network visualization tool that was configured from the web interface with parameters gathered through analyses run within the tool itself. The tool offers the user a bird’s eye view of a number of important facets of the biology, in the case of Fig. 5, of central nervous system tumors. For instance, IDH1, TP53, and DDX3X stand out as genes in which mutations are more damaging than expected. Plots such as these can help the user identify patterns such as mutual exclusivity and clinical prognosis, as well as highlight the ways in which gene function can be deregulated, for example, by mutation or alteration of gene expression.
Synergy of the different tools
Combining the strengths of the different tools can provide a deeper understanding of tumor biology. That synergy is illustrated by considering a common driver event in prostate cancer: fusion of the oncogene ERG16,17 (Fig. 1). Xena’s Visual Spreadsheet enables the user to look across all 18 PCAWG prostate samples with both WGS and RNA-seq data, showing that 8 of the samples harbor an ERG fusion. These samples also show ERG overexpression (Fig. 1b). A view of the PCAWG SV data shows that, across all samples, the fusion breakpoints are located at the ERG transcription start site, leaving the ERG-coding region intact and fusing it to the promoter region of TMPRSS2 or SLC45A3 (Fig. 1b). In addition, the figure shows that fusions detected by RNA-seq and WGS are not always consistent; one fusion detected by a consensus of RNA-based detectors is missed in the WGS calls, and the converse is also seen. This example shows that an integrated visualization across multiple data types and algorithms can provide a more accurate picture of a genomic event.
The Chromothripsis Explorer adds a more in-depth view of the CNV and SV alterations in the eight tumors with ERG fusions. It shows that alterations in those eight tumors vary widely (Fig. 1c, Supplementary Fig. 4). Whereas donors DO36372, DO36359, DO36265, and DO36335 have quiescent genomes with few SVs, DO36356 and DO36283 show more complex karyotypes. For example, in DO38283, chromosome 21 harbors multiple SVs that link it with chromosomes 2, 9, 13, and 21 (right). A closer look at the intrachromosomal SVs in chromosome 21 (left) reveals an oncogenic fusion generated by a deletion at chr21:39,988,805–40,578,907.
The Expression Atlas adds the observation that expression levels of TMPRSS2 and SLC45A3 vary across tissue and tumor types but that both TMPRSS2 and SLC45A3 are highly expressed in normal prostate tissues and prostate tumors, as shown in the Expression Atlas Baseline Expression Widget (Fig. 1d). Combined analysis of the PCAWG and GTEx datasets leads to the hypothesis that a subset of prostate cancers, through genome rearrangement, hijack the promoters of androgen-responsive genes to increase ERG expression, resulting in an androgen-dependent overexpression of ERG.
PCAWG-Scout adds further information by illuminating genomic events in the prostate samples that do not show ERG fusions. Although ERG fusions are frequent, 46% (89 out of 195) of the PCAWG prostate tumors do not show them (Supplementary Fig. 5). In fact, we can see using PCAWG-Scout’s mutual exclusivity analysis that simple mutations in FOXA1, SPOP, and SYNE1 are significantly associated with non-fusion tumors (Fig. 1e). Furthermore, in PCAWG-Scout’s 3D protein structure view, the mutations in SPOP cluster tightly around the interaction interface for PTEN (Fig. 1e), suggesting that those mutations may lead to altered SPOP protein function.
The use case in this section highlights some of the strengths of each individual tool and also demonstrates how the tools can be used synergistically to gain a fuller understanding of a genomic event, in this case ERG fusions in prostate cancer. In this example, we started with UCSC Xena, but the user can start with any of the five tools and then use others to investigate further.
The data generated by the PCAWG consortium provide a valuable resource for understanding complex cancer biology. Here we have described five tools that aim to put that resource into the hands of all researchers and also incorporate outside genomic data resources. Those tools, the ICGC Data Portal, UCSC Xena, Chromothripsis Explorer, Expression Atlas, and PCAWG-Scout, are all available at The PCAWG Data Portals and Visualization Page (http://docs.icgc.org/pcawg). Visualization of patterns within the PCAWG data is challenging because of the relatively large number of whole genomes studied, the large size of each dataset at the sequence level, and the difficulty of viewing all intergenic and intronic regions explicitly at either the sequence or gene level. Those factors impose high-performance requirements for interactive tools, especially those on the web. Adding to the high-performance requirements is the challenge of visualizing the wide array of data types derived from the high-quality genomic information provided by whole-genome data, including point mutations, gene fusions, promoter usage, and SVs. Many visualization tools, especially those for users without extensive computational training, are currently limited to coding regions and more typical genomic datasets such as those on SNVs or CNVs; they are not able to take full advantage of the depth and complexity of information made available by the PCAWG consortium. Each of the tools presented here was either created or extended in the context of the PCAWG project to address those whole-genome visualization challenges.
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
Somatic and germline variant calls, mutational signatures, subclonal reconstructions, transcript abundance, splice calls, and other core data generated by the ICGC/TCGA Pan-cancer Analysis of Whole Genomes Consortium is described here1 and available for download at https://dcc.icgc.org/releases/PCAWG. Additional information on accessing the data, including raw read files, can be found at https://docs.icgc.org/pcawg/data/. In accordance with the data access policies of the ICGC and TCGA projects, most molecular, clinical, and specimen data are in an open tier, which does not require access approval. To access genetically sensitive information, such as germline alleles and underlying sequencing data, researchers will need to apply to the TCGA Data Access Committee (DAC) via dbGaP (https://dbgap.ncbi.nlm.nih.gov/aa/wga.cgi?page=login) for access to the TCGA portion of the dataset and to the ICGC Data Access Compliance Office (DACO; http://icgc.org/daco) for the ICGC portion. In addition, to access somatic single-nucleotide variants derived from TCGA donors, researchers will also need to obtain dbGaP authorization. Derived datasets within each tool can be found in Supplementary Table 3. The source data underlying Figs. 1–5, excepting the controlled data, are provided as a Source data file. Corresponding authors for respective tools: ICGC Data Portal, J. Zhang, firstname.lastname@example.org; UCSC Xena, J. Zhu, email@example.com; Chromothripsis Explorer, P.J.P., firstname.lastname@example.org; Expression Atlas, I.P., email@example.com; PCAWG-Scout, M.V., firstname.lastname@example.org.
The core computational pipelines used by the PCAWG Consortium for alignment, quality control, and variant calling are available to the public at https://dockstore.org/search?search=pcawg under the GNU General Public License v3.0, which allows for reuse and distribution. The code for all tools in this paper are open source and publicly available. Code for the ICGC Data Portal is available at https://github.com/icgc-dcc/dcc-portal. Code for the UCSC Xena Browser is available at https://github.com/ucscXena/ucsc-xena-client. Code for the Chromothripsis Explorer is available at https://github.com/parklab/ShatterSeek. Code for the Expression Atlas is at https://github.com/gxa/atlas. Code for PCAWG-Scout is at http://mikisvaz.github.io/rbbt/, https://github.com/Rbbt-Workflows, and https://github.com/Rbbt-Apps/PCAWGScout.
Campbell, P. J. et al. Pan-cancer analysis of whole genomes. Nature 578, 82–93 (2020).
Kent, W. J. et al. The human genome browser at UCSC. Genome Res. 12, 996–1006 (2002).
Zerbino, D. R. et al. Ensembl 2018. Nucleic Acids Res. 46, D754–D761 (2018).
Piñeiro-Yáñez, E. et al. PanDrugs: a novel method to prioritize anticancer drug treatments according to individual genomic data. Genome Med 10, 41 (2018).
Shepherd, R. et al. Data mining using the Catalogue of Somatic Mutations in Cancer BioMart. Database (Oxford) 2011, bar018 (2011).
Carithers, L. J. et al. A novel approach to high-quality postmortem tissue procurement: the GTEx Project. Biopreserv. Biobank 13, 311–317 (2015).
Zhang, J. et al. The International Cancer Genome Consortium Data Portal. Nature Biotechnology 37, 367–369 (2019).
Goldman, M. J. et al. Visualizing and interpreting cancer genomics data via the Xena platform. Nat Biotechnol 38, 675–678 (2020).
Stephens, P. J. et al. Massive genomic rearrangement acquired in a single catastrophic event during cancer development. Cell 144, 27–40 (2011).
Cortés-Ciriano, I. et al. Comprehensive analysis of chromothripsis in 2,658 human cancers using whole-genome sequencing. Nat. Genet. https://doi.org/10.1038/s41588-019-0576-7 (2020).
Korbel, J. O. & Campbell, P. J. Criteria for inference of chromothripsis in cancer genomes. Cell 152, 1226–1236 (2013).
Notta, F. et al. A renewed model of pancreatic cancer evolution based on genomic rearrangement patterns. Nature 538, 378–382 (2016).
Yu, Y., Ouyang, Y. & Yao, W. shinyCircos: an R/Shiny application for interactive creation of Circos plot. Bioinformatics 34, 1229–1231 (2018).
Petryszak, R. et al. Expression Atlas update—an integrated database of gene and protein expression in humans, animals and plants. Nucleic Acids Research 44, (D1):D746–D752 (2016).
Fonseca, N. A., Petryszak, R., Marioni, J. & Brazma, A. iRAP - an integrated RNA-seq Analysis Pipeline. Preprint at https://doi.org/10.1101/005991 (2014).
John, J., Powell, K., Katie Conley-LaComb, M. & Chinni, S. R. TMPRSS2-ERG fusion gene expression in prostate tumor cells and its clinical and biological significance in prostate cancer progression. J. Cancer Sci. Ther. 4, 94–101 (2012).
Adamo, P. & Ladomery, M. R. The oncogene ERG: a key factor in prostate cancer. Oncogene 35, 403–414 (2016).
Rheinbay, E. et al. Analyses of non-coding somatic drivers in 2,658 cancer whole genomes. Nature 578, 102–111 (2020).
The ICGC Data Portal development is supported by the Ontario Institute for Cancer Research (OICR) through funding provided by the government of Ontario. UCSC Xena development is supported by the National Cancer Institute of the National Institutes of Health under award numbers 5U24CA180951-04 and 5U24CA210974-02. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. Chromothripsis Explorer development is supported by the European Union’s Framework Programme For Research and Innovation Horizon 2020 (2014-2020) under the Marie Curie Sklodowska-Curie Grant Agreement No. 703543 (I.C.-C.). Expression Atlas development is supported by the European Molecular Biology Laboratory (EMBL) member states, the Single Cell Gene Expression Atlas from the Wellcome Trust (grant numbers 108437/Z/15/Z), the National Science Foundation of USA grant to Gramene database [NSF IOS #1127112], Open Targets, and Chan Zuckerberg Initiative. PCAWG-Scout development is supported by joint BSC-IRB-CRG Program in Computational Biology and Severo Ochoa Award SEV 2015-0493. In addition, this work has been supported by the Spanish Government (SEV 2015-0493) and from the BSC-Lenovo Master Collaboration Agreement (2015). We acknowledge the contributions of the many clinical networks across ICGC and TCGA who provided samples and data to the PCAWG Consortium and the contributions of the Technical Working Group and the Germline Working Group of the PCAWG Consortium for collation, realignment, and harmonized variant calling of the cancer genomes used in this study. We thank the patients and their families for their participation in the individual ICGC and TCGA projects.
The authors declare no competing interests.
Peer review information Nature Communications thanks the anonymous reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Goldman, M.J., Zhang, J., Fonseca, N.A. et al. A user guide for the online exploration and visualization of PCAWG data. Nat Commun 11, 3400 (2020). https://doi.org/10.1038/s41467-020-16785-6
A novel prognostic mRNA/miRNA signature for esophageal cancer and its immune landscape in cancer progression
Molecular Oncology (2021)
Single-cell profiling reveals the trajectories of natural killer cell differentiation in bone marrow and a stress signature induced by acute myeloid leukemia
Cellular & Molecular Immunology (2021)
BMC Medical Genomics (2021)
Biology Direct (2020)