Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Deep whole-genome analysis of 494 hepatocellular carcinomas

Abstract

Over half of hepatocellular carcinoma (HCC) cases diagnosed worldwide are in China1,2,3. However, whole-genome analysis of hepatitis B virus (HBV)-associated HCC in Chinese individuals is limited4,5,6,7,8, with current analyses of HCC mainly from non-HBV-enriched populations9,10. Here we initiated the Chinese Liver Cancer Atlas (CLCA) project and performed deep whole-genome sequencing (average depth, 120×) of 494 HCC tumours. We identified 6 coding and 28 non-coding previously undescribed driver candidates. Five previously undescribed mutational signatures were found, including aristolochic-acid-associated indel and doublet base signatures, and a single-base-substitution signature that we termed SBS_H8. Pentanucleotide context analysis and experimental validation confirmed that SBS_H8 was distinct to the aristolochic-acid-associated SBS22. Notably, HBV integrations could take the form of extrachromosomal circular DNA, resulting in elevated copy numbers and gene expression. Our high-depth data also enabled us to characterize subclonal clustered alterations, including chromothripsis, chromoplexy and kataegis, suggesting that these catastrophic events could also occur in late stages of hepatocarcinogenesis. Pathway analysis of all classes of alterations further linked non-coding mutations to dysregulation of liver metabolism. Finally, we performed in vitro and in vivo assays to show that fibrinogen alpha chain (FGA), determined as both a candidate coding and non-coding driver, regulates HCC progression and metastasis. Our CLCA study depicts a detailed genomic landscape and evolutionary history of HCC in Chinese individuals, providing important clinical implications.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Candidate driver landscape.
Fig. 2: Previously undescribed mutational signatures.
Fig. 3: ecDNA analysis.
Fig. 4: Genomic rearrangement.
Fig. 5: FGA dysfunction facilitates HCC progression.

Similar content being viewed by others

Data availability

The raw sequencing data reported in this paper have been deposited at the Genome Sequence Archive in BIG Data Center, Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, under the study accession number PRJCA002666. We also built an interactive website (http://lifeome.net/database/liver) for visualizing and analysing our CLCA data. The data deposited and made public are compliant with the regulations of the Ministry of Science and Technology of China. Other public data used in this study include the human reference genome hg19/GRCh37 (https://ftp.ensembl.org/pub/grch37/), PCAWG data (https://dcc.icgc.org/pcawg/#!), TCGA-HCC data (https://portal.gdc.cancer.gov/projects/TCGA-LIHC) and COSMIC signatures (https://cancer.sanger.ac.uk/signatures/). Source data are provided with this paper.

Code availability

The Linux working environment that we used is packed into a Singularity container file and is available at Zenodo (https://doi.org/10.5281/zenodo.7260221). The detailed codes and instructions for all software have been deposited at GitHub (https://github.com/ChongJenniferZhang/CLCA_WGS).

References

  1. Sung, H. et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 71, 209–249 (2021).

    PubMed  Google Scholar 

  2. Llovet, J. M. et al. Hepatocellular carcinoma. Nat. Rev. Dis. Primers 7, 6 (2021).

    PubMed  Google Scholar 

  3. Villanueva, A. Hepatocellular Carcinoma. N. Engl. J. Med. 380, 1450–1462 (2019).

    CAS  PubMed  Google Scholar 

  4. The ICGC/TCGA Pan-Cancer Analysis of Whole Genomes Consortium. Pan-cancer analysis of whole genomes. Nature 578, 82–93 (2020).

    ADS  CAS  Google Scholar 

  5. Rheinbay, E. et al. Analyses of non-coding somatic drivers in 2,658 cancer whole genomes. Nature 578, 102–111 (2020).

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  6. Alexandrov, L. B. et al. The repertoire of mutational signatures in human cancer. Nature 578, 94–101 (2020).

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  7. Li, Y. et al. Patterns of somatic structural variation in human cancer genomes. Nature 578, 112–121 (2020).

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  8. Gerstung, M. et al. The evolutionary history of 2,658 cancers. Nature 578, 122–128 (2020).

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  9. Fujimoto, A. et al. Whole-genome mutational landscape and characterization of noncoding and structural mutations in liver cancer. Nat. Genet. 48, 500–509 (2016).

    CAS  PubMed  Google Scholar 

  10. Letouze, E. et al. Mutational signatures reveal the dynamic interplay of risk factors and cellular processes during liver tumorigenesis. Nat. Commun. 8, 1315 (2017).

    ADS  PubMed  PubMed Central  Google Scholar 

  11. Gao, Q. et al. Integrated proteogenomic characterization of HBV-related hepatocellular carcinoma. Cell 179, 561–577 (2019).

    CAS  PubMed  Google Scholar 

  12. Sung, W. K. et al. Genome-wide survey of recurrent HBV integration in hepatocellular carcinoma. Nat. Genet. 44, 765–769 (2012).

    CAS  PubMed  Google Scholar 

  13. Kan, Z. et al. Whole-genome sequencing identifies recurrent mutations in hepatocellular carcinoma. Genome Res. 23, 1422–1433 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  14. Xue, R. et al. Variable intra-tumor genomic heterogeneity of multiple lesions in patients with hepatocellular carcinoma. Gastroenterology 150, 998–1008 (2016).

    PubMed  Google Scholar 

  15. Schulze, K. et al. Exome sequencing of hepatocellular carcinomas identifies new mutational signatures and potential therapeutic targets. Nat. Genet. 47, 505–511 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  16. Imielinski, M., Guo, G. & Meyerson, M. Insertions and deletions target lineage-defining genes in human cancers. Cell 168, 460–472 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  17. Dentro, S. C. et al. Characterizing genetic intra-tumor heterogeneity across 2,658 human cancer genomes. Cell 184, 2239–2254 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  18. Martincorena, I. et al. Tumor evolution. High burden and pervasive positive selection of somatic mutations in normal human skin. Science 348, 880–886 (2015).

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  19. Tarabichi, M. et al. Neutral tumor evolution? Nat. Genet. 50, 1630–1633 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  20. Ng, S. W. K. et al. Convergent somatic mutations in metabolism genes in chronic liver disease. Nature 598, 473–478 (2021).

    ADS  CAS  PubMed  Google Scholar 

  21. Kim, H. et al. Extrachromosomal DNA is associated with oncogene amplification and poor outcome across multiple cancers. Nat. Genet. 52, 891–897 (2020).

  22. Deshpande, V. et al. Exploring the landscape of focal amplifications in cancer using AmpliconArchitect. Nat. Commun. 10, 392 (2019).

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  23. Stephens, P. J. et al. Massive genomic rearrangement acquired in a single catastrophic event during cancer development. Cell 144, 27–40 (2011).

    MathSciNet  CAS  PubMed  PubMed Central  Google Scholar 

  24. Baca, S. C. et al. Punctuated evolution of prostate cancer genomes. Cell 153, 666–677 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  25. Nik-Zainal, S. et al. The life history of 21 breast cancers. Cell 149, 994–1007 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  26. Cortes-Ciriano, I. et al. Comprehensive analysis of chromothripsis in 2,658 human cancers using whole-genome sequencing. Nat. Genet. 52, 331–341 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  27. Alexandrov, L. B. et al. Signatures of mutational processes in human cancer. Nature 500, 415–421 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  28. Satriano, L., Lewinska, M., Rodrigues, P. M., Banales, J. M. & Andersen, J. B. Metabolic rearrangements in primary liver cancers: cause and consequences. Nat. Rev. Gastroenterol. Hepatol. 16, 748–766 (2019).

    CAS  PubMed  Google Scholar 

  29. Guo, L. et al. Single-cell DNA sequencing reveals punctuated and gradual clonal evolution in hepatocellular carcinoma. Gastroenterology 162, 238–252 (2022).

    CAS  PubMed  Google Scholar 

  30. Xue, R. et al. Genomic and transcriptomic profiling of combined hepatocellular and intrahepatic cholangiocarcinoma reveals distinct molecular subtypes. Cancer Cell 35, 932–947 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  31. Wu, S. et al. Circular ecDNA promotes accessible chromatin and high oncogene expression. Nature 575, 699–703 (2019).

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  32. Xue, R. et al. Liver tumour immune microenvironment subtypes and neutrophil heterogeneity. Nature 612, 141–147 (2022).

    ADS  CAS  PubMed  Google Scholar 

  33. Cibulskis, K. et al. Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat. Biotechnol. 31, 213–219 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  34. Kim, S. et al. Strelka2: fast and accurate calling of germline and somatic variants. Nat. Methods 15, 591–594 (2018).

    CAS  PubMed  Google Scholar 

  35. Lawrence, M. S. et al. Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature 499, 214–218 (2013).

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  36. Martincorena, I. et al. Universal patterns of selection in cancer and somatic tissues. Cell 171, 1029–1041 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  37. Mularoni, L., Sabarinathan, R., Deu-Pons, J., Gonzalez-Perez, A. & López-Bigas, N. OncodriveFML: a general framework to identify coding and non-coding regions with cancer driver mutations. Genome Biol. 17, 128 (2016).

    PubMed  PubMed Central  Google Scholar 

  38. Lawrence, M. S. et al. Discovery and saturation analysis of cancer genes across 21 tumour types. Nature 505, 495–501 (2014).

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  39. Zhu, H. et al. Candidate cancer driver mutations in distal regulatory elements and long-range chromatin interaction networks. Mol. Cell 77, 1307–1321 (2020).

    CAS  PubMed  Google Scholar 

  40. Liu, M., Wu, Y., Jiang, N., Boot, A. & Rozen, S. G. mSigHdp: hierarchical Dirichlet process mixture modeling for mutational signature discovery. NAR Genom. Bioinform. 5, lqad005 (2023).

  41. Boot, A. et al. In-depth characterization of the cisplatin mutational signature in human cell lines and in esophageal and liver tumors. Genome Res. 28, 654–665 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  42. Favero, F. et al. Sequenza: allele-specific copy number and mutation profiles from tumor sequencing data. Ann. Oncol. 26, 64–70 (2015).

    CAS  PubMed  Google Scholar 

  43. Priestley, P. et al. Pan-cancer whole-genome analyses of metastatic solid tumours. Nature 575, 210–216 (2019).

  44. Mermel, C. H. et al. GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers. Genome Biol. 12, R41 (2011).

    PubMed  PubMed Central  Google Scholar 

  45. Layer, R. M., Chiang, C., Quinlan, A. R. & Hall, I. M. LUMPY: a probabilistic framework for structural variant discovery. Genome Biol. 15, R84 (2014).

    PubMed  PubMed Central  Google Scholar 

  46. Mayakonda, A., Lin, D.-C., Assenov, Y., Plass, C. & Koeffler, H. P. Maftools: efficient and comprehensive analysis of somatic variants in cancer. Genome Res. 28, 1747–1756 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  47. Haas, B. J. et al. Accuracy assessment of fusion transcript detection via read-mapping and de novo fusion transcript assembly-based methods. Genome Biol. 20, 213 (2019).

  48. Turner, K. M. et al. Extrachromosomal oncogene amplification drives tumour evolution and genetic heterogeneity. Nature 543, 122–125 (2017).

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  49. deCarvalho, A. C. et al. Discordant inheritance of chromosomal and extrachromosomal DNA elements contributes to dynamic disease evolution in glioblastoma. Nat. Genet. 50, 708–717 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  50. Tsai, S. Q. et al. CIRCLE-seq: a highly sensitive in vitro screen for genome-wide CRISPR–Cas9 nuclease off-targets. Nat. Methods 14, 607–614 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  51. Koche, R. P. et al. Extrachromosomal circular DNA drives oncogenic genome remodeling in neuroblastoma. Nat. Genet. 52, 29–34 (2020).

  52. Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).

    CAS  PubMed  Google Scholar 

  53. Anzalone, A. V. et al. Search-and-replace genome editing without double-strand breaks or donor DNA. Nature 576, 149–157 (2019).

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We thank D. Li, S. Yin and C. Zhang for their support in gene editing and the members of the Shanghai Key Laboratory of Hepato-biliary Tumour Biology and the Key Laboratory of Signaling Regulation and Targeting Therapy of Liver Cancer (SMMU) for their technical support. This work was supported by the National Natural Science Foundation of China (81988101, T2125002, 82322047, 82241230, U21A20376, 81830054, 82173035, 82141103 and 82341007), the Innovation Program of Shanghai Municipal Education Commission (21JC1406600 and 22140901000), Beijing Natural Science Foundation (Z220014), Beijing Nova Program (20230434854), Program of Shanghai Academic/Technology Research Leader (21XD1404600), the National Key Research and Development Program of China (2022YFC3400902 and 2022YFC2504602), and the New Cornerstone Science Foundation through the XPLORER PRIZE. Figure 1a and Extended Data Fig. 10s were created using BioRender with an academic license.

Author information

Authors and Affiliations

Authors

Contributions

L.C., C.Z., R.X., L.W., F.B., S.G.R. and H.W. conceived and designed the project. L.C., Z.L., B.Z., K.L., Y. Zhu, S.Y. and Q.G. collected the clinical samples. C.Z., R.X., M.L., J. Bai, Yin Wang, R.W., A.Y. and Yan Wang analysed the WGS and RNA-seq data. S.G.R., M.L., N.J., C.Z. and R.X. performed mutational signature analysis. L.C., J. Bao, W.W., J.H., S.S., Y. Zhang and M.B. performed functional validation of candidate drivers. R.X., C.Z., J. Bai, L.C. and J.G. designed and built the CLCA website. C.Z., R.W. and N.J. built the Zenodo and GitHub pages. R.X., L.C., C.Z. and J. Bao integrated the sequencing and experimental data, drew the display items and wrote the manuscript. F.B., L.W., D.G., X.W.W., N.Z., H.N., S.G.R. and H.W. provided edits to the manuscript. L.C., L.W., F.B. and H.W. oversaw the ethical guidelines and data regulation. L.C., L.W., F.B., S.G.R. and H.W. supervised the project. All of the authors contributed to the final version of the paper.

Corresponding authors

Correspondence to Lei Chen, Lin Wu, Steven G. Rozen, Fan Bai or Hongyang Wang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature thanks Lewis Roberts and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Comparison of CLCA with other HCC cohorts.

a, Comparison of clinical information between CLCA and PCAWG-HCC. DP, double positive of HBV and HCV; DN, double negative of HBV and HCV. b, Sequencing depth of 494 tumours and their matched normal controls in CLCA. c, Relationships among driver genes using the DISCOVER mutual exclusivity test. d-e, Venn plot showing the comparison of potential driver genes identified in the TCGA-HCC, PCAWG-HCC, and our CLCA cohort. *Potential non-true drivers curated by PCAWG-HCC. f-g, Comparison of frequency of potential drivers between CLCA and PCAWG-HCC (f) and TCGA-HCC (g), respectively. Two-sided Fisher’s exact test, multiple hypothesis test performed with the Benjamini–Hochberg method. A threshold of Q < 0.1 was used for significance and denoted in blue. h, Two-sided Spearman correlation between the ratio of clonal drivers and tumour purity across all CLCA samples. The grey shaded area represents the 95% confidence interval. i, The dN/dS ratios for clonal and subclonal SNVs in 23 cancer coding drivers across our CLCA cohort. n denotes the total number of mutations for each category collected from 494 individual tumours. Centre points denote dN/dS values for missense, nonsense, splice site, and all mutations. Error bars denote the 95% confidence intervals. Red dashed line denotes dN/dS value of 1. j, Workflow for mutational signature analysis in CLCA.

Extended Data Fig. 2 Profiles of all mutational signatures in CLCA.

Mutational profiles of all signatures. SBS (single base substitution), DBS (doublet base substitution), and ID (small insertion and deletion). Magnified versions of signatures SBS_H1, DBS_H1 and ID_H1 are shown to illustrate the classification of each mutation subtype in each plot. The cosine similarity between each signature and its matched COSMICv3.2 signature is indicated. Novel signatures are labelled in red.

Extended Data Fig. 3 Analysis of mutational signatures.

a, Signature profiles of SBS_H8, DBS_H2, and ID_H3 extracted by both mSigHdp and SigProfiler. b, Comparison of the pentanucleotide context of SBS_H8, SBS_H2, and AA-exposed cell lines. The red square highlights the pentanucleotide context of T > A mutations enriched in SBS_H8 compared to SBS_H2. c, Correlation between the numbers of mutations associated with SBS_H2, DBS_H2, and ID_H3. d, Mutational profile of DBS_H1. e, Mutational profile of ID_H8 related to aflatoxin. f, Correlation between numbers of DBS_H1 mutations and age for involved patients. g, Correlation between numbers of ID_H8 mutations and SBS_H3 for involved patients. h, Unsupervised clustering based on the proportions of SBS, DBS, and ID mutations across tumours results in five subgroups. Selected clinical variables are also listed. The P values indicate significant nonrandom distributions for each attribute. Two-sided Fisher’s exact tests with Benjamini-Hochberg correction for multiple comparison. A threshold of Q < 0.1 was used for significance. i, Bar plots comparing selected variables that had significant differences between groups. Blue denotes mutation or yes. Grey denotes wildtype or no. Two-sided Chi-square test. j, Boxplots comparing the contributions of SBS_H8 across five subgroups. n denotes biologically independent samples. For boxplots, centre line shows median, box limits indicate upper and lower quartiles, and whiskers extend 1.5 times the interquartile range, while data beyond the end of the whiskers are outlying points that are plotted individually. Two-tailed Student’s t-test. k, OS and DFS of CLCA cases stratified into SBS_H8-high and SBS_H8-low groups by the median value. Log-rank test. For c, f, and g, ρ and P values are from a two-sided Spearman correlation test.

Extended Data Fig. 4 Mutational signature attributions.

a, mSigHdp splits COSMIC SBS5 into three components: SBS_H1, SBS_H14 and SBS_H17. They together recapitulate the pattern of SBS5. SigProfiler splits COSMIC SBS5 into four components: SBS96C, SBS96E, SBS96I and SBS96M. They together recapitulate the pattern of SBS5. b, Correlation between patient age with mutation numbers of mSigHdp extracted SBS5-split signature (SBS_H1, SBS_H14 and SBS_H17) or of SigProfiler extracted SBS5-split signature (SBS96C, SBS96E, SBS96I and SBS96M). c, mSigHdp splits COSMIC SBS40 into three components: SBS_H4, SBS_H16 and SBS_H17. They together recapitulate the pattern of SBS40. SigProfiler splits COSMIC SBS40 into three components: SBS96E, SBS96M and SBS96I. They together recapitulate the pattern of SBS40. d, Correlation between patient age with mutation numbers of mSigHdp extracted SBS40-split signature (SBS_H4, SBS_H16 and SBS_H17) or of SigProfiler extracted SBS40-split signature (SBS96E, SBS96M and SBS96I). e-f, Stacked bar plots showing the contributions of SBS mutational processes, coloured as shown in Extended Data Fig. 3h, to coding driver mutations (e) and noncoding driver mutations (f). g, Stacked bar plot shows the contribution of mutational processes to hotspot mutations (chromosome: position: the total number of patients with mutations at this particular genomic hotspot). Gene names are given with amino acid alterations for protein-coding genes. h, Enrichment of mutational signatures with clonal status. Potential aetiology and related COSMIC signatures are annotated for each signature. Two-sided Chi-square test. For b and d, ρ and P values are from a two-sided Spearman correlation test.

Extended Data Fig. 5 Survival, CNAs, HBV integrations and ecDNA.

a, Multi-variate analysis for OS and DFS. Multivariate Cox analysis was performed. Hazard ratios with a 95% confidence interval are shown for each predictor and are plotted on a natural log scale. b, Significant CNAs identified by GISTIC analysis. Red for amplification and blue for deletion. Green lines denote the threshold of Q value = 0.001. c, Hotspots of HBV integrations across CLCA. d, Top frequently amplified genes detected in ecDNA. e, Boxplots comparing the copy number of genes detected in ecDNA to others. f, Higher expression of oncogenes in ecDNA compared with these not in ecDNA. In e-f, n denotes biologically independent samples. Two-sided Wilcoxon rank-sum test. For boxplots, centre line shows median, box limits indicate upper and lower quartiles, and whiskers extend 1.5 times the interquartile range, while data beyond the end of the whiskers are outlying points that are plotted individually. g, Comparison of the frequency of cases with kataegis events (denoted in blue) between patients with or without APOBEC signatures. Two-sided Chi-square test.

Extended Data Fig. 6 Patterns of SVs and clustered mutational processes.

a, The number of SV events, focal CN segments, kataegis events, chromoplexy events, and chromothripsis events in the CLCA. b, Proportions of different categories for each type of alteration. c, The density of breakpoints genome-wide (top) and 2D density of juxtapositions (bottom) of SV, chromothripsis, and chromoplexy. d-e, Examples of chromothripsis (d) and chromoplexy (e) events involving oncogenes.

Extended Data Fig. 7 Kataegis and evolutionary history.

a, Rainfall plots of kataegis events. n denotes the total number of kataegis events detected in the tumour and marked with arrows below. b, Distribution of point mutations over different mutation periods. c, Distribution of mutations across early clonal, late clonal and subclonal stages, for drivers in CLCA. Barplots comparing the distribution of coding and noncoding mutations are shown, Two-sided Chi-square test. d, Relative ordering of CN events and driver mutations across all samples. e, Relative timing of signatures across all patients. n denotes the total number of individual tumours with the presence of the noted signature. For boxplots, centre line shows median, box limits indicate upper and lower quartiles, and whiskers extend 1.5 times the interquartile range, while data beyond the end of the whiskers are outlying points that are plotted individually. Boxplots are ordered by the median and no statistical test is used. f, Preferential ordering diagrams for patients stratified based on Cluster V, alcohol, and smoking. The relative ordering of candidate drivers was compared.

Extended Data Fig. 8 Dysregulated pathways.

Each gene box includes the frequency of patients influenced by different types of somatic alterations affecting the corresponding gene. A total of eight forms of somatic alterations are listed and colour-coded, including coding SNVs, noncoding SNVs (further divided into promoters, lncRNAs and UTRs), CNAs, ecDNA, SVs and HBV integrations. Solid rectangles enclose genes in eight major signalling pathways. Dashed rectangles enclose genes in specific signalling pathways. Interactions between genes are indicated. For each pathway, the frequencies of patients altered by coding mutations only, noncoding mutations only, and both coding and noncoding mutations are denoted, as shown in the Venn diagram.

Extended Data Fig. 9 Functional validation of PPP1R12B and KCNJ12.

a, Comparison of tumour migration, invasion, self-renewal and cell proliferation capacities of PPP1R12B disruption across cell lines. b, Edited sites in PPP1R12B by Prime Editing. c, RT-qPCR analysis of PPP1R12B mRNA expression across wild-type (WT) and point-mutated HepG2 cell lines. d-e, Comparison of the proliferation (d), migration, invasion, and self-renewal (e) capacities across cell lines of indicated genotypes. Representative images of each assay are shown for each cell line. f, Comparison of tumour migration, invasion, self-renewal and cell proliferation capacities of KCNJ12 disruption across cell lines. g, Edited sites in KCNJ12. h, RT-qPCR analysis of KCNJ12 mRNA expression across wild-type (WT) and point-mutated HepG2 cell lines. i-j, Comparison of the proliferation (i), migration, invasion, and self-renewal (j) capacities across cell lines of indicated genotypes. For all panels, each experimental condition was independently repeated for three times. Representative images of each assay are shown. Data are presented as mean ± s.e.m. In a, e, f, j, P values for the comparison between a certain group with the control group are denoted on the top of images. Two-way ANOVA test is used for proliferation analysis in (a, d, f, i). For other plots, P value was derived with two-tailed Student’s t-test.

Source Data

Extended Data Fig. 10 Function validation of FGA.

a, Lollipop plot of FGA mutations in CLCA. b, Overall survival of TCGA-HCC patients (n = 364) classified by FGA expression levels, Log-rank test. c, Comparison of FGA mRNA expression between tumour and normal tissues in the TCGA-HCC cohort. For boxplots, centre line shows median, box limits indicate upper and lower quartiles, and whiskers extend 1.5 times the interquartile range, while data beyond the end of the whiskers are outlying points that are plotted individually. d, Representative FGA IHC images of paired tumour and normal tissues. Quantitative result is shown in Fig. 5d. e, Schematic of the edited site in the FGA noncoding region. f, Western blot analysis of FGA levels across wild-type and mutated HepG2 cell lines. Source gels in Supplementary Fig. 3. g-h, Comparison of the proliferation (g), migration, invasion, and self-renewal (h) capacities across cell lines of indicated genotypes. i-j, Comparison of tumour migration (i), invasion and self-renewal, and cell proliferation (j) capacities of FGA disruption across cell lines. k, Resected xenograft tumours by sh-Ctrl (n = 6) and sh-FGA cells (n = 7) in PLC/PRF/5. l, Specific phospho-antibody array analysis between PLC/PRF/5-sh-Ctrl and sh-FGA cell lines. Top significantly altered phosphorylation sites among 156 phosphoproteins are listed. m, TYK2 phosphorylation and its unphosphorylated counterpart between PLC/PRF/5- sh-Ctrl and sh-FGA cell lines determined with Cy3-labelled streptavidin via specific phospho-antibody array (n = 2 for each phosphorylated site or unphosphorylated protein). n, Western blot analysis of p-TYK2 (Y1054) and p-STAT3 (Y705) protein levels by FGA knockdown in PLC/PRF/5 and PVTT cell lines. p-, phosphorylated. Source gels in Supplementary Fig. 4. o, Representative images of cell migration assay following inhibitor treatment. p, IL6 mRNA expression of sh-Ctrl and sh-FGA cells. q, IL6 mRNA levels between PLC/PRF/5- sh-Ctrl and sh-FGA cell lines following FBS stimulation. Cells were incubated in DMEM supplemented with 10% FBS for the indicated time intervals after treated with FBS-free medium overnight. r, Two-tailed Pearson correlation analysis of FGA protein and TYK2 phosphorylation (n = 75) in an independent HCC patient cohort. The relative intensity of FGA and p-TYK2 were normalized to β-actin. Source gels in Supplementary Fig. 5. s, A proposed model illustrating the role of the FGA/TYK2/STAT3 axis during HCC tumorigenesis. Wildtype and mutated forms of FGA were shown, respectively. The diagram was created using BioRender. For all panels, n denotes biologically independent samples. Each experimental condition was independently repeated three to five times. Data are presented as mean ± s.e.m. In h and i, P value for the comparison between a certain group with the control group are denoted on the top of images. Two-tailed Student’s t-test is used in (c, h, i, p, and q). Two-way ANOVA test is used in g and j.

Source Data

Supplementary information

Supplementary Information

Supplementary Figs. 1–5, Supplementary Notes 1–8 and Supplementary Discussion.

Reporting Summary

Supplementary Table 1

Clinical information and mutational landscape.

Supplementary Table 2

TPM matrix of RNA-seq cases.

Supplementary Table 3

Mutational signatures.

Supplementary Table 4

Catalogue of genomic rearrangements including CNAs, SVs, HBV integrations and clustered alterations.

Supplementary Table 5

Mutational history.

Supplementary Table 6

Signalling pathways.

Supplementary Table 7

Functional validation of candidate drivers and reagents.

Source data

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, L., Zhang, C., Xue, R. et al. Deep whole-genome analysis of 494 hepatocellular carcinomas. Nature 627, 586–593 (2024). https://doi.org/10.1038/s41586-024-07054-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41586-024-07054-3

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing: Cancer

Sign up for the Nature Briefing: Cancer newsletter — what matters in cancer research, free to your inbox weekly.

Get what matters in cancer research, free to your inbox weekly. Sign up for Nature Briefing: Cancer