This is a preview of subscription content, access via your institution
Relevant articles
Open Access articles citing this article.
-
Integrating chromatin accessibility states in the design of targeted sequencing panels for liquid biopsy
Scientific Reports Open Access 21 June 2022
-
A platform for oncogenomic reporting and interpretation
Nature Communications Open Access 09 February 2022
-
JCGA: the Japanese version of the Cancer Genome Atlas and its contribution to the interpretation of gene alterations detected in clinical cancer genome sequencing
Human Genome Variation Open Access 30 September 2021
Access options
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
References
Van Allen, E.M. et al. Nat. Med. 20, 682–688 (2014).
Forbes, S.A. et al. Nucleic Acids Res. 43, D805–D811 (2015).
Zhang, J. et al. Database (Oxford) 2011, bar026 (2011).
Yeh, P. et al. Clin. Cancer Res. 19, 1894–1901 (2013).
Dienstmann, R. et al. Mol. Oncol. 8, 859–873 (2014).
MacConaill, L.E. et al. J. Mol. Diagn. 16, 660–672 (2014).
Acknowledgements
The authors gratefully acknowledge L. Trani, J. Hodges, and A. Wollam for efforts in manual review. T. Ley, R. Bose, R. Govindan, and S. Devarakonda provided valuable curation input. D. Larson provided valuable analysis input. M.G. was supported by the National Human Genome Research Institute (NIH NHGRI K99HG007940). O.L.G. was supported by the National Cancer Institute (NIH NCI K22CA188163). This work was supported by a grant to R.K.W. from the National Human Genome Research Institute (NIH NHGRI U54HG003079).
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Integrated supplementary information
Supplementary Figure 1 Overview of DoCM resource
(a) Outline of criteria to curate a variant. Variants are evaluated for inclusion and then curated elements are identified. (b) Summary of current DoCM contents. DoCM contains SNSs and indels across many cancer subtypes with easy identification of the journal article that outlines the variant's relevance. (c) Screenshot of the DoCM web application available at http://docm.info. (d) Illustration of the API. An HTTP GET request for a variety of parameters including gene, chromosome, position etc. and returns a JSON response with the PubMed ids, diseases and other useful information. The API is thoroughly documented at http://docm.genome.wustl.edu/api.
Supplementary Figure 2 Screenshot of DoCM batch submission form.
In the batch submission form, users can enter all the parameters necessary for inclusion into DoCM, including the name of the batch, the rationale statement outlining the reason for including the variants and curation details, any relevant urls, tags to be applied to the whole batch, the TSV file with variants and submitter information. Following submission the user will be given a link to review the batch and any messages from moderators.
Supplementary Figure 3 Screenshot of moderators view of the submitted batches queue.
Once a batch has been submitted, it can be reviewed in the password protected moderator queue. A listing of current DoCM moderators can be viewed at http://docm.genome.wustl.edu/about. Moderators can select a batch, such as the Drug Gene Knowledge Database highlighted in purple above, to review the batch. Once multiple batches have been accepted a moderator can create a new DoCM version using the blue button at the bottom-right of the screen. The “Drug Gene Knowledge Database” link is highlighted in purple as it is the subject of Supplementary Figure 4.
Supplementary Figure 4 Screenshot of moderator review page.
A moderator can review all information submitted with a batch and evaluate whether it fits the scope and quality requirements of DoCM. Individual variants can be accepted or rejected and the moderator can leave a message to the submitter.
Supplementary Figure 5 Number of papers in PubMed indexed by “Cancer” per year.
Searching PubMed with the search term “Cancer” yields the number of papers relating to cancer per year. This serves as an upper-bound limit of the number of papers that need to be curated to accurately summarize important cancer variants. There is a need for public resources that reduce the duplication of curation effort.
Supplementary Figure 6 Overview of variant curation for entry into DoCM
An anecdotal example of the curation involved for the variant BRAF V600E is shown. Typically the literature only lists the gene and amino acid change (purple in the figure), requiring extensive curation to uniquely identify the variant. Correct genomic coordinates on a consistent genome build need to be identified, with accompanying nucleotide and strand information. Occasionally there are multiple nucleotide changes that are synonymous with a particular amino acid change. A representative transcript that correctly models the variant described in the literature also needs to be specified. Cancer subtypes are specified using the disease ontology nomenclature. Green boxes note the class of information that needs to be captured in DoCM, black boxes show the subtype of each class, and white boxes denote the value.
Supplementary Figure 7 Overview of analysis and validation sequencing of four TCGA projects
(a) Outline of the manual review strategy. DoCM sites with two or more reads of support are evaluated for obvious errors. (b) Summary of the variants that passed manual review and were not identified in the original TCGA analyses. (c) Summary of the variants that were validated in the 93 validation samples. (d) Comparison of DoCM-MSRV to ClinSek and the Bayesian classifier.
Supplementary Figure 8 Coverage of the custom capture validation sequencing
Heatmap illustrating the coverage obtained at all target sites in validation sequencing. Bar graphs on the x and y-axes illustrate the mean coverage at each case/position.
Supplementary Figure 9 Overview of validation sequencing results.
Variant allele fraction plot illustrating the types of variants identified through manual review that validated. Variants called in the original TCGA study are highlighted in blue and those missed are in green. Note that TCGA was unable to call variants below ∼10% VAF while the MSRV approach was able to recover many such variants. Density plots on the x and y-axes show the distribution of tumor VAF and coverage depth for validated variants respectively.
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1–9, Supplementary Tables 1–4, Supplementary Methods and Supplementary Results (PDF 2630 kb)
Supplementary Data 1
DoCM read-count data and manual review calls for all TCGA samples. (ZIP 9091 kb)
Supplementary Data 2
DoCM read-count data and manual review calls for all Validation samples. (ZIP 1052 kb)
Rights and permissions
About this article
Cite this article
Ainscough, B., Griffith, M., Coffman, A. et al. DoCM: a database of curated mutations in cancer. Nat Methods 13, 806–807 (2016). https://doi.org/10.1038/nmeth.4000
Published:
Issue Date:
DOI: https://doi.org/10.1038/nmeth.4000
This article is cited by
-
Integrating chromatin accessibility states in the design of targeted sequencing panels for liquid biopsy
Scientific Reports (2022)
-
A platform for oncogenomic reporting and interpretation
Nature Communications (2022)
-
Current cancer driver variant predictors learn to recognize driver genes instead of functional variants
BMC Biology (2021)
-
OncoGEMINI: software for investigating tumor variants from multiple biopsies with integrated cancer annotations
Genome Medicine (2021)
-
JCGA: the Japanese version of the Cancer Genome Atlas and its contribution to the interpretation of gene alterations detected in clinical cancer genome sequencing
Human Genome Variation (2021)