Correspondence


Nature Genetics 39, 427 - 428 (2007)
doi:10.1038/ng0407-427

Analysis of published PKD1 gene sequence variants

Alexander M Gout1,2,3, the ADPKD Gene Variant Consortium & David Ravine3,5

  1. The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Victoria 3050, Australia.
  2. Department of Medical Biology, The University of Melbourne, Parkville, Victoria 3010, Australia.
  3. School of Medicine and Pharmacology, The University of Western Australia, Nedlands 6009, Australia.
  4. Corresponding authors of publications describing variants within the PKD1 gene (a full list of authors is given at the end of the paper*).
  5. Western Australian Institute for Medical Research, Centre for Medical Research, The University of Western Australia, Nedlands 6009, Australia. e-mail: david.ravine@uwa.edu.au
  6. Mayo Clinic College of Medicine, Rochester, New York, USA.
  7. Leiden University Medical Centre, Leiden, The Netherlands.
  8. Fox Chase Cancer Center, Philadelphia, Pennsylvania, USA.
  9. Department of Health and Environmental Sciences, Kyoto University, Japan.
  10. Kyorin University School of Health Sciences, Japan.
  11. Faculty of Medicine, Siriraj Hospital, Mahidol University, Bangkok, Thailand.
  12. Department of Biological Sciences, University of Cyprus, Lefkosia, Cyprus.
  13. Department of Medical Genetics, Cambridge Institute of Medical Research, Addenbrooke's Hospital, Cambridge, UK.
  14. Nephrology Department, Fundació Puigvert, Barcelona, Spain.
  15. Department of Mother & Child - Genetics, University of Verona School of Medicine, Verona, Italy.
  16. Medical Genetics Unit, St. George's Hospital Medical School, London, UK.
  17. Génétique médicale et développement, INSERM, Marseille, France.
  18. Section of Nephrology, Yale University School of Medicine, New Haven, Connecticut, USA.
  19. Free University Medical Center Department of Internal Medicine, Amsterdam, The Netherlands.
  20. Laboratoire de génétique moléculaire, CHU Brest, France.
  21. Division of Nephrology, Department of Medicine, University of Toronto, Canada.
  22. Western Australian Institute for Medical Research, Centre for Medical Research, The University of Western Australia, Perth, Australia.
  23. Institute of Human Genetics, University of Muenster, Germany.
  24. Centre for Human Genetics, Edith Cowan University, Joondalup, Australia.
  25. Department of Internal Medicine, Eulji Medical College, Seoul National University, Seoul, Korea.
  26. Deptartment of Clinical Genetics, Rotterdam, The Netherlands.
  27. Department of Nephrology & Department of Biology and Medical Genetics, Charles University, Prague, Czech Republic.
  28. Departments of Endocrinology and Molecular Genetics, Hospital Ramo, Madrid, Spain.
  29. Dipartimento di Biochimica e Biologia Molecolare, Universita degli Studi, Ferrara, Italy.

To the Editor:

We retrospectively reviewed published variants in the PKD1 genes and detected errors in 39 of 771 variants (5.06% (95% c.i., 3.62–6.85)). All arose from human processing mistakes. As peer-reviewed publication is no safeguard for those considering the clinical significance of an unknown variant, we suggest that reporting of new variants for the proposed Human Variome Project should employ both automated reporting and expert scrutiny.

Measurement of the biological activity of a mutant gene provides the best indication of the functional effect of a gene variant1. As this is seldom possible, less direct measures are used to assess the likely clinical significance of a variant. These include observing the degree of nucleotide sequence conservation in orthologous genes, the nature and position of resultant amino acid changes, its frequency within cases and control populations and whether it exhibits familial segregation with the disease. With the exception of the latter, this information is often present within published mutation reports, providing grounds for the clinical interpretation of a gene variant.

Not surprisingly, guidance from professional organizations representing medical geneticists emphasizes the important role of locus-specific mutation databases (LSDBs)2 (http://www.hgvs.org/dblist/dblist.html) and peer-reviewed publications in the evaluation of unknown variants3 (see http://www.acmg.net/Pages/ACMG_Activities/stds-2002/stdsmenu-n.htm and http://www.cmgs.org/BPGs/Sequencing_new.htm). At this early stage in the development of LSDBs, professional bodies also caution against overreliance on databases to interpret the meaning of an observed variant3. Similar cautionary warnings are not issued about published mutation reports, presumably because the rigor of the peer review process instills a stronger sense of confidence in published clinical mutation reports. Studies of quality control in genetic diagnostic laboratories have so far focused on the quality of DNA sequencing4 and genotypic allele calling5. To our knowledge, there has been no systematic study of the accuracy of variants in peer-reviewed publications.

A recent upgrading of the autosomal dominant polycystic kidney disease (ADPKD) mutation database (PKDB; see http://pkdb.mayo.edu)6 presented an opportunity to evaluate the accuracy of published reports of variants in the ADPKD-associated gene PKD1 (16p13.3). A large number of disease-causing mutations, a large number of polymorphisms not associated with disease and a considerable number of gene variants with an unknown effect are now reported for this gene. This allowed us to assess the accuracy with which 771 variants were reported in 55 peer-reviewed publications (Supplementary Table 1 online). In order to evaluate the accuracy of these variant reports, the numbering of each was first standardized to the ATG start codon of the PKD1 NCBI RefSeq mRNA sequence (NM_000296.2). The nomenclature was also altered where necessary, to comply with current nomenclature standards7 (http://www.hgvs.org/mutnomen/). The accuracy of each reported gene variant and its associated amino acid effect were assessed via an in-house mutation checker tool6 in conjunction with the Artemis DNA sequence visualization software8. Inconsistencies between the reported nucleotide changes and associated reference sequence or reported amino acid change were subjected to closer investigation until resolved. Corresponding authors were invited to recheck their own publication as well as our suggested amendments. Twenty-two of thirty-four corresponding authors (65%) responded. Excluding the time contributed by corresponding authors, approximately 170 h of curator time was required to standardize and check all the gene variant reports.

As a number of reference sequences and numbering conventions had been used to describe the gene variant reports, the nucleotide numbering of 542 (70.3%) variants required updating. Twenty publications (36%) did not provide details of the reference sequence used or indicate the numbering convention employed. This required the empiric determination of these details. A total of 39 errors (Supplementary Table 2 online) were identified (5.06% (95% c.i. 3.62–6.85)), 36 of which were detected using the procedure describe above. The remaining three errors were reported by a corresponding author after re-examination of the primary data. The errors, which have been categorized as either miscounting, misassignment or typographical (Fig. 1), were identified in both the nucleotide and amino acid descriptions of variants. While it was evident the majority of errors identified in this study arose from human copying, the discovery of errors in reporting primary sequence data indicates that this vital first step is also prone to error. As many primary data were not reviewed in this study, it is highly likely the error rate reported here is an underestimate. The 10% administrative error rate detected among diagnostic laboratories reported in ref. 5 and the 13% rate of genotyping and nomenclature errors among 64 diagnostic laboratories participating in an international external quality assessment of sequence-based genetic testing4 could more closely approximate the error rate in peer-reviewed publications.

Figure 1: Errors were grouped into three categories: misassignment, miscounting and typographical.

Figure 1 : Errors were grouped into three categories: misassignment, miscounting and typographical.

The number of instances of each category at the nucleotide (light gray) and amino acid (dark gray) levels is shown.

Full size image (24 KB)

Our experience of checking published variants, correcting their nomenclature and adjusting their correspondence to a common reference sequence provides some insight into the likely cost of completing the task for other LSDBs. At present, there is little scope for automating the initial steps required to standardize numbering and nomenclature. However, the later steps of checking for inconsistencies in the description of variants are being aided increasingly by mutation-checking and visualization software6, 8 (http://www.ebi.ac.uk/cgi-bin/mutations/check.cgi).

Much is known about the processes that contribute to human errors, as well as the proofreading and feedback processes required to identify mistakes9. Although the most effective way of reducing human error is to introduce automated processes that reduce or eliminate the need for human characterization and transcription of gene variants, software introduces new sources of error: for example, gene name errors introduced through the use of Microsoft Excel10.

In the interim, journal editors and LSDB curators should consider a range of strategies, including encouraging authors of genomic variant reports to recheck gene variant data before releasing it into the public domain. Sequence data may now be checked prospectively with mutation checking software.

We must point out that the scope of our analysis did not extend to scrutinizing the accuracy of clinical interpretation of the reported variants, which adds another dimension of risk to the clinical use of published gene variants. The consequences of being unable to access a report of a rare disease-causing variant may not, in practice, be a major issue because its rarity alone should prompt suspicions about its potential pathogenicity. However, failure to access a report of a rare polymorphic variant, perhaps found only in a specific population, would seem likely to increase the risk of wrongly concluding that the variant is associated with disease. This, in turn, could prompt inappropriate clinical decisions. By contrast, incorrect assignment of a novel pathogenic variant as non–disease associated may prompt unnecessary expenditure on additional molecular screening when the primary disease-causing lesion has already been characterized. Finally, erroneous reporting of a rare non–disease associated polymorphism as disease-causing could result in inappropriate clinical decisions because of the mistake. Potential ramifications of these scenarios are poignantly illustrated in a hypothetical scenario presented in ref. 11.

The error rate detected among these gene variants published in peer-reviewed journals shows that caution must be exercised—particularly by curators of LSDBs, genetic diagnostic laboratories, genetic counselors and other health care practitioners—when relying on published reports to evaluate the likely clinical significance of an unknown variant. The nature of the mistakes demonstrates that steps toward reducing human single-entry recording of sequence variants into peer-reviewed publications or LSDBs will enhance the accuracy and clinical utility of the Human Variome Project.

Note: Supplementary information is available on the Nature Genetics website.

Members of the ADPKD Gene Variant Consortium include the following: Peter C Harris6, Sandro Rossetti6, Dorien Peters7, Martijn Breuning7, Elizabeth Petri Henske8, Akio Koizumi9, Sumiko Inoue9, Yoshiko Shimizu10, Wanna Thongnoppakhun11, Pa-thai Yenchitsomanus11, Constantinos Deltas12, Richard Sandford13, Roser Torra14, Alberto E Turco15, Steve Jeffery16, Michel Fontes17, Stefan Somlo18, Laszlo M Furu18, Yvo M Smulders19, Bernard Mercier20, Claude Ferec20, Stéphane Burtey17, York Pei21, Luba Kalaydjieva22, Nadja Bogdanova23, Marie McCluskey24, Lee Jung Geon25, C H Wouters26, Jana Reiterova27, Jitka Stekrová27, Jose L San Millan28, Gianluca Aguiari29 & Laura Del Senno29

Top

Acknowledgments

The members of the ADPKD Gene Variant Consortium have been critical in this collaborative effort of verifying the accuracy of the peer-reviewed PKD1 gene variants included in this retrospective audit. This work was supported by funding from the PKD Foundation. We are grateful to J. Crowhurst and H. Scott for their careful scrutiny of the manuscript. The encouragement of R. Cotton in the preparation of this report is appreciated.

Competing interests statement:

The authors declare no competing financial interests.

Top

References

  1. Cotton, R.G. & Scriver, C.R. Hum. Mutat. 12, 1–3 (1998). | Article | PubMed | ISI | ChemPort |
  2. Horaitis, O. & Cotton, R.G. Hum. Mutat. 23, 447–452 (2004). | Article | PubMed | ISI | ChemPort |
  3. Maddalena, A., Bale, S., Das, S., Grody, W. & Richards, S. & the ACMG Laboratory Quality Assurance Committee. Genet. Med. 7, 571–583 (2005). | PubMed |
  4. Patton, S.J., Wallace, A.J. & Elles, R. Clin. Chem. 52, 728–736 (2006). | Article | PubMed | ChemPort |
  5. Dequeker, E. & Cassiman, J.J. Nat. Genet. 25, 259–260 (2000). | Article | PubMed | ISI | ChemPort |
  6. Gout, A.M., Martin, N.C., Brown, A.F. & Ravine, D. Hum. Mutat. (in the press).
  7. den Dunnen, J.T. & Antonarakis, S.E. Hum. Mutat. 15, 7–12 (2000). | Article | PubMed | ISI | ChemPort |
  8. Rutherford, K. et al. Bioinformatics 16, 944–945 (2000). | Article | PubMed | ISI | ChemPort |
  9. Pilotti, M., Chodorow, M. & Thornton, K.C. J. Gen. Psychol. 131, 242–266 (2004). | PubMed |
  10. Zeeberg, B.R. et al. BMC Bioinformatics 5, 80–85 (2004). | Article | PubMed |
  11. den Dunnen, J.T. & Paalman, M.H. Hum. Mutat. 22, 181–182 (2003). | Article | PubMed |

MORE ARTICLES LIKE THIS

These links to content published by NPG are automatically generated.


Extra navigation

Subscribe to Nature Genetics

Subscribe

Open Innovation Challenges

naturejobs

ADVERTISEMENT