Abstract
High-resolution genetic analysis of the human genome promises to provide insight into common disease susceptibility. To perform such analysis will require a collection of high-throughput, high-density analysis reagents. We have developed a polymorphism detection system that uses public-domain sequence data. This detection system is called the single nucleotide polymorphism pipeline (SNPpipeline). The analytic core of the SNPpipeline is composed of three components: PHRED, PHRAP and DEMIGLACE. PHRED and PHRAP are components of a sequence analysis suite developed to perform the semi-automated analysis required for large-scale genomes1,2 (provided courtesy of P. Green). Using these informatics tools, which examine redundant raw expressed sequence tag (EST) data, we have identified more than 3,000 candidate single-nucleotide polymorphisms (SNPs). Empiric validation studies of a set of 192 candidates indicate that 82% identify variation in a sample of ten Centre d'Etudes Polymorphism Humain (CEPH) individuals. Our results suggest that existing sequence resources may serve as a valuable source for identifying genetic variation.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Ewing, B., Hillier, L., Wendl, M.C. & Green, P. Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res. 8, 175–185 ( 1998).
Ewing, B. & Green, P. Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res. 8, 186–194 (1998).
Schuler, G.D. Pieces of the puzzle: expressed sequence tags and the catalog of human genes. J. Mol. Med. 75, 694–698 (1997).
Hillier, L. et al. Generation and analysis of 280,000 human expressed sequence tags. Genome Res. 6, 807– 828 (1996).
Wang, D.G. et al. Large-scale identification, mapping, and genotyping of single-nucleotide polymorphisms in the human genome. Science 280, 1077–1082 (1998).
Murray, J.C. et al. A comprehensive human linkage map with centimorgan density. Cooperative Human Linkage Center (CHLC). Science 265 , 2049–2054 (1994).
Jin, L. & Nei, M. Limitations of the evolutionary parsimony method of phylogenetic analysis. Mol. Biol. Evol. 7 , 82–102 (1990).
Sokal, R.R. & Sneath, P.H.A. Principles of Numerical Taxonomy (W.H. Freeman, San Francisco, 1963).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Buetow, K., Edmonson, M. & Cassidy, A. Reliable identification of large numbers of candidate SNPs from public EST data. Nat Genet 21, 323–325 (1999). https://doi.org/10.1038/6851
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1038/6851
This article is cited by
-
Role and Present Status of Biotechnology in Augmenting Poultry Productivity in India
Proceedings of the National Academy of Sciences, India Section B: Biological Sciences (2014)
-
Identification of candidate genes involved in the biosynthesis of carotenoids in Brassica rapa
Horticulture, Environment, and Biotechnology (2014)
-
Mining of gene-based SNPs from publicly available ESTs and their conversion to cost-effective genotyping assay in sorghum [Sorghum bicolor (L.) Moench]
Journal of Crop Science and Biotechnology (2014)
-
Genomic profile of the plants with pharmaceutical value
3 Biotech (2014)
-
Identification of single nucleotide polymorphisms from the transcriptome of an organism with a whole genome duplication
BMC Bioinformatics (2013)