Abstract
Information retrieval (IR) is the field of computer science that deals with the processing of documents containing free text, so that they can be rapidly retrieved based on keywords specified in a user's query. IR technology is the basis of Web-based search engines, and plays a vital role in biomedical research, because it is the foundation of software that supports literature search. Documents can be indexed by both the words they contain, as well as the concepts that can be matched to domain-specific thesauri; concept matching, however, poses several practical difficulties that make it unsuitable for use by itself. This article provides an introduction to IR and summarizes various applications of IR and related technologies to genomics.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 6 print issues and online access
$259.00 per year
only $43.17 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Salton G . Automatic Text Processing: the transformation, analysis, and retrieval of information by computer Addison-Wesley: Reading, MA 1989
Van Rijsbergen CJ . Information Retrieval Butterworths: London, UK 1979
Baeza-Yates R, Ribeiro-Neto B . Modern Information Retrieval Addison-Wesley Longman: Harlow, UK 1999
Witten IH, Moffat A, Bell TC . Managing Gigabytes Morgan Kaufman: San Francisco, CA 1999
Porter MF . An algorithm for suffix stripping Program 1980 14: 130–137
Harman D . How effective is suffixing? J Am Soc Inform Sci 1991 42: 7–15
Xu J, Croft WB . Corpus-based stemming using co-occurrence of word variants ACM Trans Inform Syst 1979 16: 61–81
Nadkarni PM, Chen RS, Brandt CA . UMLS concept indexing for production databases: a feasibility study J Am Med Inform Assoc 2001 8: 80–91
Elkin PL, Cimino JJ, Lowe HJ, Aronow DB, Payne TH, Pincett PS et al . Mapping to MESH: the art of trapping MESH equivalence from within narrative text In Proc Symposium on Computer Applications in Medical Care 1988 pp 185–190
Aronson A, Rindflesch T, Browne A . Exploiting a large thesaurus for information retrieval In Proceedings of the RIAO 1994 pp 197–216
Aronson AR, Rindflesch TC . Query expansion using the UMLS Metathesaurus In Proceedings/AMIA Annual Fall Symposium 1997 pp 485–489
Rindflesch TC, Aronson AR . Ambiguity resolution while mapping free text to the UMLS Metathesaurus In Proceedings–the Annual Symposium on Computer Applications in Medical Care 1994 pp 240–244
Masys D . Linking microarray data to the literature (Editorial) Nature Genet 2001 27: 9–10
Mutalik P, Deshpande A, Nadkarni P . Use of general-purpose negation detection to augment concept indexing of medical documents: a quantitative study using the UMLS J Am Med Inform Assoc 2001 8: 598–609
Williams JH, Perriens MP . Automated full text indexing and searching systems In IBM Information Systems Symposium Washington, DC 1968 pp 335–350
Sparck Jones K . A statistical interpretation of term specificity and its application in retrieval J Documentation 1972 28: 11–21
Sparck-Jones K, Walter S, Robertson SE . Information retrieval: development and comparative experiments (Part I) Inform Proc Manage 2000 36: 779–808
Sparck-Jones K, Walter S, Robertson SE . Information retrieval: development and comparative experiments (Part 2) Inform Proc Manage 2000 36: 809–840
Google Inc Google: Technology Overview 2001
Marshall E . Medline searches turn up cases of suspected plagiarism (News) Science 1998 279: 473–474
OMIM. Online Mendelian Inheritance in Man In: McKusick–Nathans Institute for Genetic Medicine, Johns Hopkins University (Baltimore, MD) and National Center for Biotechnology Information, National Library of Medicine (Bethesda, MD) 2001
Ley K, Brewer K, Moton A . A web-based research tool for functional genomics of the microcirculation: the leukocyte adhesion cascade Microcirculation 1999 6: 259–265
Achard F, Vayssix G, Dessen P, Barillot E . Virgil database for rich links (1999 update) Nucl Acids Res 1999 27: 113–114
Rebhan M, Chalifa-Casp iV, Prilusky J, Lancet D . GeneCards: a novel functional genomics compendium with automated data mining and query reformulation support Bioinformatics 1998 14: 656–664
Wu S, Manber U . Fast text searching allowing errors Commun ACM 1992 35: 83–91
Masys D, Welsh J, Lynn Fink J, MG, Klacansky I, Corbeil J . Use of keyword hierarchies to interpret gene expression patterns Bioinformatics 2001 17: 319–326
National Center for Biotechnology Information. PubMed help 2001
Tanabe L, Scherf U, Smith L, Lee J, Hunter L, Weinstein J . MedMiner: an Internet text-mining tool for biomedical information, with application to gene expression profiling Biotechniques 1999 27: 1210–1217
Rindflesch T, Hunter L, Aronson A . Mining molecular binding terminology from biomedical text In AMIA Fall Symposium 1999 pp 127–31
Rindflesch T, Tanabe L, Weinstein J, Hunter L . EDGAR: extraction of drugs, genes and relations from the biomedical literature In Pacific Symposium on Biocomputing, Honolulu, Hawaii 2000 pp 517–528
Swanson D, Smalheiser N . An interactive system for finding complementary literatures: a stimulus to scientific discovery Artif Intell 1997 91: 183–203
Finn R . Program uncovers hidden connections in the literature The Scientist 1998 12: www.the-scientist.com
Swanson D . Migraine and magnesium: eleven neglected connections Perspect Biol Med 1988 31: 526–557
Acknowledgements
The author thanks Cynthia Brandt, MD, and John Fisk, MD, of the Yale Center for Medical Informatics, and the anonymous reviewers for feedback on the article. The author is supported by grants U01 ES10867–02 from the National Institute of Environmental Health Sciences, R01 LM06843–02 from the National Library of Medicine and U01 CA78266–04 from the National Cancer Institute.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Nadkarni, P. An introduction to information retrieval: applications in genomics. Pharmacogenomics J 2, 96–102 (2002). https://doi.org/10.1038/sj.tpj.6500084
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/sj.tpj.6500084
Keywords
This article is cited by
-
Diagnosis of Rare Diseases: a scoping review of clinical decision support systems
Orphanet Journal of Rare Diseases (2020)