No approaches have yet been developed to allow instant searching of the World-Wide-Web by just entering a string of sequence data. Though general search engines can be tuned to accept ‘processed’ queries, the burden of preparing such ‘search strings’ simply defeats the purpose of quickly locating highly relevant information. Unlike ‘sequence similarity’ searches that employ dedicated algorithms (like BLAST) to compare an input sequence from defined databases, a direct ‘sequence based’ search simply locates quick and relevant information about a blunt piece of nucleotide or peptide sequence. This approach is particularly invaluable to all biomedical researchers who would often like to enter a sequence and quickly locate any pertinent information before proceeding to carry out detailed sequence alignment.
Here, we describe the theory and implementation of a web-based front-end for a search engine, like Google, which accepts sequence fragments and interactively retrieves a collection of highly relevant links and documents, in real-time. e.g. flat files like patent records, privately hosted sequence documents and regular databases.
The importance of this simple yet highly relevant tool will be evident when with a little bit of tweaking, the tool can be engineered to carry out searches on all kinds of hosted documents in the World-Wide-Web.*Availability:* Instaseq is free web based service that can be accessed by visiting the following hyperlink on the WWWhttp://instaseq.georgetown.edu
About this article
Cite this article
Ganesan, N., Bennett, N., Kalyanasundaram, B. et al. Searching the World-Wide-Web using nucleotide and peptide sequences. Nat Prec (2008). https://doi.org/10.1038/npre.2008.2492.1