Box 1. In-silico prediction methods available online
From the following article
Bioinformatics prediction of HIV coreceptor usage
Thomas Lengauer, Oliver Sander, Saleta Sierra, Alexander Thielen & Rolf Kaiser
Nature Biotechnology 25, 1407 - 1410 (2007)
doi:10.1038/nbt1371
Although several bioinformatics methods for prediction of coreceptor usage have been proposed over the years, only three of them are available as online tools: WetCat, WebPSSM and, from our laboratory, geno2pheno[coreceptor] (see Table 1). At this point, all three systems are restricted to using the V3-loop as viral sequence information input. The servers differ with respect to the methods on which the prediction is based, but also in terms of the data sets on which they are trained and the way in which the input has to be supplied.
Some differences are due to the different release dates of the respective software. As time passes, more steps are automated. WetCat, the oldest system, requires data in a restricted input format, including an alignment of the V3-loop(s) to a specific consensus sequence. In contrast, WebPSSM allows unaligned sequences and builds the alignment itself. The system also takes sequence fragments containing amino acids extending beyond the V3-loop. The third system, geno2pheno[coreceptor], detects and aligns the V3-loop from a given sequence automatically. WebPSSM and geno2pheno[coreceptor] have been trained on much larger data sets than WetCat.
There are also differences in the output of the servers. WetCat classifies viral variants into X4 and non-X4, respectively, whereas WebPSSM displays a quantitative value that estimates how likely it is that a virus uses CXCR4. Similarly, geno2pheno[coreceptor] allows selection of a level of specificity that defines how conservative a prediction should be.
In addition to the predictions solely based on the sequence of the V3-loop, the recently updated version of geno2pheno[coreceptor] enables the user to supply certain additional clinical markers, such as CD4+ T-cell counts. The prediction models incorporating this information have been trained on about 1,000 samples of therapy-naive patients24.
