Ranadip Pal and his colleagues had just won a $100,000 prize in a 2012 Dialogue on Reverse Engineering Assessment and Methods (DREAM) challenge run in collaboration with the National Cancer Institute (NCI). By combining transcriptomic data—which captures changes in the abundance of RNA transcripts—and data from DNA methylation, they designed a tool that predicted drug sensitivity more successfully than those created by the vast majority of their competition (only one team outperformed them). Despite his team's win, Pal, a computational biologist at Texas Tech University in Lubbock, Texas, knew they could take their analytic tool even further.
Pal did so by combining the two data sets from the challenge with three other data types: protein expression data, DNA variation data and RNA sequencing data. By doing so, he reduced the uncertainty in his prediction model by more than 40% compared with that of his group's winning model that relied on only two types of data (PLoS One, 10.1371/journal.pone.0101183, 2014). “We always think that combining the data will help reduce the margin of error,” Pal says. “It was really good to see that our thinking was correct.”
This is a preview of subscription content, access via your institution