What are some of the biggest obstacles in bringing big data into drug discovery?

Certainly we'll benefit from integrating large data sets, but it is imperative that this is not uncoupled from biological investigation. One of the challenges in pharma, at a time of increased externalization and partnering for research, is how to retain deep biological insight and connect that to the interrogation of these large data sets. One without the other is misguided.

As we move from the lab into the clinic, it is useful to study large, longitudinal clinical data sets. But again, interrogating those data without clinical insight is not very meaningful. The big prizes will go to those who connect the large clinical data sets with an abundance of preclinical data, and bring all that together. I don't think anybody has got that right yet.

In the academic world, the two new drugs to treat high cholesterol that target the protein PCSK9 are a great example of making the connection. Helen Hobbs of the University of Texas Southwestern Medical Center and her partners connected formidable genetics, biological understanding and chemical insight to lead to the important new medicines.

In the pharma world, Genentech seems to have made a long-term investment in biology and linked that to clinical data to treat the right person with the right drug — witness Herceptin for patients with breast cancer.

How are drug companies figuring out where to place their bets?

Given the attention deficit disorder and externalization of research in pharma, and the ever-increasing demands of venture capital and other financial markets in drug discovery, sometimes we are seeing elements of risk aversion and a herd effect. There must be 20 companies looking for the next PD-1 [a cancer immunotherapy target] right now.

But we are also seeing examples of pharma companies making very big, bold decisions — going after certain bespoke immune therapies, for example, without any hugely compelling evidence that this approach will work.

I think there is an argument for companies externalizing and partnering on research, so that they are not missing something at the bleeding edge of discovery. At the same time, those decisions of when and where you jump in, with enough confidence, are truly challenging.

What is a good example of a tough decision?

When will we be comfortable enough to invest substantially in making medicines to alter someone's microbiome?

Look at microbiome research. Everyone is really excited about the microbiome. It is tantalizing that all these billions of bacteria in our gut, skin and everywhere else influence disease and how we respond to drugs. We have seen associations between particular bacterial flora and disease states, but when will we be comfortable enough to invest substantially in making medicines to alter someone's microbiome? I don't know the answer to that.

To date, most of the data set out to define the diverse repertoire of microbes are from small numbers of individuals. Conducting evaluations from larger cohorts of subjects is daunting, and long-term clinical data on these cohorts have been limited.

Finally, we lack the robust tests that would let us modify the microbiome in a durable way and have a meaningful impact on the disease process.

Why does Sanford Burnham Prebys combine basic research with drug discovery?

Pharma often has a disconnect between the applied research in making medicines and the deep biological insight and ongoing experimentation that informs it. We have 80 people here from pharma who have made making medicines their whole career. Our principal investigators can work hand-in-hand with drug discoverers, and that enables us to pursue research and go after targets that nobody else will work on because they are too uncertain.

Building up the big-data element creates a unique situation in this work. For example, when we start thinking about autoimmunity, about what happens when, with which T cells and B cells, how do we start disentangling those complexities?

This is where big-data bioinformatics can be hugely useful. That to me underscores more than ever the need both to analyse large data sets and to stay connected to researchers who understand all the moving pieces with a view from a little higher up.

Is it difficult to find research staff with skills in gathering and analysing big data?

Yes, we struggle with this. Part of the challenge is training and funding individuals who are sophisticated enough to pursue that. They also tend to be gobbled up by potentially more lucrative fields outside the life sciences.

We have been searching hard to attract and recruit the next wave of systems biologists and other people who can analyse large amounts of data. You want to have a critical mass of people doing that together, and it's really tough. There are people who generate data and people who analyse data, but there are few who do both really well. I believe the winners are the ones who can pull it all together.