Spotlight on Bioinformatics

Search science jobs in United States

Journal name:
NatureJobs
Year published:
(2016)
DOI:
doi:10.1038/nj0478
Published online
This article was originally published in the journal Nature

Spotlight on Bioinformatics

Biology goes digital

A new species of biologist is beginning to thrive in the niche created by recent genomic and computational advances.

" It's important to enjoy your job and be motivated: the best ideas come whilst I'm in bed, or walking my dog, or having a coffee with colleagues. "

Federico Abascal

THERE ARE two paths to careers in bioinformatics, both of which require learning a new language. Computer scientists must become fluent in the life science terminology of genetics, genomics and cellular biology. Biologists must pick up skills in data analysis, including statistics, logic and programming. When the field was developing, fledgling bioinformaticians often taught themselves. Now, more institutions are offering formal training, and the field is maturing rapidly.

The skill set needed by a bioinformatician continues to evolve. In the early days of the human genome project, it was sufficient for scientists to find homologous genes of one organism in the genome of another. Now, bioinformaticians routinely compare multiple genomes, analyse regions that don't code for DNA, and incorporate a host of proteomic information in their analysis. Both the type and amount of information continues to expand, as biological techniques continue to improve.

As a result, the proficiency bar in bioinformatics continues to rise, along with the demand for talented bioinformaticians (see new mobility: a case study). A few decades ago the ability to scour databases to find a single gene provided at least a plank in the platform for a career in bioinformatics. Now, that skill is a basic part of a molecular biologist's toolkit, as essential as fundamental wet lab techniques. In response, bioinformaticians need to keep improving their skill set. And to really make a mark, they need to develop new tools that others in the field consider valuable.

“The learning curve is both bigger and steeper now,” says George Asimenos, director of strategic projects at DNAnexus in Mountain View, California. DNA sequencing was relatively slow and expensive when he started graduate school 13 years ago. As speed has gone up and costs have come down, demands on bioinformaticians have grown. They need to be comfortable with mining much larger data sets, and looking for relationships between them.

DNANEXUS

George Asimenos

Asimenos had to first get comfortable with the language of biology before he could dive into the data. He remembers hearing terms like “3 Prime” and “downstream” and thinking “What do these things even mean? How do I find out?”

He gave himself a crash course by reading textbooks, going to conferences and hanging out with biologists. “I had to overcome the vocabulary barrier,” he says. During his undergrad, Asimenos had taken courses in statistics, engineering and computer science. Acquiring those skills earlier gave him time during grad school to bone up on his biology, literally, with an anatomy class that included human dissection; and figuratively, with stints in molecular biology wet labs.

Still, he had some fraught moments. His advisor tapped him to lecture a course on algorithms for biology, where Asimenos explained genetics, genomics and biology to a class of computer scientists. That experience in the biological deep end helped him, because DNAnexus makes software for biologists. He needs to know the language of the company's clients, so he can help create software to meet their needs. Bridging the gap between biology and computer science remains one of his biggest challenges. “That vocabulary is really deeply rooted in every single discussion,” he says. But even more skills are necessary as the technology improves. Knowledge in machine learning and artificial intelligence might be needed for the next generation of bioinformaticians, Asimenos says.

After graduate school at Lund University, Sweden, Jean-Baptiste Cazier translated his knowledge of applied mathematics into fluency in statistical genetics analysis at deCode Genetics in Iceland. He focused on how statistics can be used to find areas in the human genome that contribute to increased risk of disease. Now, as the director of the Centre for Computational Biology, University of Birmingham, he has been tasked with teaching bioinformatics to scientists and clinicians at the UK's National Health Service (NHS). Part of the country's 100,000 Genome Project — which aims to sequence and understand the genome of 100,000 patients — his centre was the first of eight to start this educational programme in October last year.

EVELYN KIING

Sibon Li

Bioinformatics can often equate to some form of programming. Although many people are initially intimidated by the prospect of learning programming languages, Cazier comforts them. His first lesson is to impart confidence. “Researchers — biologists, clinicians, whatever — can learn mathematics and programming,” Cazier says. To get researchers comfortable, he uses data from a few patients to show how mutations are identified, and then asks them if the mutations are new and if the information is statistically reliable. “I am talking to their research brains, so it works,” he says.

Box 1: New mobility: A case study

New mobility: A case study

Bioinformatics offers a two-way career street. Once, people trained in maths, statistics and computer science could apply their skills to biological data, thus broadening their job prospects, whilst biologists were stuck firmly within their discipline. Now, scientists trained in bioinformatics are finding they can begin to transfer their skills into disciplines outside the life sciences.

Sibon Li studied bioinformatics at the University of Auckland in New Zealand, and did postdocs at the University of Southern California and the University of California, Los Angeles. That training prepared him for a job at Google in 2014.

What prompted you to get into bioinformatics?

As a teenager, I was always interested in computers and knew that I wanted to do something with them. I would buy PC magazines, and play with text-based video games. At the same time, at school, there were two things that were fascinating to me — I didn't care about my biology classes until they taught me about evolution, which was really exciting. And the other was probability theory in statistics, where I enjoyed the problem solving.

At the end of high school, I had no idea what I wanted to do at university. I was flicking through a university prospectus and stumbled across the bioinformatics program they were offering for the first time at the University of Auckland. I had no idea what it was or the job opportunities in the field but it seemed like the perfect union between all of the things that I was interested in.

What's your research background?

In academia my research focus was in developing algorithms for understanding the variation in rate of molecular evolution. As part of my research, I worked on the BEAST project, which my PhD and postdoc advisors Alexei Drummond and Marc Suchard had developed at Oxford.

Currently, I work at Google as a software engineer on the Knowledge Graph team, focusing on natural language understanding. Our technology is used in Google Search, as well as a range of other products.

How did you interact with more biologically-minded people to solve problems?

Frankly, there was very little interaction between the groups I worked with and traditional biologists. Computational biologists know enough about the biology to find some problems to solve, but often fail to address the biologically relevant and interesting problems.

On the other hand, I feel biologists believe that they have all the tools necessary to get good results. Of course, this is inefficient in many cases. This is a general problem in the bioinformatics community — there needs to be more communication and collaboration across the spectrum.

How do you bridge the communication gap between computer scientists and biologists?

Attend general conferences rather than going to those that are specialised to your area of research. You get to interact with others outside of your field of work and the feedback can be valuable. In addition, other researchers may see the significance of your research towards their own, and might want to collaborate.

Also, try to interact with colleagues across departments. When I first started my graduate work, my desk was in the biology department. I made a lot of friends there and attended biology seminars. Often, I would see how my research would benefit others and assisted some of my peers in their computational work.

What advice do you have for biologists or computer scientists considering bioinformatics?

I think an understanding of bioinformatics and proficiency in computational analysis is essential towards being a biologist in this day and age. The impression that I get from biologists is that learning these sorts of techniques is difficult and outside of their comfort zone. In reality, it's fairly easy to understand and just requires a change in mindset.

For computer scientists, I would say that there are a ton of interesting problems inside biology that are worth solving. The problems in biology are no different to the traditional problems that computer scientists generally focus their efforts on, in the sense that they are complex, intangible and challenging. If anything, the benefit to the world is potentially much greater than many of these other fields.

Finally, what does a degree in bioinformatics get you?

A degree in bioinformatics provides you with a diverse skill set that opens doors for a range of career options. I myself transitioned to working at Google on pure computer science problems such as natural language understanding and developing infrastructure. Peers from my program at university have found work in areas like biostatistics, pure statistics and biology. Outside of academia, there are plenty of careers for bioinformatics graduates in companies doing things like software and biotech.

That breaks the ice for basic programming — especially when he demonstrates how code can help them ask and answer scientific questions. He has been surprised at the response. “I was quite worried about the teaching course, but they embraced it and asked for more,” says Cazier.

MATTHEW WAKEFIELD

Vicky Schneider

Basic bioinformatics skills can empower biologists to make use of their own data: after all, they have the best understanding of biological processes. However, because the field is advancing so quickly, they need to keep in touch with the “hardcore” bioinformaticians to have any hope of keeping abreast with the latest developments.

Vicky Schneider, associate professor and deputy director of the EMBL Australia Bioinformatics Resource, at the University of Melbourne, Australia, says one way each side can learn the language of the other is by having more conversations. “You have to have a minimal common vocabulary,” Schneider says.

More dialogue between users and developers results in better tools, she says. For instance, a computer science-trained developer might create a powerful tool. But if someone with a biology background doesn't know how to use it, that tool is useless. “The two sides need to work together to develop user interfaces,” she says.

There are formal efforts in place to achieve this. For instance, the Global Organisation for Bioinformatics Learning, Education & Training (GOBLET) helps each side learn the others' science. But even that can only go so far. It is unrealistic for specialists from each side of the field to be completely fluent in the other's field. “Each side has to understand their own limitations,” Schneider says.

David Martin, a senior lecturer in bioinformatics at the University of Dundee, agrees with Cazier that biologists need more familiarity with bioinformatics. “The core skills for modern data-rich biology are not always there,” he says. If he could, he would teach every biology grad student enough skills so that they could do some basic programming, read data into a file, then be able to manipulate and process it — “not enough to be a computer scientist, but enough to have the tools to work with the data,” he says (see skills spectrum). But money and time is always a problem. “These skills take time to develop and craft, much like lab skills take time to develop and craft,” Martin says.

However, it can be done, if one is willing to put in the work, says Joseph Mullen, a PhD student at Newcastle University in the United Kingdom. After an undergraduate degree in biology and with little computational experience, he decided that the career opportunities bioinformatics would open up would be worth the effort. “It was an incredibly steep learning curve,” Mullen says. “I jumped into it with everything I had.” Indeed, between coursework, working three part-time jobs to fund his education and putting in the hours to learn multiple programming languages, he had time for little else. Mullen estimates he averaged about five hours of sleep during his MSc year.

The sacrifice paid off, though. The Engineering and Physical Sciences Research Council (EPSRC) and GlaxoSmithKline are funding his PhD research. In return, Mullen contributes to drug discovery work for the company. He has already been offered a government-funded joint postdoc position with Newcastle and Prozomix — a biotech company based in Northumberland, United Kingdom — even though he hasn't yet written up his dissertation.

Federico Abascal was similarly computationally illiterate when he completed his undergraduate degree in 1998. Now he works as a bioinformatician at the Wellcome Trust Sanger Institute in the United Kingdom — one of the world's most renowned bioinformatics hubs. When he finished his undergrad work, he remained interested in biology, but knew he didn't want to perform experiments.

He took a course in the programming language C, then went on to graduate school at the Spanish National Biotechnology Centre in Madrid. His drive to solve problems in evolutionary biology led him to learn more programming languages. “Once you know one language, it is easier and easier to learn others.”

He advises would-be bioinformaticians to get out of the lab as much as they can to avoid losing perspective. “In my case, the best ideas never come in front of a computer,” Abascal says. “It's important to enjoy your job and be motivated: the best ideas come whilst I'm in bed, or walking my dog, or having a coffee with colleagues.”

The next generation of bioinformatician may well find the lab and the computer indistinguishable. Dual training in both fields, as early as undergrad education, may well become the norm, says Atul Butte, director of the University of California, San Francisco's Institute for Computational Health Sciences.

ELISABETH FALL

Atul Butte

Butte is a pioneer in training for bioinformatics. In high school, he was fascinated by National Geographic covers displaying MRI and CT scan images. He thought combining computers and medicine would prepare him for a career in radiology.

He pursued that career by enrolling in an eight-year programme at Brown University, studying medicine and computer science. Towards the end of his studies, gene expression microarray chips were invented, the human genome project took off, and the era of big data in biology was born. He emerged with a skill set training him in both worlds.

He may have been the exception then, but Butte sees dual training as the new norm. “More and more people come up with both,” he says.

In fact, it is harder now to get into top bioinformatics graduate programs conversant in only biology, or only computer science. “You have to demonstrate you know more than a little of both,” Butte says.

But knowing enough to use the software may not be enough to excel, Butte says. “The point of being in this field is to develop new tools, new methods — you have to innovate. You have to write new code.” Focusing too much on one technique or one problem could be career limiting, he says.

He advises constant learning — the amount of data keeps growing and the nature of it keeps changing. “Treat the field with respect,” he says. “If you want to thrive in biomedical informatics it can't be a casual thing. You have to be here to stay.”

Science jobs

Editorial Assistants, Nature Research - Talent Pool 2017

  • Springer Nature
  • London, United Kingdom

Gastroenterologist – Loyola University Medical Center

  • Loyola University Chicago
  • Maywood, IL, United States

Professor and Faculty Positions at the Academy of Medical Sciences (AMS), Zhengzhou University

  • The Academy of Medical Sciences of Zhengzhou University
  • Zhengzhou, China

Open Innovation Challenges

    No Open Innovation Challenges found at this time.