PhD holders with quantitative skills are landing posts at technology companies.
Eli Bressert planned to spend his academic career in search of forming stars. He had completed a PhD in astronomy at the University of Exeter, UK, and had won a prestigious postdoctoral fellowship to study radio astronomy near Sydney, Australia. Citations of his papers and invitations for collaborations and conference talks were on the rise. He had no reason to want to work outside astronomy.
But a year into his studies in 2012, the grim reality of the academic job market began to make him nervous. “I sat down and calculated my odds,” he recalls. “What was the chance of getting in at a good research institution in a place where my family would be happy?” He had already moved himself, his wife and their year-old son some 16,000 kilometres to Australia for his postdoc, and more transglobal moves for low pay and little stability did not appeal. Still, his research was going well, and he decided to carry on.
That same year, he and a colleague published a handbook on scientific programming, and he was recruited as an academic adviser to a start-up company that was creating software to help collaborators to co-author papers. Bressert loved the energy of the start-up and when he heard of a fellowship that groomed scientists for technology jobs in Silicon Valley, he applied — and was accepted.
He and his family moved again, this time 12,000 kilometres to Palo Alto, California. Today, he is head of data labs at Stitch Fix, a company in San Francisco, California, that creates predictive algorithms that help clients to choose clothes. He says that he loves his work evaluating computational methods in part because it offers more intellectual freedom and creativity than he had experienced in academia.
Bressert is hardly an anomaly — his company employs 20 PhD holders from disciplines as varied as astronomy, neuroscience and electrical engineering. Their biggest asset is rigorous thinking, says Eric Colson, Bressert's manager. PhD training means learning to formulate questions, test hypotheses and assess whether a solution is reliable. When it comes to modelling data, these qualities make PhD holders more sceptical than most, says Colson. “If it was perfect on the first try, a PhD's first response will be that it is too good to be true. PhDs have this patience and way of framing problems that MBAs don't have.” Stitch Fix's PhD holders are just a few of the many young scientists, mainly in the United States, who have left the academic quagmire for jobs in industrial data science.
Make the leap
Mathematicians and computer scientists are well-represented in the data-science field, but computing savvy and communication skills matter more than scientific speciality. Early-career researchers hoping to make the transition need to show that they can extract patterns from messy data and place those patterns in the context of commercial goals.
“It's important to remember that industry doesn't value insights. They value analyses that are actionable,” says Michael Li, who is co-founder of The Data Incubator, a training course based in New York and Washington DC that prepares graduate students for jobs in data science. And academics skewer their chances by not knowing the ins and outs of industry, says Jake Klamka, who founded a similar training programme, Insight Data Science in Palo Alto. Otherwise qualified candidates can be dismissed as clueless for using the wrong word, such as the academic term 'study' instead of the industry argot 'experiment' or 'A/B test'.
Klamka found it hard to break into industry. He quit his PhD programme in particle physics at the University of Toronto, Canada, in 2010 and began developing tech tools in his kitchen. But although he had the expertise, he lacked knowledge of the industry. “I was 99.5% there in terms of skills,” he says. “What I needed was guidance and mentorship.” After a year of frustration, he headed to Silicon Valley, where he met software engineers and entrepreneurs who put him on the right track. And thanks in part to backing from the start-up incubator Y Combinator, based in Mountain View, California, he was able to launch his own company, Noteleaf.
Klamka knew that many of his friends in the physics community were interested in moving into industrial data science but were struggling, like he had, to break into industry. At the same time, his tech-community friends were complaining that they had open positions but no one smart enough to fill them. So Klamka founded Insight Data Science to provide PhD holders with the training they need for a career in industrial data science. So far, everyone who has completed the 7-week programme has received job offers (see 'Learn the ropes').
Data-scientist jobs vary widely. Some require mainly tedious 'data munging', cleaning data and filling in gaps to make data sets suitable for relatively simple analysis. Some data scientists work as consultants on data applications; others craft new models and methodologies. Large firms such as LinkedIn, Google and Facebook, with their huge user bases and data sets, tend to support the most sophisticated data modelling.
Would-be data scientists should think broadly about their interests and where they can do what interests them, says Glenn Wong, who has a PhD in physics and is now vice-president at Recorded Future in Somerville, Massachusetts, which organizes web data to help clients deflect cyberattacks. “I don't mean 'how this snippet of DNA interacts with that snippet of DNA',” clarifies Wong, “but 'I like solving problems of a complex two-dimensional nature'. Or 'I like being surrounded by people who have wacky ideas and don't care about hierarchy.'”
Amy Heineike took a leave of absence from her PhD programme in computational social science to join a tech start-up based in San Francisco, California, that helps to advise and evaluate early-stage entrepreneurs. “The reason I was doing a PhD was to solve interesting problems, but we were already doing that,” she says of her work at the firm. Several years out of academia, and now with stints at other start-ups under her belt, Heineike thinks that she has better opportunities to build ideas and implement them in industry because companies actually connect with the people who use the products.
But PhD graduates have to be comfortable with abandoning quests for ever-greater accuracy in favour of commercial goals. Once a data model is working, academics might focus on sophisticated tweaks to improve accuracy and account for outliers. “But in industry, you'd be saying, 'How do I build this into the software; how do I make sure that it won't crash?'” says Heineike. “You have to go the distance for what users really want, and that's something you don't necessarily have time for in academia.”
Some hiring managers worry that a desire to craft increasingly accurate models can lead academicians into an unproductive morass. John Baker, who founded a consultancy for data-science services called Datakin in Boston, Massachusetts, recalls an astrophysicist nicknamed 'Dark Matter' by his colleagues because his zeal for perfecting data models meant that he never completed his projects.
David Freeman, head of security data science at the networking firm LinkedIn in Mountain View, says that it is possible to weed out those with such tendencies during interviews. When asked to describe their accomplishments, the most-promising candidates focus more on codes they have implemented than papers they have published. Portfolios developed independently or at boot camps are another good sign of an industry fit, says Baker. “You can tell who is really academic and who really has potential by their projects.”
Will Cukierski got noticed this way. He earned his PhD at Rutgers University in New Brunswick, New Jersey, where he taught computers to recognize telltale pathologies in cancerous tissues. But at night, he worked on a challenge from streaming-media provider Netflix: a US$1-million prize to anyone who could best its own movie-recommendation algorithms. He didn't win, but he caught the bug and started to spend his free time on similar contests hosted by the data-science company Kaggle, based in San Francisco. In 2012, company executives contacted him — they had noticed his entries and thought that he could earn a spot on their team. He started there as a data scientist a week after he defended his PhD.
For many PhD holders, the key to success is to find a company whose product or service fascinates them, says Sebastian Gutierrez, author of Data Scientists at Work. “You need someone who is excited enough about the business that they actually care that they need to meet quarterly budgets and goals.”
Posts for data scientists are starting to emerge in academia (see 'Academic data drive'), but many find the industry environment more appealing. “In industry I can use 20% of the time to achieve 80% of the goal, instead of vice versa,” says Shani Offen, formerly a research professor in neuroscience at New York University and now a data scientist at the question-answering site About.com, based in New York. Tommy Guy, a data scientist at the tech giant Microsoft in Bellevue, Washington, likes being rewarded for getting the right answer, no matter what it is. For instance, he can use data analysis to conclude that a proposed new feature would be unpopular with users and argue to dump it, saving the company a considerable sum and earning accolades. Conversely, he says, academia rarely rewards negative results.
Freeman likes the pace at LinkedIn. He recalls doing cutting-edge research in his postdoctoral work at Stanford University in California. “But the thing I was working on would not be seen in actual use for 20 years, if ever. I was looking for something with more immediate impact.” And there's nothing like constant deadlines to focus the mind.