A good dataset is a great citation and a great way to build your profile as a researcher, Julie Gould finds.

As a biomedical science student, Jake Schofield felt frustrated at the length of time it took to repeat experiments, record results and manage protocols, with most of the work paper-based.

In 2016 he and Jan Domanski, a biochemist with programming skills, launched Labstep, an online platform to help scientists record and reproduce experiments.

Schofield, now Labstep's CEO, tells Julie Gould how launching a start-up and seeking investor funding has honed his business skills.

"Every step we've taken has been a huge learning experience," he says. "I wish there were more opportunities for scientists to try entreprenurial pursuits. Scientific analytical problem-based thinking has so many parallels in the start-up world."

Brian MacNamee, a computer scientist at University College Dublin, outlines the high value of data and its potential to solve science's reproducibility crisis, citing large sky-scanning telescope projects as an example.

"These projects are generating colossal amounts of data scanning large portions of the sky and that data needs to be categorised," he says. "Astrophysicists want to go to large data collections and look for the bits they are interested in. It's impossible to do that by hand. You need to put machine learning systems into those pipelines to categorise and compare data.

"Other researchers are not reading a paper and trying to figure out where the gremlins are inside a data set. They can open the dataset up and find it themselves."