It is an exciting time to work in robotics. There are plenty of interesting challenges in designing machines that intelligently interact with both humans and their environment, and a range of techniques and insights from engineering, computer science, physics, biomechanics, psychology and other fields are available to help solve them. The International Conference on Robotics and Automation, an annual event organized by the IEEE, is a lively affair: over 4,000 participants gathered in Montreal last month for the 2019 instalment of the event to showcase their inventions, ideas and even artwork. The prominence given to several impressive art installations at the conference can perhaps be taken as a sign that the community feels confident as a multidisciplinary endeavour.

However, something is missing in robotics. Most experts are aware of it but aren’t sure what to do about it: reproducibility. Frequently, papers report a robot system that accomplishes a certain task with proof-of-concept demonstrations. Given the engineering challenges involved, a working demonstration is usually a noteworthy achievement. But how should such results be evaluated and compared with related work in a meaningful way?

One popular approach to develop methods for evaluation in robotics research is to organize competitions, where robot performance can be directly compared against benchmarks and in controlled environments. Competitions have for decades played an important role not just in providing useful benchmarks, but also in finding new directions, building a community, providing an opportunity for scientists to acquire new skills and especially for young scientists to demonstrate their talents. To highlight the benefits of competitions for individuals as well as whole areas in robotics and AI, Nature Machine Intelligence has started a series of articles called Challenge Accepted written by organizers and participants. See, for instance, the article, ‘Picking the right robotics challenge’, written by the winner of the 2017 Amazon Robotics Challenge1.

However, competitions are — after all — competitions, and they cannot replace rigorous experimental research.

For robotics to emerge as a scientific discipline, standards in reporting need to be adopted that focus on reproducibility and replicability. A special issue of IEEE Robotics & Automation Magazine from 2015 on ‘replicable and measurable robotics research’ argued that a new type of robotics paper is required, namely one that includes a clear description of the methods and datasets, as well as code and hardware identifiers2. Two years later, to give authors an incentive to write and get involved in reproducibility-focussed papers, the editors announced three new article formats: first, authors can submit their work as a special ‘R-article’, where R stands for reproducibility3. An R-article provides all necessary information, methods, data and code for others to replicate the work. This is also an important principle for Nature Research journal articles but presents an interesting challenge in particular for robotics. The second new article format is an ‘r-article’ in which other groups report on their experiences with replicating the work. And finally, the authors of the original R-article can submit a reply.

This is an exciting initiative, but reproducibility is difficult. Two years onwards, the first R-article has yet to be published. However, this is expected to happen soon, and the hope is that the community gets accustomed to the practice, which will in the long run make reporting, review and replication processes smoother.

Another aspect of transforming robotics into a scientific field is to define what it is that roboticists actually study, a point made eloquently by Signe Redfield in a Comment in this issue. So far, the field has focused too much on physical robot systems, an attitude that dates from the start of ‘nouvelle AI’, a term coined by Rodney Brooks in the early 1990s. Until then, the field of artificial intelligence had been dominated by symbolic approaches focused on the goal of giving artificially intelligent systems an internal model of reality. However, Brooks and other experts pointed out that true intelligence involves functioning in the real world, and not necessarily at human level.

Signe Redfield discusses how this vision, in combination with pressure from funding sources to come up with real-world solutions, has led to a push for proof-of-concept physical implementations in favour of theoretical research. However, it is now time for a new definition of robotics, one that makes it possible to develop scientific methods for evaluating results. Redfield proposes a definition of robotics that focuses on the engineering and evaluation of embodied artificial capabilities, rather than the physical system itself. This would enable specialization in either theory or experimental realization of capabilities, which in turn would bring benefits for objective and robust evaluation and comparison.

It is an exciting prospect that robotics can start growing as a scientific discipline, with clearly defined methods of evaluation and measurements in place.