Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Slow and steady

Subjects

To the Editor — For decades, particle colliders have exposed the fundamental building blocks of nature, most recently the Higgs boson, discovered at the Large Hadron Collider (LHC). In 2014, the Compact Muon Solenoid (CMS) experiment at the LHC took the unprecedented step of making a meaningful fraction of their data public. The CMS Open Data project (http://opendata.cern.ch/), now exceeding a petabyte of real and simulated collisions, has spawned several exploratory studies1,2,3,4, including our recent search for new particles5.

Why ‘unprecedented’? Collider datasets are huge and inherently complex. LHC proton collisions occur every 25 nanoseconds, and reconstructing the collision debris requires synthesizing information from hundreds of millions of readout channels. A filter (the ‘trigger’) discards all but the most interesting collisions, and accounting for its effects and those of the heterogeneous LHC detectors is challenging. The resources required to make such a complex dataset public and usable are substantial, but in short supply.

However, data from the LHC — whose successor is decades away — are priceless for future scientists and must be carefully archived, along with all necessary associated knowledge. As it is archived, the data should be made public, though not immediately. A delay of several years, enough for the experimenters who collected the data to perform thorough analyses, is appropriate; only those who spent years building the experiments have earned quick access. Furthermore, making LHC data ready for public use, with documentation and example code, requires significant funding and time.

But steady publication of LHC data has multiple benefits. First, it encourages prompt archiving, before collective memory fades and knowledge is lost. Second, other scientists can analyse the data while the LHC is still running, testing unconventional strategies and potentially leading to unexpected discoveries, new approaches and fruitful discussions. And third, as a by-product, these scientists can stress test the archiving methods; any deficiencies found are easier to fix now than later. In this way, public collider data can complement the overall LHC research effort. We, therefore, favour a slow but steady approach to full publication of the LHC experiments’ data; it is in the best interest of particle physics.

References

  1. 1.

    Larkoski, A., Marzani, S., Thaler, J., Tripathee, A. & Xue, W. Phys. Rev. Lett. 119, 132003 (2017).

    ADS  Article  Google Scholar 

  2. 2.

    Madrazo, C. F., Cacha, I. H., Iglesias, L. L. & de Lucas, J. M. Preprint at https://arxiv.org/abs/1708.07034 (2017).

  3. 3.

    Andrews, M., Paulini, M., Gleyzer, S. & Poczos, B. Preprint at https://arxiv.org/abs/1807.11916 (2018).

  4. 4.

    Lester, C. G. & Schott, M. Preprint at https://arxiv.org/abs/1904.11195 (2019).

  5. 5.

    Cesarotti, C., Soreq, Y., Strassler, M. J., Thaler, J. & Xue, W. Phys. Rev. D 100, 015021 (2019).

Download references

Author information

Affiliations

Authors

Corresponding authors

Correspondence to Matthew Strassler or Jesse Thaler.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Strassler, M., Thaler, J. Slow and steady. Nat. Phys. 15, 725 (2019). https://doi.org/10.1038/s41567-019-0628-z

Download citation

Further reading

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing