Abstract
The launch of the US BRAIN and European Human Brain Projects coincides with growing international efforts toward transparency and increased access to publicly funded research in the neurosciences. The need for data-sharing standards and neuroinformatics infrastructure is more pressing than ever. However, 'big science' efforts are not the only drivers of data-sharing needs, as neuroscientists across the full spectrum of research grapple with the overwhelming volume of data being generated daily and a scientific environment that is increasingly focused on collaboration. In this commentary, we consider the issue of sharing of the richly diverse and heterogeneous small data sets produced by individual neuroscientists, so-called long-tail data. We consider the utility of these data, the diversity of repositories and options available for sharing such data, and emerging best practices. We provide use cases in which aggregating and mining diverse long-tail data convert numerous small data sources into big data for improved knowledge about neuroscience-related disorders.
This is a preview of subscription content
Access options
Subscribe to Journal
Get full journal access for 1 year
68,37 €
only 5,70 € per issue
Tax calculation will be finalised during checkout.
Buy article
Get time limited or full article access on ReadCube.
$32.00
All prices are NET prices.

References
Huerta, M.F., Koslow, S.H. & Leshner, A.I. Trends Neurosci. 16, 436–438 (1993).
Roysam, B., Shain, W. & Ascoli, G.A. Neuroinformatics 7, 1–5 (2009).
National Institutes of Health. NIH Program Announcement NOT-MH-05–014, http://grants.nih.gov/grants/guide/notice-files/NOT-MH-05-014.html (2005).
Shepherd, G.M. et al. Trends Neurosci. 21, 460–468 (1998).
Weinberg, A.M. Science 134, 161–164 (1961).
Wallis, J.C., Rolando, E. & Borgman, C.L. PLoS ONE 8, e67332 (2013).
Chan, A.W. et al. Lancet 383, 257–266 (2014).
Ascoli, G.A., Donohue, D.E. & Halavi, M. J. Neurosci. 27, 9247–9251 (2007).
Gardner, D. et al. Neuroinformatics 6, 149–160 (2008).
Gardner, D. et al. Neuroinformatics 1, 289–295 (2003).
Boline, J., Lee, E.F. & Toga, A.W. Front. Neurosci. 2, 100–106 (2008).
Van Horn, J.D. & Gazzaniga, M.S. Neuroimage 82, 677–682 (2013).
Perrino, T. et al. Perspect. Psychol. Sci. 8, 433–444 (2013).
Poline, J.B. & Poldrack, R.A. Front. Neurosci. 6, 96 (2012).
Poldrack, R.A. et al. Front. Neuroinform. 7, 12 (2013).
Steward, O., Popovich, P.G., Dietrich, W.D. & Kleitman, N. Exp. Neurol. 233, 597–605 (2012).
Wicherts, J.M., Bakker, M. & Molenaar, D. PLoS ONE 6, e26828 (2011).
Heidorn, P.B. Libr. Trends 57, 280–299 (2008).
Mueck, L. Nat. Nanotechnol. 8, 693–695 (2013).
Sena, E.S., van der Worp, H.B., Bath, P.M., Howells, D.W. & Macleod, M.R. PLoS Biol. 8, e1000344 (2010).
Fawcett, J.W. et al. Spinal Cord 45, 190–205 (2007).
Lemmon, V.P. et al. J. Neurotrauma 31, 1354–1361 (2014).
Nielson, J.L. et al. J. Neurotrauma doi:10.1089/neu.2014.3399 (31 July 2014).
Fisher, M. et al. Stroke 40, 2244–2250 (2009).
Kwon, B.K., Hillyer, J. & Tetzlaff, W. J. Neurotrauma 27, 21–33 (2010).
Marmarou, A. et al. J. Neurotrauma 24, 239–250 (2007).
Maas, A.I. et al. J. Neurotrauma 28, 177–187 (2011).
Manley, G.T. & Maas, A.I. J. Am. Med. Assoc. 310, 473–474 (2013).
Yue, J.K. et al. J. Neurotrauma 30, 1831–1844 (2013).
Steyerberg, E.W. et al. PLoS Med. 5, e165 (2008).
Yuh, E.L. et al. Ann. Neurol. 73, 224–235 (2013).
Ferguson, A.R. et al. PLoS ONE 8, e59712 (2013).
Turner, C.F. et al. Database (Oxford) 2011, bar043 (2011).
Turner, J.A. et al. Front. Neuroinform. 4, 10 (2010).
Tenopir, C. et al. PLoS ONE 6, e21101 (2011).
Roche, D.G. et al. PLoS Biol. 12, e1001779 (2014).
Boulton, G., Rawlins, M., Vallance, P. & Walport, M. Lancet 377, 1633–1635 (2011).
Bohannon, J. Science 344, 788–789 (2014).
Agarwal, G. et al. Science 344, 626–630 (2014).
Cragin, M.H., Palmer, C.L., Carlson, J.R. & Witt, M. Philos. Trans. A Math. Phys. Eng. Sci. 368, 4023–4038 (2010).
Halavi, M., Hamilton, K.A., Parekh, R. & Ascoli, G.A. Front. Neurosci. 6, 49 (2012).
Martone, M.E. et al. J. Struct. Biol. 138, 145–155 (2002).
Fernandez, J.J. BMC Bioinformatics 10, 178 (2009).
Goodman, A. et al. PLoS Comput. Biol. 10, e1003542 (2014).
Gorgolewski, K.J., Margulies, D.S. & Milham, M.P. Front. Neurosci. 7, 9 (2013).
Gorgolewski, K.J. et al. Gigascience 2, 6 (2013).
Klein, T. et al. Data Sci. J. 12, 1–9 (2013).
The Future of Research Communications and e-Scholarship (FORCE11). Joint Declaration of Data Citation Principles–FINAL, https://www.force11.org/datacitation (2013).
Research Data Alliance. Research data sharing without barriers, https://rd-alliance.org/group/data-citation-wg.html (2014).
Van Essen, D.C. et al. Neuroimage 80, 62–79 (2013).
Mennes, M., Biswal, B.B., Castellanos, F.X. & Milham, M.P. Neuroimage 82, 683–691 (2013).
The Royal Society. Science as an open enterprise, https://royalsociety.org/policy/projects/science-public-enterprise/Report/ (2012).
Kennedy, D.N. Neuroinformatics 12, 361–363 (2014).
Costa L.F., Zawadzki, K., Miazaki, M., Viana, M.P. & Taraskin, S.N. Front. Comput. Neurosci. 4, 150 (2010).
Hansen, M.B., Jespersen, S.N., Leigland, L.A. & Kroenke, C.D. Front. Integr. Neurosci. 7, 31 (2013).
Martone, M.E., Gupta, A. & Ellisman, M.H. Nat. Neurosci. 7, 467–472 (2004).
Maas, A.I. et al. Lancet Neurol. 12, 1200–1210 (2013).
Acknowledgements
We thank the NIF staff, especially B. Ozyurt for his text mining expertise and tools that contributed substantially to Supplementary Table 1. The Neuroscience Information Framework is supported by a contract from the NIH Neuroscience Blueprint HHSN271200800035C via the National Institute on Drug Abuse. VISION-SCI is supported by NIH grants NS067092 (A.R.F.) and NS079030 (J.L.N.), and the Craig H. Neilsen foundation (A.R.F.) and Wings for Life foundation (A.R.F). This material is based on (M.H.C.) work supported while serving at the National Science Foundation. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not reflect the views of the National Science Foundation.
Author information
Affiliations
Corresponding author
Ethics declarations
Competing interests
M.E. Martone is the principal investigator of the Neuroscience Information Framework. A.E. Bandrowski is the NIF Project Leader. A.R. Ferguson, J.L. Nielson and M.H. Cragin are not affiliated with NIF.
Supplementary information
Supplementary Table
A sample of Neuroscience-centered data repositories available to the community. (PDF 327 kb)
Rights and permissions
About this article
Cite this article
Ferguson, A., Nielson, J., Cragin, M. et al. Big data from small data: data-sharing in the 'long tail' of neuroscience. Nat Neurosci 17, 1442–1447 (2014). https://doi.org/10.1038/nn.3838
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nn.3838
Further reading
-
Constructing the rodent stereotaxic brain atlas: a survey
Science China Life Sciences (2022)
-
Is Neuroscience FAIR? A Call for Collaborative Standardisation of Neuroscience Data
Neuroinformatics (2022)
-
Machine intelligence identifies soluble TNFa as a therapeutic target for spinal cord injury
Scientific Reports (2021)
-
Excavating FAIR Data: the Case of the Multicenter Animal Spinal Cord Injury Study (MASCIS), Blood Pressure, and Neuro-Recovery
Neuroinformatics (2021)
-
The “Narratives” fMRI dataset for evaluating models of naturalistic language comprehension
Scientific Data (2021)