Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Learnability can be undecidable

An Author Correction to this article was published on 23 January 2019

This article has been updated

Abstract

The mathematical foundations of machine learning play a key role in the development of the field. They improve our understanding and provide tools for designing new learning paradigms. The advantages of mathematics, however, sometimes come with a cost. Gödel and Cohen showed, in a nutshell, that not everything is provable. Here we show that machine learning shares this fate. We describe simple scenarios where learnability cannot be proved nor refuted using the standard axioms of mathematics. Our proof is based on the fact the continuum hypothesis cannot be proved nor refuted. We show that, in some cases, a solution to the ‘estimating the maximum’ problem is equivalent to the continuum hypothesis. The main idea is to prove an equivalence between learnability and compression.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Data availability

The data that support the findings of this study are available from the corresponding author upon request.

Change history

  • 23 January 2019

    In the version of this Article originally published, the following text was missing from the Acknowledgements: ‘Part of the research was done while S.M. was at the Institute for Advanced Study in Princeton and was supported by NSF grant CCF-1412958.’ This has now been corrected.

References

  1. Valiant, L. G. A theory of the learnable. Commun. ACM 27, 1134–1142 (1984).

    Article  Google Scholar 

  2. Vapnik, V. N. & Chervonenkis, A. Y. On the uniform convergence of relative frequencies of events to their probabilities. Theory Probab. Appl. 16, 264–280 (1971).

    Article  Google Scholar 

  3. Vapnik, V. N. & Chervonenkis, A. Y. Theory of Pattern Recognition [in Russian] (Nauka, Moscow, 1974).

  4. Blumer, A., Ehrenfeucht, A., Haussler, D. & Warmuth, M. K. Learnability and the Vapnik–Chervonenkis dimension. J. ACM 36, 929–965 (1989).

    Article  MathSciNet  Google Scholar 

  5. Vapnik, V. N. Statistical Learning Theory (Wiley, Hoboken, 1998).

  6. Vapnik, V. N. An overview of statistical learning theory. IEEE Trans. Neural Netw. 10, 988–999 (1999).

    Article  Google Scholar 

  7. Shalev-Shwartz, S., Shamir, O., Srebro, N. & Sridharan, K. Learnability, stability and uniform convergence. J. Mach. Learn. Res. 11, 2635–2670 (2010).

    MathSciNet  MATH  Google Scholar 

  8. Ben-David, S., Cesa-Bianchi, N., Haussler, D. & Long, P. M. Characterizations of learnability for classes of {0, …, n}-valued functions. J. Comput. Syst. Sci. 50, 74–86 (1995).

    Article  MathSciNet  Google Scholar 

  9. Daniely, A., Sabato, S., Ben-David, S. & Shalev-Shwartz, S. Multiclass learnability and the ERM principle. J. Mach. Learn. Res. 16, 2377–2404 (2015).

    MathSciNet  MATH  Google Scholar 

  10. Daniely, A., Sabato, S. & Shalev-Shwartz, S. Multiclass learning approaches: a theoretical comparison with implications. In Proc. NIPS 485–493 (ACM, 2012).

  11. Daniely, A. & Shalev-Shwartz, S. Optimal learners for multiclass problems. In Proc. COLT 287–316 (2014).

  12. Cohen, P. J. The independence of the continuum hypothesis. Proc. Natl Acad. Sci. USA 50, 1143–1148 (1963).

    Article  MathSciNet  Google Scholar 

  13. Cohen, P. J. The independence of the continuum hypothesis, II. Proc. Natl Acad. Sci. USA 51, 105–110 (1964).

    Article  MathSciNet  Google Scholar 

  14. Jech, T. J. Set Theory: Third Millenium Edition, Revised and Expanded (Springer, Berlin, 2003).

  15. Kunen, K. Set Theory: An Introduction to Independence Proofs (Elsevier, Amsterdam, 1980).

  16. Gödel, K. The Consistency of the Continuum Hypothesis (Princeton University Press, Princeton, 1940).

  17. David, O., Moran, S. & Yehudayoff, A. Supervised learning through the lens of compression. In Proc. NIPS 2784–2792 (ACM, 2016).

  18. Littlestone, N. & Warmuth, M. Relating Data Compression and Learnability. Technical Report (Univ. of California, 1986).

  19. Moran, S. & Yehudayoff, A. Sample compression schemes for VC classes. J. ACM 63, 1–21 (2016).

  20. Shalev-Shwartz, S. & Ben-David, S. Understanding Machine Learning: From Theory to Algorithms (Cambridge Univ. Press, New York, 2014).

  21. Alon, N., Ben-David, S., Cesa-Bianchi, N. & Haussler, D. Scale-sensitive dimensions, uniform convergence, and learnability. J. ACM 44, 615–631 (1997).

  22. Kearns, M. J. & Schapire, R. E. Efficient distribution-free learning of probabilistic concepts. J. Comput. Syst. Sci. 48, 464–497 (1994).

    Article  MathSciNet  Google Scholar 

  23. Simon, H. U. Bounds on the number of examples needed for learning functions. SIAM J. Comput. 26, 751–763 (1997).

    Article  MathSciNet  Google Scholar 

  24. Hanneke, S. The optimal sample complexity of PAC learning. J. Mach. Learn. Res. 15, 1–38 (2016).

    MathSciNet  MATH  Google Scholar 

  25. Ben-David, S. & Ben-David, S. Learning a classifier when the labeling is known. In Proc. ALT 2011 (Lecture Notes in Computer Science Vol. 6925, 2011).

Download references

Acknowledgements

The authors thank D. Chodounský, S. Hanneke, R. Honzk and R. Livni for useful discussions. The authors also acknowledge the Simons Institute for the Theory of Computing for support. A.S.’s research has received funding from the Israel Science Foundation (ISF grant no. 552/16) and from the Len Blavatnik and the Blavatnik Family foundation. A.Y.’s research is supported by ISF grant 1162/15. Part of the research was done while S.M. was at the Institute for Advanced Study in Princeton and was supported by NSF grant CCF-1412958.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amir Yehudayoff.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ben-David, S., Hrubeš, P., Moran, S. et al. Learnability can be undecidable. Nat Mach Intell 1, 44–48 (2019). https://doi.org/10.1038/s42256-018-0002-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s42256-018-0002-3

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing