Learnability can be undecidable

Abstract

The mathematical foundations of machine learning play a key role in the development of the field. They improve our understanding and provide tools for designing new learning paradigms. The advantages of mathematics, however, sometimes come with a cost. Gödel and Cohen showed, in a nutshell, that not everything is provable. Here we show that machine learning shares this fate. We describe simple scenarios where learnability cannot be proved nor refuted using the standard axioms of mathematics. Our proof is based on the fact the continuum hypothesis cannot be proved nor refuted. We show that, in some cases, a solution to the ‘estimating the maximum’ problem is equivalent to the continuum hypothesis. The main idea is to prove an equivalence between learnability and compression.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Data availability

The data that support the findings of this study are available from the corresponding author upon request.

Change history

  • 23 January 2019

    In the version of this Article originally published, the following text was missing from the Acknowledgements: ‘Part of the research was done while S.M. was at the Institute for Advanced Study in Princeton and was supported by NSF grant CCF-1412958.’ This has now been corrected.

References

  1. 1.

    Valiant, L. G. A theory of the learnable. Commun. ACM 27, 1134–1142 (1984).

    Article  Google Scholar 

  2. 2.

    Vapnik, V. N. & Chervonenkis, A. Y. On the uniform convergence of relative frequencies of events to their probabilities. Theory Probab. Appl. 16, 264–280 (1971).

    Article  Google Scholar 

  3. 3.

    Vapnik, V. N. & Chervonenkis, A. Y. Theory of Pattern Recognition [in Russian] (Nauka, Moscow, 1974).

  4. 4.

    Blumer, A., Ehrenfeucht, A., Haussler, D. & Warmuth, M. K. Learnability and the Vapnik–Chervonenkis dimension. J. ACM 36, 929–965 (1989).

    MathSciNet  Article  Google Scholar 

  5. 5.

    Vapnik, V. N. Statistical Learning Theory (Wiley, Hoboken, 1998).

  6. 6.

    Vapnik, V. N. An overview of statistical learning theory. IEEE Trans. Neural Netw. 10, 988–999 (1999).

    Article  Google Scholar 

  7. 7.

    Shalev-Shwartz, S., Shamir, O., Srebro, N. & Sridharan, K. Learnability, stability and uniform convergence. J. Mach. Learn. Res. 11, 2635–2670 (2010).

    MathSciNet  MATH  Google Scholar 

  8. 8.

    Ben-David, S., Cesa-Bianchi, N., Haussler, D. & Long, P. M. Characterizations of learnability for classes of {0, …, n}-valued functions. J. Comput. Syst. Sci. 50, 74–86 (1995).

    MathSciNet  Article  Google Scholar 

  9. 9.

    Daniely, A., Sabato, S., Ben-David, S. & Shalev-Shwartz, S. Multiclass learnability and the ERM principle. J. Mach. Learn. Res. 16, 2377–2404 (2015).

    MathSciNet  MATH  Google Scholar 

  10. 10.

    Daniely, A., Sabato, S. & Shalev-Shwartz, S. Multiclass learning approaches: a theoretical comparison with implications. In Proc. NIPS 485–493 (ACM, 2012).

  11. 11.

    Daniely, A. & Shalev-Shwartz, S. Optimal learners for multiclass problems. In Proc. COLT 287–316 (2014).

  12. 12.

    Cohen, P. J. The independence of the continuum hypothesis. Proc. Natl Acad. Sci. USA 50, 1143–1148 (1963).

    MathSciNet  Article  Google Scholar 

  13. 13.

    Cohen, P. J. The independence of the continuum hypothesis, II. Proc. Natl Acad. Sci. USA 51, 105–110 (1964).

    MathSciNet  Article  Google Scholar 

  14. 14.

    Jech, T. J. Set Theory: Third Millenium Edition, Revised and Expanded (Springer, Berlin, 2003).

  15. 15.

    Kunen, K. Set Theory: An Introduction to Independence Proofs (Elsevier, Amsterdam, 1980).

  16. 16.

    Gödel, K. The Consistency of the Continuum Hypothesis (Princeton University Press, Princeton, 1940).

  17. 17.

    David, O., Moran, S. & Yehudayoff, A. Supervised learning through the lens of compression. In Proc. NIPS 2784–2792 (ACM, 2016).

  18. 18.

    Littlestone, N. & Warmuth, M. Relating Data Compression and Learnability. Technical Report (Univ. of California, 1986).

  19. 19.

    Moran, S. & Yehudayoff, A. Sample compression schemes for VC classes. J. ACM 63, 1–21 (2016).

  20. 20.

    Shalev-Shwartz, S. & Ben-David, S. Understanding Machine Learning: From Theory to Algorithms (Cambridge Univ. Press, New York, 2014).

  21. 21.

    Alon, N., Ben-David, S., Cesa-Bianchi, N. & Haussler, D. Scale-sensitive dimensions, uniform convergence, and learnability. J. ACM 44, 615–631 (1997).

  22. 22.

    Kearns, M. J. & Schapire, R. E. Efficient distribution-free learning of probabilistic concepts. J. Comput. Syst. Sci. 48, 464–497 (1994).

    MathSciNet  Article  Google Scholar 

  23. 23.

    Simon, H. U. Bounds on the number of examples needed for learning functions. SIAM J. Comput. 26, 751–763 (1997).

    MathSciNet  Article  Google Scholar 

  24. 24.

    Hanneke, S. The optimal sample complexity of PAC learning. J. Mach. Learn. Res. 15, 1–38 (2016).

    MathSciNet  MATH  Google Scholar 

  25. 25.

    Ben-David, S. & Ben-David, S. Learning a classifier when the labeling is known. In Proc. ALT 2011 (Lecture Notes in Computer Science Vol. 6925, 2011).

Download references

Acknowledgements

The authors thank D. Chodounský, S. Hanneke, R. Honzk and R. Livni for useful discussions. The authors also acknowledge the Simons Institute for the Theory of Computing for support. A.S.’s research has received funding from the Israel Science Foundation (ISF grant no. 552/16) and from the Len Blavatnik and the Blavatnik Family foundation. A.Y.’s research is supported by ISF grant 1162/15. Part of the research was done while S.M. was at the Institute for Advanced Study in Princeton and was supported by NSF grant CCF-1412958.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Amir Yehudayoff.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Ben-David, S., Hrubeš, P., Moran, S. et al. Learnability can be undecidable. Nat Mach Intell 1, 44–48 (2019). https://doi.org/10.1038/s42256-018-0002-3

Download citation

Further reading