Auction theory is of central importance in the study of markets. Unfortunately, we do not know equilibrium bidding strategies for most auction games. For realistic markets with multiple items and value interdependencies, the Bayes Nash equilibria (BNEs) often turn out to be intractable systems of partial differential equations. Previous numerical techniques have relied either on calculating pointwise best responses in strategy space or iteratively solving restricted subgames. We present a learning method that represents strategies as neural networks and applies policy iteration on the basis of gradient dynamics in self-play to provably learn local equilibria. Our empirical results show that these approximated BNEs coincide with the global equilibria whenever available. The method follows the simultaneous gradient of the game and uses a smoothing technique to circumvent discontinuities in the ex post utility functions of auction games. Discontinuities arise at the bid value where an infinite small change would make the difference between winning and not winning. Convergence to local BNEs can be explained by the fact that bidders in most auction models are symmetric, which leads to potential games for which gradient dynamics converge.
This is a preview of subscription content, access via your institution
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Rent or buy this article
Prices vary by article type
Prices may be subject to local taxes which are calculated during checkout
All data analyses in this study are based exclusively on data generated by our custom simulation framework (see Code Availability). Raw simulation artefacts (all-iteration logs and trained models) will be made available by the corresponding author on request. Source data are provided with this paper.
Brown, N. & Sandholm, T. Superhuman AI for multiplayer poker. Science 365, 885–890 (2019).
Daskalakis, C., Ilyas, A., Syrgkanis, V. & Zeng, H. Training gans with optimism. Preprint at https://arxiv.org/abs/1711.00141 (2017).
Silver, D. et al. A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. Science 362, 1140–1144 (2018).
Daskalakis, C., Goldberg, P. & Papadimitriou, C. The complexity of computing a nash equilibrium. SIAM J. Comput. 39, 195–259 (2009).
Brown, G. W in Activity Analysis of Production and Allocation (ed. Koopmans, T. C.) 374–376 (Wiley, 1951).
Zinkevich, M. Online convex programming and generalized infinitesimal gradient ascent. In Proc. 20th International Conference on Machine Learning 928–936 (ICML, 2003).
Bowling, M. Convergence and no-regret in multiagent learning. In Advances in Neural Information Processing Systems 209–216 (NIPS, 2005).
Milgrom, P. R. & Weber, R. J. A theory of auctions and competitive bidding. Econometrica 50, 1089–1122 (1982).
Klemperer, P. Auction theory: a guide to the literature. J. Econ. Surveys 13, 227–286 (1999).
Vickrey, W. Counterspeculation, auctions, and competitive sealed tenders. J. Finance 16, 8–37 (1961).
Krishna, V. Auction Theory (Academic, 2009).
Bergemann, D. & Morris, S. Robust implementation in direct mechanisms. Rev. Econ. Stud. 76, 1175–1204 (2009).
Campo, S., Perrigne, I. & Vuong, Q. Asymmetry in first-price auctions with affiliated private values. J. Appl. Econom. 18, 179–207 (2003).
Janssen, M. C. Reflections on the 2020 Nobel Memorial Prize awarded to Paul Milgrom and Robert Wilson. Erasmus J. Philos. Econ. 13, 177–184 (2020).
Heinrich, J. & Silver, D. Deep reinforcement learning from self-play in imperfect-information games. Preprint at https://arxiv.org/abs/1603.01121 (2016).
Lanctot, M. et al. A unified game-theoretic approach to multiagent reinforcement learning. In Proc. 31st International Conference on Neural Information Processing Systems (NIPS, 2017).
Brown, N., Lerer, A., Gross, S. & Sandholm, T. Deep counterfactual regret minimization. In Proc. 36th International Conference on Machine Learning 793–802 (PMLR, 2019).
Brown, N. & Sandholm, T. Superhuman AI for multiplayer poker. Science 365, 885–890 (2019).
Reeves, D. M. & Wellman, M. P. Computing equilibrium strategies in infinite games of incomplete information. Proc. 20th Conference on Uncertainty in Artificial Intelligence (UAI, 2004).
Naroditskiy, V. & Greenwald, A. Using Iterated Best-response to Find Bayes-Nash Equilibria in Auctions 1894–1895 (AAAI, 2007).
Rabinovich, Z., Naroditskiy, V., Gerding, E. H. & Jennings, N. R. Computing pure Bayesian-Nash equilibria in games with finite actions and continuous types. Artif. Intell. 195, 106–139 (2013).
Bosshard, V., Bünz, B., Lubin, B. & Seuken, S. Computing Bayes-Nash equilibria in combinatorial auctions with continuous value and action spaces. In Proc. 26th International Joint Conference on Artificial Intelligence 119–127 (IJCAI, 2017).
Bosshard, V., Bünz, B., Lubin, B. & Seuken, S. Computing Bayes-Nash equilibria in combinatorial auctions with verification. J. Artif. Intell. Res. 69, 531–570 (2020).
Feng, Z., Guruganesh, G., Liaw, C., Mehta, A. & Sethi, A. Convergence analysis of no-regret bidding algorithms in repeated auctions. Preprint at https://arxiv.org/abs/2009.06136 (2020).
Li, Z. & Wellman, M. P. Evolution strategies for approximate solution of Bayesian games. In Proc. AAAI Conference on Artificial Intelligence Vol. 35, 5531–5540 (AAAI, 2021).
Cai, Y. & Papadimitriou, C. Simultaneous Bayesian auctions and computational complexity. In Proc. 15th ACM Conference on Economics and Computation 895–910 (ACM, 2014).
Fudenberg, D. & Levine, D. K. Learning and equilibrium. Annu. Rev. Econ. 1, 385–420 (2009).
Jafari, A., Greenwald, A., Gondek, D. & Ercal, G. On no-regret learning, fictitious play, and Nash equilibrium. Proc. 18th International Conference on Machine Learning 226–233 (ICML, 2001).
Stoltz, G. & Lugosi, G. Learning correlated equilibria in games with compact sets of strategies. Games Econ. Behav. 59, 187–208 (2007).
Hartline, J., Syrgkanis, V. & Tardos, E. No-regret learning in Bayesian games. In Advances in Neural Information Processing Systems (eds Cortes, C. et al.) Vol. 28, 3061–3069 (NIPS, 2015); http://papers.nips.cc/paper/6016-no-regret-learning-in-bayesian-games.pdf
Foster, D. J., Li, Z., Lykouris, T., Sridharan, K. & Tardos, E. Learning in games: robustness of fast convergence. In Advances in Neural Information Processing Systems 4734–4742 (NIPS, 2016).
Viossat, Y. & Zapechelnyuk, A. No-regret dynamics and fictitious play. J. Econ. Theory 148, 825–842 (2013).
Mazumdar, E., Ratliff, L. J. & Sastry, S. S. On gradient-based learning in continuous games. SIMODS 2, 103–131 (2020).
Dütting, P., Feng, Z., Narasimhan, H., Parkes, D. & Ravindranath, S. S. Optimal auctions through deep learning. In International Conference on Machine Learning 1706–1715 (PMLR, 2019).
Feng, Z., Narasimhan, H. & Parkes, D. C. Deep learning for revenue-optimal auctions with budgets. In Proc. 17th International Conference on Autonomous Agents and Multiagent Systems 354–362 (AAMAS, 2018).
Tacchetti, A., Strouse, D., Garnelo, M., Graepel, T. & Bachrach, Y. A neural architecture for designing truthful and efficient auctions. Preprint at https://arxiv.org/abs/1907.05181 (2019).
Weissteiner, J. & Seuken, S. Deep learning-powered iterative combinatorial auctions. In Proc. AAAI Conference on Artificial Intelligence Vol. 34, 2284–2293 (AAAI, 2020).
Morrill, D. et al. Hindsight and sequential rationality of correlated play. Preprint at https://arxiv.org/abs/2012.05874 (2020).
Hartford, J. S. Deep Learning for Predicting Human Strategic Behavior. Ph.D. thesis, Univ. British Columbia (2016).
Ghani, R. & Simmons, H. Predicting the end-price of online auctions. In Proc. International Workshop on Data Mining and Adaptive Modelling Methods for Economics and Management (CiteSeer, 2004).
Zheng, S. et al. The AI economist: improving equality and productivity with AI-driven tax policies. Preprint at https://arxiv.org/abs/2004.13332 (2020).
Goeree, J. K. & Lien, Y. On the impossibility of core-selecting auctions. Theoretical Econ. 11, 41–52 (2016).
Bichler, M. & Goeree, J. K. Handbook of Spectrum Auction Design (Cambridge Univ. Press, 2017).
Debnath, L. et al. Introduction to Hilbert Spaces with Applications (Academic, 2005).
Bichler, M., Guler, K. & Mayer, S. Split-award procurement auctions-can bayesian equilibrium strategies predict human bidding behavior in multi-object auctions? Prod. Oper. Manag. 24, 1012–1027 (2015).
Ui, T. Bayesian nash equilibrium and variational inequalities. J. Math. Econ. 63, 139–146 (2016).
Hornik, K. Approximation capabilities of multilayer feedforward networks. Neural Networks 4, 251–257 (1991).
Wierstra, D. et al. Natural evolution strategies. J. Mach. Learn. Res. 15, 949–980 (2014).
Salimans, T., Ho, J., Chen, X., Sidor, S. & Sutskever, I. Evolution strategies as a scalable alternative to reinforcement learning. Preprint at https://arxiv.org/abs/1703.03864 (2017).
Benaím, M., Hofbauer, J. & Sorin, S. Perturbations of set-valued dynamical systems, with applications to game theory. Dyn. Games Appl. 2, 195–205 (2012).
Letcher, A. et al. Differentiable game mechanics. J. Mach. Learn. Res. 20, 1–40 (2019).
Monderer, D. & Shapley, L. S. Potential games. Games Econ. Behav. 14, 124–143 (1996).
Bünz, B., Lubin, B. & Seuken, S. Designing core-selecting payment rules: a computational search approach. In Proc. 2018 ACM Conference on Economics and Computation 109 (ACM, 2018).
Ausubel, L. M. & Baranov, O. Core-selecting auctions with incomplete information. Int. J. Game Theory 49, 251–273 (2019).
Guler, K., Bichler, M. & Petrakis, I. Ascending combinatorial auctions with risk averse bidders. Group Decis. Negot. 25, 609–639 (2016).
Jehiel, P., Meyer-ter-Vehn, M., Moldovanu, B. & Zame, W. R. The limits of ex post implementation. Econometrica 74, 585–610 (2006).
Daskalakis, C., Skoulakis, S. & Zampetakis, M. The complexity of constrained min–max optimization. In Pro. 53rd Annual ACM SIGACT Symposium on Theory of Computing 1466–1478 (STOC, 2021).
Vorobeychik, Y., Reeves, D. M. & Wellman, M. P. Constrained automated mechanism design for infinite games of incomplete information. In Proc. 23rd Conference on Uncertainty in Artificial Intelligence 400–407 (UAI, 2007).
Viqueira, E. A., Cousins, C., Mohammad, Y. & Greenwald, A. Empirical mechanism design: designing mechanisms from data. In Proc. 35th Uncertainty in Artificial Intelligence Conference 1094–1104 (PMLR, 2020).
Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. Preprint at https://arxiv.org/abs/1412.6980 (2015).
Paszke, A. et al. Automatic differentiation in pytorch. In 31st Conference on Neural Information Processing Systems (NIPS, 2017).
Heidekrüger, S., Kohring, N., Sutterer, S. & Bichler, M. bnelearn: A Framework for Equilibrium Learning in Sealed-Bid Auctions (Github, 2021); https://github.com/heidekrueger/bnelearn
We are grateful for funding by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation; grant no. BI 1057/1-9). We thank V. Bosshard, B. Lubin, P. Mertikopoulos, P. Milgrom, S. Seuken, T. Ui, F. Maldonado and participants of the NBER Market Design Workshop 2020 for valuable feedback on earlier versions.
The authors declare no competing interests.
Peer review information Nature Machine Intellligence thanks Pierre Baldi, Neil Newman and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Description and discussion of additional experiments (including Supplementary Figs. 1–4 and Tables 1–3); proof of auxiliary lemmata used in the proof of Proposition 1 of the main manuscript; mathematical derivation of conditional distributions required for evaluation in the absence of equilibria.
Last-iteration loss measurements of approximate equilibria found by NPGA for 10 runs each in LLG settings.
Aggregate last-iteration efficiency measurements (mean and s.d. over 10 runs each) of learned NPGA policies in NVCG LLG for several risk parameters.
Last-iteration seller revenue measurements of learned NPGA policies in NVCG LLG for several risk and correlation parameters.
About this article
Cite this article
Bichler, M., Fichtl, M., Heidekrüger, S. et al. Learning equilibria in symmetric auction games using artificial neural networks. Nat Mach Intell 3, 687–695 (2021). https://doi.org/10.1038/s42256-021-00365-4
This article is cited by
Nature Machine Intelligence (2021)