arxiv:2212.12567

Adapting to game trees in zero-sum imperfect information games

Published on Dec 23, 2022

Authors:

Michal Valko

Abstract

Imperfect information games (IIG) are games in which each player only partially observes the current game state. We study how to learn epsilon-optimal strategies in a zero-sum IIG through self-play with trajectory feedback. We give a problem-independent lower bound mathcal{O}(H(A_{X}+B_{Y})/epsilon^2) on the required number of realizations to learn these strategies with high probability, where H is the length of the game, A_{X} and B_{Y} are the total number of actions for the two players. We also propose two Follow the Regularized leader (FTRL) algorithms for this setting: Balanced FTRL which matches this lower bound, but requires the knowledge of the information set structure beforehand to define the regularization; and Adaptive FTRL which needs mathcal{O}(H^2(A_{X}+B_{Y})/epsilon^2) realizations without this requirement by progressively adapting the regularization to the observations.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2212.12567 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2212.12567 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2212.12567 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.