Reinforcement learning with restrictions on the action set

Bravo, Mario; Faure, Mathieu

Artículo

Open/Download

REINFORCEMENT-LEARNING-WITH-RESTRICTIONS.pdf (329.3Kb)

Publication date

2015

Metadata

Show full item record

Cómo citar

Reinforcement learning with restrictions on the action setFormato de cita

Copiar

Cerrar

Author

Abstract

Consider a two-player normal-form game repeated over time. We introduce an adaptive learning procedure, where the players only observe their own realized payoff at each stage. We assume that agents do not know their own payoff function and have no information on the other player. Furthermore, we assume that they have restrictions on their own actions such that, at each stage, their choice is limited to a subset of their action set. We prove that the empirical distributions of play converge to the set of Nash equilibria for zero-sum and potential games, and games where one player has two actions.

General note

Artículo de publicación ISI

Patrocinador

Fondecyt grant 3130732 Nucleo Milenio Informacion y Coordinacion en Redes ICM/FIC P10-024F Complex Engineering Systems Institute ICM: P-05-004-F CONICYT: FBO16

Identifier

URI: https://repositorio.uchile.cl/handle/2250/132415
DOI: DOI: 10.1137/130936488

Quote Item

SIAM J. CONTROL OPTIM. Vol. 53, No. 1, pp. 287–312

Collections