Accelerating decentralized reinforcement learning of complex individual behaviors

Leottau, David L.; Lobos Tsunekawa, Kenzo; Jaramillo, Francisco; Ruiz del Solar, Javier

Author	dc.contributor.author	Leottau, David L.
Author	dc.contributor.author	Lobos Tsunekawa, Kenzo
Author	dc.contributor.author	Jaramillo, Francisco
Author	dc.contributor.author	Ruiz del Solar, Javier
Admission date	dc.date.accessioned	2019-10-30T15:29:58Z
Available date	dc.date.available	2019-10-30T15:29:58Z
Publication date	dc.date.issued	2019
Cita de ítem	dc.identifier.citation	Engineering Applications of Artificial Intelligence 85 (2019) 243–253
Identifier	dc.identifier.issn	09521976
Identifier	dc.identifier.other	10.1016/j.engappai.2019.06.019
Identifier	dc.identifier.uri	https://repositorio.uchile.cl/handle/2250/172445
Abstract	dc.description.abstract	Many Reinforcement Learning (RL) real-world applications have multi-dimensional action spaces which suffer from the combinatorial explosion of complexity. Then, it may turn infeasible to implement Centralized RL (CRL) systems due to the exponential increasing of dimensionality in both the state space and the action space, and the large number of training trials. In order to address this, this paper proposes to deal with these issues by using Decentralized Reinforcement Learning (DRL) to alleviate the effects of the curse of dimensionality on the action space, and by transferring knowledge to reduce the training episodes so that asymptotic converge can be achieved. Three DRL schemes are compared: DRL with independent learners and no prior-coordination (DRL-Ind); DRL accelerated-coordinated by using the Control Sharing (DRL+CoSh) Knowledge Transfer approach; and a proposed DRL scheme using the CoSh-based variant Nearby Action Sharing to include a measure of the uncertainty into the CoSh procedure (DRL+NeASh). These three schemes are analyzed through an extensive experimental study and validated through two complex real-world problems, namely the inwalk-kicking and the ball-dribbling behaviors, both performed with humanoid biped robots. Obtained results show (empirically): (i) the effectiveness of DRL systems which even without prior-coordination are able to achieve asymptotic convergence throughout indirect coordination; (ii) that by using the proposed knowledge transfer methods, it is possible to reduce the training episodes and to coordinate the DRL process; and (iii) obtained learning times are between 36% and 62% faster than the DRL-Ind schemes in the case studies.
Lenguage	dc.language.iso	en
Publisher	dc.publisher	Elsevier
Type of license	dc.rights	Attribution-NonCommercial-NoDerivs 3.0 Chile
Link to License	dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/cl/
Source	dc.source	Engineering Applications of Artificial Intelligence
Keywords	dc.subject	Autonomous robots
Keywords	dc.subject	Decentralized reinforcement learning
Keywords	dc.subject	Distributed artificial intelligence
Keywords	dc.subject	Distributed control
Keywords	dc.subject	Knowledge transfer
Keywords	dc.subject	Multi-agent systems
Título	dc.title	Accelerating decentralized reinforcement learning of complex individual behaviors
Document type	dc.type	Artículo de revista
Cataloguer	uchile.catalogador	laj
Indexation	uchile.index	Artículo de publicación SCOPUS
uchile.cosecha	uchile.cosecha	SI

Files in this item

Name:: Accelerating-decentralized-rei ...
Size:: 1.081Mb
Format:: PDF

This item appears in the following Collection(s)

Artículos de revistas
Artículos de revistas

Show simple item record

Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 Chile