Show simple item record

Authordc.contributor.authorLeottau, David L. 
Authordc.contributor.authorLobos Tsunekawa, Kenzo 
Authordc.contributor.authorJaramillo, Francisco 
Authordc.contributor.authorRuiz del Solar, Javier 
Admission datedc.date.accessioned2019-10-30T15:29:58Z
Available datedc.date.available2019-10-30T15:29:58Z
Publication datedc.date.issued2019
Cita de ítemdc.identifier.citationEngineering Applications of Artificial Intelligence 85 (2019) 243–253
Identifierdc.identifier.issn09521976
Identifierdc.identifier.other10.1016/j.engappai.2019.06.019
Identifierdc.identifier.urihttps://repositorio.uchile.cl/handle/2250/172445
Abstractdc.description.abstractMany Reinforcement Learning (RL) real-world applications have multi-dimensional action spaces which suffer from the combinatorial explosion of complexity. Then, it may turn infeasible to implement Centralized RL (CRL) systems due to the exponential increasing of dimensionality in both the state space and the action space, and the large number of training trials. In order to address this, this paper proposes to deal with these issues by using Decentralized Reinforcement Learning (DRL) to alleviate the effects of the curse of dimensionality on the action space, and by transferring knowledge to reduce the training episodes so that asymptotic converge can be achieved. Three DRL schemes are compared: DRL with independent learners and no prior-coordination (DRL-Ind); DRL accelerated-coordinated by using the Control Sharing (DRL+CoSh) Knowledge Transfer approach; and a proposed DRL scheme using the CoSh-based variant Nearby Action Sharing to include a measure of the uncertainty into the CoSh procedure (DRL+NeASh). These three schemes are analyzed through an extensive experimental study and validated through two complex real-world problems, namely the inwalk-kicking and the ball-dribbling behaviors, both performed with humanoid biped robots. Obtained results show (empirically): (i) the effectiveness of DRL systems which even without prior-coordination are able to achieve asymptotic convergence throughout indirect coordination; (ii) that by using the proposed knowledge transfer methods, it is possible to reduce the training episodes and to coordinate the DRL process; and (iii) obtained learning times are between 36% and 62% faster than the DRL-Ind schemes in the case studies.
Lenguagedc.language.isoen
Publisherdc.publisherElsevier
Type of licensedc.rightsAttribution-NonCommercial-NoDerivs 3.0 Chile
Link to Licensedc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/cl/
Sourcedc.sourceEngineering Applications of Artificial Intelligence
Keywordsdc.subjectAutonomous robots
Keywordsdc.subjectDecentralized reinforcement learning
Keywordsdc.subjectDistributed artificial intelligence
Keywordsdc.subjectDistributed control
Keywordsdc.subjectKnowledge transfer
Keywordsdc.subjectMulti-agent systems
Títulodc.titleAccelerating decentralized reinforcement learning of complex individual behaviors
Document typedc.typeArtículo de revista
Catalogueruchile.catalogadorlaj
Indexationuchile.indexArtículo de publicación SCOPUS
uchile.cosechauchile.cosechaSI


Files in this item

Icon

This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial-NoDerivs 3.0 Chile
Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 Chile