Show simple item record

Authordc.contributor.authorModaresi, Sajad 
Authordc.contributor.authorSauré Valenzuela, Denis 
Authordc.contributor.authorVielma, Juan Pablo 
Admission datedc.date.accessioned2021-03-22T21:07:34Z
Available datedc.date.available2021-03-22T21:07:34Z
Publication datedc.date.issued2020
Cita de ítemdc.identifier.citationOperations Research Volumen: 68 Número: 5 Páginas: 1585-1604 Sep-Oct 2020es_ES
Identifierdc.identifier.other10.1287/opre.2019.1926
Identifierdc.identifier.urihttps://repositorio.uchile.cl/handle/2250/178743
Abstractdc.description.abstractWe study dynamic decision making under uncertainty when, at each period, a decision maker implements a solution to a combinatorial optimization problem. The objective coefficient vectors of said problem, which are unobserved before implementation, vary from period to period. These vectors, however, are known to be random draws from an initially unknown distribution with known range. By implementing different solutions, the decision maker extracts information about the underlying distribution but at the same time experiences the cost associated with said solutions. We show that resolving the implied exploration versus exploitation tradeoff efficiently is related to solving a lower-bound problem (LBP), which simultaneously answers the questions of what to explore and how to do so. We establish a fundamental limit on the asymptotic performance of any admissible policy that is proportional to the optimal objective value of the LBP problem. We show that such a lower bound might be asymptotically attained by policies that adaptively reconstruct and solve the LBP at an exponentially decreasing frequency. Because the LBP is likely intractable in practice, we propose policies that instead reconstruct and solve a proxy for the LBP, which we call the optimality cover problem (OCP). We provide strong evidence of the practical tractability of the OCP, which implies that the proposed policies can be implemented in real time. We test the performance of the proposed policies through extensive numerical experiments, and we show that they significantly outperform relevant benchmarks in the long-term and are competitive in the short-term.es_ES
Patrocinadordc.description.sponsorshipNational Science Foundation (NSF) NSF - Directorate for Engineering (ENG) 1233441 Complex Engineering Systems Institute CONICYT: PIA FB0816es_ES
Lenguagedc.language.isoenes_ES
Publisherdc.publisherINFORMSes_ES
Type of licensedc.rightsAttribution-NonCommercial-NoDerivs 3.0 Chile*
Link to Licensedc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/cl/*
Sourcedc.sourceOperations Researches_ES
Keywordsdc.subjectCombinatorial optimizationes_ES
Keywordsdc.subjectMultiarmed bandites_ES
Keywordsdc.subjectMixed-integer programminges_ES
Títulodc.titleLearning in Combinatorial Optimization: What and How to Explorees_ES
Document typedc.typeArtículo de revistaes_ES
dcterms.accessRightsdcterms.accessRightsAcceso Abierto
Catalogueruchile.catalogadorctces_ES
Indexationuchile.indexArtículo de publicación ISI
Indexationuchile.indexArtículo de publicación SCOPUS


Files in this item

Icon

This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial-NoDerivs 3.0 Chile
Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 Chile