Reinforcement learning applied to dynamic clustering

Carvajal Cáceres, Ignacio Nicolás

Professor Advisor	dc.contributor.advisor	Weber Haas, Richard
Author	dc.contributor.author	Carvajal Cáceres, Ignacio Nicolás
Associate professor	dc.contributor.other	Saltos Atiencia, Ramiro
Associate professor	dc.contributor.other	Sauré Valenzuela, Denis
Admission date	dc.date.accessioned	2024-10-29T18:48:36Z
Available date	dc.date.available	2024-10-29T18:48:36Z
Publication date	dc.date.issued	2024
Identifier	dc.identifier.other	10.58011/ypzt-da43
Identifier	dc.identifier.uri	https://repositorio.uchile.cl/handle/2250/201762
Abstract	dc.description.abstract	El clustering es una técnica esencial en el reconocimiento de patrones, la minería de datos y el descubrimiento de conocimiento. Un desafío significativo en el clustering dinámico es predecir los cambios en la estructura subyacente de los datos, como la segmentación futura de clientes. Este problema se complica especialmente cuando se trabaja con datos multimodales, ya que es necesario estudiar los cambios en la estructura de los datos a lo largo del tiempo. Este documento propone utilizar Gradientes de Política Determinística Profunda Multiagente (MADDPG) y el Modelo de Mezcla Gaussiana (GMM) para resolver el problema del clustering dinámico. El GMM se emplea para representar una mezcla de distribuciones de probabilidad, considerando los clusters (componentes) de GMM como agentes en un juego de Markov parcialmente observable. Los agentes se entrenan con MADDPG, una extensión del algoritmo DDPG diseñada para entornos multiagentes, que permite a los agentes aprender políticas descentralizadas y coordinarse entre sí. El objetivo principal de este trabajo es predecir los parámetros de GMM del próximo período utilizando la información del período actual. Durante el entrenamiento, cada agente observa los estados y acciones de todos los agentes y aprende un crítico centralizado para estimar el valor de la acción conjunta. En la fase de ejecución, cada agente utiliza solo sus observaciones locales para seleccionar acciones, buscando optimizar la log-verosimilitud obtenida con los parámetros predichos al clusterizar los datos en el próximo período. El documento demuestra que el enfoque propuesto puede predecir eficazmente los parámetros de GMM para períodos futuros bajo condiciones de movimientos lineales y estacionarios de los clusters, mejorando la capacidad de predecir la estructura subyacente de los datos en contextos dinámicos comparado con solo confiar en la clusterización del período actual.	es_ES
Abstract	dc.description.abstract	Clustering is an essential technique in pattern recognition, data mining, and knowledge discovery. A significant challenge in dynamic clustering is predicting changes in the underlying structure of the data, such as future customer segmentation. This problem is especially complex when dealing with multimodal data, as it is necessary to study changes in the data structure over time. This paper proposes using Multi-Agent Deep Deterministic Policy Gradients (MADDPG) and the Gaussian Mixture Model (GMM) to address the issue of dynamic clustering. GMM is employed to represent a mixture of probability distributions, considering the clusters (components) of GMM as agents in a partially observable Markov game. The agents are trained using MADDPG, an extension of the DDPG algorithm designed for multi-agent environments, which allows the agents to learn decentralized policies and coordinate with each other. The primary objective of this work is to predict the GMM parameters for the next period using information from the current period. During training, each agent observes the states and actions of all other agents and learns a centralized critic to estimate the value of the joint action. In the execution phase, each agent uses only its local observations to select actions, aiming to optimize the log-likelihood obtained with the predicted parameters when clustering the data in the next period. The paper demonstrates that the proposed approach can effectively predict GMM parameters for future periods under conditions of linear and stationary movements of the clusters, improving the ability to predict the underlying structure of the data in dynamic contexts compared to relying solely on the clustering of the current period.	es_ES
Lenguage	dc.language.iso	en	es_ES
Publisher	dc.publisher	Universidad de Chile	es_ES
Type of license	dc.rights	Attribution-NonCommercial-NoDerivs 3.0 United States	*
Link to License	dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/us/	*
Título	dc.title	Reinforcement learning applied to dynamic clustering	es_ES
Document type	dc.type	Tesis	es_ES
dc.description.version	dc.description.version	Versión original del autor	es_ES
dcterms.accessRights	dcterms.accessRights	Acceso abierto	es_ES
Cataloguer	uchile.catalogador	chb	es_ES
Department	uchile.departamento	Departamento de Ingeniería Industrial	es_ES
Faculty	uchile.facultad	Facultad de Ciencias Físicas y Matemáticas	es_ES
uchile.titulacion	uchile.titulacion	Doble Titulación	es_ES
uchile.carrera	uchile.carrera	Ingeniería Civil Industrial	es_ES
uchile.gradoacademico	uchile.gradoacademico	Magister	es_ES
uchile.notadetesis	uchile.notadetesis	Tesis para optar al grado de Magíster en Gestión de Operaciones	es_ES
uchile.notadetesis	uchile.notadetesis	Memoria para optar al título de Ingeniero Civil Industrial

Files in this item

Name:: Reinforcement-learning-applied ...
Size:: 2.291Mb
Format:: PDF

This item appears in the following Collection(s)

Tesis Postgrado
Tesis Postgrado

Show simple item record

Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 United States