Pattern spotting in historical documents using convolutional models

Úbeda Soto, Ignacio Andrés

Professor Advisor	dc.contributor.advisor	Saavedra Rondo, José
Professor Advisor	dc.contributor.advisor	Ríos Pérez, Sebastián
Author	dc.contributor.author	Úbeda Soto, Ignacio Andrés
Associate professor	dc.contributor.other	Heutte, Laurent
Associate professor	dc.contributor.other	Sauré Valenzuela, Denis
Admission date	dc.date.accessioned	2020-10-15T02:27:39Z
Available date	dc.date.available	2020-10-15T02:27:39Z
Publication date	dc.date.issued	2020
Identifier	dc.identifier.uri	https://repositorio.uchile.cl/handle/2250/177136
General note	dc.description	Tesis para optar al grado de Magíster en Gestión de Operaciones	es_ES
General note	dc.description	Memoria para optar al título de Ingeniero Civil Industrial
Abstract	dc.description.abstract	Pattern spotting consists of locating diﬀerent instances of a given object (i.e. an image query) in a collection of document images. Contrary to object detection, no prior information is given about the patterns that can be searched, hence no training can be done for localization. The queried patterns may vary in shape, size, color, context and even style, which makes pattern spotting a diﬃcult task. To tackle this problem, we propose a convolutional neural network approach based on Feature Pyramid Networks (FPN) as the feature extractor of our system. Using FPN allows us to extract descriptors of local regions of the documents to be indexed and queries, at multiple scales with a single forward pass. Our main hypothesis is that deep features are more discriminatory than classical descriptors for pattern localization. Experiments conducted on DocExplore (a historical document dataset for pattern spotting evaluation) show that the proposed system improves mAP by 73% (from 0.157 to 0.272) in pattern localization compared with state-of-the-art results, even when the feature extractor is not trained with domain-speciﬁc data. Memory requirement and computation time are also decreased since the descriptor dimension used for distance computation is reduced by a factor of 16. We conclude that CNN-based local descriptors are better than VLAD (classical) descriptors at locating patterns and we use them to propose a system for pattern localization. Limitation of our approach is that it struggles with non-square patterns. We also propose a solution to address this issue extracting multiple descriptors per query. Although it improves results in retrieving documents, it loses precision in locating patterns. Aggregation on those descriptors is proposed as interesting future work in order to improve the system.	es_ES
Lenguage	dc.language.iso	en	es_ES
Publisher	dc.publisher	Universidad de Chile	es_ES
Type of license	dc.rights	Attribution-NonCommercial-NoDerivs 3.0 Chile	*
Link to License	dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/cl/	*
Keywords	dc.subject	Redes neuronales (Ciencia de la computación)	es_ES
Keywords	dc.subject	Recuperación de imágenes	es_ES
Keywords	dc.subject	Pattern spotting	es_ES
Título	dc.title	Pattern spotting in historical documents using convolutional models	es_ES
Document type	dc.type	Tesis
Cataloguer	uchile.catalogador	gmm	es_ES
Department	uchile.departamento	Departamento de Ingeniería Industrial	es_ES
Faculty	uchile.facultad	Facultad de Ciencias Físicas y Matemáticas	es_ES
uchile.titulacion	uchile.titulacion	Doble Titulación	es_ES

Files in this item

Name:: Pattern-spotting-in-historical ...
Size:: 26.40Mb
Format:: PDF

Name:: TablaConten.pdf
Size:: 82.20Kb
Format:: PDF

This item appears in the following Collection(s)

Tesis Postgrado
Tesis Postgrado

Show simple item record

Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 Chile