A metric index for approximate string matching

Chávez, Edgar; Navarro, Gonzalo

Author	dc.contributor.author	Chávez, Edgar
Author	dc.contributor.author	Navarro, Gonzalo	es_CL
Admission date	dc.date.accessioned	2008-12-22T10:32:31Z
Available date	dc.date.available	2008-12-22T10:32:31Z
Publication date	dc.date.issued	2006-03-07
Cita de ítem	dc.identifier.citation	THEORETICAL COMPUTER SCIENCE Volume: 352 Issue: 1-3 Pages: 266-279 Published: MAR 7 2006	en
Identifier	dc.identifier.issn	0304-3975
Identifier	dc.identifier.uri	https://repositorio.uchile.cl/handle/2250/124793
Abstract	dc.description.abstract	We present a radically new indexing approach for approximate string matching. The scheme uses the metric properties of the edit distance and can be applied to any other metric between strings. We build a metric space where the sites are the nodes of the suffix tree of the text, and the approximate query is seen as a proximity query on that metric space. This permits us finding the occ occurrences of a pattern of length in, permitting up to r differences, in a text of length n over an alphabet of size a, in average time O(m(1+epsilon) + occ) for any epsilon > 0, if r = o(m/log, m) and m > ((1 + epsilon)/epsilon)log(sigma) n. The index works well up to r < (3 - root 2)m/log(sigma) m, where it achieves its maximum average search complexity O(m(1+root 2+epsilon) + occ). The construction time of the index is O(m(1+root 2+epsilon) n log n) and its space is O(m(1+root 2+epsilon) n). This is the first index achieving average search time polynomial in in and independent of n, for r = O(m /log(sigma) m). Previous methods achieve this complexity only for r = O(m/log(sigma) m). We also present a simpler scheme needing O(n) space.	en
Lenguage	dc.language.iso	en	en
Publisher	dc.publisher	ELSEVIER	en
Keywords	dc.subject	LINEAR-TIME CONSTRUCTION	en
Título	dc.title	A metric index for approximate string matching	en
Document type	dc.type	Artículo de revista

Files in this item

Name:: Chavez_Edgar.pdf
Size:: 780.9Kb
Format:: PDF

This item appears in the following Collection(s)

Artículos de revistas
Artículos de revistas

Show simple item record