Indexed dynamic programming to boost edit distance and LCSS computation
Author
dc.contributor.author
Barbay, Jérémy
Author
dc.contributor.author
Olivares, Andrés
Admission date
dc.date.accessioned
2019-05-31T15:21:08Z
Available date
dc.date.available
2019-05-31T15:21:08Z
Publication date
dc.date.issued
2018
Cita de ítem
dc.identifier.citation
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Volumen 11147 LNCS, 2018.
Identifier
dc.identifier.issn
16113349
Identifier
dc.identifier.issn
03029743
Identifier
dc.identifier.other
10.1007/978-3-030-00479-8_6
Identifier
dc.identifier.uri
https://repositorio.uchile.cl/handle/2250/169515
Abstract
dc.description.abstract
There are efficient dynamic programming solutions to the computation of the Edit Distance from S ∈in [1..σ]n to T ∈in [1..σ]m, for many natural subsets of edit operations, typically in time within O(nm) in the worst-case over strings of respective lengths n and m (which is likely to be optimal), and in time within O(n+m) in some special cases (e.g., disjoint alphabets). We describe how indexing the strings (in linear time), and using such an index to refine the recurrence formulas underlying the dynamic programs, yield faster algorithms in a variety of models, on a continuum of classes of instances of intermediate difficulty between the worst and the best case, thus refining the analysis beyond the worst case analysis. As a side result, we describe similar properties for the computation of the Longest Common Sub Sequence LCSS(S,T) between S and T, since it is a particular case of Edit Distance, and we discuss the application of similar algorithmic and analysis techniques for other dynamic programming
solutions. More formally, we propose a parameterized analysis of the computational complexity of the
Edit Distance for various set of operators and of the Longest Common Sub Sequence in function of the
area of the dynamic program matrix relevant to the computation.