Abstract | dc.description.abstract | Bit-parallelism permits executing several operations simultaneously over a set of bits or numbers stored in a single computer word. This technique permits searching for the approximate occurrences of a pattern of length m in a text of length n in time O([m/w]n), where w is the number of bits in the computer word. Although this is asymptotically the optimal speedup over the basic O(mn) time algorithm, it wastes bit-parallelism's power in the common case where m is much smaller than w, since w - m bits in the computer words get unused.
In this paper we explore different ways to increase the bit-parallelism when the search pattern is short. First, we show how multiple patterns can be packed in a single computer word so as to search for multiple patterns simultaneously. Instead of paying O(rn) time to search for r patterns of length m < w, we obtain O([r/[w/m]]n) time. Second, we show how the mechanism permits boosting the search for a single pattern of length m < w, which can be searched for in time O(n/[w/m]) instead of O(n). Finally, we show how to extend these algorithms so that the time bounds essentially depend on k instead of m, where k is the maximum number of differences permitted.
Our experimental results show that that the algorithms work well in practice, and are the fastest alternatives for a wide range of search parameters. | en |