We describe an algorithm computing an optimal prefix free code for n unsorted positive weights in time within O(n(1+lg alpha))subset of O(nlgn), where the alternation alpha is an element of[1..n-1] approximates the minimal amount of sorting required by the computation. This asymptotical complexity is within a constant factor of the optimal in the algebraic decision tree computational model, in the worst case over all instances of size n and alternation alpha. Such results refine the state of the art complexity of Theta(nlgn) in the worst case over instances of size n in the same computational model, a landmark in compression and coding since 1952. Beside the new analysis technique, such improvement is obtained by combining a new algorithm, inspired by van Leeuwen's algorithm to compute optimal prefix free codes from sorted weights (known since 1976), with a relatively minor extension of Karp et al.'s deferred data structure to partially sort a multiset accordingly to the queries performed on it (known since 1988). Preliminary experimental results on text compression by words show alpha to be polynomially smaller than n, which suggests improvements by at most a constant multiplicative factor in the running time for such applications.
es_ES
Patrocinador
dc.description.sponsorship
Millennium Nucleus "Information and Coordination in Networks" RC130003