Now showing items 1-20 of 104

    • Nogueira Nunes, Daniel; Louza, Felipe; Gog, Simon; Ayala-Rincon, Mauricio; Navarro, Gonzalo (Institute of Electrical and Electronics Engineers Inc., 2018)
      We introduce GCIS, a grammar compression algorithm based on the induced suffix sorting algorithm SAIS, presented by Nong et al. in 2009. Our solution builds on the factorization performed by SAIS during suffix sorting. We ...
    • Navarro, Gonzalo (Springer, 2017)
      The Block Tree is a recently proposed data structure that reaches compression close to Lempel-Ziv while supporting efficient direct access to text substrings. In this paper we show how a self-index can be built on top of ...
    • Álvarez García, Sandra; Bernardo, Guillermo de; Brisaboa, Nieves R.; Navarro, Gonzalo (Elsevier, 2017)
      The representation of binary relations has been intensively studied and many different theoretical and practical representations have been proposed to answer the usual queries in multiple domains. However, ternary relations ...
    • Brisaboa, Nieves; De Bernardo, Guillermo; Konow Krause, Roberto; Navarro, Gonzalo (Pergamon-Elsevier Science, 2016)
      Efficient processing of aggregated range queries on two-dimensional grids is a common requirement in information retrieval and data mining systems, for example in Geographic Information Systems and OLAP cubes. We introduce ...
    • Ferragina, Paolo; Manzini, Giovanni; Mäkinen, Veli; Navarro, Gonzalo (SPRINGER-VERLAG BERLIN, 2004)
      We show that, by combining an existing compression boosting technique with the wavelet tree data structure, we are able to design a variant of the FM-index which scales well with the size of the input alphabet Sigma. The ...
    • Navarro, Gonzalo; Paredes, Rodrigo; Reyes, Nora; Bustos, Cristian (Elsevier, 2017)
      We study the practical behavior of different algorithms and methods that aim to estimate the intrinsic dimension (IDim) in metric spaces. Some of them were specifically developed to evaluate the complexity of searching in ...
    • Russo, Luís M. S.; Navarro, Gonzalo; Oliveira, Arlindo L.; Morales, Pedro (2009)
      A compressed full-text self-index for a text T is a data structure requiring reduced space and able to search for patterns P in T. It can also reproduce any substring of T, thus actually replacing T. Despite the recent ...
    • Russo, Luís M. S.; Navarro, Gonzalo; Oliveira, Arlindo L. (Springer, 2007)
      A compressed full-text self-index for a text T is a data structure requiring reduced space and able of searching for patterns P in T. Furthermore, the structure can reproduce any substring of T, thus it actually replaces ...
    • Grossi, Roberto; Iacono, John; Navarro, Gonzalo; Raman, Rajeev; Satti, S. Rao (Association for Computing Machinery, 2017)
      Given an array A[1, n] of elements with a total order, we consider the problem of building a data structure that solves two queries: (a) selection queries receive a range [i, j] and an integer k and return the position of ...
    • Navarro, Gonzalo; Fredriksson, Kimmo (ELSEVIER, 2004-08-16)
      We show that the average number of characters examined to search for r random patterns of length m in a text of length n over a uniformly distributed alphabet of size a cannot be less than Omega(n log(sigma)(rm)/m). When ...
    • Crochemore, Maxime; Iliopoulos, Costas; Navarro, Gonzalo; Pinzon, Yoan; Salinger, Alejandro (SPRINGER-VERLAG BERLIN, 2003)
      delta, gamma)-Matching is a string matching problem with applications to music retrieval. The goal is, given a pattern P-1...m and a text T-1...n on an alphabet of integers, find the occurrences P' of the pattern in the ...
    • Hyyro, Heikki; Navarro, Gonzalo (SPRINGER, 2005-01)
      We present a new bit-parallel technique for approximate string matching. We build on two previous techniques. The first one, BPM (Myers, 1999), searches for a pattern of length m in a text of length n permitting k differences ...
    • Fariña, Antonio; Navarro, Gonzalo; Paramá, José R. (OXFORD UNIV PRESS, 2012-01)
      Semistatic word-based byte-oriented compressors are known to be attractive alternatives to compress natural language texts. With compression ratios around 30-35%, they allow fast direct searching of compressed text. In ...
    • Gagie, Travis; Kärkkäinen, Juha; Navarro, Gonzalo; Puglisi, Simon J. (2013)
      Colored range queries are a well-studied topic in computational geometry and database research that, in the past decade, have found exciting applications in information retrieval. In this paper, we give improved time and ...
    • Brisaboa, NievesR; Ladra, Susana; Navarro, Gonzalo (Elsevier, 2013)
      Many relevant Web mining tasks translate into classical algorithms on the Web graph. Compact Web graph representations allow running these tasks on larger graphs within main memory. These representations at least provide ...
    • Brisaboa, Nieves; De Bernardo, Guillermo; Navarro, Gonzalo; Rodeiro, Tirso; Seco, Diego (Institute of Electrical and Electronics Engineers Inc., 2018)
      We introduce a new technique for the efficient management of large sequences of multi-dimensional data, which takes advantage of regularities that arise in real-world datasets and supports different types of aggregation ...
    • Chávez, Edgar; Navarro, Gonzalo (ELSEVIER SCIENCE BV, 2005-07-01)
      he metric space model abstracts many proximity search problems, from nearest-neighbor classifiers to textual and multimedia information retrieval. In this context, an index is a data structure that speeds up proximity ...
    • González Cornejo, Senén Andrés (Universidad de ChileCyberDocs, 2009)
      Las máquinas de búsqueda para la Web utilizan el índice invertido como estructura de datos que permite acelerar las búsquedas en grandes colecciones de texto. Para lograr tiempos de respuesta por consulta menores al medio ...
    • Gagie, Travis; He, Meng; Navarro, Gonzalo (Institute of Electrical and Electronics Engineers Inc., 2017)
      In the range α-majority query problem, we preprocess a given sequence S[1.n] for a fixed threshold α ϵ (0, 1], such that given a query range [i.j], the symbols that occur more than α (j-i+1) times in S[i.j] can be reported ...
    • Navarro, Gonzalo; Makinen, Veli (ASSOC COMPUTING MACHINERY, 2007-04)
      Full-text indexes provide fast substring search over large text collections. A serious problem of these indexes has traditionally been their space consumption. A recent trend is to develop indexes that exploit the ...