Near neighbor searching with K nearest references
MetadataShow full item record
Proximity searching is the problem of retrieving,from a given database, those objects closest to a query.To avoid exhaustive searching,data structures called indexes are builton the database prior to serving queries.The curse of dimensionality is a well-known problem for indexes:in spaces with sufficiently concentrated distance histograms,no index out performs an exhaustives can of the database. In recent years,a number of index es for approximate proximity searching have been proposed. These are able to cope with the curse of dimensionality inexchangefor returning an answer that might beslightly different from the correctone. In this paper we show that many of those recent indexes can be understood as variants of a simple general model basedon K-nearest reference signatures.A set o freferences is chosen from the database,and the signature of each object consists of the K references nearest to the object. At query time,the signature of the query is computed and the search examines only the objects whose signature is close enough to that of the query. Many known and novelindexes are obtained by considering different ways to determine how much detail the signature records(e.g.,justthesetofnearestreferences, or also their proximity order to the object, oral so their distances to the object,and soon), how the similarity between signatures is defined,and how the parameters are tuned.In addition,we introduce a space-efficient representation for those families of indexes, making it possible to search very large databases in main memory.Small indexes are cache friendly,inducing faster queries. We perform exhaustive experiments comparing several known and new indexes that derive from our framework,evaluatingtheir time performance,memory usage,and quality of approximation.The best indexes out perform the state of the art,offering an attractive balance between all these aspects,and turn out to be excellent choices in many scenarios.Our framework gives high flexibility to design new indexes.
Artículo de publicación ISI
DOI: DOI: 10.1016/j.is.2015.02.001
Quote ItemInformation Systems51(2015)43–61
The following license files are associated with this item: