We study data structures for storing a set of polygonal curves in R^d such that, given a query curve, we can efficiently retrieve similar curves from the set, where similarity is measured using the discrete Fréchet distance or the dynamic time warping distance. To this end we devise the first locality-sensitive hashing schemes for these distance measures. A major challenge is posed by the fact that these distance measures internally optimize the alignment between the curves. We give solutions for different types of alignments including constrained and unconstrained versions. For unconstrained alignments, we improve over a result by Indyk [SoCG 2002] for short curves. Let n be the number of input curves and let m be the maximum complexity of a curve in the input. In the particular case where m <= (a/(4d)) log n, for some fixed a>0, our solutions imply an approximate near-neighbor data structure for the discrete Fréchet distance that uses space in O(n^(1+a) log n) and achieves query time in O(n^a log^2 n) and constant approximation factor. Furthermore, our solutions provide a trade-off between approximation quality and computational performance: for any parameter k in [m], we can give a data structure that uses space in O(2^(2k) m^(k-1) n log n + nm), answers queries in O( 2^(2k) m^(k) log n) time and achieves approximation factor in O(m/k).

Locality-sensitive hashing of curves

SILVESTRI, FRANCESCO
2017

Abstract

We study data structures for storing a set of polygonal curves in R^d such that, given a query curve, we can efficiently retrieve similar curves from the set, where similarity is measured using the discrete Fréchet distance or the dynamic time warping distance. To this end we devise the first locality-sensitive hashing schemes for these distance measures. A major challenge is posed by the fact that these distance measures internally optimize the alignment between the curves. We give solutions for different types of alignments including constrained and unconstrained versions. For unconstrained alignments, we improve over a result by Indyk [SoCG 2002] for short curves. Let n be the number of input curves and let m be the maximum complexity of a curve in the input. In the particular case where m <= (a/(4d)) log n, for some fixed a>0, our solutions imply an approximate near-neighbor data structure for the discrete Fréchet distance that uses space in O(n^(1+a) log n) and achieves query time in O(n^a log^2 n) and constant approximation factor. Furthermore, our solutions provide a trade-off between approximation quality and computational performance: for any parameter k in [m], we can give a data structure that uses space in O(2^(2k) m^(k-1) n log n + nm), answers queries in O( 2^(2k) m^(k) log n) time and achieves approximation factor in O(m/k).
2017
Proceedings of the 33rd International Symposium on Computational Geometry
978-3-95977-038-5
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3228400
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 31
  • ???jsp.display-item.citation.isi??? ND
social impact