We describe a way to compute the edit distance of two strings without having to fill the whole dynamic programming (DP) matrix, through a sequence of increasing guesses on the edit distance. If the strings share a certain degree of similarity, the edit distance can be quite smaller than the value of non-optimal solutions, and a large fraction (up to 80–90%) of the DP matrix cells do not need to be computed. Including the method’s overhead, this translates into a speedup factor from 3× up to 30× in the time needed to find the optimal solution for strings of length about 20,000.

Speeding-up the dynamic programming procedure for the edit distance of two strings

Dalpasso M.
2019

Abstract

We describe a way to compute the edit distance of two strings without having to fill the whole dynamic programming (DP) matrix, through a sequence of increasing guesses on the edit distance. If the strings share a certain degree of similarity, the edit distance can be quite smaller than the value of non-optimal solutions, and a large fraction (up to 80–90%) of the DP matrix cells do not need to be computed. Including the method’s overhead, this translates into a speedup factor from 3× up to 30× in the time needed to find the optimal solution for strings of length about 20,000.
2019
Communications in Computer and Information Science
978-3-030-27683-6
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3329625
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact