One of the most beautiful and useful notions in the Mathematical Theory of Strings is that of a Period, i.e., an initial piece of a given string that can generate that string by repeating itself at regular intervals. Periods have an elegant mathematical structure and a wealth of applications [F. Mignosi and A. Restivo, Periodicity, Algebraic Combinatorics on Words, in: M. Lothaire (Ed.), Cambridge University Press, Cambridge, pp. 237-274, 2002]. At the hearth of their theory, there are two Periodicity Lemmas: one due to Lyndon and Schutzenberger [The equation a^M=b^Nc^P in a free group, Michigan Math. J. 9 (1962) 289-298], referred to as the Weak Version, and the other due to Fine and Wilf [Uniqueness theorems for periodic functions, Proc. Amer. Math. Soc. 16 (1965) 109-114]. In this paper, we investigate the notion of periodicity and the closely related one of repetition in connection with parameterized strings as introduced by Baker [Parameterized pattern matching: algorithms and applications, J. Comput. System Sci. 52(1) (1996) 28-42; Parameterized duplication in strings: algorithms and an application to software maintenance, SIAM J. Comput. 26(5) (1997) 1343-1362]. In such strings, the notion of pairwise match or ''equivalence'' of symbols is more relaxed than the usual one, in that it rests on some mapping, rather than identity, of symbols. It seems natural to try and extend notions of periods and periodicities to encompass parameterized strings. However, we know of no previous attempt in this direction. Our preliminary investigation yields results as follows. For periodicity, we get (a) a generalization of the Weak Version of the Periodicity Lemma for parameterized strings, showing that it is essential that the two mappings inducing the periodicity must commute; (b) a proof that an analogous of the Lemma by Fine and Wilf [Uniqueness theorems for periodic functions, Proc. Amer. Math. Soc. 16 (1965) 109-114] cannot hold for parameterized strings, even if the mappings inducing the periodicity ''commute'', in a sense to be specified below; (c) a proof that parameterized strings over an alphabet of at least three letters may have a set of periods which differ from those of any binary string of the same length-whereby the parameterized analog of a classic result by Guibas and Odlyzko [String overlaps, pattern matching, and nontransitive games, J. Combin. Theory Ser. A 30 (1981) 183-208] cannot hold. We also derive necessary and sufficient conditions characterizing parameterized repetitions, which are patterns of length at least twice that of the period, and show how the notion of root differs from the standard case, and highlight some of the implications on extending algorithmic criteria previously adopted for string searching, detection of repetitions and the likes. Finally, as a corollary of our main results, we also show that binary parameterized strings behave much in the same way as non-parameterized ones with respect to periodicity and repetitions, while there is a substantial difference for strings over alphabets of at least three symbols.

Periodicity and repetitions in parameterized strings.

APOSTOLICO, ALBERTO;
2008

Abstract

One of the most beautiful and useful notions in the Mathematical Theory of Strings is that of a Period, i.e., an initial piece of a given string that can generate that string by repeating itself at regular intervals. Periods have an elegant mathematical structure and a wealth of applications [F. Mignosi and A. Restivo, Periodicity, Algebraic Combinatorics on Words, in: M. Lothaire (Ed.), Cambridge University Press, Cambridge, pp. 237-274, 2002]. At the hearth of their theory, there are two Periodicity Lemmas: one due to Lyndon and Schutzenberger [The equation a^M=b^Nc^P in a free group, Michigan Math. J. 9 (1962) 289-298], referred to as the Weak Version, and the other due to Fine and Wilf [Uniqueness theorems for periodic functions, Proc. Amer. Math. Soc. 16 (1965) 109-114]. In this paper, we investigate the notion of periodicity and the closely related one of repetition in connection with parameterized strings as introduced by Baker [Parameterized pattern matching: algorithms and applications, J. Comput. System Sci. 52(1) (1996) 28-42; Parameterized duplication in strings: algorithms and an application to software maintenance, SIAM J. Comput. 26(5) (1997) 1343-1362]. In such strings, the notion of pairwise match or ''equivalence'' of symbols is more relaxed than the usual one, in that it rests on some mapping, rather than identity, of symbols. It seems natural to try and extend notions of periods and periodicities to encompass parameterized strings. However, we know of no previous attempt in this direction. Our preliminary investigation yields results as follows. For periodicity, we get (a) a generalization of the Weak Version of the Periodicity Lemma for parameterized strings, showing that it is essential that the two mappings inducing the periodicity must commute; (b) a proof that an analogous of the Lemma by Fine and Wilf [Uniqueness theorems for periodic functions, Proc. Amer. Math. Soc. 16 (1965) 109-114] cannot hold for parameterized strings, even if the mappings inducing the periodicity ''commute'', in a sense to be specified below; (c) a proof that parameterized strings over an alphabet of at least three letters may have a set of periods which differ from those of any binary string of the same length-whereby the parameterized analog of a classic result by Guibas and Odlyzko [String overlaps, pattern matching, and nontransitive games, J. Combin. Theory Ser. A 30 (1981) 183-208] cannot hold. We also derive necessary and sufficient conditions characterizing parameterized repetitions, which are patterns of length at least twice that of the period, and show how the notion of root differs from the standard case, and highlight some of the implications on extending algorithmic criteria previously adopted for string searching, detection of repetitions and the likes. Finally, as a corollary of our main results, we also show that binary parameterized strings behave much in the same way as non-parameterized ones with respect to periodicity and repetitions, while there is a substantial difference for strings over alphabets of at least three symbols.
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/2268854
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 22
  • ???jsp.display-item.citation.isi??? 19
social impact