In textual analysis, many corpora include texts in chronological order and in many cases this temporal connotation is crucial to understanding of their inner structure. In a typical bag-of-words approach, data are organized in contingency tables, the rows reporting the frequency of each word over time-points (shown in columns). These discrete data (temporal patterns for frequencies) may be viewed as continuous objects represented by functional relationships. This study aimed at identifying a specific sequential pattern for each word as a functional object and at grouping these word patterns in clusters. A model-based clustering procedure is proposed, with specific reference to a corpus of end-of-year messages delivered by the ten Presidents of the Italian Republic covering the period from 1949 to 2011.
Shaping the history of words
TUZZI, ARJUNA
2013
Abstract
In textual analysis, many corpora include texts in chronological order and in many cases this temporal connotation is crucial to understanding of their inner structure. In a typical bag-of-words approach, data are organized in contingency tables, the rows reporting the frequency of each word over time-points (shown in columns). These discrete data (temporal patterns for frequencies) may be viewed as continuous objects represented by functional relationships. This study aimed at identifying a specific sequential pattern for each word as a functional object and at grouping these word patterns in clusters. A model-based clustering procedure is proposed, with specific reference to a corpus of end-of-year messages delivered by the ten Presidents of the Italian Republic covering the period from 1949 to 2011.Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.