Matrix factorization exploits the idea that, in complex high-dimensional data, the actual signal typically lies in lower-dimensional structures. These lower dimensional objects provide useful insights, with interpretation favored by sparse structures. Sparsity, in addition, is beneficial in terms of regularization and, thus, to avoid over-fitting. By exploiting Bayesian shrinkage priors, we devise a computationally convenient approach for high-dimensional matrix factorization. The dependence between row and column entities is modeled by inducing flexible sparse patterns within factors. The availability of external information is accounted for in such a way that structures are allowed while not imposed. Inspired by boosting algorithms, we pair the proposed approach with a numerical strategy relying on a sequential inclusion and estimation of low-rank contributions, with a data-driven stopping rule. Practical advantages of the proposed approach are demonstrated by means of a simulation study and the analysis of soccer heatmaps obtained from new generation tracking data. Supplementary materials for this article are available online.

Accelerated Structured Matrix Factorization

Schiavon, Lorenzo;Canale, Antonio
2024

Abstract

Matrix factorization exploits the idea that, in complex high-dimensional data, the actual signal typically lies in lower-dimensional structures. These lower dimensional objects provide useful insights, with interpretation favored by sparse structures. Sparsity, in addition, is beneficial in terms of regularization and, thus, to avoid over-fitting. By exploiting Bayesian shrinkage priors, we devise a computationally convenient approach for high-dimensional matrix factorization. The dependence between row and column entities is modeled by inducing flexible sparse patterns within factors. The availability of external information is accounted for in such a way that structures are allowed while not imposed. Inspired by boosting algorithms, we pair the proposed approach with a numerical strategy relying on a sequential inclusion and estimation of low-rank contributions, with a data-driven stopping rule. Practical advantages of the proposed approach are demonstrated by means of a simulation study and the analysis of soccer heatmaps obtained from new generation tracking data. Supplementary materials for this article are available online.
File in questo prodotto:
File Dimensione Formato  
2212.06504v1.pdf

accesso aperto

Tipologia: Preprint (AM - Author's Manuscript - submitted)
Licenza: Altro
Dimensione 870.72 kB
Formato Adobe PDF
870.72 kB Adobe PDF Visualizza/Apri
Accelerated Structured Matrix Factorization.pdf

Accesso riservato

Tipologia: Published (Publisher's Version of Record)
Licenza: Accesso privato - non pubblico
Dimensione 2.08 MB
Formato Adobe PDF
2.08 MB Adobe PDF Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3540745
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 2
  • OpenAlex 1
social impact