In this paper, we consider the recently proposed family of continual learning models, called Gated Linear Networks (GLNs), and study two crucial aspects impacting on the amount of catastrophic forgetting affecting gated linear networks, namely, data standardization and gating mechanism. Data standardization is particularly challenging in the online/continual learning setting because data from future tasks is not available beforehand. The results obtained using an online standardization method show a considerably higher amount of forgetting compared to an offline -static- standardization. Interestingly, with the latter standardization, we observe that GLNs show almost no forgetting on the considered benchmark datasets. Secondly, for an effective GLNs, it is essential to tailor the hyperparameters of the gating mechanism to the data distribution. In this paper, we propose a gating strategy based on a set of prototypes and the resulting Voronoi tessellation. The experimental assessment shows that the proposed approach is more robust to different data standardizations compared to the original one, based on a halfspace gating mechanism, and shows improved predictive performance.

Understanding Catastrophic Forgetting of Gated Linear Networks in Continual Learning

Pasa, L;Navarin, N
2022

Abstract

In this paper, we consider the recently proposed family of continual learning models, called Gated Linear Networks (GLNs), and study two crucial aspects impacting on the amount of catastrophic forgetting affecting gated linear networks, namely, data standardization and gating mechanism. Data standardization is particularly challenging in the online/continual learning setting because data from future tasks is not available beforehand. The results obtained using an online standardization method show a considerably higher amount of forgetting compared to an offline -static- standardization. Interestingly, with the latter standardization, we observe that GLNs show almost no forgetting on the considered benchmark datasets. Secondly, for an effective GLNs, it is essential to tailor the hyperparameters of the gating mechanism to the data distribution. In this paper, we propose a gating strategy based on a set of prototypes and the resulting Voronoi tessellation. The experimental assessment shows that the proposed approach is more robust to different data standardizations compared to the original one, based on a halfspace gating mechanism, and shows improved predictive performance.
2022
2022 International Joint Conference on Neural Networks (IJCNN)
978-1-7281-8671-9
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3472506
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 1
social impact