The goal of Multiple Kernel Learning (MKL) is to combine kernels derived from multiple sources in a data-driven way with the aim to enhance the accuracy of a target kernel machine. State-of-the-art methods of MKL have the drawback that the time required to solve the associated optimization problem grows (typically more than linearly) with the number of kernels to combine. Moreover, it has been empirically observed that even sophisticated methods often do not significantly outperform the simple average of kernels. In this paper, we propose a time and space efficient MKL algorithm that can easily cope with hundreds of thousands of kernels and more. The proposed method has been compared with other baselines (random, average, etc.) and three state-of-the-art MKL methods showing that our approach is often superior. We show empirically that the advantage of using the method proposed in this paper is even clearer when noise features are added. Finally, we have analyzed how our algorithm changes its performance with respect to the number of examples in the training set and the number of kernels combined.

EasyMKL: A scalable multiple kernel learning algorithm

AIOLLI, FABIO;DONINI, MICHELE
2015

Abstract

The goal of Multiple Kernel Learning (MKL) is to combine kernels derived from multiple sources in a data-driven way with the aim to enhance the accuracy of a target kernel machine. State-of-the-art methods of MKL have the drawback that the time required to solve the associated optimization problem grows (typically more than linearly) with the number of kernels to combine. Moreover, it has been empirically observed that even sophisticated methods often do not significantly outperform the simple average of kernels. In this paper, we propose a time and space efficient MKL algorithm that can easily cope with hundreds of thousands of kernels and more. The proposed method has been compared with other baselines (random, average, etc.) and three state-of-the-art MKL methods showing that our approach is often superior. We show empirically that the advantage of using the method proposed in this paper is even clearer when noise features are added. Finally, we have analyzed how our algorithm changes its performance with respect to the number of examples in the training set and the number of kernels combined.
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

Caricamento pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3183648
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 125
  • ???jsp.display-item.citation.isi??? 101
social impact