The analysis of projection-free first order methods is often complicated by the presence of different kinds of "good" and "bad" steps. In this article, we propose a unifying framework for projection-free methods, aiming to simplify the converge analysis by getting rid of such a distinction between steps. The main tool employed in our framework is the Short Step Chain (SSC) procedure, which skips gradient computations in consecutive short steps until proper stopping conditions are satisfied. This technique allows us to give a unified analysis and converge rates in the general smooth non convex setting, as well as convergence rates under a Kurdyka-Lojasiewicz (KL) property, a setting that, to our knowledge, has not been analyzed before for the projection-free methods under study. In this context, we prove local convergence rates comparable to those of projected gradient methods under the same conditions. Our analysis relies on a sufficient slope condition, ensuring that the directions selected by the methods have the steepest slope possible up to a constant among feasible directions. This condition is satisfied, among others, by several Frank-Wolfe (FW) variants on polytopes, and by some projection-free methods on convex sets with smooth boundary.
A unifying framework for the analysis of projection-free first-order methods under a sufficient slope condition
Francesco Rinaldi;Damiano Zeffiro
2020
Abstract
The analysis of projection-free first order methods is often complicated by the presence of different kinds of "good" and "bad" steps. In this article, we propose a unifying framework for projection-free methods, aiming to simplify the converge analysis by getting rid of such a distinction between steps. The main tool employed in our framework is the Short Step Chain (SSC) procedure, which skips gradient computations in consecutive short steps until proper stopping conditions are satisfied. This technique allows us to give a unified analysis and converge rates in the general smooth non convex setting, as well as convergence rates under a Kurdyka-Lojasiewicz (KL) property, a setting that, to our knowledge, has not been analyzed before for the projection-free methods under study. In this context, we prove local convergence rates comparable to those of projected gradient methods under the same conditions. Our analysis relies on a sufficient slope condition, ensuring that the directions selected by the methods have the steepest slope possible up to a constant among feasible directions. This condition is satisfied, among others, by several Frank-Wolfe (FW) variants on polytopes, and by some projection-free methods on convex sets with smooth boundary.File | Dimensione | Formato | |
---|---|---|---|
2008.09781.pdf
accesso aperto
Descrizione: articolo principale
Tipologia:
Preprint (submitted version)
Licenza:
Accesso libero
Dimensione
830.41 kB
Formato
Adobe PDF
|
830.41 kB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.