Recent data science applications using large datasets often need scalable optimization methods with low per iteration cost and low memory requirements. This has lead to a renewed interest in gradient descent methods, and on tailored variants for problems where gradient descent is unpractical due, e.g., to non smoothness or stochasticity of the optimization objective. Applications include deep neural network training, adversarial attacks in machine learning, sparse signal recovery, cluster detection in networks, etc. In this thesis, we focus on the theoretical analysis of some of these methods, as well as in the formulation and numerical testing of new methods with better complexity guarantees than existing ones under suitable conditions. The problems we consider have a continuous but sometimes constrained and not necessarily differentiable objective. The main contributions concern both some variants of the classic Frank-Wolfe (FW) method and direct search schemes. In particular, we prove new support identification properties for FW variants, with an application to a cluster detection problem in networks, we introduce a technique to provably speed up the convergence of FW variants, and extend some direct search schemes to the stochastic non smooth setting, as well as to problems defined on Riemannian manifolds.

Recent data science applications using large datasets often need scalable optimization methods with low per iteration cost and low memory requirements. This has lead to a renewed interest in gradient descent methods, and on tailored variants for problems where gradient descent is unpractical due, e.g., to non smoothness or stochasticity of the optimization objective. Applications include deep neural network training, adversarial attacks in machine learning, sparse signal recovery, cluster detection in networks, etc. In this thesis, we focus on the theoretical analysis of some of these methods, as well as in the formulation and numerical testing of new methods with better complexity guarantees than existing ones under suitable conditions. The problems we consider have a continuous but sometimes constrained and not necessarily differentiable objective. The main contributions concern both some variants of the classic Frank-Wolfe (FW) method and direct search schemes. In particular, we prove new support identification properties for FW variants, with an application to a cluster detection problem in networks, we introduce a technique to provably speed up the convergence of FW variants, and extend some direct search schemes to the stochastic non smooth setting, as well as to problems defined on Riemannian manifolds.

First and zeroth order optimization methods for data science / Zeffiro, Damiano. - (2023 Mar 14).

First and zeroth order optimization methods for data science

ZEFFIRO, DAMIANO
2023

Abstract

Recent data science applications using large datasets often need scalable optimization methods with low per iteration cost and low memory requirements. This has lead to a renewed interest in gradient descent methods, and on tailored variants for problems where gradient descent is unpractical due, e.g., to non smoothness or stochasticity of the optimization objective. Applications include deep neural network training, adversarial attacks in machine learning, sparse signal recovery, cluster detection in networks, etc. In this thesis, we focus on the theoretical analysis of some of these methods, as well as in the formulation and numerical testing of new methods with better complexity guarantees than existing ones under suitable conditions. The problems we consider have a continuous but sometimes constrained and not necessarily differentiable objective. The main contributions concern both some variants of the classic Frank-Wolfe (FW) method and direct search schemes. In particular, we prove new support identification properties for FW variants, with an application to a cluster detection problem in networks, we introduce a technique to provably speed up the convergence of FW variants, and extend some direct search schemes to the stochastic non smooth setting, as well as to problems defined on Riemannian manifolds.
First and zeroth order optimization methods for data science
14-mar-2023
Recent data science applications using large datasets often need scalable optimization methods with low per iteration cost and low memory requirements. This has lead to a renewed interest in gradient descent methods, and on tailored variants for problems where gradient descent is unpractical due, e.g., to non smoothness or stochasticity of the optimization objective. Applications include deep neural network training, adversarial attacks in machine learning, sparse signal recovery, cluster detection in networks, etc. In this thesis, we focus on the theoretical analysis of some of these methods, as well as in the formulation and numerical testing of new methods with better complexity guarantees than existing ones under suitable conditions. The problems we consider have a continuous but sometimes constrained and not necessarily differentiable objective. The main contributions concern both some variants of the classic Frank-Wolfe (FW) method and direct search schemes. In particular, we prove new support identification properties for FW variants, with an application to a cluster detection problem in networks, we introduce a technique to provably speed up the convergence of FW variants, and extend some direct search schemes to the stochastic non smooth setting, as well as to problems defined on Riemannian manifolds.
First and zeroth order optimization methods for data science / Zeffiro, Damiano. - (2023 Mar 14).
File in questo prodotto:
File Dimensione Formato  
thesis_final.pdf

accesso aperto

Descrizione: Tesi
Tipologia: Tesi di dottorato
Dimensione 5.41 MB
Formato Adobe PDF
5.41 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3478828
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact