First and zeroth order optimization methods for data science

Recent data science applications using large datasets often need scalable optimization methods with low per iteration cost and low memory requirements. This has lead to a renewed interest in gradient descent methods, and on tailored variants for problems where gradient descent is unpractical due, e.g., to non smoothness or stochasticity of the optimization objective. Applications include deep neural network training, adversarial attacks in machine learning, sparse signal recovery, cluster detection in networks, etc. In this thesis, we focus on the theoretical analysis of some of these methods, as well as in the formulation and numerical testing of new methods with better complexity guarantees than existing ones under suitable conditions. The problems we consider have a continuous but sometimes constrained and not necessarily differentiable objective. The main contributions concern both some variants of the classic Frank-Wolfe (FW) method and direct search schemes. In particular, we prove new support identification properties for FW variants, with an application to a cluster detection problem in networks, we introduce a technique to provably speed up the convergence of FW variants, and extend some direct search schemes to the stochastic non smooth setting, as well as to problems defined on Riemannian manifolds.

Recent data science applications using large datasets often need scalable optimization methods with low per iteration cost and low memory requirements. This has lead to a renewed interest in gradient descent methods, and on tailored variants for problems where gradient descent is unpractical due, e.g., to non smoothness or stochasticity of the optimization objective. Applications include deep neural network training, adversarial attacks in machine learning, sparse signal recovery, cluster detection in networks, etc. In this thesis, we focus on the theoretical analysis of some of these methods, as well as in the formulation and numerical testing of new methods with better complexity guarantees than existing ones under suitable conditions. The problems we consider have a continuous but sometimes constrained and not necessarily differentiable objective. The main contributions concern both some variants of the classic Frank-Wolfe (FW) method and direct search schemes. In particular, we prove new support identification properties for FW variants, with an application to a cluster detection problem in networks, we introduce a technique to provably speed up the convergence of FW variants, and extend some direct search schemes to the stochastic non smooth setting, as well as to problems defined on Riemannian manifolds.

First and zeroth order optimization methods for data science / Zeffiro, Damiano. - (2023 Mar 14).

First and zeroth order optimization methods for data science

ZEFFIRO, DAMIANO

2023

Abstract

Recent data science applications using large datasets often need scalable optimization methods with low per iteration cost and low memory requirements. This has lead to a renewed interest in gradient descent methods, and on tailored variants for problems where gradient descent is unpractical due, e.g., to non smoothness or stochasticity of the optimization objective. Applications include deep neural network training, adversarial attacks in machine learning, sparse signal recovery, cluster detection in networks, etc. In this thesis, we focus on the theoretical analysis of some of these methods, as well as in the formulation and numerical testing of new methods with better complexity guarantees than existing ones under suitable conditions. The problems we consider have a continuous but sometimes constrained and not necessarily differentiable objective. The main contributions concern both some variants of the classic Frank-Wolfe (FW) method and direct search schemes. In particular, we prove new support identification properties for FW variants, with an application to a cluster detection problem in networks, we introduce a technique to provably speed up the convergence of FW variants, and extend some direct search schemes to the stochastic non smooth setting, as well as to problems defined on Riemannian manifolds.

Scheda breve

Scheda completa

Scheda completa (DC)

	Titolo in inglese
	
			First and zeroth order optimization methods for data science
		
	Anno di discussione
	
			14-mar-2023
		
	Abstract
	
			Recent data science applications using large datasets often need scalable optimization methods with low per iteration cost and low memory requirements. This has lead to a renewed interest in gradient descent methods, and on tailored variants for problems where gradient descent is unpractical due, e.g., to non smoothness or stochasticity of the optimization objective. Applications include deep neural network training, adversarial attacks in machine learning, sparse signal recovery, cluster detection in networks, etc. 
In this thesis, we focus on the theoretical analysis of some of these methods, as well as in the formulation and numerical testing of new methods with better complexity guarantees than existing ones under suitable conditions. The problems we consider have a continuous but sometimes constrained and not necessarily differentiable objective. The main contributions concern both some variants of the classic Frank-Wolfe (FW) method and direct search schemes. In particular, we prove new support identification properties for FW variants, with an application to a cluster detection problem in networks, we introduce a technique to provably speed up the convergence of FW variants, and extend some direct search schemes to the stochastic non smooth setting, as well as to problems defined on Riemannian manifolds.
		
	Citazione
	
			First and zeroth order optimization methods for data science / Zeffiro, Damiano. - (2023 Mar 14).
		
	Appare nelle tipologie:
	
			08.01 - Tesi di Dottorato UNIPD (Deposito Legale)

File in questo prodotto:

File	Dimensione	Formato
thesis_final.pdf accesso aperto Descrizione: Tesi Tipologia: Tesi di dottorato Dimensione 5.41 MB Formato Adobe PDF Visualizza/Apri	5.41 MB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3478828

Citazioni

ND

ND

ND

social impact