TOWARDS LOWER BOUNDS ON THE DEPTH OF RELU NEURAL NETWORKS

We contribute to a better understanding of the class of functions that can be represented by a neural network with ReLU activations and a given architecture. Using techniques from mixed-integer optimization, polyhedral theory, and tropical geometry, we provide a mathematical counterbalance to the universal approximation theorems which suggest that a single hidden layer is sufficient for learning any function. In particular, we investigate whether the class of exactly representable functions strictly increases by adding more layers (with no restrictions on size). As a by-product of our investigations, we settle an old conjecture about piecewise linear functions by Wang and Sun [IEEE Trans. Inform. Theory, 51 (2005), pp. 4425--4431] in the affirmative. We also present upper bounds on the sizes of neural networks required to represent functions with logarithmic depth.

TOWARDS LOWER BOUNDS ON THE DEPTH OF RELU NEURAL NETWORKS

Hertrich, C;Basu, A;Di Summa, M;Skutella, M

2023

Abstract

We contribute to a better understanding of the class of functions that can be represented by a neural network with ReLU activations and a given architecture. Using techniques from mixed-integer optimization, polyhedral theory, and tropical geometry, we provide a mathematical counterbalance to the universal approximation theorems which suggest that a single hidden layer is sufficient for learning any function. In particular, we investigate whether the class of exactly representable functions strictly increases by adding more layers (with no restrictions on size). As a by-product of our investigations, we settle an old conjecture about piecewise linear functions by Wang and Sun [IEEE Trans. Inform. Theory, 51 (2005), pp. 4425--4431] in the affirmative. We also present upper bounds on the sizes of neural networks required to represent functions with logarithmic depth.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2023
			
	Rivista su cui è pubblicata l'opera
	
				SIAM JOURNAL ON DISCRETE MATHEMATICS
			
	Codice DOI
	
				https://dx.doi.org/10.1137/22M1489332
			
	Codice WOS
	
				WOS:001041790200016
			
	Codice Scopus
	
				2-s2.0-85163477079
			
	Appare nelle tipologie:
	
				01.01 - Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
2105.14835.pdf accesso aperto Descrizione: articolo Tipologia: Accepted (AAM - Author's Accepted Manuscript) Licenza: Accesso libero Dimensione 423.54 kB Formato Adobe PDF Visualizza/Apri	423.54 kB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3493686

Citazioni

ND

4

2

ND

social impact