In recent years, federated learning has been widely studied to speed up various supervised learning tasks at the wireless network edge. However, there is a lack of theoretical understanding as to whether similar speedups in sample complexity can be achieved for cooperative reinforcement learning (RL) problems subject to communication constraints. To that end, we study a federated policy evaluation problem over wireless fading channels where, to update model parameters, a central server aggregates local temporal difference (TD) update directions from N agents via analog over-the-air computation (OAC). We refer to this scheme as OAC-FedTD and provide a rigorous finite-time convergence analysis of its performance. Our analysis reveals the impact of the noisy fading channels on the convergence rate and establishes a linear convergence speedup w.r.t. the number of agents. Notably, this is the first non-asymptotic analysis of a cooperative RL setting under wireless channels that jointly considers linear value function approximation, Markovian sampling, and the OAC channel-induced distortions and noise. Our work develops the theoretical foundations that are key for relevant advancements in the analysis and design of federated reinforcement learning algorithms over wireless networks.

Finite-Time Analysis of Over-the-Air Federated TD Learning

Fabbro, Nicolò Dal;Schenato, Luca;
2025

Abstract

In recent years, federated learning has been widely studied to speed up various supervised learning tasks at the wireless network edge. However, there is a lack of theoretical understanding as to whether similar speedups in sample complexity can be achieved for cooperative reinforcement learning (RL) problems subject to communication constraints. To that end, we study a federated policy evaluation problem over wireless fading channels where, to update model parameters, a central server aggregates local temporal difference (TD) update directions from N agents via analog over-the-air computation (OAC). We refer to this scheme as OAC-FedTD and provide a rigorous finite-time convergence analysis of its performance. Our analysis reveals the impact of the noisy fading channels on the convergence rate and establishes a linear convergence speedup w.r.t. the number of agents. Notably, this is the first non-asymptotic analysis of a cooperative RL setting under wireless channels that jointly considers linear value function approximation, Markovian sampling, and the OAC channel-induced distortions and noise. Our work develops the theoretical foundations that are key for relevant advancements in the analysis and design of federated reinforcement learning algorithms over wireless networks.
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3589916
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact