Deep Reinforcement Learning (DRL) has emerged as a powerful paradigm for learning complex policies directly from high-dimensional input spaces, enabling advances across a variety of domains. Modern DRL algorithms often rely on dual-network Q-learning architectures to approximate optimal policies to overcome overestimation bias. Recent research has introduced approaches leveraging multiple Q-functions to further mitigate overestimation effects and enhance policy reliability. However, there is a growing emphasis on deploying DRL in edge scenarios, where privacy concerns and stringent hardware constraints necessitate highly efficient algorithms. In such environments, the computational and memory efficiency of learning methods is of critical importance. In this context, we propose Edge Delayed Deep Deterministic Policy Gradient (EdgeD3), a novel reinforcement learning algorithm specifically designed for edge computing settings. EdgeD3 offers significant reductions in GPU time (by 25%) and computational and memory usage (by 30%), while consistently achieving or surpassing the performance of state-of-the-art algorithms across multiple benchmarks and in real-world tasks.

Edge Delayed Deep Deterministic Policy Gradient: Efficient Continuous Control for Edge Scenarios

Sinigaglia, Alberto;Turcato, Niccolò;Carli, Ruggero;Antonio Susto, Gian
2025

Abstract

Deep Reinforcement Learning (DRL) has emerged as a powerful paradigm for learning complex policies directly from high-dimensional input spaces, enabling advances across a variety of domains. Modern DRL algorithms often rely on dual-network Q-learning architectures to approximate optimal policies to overcome overestimation bias. Recent research has introduced approaches leveraging multiple Q-functions to further mitigate overestimation effects and enhance policy reliability. However, there is a growing emphasis on deploying DRL in edge scenarios, where privacy concerns and stringent hardware constraints necessitate highly efficient algorithms. In such environments, the computational and memory efficiency of learning methods is of critical importance. In this context, we propose Edge Delayed Deep Deterministic Policy Gradient (EdgeD3), a novel reinforcement learning algorithm specifically designed for edge computing settings. EdgeD3 offers significant reductions in GPU time (by 25%) and computational and memory usage (by 30%), while consistently achieving or surpassing the performance of state-of-the-art algorithms across multiple benchmarks and in real-world tasks.
2025
Inglese
Inglese
22
20596
20610
15
Institute of Electrical and Electronics Engineers Inc.
Continuous control; deep deterministic policy gradient; deep reinforcement learning; edge computing; Q-learning
no
none
Sinigaglia, Alberto; Turcato, Niccolò; Carli, Ruggero; Antonio Susto, Gian
01 CONTRIBUTO IN RIVISTA::01.01 - Articolo in rivista
info:eu-repo/semantics/article
4
262
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3562487
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 1
  • OpenAlex 0
social impact