This work proposes a general formulation for the Object Sorting problem, suitable to describe any non-deterministic environment characterized by friendly and adversarial interference. Such an approach, coupled with a Deep Reinforcement Learning algorithm, allows training policies to solve different sorting tasks without adjusting the architecture or modifying the learning method. Briefly, the environment is subdivided into a clutter, where objects are freely located, and a set of clusters, where objects should be placed according to predefined ordering and classification rules. A 3D grid discretizes such environment: the properties of an object within a cell depict its state. Such attributes include object category and order. A Markov Decision Process formulates the problem: at each time step, the state of the cells fully defines the environment's one. Users can custom-define object classes, ordering priorities, and failure rules. The latter by assigning a non-uniform risk probability to each cell. Performed experiments successfully trained and validated a Deep Reinforcement Learning model to solve several sorting tasks while minimizing the number of moves and failure probability. Obtained results demonstrate the capability of the system to handle non-deterministic events, like failures, and unpredictable external disturbances, like human user interventions.

Robotic Object Sorting via Deep Reinforcement Learning: A generalized approach

Nicola G.;Tagliapietra L.;Tosello E.;Navarin N.;Ghidoni S.;Menegatti E.
2020

Abstract

This work proposes a general formulation for the Object Sorting problem, suitable to describe any non-deterministic environment characterized by friendly and adversarial interference. Such an approach, coupled with a Deep Reinforcement Learning algorithm, allows training policies to solve different sorting tasks without adjusting the architecture or modifying the learning method. Briefly, the environment is subdivided into a clutter, where objects are freely located, and a set of clusters, where objects should be placed according to predefined ordering and classification rules. A 3D grid discretizes such environment: the properties of an object within a cell depict its state. Such attributes include object category and order. A Markov Decision Process formulates the problem: at each time step, the state of the cells fully defines the environment's one. Users can custom-define object classes, ordering priorities, and failure rules. The latter by assigning a non-uniform risk probability to each cell. Performed experiments successfully trained and validated a Deep Reinforcement Learning model to solve several sorting tasks while minimizing the number of moves and failure probability. Obtained results demonstrate the capability of the system to handle non-deterministic events, like failures, and unpredictable external disturbances, like human user interventions.
29th IEEE International Conference on Robot and Human Interactive Communication, RO-MAN 2020
978-1-7281-6075-7
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

Caricamento pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11577/3358161
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? 2
social impact