Robotic agents vastly increase the return of planetary exploration missions thanks to their ability of performing in-situ measurements. To this date, unmanned exploration has been performed by individual of robots such as the MER Spirit and Opportunity and later MSL Curiosity. A fundamental asset to robotic autonomy is the ability to perceive the surroundings through vision systems such as stereo cameras. Since global localization using GPS-like approaches is unavailable in extra-terrestrial environments, rovers need to measure their motion in order to understand where they are heading. This allows to close high-level control loops to follow planned routes toward goals of scientific interest. Visual SLAM (Simultaneous Localization and Mapping) is an effective strategy to fulfill these needs. Stereo cameras are used to both reconstruct the environment structure through triangulation and use that information to localize the cameras while moving. While performing Visual SLAM on constrained resources is still challenging, many state of the art solution exist to solve this problem for single exploration sessions. The future of planetary exploration however strongly involves cooperation amongst teams of heterogeneous robotic agents. While the SLAM problem is efficiently solved for single sessions and agents, robust solutions for collaborative map merging and re-localization are still topics of active research and constitute the first major objective of this thesis. Here is proposed and validated a robust re-localization pipeline targeted at planetary vehicles equipped with stereo vision systems allowing to localize them in previously built maps. Instead of common Visual SLAM approaches based exclusively on visual features, this algorithm exploits the invariant nature of 3D point clouds by using compact 3D binary descriptors in conjunction with texture cues. Maps are discretized in submaps which are represented in a lightweight form using the Bag of Binary Words paradigm. The algorithm is then tested and validated both in the laboratories of the DLR Robotics and Mechatronics Center and in Mount Etna, Sicily, an outdoor planetary analogous environment. The second major research objective involves monocular vision for UAVs. Stereo depth perception is often infeasible for UAVs as small baseline systems degenerate to monocular as the vehicle takes off. 3D structure can be obtained using Structure-from-Motion approaches which are however unable to recover a global metric scale. Scale is traditionally recovered integrating accelerations from IMUs. However visual-inertial sensing is delicate being very sensitive on wrong extrinsic calibration. In addition, initialization of the visual-inertial pipeline is challenging and can diverge. These reasons challenge the implementation of unsupervised autonomous behaviors on UAVs. To address these issues, this thesis work proposes a sensor fusion approach between cameras and low resolution range sensors in order to exploit direct range measurements enforcing scale constraints in monocular Visual Odometry. This research objective is accomplished in two stages. Firstly a monocular Visual Odometry is developed without enforcing strict performance constraints and is used in conjunction with a low resolution Time of Flight camera, a lightweight sensor capable of measuring 64 ranges in a narrow Field-of-View. The algorithm is tested against both a state of the art stereo visual SLAM system and a more accurate, while heavier, 2D LiDAR. Finally, a real-time monocular Visual Odometry is developed exploiting a multi-threaded architecture to enable concurrent tracking of the camera pose and scale optimization in the background. This algorithm is tested with a 1D LiDAR altimeter, a minimal range sensing configuration of just 1 point per measurement, demonstrating the ability of recovering and maintain a correct scale along the trajectory with very light and inexpensive off-the-shelf range sensors.

Stereo and Monocular Vision Guidance for Autonomous Aerial and Ground Vehicles / Giubilato, Riccardo. - (2019 Dec 02).

Stereo and Monocular Vision Guidance for Autonomous Aerial and Ground Vehicles

Giubilato, Riccardo
2019

Abstract

Robotic agents vastly increase the return of planetary exploration missions thanks to their ability of performing in-situ measurements. To this date, unmanned exploration has been performed by individual of robots such as the MER Spirit and Opportunity and later MSL Curiosity. A fundamental asset to robotic autonomy is the ability to perceive the surroundings through vision systems such as stereo cameras. Since global localization using GPS-like approaches is unavailable in extra-terrestrial environments, rovers need to measure their motion in order to understand where they are heading. This allows to close high-level control loops to follow planned routes toward goals of scientific interest. Visual SLAM (Simultaneous Localization and Mapping) is an effective strategy to fulfill these needs. Stereo cameras are used to both reconstruct the environment structure through triangulation and use that information to localize the cameras while moving. While performing Visual SLAM on constrained resources is still challenging, many state of the art solution exist to solve this problem for single exploration sessions. The future of planetary exploration however strongly involves cooperation amongst teams of heterogeneous robotic agents. While the SLAM problem is efficiently solved for single sessions and agents, robust solutions for collaborative map merging and re-localization are still topics of active research and constitute the first major objective of this thesis. Here is proposed and validated a robust re-localization pipeline targeted at planetary vehicles equipped with stereo vision systems allowing to localize them in previously built maps. Instead of common Visual SLAM approaches based exclusively on visual features, this algorithm exploits the invariant nature of 3D point clouds by using compact 3D binary descriptors in conjunction with texture cues. Maps are discretized in submaps which are represented in a lightweight form using the Bag of Binary Words paradigm. The algorithm is then tested and validated both in the laboratories of the DLR Robotics and Mechatronics Center and in Mount Etna, Sicily, an outdoor planetary analogous environment. The second major research objective involves monocular vision for UAVs. Stereo depth perception is often infeasible for UAVs as small baseline systems degenerate to monocular as the vehicle takes off. 3D structure can be obtained using Structure-from-Motion approaches which are however unable to recover a global metric scale. Scale is traditionally recovered integrating accelerations from IMUs. However visual-inertial sensing is delicate being very sensitive on wrong extrinsic calibration. In addition, initialization of the visual-inertial pipeline is challenging and can diverge. These reasons challenge the implementation of unsupervised autonomous behaviors on UAVs. To address these issues, this thesis work proposes a sensor fusion approach between cameras and low resolution range sensors in order to exploit direct range measurements enforcing scale constraints in monocular Visual Odometry. This research objective is accomplished in two stages. Firstly a monocular Visual Odometry is developed without enforcing strict performance constraints and is used in conjunction with a low resolution Time of Flight camera, a lightweight sensor capable of measuring 64 ranges in a narrow Field-of-View. The algorithm is tested against both a state of the art stereo visual SLAM system and a more accurate, while heavier, 2D LiDAR. Finally, a real-time monocular Visual Odometry is developed exploiting a multi-threaded architecture to enable concurrent tracking of the camera pose and scale optimization in the background. This algorithm is tested with a 1D LiDAR altimeter, a minimal range sensing configuration of just 1 point per measurement, demonstrating the ability of recovering and maintain a correct scale along the trajectory with very light and inexpensive off-the-shelf range sensors.
2-dic-2019
Visual SLAM, Visual Systems, Space Robotics, Navigation, Mapping
Stereo and Monocular Vision Guidance for Autonomous Aerial and Ground Vehicles / Giubilato, Riccardo. - (2019 Dec 02).
File in questo prodotto:
File Dimensione Formato  
MAIN_FINAL.pdf

Open Access dal 04/12/2022

Tipologia: Tesi di dottorato
Licenza: Non specificato
Dimensione 26.83 MB
Formato Adobe PDF
26.83 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3422709
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact