Vision Transformer and Residual Network-Based Autoencoder for RGBD Data Processing in Robotic Grasping of Noodle-Like Objects

In this innovative study, a Vision Transformer and Residual Network-based Autoencoder is employed for the efficient encoding of RGBD data, aimed at enhancing robotic precision in grasping noodle-like objects. The project successfully compresses 50x50 pixel RGBD images to a 1024-element format, optim...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:2024 1st International Conference on Robotics, Engineering, Science, and Technology (RESTCON) S. 85 - 89
Hauptverfasser: Koomklang, Nattapat, Gamolped, Prem, Hayashi, Eiji
Format: Tagungsbericht
Sprache:Englisch
Veröffentlicht: IEEE 16.02.2024
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In this innovative study, a Vision Transformer and Residual Network-based Autoencoder is employed for the efficient encoding of RGBD data, aimed at enhancing robotic precision in grasping noodle-like objects. The project successfully compresses 50x50 pixel RGBD images to a 1024-element format, optimizing data processing for robotic applications. Utilizing a novel combination of vision transformers and residual networks, the autoencoder maintains critical data features during compression and decompression, essential for accurate robotic manipulation. The efficacy of this approach is evaluated using various metrics such as Relative Absolute Error (RAE), Relative Squared Error (RSE), Root Mean Square Error (RMSE), and accuracy thresholds. The results are promising, demonstrating high fidelity in the reconstructed data with accuracy thresholds at 1.25 reaching 0.993 for RGB images and 0.989 for depth images. These findings confirm the autoencoder's effectiveness in representing full data with excellent accuracy, underscoring its potential in robotic grasping applications, particularly for objects with complex shapes and textures.
DOI:10.1109/RESTCON60981.2024.10463551