GelSight dual-modal tactile data compression for machines
Existing image coding for machines methods usually optimize jointly with downstream tasks and transmit task-relevant information in a lossy manner, but they cannot meet the differentiated needs of the dual-modal information in GelSight tactile images. To this end, we propose an end-to-end dual-modal...
Gespeichert in:
| Veröffentlicht in: | Digital signal processing Jg. 168; S. 105696 |
|---|---|
| Hauptverfasser: | , , |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
Elsevier Inc
01.01.2026
|
| Schlagworte: | |
| ISSN: | 1051-2004 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Zusammenfassung: | Existing image coding for machines methods usually optimize jointly with downstream tasks and transmit task-relevant information in a lossy manner, but they cannot meet the differentiated needs of the dual-modal information in GelSight tactile images. To this end, we propose an end-to-end dual-modal tactile data compression framework that integrates lossy and lossless strategies for differentiated transmission. For the shape-texture modality, we address the feature mismatch problem that occurs when the task inference subnet changes during decoding by proposing a Feature-Semantics Preserving Multi-Branch Decoder (FSPMBD). This decoder reconstructs multi-level semantic features through combinations of different branches and aligns them with a pretrained tactile task model, thereby ensuring semantic consistency of the decoded features with downstream tasks. With this design, the framework can flexibly adapt to different task inference subnets without the need to retrain or store multiple models. On the other hand, to more effectively eliminate the statistical redundancy in the force modality and achieve its lossless transmission at lower bitrates, we build a dual-branch entropy model that leverages the distribution of background and force marker pixels, achieving more accurate probability modeling. Experiments show that our method enables differentiated transmission of dual-modal information and, in material classification, reduces the bitrate to 8.3 %-33.3 % of existing methods while maintaining performance comparable to state-of-the-art approaches. |
|---|---|
| ISSN: | 1051-2004 |
| DOI: | 10.1016/j.dsp.2025.105696 |