GelSight dual-modal tactile data compression for machines

Existing image coding for machines methods usually optimize jointly with downstream tasks and transmit task-relevant information in a lossy manner, but they cannot meet the differentiated needs of the dual-modal information in GelSight tactile images. To this end, we propose an end-to-end dual-modal...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Digital signal processing Ročník 168; s. 105696
Hlavní autori: Zeng, Yaofeng, Lan, Chengdong, Xu, Yifeng
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Elsevier Inc 01.01.2026
Predmet:
ISSN:1051-2004
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:Existing image coding for machines methods usually optimize jointly with downstream tasks and transmit task-relevant information in a lossy manner, but they cannot meet the differentiated needs of the dual-modal information in GelSight tactile images. To this end, we propose an end-to-end dual-modal tactile data compression framework that integrates lossy and lossless strategies for differentiated transmission. For the shape-texture modality, we address the feature mismatch problem that occurs when the task inference subnet changes during decoding by proposing a Feature-Semantics Preserving Multi-Branch Decoder (FSPMBD). This decoder reconstructs multi-level semantic features through combinations of different branches and aligns them with a pretrained tactile task model, thereby ensuring semantic consistency of the decoded features with downstream tasks. With this design, the framework can flexibly adapt to different task inference subnets without the need to retrain or store multiple models. On the other hand, to more effectively eliminate the statistical redundancy in the force modality and achieve its lossless transmission at lower bitrates, we build a dual-branch entropy model that leverages the distribution of background and force marker pixels, achieving more accurate probability modeling. Experiments show that our method enables differentiated transmission of dual-modal information and, in material classification, reduces the bitrate to 8.3 %-33.3 % of existing methods while maintaining performance comparable to state-of-the-art approaches.
ISSN:1051-2004
DOI:10.1016/j.dsp.2025.105696