Adventurer: Optimizing Vision Mamba Architecture Designs for Efficiency

In this work, we introduce the Adventurer series models where we treat images as sequences of patch tokens and employ uni-directional language models to learn visual representations. This modeling paradigm allows us to process images in a recurrent formulation with linear complexity relative to the...

Full description

Saved in:

Bibliographic Details
Published in:	Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) Vol. 2025; pp. 30157 - 30166
Main Authors:	Wang, Feng, Yang, Timing, Yu, Yaodong, Ren, Sucheng, Wei, Guoyizhe, Wang, Angtian, Shao, Wei, Zhou, Yuyin, Yuille, Alan, Xie, Cihang
Format:	Conference Proceeding Journal Article
Language:	English
Published:	United States IEEE 01.06.2025
Subjects:	Complexity theory Computational modeling Computer vision Pattern recognition Predictive models Throughput Training Transformers Visualization
ISSN:	1063-6919, 1063-6919
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Be the first to leave a comment!