BrainCLIP: Brain Representation via CLIP for Generic Natural Visual Stimulus Decoding.
Saved in:
| Title: | BrainCLIP: Brain Representation via CLIP for Generic Natural Visual Stimulus Decoding. |
|---|---|
| Authors: | Ma Y, Liu Y, Chen L, Zhu G, Chen B, Zheng N |
| Source: | IEEE transactions on medical imaging [IEEE Trans Med Imaging] 2025 Oct; Vol. 44 (10), pp. 3962-3972. |
| Publication Type: | Journal Article |
| Language: | English |
| Journal Info: | Publisher: Institute of Electrical and Electronics Engineers Country of Publication: United States NLM ID: 8310780 Publication Model: Print Cited Medium: Internet ISSN: 1558-254X (Electronic) Linking ISSN: 02780062 NLM ISO Abbreviation: IEEE Trans Med Imaging Subsets: MEDLINE |
| Imprint Name(s): | Original Publication: New York, NY : Institute of Electrical and Electronics Engineers, c1982- |
| MeSH Terms: | Magnetic Resonance Imaging*/methods , Brain*/diagnostic imaging , Brain*/physiology , Image Processing, Computer-Assisted*/methods , Brain Mapping*/methods, Humans ; Photic Stimulation ; Algorithms ; Adult |
| Abstract: | Functional Magnetic Resonance Imaging (fMRI) presents challenges due to limited paired samples and low signal-to-noise ratios, particularly in tasks involving reconstructing natural images or decoding their semantic content. To address these challenges, we introduce BrainCLIP, an innovative fMRI-based brain decoding model. BrainCLIP leverages Contrastive Language-Image Pre-training's (CLIP) cross-modal generalization abilities to bridge brain activity, images, and text for the first time. Our experiments demonstrate CLIP's effectiveness in diverse brain decoding tasks, including zero-shot visual category decoding, fMRI-image/text alignment, and fMRI-to-image generation. The core objective of BrainCLIP is to train a mapping network that translates fMRI patterns into a unified CLIP embedding space, achieved through visual and textual supervision integration. Our experiments highlight that this approach significantly enhances performance in tasks such as fMRI-text alignment and fMRI-based image generation. Notably, BrainCLIP surpasses BraVL, a recent multi-modal method, in zero-shot visual category decoding. Moreover, BrainCLIP demonstrates strong capability in reconstructing visual stimuli with high semantic fidelity, competing favorably with state-of-the-art methods in capturing high-level semantic features during fMRI-based natural image reconstruction. |
| Entry Date(s): | Date Created: 20250303 Date Completed: 20251027 Latest Revision: 20251028 |
| Update Code: | 20251028 |
| DOI: | 10.1109/TMI.2025.3537287 |
| PMID: | 40031248 |
| Database: | MEDLINE |
Be the first to leave a comment!
Full Text Finder
Nájsť tento článok vo Web of Science