Zero-Shot Semantic Communication With Multimodal Foundation Models
Most existing semantic communication (SemCom) systems use deep joint source-channel coding (DeepJSCC) to encode task-specific semantics in a goal-oriented manner. However, their reliance on predefined tasks and datasets significantly limits their flexibility and generalizability in practical deploym...
Uloženo v:
| Vydáno v: | IEEE transactions on vehicular technology s. 1 - 6 |
|---|---|
| Hlavní autoři: | , , , , , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
IEEE
2025
|
| Témata: | |
| ISSN: | 0018-9545, 1939-9359 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | Most existing semantic communication (SemCom) systems use deep joint source-channel coding (DeepJSCC) to encode task-specific semantics in a goal-oriented manner. However, their reliance on predefined tasks and datasets significantly limits their flexibility and generalizability in practical deployments. Multi-modal foundation models provide a promising solution by generating universal semantic tokens. Inspired by this, in this paper, we propose SemCLIP, a zero-shot SemCom framework leveraging the contrastive language-image pre-training (CLIP) model. CLIP-generated image tokens are transmitted in SemCLIP under low bandwidth and challenging channel conditions, facilitating diverse zero-shot applications. Specifically, we propose a DeepJSCC scheme for efficient CLIP token encoding. To mitigate potential degradation caused by compression and channel noise, a multi-modal transmission-aware prompt learning (TAPL) mechanism is designed at the receiver, which adapts prompts based on transmission quality, enhancing system robustness and channel adaptability. Simulation results demonstrate that SemCLIP outperforms the baselines, achieving a 41% improvement in zero-shot performance at low signal-to-noise ratios. Meanwhile, SemCLIP reduces bandwidth usage by more than 50-fold compared to alternative image transmission methods, demonstrating the potential of foundation models towards a generalized, task-agnostic SemCom solution. |
|---|---|
| AbstractList | Most existing semantic communication (SemCom) systems use deep joint source-channel coding (DeepJSCC) to encode task-specific semantics in a goal-oriented manner. However, their reliance on predefined tasks and datasets significantly limits their flexibility and generalizability in practical deployments. Multi-modal foundation models provide a promising solution by generating universal semantic tokens. Inspired by this, in this paper, we propose SemCLIP, a zero-shot SemCom framework leveraging the contrastive language-image pre-training (CLIP) model. CLIP-generated image tokens are transmitted in SemCLIP under low bandwidth and challenging channel conditions, facilitating diverse zero-shot applications. Specifically, we propose a DeepJSCC scheme for efficient CLIP token encoding. To mitigate potential degradation caused by compression and channel noise, a multi-modal transmission-aware prompt learning (TAPL) mechanism is designed at the receiver, which adapts prompts based on transmission quality, enhancing system robustness and channel adaptability. Simulation results demonstrate that SemCLIP outperforms the baselines, achieving a 41% improvement in zero-shot performance at low signal-to-noise ratios. Meanwhile, SemCLIP reduces bandwidth usage by more than 50-fold compared to alternative image transmission methods, demonstrating the potential of foundation models towards a generalized, task-agnostic SemCom solution. |
| Author | Wang, Fengyu Gao, Hui Hu, Jiangjing Zhang, Wenjing Gunduz, Deniz Wu, Haotian Xu, Wenjun |
| Author_xml | – sequence: 1 givenname: Jiangjing surname: Hu fullname: Hu, Jiangjing organization: Beijing University of Posts and Telecommunications, Beijing, China – sequence: 2 givenname: Haotian surname: Wu fullname: Wu, Haotian organization: Department of Electrical and Electronic Engineering, Imperial College London, London, U.K – sequence: 3 givenname: Wenjing surname: Zhang fullname: Zhang, Wenjing organization: Beijing University of Posts and Telecommunications, Beijing, China – sequence: 4 givenname: Fengyu surname: Wang fullname: Wang, Fengyu email: fengyu.wang@bupt.edu.cn organization: Beijing University of Posts and Telecommunications, Beijing, China – sequence: 5 givenname: Wenjun surname: Xu fullname: Xu, Wenjun organization: Beijing University of Posts and Telecommunications, Beijing, China – sequence: 6 givenname: Hui surname: Gao fullname: Gao, Hui organization: Beijing University of Posts and Telecommunications, Beijing, China – sequence: 7 givenname: Deniz surname: Gunduz fullname: Gunduz, Deniz organization: Department of Electrical and Electronic Engineering, Imperial College London, London, U.K |
| BookMark | eNpFkE1LxDAQhoOsYHf17sFD_0BrJh9Nc9TFVWEXD1sVvIS0TdlI20jTHvz3pnTBywzDM-_APGu06l1vELoFnAJgeV98FCnBhKc0oySX9AJFIKlMJOVyhSKMIU8kZ_wKrb3_DiNjEiL0-GUGlxxPboyPptP9aKt467pu6m2lR-v6-NOOp_gwtaPtXK3beOemvl7QwdWm9dfostGtNzfnvkHvu6di-5Ls355ftw_7pALKxiQXpSRZTQRwVmEuNEjelEzzWkOuRckgx00luM5ERsILJNSZB5oZWlO6QXi5Ww3O-8E06mewnR5-FWA1O1DBgZodqLODELlbItYY878OhEnGBP0D-a5ZjA |
| CODEN | ITVTAB |
| ContentType | Journal Article |
| DBID | 97E RIA RIE AAYXX CITATION |
| DOI | 10.1109/TVT.2025.3632893 |
| DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005-present IEEE All-Society Periodicals Package (ASPP) 1998-Present IEEE Electronic Library (IEL) CrossRef |
| DatabaseTitle | CrossRef |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Xplore url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering |
| EISSN | 1939-9359 |
| EndPage | 6 |
| ExternalDocumentID | 10_1109_TVT_2025_3632893 11249447 |
| Genre | orig-research |
| GroupedDBID | -~X .DC 0R~ 29I 4.4 5GY 6IK 97E AAIKC AAJGR AAMNW AASAJ AAWTH ABQJQ ABVLG ACGFO ACGFS ACIWK ACNCT AENEX AGQYO AHBIQ AKJIK AKQYR ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS HZ~ IFIPE IPLJI JAVBF LAI MS~ O9- OCL P2P RIA RIE RNS RXW TAE TN5 3EH 5VS AAYXX AETIX AGSQL AI. AIBXA ALLEH CITATION EJD H~9 IAAWW IBMZZ ICLAB IFJZH M43 VH1 |
| ID | FETCH-LOGICAL-c134t-87b926d27154c057a195fb4a5da18a7b4180fc75a67623282623fb4aa186e3d33 |
| IEDL.DBID | RIE |
| ISSN | 0018-9545 |
| IngestDate | Sat Nov 29 06:53:56 EST 2025 Wed Nov 26 07:22:37 EST 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Language | English |
| License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c134t-87b926d27154c057a195fb4a5da18a7b4180fc75a67623282623fb4aa186e3d33 |
| PageCount | 6 |
| ParticipantIDs | crossref_primary_10_1109_TVT_2025_3632893 ieee_primary_11249447 |
| PublicationCentury | 2000 |
| PublicationDate | 2025-00-00 |
| PublicationDateYYYYMMDD | 2025-01-01 |
| PublicationDate_xml | – year: 2025 text: 2025-00-00 |
| PublicationDecade | 2020 |
| PublicationTitle | IEEE transactions on vehicular technology |
| PublicationTitleAbbrev | TVT |
| PublicationYear | 2025 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssj0014491 |
| Score | 2.4514177 |
| Snippet | Most existing semantic communication (SemCom) systems use deep joint source-channel coding (DeepJSCC) to encode task-specific semantics in a goal-oriented... |
| SourceID | crossref ieee |
| SourceType | Index Database Publisher |
| StartPage | 1 |
| SubjectTerms | Bandwidth Deep joint source-channel coding Encoding Feature extraction Foundation models Image reconstruction prompt optimization Receivers Robustness Semantic communication semantic communications Signal to noise ratio token communications Vectors |
| Title | Zero-Shot Semantic Communication With Multimodal Foundation Models |
| URI | https://ieeexplore.ieee.org/document/11249447 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE Xplore customDbUrl: eissn: 1939-9359 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0014491 issn: 0018-9545 databaseCode: RIE dateStart: 19670101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PS8MwFH644UEP_pw4f5GDFw_dmiZN2qOKw9MQNnV4KW2SsoFbZev8-31JO7cdPAglhCYN5QtJ3svLlw_gVkk_MzzUnqI69zCjvYgr6fnMiCiKQ52q3IlNyH4_Go3il5qs7rgwxhh3-Mx0bNbF8nWhlnarrEutUjLnsgENKWVF1voNGXBey-NRHMFoF6xikn7cHb4N0RMMwg4TDB0MtrUGbYiquDWld_jPvzmCg9p4JPdVbx_DjpmdwP7GlYKn8PBh5oU3GBclGZgpwjZRZIsEQt4n5Zg43u200NjaWliJWGG0z0ULXntPw8dnr9ZJQIAZL3FCy-JA6ECiOaTQ_kppHOYZTxFoGqUy4zTycyXDVODMhwAEmNpyLBWGacbOoDkrZuYciO-bNBAKH0WxLZahRSe1wEFtMqkz1Ya7FXLJV3UdRuLcCD9OEOXEopzUKLehZUFb16vxuvjj_SXs2c-r_Y0raJbzpbmGXfVdThbzG9fZP8Q8p4Y |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PS8MwFH7oFNSDPyfOnz148dCtadKmPao4Js4hbOrwUtokZQO3ytb59_uSdm47eBBKCEl5lC8k-V5eXz6Aa8GdRDFP2oLI1MaKtAMmuO1Q5QdB6MlYpEZsgnc6Qb8fvpTJ6iYXRillfj5TdV01sXyZiZk-KmsQrZTMGF-HDY8xlxTpWr9BA8ZKgTyCcxiZwTwq6YSN3lsPfUHXq1OfootBV3ahJVkVs6s09_75PfuwW9JH67YY7wNYU-ND2Fm6VPAI7j7UJLO7gyy3umqEwA2FtZIGYr0P84FlMm9HmURrC2klS0ujfU6r8Np86N237FIpASGmLMclLQldX7ocCZFABhaT0EsTFiPUJIh5wkjgpIJ7sY9rHwLgYqn7sddXVFJ6DJVxNlYnYDmOil1f4CMI2qIJcjoufZzWKuEyETW4mSMXfRUXYkTGkXDCCFGONMpRiXINqhq0xXslXqd_tF_BVqv33I7aj52nM9jWporTjnOo5JOZuoBN8Z0Pp5NLM_A_rhaqzQ |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Zero-Shot+Semantic+Communication+With+Multimodal+Foundation+Models&rft.jtitle=IEEE+transactions+on+vehicular+technology&rft.au=Hu%2C+Jiangjing&rft.au=Wu%2C+Haotian&rft.au=Zhang%2C+Wenjing&rft.au=Wang%2C+Fengyu&rft.date=2025&rft.issn=0018-9545&rft.eissn=1939-9359&rft.spage=1&rft.epage=6&rft_id=info:doi/10.1109%2FTVT.2025.3632893&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_TVT_2025_3632893 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0018-9545&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0018-9545&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0018-9545&client=summon |