PreCurious: How Innocent Pre-Trained Language Models Turn into Privacy Traps
The pre-training and fine-tuning paradigm has demonstrated its effectiveness and has become the standard approach for tailoring language models to various tasks. Currently, community-based platforms offer easy access to various pre-trained models, as anyone can publish without strict validation proc...
Saved in:
| Published in: | Proceedings of the ... ACM Conference on Computer and Communications Security Vol. 2024; p. 3511 |
|---|---|
| Main Authors: | , , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
United States
01.10.2024
|
| Subjects: | |
| ISSN: | 1543-7221, 1543-7221 |
| Online Access: | Get more information |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | The pre-training and fine-tuning paradigm has demonstrated its effectiveness and has become the standard approach for tailoring language models to various tasks. Currently, community-based platforms offer easy access to various pre-trained models, as anyone can publish without strict validation processes. However, a released pre-trained model can be a privacy trap for fine-tuning datasets if it is carefully designed. In this work, we propose PreCurious framework to reveal the new attack surface where the attacker releases the pre-trained model and gets a black-box access to the final fine-tuned model. PreCurious aims to escalate the general privacy risk of both membership inference and data extraction on the fine-tuning dataset. The key intuition behind PreCurious is to manipulate the memorization stage of the pre-trained model and guide fine-tuning with a seemingly legitimate configuration. While empirical and theoretical evidence suggests that parameter-efficient and differentially private fine-tuning techniques can defend against privacy attacks on a fine-tuned model, PreCurious demonstrates the possibility of breaking up this invulnerability in a stealthy manner compared to fine-tuning on a benign pre-trained model. While DP provides some mitigation for membership inference attack, by further leveraging a sanitized dataset, PreCurious demonstrates potential vulnerabilities for targeted data extraction even under differentially private tuning with a strict privacy budget e.g.
. Thus, PreCurious raises warnings for users on the potential risks of downloading pre-trained models from unknown sources, relying solely on tutorials or common-sense defenses, and releasing sanitized datasets even after perfect scrubbing. |
|---|---|
| AbstractList | The pre-training and fine-tuning paradigm has demonstrated its effectiveness and has become the standard approach for tailoring language models to various tasks. Currently, community-based platforms offer easy access to various pre-trained models, as anyone can publish without strict validation processes. However, a released pre-trained model can be a privacy trap for fine-tuning datasets if it is carefully designed. In this work, we propose PreCurious framework to reveal the new attack surface where the attacker releases the pre-trained model and gets a black-box access to the final fine-tuned model. PreCurious aims to escalate the general privacy risk of both membership inference and data extraction on the fine-tuning dataset. The key intuition behind PreCurious is to manipulate the memorization stage of the pre-trained model and guide fine-tuning with a seemingly legitimate configuration. While empirical and theoretical evidence suggests that parameter-efficient and differentially private fine-tuning techniques can defend against privacy attacks on a fine-tuned model, PreCurious demonstrates the possibility of breaking up this invulnerability in a stealthy manner compared to fine-tuning on a benign pre-trained model. While DP provides some mitigation for membership inference attack, by further leveraging a sanitized dataset, PreCurious demonstrates potential vulnerabilities for targeted data extraction even under differentially private tuning with a strict privacy budget e.g. ϵ = 0.05 . Thus, PreCurious raises warnings for users on the potential risks of downloading pre-trained models from unknown sources, relying solely on tutorials or common-sense defenses, and releasing sanitized datasets even after perfect scrubbing.The pre-training and fine-tuning paradigm has demonstrated its effectiveness and has become the standard approach for tailoring language models to various tasks. Currently, community-based platforms offer easy access to various pre-trained models, as anyone can publish without strict validation processes. However, a released pre-trained model can be a privacy trap for fine-tuning datasets if it is carefully designed. In this work, we propose PreCurious framework to reveal the new attack surface where the attacker releases the pre-trained model and gets a black-box access to the final fine-tuned model. PreCurious aims to escalate the general privacy risk of both membership inference and data extraction on the fine-tuning dataset. The key intuition behind PreCurious is to manipulate the memorization stage of the pre-trained model and guide fine-tuning with a seemingly legitimate configuration. While empirical and theoretical evidence suggests that parameter-efficient and differentially private fine-tuning techniques can defend against privacy attacks on a fine-tuned model, PreCurious demonstrates the possibility of breaking up this invulnerability in a stealthy manner compared to fine-tuning on a benign pre-trained model. While DP provides some mitigation for membership inference attack, by further leveraging a sanitized dataset, PreCurious demonstrates potential vulnerabilities for targeted data extraction even under differentially private tuning with a strict privacy budget e.g. ϵ = 0.05 . Thus, PreCurious raises warnings for users on the potential risks of downloading pre-trained models from unknown sources, relying solely on tutorials or common-sense defenses, and releasing sanitized datasets even after perfect scrubbing. The pre-training and fine-tuning paradigm has demonstrated its effectiveness and has become the standard approach for tailoring language models to various tasks. Currently, community-based platforms offer easy access to various pre-trained models, as anyone can publish without strict validation processes. However, a released pre-trained model can be a privacy trap for fine-tuning datasets if it is carefully designed. In this work, we propose PreCurious framework to reveal the new attack surface where the attacker releases the pre-trained model and gets a black-box access to the final fine-tuned model. PreCurious aims to escalate the general privacy risk of both membership inference and data extraction on the fine-tuning dataset. The key intuition behind PreCurious is to manipulate the memorization stage of the pre-trained model and guide fine-tuning with a seemingly legitimate configuration. While empirical and theoretical evidence suggests that parameter-efficient and differentially private fine-tuning techniques can defend against privacy attacks on a fine-tuned model, PreCurious demonstrates the possibility of breaking up this invulnerability in a stealthy manner compared to fine-tuning on a benign pre-trained model. While DP provides some mitigation for membership inference attack, by further leveraging a sanitized dataset, PreCurious demonstrates potential vulnerabilities for targeted data extraction even under differentially private tuning with a strict privacy budget e.g. . Thus, PreCurious raises warnings for users on the potential risks of downloading pre-trained models from unknown sources, relying solely on tutorials or common-sense defenses, and releasing sanitized datasets even after perfect scrubbing. |
| Author | Xiong, Li Liu, Ruixuan Cao, Yang Wang, Tianhao |
| Author_xml | – sequence: 1 givenname: Ruixuan surname: Liu fullname: Liu, Ruixuan organization: Emory University, Atlanta, USA – sequence: 2 givenname: Tianhao surname: Wang fullname: Wang, Tianhao organization: University of Virginia, Charlottesville, USA – sequence: 3 givenname: Yang surname: Cao fullname: Cao, Yang organization: Tokyo Institute of Technology, Tokyo, Japan – sequence: 4 givenname: Li surname: Xiong fullname: Xiong, Li organization: Emory University, Atlanta, USA |
| BackLink | https://www.ncbi.nlm.nih.gov/pubmed/40401199$$D View this record in MEDLINE/PubMed |
| BookMark | eNpNkM1LxDAUxIOsuB969iY5eumazzbxJkXdhYoe1nNJm9el0k1q0ir731twBWFgHsyPgTdLNHPeAULXlKwpFfKOp1KlQqx5qgnL9BlaUCl4kjFGZ__uOVrG-EEI1ULzCzQXRBBKtV6g4i1APobWj_Eeb_w33jrna3ADnoJkF0zrwOLCuP1o9oBfvIUu4t0YHG7d4Ceq_TL1EU9kHy_ReWO6CFcnX6H3p8ddvkmK1-dt_lAkhisyJE0GVvHaKmk5kTUAyRrFrMissEpAUwEHpoHIxqSVtBZSybIKdEWVJFXN2Qrd_vb2wX-OEIfy0MYaus44mB4pOSOp1JP0hN6c0LE6gC370B5MOJZ_C7AfC8lfHQ |
| ContentType | Journal Article |
| DBID | NPM 7X8 |
| DOI | 10.1145/3658644.3690279 |
| DatabaseName | PubMed MEDLINE - Academic |
| DatabaseTitle | PubMed MEDLINE - Academic |
| DatabaseTitleList | MEDLINE - Academic PubMed |
| Database_xml | – sequence: 1 dbid: NPM name: PubMed url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: 7X8 name: MEDLINE - Academic url: https://search.proquest.com/medline sourceTypes: Aggregation Database |
| DeliveryMethod | no_fulltext_linktorsrc |
| Discipline | Computer Science |
| EISSN | 1543-7221 |
| ExternalDocumentID | 40401199 |
| Genre | Journal Article |
| GrantInformation_xml | – fundername: NIEHS NIH HHS grantid: R01 ES033241 – fundername: NLM NIH HHS grantid: R01 LM013712 |
| GroupedDBID | -~X 123 29O ABPPZ AFFNX ALMA_UNASSIGNED_HOLDINGS ASPBG AVWKF NPM YZZ 7X8 |
| ID | FETCH-LOGICAL-a380t-f7ed83cd85d305cee07f82d47d4d84efbe3e29e05fa6b5dde6527be9b1850bc32 |
| IEDL.DBID | 7X8 |
| ISICitedReferencesCount | 3 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001436367300239&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1543-7221 |
| IngestDate | Thu May 22 17:35:33 EDT 2025 Sat May 24 01:33:54 EDT 2025 |
| IsDoiOpenAccess | false |
| IsOpenAccess | true |
| IsPeerReviewed | false |
| IsScholarly | true |
| Keywords | Pre-Training Language Model Privacy Attack |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-a380t-f7ed83cd85d305cee07f82d47d4d84efbe3e29e05fa6b5dde6527be9b1850bc32 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
| OpenAccessLink | https://pubmed.ncbi.nlm.nih.gov/PMC12094715 |
| PMID | 40401199 |
| PQID | 3206596599 |
| PQPubID | 23479 |
| ParticipantIDs | proquest_miscellaneous_3206596599 pubmed_primary_40401199 |
| PublicationCentury | 2000 |
| PublicationDate | 20241001 |
| PublicationDateYYYYMMDD | 2024-10-01 |
| PublicationDate_xml | – month: 10 year: 2024 text: 20241001 day: 1 |
| PublicationDecade | 2020 |
| PublicationPlace | United States |
| PublicationPlace_xml | – name: United States |
| PublicationTitle | Proceedings of the ... ACM Conference on Computer and Communications Security |
| PublicationTitleAlternate | Conf Comput Commun Secur |
| PublicationYear | 2024 |
| SSID | ssj0019493 |
| Score | 2.5136359 |
| Snippet | The pre-training and fine-tuning paradigm has demonstrated its effectiveness and has become the standard approach for tailoring language models to various... |
| SourceID | proquest pubmed |
| SourceType | Aggregation Database Index Database |
| StartPage | 3511 |
| Title | PreCurious: How Innocent Pre-Trained Language Models Turn into Privacy Traps |
| URI | https://www.ncbi.nlm.nih.gov/pubmed/40401199 https://www.proquest.com/docview/3206596599 |
| Volume | 2024 |
| WOSCitedRecordID | wos001436367300239&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LS8NAEF7UevBifVtfrOB1bbK7ySZeRIqlQi09VOgt7GYnUJCkNrXiv3c2SelNBCEkh2xCMpnHl5lhPkLuUEeU8YxhCqMJkyZLWcQl2pUJAOM3CCV0RTahRqNoOo3HTcKtbNoq1z6xctS2SF2OvCu4qwDiFj_OP5hjjXLV1YZCY5u0BEIZZ5hquqkixLIauosoQTDFud-M9vFl0BUYeREK3Av8O-TqF3xZxZl--79PeED2G4RJn2qVOCRbkB-R9pq9gTbGfEyG4wX0PheuB_aBDoov-pLnhevVpHiCTRx1BFg6bPKZ1JGmvZd0gnens3xZ4KrZSqffFFfOyxPy1n-e9Aas4VZgWkTekmUKbCRSGwUWLR4jpaeyiFuprLSRhMyAAB6DF2Q6NAH6wDDAjwqxwfjumVTwU7KTFzmcE8o9DTIGY0yoZIhHDZFOQ-0DD40NoENu1_JKUHddQULngC-XbCTWIWe10JN5PWQjkehdfD-OL_5w9SXZ44g16h67K9LK0HLhmuymq-WsXNxUSoH70fj1Bxxlwhw |
| linkProvider | ProQuest |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=PreCurious%3A+How+Innocent+Pre-Trained+Language+Models+Turn+into+Privacy+Traps&rft.jtitle=Proceedings+of+the+...+ACM+Conference+on+Computer+and+Communications+Security&rft.au=Liu%2C+Ruixuan&rft.au=Wang%2C+Tianhao&rft.au=Cao%2C+Yang&rft.au=Xiong%2C+Li&rft.date=2024-10-01&rft.eissn=1543-7221&rft.volume=2024&rft.spage=3511&rft_id=info:doi/10.1145%2F3658644.3690279&rft_id=info%3Apmid%2F40401199&rft_id=info%3Apmid%2F40401199&rft.externalDocID=40401199 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1543-7221&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1543-7221&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1543-7221&client=summon |