基于LLM的多粒度口令分析研究
TP309; 基于口令的认证是常见的身份认证机制.然而,大规模口令泄露事件时有发生,表明口令仍面临着被猜测或者盗用等风险.由于口令可以被视作一种特殊的自然语言,近年来运用自然语言处理技术进行口令分析的研究工作逐渐展开.目前少有工作在大语言模型(LLM,large language model)上探究口令文本分词粒度对口令分析效果的影响.为此,提出了基于LLM的多粒度口令分析框架,总体上沿用预训练范式,在大量未标记数据集上自主学习口令分布先验知识.该框架由同步网络、主干网络、尾部网络3个模块构成.其中,同步网络模块实现了 char-level、template-level和chunk-level...
Gespeichert in:
| Veröffentlicht in: | 网络与信息安全学报 Jg. 10; H. 1; S. 112 - 122 |
|---|---|
| Hauptverfasser: | , , |
| Format: | Journal Article |
| Sprache: | Chinesisch |
| Veröffentlicht: |
上海交通大学网络空间安全学院,上海 200240
25.02.2024
|
| Schlagworte: | |
| ISSN: | 2096-109X |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | TP309; 基于口令的认证是常见的身份认证机制.然而,大规模口令泄露事件时有发生,表明口令仍面临着被猜测或者盗用等风险.由于口令可以被视作一种特殊的自然语言,近年来运用自然语言处理技术进行口令分析的研究工作逐渐展开.目前少有工作在大语言模型(LLM,large language model)上探究口令文本分词粒度对口令分析效果的影响.为此,提出了基于LLM的多粒度口令分析框架,总体上沿用预训练范式,在大量未标记数据集上自主学习口令分布先验知识.该框架由同步网络、主干网络、尾部网络3个模块构成.其中,同步网络模块实现了 char-level、template-level和chunk-level这3种粒度的口令分词,并提取了口令的字符分布、结构、词块组成等特征知识;主干网络模块构建了通用的口令模型来学习口令组成规律;尾部网络模块生成了候选口令对目标库进行猜测分析.在Tianya、Twitter等8个口令库上进行大量实验,分析总结了多粒度分词下所提框架在不同语言环境中的口令分析效果.实验结果表明,在中文用户场景中,基于char-level和chunk-level分词的框架口令分析性能接近一致,且显著优于基于template-level分词的框架;在英文用户场景中,基于chunk-level分词的框架口令分析性能最佳. |
|---|---|
| AbstractList | TP309; 基于口令的认证是常见的身份认证机制.然而,大规模口令泄露事件时有发生,表明口令仍面临着被猜测或者盗用等风险.由于口令可以被视作一种特殊的自然语言,近年来运用自然语言处理技术进行口令分析的研究工作逐渐展开.目前少有工作在大语言模型(LLM,large language model)上探究口令文本分词粒度对口令分析效果的影响.为此,提出了基于LLM的多粒度口令分析框架,总体上沿用预训练范式,在大量未标记数据集上自主学习口令分布先验知识.该框架由同步网络、主干网络、尾部网络3个模块构成.其中,同步网络模块实现了 char-level、template-level和chunk-level这3种粒度的口令分词,并提取了口令的字符分布、结构、词块组成等特征知识;主干网络模块构建了通用的口令模型来学习口令组成规律;尾部网络模块生成了候选口令对目标库进行猜测分析.在Tianya、Twitter等8个口令库上进行大量实验,分析总结了多粒度分词下所提框架在不同语言环境中的口令分析效果.实验结果表明,在中文用户场景中,基于char-level和chunk-level分词的框架口令分析性能接近一致,且显著优于基于template-level分词的框架;在英文用户场景中,基于chunk-level分词的框架口令分析性能最佳. |
| Abstract_FL | Password-based authentication has been widely used as the primary authentication mechanism.However,occasional large-scale password leaks have highlighted the vulnerability of passwords to risks such as guessing or theft.In recent years,research on password analysis using natural language processing techniques has progressed,treating passwords as a special form of natural language.Nevertheless,limited studies have investigated the impact of password text segmentation granularity on the effectiveness of password analysis with large language models.A multi-granularity password-analyzing framework was proposed based on a large language model,which follows the pre-training paradigm and autonomously learns prior knowledge of password distribution from large unlabelled da-tasets.The framework comprised three modules:the synchronization network,backbone network,and tail network.The synchronization network module implemented char-level,template-level,and chunk-level password segmenta-tion,extracting knowledge on character distribution,structure,word chunk composition,and other password features.The backbone network module constructed a generic password model to learn the rules governing password compo-sition.The tail network module generated candidate passwords for guessing and analyzing target databases.Experi-mental evaluations were conducted on eight password databases including Tianya and Twitter,analyzing and sum-marizing the effectiveness of the proposed framework under different language environments and word segmenta-tion granularities.The results indicate that in Chinese user scenarios,the performance of the password-analyzing framework based on char-level and chunk-level segmentation is comparable,and significantly superior to the framework based on template-level segmentation.In English user scenarios,the framework based on chunk-level segmentation demonstrates the best password-analyzing performance. |
| Author | 王杨德 邱卫东 洪萌 |
| AuthorAffiliation | 上海交通大学网络空间安全学院,上海 200240 |
| AuthorAffiliation_xml | – name: 上海交通大学网络空间安全学院,上海 200240 |
| Author_FL | HONG Meng QIU Weidong WANG Yangde |
| Author_FL_xml | – sequence: 1 fullname: HONG Meng – sequence: 2 fullname: QIU Weidong – sequence: 3 fullname: WANG Yangde |
| Author_xml | – sequence: 1 fullname: 洪萌 – sequence: 2 fullname: 邱卫东 – sequence: 3 fullname: 王杨德 |
| BookMark | eNrjYmDJy89LZWBQNDTQMzS0NLXUz9LLLC7O0zMysDTTNTSwrACyjEwMDCxYGDhhYhEcDLzFxZlJBqYW5qaWQGlOBvWn83c92dXn4-P7fFbL0yWznm-a9HTXsqf9i5_sXvK0o-3ZvAnPF0x5vnIbDwNrWmJOcSovlOZmCHVzDXH20PXxd_d0dvTRLTY0MLXUTTUzs7BITku2TDaxTEw0STUztTRONjJJNjU1MEw1N0gyTkmxMDY3SzQyTEtJNk8zMjW3tEgxSUkzN09JTUkySjXmZtCCmFuemJeWmJcen5VfWpQHtDG-PKeyoiKxsCIJ7C9DAwNLYwA0uVUw |
| ClassificationCodes | TP309 |
| ContentType | Journal Article |
| Copyright | Copyright © Wanfang Data Co. Ltd. All Rights Reserved. |
| Copyright_xml | – notice: Copyright © Wanfang Data Co. Ltd. All Rights Reserved. |
| DBID | 2B. 4A8 92I 93N PSX TCJ |
| DOI | 10.11959/j.issn.2096-109x.2024008 |
| DatabaseName | Wanfang Data Journals - Hong Kong WANFANG Data Centre Wanfang Data Journals 万方数据期刊 - 香港版 China Online Journals (COJ) China Online Journals (COJ) |
| DatabaseTitleList | |
| DeliveryMethod | fulltext_linktorsrc |
| DocumentTitle_FL | Research on multi-granularity password analysis based on LLM |
| EndPage | 122 |
| ExternalDocumentID | wlyxxaqxb202401009 |
| GroupedDBID | 2B. 4A8 92I 93N ALMA_UNASSIGNED_HOLDINGS M~E PSX TCJ |
| ID | FETCH-LOGICAL-s1059-e6688cfc9c49aa4e6593c24c5501e70b3dd8376a21fdc7f25798d4df77dedb2e3 |
| ISSN | 2096-109X |
| IngestDate | Thu May 29 03:56:41 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Issue | 1 |
| Keywords | word segmentation natural language processing 自然语言处理 大语言模型 large language model password analysis 分词 口令分析 |
| Language | Chinese |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-s1059-e6688cfc9c49aa4e6593c24c5501e70b3dd8376a21fdc7f25798d4df77dedb2e3 |
| PageCount | 11 |
| ParticipantIDs | wanfang_journals_wlyxxaqxb202401009 |
| PublicationCentury | 2000 |
| PublicationDate | 2024-02-25 |
| PublicationDateYYYYMMDD | 2024-02-25 |
| PublicationDate_xml | – month: 02 year: 2024 text: 2024-02-25 day: 25 |
| PublicationDecade | 2020 |
| PublicationTitle | 网络与信息安全学报 |
| PublicationTitle_FL | Chinese Journal of Network and Information Security |
| PublicationYear | 2024 |
| Publisher | 上海交通大学网络空间安全学院,上海 200240 |
| Publisher_xml | – name: 上海交通大学网络空间安全学院,上海 200240 |
| SSID | ssib058759024 |
| Score | 2.3697007 |
| Snippet | TP309; 基于口令的认证是常见的身份认证机制.然而,大规模口令泄露事件时有发生,表明口令仍面临着被猜测或者盗用等风险.由于口令可以被视作一种特殊的自然语言,近年来运用... |
| SourceID | wanfang |
| SourceType | Aggregation Database |
| StartPage | 112 |
| Title | 基于LLM的多粒度口令分析研究 |
| URI | https://d.wanfangdata.com.cn/periodical/wlyxxaqxb202401009 |
| Volume | 10 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVHPJ databaseName: ROAD: Directory of Open Access Scholarly Resources issn: 2096-109X databaseCode: M~E dateStart: 20150101 customDbUrl: isFulltext: true dateEnd: 99991231 titleUrlDefault: https://road.issn.org omitProxy: false ssIdentifier: ssib058759024 providerName: ISSN International Centre |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtR3LahRBsIlRxIsoKr6JaONp40xPz3TXsWczi4ckeIiQW5inBsJEszGOHjyJeFIRTzmoYBC8evCw_k7W_Q2ra2Z2FxYlHrw0RXV1VXVV91TX0A_GbiVFmsWBEh1ca3sd6WPCGufasz_yVay8XBdJTI9NqNVVvb4O9-aObbZnYfa2VFnqqoJH_9XViENn26Oz_-DuMVNEIIxOxxLdjuWRHM8jn0OPh4ZH0pY6Wl5e4ZHigLC0tUZaGDGh4CAsBslMYAHd48ajhqElsxjNNVYFHCIOjm1lHA6SAOBhML26JZ5LHFwCQg5dYoUcIgKQuWtZaZebHmkScQ0kBWFNmCXSBGlQJb8dDhYRSm5Qa2210N1JDXAteOgSF2wcNiJhikRZBXRIvVhqBIURD9X0Pw8h6Qy5Px6lre6GpGML1VjUGgZoi4hpzGnUlO6zNqhNZWwrULYjf-s60gA3Asf-rAL2AkhRXzxVf70F5oYY4-id4EmocWamVB033GYveb0Eceuj2rPRDXyg8GYlLLYSqkVrIsfRk5A-3mj5dOtZVcWPq4RIXDrqelwoH-zux5UXUfvp9TFtBYeegB4rfpLdaIXe-ZNIOupWFnH5YGpVtnaGnW7SqQVTT4OzbO75w3Ps9vDT4HDwBof9aP_l8GB_9P39cPB1-PbL4c-D4etXvz6-G33-MPr24zy734vWunc7zYMgnb5NAzp5EGidFimkEuJY5oEPXipkilm2mysn8bJMY8CMhVtkqSowGoHOZFYoleVZInLvApsvt8v8IluQuigwmfDi3JMSa7XwRRLTdXy5L934ErvZdGyjmdv9jVljXj4S1RV2ajKIr7L53Z0n-TV2It3b3ezvXCc__AbZ5HYO |
| linkProvider | ISSN International Centre |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=%E5%9F%BA%E4%BA%8ELLM%E7%9A%84%E5%A4%9A%E7%B2%92%E5%BA%A6%E5%8F%A3%E4%BB%A4%E5%88%86%E6%9E%90%E7%A0%94%E7%A9%B6&rft.jtitle=%E7%BD%91%E7%BB%9C%E4%B8%8E%E4%BF%A1%E6%81%AF%E5%AE%89%E5%85%A8%E5%AD%A6%E6%8A%A5&rft.au=%E6%B4%AA%E8%90%8C&rft.au=%E9%82%B1%E5%8D%AB%E4%B8%9C&rft.au=%E7%8E%8B%E6%9D%A8%E5%BE%B7&rft.date=2024-02-25&rft.pub=%E4%B8%8A%E6%B5%B7%E4%BA%A4%E9%80%9A%E5%A4%A7%E5%AD%A6%E7%BD%91%E7%BB%9C%E7%A9%BA%E9%97%B4%E5%AE%89%E5%85%A8%E5%AD%A6%E9%99%A2%2C%E4%B8%8A%E6%B5%B7+200240&rft.issn=2096-109X&rft.volume=10&rft.issue=1&rft.spage=112&rft.epage=122&rft_id=info:doi/10.11959%2Fj.issn.2096-109x.2024008&rft.externalDocID=wlyxxaqxb202401009 |
| thumbnail_s | http://cvtisr.summon.serialssolutions.com/2.0.0/image/custom?url=http%3A%2F%2Fwww.wanfangdata.com.cn%2Fimages%2FPeriodicalImages%2Fwlyxxaqxb%2Fwlyxxaqxb.jpg |