Podrobná bibliografie
| Název: |
A Trustworthy Dataset for APT Intelligence with an Auto-Annotation Framework. |
| Autoři: |
Qi, Rui, Xiang, Ga, Zhang, Yangsen, Yang, Qunsheng, Cheng, Mingyue, Zhang, Haoyang, Ma, Mingming, Sun, Lu, Ma, Zhixing |
| Zdroj: |
Electronics (2079-9292); Aug2025, Vol. 14 Issue 16, p3251, 22p |
| Témata: |
KNOWLEDGE graphs, DATA quality, ANNOTATIONS, CYBERTERRORISM, DATA mining, DATA curation, LANGUAGE models, INTERNET security |
| Abstrakt: |
Advanced Persistent Threats (APTs) pose significant cybersecurity challenges due to their multi-stage complexity. Knowledge graphs (KGs) effectively model APT attack processes through node-link architectures; however, the scarcity of high-quality, annotated datasets limits research progress. The primary challenge lies in balancing annotation cost and quality, particularly due to the lack of quality assessment methods for graph annotation data. This study addresses these issues by extending existing APT ontology definitions and developing a dynamic, trustworthy annotation framework for APT knowledge graphs. The framework introduces a self-verification mechanism utilizing large language model (LLM) annotation consistency and establishes a comprehensive graph data metric system for problem localization in annotated data. This metric system, based on structural properties, logical consistency, and APT attack chain characteristics, comprehensively evaluates annotation quality across representation, syntax semantics, and topological structure. Experimental results show that this framework significantly reduces annotation costs while maintaining quality. Using this framework, we constructed LAPTKG, a reliable dataset containing over 10,000 entities and relations. Baseline evaluations show substantial improvements in entity and relation extraction performance after metric correction, validating the framework's effectiveness in reliable APT knowledge graph dataset construction. [ABSTRACT FROM AUTHOR] |
|
Copyright of Electronics (2079-9292) is the property of MDPI and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.) |
| Databáze: |
Complementary Index |