Feature mining and classifier selection for API calls-based malware detection.
Gespeichert in:
| Titel: | Feature mining and classifier selection for API calls-based malware detection. |
|---|---|
| Autoren: | Balan, Gheorghe, Simion, Ciprian-Alin, Gavriluţ, Dragoş Teodor, Luchian, Henri |
| Quelle: | Applied Intelligence; Dec2023, Vol. 53 Issue 23, p29094-29108, 15p |
| Schlagwörter: | MACHINE learning, MALWARE, DATABASES, FEATURE selection, APPLICATION program interfaces, MACHINE performance, DECISION trees |
| Abstract: | This paper deals with a major challenge in cyber-security: the need to respond to ever renewed techniques used by attackers in order to avoid detection based on analysing static features of malware. These constantly renewed techniques consist of various changes in file geometry, entropy a.s.o. As a consequence, static malware features sets describe less and less accurately the malicious files; hence, the performance of machine learning models in detecting new variants of the same malware family may be severely impaired. The paper focuses on a promising approach to this detection challenge: defining file features based on OS (operating system) API (Application Program Interface) calls sequences. We explore in detail the detection potential of such features, since, in order to act maliciously, these features are highly unlikely to be hidden. We studied several tens of thousands of such features, a modest-sized subset of which were subsequently fed to several machine learning models. The database used for training and testing consists of 1.5 million files, including malicious files from the polymorphic families Emotet and Trickbot. Using this database, nearly 4,000 pairings (classifier, feature selection algorithm) were trained / tested. Our experimental results show that the API (Application Program Interface) calls-oriented feature mining process is well suited for detecting polymorphic malware. A comparative discussion of the detection results of the various models is presented; depending on the target optimisation criterion (detection rate / false positive rate / saving resources), three of the 4,000 classification models turn out to be best suited for real-world applications: Random Forrest, Legacy Neural Networks and Decision Tree. [ABSTRACT FROM AUTHOR] |
| Copyright of Applied Intelligence is the property of Springer Nature and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.) | |
| Datenbank: | Complementary Index |
| FullText | Text: Availability: 0 CustomLinks: – Url: https://resolver.ebscohost.com/openurl?sid=EBSCO:edb&genre=article&issn=0924669X&ISBN=&volume=53&issue=23&date=20231201&spage=29094&pages=29094-29108&title=Applied Intelligence&atitle=Feature%20mining%20and%20classifier%20selection%20for%20API%20calls-based%20malware%20detection.&aulast=Balan%2C%20Gheorghe&id=DOI:10.1007/s10489-023-05086-2 Name: Full Text Finder Category: fullText Text: Full Text Finder Icon: https://imageserver.ebscohost.com/branding/images/FTF.gif MouseOverText: Full Text Finder – Url: https://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=EBSCO&SrcAuth=EBSCO&DestApp=WOS&ServiceName=TransferToWoS&DestLinkType=GeneralSearchSummary&Func=Links&author=Balan%20G Name: ISI Category: fullText Text: Nájsť tento článok vo Web of Science Icon: https://imagesrvr.epnet.com/ls/20docs.gif MouseOverText: Nájsť tento článok vo Web of Science |
|---|---|
| Header | DbId: edb DbLabel: Complementary Index An: 173923740 RelevancyScore: 965 AccessLevel: 6 PubType: Academic Journal PubTypeId: academicJournal PreciseRelevancyScore: 965.43994140625 |
| IllustrationInfo | |
| Items | – Name: Title Label: Title Group: Ti Data: Feature mining and classifier selection for API calls-based malware detection. – Name: Author Label: Authors Group: Au Data: <searchLink fieldCode="AR" term="%22Balan%2C+Gheorghe%22">Balan, Gheorghe</searchLink><br /><searchLink fieldCode="AR" term="%22Simion%2C+Ciprian-Alin%22">Simion, Ciprian-Alin</searchLink><br /><searchLink fieldCode="AR" term="%22Gavriluţ%2C+Dragoş+Teodor%22">Gavriluţ, Dragoş Teodor</searchLink><br /><searchLink fieldCode="AR" term="%22Luchian%2C+Henri%22">Luchian, Henri</searchLink> – Name: TitleSource Label: Source Group: Src Data: Applied Intelligence; Dec2023, Vol. 53 Issue 23, p29094-29108, 15p – Name: Subject Label: Subject Terms Group: Su Data: <searchLink fieldCode="DE" term="%22MACHINE+learning%22">MACHINE learning</searchLink><br /><searchLink fieldCode="DE" term="%22MALWARE%22">MALWARE</searchLink><br /><searchLink fieldCode="DE" term="%22DATABASES%22">DATABASES</searchLink><br /><searchLink fieldCode="DE" term="%22FEATURE+selection%22">FEATURE selection</searchLink><br /><searchLink fieldCode="DE" term="%22APPLICATION+program+interfaces%22">APPLICATION program interfaces</searchLink><br /><searchLink fieldCode="DE" term="%22MACHINE+performance%22">MACHINE performance</searchLink><br /><searchLink fieldCode="DE" term="%22DECISION+trees%22">DECISION trees</searchLink> – Name: Abstract Label: Abstract Group: Ab Data: This paper deals with a major challenge in cyber-security: the need to respond to ever renewed techniques used by attackers in order to avoid detection based on analysing static features of malware. These constantly renewed techniques consist of various changes in file geometry, entropy a.s.o. As a consequence, static malware features sets describe less and less accurately the malicious files; hence, the performance of machine learning models in detecting new variants of the same malware family may be severely impaired. The paper focuses on a promising approach to this detection challenge: defining file features based on OS (operating system) API (Application Program Interface) calls sequences. We explore in detail the detection potential of such features, since, in order to act maliciously, these features are highly unlikely to be hidden. We studied several tens of thousands of such features, a modest-sized subset of which were subsequently fed to several machine learning models. The database used for training and testing consists of 1.5 million files, including malicious files from the polymorphic families Emotet and Trickbot. Using this database, nearly 4,000 pairings (classifier, feature selection algorithm) were trained / tested. Our experimental results show that the API (Application Program Interface) calls-oriented feature mining process is well suited for detecting polymorphic malware. A comparative discussion of the detection results of the various models is presented; depending on the target optimisation criterion (detection rate / false positive rate / saving resources), three of the 4,000 classification models turn out to be best suited for real-world applications: Random Forrest, Legacy Neural Networks and Decision Tree. [ABSTRACT FROM AUTHOR] – Name: Abstract Label: Group: Ab Data: <i>Copyright of Applied Intelligence is the property of Springer Nature and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract.</i> (Copyright applies to all Abstracts.) |
| PLink | https://erproxy.cvtisr.sk/sfx/access?url=https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edb&AN=173923740 |
| RecordInfo | BibRecord: BibEntity: Identifiers: – Type: doi Value: 10.1007/s10489-023-05086-2 Languages: – Code: eng Text: English PhysicalDescription: Pagination: PageCount: 15 StartPage: 29094 Subjects: – SubjectFull: MACHINE learning Type: general – SubjectFull: MALWARE Type: general – SubjectFull: DATABASES Type: general – SubjectFull: FEATURE selection Type: general – SubjectFull: APPLICATION program interfaces Type: general – SubjectFull: MACHINE performance Type: general – SubjectFull: DECISION trees Type: general Titles: – TitleFull: Feature mining and classifier selection for API calls-based malware detection. Type: main BibRelationships: HasContributorRelationships: – PersonEntity: Name: NameFull: Balan, Gheorghe – PersonEntity: Name: NameFull: Simion, Ciprian-Alin – PersonEntity: Name: NameFull: Gavriluţ, Dragoş Teodor – PersonEntity: Name: NameFull: Luchian, Henri IsPartOfRelationships: – BibEntity: Dates: – D: 01 M: 12 Text: Dec2023 Type: published Y: 2023 Identifiers: – Type: issn-print Value: 0924669X Numbering: – Type: volume Value: 53 – Type: issue Value: 23 Titles: – TitleFull: Applied Intelligence Type: main |
| ResultId | 1 |
Full Text Finder
Nájsť tento článok vo Web of Science