Fast Content-Based File Type Identification

Uložené v:
Podrobná bibliografia
Názov: Fast Content-Based File Type Identification
Autori: Ahmed, Irfan, Lhee, Kyung-Suk, Shin, Hyun-Jung, Hong, Man-Pyo
Prispievatelia: Information Security Institute, Queensland University of Technology Brisbane (QUT), Ajou University, Gilbert Peterson, Sujeet Shenoi, TC 11, WG 11.9
Zdroj: IFIP Advances in Information and Communication Technology ; 7th Digital Forensics (DF) ; https://inria.hal.science/hal-01569553 ; 7th Digital Forensics (DF), Jan 2011, Orlando, FL, United States. pp.65-75, ⟨10.1007/978-3-642-24212-0_5⟩
Informácie o vydavateľovi: CCSD
Springer
Rok vydania: 2011
Predmety: File type identification, file content classification, byte frequency, [INFO]Computer Science [cs]
Geografické téma: Orlando, FL, United States
Popis: Part 2: FORENSIC TECHNIQUES ; International audience ; Digital forensic examiners often need to identify the type of a file or file fragment based on the content of the file. Content-based file type identification schemes typically use a byte frequency distribution with statistical machine learning to classify file types. Most algorithms analyze the entire file content to obtain the byte frequency distribution, a technique that is inefficient and time consuming. This paper proposes two techniques for reducing the classification time. The first technique selects a subset of features based on the frequency of occurrence. The second speeds up classification by randomly sampling file blocks. Experimental results demonstrate that up to a fifteen-fold reduction in computational time can be achieved with limited impact on accuracy.
Druh dokumentu: conference object
Jazyk: English
DOI: 10.1007/978-3-642-24212-0_5
Dostupnosť: https://inria.hal.science/hal-01569553
https://inria.hal.science/hal-01569553v1/document
https://inria.hal.science/hal-01569553v1/file/978-3-642-24212-0_5_Chapter.pdf
https://doi.org/10.1007/978-3-642-24212-0_5
Rights: http://creativecommons.org/licenses/by/ ; info:eu-repo/semantics/OpenAccess
Prístupové číslo: edsbas.54718D05
Databáza: BASE
FullText Text:
  Availability: 0
CustomLinks:
  – Url: https://inria.hal.science/hal-01569553#
    Name: EDS - BASE (s4221598)
    Category: fullText
    Text: View record from BASE
  – Url: https://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=EBSCO&SrcAuth=EBSCO&DestApp=WOS&ServiceName=TransferToWoS&DestLinkType=GeneralSearchSummary&Func=Links&author=Ahmed%20I
    Name: ISI
    Category: fullText
    Text: Nájsť tento článok vo Web of Science
    Icon: https://imagesrvr.epnet.com/ls/20docs.gif
    MouseOverText: Nájsť tento článok vo Web of Science
Header DbId: edsbas
DbLabel: BASE
An: edsbas.54718D05
RelevancyScore: 858
AccessLevel: 3
PubType: Conference
PubTypeId: conference
PreciseRelevancyScore: 857.654235839844
IllustrationInfo
Items – Name: Title
  Label: Title
  Group: Ti
  Data: Fast Content-Based File Type Identification
– Name: Author
  Label: Authors
  Group: Au
  Data: <searchLink fieldCode="AR" term="%22Ahmed%2C+Irfan%22">Ahmed, Irfan</searchLink><br /><searchLink fieldCode="AR" term="%22Lhee%2C+Kyung-Suk%22">Lhee, Kyung-Suk</searchLink><br /><searchLink fieldCode="AR" term="%22Shin%2C+Hyun-Jung%22">Shin, Hyun-Jung</searchLink><br /><searchLink fieldCode="AR" term="%22Hong%2C+Man-Pyo%22">Hong, Man-Pyo</searchLink>
– Name: Author
  Label: Contributors
  Group: Au
  Data: Information Security Institute<br />Queensland University of Technology Brisbane (QUT)<br />Ajou University<br />Gilbert Peterson<br />Sujeet Shenoi<br />TC 11<br />WG 11.9
– Name: TitleSource
  Label: Source
  Group: Src
  Data: IFIP Advances in Information and Communication Technology ; 7th Digital Forensics (DF) ; https://inria.hal.science/hal-01569553 ; 7th Digital Forensics (DF), Jan 2011, Orlando, FL, United States. pp.65-75, ⟨10.1007/978-3-642-24212-0_5⟩
– Name: Publisher
  Label: Publisher Information
  Group: PubInfo
  Data: CCSD<br />Springer
– Name: DatePubCY
  Label: Publication Year
  Group: Date
  Data: 2011
– Name: Subject
  Label: Subject Terms
  Group: Su
  Data: <searchLink fieldCode="DE" term="%22File+type+identification%22">File type identification</searchLink><br /><searchLink fieldCode="DE" term="%22file+content+classification%22">file content classification</searchLink><br /><searchLink fieldCode="DE" term="%22byte+frequency%22">byte frequency</searchLink><br /><searchLink fieldCode="DE" term="%22[INFO]Computer+Science+[cs]%22">[INFO]Computer Science [cs]</searchLink>
– Name: Subject
  Label: Subject Geographic
  Group: Su
  Data: <searchLink fieldCode="DE" term="%22Orlando%22">Orlando</searchLink><br /><searchLink fieldCode="DE" term="%22FL%22">FL</searchLink><br /><searchLink fieldCode="DE" term="%22United+States%22">United States</searchLink>
– Name: Abstract
  Label: Description
  Group: Ab
  Data: Part 2: FORENSIC TECHNIQUES ; International audience ; Digital forensic examiners often need to identify the type of a file or file fragment based on the content of the file. Content-based file type identification schemes typically use a byte frequency distribution with statistical machine learning to classify file types. Most algorithms analyze the entire file content to obtain the byte frequency distribution, a technique that is inefficient and time consuming. This paper proposes two techniques for reducing the classification time. The first technique selects a subset of features based on the frequency of occurrence. The second speeds up classification by randomly sampling file blocks. Experimental results demonstrate that up to a fifteen-fold reduction in computational time can be achieved with limited impact on accuracy.
– Name: TypeDocument
  Label: Document Type
  Group: TypDoc
  Data: conference object
– Name: Language
  Label: Language
  Group: Lang
  Data: English
– Name: DOI
  Label: DOI
  Group: ID
  Data: 10.1007/978-3-642-24212-0_5
– Name: URL
  Label: Availability
  Group: URL
  Data: https://inria.hal.science/hal-01569553<br />https://inria.hal.science/hal-01569553v1/document<br />https://inria.hal.science/hal-01569553v1/file/978-3-642-24212-0_5_Chapter.pdf<br />https://doi.org/10.1007/978-3-642-24212-0_5
– Name: Copyright
  Label: Rights
  Group: Cpyrght
  Data: http://creativecommons.org/licenses/by/ ; info:eu-repo/semantics/OpenAccess
– Name: AN
  Label: Accession Number
  Group: ID
  Data: edsbas.54718D05
PLink https://erproxy.cvtisr.sk/sfx/access?url=https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsbas&AN=edsbas.54718D05
RecordInfo BibRecord:
  BibEntity:
    Identifiers:
      – Type: doi
        Value: 10.1007/978-3-642-24212-0_5
    Languages:
      – Text: English
    Subjects:
      – SubjectFull: Orlando
        Type: general
      – SubjectFull: FL
        Type: general
      – SubjectFull: United States
        Type: general
      – SubjectFull: File type identification
        Type: general
      – SubjectFull: file content classification
        Type: general
      – SubjectFull: byte frequency
        Type: general
      – SubjectFull: [INFO]Computer Science [cs]
        Type: general
    Titles:
      – TitleFull: Fast Content-Based File Type Identification
        Type: main
  BibRelationships:
    HasContributorRelationships:
      – PersonEntity:
          Name:
            NameFull: Ahmed, Irfan
      – PersonEntity:
          Name:
            NameFull: Lhee, Kyung-Suk
      – PersonEntity:
          Name:
            NameFull: Shin, Hyun-Jung
      – PersonEntity:
          Name:
            NameFull: Hong, Man-Pyo
      – PersonEntity:
          Name:
            NameFull: Information Security Institute
      – PersonEntity:
          Name:
            NameFull: Queensland University of Technology Brisbane (QUT)
      – PersonEntity:
          Name:
            NameFull: Ajou University
      – PersonEntity:
          Name:
            NameFull: Gilbert Peterson
      – PersonEntity:
          Name:
            NameFull: Sujeet Shenoi
      – PersonEntity:
          Name:
            NameFull: TC 11
      – PersonEntity:
          Name:
            NameFull: WG 11.9
    IsPartOfRelationships:
      – BibEntity:
          Dates:
            – D: 01
              M: 01
              Type: published
              Y: 2011
          Identifiers:
            – Type: issn-locals
              Value: edsbas
            – Type: issn-locals
              Value: edsbas.oa
          Titles:
            – TitleFull: IFIP Advances in Information and Communication Technology ; 7th Digital Forensics (DF) ; https://inria.hal.science/hal-01569553 ; 7th Digital Forensics (DF), Jan 2011, Orlando, FL, United States. pp.65-75, ⟨10.1007/978-3-642-24212-0_5⟩
              Type: main
ResultId 1