Programming Language Prediction using Machine Learning

Uloženo v:
Podrobná bibliografie
Název: Programming Language Prediction using Machine Learning
Autoři: Nidhun M, Sona Maria Sebastian, orcid:0000-0001-7784-
Informace o vydavateli: Department of Computer Applications, Amal Jyothi College of Engineering Kanjirappally, Kottayam
Rok vydání: 2023
Sbírka: Zenodo
Témata: Classification, Machine learning, Random Forest, NLP, Source code Detection
Popis: The primary tool used in the software development industry is programming languages. Since the 1940s, hundreds of them have been developed, and every day, a sizable number of new lines of code are written in a variety of programming languages and pushed to active repositories. We consider a source code classifier to be a highly valuable tool for automatic syntax highlighting and label suggestion on systems, such as code editors, that can identify the programming language used to write a certain piece of code. This motivated us to use cutting-edge AI methods for text classification to build a model for categorizing code snippets according to their language. We developed a new dataset for our empirical investigation using the GitHub Repos Dataset, which includes 131450 code snippets dispersed over 34 programming languages.
Druh dokumentu: conference object
Jazyk: unknown
Relation: https://zenodo.org/communities/amaljyothi/; https://zenodo.org/records/7961995; oai:zenodo.org:7961995; https://doi.org/10.5281/zenodo.7961995
DOI: 10.5281/zenodo.7961995
Dostupnost: https://doi.org/10.5281/zenodo.7961995
https://zenodo.org/records/7961995
Rights: Creative Commons Attribution 4.0 International ; cc-by-4.0 ; https://creativecommons.org/licenses/by/4.0/legalcode
Přístupové číslo: edsbas.B9989DF
Databáze: BASE
FullText Text:
  Availability: 0
CustomLinks:
  – Url: https://doi.org/10.5281/zenodo.7961995#
    Name: EDS - BASE (s4221598)
    Category: fullText
    Text: View record from BASE
  – Url: https://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=EBSCO&SrcAuth=EBSCO&DestApp=WOS&ServiceName=TransferToWoS&DestLinkType=GeneralSearchSummary&Func=Links&author=M%20N
    Name: ISI
    Category: fullText
    Text: Nájsť tento článok vo Web of Science
    Icon: https://imagesrvr.epnet.com/ls/20docs.gif
    MouseOverText: Nájsť tento článok vo Web of Science
Header DbId: edsbas
DbLabel: BASE
An: edsbas.B9989DF
RelevancyScore: 959
AccessLevel: 3
PubType: Conference
PubTypeId: conference
PreciseRelevancyScore: 958.653564453125
IllustrationInfo
Items – Name: Title
  Label: Title
  Group: Ti
  Data: Programming Language Prediction using Machine Learning
– Name: Author
  Label: Authors
  Group: Au
  Data: <searchLink fieldCode="AR" term="%22Nidhun+M%22">Nidhun M</searchLink><br /><searchLink fieldCode="AR" term="%22Sona+Maria+Sebastian%22">Sona Maria Sebastian</searchLink><br /><searchLink fieldCode="AR" term="%22orcid%3A0000-0001-7784-%22">orcid:0000-0001-7784-</searchLink>
– Name: Publisher
  Label: Publisher Information
  Group: PubInfo
  Data: Department of Computer Applications, Amal Jyothi College of Engineering Kanjirappally, Kottayam
– Name: DatePubCY
  Label: Publication Year
  Group: Date
  Data: 2023
– Name: Subset
  Label: Collection
  Group: HoldingsInfo
  Data: Zenodo
– Name: Subject
  Label: Subject Terms
  Group: Su
  Data: <searchLink fieldCode="DE" term="%22Classification%22">Classification</searchLink><br /><searchLink fieldCode="DE" term="%22Machine+learning%22">Machine learning</searchLink><br /><searchLink fieldCode="DE" term="%22Random+Forest%22">Random Forest</searchLink><br /><searchLink fieldCode="DE" term="%22NLP%22">NLP</searchLink><br /><searchLink fieldCode="DE" term="%22Source+code+Detection%22">Source code Detection</searchLink>
– Name: Abstract
  Label: Description
  Group: Ab
  Data: The primary tool used in the software development industry is programming languages. Since the 1940s, hundreds of them have been developed, and every day, a sizable number of new lines of code are written in a variety of programming languages and pushed to active repositories. We consider a source code classifier to be a highly valuable tool for automatic syntax highlighting and label suggestion on systems, such as code editors, that can identify the programming language used to write a certain piece of code. This motivated us to use cutting-edge AI methods for text classification to build a model for categorizing code snippets according to their language. We developed a new dataset for our empirical investigation using the GitHub Repos Dataset, which includes 131450 code snippets dispersed over 34 programming languages.
– Name: TypeDocument
  Label: Document Type
  Group: TypDoc
  Data: conference object
– Name: Language
  Label: Language
  Group: Lang
  Data: unknown
– Name: NoteTitleSource
  Label: Relation
  Group: SrcInfo
  Data: https://zenodo.org/communities/amaljyothi/; https://zenodo.org/records/7961995; oai:zenodo.org:7961995; https://doi.org/10.5281/zenodo.7961995
– Name: DOI
  Label: DOI
  Group: ID
  Data: 10.5281/zenodo.7961995
– Name: URL
  Label: Availability
  Group: URL
  Data: https://doi.org/10.5281/zenodo.7961995<br />https://zenodo.org/records/7961995
– Name: Copyright
  Label: Rights
  Group: Cpyrght
  Data: Creative Commons Attribution 4.0 International ; cc-by-4.0 ; https://creativecommons.org/licenses/by/4.0/legalcode
– Name: AN
  Label: Accession Number
  Group: ID
  Data: edsbas.B9989DF
PLink https://erproxy.cvtisr.sk/sfx/access?url=https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsbas&AN=edsbas.B9989DF
RecordInfo BibRecord:
  BibEntity:
    Identifiers:
      – Type: doi
        Value: 10.5281/zenodo.7961995
    Languages:
      – Text: unknown
    Subjects:
      – SubjectFull: Classification
        Type: general
      – SubjectFull: Machine learning
        Type: general
      – SubjectFull: Random Forest
        Type: general
      – SubjectFull: NLP
        Type: general
      – SubjectFull: Source code Detection
        Type: general
    Titles:
      – TitleFull: Programming Language Prediction using Machine Learning
        Type: main
  BibRelationships:
    HasContributorRelationships:
      – PersonEntity:
          Name:
            NameFull: Nidhun M
      – PersonEntity:
          Name:
            NameFull: Sona Maria Sebastian
      – PersonEntity:
          Name:
            NameFull: orcid:0000-0001-7784-
    IsPartOfRelationships:
      – BibEntity:
          Dates:
            – D: 01
              M: 01
              Type: published
              Y: 2023
          Identifiers:
            – Type: issn-locals
              Value: edsbas
            – Type: issn-locals
              Value: edsbas.oa
ResultId 1