Big Coding Data: Analysis, Insights, and Applications

In recent years, there has been a notable surge in the generation of coding data on various platforms, including programming competitions and educational institutions. These platforms serve as repositories for substantial volumes of real-world code, problem descriptions, test cases, and activity log...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:IEEE access Ročník 12; s. 196010 - 196026
Hlavní autori: Rahman, Md. Mostafizer, Shirafuji, Atsushi, Watanobe, Yutaka
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Piscataway IEEE 2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Predmet:
ISSN:2169-3536, 2169-3536
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:In recent years, there has been a notable surge in the generation of coding data on various platforms, including programming competitions and educational institutions. These platforms serve as repositories for substantial volumes of real-world code, problem descriptions, test cases, and activity logs. Despite this wealth of coding data, its potential for advancing software engineering, programming, and research remains largely unexplored. To the best of our knowledge, coding data has been partially explored and utilized in previous research projects such as CodeNet and AlphaCode, but has not been fully considered. There exists a compelling need to explore coding data in more depth to explore its potential for programming and research endeavors. Recognizing this gap, our study undertakes a comprehensive analysis of extensive coding data obtained from a programming learning platform. The Aizu Online Judge (AOJ) serves as our chosen programming platform, providing access to coding data and its associated features. We collected approximately 9 million code evaluation logs, code files, as well as a substantial number of problem descriptions and input/output test cases for thorough analysis and experimentation. The goal of this study is to explore the full potential of the coding data for latent knowledge extraction, programming, and research. We conducted experiments with code evaluation logs, code files, problem descriptions, and test cases to demonstrate the suitability of coding data for various research and applications. Additionally, this study introduces a comprehensive array of features and application programming interfaces (APIs) associated with the AOJ platform. These resources facilitate seamless access and use of coding data, making them a valuable tool for professional and educational initiatives as well as research endeavors.
Bibliografia:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2024.3521383