Bidirectional Paper-Repository Tracing in Software Engineering

While computer science papers frequently include their associated code repositories, establishing a clear link between papers and their corresponding implementations may be challenging due to the number of code repositories used in research publications. In this paper we describe a lightweight metho...

Full description

Saved in:
Bibliographic Details
Published in:Proceedings (IEEE/ACM International Conference on Mining Software Repositories. Online) pp. 642 - 646
Main Authors: Garijo, Daniel, Arroyo, Miguel, Gonzalez, Esteban, Treude, Christoph, Tarocco, Nicola
Format: Conference Proceeding
Language:English
Published: ACM 15.04.2024
Subjects:
ISSN:2574-3864
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:While computer science papers frequently include their associated code repositories, establishing a clear link between papers and their corresponding implementations may be challenging due to the number of code repositories used in research publications. In this paper we describe a lightweight method for effectively identifying bidirectional links between papers and repositories from both LaTeX and PDF sources. We have used our approach to analyze more than 14000 PDF and Latex files in the Software Engineering category of Arxiv, generating a dataset of more than 1400 paper-code implementations and assessing current citation practices on it.CCS CONCEPTS*Applied computing → Document analysis; * Software and its engineering → Software libraries and repositories.
ISSN:2574-3864
DOI:10.1145/3643991.3644876