The Application of Code Plagiarism Detection Function in a C Programming University Course

C programming is taught as the first programming course at many universities worldwide. In most cases, students are required to submit their assignment answer codes through the university's system or by mail, which makes code plagiarism among classmates occur easily. When plagiarism occurs, the...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International Symposium on Computer, Consumer and Control (Online) S. 1 - 4
Hauptverfasser: Lu, Xiqin, Wai, Khaing Hsu
Format: Tagungsbericht
Sprache:Englisch
Veröffentlicht: IEEE 27.06.2025
Schlagworte:
ISSN:2770-0496
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:C programming is taught as the first programming course at many universities worldwide. In most cases, students are required to submit their assignment answer codes through the university's system or by mail, which makes code plagiarism among classmates occur easily. When plagiarism occurs, the instructor cannot grasp the students' true level and then provide appropriate guidance, thereby limiting the improvement of their programming skills. In large classes, checking code manually is impractical, making automated plagiarism detection tools essential. In this paper, we implement a code plagiarism detection function to a C programming university course. This function detects plagiarism by the following steps: 1) removes all whitespace and comments using regular expressions. 2) calculates the Levenshtein distance and similarity score for pairs of source codes from different students. If the score exceeds a specified threshold, they are regarded as plagiarism. 3) reports the scores into a CSV file. For evaluations, the proposal was applied to a total of 1,688 answer source codes from 99 second-year undergraduate students. The results indicate that: 1) the plagiarism rate increases as the difficulty of the assignment increases, 2) the optimal similarity threshold differs across assignments, 3) some students often copied answer code from certain students.
ISSN:2770-0496
DOI:10.1109/IS3C65361.2025.11130958