Towards the Validation of Plagiarism Detection Tools by Means of Grammar Evolution

Student plagiarism is a major problem in universities worldwide. In this paper, we focus on plagiarism in answers to computer programming assignments, where students mix and/or modify one or more original solutions to obtain counterfeits. Although several software tools have been developed to help t...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on evolutionary computation Vol. 13; no. 3; pp. 477 - 485
Main Authors:	Cebrian, M., Alfonseca, M., Ortega, A.
Format:	Journal Article
Language:	English
Published:	New York, NY IEEE 01.06.2009 Institute of Electrical and Electronics Engineers The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:	AC generators Applied sciences Artificial intelligence Automatic programming Benchmark testing Benchmarks Computer languages Computer programs Computer science computer science education Computer science; control theory; systems Counterfeiting Educational institutions educational technology Evolution Exact sciences and technology General aspects genetic algorithms Grammars Learning and adaptive systems Markers Mathematical models Occupational training. Personnel. Work management Plagiarism Programming Software Software engineering Software tools Students Tasks Whales Validation computer science education Automatic programming Software development Evolutionary algorithm Empirical method Grammar Marker genetic algorithms Experimental study educational technology Teaching Copyright Software tool Cheating Programming language University Genetic algorithm Education Intellectual property
ISSN:	1089-778X, 1941-0026
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Student plagiarism is a major problem in universities worldwide. In this paper, we focus on plagiarism in answers to computer programming assignments, where students mix and/or modify one or more original solutions to obtain counterfeits. Although several software tools have been developed to help the tedious and time consuming task of detecting plagiarism, little has been done to assess their quality, because determining the real authorship of the whole submission corpus is practically impossible for markers. In this paper, we present a grammar evolution technique which generates benchmarks for testing plagiarism detection tools. Given a programming language, our technique generates a set of original solutions to an assignment, together with a set of plagiarisms of the former set which mimic the basic plagiarism techniques performed by students. The authorship of the submission corpus is predefined by the user, providing a base for the assessment and further comparison of copy-catching tools. We give empirical evidence of the suitability of our approach by studying the behavior of one advanced plagiarism detection tool (AC) on four benchmarks coded in APL2, generated with our technique.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 ObjectType-Article-2 ObjectType-Feature-1 content type line 23
ISSN:	1089-778X 1941-0026
DOI:	10.1109/TEVC.2008.2008797