Towards the Validation of Plagiarism Detection Tools by Means of Grammar Evolution

Student plagiarism is a major problem in universities worldwide. In this paper, we focus on plagiarism in answers to computer programming assignments, where students mix and/or modify one or more original solutions to obtain counterfeits. Although several software tools have been developed to help t...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on evolutionary computation Jg. 13; H. 3; S. 477 - 485
Hauptverfasser: Cebrian, M., Alfonseca, M., Ortega, A.
Format: Journal Article
Sprache:Englisch
Veröffentlicht: New York, NY IEEE 01.06.2009
Institute of Electrical and Electronics Engineers
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Schlagworte:
ISSN:1089-778X, 1941-0026
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract Student plagiarism is a major problem in universities worldwide. In this paper, we focus on plagiarism in answers to computer programming assignments, where students mix and/or modify one or more original solutions to obtain counterfeits. Although several software tools have been developed to help the tedious and time consuming task of detecting plagiarism, little has been done to assess their quality, because determining the real authorship of the whole submission corpus is practically impossible for markers. In this paper, we present a grammar evolution technique which generates benchmarks for testing plagiarism detection tools. Given a programming language, our technique generates a set of original solutions to an assignment, together with a set of plagiarisms of the former set which mimic the basic plagiarism techniques performed by students. The authorship of the submission corpus is predefined by the user, providing a base for the assessment and further comparison of copy-catching tools. We give empirical evidence of the suitability of our approach by studying the behavior of one advanced plagiarism detection tool (AC) on four benchmarks coded in APL2, generated with our technique.
AbstractList In this paper, we focus on plagiarism in answers to computer programming assignments, where students mix and/or modify one or more original solutions to obtain counterfeits.
Student plagiarism is a major problem in universities worldwide. In this paper, we focus on plagiarism in answers to computer programming assignments, where students mix and/or modify one or more original solutions to obtain counterfeits. Although several software tools have been developed to help the tedious and time consuming task of detecting plagiarism, little has been done to assess their quality, because determining the real authorship of the whole submission corpus is practically impossible for markers. In this paper, we present a grammar evolution technique which generates benchmarks for testing plagiarism detection tools. Given a programming language, our technique generates a set of original solutions to an assignment, together with a set of plagiarisms of the former set which mimic the basic plagiarism techniques performed by students. The authorship of the submission corpus is predefined by the user, providing a base for the assessment and further comparison of copy-catching tools. We give empirical evidence of the suitability of our approach by studying the behavior of one advanced plagiarism detection tool (AC) on four benchmarks coded in APL2, generated with our technique.
Author Ortega, A.
Alfonseca, M.
Cebrian, M.
Author_xml – sequence: 1
  givenname: M.
  surname: Cebrian
  fullname: Cebrian, M.
  organization: Dept. of Comput. Sci., Brown Univ., Providence, RI
– sequence: 2
  givenname: M.
  surname: Alfonseca
  fullname: Alfonseca, M.
– sequence: 3
  givenname: A.
  surname: Ortega
  fullname: Ortega, A.
BackLink http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=21860418$$DView record in Pascal Francis
BookMark eNp9kU1vFDEMhiNUJNrCD0BcRkiI07R2JpkkR7RdClKrVmipuEXZfECq7KQks6D-e2Z2Fw49cLEt63kt2-8JORry4Al5jXCGCOp8tbxbnFEAuQtCiWfkGBXDFoD2R1MNUrVCyG8vyEmt9wDIOKpj8mWVf5viajP-8M2dSdGZMeahyaG5TeZ7NCXWTXPhR293_VXOqTbrx-bam6HO2GUxm40pzfJXTtuZeUmeB5Oqf3XIp-Trx-Vq8am9urn8vPhw1dqOs7HFvuuo5Gvac-lxbQTIDhi3QgVwwotAFYjgOiWl8yr0zoByPFCLVjqQrjsl7_dzH0r-ufV11JtYrU_JDD5vq5aCA0VO2US-fULe520ZpuW05IJRyZBO0LsDZKo1KRQz2Fj1Q4nTdY-aouyBoZw4sedsybUWH7SN4-5pYzExaQQ9O6JnR_Rshj44MinxifLv8P9p3uw10Xv_j2dCYg-q-wMORJdS
CODEN ITEVF5
CitedBy_id crossref_primary_10_4103_jehp_jehp_86_24
crossref_primary_10_1109_TE_2010_2098442
crossref_primary_10_1162_EVCO_a_00066
crossref_primary_10_1016_j_asoc_2023_110427
crossref_primary_10_1145_3313290
crossref_primary_10_1016_j_scico_2010_11_005
crossref_primary_10_1088_1757_899X_1130_1_012071
crossref_primary_10_1007_s10664_021_09990_4
Cites_doi 10.1007/11499305_19
10.1109/TIT.2004.830793
10.1017/CBO9780511608858
10.1006/jcss.1997.1500
10.1109/TSE.1977.231174
10.1093/comjnl/33.2.140
10.1109/TEVC.2006.880327
10.1145/135250.134564
10.1007/978-1-4757-2606-0
10.1109/TIT.2005.844059
10.1145/75145.75146
10.1016/0005-1098(78)90005-5
10.1145/299649.299783
10.1145/377435.377473
10.1017/S1049096501000786
ContentType Journal Article
Copyright 2015 INIST-CNRS
Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2009
Copyright_xml – notice: 2015 INIST-CNRS
– notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2009
DBID 97E
RIA
RIE
AAYXX
CITATION
IQODW
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
F28
FR3
DOI 10.1109/TEVC.2008.2008797
DatabaseName IEEE Xplore (IEEE)
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
Pascal-Francis
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
ANTE: Abstracts in New Technology & Engineering
Engineering Research Database
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
Engineering Research Database
ANTE: Abstracts in New Technology & Engineering
DatabaseTitleList Technology Research Database

Technology Research Database
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Computer Science
Applied Sciences
EISSN 1941-0026
EndPage 485
ExternalDocumentID 2294849221
21860418
10_1109_TEVC_2008_2008797
4781609
Genre orig-research
GroupedDBID -~X
.DC
0R~
29I
4.4
5GY
5VS
6IF
6IK
6IL
6IN
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABJNI
ABQJQ
ABVLG
ACGFO
ACGFS
ACIWK
ADZIZ
AENEX
AETIX
AGQYO
AGSQL
AHBIQ
AI.
AIBXA
AKJIK
AKQYR
ALLEH
ALMA_UNASSIGNED_HOLDINGS
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CHZPO
CS3
EBS
EJD
HZ~
H~9
IEGSK
IFIPE
IFJZH
IPLJI
JAVBF
LAI
M43
O9-
OCL
P2P
PQQKQ
RIA
RIE
RIL
RNS
TN5
VH1
AAYXX
CITATION
IQODW
RIG
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
F28
FR3
ID FETCH-LOGICAL-c354t-1633285b2658e1ba7083045c79f0d7e7f2907fd3988de9f6da09d5f2c1c8d08d3
IEDL.DBID RIE
ISICitedReferencesCount 7
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000267435800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1089-778X
IngestDate Sun Sep 28 07:10:11 EDT 2025
Sun Nov 09 08:50:00 EST 2025
Mon Jul 21 09:17:36 EDT 2025
Sat Nov 29 03:13:45 EST 2025
Tue Nov 18 21:31:34 EST 2025
Tue Aug 26 16:47:42 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 3
Keywords Validation
computer science education
Automatic programming
Software development
Evolutionary algorithm
Empirical method
Grammar
Marker
genetic algorithms
Experimental study
educational technology
Teaching
Copyright
Software tool
Cheating
Programming language
University
Genetic algorithm
Education
Intellectual property
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
CC BY 4.0
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c354t-1633285b2658e1ba7083045c79f0d7e7f2907fd3988de9f6da09d5f2c1c8d08d3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ObjectType-Article-2
ObjectType-Feature-1
content type line 23
PQID 857428412
PQPubID 85418
PageCount 9
ParticipantIDs proquest_miscellaneous_875021524
crossref_primary_10_1109_TEVC_2008_2008797
ieee_primary_4781609
proquest_journals_857428412
crossref_citationtrail_10_1109_TEVC_2008_2008797
pascalfrancis_primary_21860418
PublicationCentury 2000
PublicationDate 2009-06-01
PublicationDateYYYYMMDD 2009-06-01
PublicationDate_xml – month: 06
  year: 2009
  text: 2009-06-01
  day: 01
PublicationDecade 2000
PublicationPlace New York, NY
PublicationPlace_xml – name: New York, NY
– name: New York
PublicationTitle IEEE transactions on evolutionary computation
PublicationTitleAbbrev TEVC
PublicationYear 2009
Publisher IEEE
Institute of Electrical and Electronics Engineers
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: Institute of Electrical and Electronics Engineers
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref13
ref23
de la cruz (ref8) 0; 3562
ref14
ref20
lyon (ref15) 2006
ref22
ref10
ref21
koza (ref12) 1992
o'neill (ref17) 2003
freire (ref9) 0
ref16
prechelt (ref19) 2002; 8
ref18
(ref1) 0
ref7
ref4
aiken (ref2) 2005
ref3
ref6
grune (ref11) 1989
ref5
References_xml – volume: 3562
  start-page: 182
  year: 0
  ident: ref8
  article-title: attribute grammar evolution
  publication-title: Lecture Notes in Computer Science
  doi: 10.1007/11499305_19
– year: 2006
  ident: ref15
  article-title: plagiarism is easy, but also easy to detect
  publication-title: Plagiary 1 (5) 110
– ident: ref6
  doi: 10.1109/TIT.2004.830793
– volume: 8
  start-page: 1016
  year: 2002
  ident: ref19
  article-title: finding plagiarisms among a set of programs with jplag
  publication-title: J Universal Comput Sci
– ident: ref5
  doi: 10.1017/CBO9780511608858
– ident: ref21
  doi: 10.1006/jcss.1997.1500
– year: 1989
  ident: ref11
  article-title: detecting copied submissions in computer science workshops
  publication-title: Informatica Faculteit Wiskunde Informatica Vrije Universiteit
– ident: ref16
  doi: 10.1109/TSE.1977.231174
– ident: ref22
  doi: 10.1093/comjnl/33.2.140
– year: 1992
  ident: ref12
  publication-title: Genetic Programming On the Programming of Computers by Means of Natural Selection
– ident: ref18
  doi: 10.1109/TEVC.2006.880327
– ident: ref23
  doi: 10.1145/135250.134564
– year: 0
  ident: ref9
  publication-title: AC An integrated source code plagiarism detection environment
– ident: ref14
  doi: 10.1007/978-1-4757-2606-0
– ident: ref7
  doi: 10.1109/TIT.2005.844059
– year: 2005
  ident: ref2
  publication-title: Moss A System for detecting software plagiarism
– ident: ref3
  doi: 10.1145/75145.75146
– year: 2003
  ident: ref17
  publication-title: Grammatical Evolution Evolutionary Automatic Programming in an Arbitrary Language
– ident: ref20
  doi: 10.1016/0005-1098(78)90005-5
– ident: ref10
  doi: 10.1145/299649.299783
– ident: ref13
  doi: 10.1145/377435.377473
– year: 0
  ident: ref1
  publication-title: ACM International Collegiate Programming Contest
– ident: ref4
  doi: 10.1017/S1049096501000786
SSID ssj0014519
Score 1.97288
Snippet Student plagiarism is a major problem in universities worldwide. In this paper, we focus on plagiarism in answers to computer programming assignments, where...
In this paper, we focus on plagiarism in answers to computer programming assignments, where students mix and/or modify one or more original solutions to obtain...
SourceID proquest
pascalfrancis
crossref
ieee
SourceType Aggregation Database
Index Database
Enrichment Source
Publisher
StartPage 477
SubjectTerms AC generators
Applied sciences
Artificial intelligence
Automatic programming
Benchmark testing
Benchmarks
Computer languages
Computer programs
Computer science
computer science education
Computer science; control theory; systems
Counterfeiting
Educational institutions
educational technology
Evolution
Exact sciences and technology
General aspects
genetic algorithms
Grammars
Learning and adaptive systems
Markers
Mathematical models
Occupational training. Personnel. Work management
Plagiarism
Programming
Software
Software engineering
Software tools
Students
Tasks
Whales
Title Towards the Validation of Plagiarism Detection Tools by Means of Grammar Evolution
URI https://ieeexplore.ieee.org/document/4781609
https://www.proquest.com/docview/857428412
https://www.proquest.com/docview/875021524
Volume 13
WOSCitedRecordID wos000267435800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE Electronic Library (IEL)
  customDbUrl:
  eissn: 1941-0026
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0014519
  issn: 1089-778X
  databaseCode: RIE
  dateStart: 19970101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LT9wwEB5RxKE9lAKtGmiRD5yqpiRxEtvHChZ6KAhV29XeIsePCmlJqs0uUv89HtsbURVV4hJF8kSJPXb8jefxAZwIK62qBUZP0TYtDa1SWSnMAtFYra2tSs-fMvvOrq_5fC5utuDzmAtjjPHBZ-YL3npfvu7VGo_KTjEtssZsvReM1SFXa_QYYJmUEEwvHGLk8-jBzDNxOp3MzkLUJF4Y1nd6tAd5UhUMiZSDGxUb6Cz--TP77eZi93kf-gZeR1hJvoZ5sAdbptuH3Q1lA4kreB9ePao_eAA_pj5odiAOBZKZQ-SBYIn0ltws5K9bJCi8I-dm5eO1OjLt-8VA2j_kyrgdDsUulxKT38jkPk7ht_DzYjI9-5ZGkoVU0apcpQ6P0YJXbeGgiMlbyRwmczBPMWEzzQyzhTOfraaCc22ErbXMhK5soXLFdcY1fQfbXd-Z90CcMaldr43NFS2d3S2plIbrNtcZtVzUCWSbYW9UrECORBiLxlsimWhQU4EYM2oqgU_jI79D-Y3_CR-gKkbBqIUEjv_S7diOdFxZmfMEjjbKbuIKHhpeMWeZlXmRABlb3dJDf4rsTL92Ig5tIS1wefj0i4_gZfA94ZnNB9heLdfmI-yo-9XtsDz20_cBJ-jsEQ
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3da9RAEB9KFdQHq61irNZ98EmMTbLJZfdR6tWK16NIPO4tbPZDCmcil7uC_70zm71QUQRfQmAnJLuzm_3NzscP4LV0yumJpOgp3sS55UWsCk1ZIIaqtTVF7vlTFrNyPhfLpbzag7djLoy11gef2Xd06335ptNbOio7pbTICWXr3SHmrJCtNfoMqFDKEE4vETOKZfBhpok8raaLsyFuki4lVXi6tQt5WhUKilQ9josbCC3--Df7Def84P8-9RE8DMCSvR9mwmPYs-0hHOxIG1hYw4fw4FYFwiP4Uvmw2Z4hDmQLxOQDxRLrHLtaqW_XRFH4nX2wGx-x1bKq61Y9a36yS4t7HIl9XCtKf2PTmzCJn8DX82l1dhEHmoVY8yLfxIjIeCaKJkMwYtNGlYjKEOjpUrrElLZ0GRrQznAphLHSTYxKpClcplMtTCIMfwr7bdfaZ8DQnDTYa-tSzXO0vBVXygrTpCbhTshJBMlu2GsdapATFcaq9rZIImvS1ECNGTQVwZvxkR9DAY5_CR-RKkbBoIUITn7T7dhOhFxJnooIjnfKrsMa7mtRlGib5WkWARtbcfGRR0W1ttuiCOItIgbOn__9xa_g3kV1Oatnn-afj-H-4ImiE5wXsL9Zb-1LuKtvNtf9-sRP5V-Xwu9a
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Towards+the+Validation+of+Plagiarism+Detection+Tools+by+Means+of+Grammar+Evolution&rft.jtitle=IEEE+transactions+on+evolutionary+computation&rft.au=Cebrian%2C+M&rft.au=Alfonseca%2C+M&rft.au=Ortega%2C+A&rft.date=2009-06-01&rft.issn=1089-778X&rft.volume=13&rft.issue=3&rft_id=info:doi/10.1109%2FTEVC.2008.2008797&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1089-778X&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1089-778X&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1089-778X&client=summon