Towards the Validation of Plagiarism Detection Tools by Means of Grammar Evolution
Student plagiarism is a major problem in universities worldwide. In this paper, we focus on plagiarism in answers to computer programming assignments, where students mix and/or modify one or more original solutions to obtain counterfeits. Although several software tools have been developed to help t...
Gespeichert in:
| Veröffentlicht in: | IEEE transactions on evolutionary computation Jg. 13; H. 3; S. 477 - 485 |
|---|---|
| Hauptverfasser: | , , |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
New York, NY
IEEE
01.06.2009
Institute of Electrical and Electronics Engineers The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Schlagworte: | |
| ISSN: | 1089-778X, 1941-0026 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | Student plagiarism is a major problem in universities worldwide. In this paper, we focus on plagiarism in answers to computer programming assignments, where students mix and/or modify one or more original solutions to obtain counterfeits. Although several software tools have been developed to help the tedious and time consuming task of detecting plagiarism, little has been done to assess their quality, because determining the real authorship of the whole submission corpus is practically impossible for markers. In this paper, we present a grammar evolution technique which generates benchmarks for testing plagiarism detection tools. Given a programming language, our technique generates a set of original solutions to an assignment, together with a set of plagiarisms of the former set which mimic the basic plagiarism techniques performed by students. The authorship of the submission corpus is predefined by the user, providing a base for the assessment and further comparison of copy-catching tools. We give empirical evidence of the suitability of our approach by studying the behavior of one advanced plagiarism detection tool (AC) on four benchmarks coded in APL2, generated with our technique. |
|---|---|
| AbstractList | In this paper, we focus on plagiarism in answers to computer programming assignments, where students mix and/or modify one or more original solutions to obtain counterfeits. Student plagiarism is a major problem in universities worldwide. In this paper, we focus on plagiarism in answers to computer programming assignments, where students mix and/or modify one or more original solutions to obtain counterfeits. Although several software tools have been developed to help the tedious and time consuming task of detecting plagiarism, little has been done to assess their quality, because determining the real authorship of the whole submission corpus is practically impossible for markers. In this paper, we present a grammar evolution technique which generates benchmarks for testing plagiarism detection tools. Given a programming language, our technique generates a set of original solutions to an assignment, together with a set of plagiarisms of the former set which mimic the basic plagiarism techniques performed by students. The authorship of the submission corpus is predefined by the user, providing a base for the assessment and further comparison of copy-catching tools. We give empirical evidence of the suitability of our approach by studying the behavior of one advanced plagiarism detection tool (AC) on four benchmarks coded in APL2, generated with our technique. |
| Author | Ortega, A. Alfonseca, M. Cebrian, M. |
| Author_xml | – sequence: 1 givenname: M. surname: Cebrian fullname: Cebrian, M. organization: Dept. of Comput. Sci., Brown Univ., Providence, RI – sequence: 2 givenname: M. surname: Alfonseca fullname: Alfonseca, M. – sequence: 3 givenname: A. surname: Ortega fullname: Ortega, A. |
| BackLink | http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=21860418$$DView record in Pascal Francis |
| BookMark | eNp9kU1vFDEMhiNUJNrCD0BcRkiI07R2JpkkR7RdClKrVmipuEXZfECq7KQks6D-e2Z2Fw49cLEt63kt2-8JORry4Al5jXCGCOp8tbxbnFEAuQtCiWfkGBXDFoD2R1MNUrVCyG8vyEmt9wDIOKpj8mWVf5viajP-8M2dSdGZMeahyaG5TeZ7NCXWTXPhR293_VXOqTbrx-bam6HO2GUxm40pzfJXTtuZeUmeB5Oqf3XIp-Trx-Vq8am9urn8vPhw1dqOs7HFvuuo5Gvac-lxbQTIDhi3QgVwwotAFYjgOiWl8yr0zoByPFCLVjqQrjsl7_dzH0r-ufV11JtYrU_JDD5vq5aCA0VO2US-fULe520ZpuW05IJRyZBO0LsDZKo1KRQz2Fj1Q4nTdY-aouyBoZw4sedsybUWH7SN4-5pYzExaQQ9O6JnR_Rshj44MinxifLv8P9p3uw10Xv_j2dCYg-q-wMORJdS |
| CODEN | ITEVF5 |
| CitedBy_id | crossref_primary_10_4103_jehp_jehp_86_24 crossref_primary_10_1109_TE_2010_2098442 crossref_primary_10_1162_EVCO_a_00066 crossref_primary_10_1016_j_asoc_2023_110427 crossref_primary_10_1145_3313290 crossref_primary_10_1016_j_scico_2010_11_005 crossref_primary_10_1088_1757_899X_1130_1_012071 crossref_primary_10_1007_s10664_021_09990_4 |
| Cites_doi | 10.1007/11499305_19 10.1109/TIT.2004.830793 10.1017/CBO9780511608858 10.1006/jcss.1997.1500 10.1109/TSE.1977.231174 10.1093/comjnl/33.2.140 10.1109/TEVC.2006.880327 10.1145/135250.134564 10.1007/978-1-4757-2606-0 10.1109/TIT.2005.844059 10.1145/75145.75146 10.1016/0005-1098(78)90005-5 10.1145/299649.299783 10.1145/377435.377473 10.1017/S1049096501000786 |
| ContentType | Journal Article |
| Copyright | 2015 INIST-CNRS Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2009 |
| Copyright_xml | – notice: 2015 INIST-CNRS – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2009 |
| DBID | 97E RIA RIE AAYXX CITATION IQODW 7SC 7SP 8FD JQ2 L7M L~C L~D F28 FR3 |
| DOI | 10.1109/TEVC.2008.2008797 |
| DatabaseName | IEEE Xplore (IEEE) IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Pascal-Francis Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional ANTE: Abstracts in New Technology & Engineering Engineering Research Database |
| DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional Engineering Research Database ANTE: Abstracts in New Technology & Engineering |
| DatabaseTitleList | Technology Research Database Technology Research Database |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering Computer Science Applied Sciences |
| EISSN | 1941-0026 |
| EndPage | 485 |
| ExternalDocumentID | 2294849221 21860418 10_1109_TEVC_2008_2008797 4781609 |
| Genre | orig-research |
| GroupedDBID | -~X .DC 0R~ 29I 4.4 5GY 5VS 6IF 6IK 6IL 6IN 97E AAJGR AARMG AASAJ AAWTH ABAZT ABJNI ABQJQ ABVLG ACGFO ACGFS ACIWK ADZIZ AENEX AETIX AGQYO AGSQL AHBIQ AI. AIBXA AKJIK AKQYR ALLEH ALMA_UNASSIGNED_HOLDINGS ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CHZPO CS3 EBS EJD HZ~ H~9 IEGSK IFIPE IFJZH IPLJI JAVBF LAI M43 O9- OCL P2P PQQKQ RIA RIE RIL RNS TN5 VH1 AAYXX CITATION IQODW RIG 7SC 7SP 8FD JQ2 L7M L~C L~D F28 FR3 |
| ID | FETCH-LOGICAL-c354t-1633285b2658e1ba7083045c79f0d7e7f2907fd3988de9f6da09d5f2c1c8d08d3 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 7 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000267435800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1089-778X |
| IngestDate | Sun Sep 28 07:10:11 EDT 2025 Sun Nov 09 08:50:00 EST 2025 Mon Jul 21 09:17:36 EDT 2025 Sat Nov 29 03:13:45 EST 2025 Tue Nov 18 21:31:34 EST 2025 Tue Aug 26 16:47:42 EDT 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 3 |
| Keywords | Validation computer science education Automatic programming Software development Evolutionary algorithm Empirical method Grammar Marker genetic algorithms Experimental study educational technology Teaching Copyright Software tool Cheating Programming language University Genetic algorithm Education Intellectual property |
| Language | English |
| License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html CC BY 4.0 |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c354t-1633285b2658e1ba7083045c79f0d7e7f2907fd3988de9f6da09d5f2c1c8d08d3 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 ObjectType-Article-2 ObjectType-Feature-1 content type line 23 |
| PQID | 857428412 |
| PQPubID | 85418 |
| PageCount | 9 |
| ParticipantIDs | proquest_miscellaneous_875021524 crossref_primary_10_1109_TEVC_2008_2008797 ieee_primary_4781609 proquest_journals_857428412 crossref_citationtrail_10_1109_TEVC_2008_2008797 pascalfrancis_primary_21860418 |
| PublicationCentury | 2000 |
| PublicationDate | 2009-06-01 |
| PublicationDateYYYYMMDD | 2009-06-01 |
| PublicationDate_xml | – month: 06 year: 2009 text: 2009-06-01 day: 01 |
| PublicationDecade | 2000 |
| PublicationPlace | New York, NY |
| PublicationPlace_xml | – name: New York, NY – name: New York |
| PublicationTitle | IEEE transactions on evolutionary computation |
| PublicationTitleAbbrev | TEVC |
| PublicationYear | 2009 |
| Publisher | IEEE Institute of Electrical and Electronics Engineers The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Publisher_xml | – name: IEEE – name: Institute of Electrical and Electronics Engineers – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| References | ref13 ref23 de la cruz (ref8) 0; 3562 ref14 ref20 lyon (ref15) 2006 ref22 ref10 ref21 koza (ref12) 1992 o'neill (ref17) 2003 freire (ref9) 0 ref16 prechelt (ref19) 2002; 8 ref18 (ref1) 0 ref7 ref4 aiken (ref2) 2005 ref3 ref6 grune (ref11) 1989 ref5 |
| References_xml | – volume: 3562 start-page: 182 year: 0 ident: ref8 article-title: attribute grammar evolution publication-title: Lecture Notes in Computer Science doi: 10.1007/11499305_19 – year: 2006 ident: ref15 article-title: plagiarism is easy, but also easy to detect publication-title: Plagiary 1 (5) 110 – ident: ref6 doi: 10.1109/TIT.2004.830793 – volume: 8 start-page: 1016 year: 2002 ident: ref19 article-title: finding plagiarisms among a set of programs with jplag publication-title: J Universal Comput Sci – ident: ref5 doi: 10.1017/CBO9780511608858 – ident: ref21 doi: 10.1006/jcss.1997.1500 – year: 1989 ident: ref11 article-title: detecting copied submissions in computer science workshops publication-title: Informatica Faculteit Wiskunde Informatica Vrije Universiteit – ident: ref16 doi: 10.1109/TSE.1977.231174 – ident: ref22 doi: 10.1093/comjnl/33.2.140 – year: 1992 ident: ref12 publication-title: Genetic Programming On the Programming of Computers by Means of Natural Selection – ident: ref18 doi: 10.1109/TEVC.2006.880327 – ident: ref23 doi: 10.1145/135250.134564 – year: 0 ident: ref9 publication-title: AC An integrated source code plagiarism detection environment – ident: ref14 doi: 10.1007/978-1-4757-2606-0 – ident: ref7 doi: 10.1109/TIT.2005.844059 – year: 2005 ident: ref2 publication-title: Moss A System for detecting software plagiarism – ident: ref3 doi: 10.1145/75145.75146 – year: 2003 ident: ref17 publication-title: Grammatical Evolution Evolutionary Automatic Programming in an Arbitrary Language – ident: ref20 doi: 10.1016/0005-1098(78)90005-5 – ident: ref10 doi: 10.1145/299649.299783 – ident: ref13 doi: 10.1145/377435.377473 – year: 0 ident: ref1 publication-title: ACM International Collegiate Programming Contest – ident: ref4 doi: 10.1017/S1049096501000786 |
| SSID | ssj0014519 |
| Score | 1.97288 |
| Snippet | Student plagiarism is a major problem in universities worldwide. In this paper, we focus on plagiarism in answers to computer programming assignments, where... In this paper, we focus on plagiarism in answers to computer programming assignments, where students mix and/or modify one or more original solutions to obtain... |
| SourceID | proquest pascalfrancis crossref ieee |
| SourceType | Aggregation Database Index Database Enrichment Source Publisher |
| StartPage | 477 |
| SubjectTerms | AC generators Applied sciences Artificial intelligence Automatic programming Benchmark testing Benchmarks Computer languages Computer programs Computer science computer science education Computer science; control theory; systems Counterfeiting Educational institutions educational technology Evolution Exact sciences and technology General aspects genetic algorithms Grammars Learning and adaptive systems Markers Mathematical models Occupational training. Personnel. Work management Plagiarism Programming Software Software engineering Software tools Students Tasks Whales |
| Title | Towards the Validation of Plagiarism Detection Tools by Means of Grammar Evolution |
| URI | https://ieeexplore.ieee.org/document/4781609 https://www.proquest.com/docview/857428412 https://www.proquest.com/docview/875021524 |
| Volume | 13 |
| WOSCitedRecordID | wos000267435800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE Electronic Library (IEL) customDbUrl: eissn: 1941-0026 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0014519 issn: 1089-778X databaseCode: RIE dateStart: 19970101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LT9wwEB5RxKE9lAKtGmiRD5yqpiRxEtvHChZ6KAhV29XeIsePCmlJqs0uUv89HtsbURVV4hJF8kSJPXb8jefxAZwIK62qBUZP0TYtDa1SWSnMAtFYra2tSs-fMvvOrq_5fC5utuDzmAtjjPHBZ-YL3npfvu7VGo_KTjEtssZsvReM1SFXa_QYYJmUEEwvHGLk8-jBzDNxOp3MzkLUJF4Y1nd6tAd5UhUMiZSDGxUb6Cz--TP77eZi93kf-gZeR1hJvoZ5sAdbptuH3Q1lA4kreB9ePao_eAA_pj5odiAOBZKZQ-SBYIn0ltws5K9bJCi8I-dm5eO1OjLt-8VA2j_kyrgdDsUulxKT38jkPk7ht_DzYjI9-5ZGkoVU0apcpQ6P0YJXbeGgiMlbyRwmczBPMWEzzQyzhTOfraaCc22ErbXMhK5soXLFdcY1fQfbXd-Z90CcMaldr43NFS2d3S2plIbrNtcZtVzUCWSbYW9UrECORBiLxlsimWhQU4EYM2oqgU_jI79D-Y3_CR-gKkbBqIUEjv_S7diOdFxZmfMEjjbKbuIKHhpeMWeZlXmRABlb3dJDf4rsTL92Ig5tIS1wefj0i4_gZfA94ZnNB9heLdfmI-yo-9XtsDz20_cBJ-jsEQ |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3da9RAEB9KFdQHq61irNZ98EmMTbLJZfdR6tWK16NIPO4tbPZDCmcil7uC_70zm71QUQRfQmAnJLuzm_3NzscP4LV0yumJpOgp3sS55UWsCk1ZIIaqtTVF7vlTFrNyPhfLpbzag7djLoy11gef2Xd06335ptNbOio7pbTICWXr3SHmrJCtNfoMqFDKEE4vETOKZfBhpok8raaLsyFuki4lVXi6tQt5WhUKilQ9josbCC3--Df7Def84P8-9RE8DMCSvR9mwmPYs-0hHOxIG1hYw4fw4FYFwiP4Uvmw2Z4hDmQLxOQDxRLrHLtaqW_XRFH4nX2wGx-x1bKq61Y9a36yS4t7HIl9XCtKf2PTmzCJn8DX82l1dhEHmoVY8yLfxIjIeCaKJkMwYtNGlYjKEOjpUrrElLZ0GRrQznAphLHSTYxKpClcplMtTCIMfwr7bdfaZ8DQnDTYa-tSzXO0vBVXygrTpCbhTshJBMlu2GsdapATFcaq9rZIImvS1ECNGTQVwZvxkR9DAY5_CR-RKkbBoIUITn7T7dhOhFxJnooIjnfKrsMa7mtRlGib5WkWARtbcfGRR0W1ttuiCOItIgbOn__9xa_g3kV1Oatnn-afj-H-4ImiE5wXsL9Zb-1LuKtvNtf9-sRP5V-Xwu9a |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Towards+the+Validation+of+Plagiarism+Detection+Tools+by+Means+of+Grammar+Evolution&rft.jtitle=IEEE+transactions+on+evolutionary+computation&rft.au=Cebrian%2C+M&rft.au=Alfonseca%2C+M&rft.au=Ortega%2C+A&rft.date=2009-06-01&rft.issn=1089-778X&rft.volume=13&rft.issue=3&rft_id=info:doi/10.1109%2FTEVC.2008.2008797&rft.externalDBID=NO_FULL_TEXT |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1089-778X&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1089-778X&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1089-778X&client=summon |