Source code similarity detection by using data mining methods

Programming courses at university and high school level, and competitions in informatics (programming), often require fast assessment of received solutions of the programming tasks. This problem is usually solved by use of automated systems that check the produced output for some test cases for ever...

Full description

Saved in:

Bibliographic Details
Published in:	2013 35th International Conference on Information Technology Interfaces (ITI) pp. 257 - 262
Main Authors:	Stankov, Emil, Jovanov, Mile, Bogdanova, Ana Madevska
Format:	Conference Proceeding
Language:	English
Published:	SRCE University Computing Centre, University of Zagreb 01.06.2013
Subjects:	Algorithm design and analysis clustering analysis code similarity Data mining Educational institutions evaluation of source code Informatics Programming code Programming profession Vectors
ISBN:	9789537138301, 9537138305
ISSN:	1334-2762
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Programming courses at university and high school level, and competitions in informatics (programming), often require fast assessment of received solutions of the programming tasks. This problem is usually solved by use of automated systems that check the produced output for some test cases for every solution. In our paper we present a novel approach of representation of the programming codes as vectors, and use of these vectors in data mining analysis that could produce better assessment of the solutions. We present the results of cluster analysis that go up to 88% of correctly clustered items on average.
ISBN:	9789537138301 9537138305
ISSN:	1334-2762
DOI:	10.2498/iti.2013.0576