Content-based Recommender System for Textual Documents Written in Croatian

Uloženo v:
Podrobná bibliografie
Název: Content-based Recommender System for Textual Documents Written in Croatian
Autoři: Kavran, Natalija, Kavran, Zvonko, Cvitić, Ivan, Šemanjski, Ivana
Informace o vydavateli: 2013.
Rok vydání: 2013
Témata: recommender system, k-nearest neighbour, text mining, content-based classification, document-term matrix
Popis: The paper describes a content-based recommender system that classifies textual documents written in Croatian. We describe how documents are pre- processed, including procedures of dimensionality reduction, selection of stop- words and creation of document-term matrix. For the text classification, a combination of v- fold cross validation and k - nearest neighbours (kNN) methods is used. This way, the ‘optimal’ value of k is firstly analyzed, and the results of v-fold cross validation are applied for the selection of value k. Results are given in the form of classification error analysis.
Druh dokumentu: Conference object
Přístupové číslo: edsair.dris...01492..28a2c381f360dd797051257b7a1d2532
Databáze: OpenAIRE
Popis
Abstrakt:The paper describes a content-based recommender system that classifies textual documents written in Croatian. We describe how documents are pre- processed, including procedures of dimensionality reduction, selection of stop- words and creation of document-term matrix. For the text classification, a combination of v- fold cross validation and k - nearest neighbours (kNN) methods is used. This way, the ‘optimal’ value of k is firstly analyzed, and the results of v-fold cross validation are applied for the selection of value k. Results are given in the form of classification error analysis.