Content-based recommender system for textual documents written in Croatian
Gespeichert in:
| Titel: | Content-based recommender system for textual documents written in Croatian |
|---|---|
| Autoren: | Semanjski, Ivana, Kavran, Zvonko, Jolic, Natalija, Andelovic, Neven, Cvitic, Ivan, Gović, Marko |
| Weitere Verfasser: | Laux, Friedrich |
| Quelle: | The Second International Conference on Data Analytics, Proceedings ; ISSN: 2308-4464 ; ISBN: 9781612082950 |
| Verlagsinformationen: | IARIA |
| Publikationsjahr: | 2013 |
| Bestand: | Ghent University Academic Bibliography |
| Schlagwörter: | Technology and Engineering, k-nearest neighbour, recommender system, text mining, document-term matrix, content-based classification |
| Beschreibung: | The paper describes a content-based recommender system that classifies textual documents written in Croatian. We describe how documents are pre- processed, including procedures of dimensionality reduction, selection of stop-words and creation of document-term matrix. For the text classification, a combination of v-fold cross validation and k - nearest neighbours (kNN) methods is used. This way, the ‘optimal’ value of k is firstly analyzed, and the results of v-fold cross validation are applied for the selection of value k. Results are given in the form of classification error analysis. |
| Publikationsart: | conference object |
| Dateibeschreibung: | application/pdf |
| Sprache: | English |
| ISBN: | 978-1-61208-295-0 1-61208-295-5 |
| Relation: | https://biblio.ugent.be/publication/4233486; https://biblio.ugent.be/publication/4233486/file/01HSGH06BYC4TP3SHA25GDJ43D |
| Verfügbarkeit: | https://biblio.ugent.be/publication/4233486 https://hdl.handle.net/1854/LU-4233486 https://biblio.ugent.be/publication/4233486/file/01HSGH06BYC4TP3SHA25GDJ43D |
| Rights: | info:eu-repo/semantics/restrictedAccess |
| Dokumentencode: | edsbas.9A4787E7 |
| Datenbank: | BASE |
| Abstract: | The paper describes a content-based recommender system that classifies textual documents written in Croatian. We describe how documents are pre- processed, including procedures of dimensionality reduction, selection of stop-words and creation of document-term matrix. For the text classification, a combination of v-fold cross validation and k - nearest neighbours (kNN) methods is used. This way, the ‘optimal’ value of k is firstly analyzed, and the results of v-fold cross validation are applied for the selection of value k. Results are given in the form of classification error analysis. |
|---|---|
| ISBN: | 9781612082950 1612082955 |
Nájsť tento článok vo Web of Science