Annif and Finto AI : Developing and Implementing Automated Subject Indexing

Manually indexing documents for subject-based access is a labour-intensive process that can be automated using AI technology. Algorithms for text classification must be trained and tested with examples of indexed documents, which can be obtained from existing bibliographic databases and digital coll...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:JLIS.it : Italian journal of library and information science Ročník 13; číslo 1; s. 265 - 282
Hlavní autoři: Suominen, Osma, Lehtinen, Mona, Inkinen, Juho
Médium: Journal Article
Jazyk:angličtina
Vydáno: Macerata EUM-Edizioni Università di Macerata 01.01.2022
University of Florence
Témata:
ISSN:2038-1026, 2038-1026
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Manually indexing documents for subject-based access is a labour-intensive process that can be automated using AI technology. Algorithms for text classification must be trained and tested with examples of indexed documents, which can be obtained from existing bibliographic databases and digital collections. The National Library of Finland has created Annif, an open source toolkit for automated subject indexing and classification. Annif is multilingual, independent of the indexing vocabulary, and modular. It integrates many text classification algorithms, including Maui, fastText, Omikuji, and a neural network model based on TensorFlow. Best results can often be obtained by combining several algorithms. Many document corpora have been used for training and evaluating Annif. Finding the algorithms and configurations that give the best quality is an ongoing effort.In May 2020, we launched Finto AI, a service for automated subject indexing based on Annif. It provides a simple Web form for obtaining subject suggestions for text. The functionality is also available as a REST API. Many document repositories and the cataloguing system for electronic publications at the National Library of Finland are using it to integrate semi-automated subject indexing into their metadata workflows. In the future, we are going to extend Annif with more algorithms and new functionality, and to integrate Finto AI with other metadata management workflows. [Publisher's text].
ISSN:2038-1026
2038-1026
DOI:10.4403/jlis.it-12740