Multimodal page classification in administrative document image streams

In this paper, we present a page classification application in a banking workflow. The proposed architecture represents administrative document images by merging visual and textual descriptions. The visual description is based on a hierarchical representation of the pixel intensity distribution. The...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:International journal on document analysis and recognition Ročník 17; číslo 4; s. 331 - 341
Hlavní autori: Rusiñol, Marçal, Frinken, Volkmar, Karatzas, Dimosthenis, Bagdanov, Andrew D., Lladós, Josep
Médium: Journal Article
Jazyk:English
Japanese
Vydavateľské údaje: Berlin/Heidelberg Springer Berlin Heidelberg 01.12.2014
Predmet:
ISSN:1433-2833, 1433-2825
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:In this paper, we present a page classification application in a banking workflow. The proposed architecture represents administrative document images by merging visual and textual descriptions. The visual description is based on a hierarchical representation of the pixel intensity distribution. The textual description uses latent semantic analysis to represent document content as a mixture of topics. Several off-the-shelf classifiers and different strategies for combining visual and textual cues have been evaluated. A final step uses an n -gram model of the page stream allowing a finer-grained classification of pages. The proposed method has been tested in a real large-scale environment and we report results on a dataset of 70,000 pages.
ISSN:1433-2833
1433-2825
DOI:10.1007/s10032-014-0225-8