Telecom paristech at imageclefphoto 2008: Bi-modal text and image retrieval with diversity enhancement

Uloženo v:
Podrobná bibliografie
Název: Telecom paristech at imageclefphoto 2008: Bi-modal text and image retrieval with diversity enhancement
Autoři: Marin Ferecatu, Hichem Sahbi
Přispěvatelé: The Pennsylvania State University CiteSeerX Archives
Zdroj: http://clef.isti.cnr.it/2008/working_notes/ferecatu-paperCLEF2008.pdf.
Rok vydání: 2008
Sbírka: CiteSeerX
Témata: Categories and Subject Descriptors H.3 [Information Storage and Retrieval, H.3.1 Content Analysis and Indexing, H.3.3 Information Search and Retrieval, H.3.4 Systems and Software, H.3.7 Digital Libraries, H.2.3 [Database Manage- ment, Languages—Query Languages General Terms Measurement, Performance, Experimentation. Keywords Image retrieval, Reranking, Support Vector Machines, Hybrid Text and Image Search
Popis: In this paper we describe the participation of TELECOM ParisTech in the ImageClefphoto 2008 challenge. This edition focuses on promoting diversity in the results produced by the retrieval systems. Given the high level semantic content of the topics, search engines based solely on text or visual descriptors are unlikely to offer satisfactory results. Our system uses several text and visual descriptors, as well as several combination algorithms to improve the overall retrieval performance. The text part includes a collection of manually built boolean queries and a set of textual descriptors extracted automatically using dictionary filtering and dimensionality reduction. Text and visual descriptors are combined using two strategies: ad-hoc concatenation and re-ranking. Diversity makes it possible to reduce the redundancy in the final results and it is obtained using two techniques, threshold clustering and maxmin exploration. Several runs were submitted to the challenge, including individual (text or visual), combined, and with different settings of diversity. The results show that the combined runs outperform by a significant amount the individual runs. These results clearly corroborate (i) the complementarity of text and visual descriptors and (ii) the effectiveness of boolean queries suggesting promising future research directions.
Druh dokumentu: text
Popis souboru: application/pdf
Jazyk: English
Relation: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.505.1655
Dostupnost: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.505.1655
http://clef.isti.cnr.it/2008/working_notes/ferecatu-paperCLEF2008.pdf
Rights: Metadata may be used without restrictions as long as the oai identifier remains attached to it.
Přístupové číslo: edsbas.6456C2C
Databáze: BASE
Popis
Abstrakt:In this paper we describe the participation of TELECOM ParisTech in the ImageClefphoto 2008 challenge. This edition focuses on promoting diversity in the results produced by the retrieval systems. Given the high level semantic content of the topics, search engines based solely on text or visual descriptors are unlikely to offer satisfactory results. Our system uses several text and visual descriptors, as well as several combination algorithms to improve the overall retrieval performance. The text part includes a collection of manually built boolean queries and a set of textual descriptors extracted automatically using dictionary filtering and dimensionality reduction. Text and visual descriptors are combined using two strategies: ad-hoc concatenation and re-ranking. Diversity makes it possible to reduce the redundancy in the final results and it is obtained using two techniques, threshold clustering and maxmin exploration. Several runs were submitted to the challenge, including individual (text or visual), combined, and with different settings of diversity. The results show that the combined runs outperform by a significant amount the individual runs. These results clearly corroborate (i) the complementarity of text and visual descriptors and (ii) the effectiveness of boolean queries suggesting promising future research directions.