CoPhIR Image Collection under the Microscope

The Content-based Photo Image Retrieval (CoPhIR) dataset is the largest available database of digital images with corresponding visual descriptors. It contains five MPEG-7 global descriptors extracted from more than 106 million images from Flickr photo-sharing system. In this paper, we analyze this...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Proceedings of the 2009 Second International Workshop on Similarity Search and Applications S. 47 - 54
Hauptverfasser: Batko, Michal, Kohoutkova, Petra, Novak, David
Format: Tagungsbericht
Sprache:Englisch
Veröffentlicht: Washington, DC, USA IEEE Computer Society 29.08.2009
IEEE
Schriftenreihe:ACM Conferences
Schlagworte:
ISBN:9780769537658, 0769537650
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract The Content-based Photo Image Retrieval (CoPhIR) dataset is the largest available database of digital images with corresponding visual descriptors. It contains five MPEG-7 global descriptors extracted from more than 106 million images from Flickr photo-sharing system. In this paper, we analyze this dataset focusing on 1) efficiency of similarity-based indexing and searching and on 2) expressiveness of combination of the descriptors with respect to subjective perception of visual similarity. We treat the descriptors as metric spaces and then combine them into a multi-metric space. We analyze distance distributions of individual descriptors, measure intrinsic dimensionality of these datasets and statistically evaluate correlation between these descriptors. Further, we use two methods to assess subjective accuracy and satisfaction of similarity retrieval based on a combination of descriptors that is recommended for CoPhIR, and we compare these results on databases of 10 and 100 million CoPhIR images. Finally, we suggest, explore and evaluate two approaches to improve the accuracy: 1) applying logarithms in order to weaken influence of a single descriptor contribution if it deviates from the rest, and 2) the possibility of categorization of the dataset and identifying visual characteristics important for individual categories.
AbstractList The content-based photo image retrieval (CoPhIR) dataset is the largest available database of digital images with corresponding visual descriptors. It contains five MPEG-7 global descriptors extracted from more than 106 million images from Flickr photo-sharing system. In this paper, we analyze this dataset focusing on 1) efficiency of similarity-based indexing and searching and on 2) expressiveness of combination of the descriptors with respect to subjective perception of visual similarity. We treat the descriptors as metric spaces and then combine them into a multi-metric space. We analyze distance distributions of individual descriptors, measure intrinsic dimensionality of these datasets and statistically evaluate correlation between these descriptors. Further, we use two methods to assess subjective accuracy and satisfaction of similarity retrieval based on a combination of descriptors that is recommended for CoPhIR, and we compare these results on databases of 10 and 100 million CoPhIR images. Finally, we suggest, explore and evaluate two approaches to improve the accuracy: 1) applying logarithms in order to weaken influence of a single descriptor contribution if it deviates from the rest, and 2) the possibility of categorization of the dataset and identifying visual characteristics important for individual categories.
Author Kohoutkova, Petra
Batko, Michal
Novak, David
Author_xml – sequence: 1
  givenname: Michal
  surname: Batko
  fullname: Batko, Michal
– sequence: 2
  givenname: Petra
  surname: Kohoutkova
  fullname: Kohoutkova, Petra
– sequence: 3
  givenname: David
  surname: Novak
  fullname: Novak, David
BookMark eNqNkLFOwzAURS1BJWjJyMSSiakNz85zHI9VRCFSERWF2XKcFxpI4yopA39PQvkApjucqyPdO2XnrW-JsWsOEeeg77b5drmJBICOhDxjgVYpqETLWCUynbDpSDSgRnHBgr7_AADOBaBUl2ye-c0ufwnzvX2nMPNNQ-5Y-zb8akvqwuOOwqfadb53_kBXbFLZpqfgL2fsbXX_mj0u1s8PebZcL6zA-LjgFjgk4GKruUDpEIpEDAicxBK0QovKSSCnKHWYCqfSUqLAgiQUupLxjN2cvDURmUNX7233baRQfBg10NsTtW5vCu8_e8PBjE-Y3yfMuNeIUTP_V9EUXU1V_AOevFug
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/SISAP.2009.25
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList

Database_xml – sequence: 1
  dbid: RIE
  name: IEEE/IET Electronic Library
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Library & Information Science
Editor Skopal, Tomas
Zezula, Pavel
Editor_xml – sequence: 1
  givenname: Tomas
  surname: Skopal
  fullname: Skopal, Tomas
– sequence: 2
  givenname: Pavel
  surname: Zezula
  fullname: Zezula, Pavel
EndPage 54
ExternalDocumentID 5271953
Genre orig-research
GroupedDBID 6IE
6IF
6IK
6IL
6IN
AAJGR
AARBI
ACM
ADPZR
ALMA_UNASSIGNED_HOLDINGS
APO
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
GUFHI
IERZE
OCL
RIE
RIL
AAWTH
LHSKQ
ID FETCH-LOGICAL-a243t-1a01060c3a91245c40b62a240c54d0974a47c50ec7e8c482c78d5424be50b9f53
IEDL.DBID RIE
ISBN 9780769537658
0769537650
ISICitedReferencesCount 11
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000282087600006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:28:28 EDT 2025
Wed Jan 31 06:42:28 EST 2024
Wed Jan 31 06:50:09 EST 2024
IsPeerReviewed false
IsScholarly false
Keywords metric space
MPEG-7
dataset analysis
CoPhIR dataset
visual descriptors
LCCN 2009904942
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a243t-1a01060c3a91245c40b62a240c54d0974a47c50ec7e8c482c78d5424be50b9f53
PageCount 8
ParticipantIDs acm_books_10_1109_SISAP_2009_25
acm_books_10_1109_SISAP_2009_25_brief
ieee_primary_5271953
PublicationCentury 2000
PublicationDate 20090829
2009-Aug.
PublicationDateYYYYMMDD 2009-08-29
2009-08-01
PublicationDate_xml – month: 08
  year: 2009
  text: 20090829
  day: 29
PublicationDecade 2000
PublicationPlace Washington, DC, USA
PublicationPlace_xml – name: Washington, DC, USA
PublicationSeriesTitle ACM Conferences
PublicationTitle Proceedings of the 2009 Second International Workshop on Similarity Search and Applications
PublicationTitleAbbrev SISAP
PublicationYear 2009
Publisher IEEE Computer Society
IEEE
Publisher_xml – name: IEEE Computer Society
– name: IEEE
SSID ssj0001120457
Score 1.5186486
Snippet The Content-based Photo Image Retrieval (CoPhIR) dataset is the largest available database of digital images with corresponding visual descriptors. It contains...
The content-based photo image retrieval (CoPhIR) dataset is the largest available database of digital images with corresponding visual descriptors. It contains...
SourceID ieee
acm
SourceType Publisher
StartPage 47
SubjectTerms Computing methodologies -- Artificial intelligence -- Computer vision -- Computer vision problems
Computing methodologies -- Computer graphics -- Image manipulation
Computing methodologies -- Machine learning -- Learning paradigms -- Unsupervised learning -- Cluster analysis
Content based retrieval
CoPhIR dataset
Data analysis
dataset analysis
Digital images
Extraterrestrial measurements
Image databases
Image retrieval
Information retrieval
Information systems -- Information retrieval
Information systems -- Information retrieval -- Evaluation of retrieval results
Information systems -- Information systems applications -- Multimedia information systems -- Multimedia databases
metric space
Microscopy
MPEG 7 Standard
MPEG-7
Visual databases
visual descriptors
Title CoPhIR Image Collection under the Microscope
URI https://ieeexplore.ieee.org/document/5271953
WOSCitedRecordID wos000282087600006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NT8IwGH4DxIMnP8CIivagnpiUtaXd0RCJHCSLaMJtabsucgDMAH-_e7sFYmJivO3rsDxp9-79eJ4H4FZmThnJUeiWpwEPNQ_MgMvAMoc1DJHqyHizCTmZqNksimvQ3XFhnHN--Mw94KHv5acru8VSWU-EErs-dahLKUuu1r6e0kdhdVlm5hGKlIhK0ml3rvYam73pePoYl3KV6JJd13bxw1_Fh5fR0f9e7Bhae54eiXcR6ARqbnkKnYqKQO5JxTVC7Em1iZvQHa7ij_ErGS-KTwnxhQPPbSBIJ8tJ8UNIXnBKz_NVWvA-enobPgeVZ0KgQ842QV9jkkct01ERuYXl1AzC4ha1gqe0SB40l1ZQZ6VTlqvQSpUKHnLjBDVRJtgZNJarpTsHUuzHNB1Ixwx3XDGjXNa3zFIdMTS5Ym24KVBLMBlYJz6XoFHicUVvyygJRRvu_ngiMfncZW1oIqbJZymwkVRwXvx--RIOy5YOTuFdQWOTb10HDuzXZr7Or_2q-AYo3K6R
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3LTsJAFL0BNNGVDzCiIrNQV1TbzgzTLg2R0AikEUzYNZ3pNLIATAG_395pAzExMe76WjQnM729j3MOwJ1ItScFQ6FblljMjZklu0xYimqsYfAk9qUxmxDjsTeb-WEFOjsujNbaDJ_pRzw0vfxkpbZYKnvirsCuTxUOOGOuU7C19hUVB6XVRZGb-yhTwktRp925t1fZfJoEk-ewEKxEn-xqrBY_HFZMgOmf_O_VTqGxZ-qRcBeDzqCil-fQKskI5IGUbCNEn5TbuA6d3ir8CN5IsMg_JsSUDgy7gSChLCP5LyEZ4ZyeYaw04L3_Mu0NrNI1wYpdRjeWE2OaZysa-3ns5orZsuvmt2zFWWLn6UPMhOK2VkJ7inmuEl7Cmcuk5rb0U04voLZcLfUlkHxHJklXaCqZZh6Vnk4dRZUd-xRtrmgT2jlqEaYD68hkE7YfGVzR3dKPXN6E-z-eiGQ212kT6ohp9FlIbEQlnFe_X27D0WA6GkbDYPx6DcdFgwdn8m6gtsm2ugWH6mszX2e3ZoV8A3O7sdg
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+of+the+2009+Second+International+Workshop+on+Similarity+Search+and+Applications&rft.atitle=CoPhIR+Image+Collection+under+the+Microscope&rft.au=Batko%2C+Michal&rft.au=Kohoutkova%2C+Petra&rft.au=Novak%2C+David&rft.series=ACM+Conferences&rft.date=2009-08-29&rft.pub=IEEE+Computer+Society&rft.isbn=9780769537658&rft.spage=47&rft.epage=54&rft_id=info:doi/10.1109%2FSISAP.2009.25
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9780769537658/lc.gif&client=summon&freeimage=true
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9780769537658/mc.gif&client=summon&freeimage=true
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9780769537658/sc.gif&client=summon&freeimage=true