Autoencoder-based self-supervised hashing for cross-modal retrieval

Cross-modal retrieval has gained lots of attention in the era of the multimedia data explosion. Taking advantage of low storage cost and fast retrieval speed, hash learning-based methods become more and more popular in this field. The crucial bottlenecks of cross-modal retrieval are twofold: the het...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Multimedia tools and applications Jg. 80; H. 11; S. 17257 - 17274
Hauptverfasser: Li, Yifan, Wang, Xuan, Cui, Lei, Zhang, Jiajia, Huang, Chengkai, Luo, Xuan, Qi, Shuhan
Format: Journal Article
Sprache:Englisch
Veröffentlicht: New York Springer US 01.05.2021
Springer Nature B.V
Schlagworte:
ISSN:1380-7501, 1573-7721
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract Cross-modal retrieval has gained lots of attention in the era of the multimedia data explosion. Taking advantage of low storage cost and fast retrieval speed, hash learning-based methods become more and more popular in this field. The crucial bottlenecks of cross-modal retrieval are twofold: the heterogeneous gap in different modalities and the semantic gap among similar data with various modalities. To address these issues, we adopt self-supervised fashion to bridge the heterogeneous gap by generating the cohesive features of different instances. To mitigate the semantic gap, we use triplet sampling to optimize the semantic loss in inter-modal and intra-modal, which increase the discriminability of our approach. Experimental on two benchmark datasets show the efficiency and robustness of our method, and the extended experiments show the scalability.
AbstractList Cross-modal retrieval has gained lots of attention in the era of the multimedia data explosion. Taking advantage of low storage cost and fast retrieval speed, hash learning-based methods become more and more popular in this field. The crucial bottlenecks of cross-modal retrieval are twofold: the heterogeneous gap in different modalities and the semantic gap among similar data with various modalities. To address these issues, we adopt self-supervised fashion to bridge the heterogeneous gap by generating the cohesive features of different instances. To mitigate the semantic gap, we use triplet sampling to optimize the semantic loss in inter-modal and intra-modal, which increase the discriminability of our approach. Experimental on two benchmark datasets show the efficiency and robustness of our method, and the extended experiments show the scalability.
Author Zhang, Jiajia
Li, Yifan
Luo, Xuan
Cui, Lei
Huang, Chengkai
Wang, Xuan
Qi, Shuhan
Author_xml – sequence: 1
  givenname: Yifan
  surname: Li
  fullname: Li, Yifan
  organization: Computer Science and Technology, Harbin Institute of Technology (Shenzhen)
– sequence: 2
  givenname: Xuan
  surname: Wang
  fullname: Wang, Xuan
  organization: Computer Science and Technology, Harbin Institute of Technology (Shenzhen)
– sequence: 3
  givenname: Lei
  surname: Cui
  fullname: Cui, Lei
  organization: Computer Science and Technology, Harbin Institute of Technology (Shenzhen)
– sequence: 4
  givenname: Jiajia
  surname: Zhang
  fullname: Zhang, Jiajia
  organization: Computer Science and Technology, Harbin Institute of Technology (Shenzhen)
– sequence: 5
  givenname: Chengkai
  surname: Huang
  fullname: Huang, Chengkai
  organization: Computer Science and Technology, Harbin Institute of Technology (Shenzhen)
– sequence: 6
  givenname: Xuan
  surname: Luo
  fullname: Luo, Xuan
  organization: Computer Science and Technology, Harbin Institute of Technology (Shenzhen)
– sequence: 7
  givenname: Shuhan
  surname: Qi
  fullname: Qi, Shuhan
  email: shuhanqi@cs.hitsz.edu.cn
  organization: Computer Science and Technology, Harbin Institute of Technology (Shenzhen)
BookMark eNp9kM1qwzAQhEVJoWnaF-jJ0LPaXcmyrGMI_YNAL-1ZKLKUODhWKtmBvn2duFDoIafdhflmh7kmkza0jpA7hAcEkI8JEXJGgQEFJZSi8oJMUUhOpWQ4GXZeApUC8Ipcp7QFwEKwfEoW874LrrWhcpGuTHJVllzjaer3Lh7q470xaVO368yHmNkYUqK7UJkmi66LtTuY5oZcetMkd_s7Z-Tz-elj8UqX7y9vi_mSWo6qo6XzzAplGFNQgimUVaVTnGNeqKrE3FaCGZlX0q2KwiuPZtAgt0rxAvxK8Bm5H333MXz1LnV6G_rYDi81E4MplBLloCpH1SlrdF7bujNdHdoumrrRCPpYmR4r00Nl-lSZPqLsH7qP9c7E7_MQH6E0iNu1i3-pzlA_FPF_6Q
CitedBy_id crossref_primary_10_1007_s11042_023_16572_7
crossref_primary_10_1007_s11042_023_15400_2
Cites_doi 10.1109/TCSVT.2013.2276704
10.1145/2911996.2912000
10.24963/ijcai.2018/158
10.1145/1646396.1646452
10.1609/aaai.v31i1.10719
10.1109/CVPR.2015.7298654
10.1109/CVPR.2015.7299011
10.1609/aaai.v33i01.33014400
10.1109/CVPR.2012.6247923
10.1109/ICCV.2017.226
10.1609/aaai.v27i1.8464
10.5244/C.28.6
10.24963/ijcai.2018/85
10.1093/biomet/28.3-4.321
10.1145/1460096.1460104
10.1007/978-3-319-54181-5_5
10.1609/aaai.v28i1.8995
10.1109/CVPR.2019.00202
10.1109/CVPR.2015.7298947
10.1109/TMM.2018.2856090
10.1016/j.jvcir.2018.12.025
10.1109/TIP.2018.2821921
10.1109/TPAMI.2015.2505311
10.1145/3126686.3126723
10.1109/TIP.2016.2607421
10.1145/1873951.1873987
10.1109/CVPR.2016.641
10.1109/CVPR.2017.348
10.1145/1390156.1390285
10.1109/TIP.2018.2890144
10.1145/2600428.2609610
10.1145/3240508.3240684
10.1109/TCSVT.2015.2400779
ContentType Journal Article
Copyright Springer Science+Business Media, LLC, part of Springer Nature 2020
Springer Science+Business Media, LLC, part of Springer Nature 2020.
Copyright_xml – notice: Springer Science+Business Media, LLC, part of Springer Nature 2020
– notice: Springer Science+Business Media, LLC, part of Springer Nature 2020.
DBID AAYXX
CITATION
3V.
7SC
7WY
7WZ
7XB
87Z
8AL
8AO
8FD
8FE
8FG
8FK
8FL
8G5
ABUWG
AFKRA
ARAPS
AZQEC
BENPR
BEZIV
BGLVJ
CCPQU
DWQXO
FRNLG
F~G
GNUQQ
GUQSH
HCIFZ
JQ2
K60
K6~
K7-
L.-
L7M
L~C
L~D
M0C
M0N
M2O
MBDVC
P5Z
P62
PHGZM
PHGZT
PKEHL
PQBIZ
PQBZA
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
Q9U
DOI 10.1007/s11042-020-09599-7
DatabaseName CrossRef
ProQuest Central (Corporate)
Computer and Information Systems Abstracts
ABI/INFORM Collection
ABI/INFORM Global (PDF only)
ProQuest Central (purchase pre-March 2016)
ABI/INFORM Collection
Computing Database (Alumni Edition)
ProQuest Pharma Collection
Technology Research Database
ProQuest SciTech Collection
ProQuest Technology Collection
ProQuest Central (Alumni) (purchase pre-March 2016)
ABI/INFORM Collection (Alumni Edition)
Research Library (Alumni Edition)
ProQuest Central (Alumni)
ProQuest Central UK/Ireland
Advanced Technologies & Computer Science Collection
ProQuest Central Essentials
ProQuest Central
Business Premium Collection
Technology collection
ProQuest One Community College
ProQuest Central
Business Premium Collection (Alumni)
ABI/INFORM Global (Corporate)
ProQuest Central Student
Research Library Prep (ProQuest)
SciTech Premium Collection
ProQuest Computer Science Collection
ProQuest Business Collection (Alumni Edition)
ProQuest Business Collection
Computer Science Database
ABI/INFORM Professional Advanced
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
ABI/INFORM Global
Computing Database
Research Library
Research Library (Corporate)
Advanced Technologies & Aerospace Database
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Premium
ProQuest One Academic (New)
ProQuest One Academic Middle East (New)
ProQuest One Business (OCUL)
ProQuest One Business (Alumni)
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic (retired)
ProQuest One Academic UKI Edition
ProQuest Central China
ProQuest Central Basic
DatabaseTitle CrossRef
ABI/INFORM Global (Corporate)
ProQuest Business Collection (Alumni Edition)
ProQuest One Business
Research Library Prep
Computer Science Database
ProQuest Central Student
Technology Collection
Technology Research Database
Computer and Information Systems Abstracts – Academic
ProQuest One Academic Middle East (New)
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
ProQuest Central (Alumni Edition)
SciTech Premium Collection
ProQuest One Community College
Research Library (Alumni Edition)
ProQuest Pharma Collection
ProQuest Central China
ABI/INFORM Complete
ProQuest Central
ABI/INFORM Professional Advanced
ProQuest One Applied & Life Sciences
ProQuest Central Korea
ProQuest Research Library
ProQuest Central (New)
Advanced Technologies Database with Aerospace
ABI/INFORM Complete (Alumni Edition)
Advanced Technologies & Aerospace Collection
Business Premium Collection
ABI/INFORM Global
ProQuest Computing
ABI/INFORM Global (Alumni Edition)
ProQuest Central Basic
ProQuest Computing (Alumni Edition)
ProQuest One Academic Eastern Edition
ProQuest Technology Collection
ProQuest SciTech Collection
ProQuest Business Collection
Computer and Information Systems Abstracts Professional
Advanced Technologies & Aerospace Database
ProQuest One Academic UKI Edition
ProQuest One Business (Alumni)
ProQuest One Academic
ProQuest One Academic (New)
ProQuest Central (Alumni)
Business Premium Collection (Alumni)
DatabaseTitleList ABI/INFORM Global (Corporate)

Database_xml – sequence: 1
  dbid: BENPR
  name: ProQuest Central
  url: https://www.proquest.com/central
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Computer Science
EISSN 1573-7721
EndPage 17274
ExternalDocumentID 10_1007_s11042_020_09599_7
GroupedDBID -4Z
-59
-5G
-BR
-EM
-Y2
-~C
.4S
.86
.DC
.VR
06D
0R~
0VY
123
1N0
1SB
2.D
203
28-
29M
2J2
2JN
2JY
2KG
2LR
2P1
2VQ
2~H
30V
3EH
3V.
4.4
406
408
409
40D
40E
5QI
5VS
67Z
6NX
7WY
8AO
8FE
8FG
8FL
8G5
8UJ
95-
95.
95~
96X
AAAVM
AABHQ
AACDK
AAHNG
AAIAL
AAJBT
AAJKR
AANZL
AAOBN
AARHV
AARTL
AASML
AATNV
AATVU
AAUYE
AAWCG
AAYIU
AAYQN
AAYTO
AAYZH
ABAKF
ABBBX
ABBXA
ABDZT
ABECU
ABFTV
ABHLI
ABHQN
ABJNI
ABJOX
ABKCH
ABKTR
ABMNI
ABMQK
ABNWP
ABQBU
ABQSL
ABSXP
ABTEG
ABTHY
ABTKH
ABTMW
ABULA
ABUWG
ABWNU
ABXPI
ACAOD
ACBXY
ACDTI
ACGFO
ACGFS
ACHSB
ACHXU
ACKNC
ACMDZ
ACMLO
ACOKC
ACOMO
ACPIV
ACREN
ACSNA
ACZOJ
ADHHG
ADHIR
ADIMF
ADINQ
ADKNI
ADKPE
ADMLS
ADRFC
ADTPH
ADURQ
ADYFF
ADYOE
ADZKW
AEBTG
AEFIE
AEFQL
AEGAL
AEGNC
AEJHL
AEJRE
AEKMD
AEMSY
AENEX
AEOHA
AEPYU
AESKC
AETLH
AEVLU
AEXYK
AFBBN
AFEXP
AFGCZ
AFKRA
AFLOW
AFQWF
AFWTZ
AFYQB
AFZKB
AGAYW
AGDGC
AGGDS
AGJBK
AGMZJ
AGQEE
AGQMX
AGRTI
AGWIL
AGWZB
AGYKE
AHAVH
AHBYD
AHKAY
AHSBF
AHYZX
AIAKS
AIGIU
AIIXL
AILAN
AITGF
AJBLW
AJRNO
AJZVZ
ALMA_UNASSIGNED_HOLDINGS
ALWAN
AMKLP
AMTXH
AMXSW
AMYLF
AMYQR
AOCGG
ARAPS
ARCSS
ARMRJ
ASPBG
AVWKF
AXYYD
AYJHY
AZFZN
AZQEC
B-.
BA0
BBWZM
BDATZ
BENPR
BEZIV
BGLVJ
BGNMA
BPHCQ
BSONS
CAG
CCPQU
COF
CS3
CSCUP
DDRTE
DL5
DNIVK
DPUIP
DU5
DWQXO
EBLON
EBS
EIOEI
EJD
ESBYG
FEDTE
FERAY
FFXSO
FIGPU
FINBP
FNLPD
FRNLG
FRRFC
FSGXE
FWDCC
GGCAI
GGRSB
GJIRD
GNUQQ
GNWQR
GQ6
GQ7
GQ8
GROUPED_ABI_INFORM_COMPLETE
GUQSH
GXS
H13
HCIFZ
HF~
HG5
HG6
HMJXF
HQYDN
HRMNR
HVGLF
HZ~
I-F
I09
IHE
IJ-
IKXTQ
ITG
ITH
ITM
IWAJR
IXC
IXE
IZIGR
IZQ
I~X
I~Z
J-C
J0Z
JBSCW
JCJTX
JZLTJ
K60
K6V
K6~
K7-
KDC
KOV
KOW
LAK
LLZTM
M0C
M0N
M2O
M4Y
MA-
N2Q
N9A
NB0
NDZJH
NPVJJ
NQJWS
NU0
O9-
O93
O9G
O9I
O9J
OAM
OVD
P19
P2P
P62
P9O
PF0
PQBIZ
PQBZA
PQQKQ
PROAC
PT4
PT5
Q2X
QOK
QOS
R4E
R89
R9I
RHV
RNI
RNS
ROL
RPX
RSV
RZC
RZE
RZK
S16
S1Z
S26
S27
S28
S3B
SAP
SCJ
SCLPG
SCO
SDH
SDM
SHX
SISQX
SJYHP
SNE
SNPRN
SNX
SOHCF
SOJ
SPISZ
SRMVM
SSLCW
STPWE
SZN
T13
T16
TEORI
TH9
TSG
TSK
TSV
TUC
TUS
U2A
UG4
UOJIU
UTJUX
UZXMN
VC2
VFIZW
W23
W48
WK8
YLTOR
Z45
Z7R
Z7S
Z7W
Z7X
Z7Y
Z7Z
Z81
Z83
Z86
Z88
Z8M
Z8N
Z8Q
Z8R
Z8S
Z8T
Z8U
Z8W
Z92
ZMTXR
~EX
AAPKM
AAYXX
ABBRH
ABDBE
ABFSG
ABRTQ
ACSTC
ADHKG
ADKFA
AEZWR
AFDZB
AFFHD
AFHIU
AFOHR
AGQPQ
AHPBZ
AHWEU
AIXLP
ATHPR
AYFIA
CITATION
PHGZM
PHGZT
PQGLB
7SC
7XB
8AL
8FD
8FK
JQ2
L.-
L7M
L~C
L~D
MBDVC
PKEHL
PQEST
PQUKI
PRINS
Q9U
ID FETCH-LOGICAL-c319t-8ef2c59a229080a69c98e9331469d814cd52a74d7eb66f9f1aa6913c99360fb53
IEDL.DBID BENPR
ISICitedReferencesCount 2
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000572719800006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1380-7501
IngestDate Tue Nov 04 23:32:27 EST 2025
Sat Nov 29 06:20:07 EST 2025
Tue Nov 18 21:53:50 EST 2025
Fri Feb 21 02:48:24 EST 2025
IsPeerReviewed true
IsScholarly true
Issue 11
Keywords Self-supervised
Cross-modal retrieval
Autoencoder
Hash learning
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c319t-8ef2c59a229080a69c98e9331469d814cd52a74d7eb66f9f1aa6913c99360fb53
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
PQID 2529008717
PQPubID 54626
PageCount 18
ParticipantIDs proquest_journals_2529008717
crossref_citationtrail_10_1007_s11042_020_09599_7
crossref_primary_10_1007_s11042_020_09599_7
springer_journals_10_1007_s11042_020_09599_7
PublicationCentury 2000
PublicationDate 20210500
2021-05-00
20210501
PublicationDateYYYYMMDD 2021-05-01
PublicationDate_xml – month: 5
  year: 2021
  text: 20210500
PublicationDecade 2020
PublicationPlace New York
PublicationPlace_xml – name: New York
– name: Dordrecht
PublicationSubtitle An International Journal
PublicationTitle Multimedia tools and applications
PublicationTitleAbbrev Multimed Tools Appl
PublicationYear 2021
Publisher Springer US
Springer Nature B.V
Publisher_xml – name: Springer US
– name: Springer Nature B.V
References PengYZhaiXZhaoYHuangXSemi-supervised cross-media feature learning with unified patch graph regularizationIEEE Transactions on Circuits and Systems for Video Technology201626358359610.1109/TCSVT.2015.2400779
Kumar S, Udupa R (2011) Learning hash functions for cross-view similarity search. In: International Joint Conference on Artificial Intelligence. AAAI Press, pp 1360–1365
Chen J, Cheung W K, Wang A (2018) Learning deep unsupervised binary codes for image retrieval. In: International Joint Conference on Artificial Intelligence, vol 2018-July, pp 613–619
LiBLiuXDineshKDuanZSharmaGCreating a Multitrack Classical Music Performance Dataset for Multimodal Music Analysis: Challenges, Insights, and ApplicationsIEEE Transactions on Multimedia201921252253510.1109/TMM.2018.2856090
Zhang D, Li W-J (2014) Large-Scale Supervised Multimodal Hashing with Semantic Correlation Maximization. In: Proceedings of the AAAI Conference on Artificial Intelligence. AAAI Press, pp 2177–2183
Liu H, Lin M, Zhang S, Wu Y, Huang F, Ji R (2018) Dense Auto-Encoder Hashing for Robust Cross-Modality Retrieval. In: Proceedings of ACM Multimedia Conference on Multimedia Conference. ACM Press, New York, New York, USA, pp 1589–1597
Huiskes M J, Lew M S (2008) The MIR flickr retrieval evaluation. In: Proceeding of the ACM international conference on Multimedia information retrieval ACM Press New York, New York, USA 39
Chua T-S, Tang J, Hong R, Li H, Luo Z, Zheng Y (2009) NUS-WIDE: a real-world web image database from National University of Singapore. In: Proceeding of the ACM International Conference on Image and Video Retrieval - CIVR ’09 ACM Press New York, New York, USA 1
Sharma A, Kumar A, Daume H, Jacobs D W (2012) Generalized Multiview Analysis: A discriminative latent space. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE, pp 2160–2167
Zhai X, Peng Y, Xiao J (2013) Heterogeneous Metric Learning with Joint Graph Regularization for Cross-Media Retrieval. In: AAAI Conference on Artificial Intelligence, pp 1198–1204
Zhou J, Ding G, Guo Y (2014) Latent semantic sparse hashing for cross-modal similarity search. In: Proceedings of the international ACM SIGIR conference on Research & development in information retrieval. ACM Press, New York, New York, USA, pp 415–424
DengCChenZLiuXGaoXTaoDTriplet-Based Deep Hashing Network for Cross-Modal RetrievalIEEE Trans Image Process201827838933903380349710.1109/TIP.2018.2821921
DingGGuoYZhouJGaoYLarge-Scale Cross-Modality Search via Collective Matrix Factorization HashingIEEE Trans Image Process2016251154275440355198410.1109/TIP.2016.2607421
Zhuang B, Lin G, Shen C, Reid I (2016) Fast training of triplet-based deep binary embedding networks. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol 2016-Decem, pp 5955–5964
Cao Y, Long M, Wang J, Zhu H (2016) Correlation Autoencoder Hashing for Supervised Cross-Modal Search. In: Proceedings of the ACM on International Conference on Multimedia Retrieval. ACM Press, New York, New York, USA, pp 197–204
Sun L, Ji S, Ye J (2008) A least squares formulation for canonical correlation analysis. In: Proceedings of the international conference on Machine learning. ACM Press, New York, New York, USA, pp 1024–1031
Wang D, Gao X, Wang X, He L (2015) Semantic topic multimodal hashing for cross-media retrieval. In: International Joint Conference on Artificial Intelligence 2015-Janua, pp 3890–3896
Zhang C, Peng Y (2018) Better and faster: Knowledge transfer from multiple self-supervised learning tasks via graph distillation for video classification. In: IJCAI International Joint Conference on Artificial Intelligence, vol 2018-July, pp 1135–1141
Jiang Q-Y, Li W-J (2017) Deep Cross-Modal Hashing. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp 3232–3240
Akaho S (2006) A kernel method for canonical correlation analysis. arXiv:0609071.0609071
Rasiwasia N, Costa Pereira J, Coviello E, Doyle G, Lanckriet GRG, Levy R, Vasconcelos N (2010) A new approach to cross-modal multimedia retrieval. In: Proceedings of the international conference on Multimedia. ACM Press, New York, New York, USA, pp 251–260
WangKHeRWangLWangWTanTJoint Feature Selection and Subspace Learning for Cross-Modal RetrievalIEEE Transactions on Pattern Analysis and Machine Intelligence201638102010202310.1109/TPAMI.2015.2505311
LiuXYuGDomeniconiCWangJRenYGuoMRanking-Based Deep Cross-Modal HashingProceedings of the AAAI Conference on Artificial Intelligence2019334400440710.1609/aaai.v33i01.33014400
Lin Z, Ding G, Hu M, Wang J (2015) Semantics-preserving hashing for cross-view retrieval. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol 07-12-June. IEEE, pp 3864–3872
Carreira-Perpiñán M A, Raziperchikolaei R (2015) Hashing with binary autoencoders. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol 07-12-June, pp 557–566
GuanJLiYSunJWangXZhaoHZhangJLiuZQiSGraph-based supervised discrete image hashingJ Vis Commun Image Represent20195867568710.1016/j.jvcir.2018.12.025
ZhaiXPengYXiaoJLearning cross-media joint representation with sparse and semisupervised regularizationIEEE Transactions on Circuits and Systems for Video Technology201424696597810.1109/TCSVT.2013.2276704
HotellingHRelations Between Two Sets of VariatesBiometrika1936283-432137710.1093/biomet/28.3-4.321
Van Der MaatenLAccelerating t-sne using tree-based algorithmsThe Journal of Machine Learning Research20141513221324532771691319.62134
Lai H, Pan Y, Liu Y, Yan S (2015) Simultaneous feature learning and hash coding with deep neural networks Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol 07-12-June, pp 3270–3278
Chen L, Srivastava S, Duan Z, Xu C (2017) Deep cross-modal audio-visual generation. In: Proceedings of the Thematic Workshops of ACM Multimedia Association for Computing Machinery, Inc New York, New York, USA, pp 349–357
Wang X, Shi Y, Kitani K M (2016) Deep supervised hashing with triplet labels. In: Proceedings of Asian conference on computer vision, vol 10111 LNCS. Springer, Cham, pp 70–84
Zhao F, Huang Y, Wang L, Tan T (2015) Deep semantic ranking based hashing for multi-label image retrieval. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol 07-12-June, pp 1556–1564
Doersch C, Zisserman A, Deepmind (2017) Multi-task Self-Supervised Visual Learning. In: Proceedings of the IEEE international conference on computer vision, pp 2070–2079
Yang E, Deng C, Liu W, Liu X, Tao D, Gao X (2017) Pairwise Relationship Guided Deep Hashing for Cross-Modal Retrieval. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 1618–1625
Kolesnikov A, Zhai X, Beyer L (2019) Revisiting self-supervised visual representation learning. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol 2019-June, pp 1920–1929
HuMYangYShenFXieNHongRShenHTCollective Reconstructive Embeddings for Cross-Modal HashingIEEE Trans Image Process201928627702784393754410.1109/TIP.2018.2890144
Chatfield K, Simonyan K, Vedaldi A, Zisserman A (2014) Return of the Devil in the Details: Delving Deep into Convolutional Nets. In: Proceedings of the British Machine Vision Conference
9599_CR2
9599_CR3
9599_CR4
9599_CR5
9599_CR6
9599_CR17
K Wang (9599_CR29) 2016; 38
9599_CR7
9599_CR18
J Guan (9599_CR11) 2019; 58
H Hotelling (9599_CR12) 1936; 28
9599_CR35
9599_CR14
9599_CR36
9599_CR15
9599_CR37
9599_CR16
9599_CR38
C Deng (9599_CR8) 2018; 27
9599_CR31
9599_CR10
9599_CR32
9599_CR34
9599_CR30
B Li (9599_CR19) 2019; 21
X Liu (9599_CR22) 2019; 33
M Hu (9599_CR13) 2019; 28
9599_CR28
Y Peng (9599_CR23) 2016; 26
L Van Der Maaten (9599_CR27) 2014; 15
9599_CR24
9599_CR25
9599_CR26
9599_CR20
9599_CR21
G Ding (9599_CR9) 2016; 25
X Zhai (9599_CR33) 2014; 24
9599_CR1
References_xml – reference: Kumar S, Udupa R (2011) Learning hash functions for cross-view similarity search. In: International Joint Conference on Artificial Intelligence. AAAI Press, pp 1360–1365
– reference: Wang X, Shi Y, Kitani K M (2016) Deep supervised hashing with triplet labels. In: Proceedings of Asian conference on computer vision, vol 10111 LNCS. Springer, Cham, pp 70–84
– reference: Chua T-S, Tang J, Hong R, Li H, Luo Z, Zheng Y (2009) NUS-WIDE: a real-world web image database from National University of Singapore. In: Proceeding of the ACM International Conference on Image and Video Retrieval - CIVR ’09 ACM Press New York, New York, USA 1
– reference: Zhang C, Peng Y (2018) Better and faster: Knowledge transfer from multiple self-supervised learning tasks via graph distillation for video classification. In: IJCAI International Joint Conference on Artificial Intelligence, vol 2018-July, pp 1135–1141
– reference: Doersch C, Zisserman A, Deepmind (2017) Multi-task Self-Supervised Visual Learning. In: Proceedings of the IEEE international conference on computer vision, pp 2070–2079
– reference: LiuXYuGDomeniconiCWangJRenYGuoMRanking-Based Deep Cross-Modal HashingProceedings of the AAAI Conference on Artificial Intelligence2019334400440710.1609/aaai.v33i01.33014400
– reference: Zhuang B, Lin G, Shen C, Reid I (2016) Fast training of triplet-based deep binary embedding networks. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol 2016-Decem, pp 5955–5964
– reference: WangKHeRWangLWangWTanTJoint Feature Selection and Subspace Learning for Cross-Modal RetrievalIEEE Transactions on Pattern Analysis and Machine Intelligence201638102010202310.1109/TPAMI.2015.2505311
– reference: ZhaiXPengYXiaoJLearning cross-media joint representation with sparse and semisupervised regularizationIEEE Transactions on Circuits and Systems for Video Technology201424696597810.1109/TCSVT.2013.2276704
– reference: Yang E, Deng C, Liu W, Liu X, Tao D, Gao X (2017) Pairwise Relationship Guided Deep Hashing for Cross-Modal Retrieval. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 1618–1625
– reference: Jiang Q-Y, Li W-J (2017) Deep Cross-Modal Hashing. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp 3232–3240
– reference: Lin Z, Ding G, Hu M, Wang J (2015) Semantics-preserving hashing for cross-view retrieval. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol 07-12-June. IEEE, pp 3864–3872
– reference: Liu H, Lin M, Zhang S, Wu Y, Huang F, Ji R (2018) Dense Auto-Encoder Hashing for Robust Cross-Modality Retrieval. In: Proceedings of ACM Multimedia Conference on Multimedia Conference. ACM Press, New York, New York, USA, pp 1589–1597
– reference: Sharma A, Kumar A, Daume H, Jacobs D W (2012) Generalized Multiview Analysis: A discriminative latent space. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE, pp 2160–2167
– reference: Kolesnikov A, Zhai X, Beyer L (2019) Revisiting self-supervised visual representation learning. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol 2019-June, pp 1920–1929
– reference: Chen L, Srivastava S, Duan Z, Xu C (2017) Deep cross-modal audio-visual generation. In: Proceedings of the Thematic Workshops of ACM Multimedia Association for Computing Machinery, Inc New York, New York, USA, pp 349–357
– reference: DengCChenZLiuXGaoXTaoDTriplet-Based Deep Hashing Network for Cross-Modal RetrievalIEEE Trans Image Process201827838933903380349710.1109/TIP.2018.2821921
– reference: PengYZhaiXZhaoYHuangXSemi-supervised cross-media feature learning with unified patch graph regularizationIEEE Transactions on Circuits and Systems for Video Technology201626358359610.1109/TCSVT.2015.2400779
– reference: Carreira-Perpiñán M A, Raziperchikolaei R (2015) Hashing with binary autoencoders. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol 07-12-June, pp 557–566
– reference: LiBLiuXDineshKDuanZSharmaGCreating a Multitrack Classical Music Performance Dataset for Multimodal Music Analysis: Challenges, Insights, and ApplicationsIEEE Transactions on Multimedia201921252253510.1109/TMM.2018.2856090
– reference: Cao Y, Long M, Wang J, Zhu H (2016) Correlation Autoencoder Hashing for Supervised Cross-Modal Search. In: Proceedings of the ACM on International Conference on Multimedia Retrieval. ACM Press, New York, New York, USA, pp 197–204
– reference: Wang D, Gao X, Wang X, He L (2015) Semantic topic multimodal hashing for cross-media retrieval. In: International Joint Conference on Artificial Intelligence 2015-Janua, pp 3890–3896
– reference: Lai H, Pan Y, Liu Y, Yan S (2015) Simultaneous feature learning and hash coding with deep neural networks Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol 07-12-June, pp 3270–3278
– reference: Chatfield K, Simonyan K, Vedaldi A, Zisserman A (2014) Return of the Devil in the Details: Delving Deep into Convolutional Nets. In: Proceedings of the British Machine Vision Conference
– reference: DingGGuoYZhouJGaoYLarge-Scale Cross-Modality Search via Collective Matrix Factorization HashingIEEE Trans Image Process2016251154275440355198410.1109/TIP.2016.2607421
– reference: Zhang D, Li W-J (2014) Large-Scale Supervised Multimodal Hashing with Semantic Correlation Maximization. In: Proceedings of the AAAI Conference on Artificial Intelligence. AAAI Press, pp 2177–2183
– reference: Sun L, Ji S, Ye J (2008) A least squares formulation for canonical correlation analysis. In: Proceedings of the international conference on Machine learning. ACM Press, New York, New York, USA, pp 1024–1031
– reference: Huiskes M J, Lew M S (2008) The MIR flickr retrieval evaluation. In: Proceeding of the ACM international conference on Multimedia information retrieval ACM Press New York, New York, USA 39
– reference: Zhao F, Huang Y, Wang L, Tan T (2015) Deep semantic ranking based hashing for multi-label image retrieval. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol 07-12-June, pp 1556–1564
– reference: GuanJLiYSunJWangXZhaoHZhangJLiuZQiSGraph-based supervised discrete image hashingJ Vis Commun Image Represent20195867568710.1016/j.jvcir.2018.12.025
– reference: Van Der MaatenLAccelerating t-sne using tree-based algorithmsThe Journal of Machine Learning Research20141513221324532771691319.62134
– reference: Chen J, Cheung W K, Wang A (2018) Learning deep unsupervised binary codes for image retrieval. In: International Joint Conference on Artificial Intelligence, vol 2018-July, pp 613–619
– reference: HotellingHRelations Between Two Sets of VariatesBiometrika1936283-432137710.1093/biomet/28.3-4.321
– reference: HuMYangYShenFXieNHongRShenHTCollective Reconstructive Embeddings for Cross-Modal HashingIEEE Trans Image Process201928627702784393754410.1109/TIP.2018.2890144
– reference: Zhou J, Ding G, Guo Y (2014) Latent semantic sparse hashing for cross-modal similarity search. In: Proceedings of the international ACM SIGIR conference on Research & development in information retrieval. ACM Press, New York, New York, USA, pp 415–424
– reference: Akaho S (2006) A kernel method for canonical correlation analysis. arXiv:0609071.0609071
– reference: Zhai X, Peng Y, Xiao J (2013) Heterogeneous Metric Learning with Joint Graph Regularization for Cross-Media Retrieval. In: AAAI Conference on Artificial Intelligence, pp 1198–1204
– reference: Rasiwasia N, Costa Pereira J, Coviello E, Doyle G, Lanckriet GRG, Levy R, Vasconcelos N (2010) A new approach to cross-modal multimedia retrieval. In: Proceedings of the international conference on Multimedia. ACM Press, New York, New York, USA, pp 251–260
– volume: 24
  start-page: 965
  issue: 6
  year: 2014
  ident: 9599_CR33
  publication-title: IEEE Transactions on Circuits and Systems for Video Technology
  doi: 10.1109/TCSVT.2013.2276704
– ident: 9599_CR2
  doi: 10.1145/2911996.2912000
– ident: 9599_CR35
  doi: 10.24963/ijcai.2018/158
– ident: 9599_CR7
  doi: 10.1145/1646396.1646452
– ident: 9599_CR31
  doi: 10.1609/aaai.v31i1.10719
– ident: 9599_CR3
  doi: 10.1109/CVPR.2015.7298654
– ident: 9599_CR20
  doi: 10.1109/CVPR.2015.7299011
– volume: 33
  start-page: 4400
  year: 2019
  ident: 9599_CR22
  publication-title: Proceedings of the AAAI Conference on Artificial Intelligence
  doi: 10.1609/aaai.v33i01.33014400
– ident: 9599_CR28
– ident: 9599_CR1
– ident: 9599_CR25
  doi: 10.1109/CVPR.2012.6247923
– ident: 9599_CR36
– ident: 9599_CR10
  doi: 10.1109/ICCV.2017.226
– ident: 9599_CR32
  doi: 10.1609/aaai.v27i1.8464
– ident: 9599_CR4
  doi: 10.5244/C.28.6
– ident: 9599_CR5
  doi: 10.24963/ijcai.2018/85
– volume: 28
  start-page: 321
  issue: 3-4
  year: 1936
  ident: 9599_CR12
  publication-title: Biometrika
  doi: 10.1093/biomet/28.3-4.321
– ident: 9599_CR14
  doi: 10.1145/1460096.1460104
– ident: 9599_CR30
  doi: 10.1007/978-3-319-54181-5_5
– ident: 9599_CR34
  doi: 10.1609/aaai.v28i1.8995
– ident: 9599_CR16
  doi: 10.1109/CVPR.2019.00202
– ident: 9599_CR18
  doi: 10.1109/CVPR.2015.7298947
– volume: 21
  start-page: 522
  issue: 2
  year: 2019
  ident: 9599_CR19
  publication-title: IEEE Transactions on Multimedia
  doi: 10.1109/TMM.2018.2856090
– volume: 58
  start-page: 675
  year: 2019
  ident: 9599_CR11
  publication-title: J Vis Commun Image Represent
  doi: 10.1016/j.jvcir.2018.12.025
– volume: 15
  start-page: 3221
  issue: 1
  year: 2014
  ident: 9599_CR27
  publication-title: The Journal of Machine Learning Research
– volume: 27
  start-page: 3893
  issue: 8
  year: 2018
  ident: 9599_CR8
  publication-title: IEEE Trans Image Process
  doi: 10.1109/TIP.2018.2821921
– volume: 38
  start-page: 2010
  issue: 10
  year: 2016
  ident: 9599_CR29
  publication-title: IEEE Transactions on Pattern Analysis and Machine Intelligence
  doi: 10.1109/TPAMI.2015.2505311
– ident: 9599_CR6
  doi: 10.1145/3126686.3126723
– volume: 25
  start-page: 5427
  issue: 11
  year: 2016
  ident: 9599_CR9
  publication-title: IEEE Trans Image Process
  doi: 10.1109/TIP.2016.2607421
– ident: 9599_CR24
  doi: 10.1145/1873951.1873987
– ident: 9599_CR38
  doi: 10.1109/CVPR.2016.641
– ident: 9599_CR15
  doi: 10.1109/CVPR.2017.348
– ident: 9599_CR17
– ident: 9599_CR26
  doi: 10.1145/1390156.1390285
– volume: 28
  start-page: 2770
  issue: 6
  year: 2019
  ident: 9599_CR13
  publication-title: IEEE Trans Image Process
  doi: 10.1109/TIP.2018.2890144
– ident: 9599_CR37
  doi: 10.1145/2600428.2609610
– ident: 9599_CR21
  doi: 10.1145/3240508.3240684
– volume: 26
  start-page: 583
  issue: 3
  year: 2016
  ident: 9599_CR23
  publication-title: IEEE Transactions on Circuits and Systems for Video Technology
  doi: 10.1109/TCSVT.2015.2400779
SSID ssj0016524
Score 2.2574966
Snippet Cross-modal retrieval has gained lots of attention in the era of the multimedia data explosion. Taking advantage of low storage cost and fast retrieval speed,...
SourceID proquest
crossref
springer
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 17257
SubjectTerms Computer Communication Networks
Computer Science
Data Structures and Information Theory
Multimedia
Multimedia Information Systems
Retrieval
Semantics
Special Purpose and Application-Based Systems
SummonAdditionalLinks – databaseName: Springer Nature - Connect here FIRST to enable access
  dbid: RSV
  link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3PS8MwFA46PejB6VScTunBmwaaNW2S4xgODzIEdexWsiRFYW6j7fz7fcnSVUUFPZakP_Ly432vyfs-hC6lliwiEJYoQyWmRiaYU0Ww4ID9MwYgwVEKje7YcMjHY3Hvk8KK6rR7tSXpVuo62Y3YVBIb7th_VwKzTbQVW7YZG6M_jNZ7B0nspWx5iMEfEp8q8_0zPrujGmN-2RZ13mbQ_N937qM9jy6D3mo4HKANM2uhZqXcEPiJ3EK7H2gID1G_tyznltBSmxxbt6aDwkwzXCwXdiWx188rzaUAIG7gWoJf5xrelDtBLhitR-hpcPPYv8VeXAErmHUl5ibrqlhIy_fOQ5kIJbgRUQQrp9CcUKXjrmRUW82UJBMZkVCHRArwTBJmkzg6Ro3ZfGZOUMBozCDQlZRBh6uETSR4QMNCSgGe6EnYRqSycao887gVwJimNWeytVkKNkudzVLWRlfrexYr3o1fa3eqrkv9HCzSbgxNCyEghOLrqqvq4p-fdvq36mdop2sPurhTkB3UKPOlOUfb6q18KfILNzbfAdZ22vg
  priority: 102
  providerName: Springer Nature
Title Autoencoder-based self-supervised hashing for cross-modal retrieval
URI https://link.springer.com/article/10.1007/s11042-020-09599-7
https://www.proquest.com/docview/2529008717
Volume 80
WOSCitedRecordID wos000572719800006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAVX
  databaseName: SpringerLINK Contemporary 1997-Present
  customDbUrl:
  eissn: 1573-7721
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0016524
  issn: 1380-7501
  databaseCode: RSV
  dateStart: 19970101
  isFulltext: true
  titleUrlDefault: https://link.springer.com/search?facet-content-type=%22Journal%22
  providerName: Springer Nature
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3db9MwED_RlQf2sEIBrdBNeeANLOLEieMnVKpWk4BSFSiFl8i1HTFpa0vT7u_nLnXagbS98HKSFcf5uPPd-ev3A3ilrZYxx2GJcUIz4XTKMmE4Uxnm_oXEJKGCFJp-lKNRNpupsZ9wK_22ytonVo7aLg3Nkb-NkkgRfhqX71a_GbFG0eqqp9BoQJOQytDOm-8Ho_Fkv46QJp7WNgsZxkbuj83sDs9xOppCwyeaC1NM_h2aDvnmP0ukVeQZtv73nR_Dic85g97OSJ7AA7doQ6vmcwh8927D8S1wwqfQ7203S4K5tG7NKNjZoHRXBSu3K_IvVP61Y2IKMPENqm9i10uLT1pXNF1ow8_g23DwtX_BPOUCM9gXNyxzRWQSpQkFPgt1qozKnIpj9KfKZlwYm0RaCktMKmmhCq6xDo8NZjlpWMyT-DkcLZYLdwqBFInE4a8WEs3ApHKuMS46GQqBSYudhx3g9d_OjccjJ1qMq_yApEwaylFDeaWhXHbg9f6e1Q6N497a3Votue-ZZX7QSQfe1Io9XL67tRf3t_YSHkW03aXaC9mFo816687gobnZXJbrc2jI7z_OvXVi6YNkKD-FfZLRZ5Tj5CfKyZfpH-nx6_w
linkProvider ProQuest
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V3LbtNAFL2K2krAooUAIrQFL2AFIzzu2ONZVFWUEiXKQywCys5MZsaiUpuEOGnVn-o39l4_EkBqdl2wtGyP5ZlzX_M4B-CDtlqecCxLjBOaCacjFgvDmYox908lJgk5pdCPvhwO4_FYfavBXXUWhrZVVj4xd9R2ZmiO_EsQBor407g8m_9mpBpFq6uVhEYBi567vcGSLTvtnuP4fgyC9tdRq8NKVQFmEG5LFrs0MKHSRHQe-zpSRsUOy3p0GcrGXBgbBloKS2IhUapSrvEZfmIwkEd-OiGVCHT5uwILL7Krgd9ar1pEYSmiG_sMIzEvD-kUR_U4HYShYo1m3hSTfwfCTXb7z4JsHufaB_9bDz2H_TKj9pqFCbyAmpvW4aBSq_BK51WHZ39QL76EVnO1nBGJp3ULRqHcepm7TFm2mpP3pOtfhc6Uh2m9l_chu5pZ_NIiFyFDC30F3x_lx17DznQ2dW_AkyKUWNxrIRHkJpITjVHfSV8ITMnsxG8Ar0Y3MSXbOol-XCYbnmhCRIKISHJEJLIBn9bvzAuuka1PH1UwSEq_kyUbDDTgcwWkze2HW3u7vbX38KQzGvSTfnfYO4SnAW3syXd9HsHOcrFyx7BnrpcX2eJdbhEe_HxsgN0DelxAJA
linkToPdf http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1LT-MwEB4hQIg9LG9tl8fmACfWIg5OHB8QqlqqRUCXw7LiFlzbEUjQlqYF8df4dczk0S4rwY0DxyiJo9jfvOyZ-QC2tdVyn2NYYpzQTDgdsVgYzlSMvn8q0UnIWwr9PZXtdnx5qc6n4LmqhaG0ykon5ora9gztke8FYaCofxqXe2mZFnHebB327xkxSNFJa0WnUUDkxD09YviWHRw3ca13gqB19Kfxi5UMA8wg9IYsdmlgQqWp6Xns60gZFTsM8VF9KBtzYWwYaCksEYdEqUq5xmf4vkGjHvlphxgjUP3PSBEKkq6z4Pf4BCMKS0Ld2GdolXlZsFOU7XEqiqHAjXbhFJOvjeLE0_3vcDa3ea2Fzzxbi_C19LS9eiEaSzDlusuwULFYeKVSW4Yv_7RkXIFGfTTsUXNP6waMTLz1MnebsmzUJ61K19cF_5SH7r6Xzye761n80iAnJ0PJXYWLD_mxNZju9rruG3gIBYlBvxYSwW8i2dHoDTjpC4Gumu34NeDVSiem7MJOZCC3yaR_NKEjQXQkOToSWYPd8Tv9ogfJu09vVJBISn2UJRM81OBnBarJ7bdH-_7-aD9gDnGVnB63T9ZhPqB8nzwZdAOmh4OR24RZ8zC8yQZbuXB4cPXR-HoBqPBJCw
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Autoencoder-based+self-supervised+hashing+for+cross-modal+retrieval&rft.jtitle=Multimedia+tools+and+applications&rft.au=Li%2C+Yifan&rft.au=Wang%2C+Xuan&rft.au=Cui%2C+Lei&rft.au=Zhang%2C+Jiajia&rft.date=2021-05-01&rft.issn=1380-7501&rft.eissn=1573-7721&rft.volume=80&rft.issue=11&rft.spage=17257&rft.epage=17274&rft_id=info:doi/10.1007%2Fs11042-020-09599-7&rft.externalDBID=n%2Fa&rft.externalDocID=10_1007_s11042_020_09599_7
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1380-7501&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1380-7501&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1380-7501&client=summon