Representation learning using step-based deep multi-modal autoencoders

Deep learning techniques have been successfully used in learning a common representation for multi-view data, wherein different modalities are projected onto a common subspace. In a broader perspective, the techniques used to investigate common representation learning falls under the categories of ‘...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Pattern recognition Jg. 95; S. 12 - 23
Hauptverfasser: Bhatt, Gaurav, Jha, Piyush, Raman, Balasubramanian
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Elsevier Ltd 01.11.2019
Schlagworte:
ISSN:0031-3203, 1873-5142
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract Deep learning techniques have been successfully used in learning a common representation for multi-view data, wherein different modalities are projected onto a common subspace. In a broader perspective, the techniques used to investigate common representation learning falls under the categories of ‘canonical correlation-based’ approaches and ‘autoencoder-based’ approaches. In this paper, we investigate the performance of deep autoencoder-based methods on multi-view data. We propose a novel step-based correlation multi-modal deep convolution neural network (CorrMCNN) which reconstructs one view of the data given the other while increasing the interaction between the representations at each hidden layer or every intermediate step. The idea of step reconstruction reduces the constraint of reconstruction of original data, instead, the objective function is optimized for reconstruction of representative features. This helps the proposed model to generalize for representation and transfer learning tasks efficiently for high dimensional data. Finally, we evaluate the performance of the proposed model on three multi-view and cross-modal problems viz., audio articulation, cross-modal image retrieval and multilingual (cross-language) document classification. Through extensive experiments, we find that the proposed model performs much better than the current state-of-the-art deep learning techniques on all three multi-view and cross-modal tasks.
AbstractList Deep learning techniques have been successfully used in learning a common representation for multi-view data, wherein different modalities are projected onto a common subspace. In a broader perspective, the techniques used to investigate common representation learning falls under the categories of ‘canonical correlation-based’ approaches and ‘autoencoder-based’ approaches. In this paper, we investigate the performance of deep autoencoder-based methods on multi-view data. We propose a novel step-based correlation multi-modal deep convolution neural network (CorrMCNN) which reconstructs one view of the data given the other while increasing the interaction between the representations at each hidden layer or every intermediate step. The idea of step reconstruction reduces the constraint of reconstruction of original data, instead, the objective function is optimized for reconstruction of representative features. This helps the proposed model to generalize for representation and transfer learning tasks efficiently for high dimensional data. Finally, we evaluate the performance of the proposed model on three multi-view and cross-modal problems viz., audio articulation, cross-modal image retrieval and multilingual (cross-language) document classification. Through extensive experiments, we find that the proposed model performs much better than the current state-of-the-art deep learning techniques on all three multi-view and cross-modal tasks.
Author Raman, Balasubramanian
Bhatt, Gaurav
Jha, Piyush
Author_xml – sequence: 1
  givenname: Gaurav
  surname: Bhatt
  fullname: Bhatt, Gaurav
  email: gauravbhatt.deeplearn@gmail.com
  organization: Indian Institute of Technology Roorkee (IITR), Roorkee, 247667, India
– sequence: 2
  givenname: Piyush
  surname: Jha
  fullname: Jha, Piyush
  email: piyushnit15@gmail.com
  organization: Malaviya National Institute of Technology (MNIT), Jaipur, 302017, India
– sequence: 3
  givenname: Balasubramanian
  surname: Raman
  fullname: Raman, Balasubramanian
  email: balarfma@iitr.ac.in
  organization: Indian Institute of Technology Roorkee (IITR), Roorkee, 247667, India
BookMark eNqFkE1LxDAQhoOs4O7qP_DQP5A6aZJ-eBBk8QsWBNFzSNLpkqXblCQr-O_tUk8e9DJzGJ6XeZ8VWQx-QEKuGeQMWHmzz0edrN_lBbAmB5kDL87IktUVp5KJYkGWAJxRXgC_IKsY9wCsmg5L8viGY8CIQ9LJ-SHrUYfBDbvsGE8zJhyp0RHbrEUcs8OxT44efKv7TB-Tx8H6FkO8JOed7iNe_ew1-Xh8eN880-3r08vmfksthzLRilkpNOPciAalNpURXSNKrCshu7rkHLiEzghouGCVaLHG0lS1kbYutbCGr8ntnGuDjzFgp6ybP09Bu14xUCcjaq9mI-pkRIFUk5EJFr_gMbiDDl__YXczhlOxT4dBReum4ti6gDap1ru_A74BQIh_0A
CitedBy_id crossref_primary_10_1007_s00371_021_02166_7
crossref_primary_10_1016_j_patcog_2024_110377
crossref_primary_10_1016_j_cad_2025_103932
crossref_primary_10_1088_2631_8695_ad56fc
crossref_primary_10_1121_10_0003339
crossref_primary_10_1016_j_engappai_2020_103478
crossref_primary_10_1080_13658816_2024_2394228
crossref_primary_10_1002_nme_6905
crossref_primary_10_1016_j_patcog_2019_107166
crossref_primary_10_1016_j_heliyon_2020_e03751
crossref_primary_10_1016_j_patcog_2021_108383
crossref_primary_10_1007_s10489_021_02626_6
crossref_primary_10_1016_j_patcog_2020_107306
crossref_primary_10_1080_07421222_2024_2376384
crossref_primary_10_1016_j_sigpro_2025_110074
crossref_primary_10_1088_1361_6560_ad7f1b
crossref_primary_10_1016_j_neunet_2024_106842
crossref_primary_10_1016_j_neucom_2021_03_090
crossref_primary_10_1016_j_patcog_2021_107905
Cites_doi 10.1002/0470013192.bsa068
10.1109/CVPR.2017.201
10.1016/j.inffus.2017.09.001
10.1109/TIP.2017.2651379
10.1109/TPAMI.2006.134
10.1109/TIP.2015.2490539
10.1109/TPAMI.2016.2587640
10.1162/NECO_a_00801
10.1016/j.ins.2017.01.011
10.1016/j.patcog.2018.03.018
10.1016/j.patcog.2018.02.006
10.1016/j.patrec.2012.02.002
10.1109/TPAMI.2008.70
10.1016/j.patcog.2015.04.012
10.1007/978-3-642-21735-7_7
10.1016/j.neucom.2017.09.045
10.1121/1.2029064
10.1016/j.patcog.2017.07.019
10.1016/j.patcog.2017.07.008
10.1016/j.patcog.2017.02.026
10.1016/j.patcog.2017.02.035
10.1016/0304-4076(76)90010-5
10.3115/v1/P14-1006
10.1109/TPAMI.2015.2417578
ContentType Journal Article
Copyright 2019 Elsevier Ltd
Copyright_xml – notice: 2019 Elsevier Ltd
DBID AAYXX
CITATION
DOI 10.1016/j.patcog.2019.05.032
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1873-5142
EndPage 23
ExternalDocumentID 10_1016_j_patcog_2019_05_032
S0031320319302146
GroupedDBID --K
--M
-D8
-DT
-~X
.DC
.~1
0R~
123
1B1
1RT
1~.
1~5
29O
4.4
457
4G.
53G
5VS
7-5
71M
8P~
9JN
AABNK
AACTN
AAEDT
AAEDW
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AAXUO
AAYFN
ABBOA
ABEFU
ABFNM
ABFRF
ABHFT
ABJNI
ABMAC
ABTAH
ABXDB
ABYKQ
ACBEA
ACDAQ
ACGFO
ACGFS
ACNNM
ACRLP
ACZNC
ADBBV
ADEZE
ADJOM
ADMUD
ADMXK
ADTZH
AEBSH
AECPX
AEFWE
AEKER
AENEX
AFKWA
AFTJW
AGHFR
AGUBO
AGYEJ
AHHHB
AHJVU
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJBFU
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
ASPBG
AVWKF
AXJTR
AZFZN
BJAXD
BKOJK
BLXMC
CS3
DU5
EBS
EFJIC
EFLBG
EJD
EO8
EO9
EP2
EP3
F0J
F5P
FD6
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-Q
G8K
GBLVA
GBOLZ
HLZ
HVGLF
HZ~
H~9
IHE
J1W
JJJVA
KOM
KZ1
LG9
LMP
LY1
M41
MO0
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
Q38
R2-
RIG
RNS
ROL
RPZ
SBC
SDF
SDG
SDP
SDS
SES
SEW
SPC
SPCBC
SST
SSV
SSZ
T5K
TN5
UNMZH
VOH
WUQ
XJE
XPP
ZMT
ZY4
~G-
9DU
AATTM
AAXKI
AAYWO
AAYXX
ABDPE
ABWVN
ACLOT
ACRPL
ACVFH
ADCNI
ADNMO
AEIPS
AEUPX
AFJKZ
AFPUW
AGQPQ
AIGII
AIIUN
AKBMS
AKRWK
AKYEP
ANKPU
APXCP
CITATION
EFKBS
~HD
ID FETCH-LOGICAL-c306t-71c54a133b49e5ab7b4f946e8745f86330350fb40934174de8e6b78b5c86a4cb3
ISICitedReferencesCount 20
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000478710600002&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0031-3203
IngestDate Tue Nov 18 22:33:20 EST 2025
Sat Nov 29 07:28:29 EST 2025
Fri Feb 23 02:25:25 EST 2024
IsPeerReviewed true
IsScholarly true
Keywords Representation learning
Convolution autoencoders
Transfer learning
Multilingual document classification
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c306t-71c54a133b49e5ab7b4f946e8745f86330350fb40934174de8e6b78b5c86a4cb3
PageCount 12
ParticipantIDs crossref_citationtrail_10_1016_j_patcog_2019_05_032
crossref_primary_10_1016_j_patcog_2019_05_032
elsevier_sciencedirect_doi_10_1016_j_patcog_2019_05_032
PublicationCentury 2000
PublicationDate November 2019
2019-11-00
PublicationDateYYYYMMDD 2019-11-01
PublicationDate_xml – month: 11
  year: 2019
  text: November 2019
PublicationDecade 2010
PublicationTitle Pattern recognition
PublicationYear 2019
Publisher Elsevier Ltd
Publisher_xml – name: Elsevier Ltd
References Rasheed, Shah (bib0001) 2003; 2
Thompson (bib0012) 2005
Yu, Hu, Xu (bib0015) 2014
Lu, Foster (bib0023) 2014
Wang, Arora, Livescu, Bilmes (bib0003) 2015
Yu, Lin, Seah, Li, Lin (bib0029) 2012; 33
Xu, Tao, Xu (bib0032) 2015; 37
Gao, He, Yih, Deng (bib0040) 2014; 1
(2006).
A. Klementiev, I. Titov, B. Bhattarai, Inducing crosslingual distributed representations of words (2012).
P. Mineiro, N. Karampatziakis, A randomized algorithm for cca, arXiv
Vinyals, Toshev, Bengio, Erhan (bib0004) 2017; 39
AP, Lauly, Larochelle, Khapra, Ravindran, Raykar, Saha (bib0042) 2014
Vinod (bib0013) 1976; 4
Arora, Livescu (bib0002) 2012
Ng (bib0021) 2011; 72
(2013).
Liu, Yang, Tao, Cheng, Tang (bib0026) 2018; 41
Zhang, Zhang, Ma, Guan, Gong (bib0030) 2015; 48
Gao, Mu, Goulermas, Wang (bib0038) 2018; 75
Xu, Tao, Xu (bib0031) 2015; 24
A. Benton, H. Khayrallah, B. Gujral, D. Reisinger, S. Zhang, R. Arora, Deep generalized canonical correlation analysis, arXiv
Afridi, Ross, Shapiro (bib0035) 2018; 73
(2017).
Han, Jing, Wu (bib0025) 2018; 275
Lei, Han, Zhou, Yu, Qin, Elazab, Lei (bib0034) 2018; 79
Masci, Meier, Cireşan, Schmidhuber (bib0046) 2011
Li, Xu, Yang, Sun, Tao (bib0033) 2017; 26
H. Soyer, P. Stenetorp, A. Aizawa, Leveraging monolingual data for crosslingual compositional word representations, arXiv
Michaeli, Wang, Livescu (bib0017) 2016
Bird, Klein, Loper (bib0050) 2009
Westbury, Milenkovic, Weismer, Kent (bib0044) 1990; 88
Zhuang, Yan, Chen, Wang, Shen (bib0037) 2018; 80
Jesorsky, Kirchberg, Frischholz (bib0045) 2001
S. Akaho, A kernel method for canonical correlation analysis, arXiv
A. Saha, M.M. Khapra, S. Chandar, J. Rajendran, K. Cho, A correlational encoder decoder architecture for pivot based sequence generation, arXiv
Chen, Zhou (bib0039) 2018; 75
Jiao, Gao, Wang, Li (bib0036) 2018; 75
Logan (bib0048) 2000
T. Mikolov, Q.V. Le, I. Sutskever, Exploiting similarities among languages for machine translation, arXiv
Tao, Li, Wu, Maybank (bib0028) 2009; 31
Koehn (bib0049) 2005; 5
Pedregosa, Varoquaux, Gramfort, Michel, Thirion, Grisel, Blondel, Prettenhofer, Weiss, Vanderplas (bib0052) 2011; 12
Andrew, Arora, Bilmes, Livescu (bib0018) 2013
Bergstra, Bengio (bib0047) 2012; 13
Gouws, Bengio, Corrado (bib0051) 2015
Yang, Liu, Tao, Cheng (bib0024) 2017; 385
Tao, Tang, Li, Wu (bib0027) 2006; 28
Bach, Jordan (bib0016) 2002; 3
Härdle, Hlávka (bib0011) 2007
Chandar, Khapra, Larochelle, Ravindran (bib0008) 2016; 28
You, Jin, Wang, Fang, Luo (bib0005) 2016
(2014).
K.M. Hermann, P. Blunsom, Multilingual models for compositional distributed semantics, arXiv
(2016).
A. Eisenschtat, L. Wolf, Linking image and text with 2-way nets, arXiv
Ngiam, Khosla, Kim, Nam, Lee, Ng (bib0022) 2011
Bird (10.1016/j.patcog.2019.05.032_bib0050) 2009
Chen (10.1016/j.patcog.2019.05.032_bib0039) 2018; 75
10.1016/j.patcog.2019.05.032_bib0020
Westbury (10.1016/j.patcog.2019.05.032_sbref0035) 1990; 88
Pedregosa (10.1016/j.patcog.2019.05.032_bib0052) 2011; 12
AP (10.1016/j.patcog.2019.05.032_bib0042) 2014
Yang (10.1016/j.patcog.2019.05.032_bib0024) 2017; 385
Afridi (10.1016/j.patcog.2019.05.032_bib0035) 2018; 73
Jiao (10.1016/j.patcog.2019.05.032_bib0036) 2018; 75
Ng (10.1016/j.patcog.2019.05.032_bib0021) 2011; 72
Koehn (10.1016/j.patcog.2019.05.032_bib0049) 2005; 5
Härdle (10.1016/j.patcog.2019.05.032_bib0011) 2007
Tao (10.1016/j.patcog.2019.05.032_bib0027) 2006; 28
Logan (10.1016/j.patcog.2019.05.032_bib0048) 2000
Yu (10.1016/j.patcog.2019.05.032_bib0015) 2014
Andrew (10.1016/j.patcog.2019.05.032_bib0018) 2013
Thompson (10.1016/j.patcog.2019.05.032_bib0012) 2005
Chandar (10.1016/j.patcog.2019.05.032_bib0008) 2016; 28
Vinyals (10.1016/j.patcog.2019.05.032_bib0004) 2017; 39
Lei (10.1016/j.patcog.2019.05.032_bib0034) 2018; 79
Li (10.1016/j.patcog.2019.05.032_bib0033) 2017; 26
Wang (10.1016/j.patcog.2019.05.032_bib0003) 2015
Zhuang (10.1016/j.patcog.2019.05.032_bib0037) 2018; 80
Liu (10.1016/j.patcog.2019.05.032_bib0026) 2018; 41
Xu (10.1016/j.patcog.2019.05.032_bib0031) 2015; 24
Tao (10.1016/j.patcog.2019.05.032_bib0028) 2009; 31
Jesorsky (10.1016/j.patcog.2019.05.032_bib0045) 2001
10.1016/j.patcog.2019.05.032_bib0043
10.1016/j.patcog.2019.05.032_bib0041
10.1016/j.patcog.2019.05.032_bib0009
10.1016/j.patcog.2019.05.032_bib0007
10.1016/j.patcog.2019.05.032_bib0006
Bach (10.1016/j.patcog.2019.05.032_bib0016) 2002; 3
Michaeli (10.1016/j.patcog.2019.05.032_bib0017) 2016
Yu (10.1016/j.patcog.2019.05.032_bib0029) 2012; 33
10.1016/j.patcog.2019.05.032_bib0014
Bergstra (10.1016/j.patcog.2019.05.032_bib0047) 2012; 13
Gouws (10.1016/j.patcog.2019.05.032_bib0051) 2015
10.1016/j.patcog.2019.05.032_bib0010
Vinod (10.1016/j.patcog.2019.05.032_bib0013) 1976; 4
Ngiam (10.1016/j.patcog.2019.05.032_bib0022) 2011
Zhang (10.1016/j.patcog.2019.05.032_bib0030) 2015; 48
Xu (10.1016/j.patcog.2019.05.032_bib0032) 2015; 37
Gao (10.1016/j.patcog.2019.05.032_bib0040) 2014; 1
Lu (10.1016/j.patcog.2019.05.032_bib0023) 2014
10.1016/j.patcog.2019.05.032_bib0019
Han (10.1016/j.patcog.2019.05.032_bib0025) 2018; 275
Masci (10.1016/j.patcog.2019.05.032_bib0046) 2011
You (10.1016/j.patcog.2019.05.032_bib0005) 2016
Gao (10.1016/j.patcog.2019.05.032_bib0038) 2018; 75
Rasheed (10.1016/j.patcog.2019.05.032_bib0001) 2003; 2
Arora (10.1016/j.patcog.2019.05.032_bib0002) 2012
References_xml – volume: 39
  start-page: 652
  year: 2017
  end-page: 663
  ident: bib0004
  article-title: Show and tell: lessons learned from the 2015 MSCOCO image captioning challenge
  publication-title: IEEE Transactions on Pattern Analysis and MachineIntelligence.
– reference: H. Soyer, P. Stenetorp, A. Aizawa, Leveraging monolingual data for crosslingual compositional word representations, arXiv:
– reference: K.M. Hermann, P. Blunsom, Multilingual models for compositional distributed semantics, arXiv:
– reference: S. Akaho, A kernel method for canonical correlation analysis, arXiv:
– reference: A. Saha, M.M. Khapra, S. Chandar, J. Rajendran, K. Cho, A correlational encoder decoder architecture for pivot based sequence generation, arXiv:
– volume: 24
  start-page: 5812
  year: 2015
  end-page: 5825
  ident: bib0031
  article-title: Multi-view learning with incomplete views
  publication-title: IEEE Trans. Image Process.
– volume: 79
  start-page: 290
  year: 2018
  end-page: 302
  ident: bib0034
  article-title: A deeply supervised residual network for hep-2 cell classification via cross-modal transfer learning
  publication-title: Pattern Recognit.
– volume: 275
  start-page: 1087
  year: 2018
  end-page: 1098
  ident: bib0025
  article-title: Multi-view local discrimination and canonical correlation analysis for image classification
  publication-title: Neurocomputing
– volume: 28
  start-page: 257-285
  year: 2016
  ident: bib0008
  article-title: Correlational neural networks
  publication-title: Neural computation.
– volume: 12
  start-page: 2825
  year: 2011
  end-page: 2830
  ident: bib0052
  article-title: Scikit-learn: machine learning in python
  publication-title: J. Mach. Learn. Res.
– volume: 31
  start-page: 260
  year: 2009
  end-page: 274
  ident: bib0028
  article-title: Geometric mean for subspace selection
  publication-title: IEEE Trans. Pattern Anal. Mach. Intell.
– volume: 5
  start-page: 79
  year: 2005
  end-page: 86
  ident: bib0049
  article-title: Europarl: A parallel corpus for statistical machine translation
  publication-title: MT Summit
– reference: P. Mineiro, N. Karampatziakis, A randomized algorithm for cca, arXiv:
– reference: A. Benton, H. Khayrallah, B. Gujral, D. Reisinger, S. Zhang, R. Arora, Deep generalized canonical correlation analysis, arXiv:
– start-page: 34
  year: 2012
  end-page: 37
  ident: bib0002
  article-title: Kernel CCA for multi-view learning of acoustic features using articulatory measurements.
  publication-title: MLSLP
– start-page: 90
  year: 2001
  end-page: 95
  ident: bib0045
  article-title: Robust face detection using the hausdorff distance
  publication-title: International Conference on Audio-and Video-Based Biometric Person Authentication
– volume: 75
  start-page: 149
  year: 2018
  end-page: 160
  ident: bib0039
  article-title: Collaborative multiview hashing
  publication-title: Pattern Recognit.
– reference: (2006).
– start-page: 4651
  year: 2016
  end-page: 4659
  ident: bib0005
  article-title: Image captioning with semantic attention
  publication-title: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
– volume: 48
  start-page: 3191
  year: 2015
  end-page: 3202
  ident: bib0030
  article-title: Multimodal learning for facial expression recognition
  publication-title: Pattern Recognit.
– volume: 385
  start-page: 338
  year: 2017
  end-page: 352
  ident: bib0024
  article-title: Canonical correlation analysis networks for two-view image recognition
  publication-title: Inf. Sci.
– volume: 1
  start-page: 699
  year: 2014
  end-page: 709
  ident: bib0040
  article-title: Learning continuous phrase representations for translation modeling
  publication-title: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
– volume: 13
  start-page: 281
  year: 2012
  end-page: 305
  ident: bib0047
  article-title: Random search for hyper-parameter optimization
  publication-title: J. Mach. Learn. Res.
– volume: 33
  start-page: 1196
  year: 2012
  end-page: 1204
  ident: bib0029
  article-title: Image classification by multimodal subspace learning
  publication-title: Pattern Recognit. Lett.
– volume: 88
  year: 1990
  ident: bib0044
  article-title: X-Ray microbeam speech production database
  publication-title: J. Acoust. Soc. Am.
– year: 2009
  ident: bib0050
  article-title: Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit
– start-page: 689
  year: 2011
  end-page: 696
  ident: bib0022
  article-title: Multimodal deep learning
  publication-title: Proceedings of the 28th International Conference on Machine Learning (ICML-11)
– year: 2005
  ident: bib0012
  article-title: Canonical Correlation Analysis
  publication-title: Encyclopedia of Statistics in Behavioral Science
– volume: 72
  start-page: 1
  year: 2011
  end-page: 19
  ident: bib0021
  article-title: Sparse autoencoder
  publication-title: CS294A Lecture notes
– start-page: 748
  year: 2015
  end-page: 756
  ident: bib0051
  article-title: Bilbowa: Fast bilingual distributed representations without word alignments
  publication-title: International Conference on Machine Learning
– reference: (2017).
– reference: T. Mikolov, Q.V. Le, I. Sutskever, Exploiting similarities among languages for machine translation, arXiv:
– start-page: 52
  year: 2011
  end-page: 59
  ident: bib0046
  article-title: Stacked convolutional auto-encoders for hierarchical feature extraction
  publication-title: Artif. Neural Netw. Mach. Learn.–ICANN 2011
– start-page: 1853
  year: 2014
  end-page: 1861
  ident: bib0042
  article-title: An autoencoder approach to learning bilingual word representations
  publication-title: Advances in Neural Information Processing Systems
– start-page: 91
  year: 2014
  end-page: 99
  ident: bib0023
  article-title: Large scale canonical correlation analysis with iterative least squares
  publication-title: Advances in Neural Information Processing Systems
– volume: 3
  start-page: 1
  year: 2002
  end-page: 48
  ident: bib0016
  article-title: Kernel independent component analysis
  publication-title: J. Mach. Learn. Res.
– volume: 41
  start-page: 119
  year: 2018
  end-page: 128
  ident: bib0026
  article-title: Multiview dimension reduction via hessian multiset canonical correlations
  publication-title: Inf. Fusion
– reference: (2016).
– volume: 2
  start-page: II
  year: 2003
  end-page: 343
  ident: bib0001
  article-title: Scene detection in hollywood movies and tv shows
  publication-title: Computer Vision and Pattern Recognition, 2003. Proceedings, 2003 IEEE Computer Society Conference.
– reference: A. Eisenschtat, L. Wolf, Linking image and text with 2-way nets, arXiv:
– start-page: 1967
  year: 2016
  end-page: 1976
  ident: bib0017
  article-title: Nonparametric canonical correlation analysis
  publication-title: International Conference on Machine Learning
– start-page: 263
  year: 2007
  end-page: 269
  ident: bib0011
  article-title: Canonical Correlation Analysis
  publication-title: Multivariate Statistics: Exercises and Solutions
– volume: 37
  start-page: 2531
  year: 2015
  end-page: 2544
  ident: bib0032
  article-title: Multi-view intact space learning
  publication-title: IEEE Trans. Pattern Anal. Mach. Intell.
– volume: 4
  start-page: 147
  year: 1976
  end-page: 166
  ident: bib0013
  article-title: Canonical ridge and econometrics of joint production
  publication-title: J. Econom.
– reference: A. Klementiev, I. Titov, B. Bhattarai, Inducing crosslingual distributed representations of words (2012).
– volume: 26
  start-page: 3113
  year: 2017
  end-page: 3127
  ident: bib0033
  article-title: Discriminative multi-view interactive image re-ranking
  publication-title: IEEE Trans. Image Process.
– start-page: 1247
  year: 2013
  end-page: 1255
  ident: bib0018
  article-title: Deep canonical correlation analysis
  publication-title: International Conference on Machine Learning
– reference: (2014).
– volume: 80
  start-page: 225
  year: 2018
  end-page: 240
  ident: bib0037
  article-title: Multi-label learning based deep transfer neural network for facial attribute classification
  publication-title: Pattern Recognit.
– volume: 73
  start-page: 65
  year: 2018
  end-page: 75
  ident: bib0035
  article-title: On automated source selection for transfer learning in convolutional neural networks
  publication-title: Pattern Recognit.
– volume: 75
  start-page: 223
  year: 2018
  end-page: 234
  ident: bib0038
  article-title: Topic driven multimodal similarity learning with multi-view voted convolutional features
  publication-title: Pattern Recognit.
– start-page: 4590
  year: 2015
  end-page: 4594
  ident: bib0003
  article-title: Unsupervised learning of acoustic features via deep canonical correlation analysis
  publication-title: Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
– volume: 75
  start-page: 292
  year: 2018
  end-page: 301
  ident: bib0036
  article-title: A parasitic metric learning net for breast mass classification based on mammography
  publication-title: Pattern Recognit.
– reference: (2013).
– year: 2000
  ident: bib0048
  article-title: Mel frequency cepstral coefficients for music modeling
  publication-title: ISMIR
– volume: 28
  start-page: 1088
  year: 2006
  end-page: 1099
  ident: bib0027
  article-title: Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval
  publication-title: IEEE Trans. Pattern Anal. Mach. Intell.
– start-page: 145
  year: 2014
  end-page: 152
  ident: bib0015
  article-title: Kernel Independent Component Analysis
  publication-title: Blind Source Separation: Theory and Applications
– year: 2005
  ident: 10.1016/j.patcog.2019.05.032_bib0012
  article-title: Canonical Correlation Analysis
  doi: 10.1002/0470013192.bsa068
– ident: 10.1016/j.patcog.2019.05.032_bib0010
  doi: 10.1109/CVPR.2017.201
– start-page: 263
  year: 2007
  ident: 10.1016/j.patcog.2019.05.032_bib0011
  article-title: Canonical Correlation Analysis
– volume: 2
  start-page: II
  year: 2003
  ident: 10.1016/j.patcog.2019.05.032_bib0001
  article-title: Scene detection in hollywood movies and tv shows
– start-page: 1853
  year: 2014
  ident: 10.1016/j.patcog.2019.05.032_bib0042
  article-title: An autoencoder approach to learning bilingual word representations
– start-page: 145
  year: 2014
  ident: 10.1016/j.patcog.2019.05.032_bib0015
  article-title: Kernel Independent Component Analysis
– start-page: 689
  year: 2011
  ident: 10.1016/j.patcog.2019.05.032_bib0022
  article-title: Multimodal deep learning
– volume: 12
  start-page: 2825
  issue: Oct
  year: 2011
  ident: 10.1016/j.patcog.2019.05.032_bib0052
  article-title: Scikit-learn: machine learning in python
  publication-title: J. Mach. Learn. Res.
– ident: 10.1016/j.patcog.2019.05.032_bib0019
– volume: 41
  start-page: 119
  year: 2018
  ident: 10.1016/j.patcog.2019.05.032_bib0026
  article-title: Multiview dimension reduction via hessian multiset canonical correlations
  publication-title: Inf. Fusion
  doi: 10.1016/j.inffus.2017.09.001
– volume: 26
  start-page: 3113
  issue: 7
  year: 2017
  ident: 10.1016/j.patcog.2019.05.032_bib0033
  article-title: Discriminative multi-view interactive image re-ranking
  publication-title: IEEE Trans. Image Process.
  doi: 10.1109/TIP.2017.2651379
– volume: 28
  start-page: 1088
  issue: 7
  year: 2006
  ident: 10.1016/j.patcog.2019.05.032_bib0027
  article-title: Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval
  publication-title: IEEE Trans. Pattern Anal. Mach. Intell.
  doi: 10.1109/TPAMI.2006.134
– volume: 24
  start-page: 5812
  issue: 12
  year: 2015
  ident: 10.1016/j.patcog.2019.05.032_bib0031
  article-title: Multi-view learning with incomplete views
  publication-title: IEEE Trans. Image Process.
  doi: 10.1109/TIP.2015.2490539
– volume: 39
  start-page: 652
  issue: 4
  year: 2017
  ident: 10.1016/j.patcog.2019.05.032_bib0004
  article-title: Show and tell: lessons learned from the 2015 MSCOCO image captioning challenge
  publication-title: IEEE Transactions on Pattern Analysis and MachineIntelligence.
  doi: 10.1109/TPAMI.2016.2587640
– volume: 28
  start-page: 257-285
  issue: 2
  year: 2016
  ident: 10.1016/j.patcog.2019.05.032_bib0008
  article-title: Correlational neural networks
  publication-title: Neural computation.
  doi: 10.1162/NECO_a_00801
– volume: 5
  start-page: 79
  year: 2005
  ident: 10.1016/j.patcog.2019.05.032_bib0049
  article-title: Europarl: A parallel corpus for statistical machine translation
– ident: 10.1016/j.patcog.2019.05.032_bib0009
– volume: 385
  start-page: 338
  year: 2017
  ident: 10.1016/j.patcog.2019.05.032_bib0024
  article-title: Canonical correlation analysis networks for two-view image recognition
  publication-title: Inf. Sci.
  doi: 10.1016/j.ins.2017.01.011
– ident: 10.1016/j.patcog.2019.05.032_bib0043
– start-page: 4651
  year: 2016
  ident: 10.1016/j.patcog.2019.05.032_bib0005
  article-title: Image captioning with semantic attention
– volume: 80
  start-page: 225
  year: 2018
  ident: 10.1016/j.patcog.2019.05.032_bib0037
  article-title: Multi-label learning based deep transfer neural network for facial attribute classification
  publication-title: Pattern Recognit.
  doi: 10.1016/j.patcog.2018.03.018
– volume: 79
  start-page: 290
  year: 2018
  ident: 10.1016/j.patcog.2019.05.032_bib0034
  article-title: A deeply supervised residual network for hep-2 cell classification via cross-modal transfer learning
  publication-title: Pattern Recognit.
  doi: 10.1016/j.patcog.2018.02.006
– volume: 13
  start-page: 281
  year: 2012
  ident: 10.1016/j.patcog.2019.05.032_bib0047
  article-title: Random search for hyper-parameter optimization
  publication-title: J. Mach. Learn. Res.
– volume: 33
  start-page: 1196
  issue: 9
  year: 2012
  ident: 10.1016/j.patcog.2019.05.032_bib0029
  article-title: Image classification by multimodal subspace learning
  publication-title: Pattern Recognit. Lett.
  doi: 10.1016/j.patrec.2012.02.002
– volume: 31
  start-page: 260
  issue: 2
  year: 2009
  ident: 10.1016/j.patcog.2019.05.032_bib0028
  article-title: Geometric mean for subspace selection
  publication-title: IEEE Trans. Pattern Anal. Mach. Intell.
  doi: 10.1109/TPAMI.2008.70
– volume: 3
  start-page: 1
  issue: Jul
  year: 2002
  ident: 10.1016/j.patcog.2019.05.032_bib0016
  article-title: Kernel independent component analysis
  publication-title: J. Mach. Learn. Res.
– volume: 1
  start-page: 699
  year: 2014
  ident: 10.1016/j.patcog.2019.05.032_bib0040
  article-title: Learning continuous phrase representations for translation modeling
– volume: 48
  start-page: 3191
  issue: 10
  year: 2015
  ident: 10.1016/j.patcog.2019.05.032_bib0030
  article-title: Multimodal learning for facial expression recognition
  publication-title: Pattern Recognit.
  doi: 10.1016/j.patcog.2015.04.012
– start-page: 52
  year: 2011
  ident: 10.1016/j.patcog.2019.05.032_bib0046
  article-title: Stacked convolutional auto-encoders for hierarchical feature extraction
  publication-title: Artif. Neural Netw. Mach. Learn.–ICANN 2011
  doi: 10.1007/978-3-642-21735-7_7
– volume: 275
  start-page: 1087
  year: 2018
  ident: 10.1016/j.patcog.2019.05.032_bib0025
  article-title: Multi-view local discrimination and canonical correlation analysis for image classification
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2017.09.045
– volume: 88
  issue: S1
  year: 1990
  ident: 10.1016/j.patcog.2019.05.032_sbref0035
  article-title: X-Ray microbeam speech production database
  publication-title: J. Acoust. Soc. Am.
  doi: 10.1121/1.2029064
– volume: 73
  start-page: 65
  year: 2018
  ident: 10.1016/j.patcog.2019.05.032_bib0035
  article-title: On automated source selection for transfer learning in convolutional neural networks
  publication-title: Pattern Recognit.
  doi: 10.1016/j.patcog.2017.07.019
– start-page: 34
  year: 2012
  ident: 10.1016/j.patcog.2019.05.032_bib0002
  article-title: Kernel CCA for multi-view learning of acoustic features using articulatory measurements.
– start-page: 90
  year: 2001
  ident: 10.1016/j.patcog.2019.05.032_bib0045
  article-title: Robust face detection using the hausdorff distance
– volume: 75
  start-page: 292
  year: 2018
  ident: 10.1016/j.patcog.2019.05.032_bib0036
  article-title: A parasitic metric learning net for breast mass classification based on mammography
  publication-title: Pattern Recognit.
  doi: 10.1016/j.patcog.2017.07.008
– start-page: 91
  year: 2014
  ident: 10.1016/j.patcog.2019.05.032_bib0023
  article-title: Large scale canonical correlation analysis with iterative least squares
– start-page: 1967
  year: 2016
  ident: 10.1016/j.patcog.2019.05.032_bib0017
  article-title: Nonparametric canonical correlation analysis
– start-page: 1247
  year: 2013
  ident: 10.1016/j.patcog.2019.05.032_bib0018
  article-title: Deep canonical correlation analysis
– start-page: 4590
  year: 2015
  ident: 10.1016/j.patcog.2019.05.032_bib0003
  article-title: Unsupervised learning of acoustic features via deep canonical correlation analysis
– volume: 75
  start-page: 149
  year: 2018
  ident: 10.1016/j.patcog.2019.05.032_bib0039
  article-title: Collaborative multiview hashing
  publication-title: Pattern Recognit.
  doi: 10.1016/j.patcog.2017.02.026
– volume: 72
  start-page: 1
  issue: 2011
  year: 2011
  ident: 10.1016/j.patcog.2019.05.032_bib0021
  article-title: Sparse autoencoder
  publication-title: CS294A Lecture notes
– ident: 10.1016/j.patcog.2019.05.032_bib0007
– ident: 10.1016/j.patcog.2019.05.032_bib0020
– year: 2009
  ident: 10.1016/j.patcog.2019.05.032_bib0050
– ident: 10.1016/j.patcog.2019.05.032_bib0041
– volume: 75
  start-page: 223
  year: 2018
  ident: 10.1016/j.patcog.2019.05.032_bib0038
  article-title: Topic driven multimodal similarity learning with multi-view voted convolutional features
  publication-title: Pattern Recognit.
  doi: 10.1016/j.patcog.2017.02.035
– ident: 10.1016/j.patcog.2019.05.032_bib0014
– volume: 4
  start-page: 147
  issue: 2
  year: 1976
  ident: 10.1016/j.patcog.2019.05.032_bib0013
  article-title: Canonical ridge and econometrics of joint production
  publication-title: J. Econom.
  doi: 10.1016/0304-4076(76)90010-5
– ident: 10.1016/j.patcog.2019.05.032_bib0006
  doi: 10.3115/v1/P14-1006
– year: 2000
  ident: 10.1016/j.patcog.2019.05.032_bib0048
  article-title: Mel frequency cepstral coefficients for music modeling
– volume: 37
  start-page: 2531
  issue: 12
  year: 2015
  ident: 10.1016/j.patcog.2019.05.032_bib0032
  article-title: Multi-view intact space learning
  publication-title: IEEE Trans. Pattern Anal. Mach. Intell.
  doi: 10.1109/TPAMI.2015.2417578
– start-page: 748
  year: 2015
  ident: 10.1016/j.patcog.2019.05.032_bib0051
  article-title: Bilbowa: Fast bilingual distributed representations without word alignments
SSID ssj0017142
Score 2.4378898
Snippet Deep learning techniques have been successfully used in learning a common representation for multi-view data, wherein different modalities are projected onto a...
SourceID crossref
elsevier
SourceType Enrichment Source
Index Database
Publisher
StartPage 12
SubjectTerms Convolution autoencoders
Multilingual document classification
Representation learning
Transfer learning
Title Representation learning using step-based deep multi-modal autoencoders
URI https://dx.doi.org/10.1016/j.patcog.2019.05.032
Volume 95
WOSCitedRecordID wos000478710600002&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: Elsevier SD Freedom Collection Journals 2021
  customDbUrl:
  eissn: 1873-5142
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0017142
  issn: 0031-3203
  databaseCode: AIEXJ
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3db9MwELfKxgMvML7EYCA_8DYZJbEdx4_btDF4mKoxpL4F23HGppFGbVON_55zbKethviSeImiqG6ru1_O57vf3SH0NksUTetaE6q0JMzQjKiUcqJyIUxFwWU3ST9sQpydFZOJHI9GX2ItzPJGNE1xeyvb_6pqeAbKdqWzf6Hu4UvhAdyD0uEKaofrHyn-vOe2hpKiJo6FuNzv5j50YFvitq5qv7K29YRC8m1auZ4B3WLq-lpWgRUfndZx34PT1b0EstEqdX_4Vfns0nvVzdRy4OP4NNL46ns3H-LN5ypEWw9d7WYHx3TXeyOgMwQeUhkq8IZoWKyIWdGPegtLU0KzxBst641qISgBx2zD6kq-ZjYDk9pvwL7--I5p91GG63ctbFHTS0fKk33P1RAe3Wya_cn3pHQlWrSfXX4PbWeCSzDd2wcfjicfh0yTSJnvKB_-dyyv7DmAd3_r5-7LmktysYMehrMEPvAYeIxGtnmCHsU5HTiY7afoZBMSOEIC95DAK0hgBwm8Bgm8Doln6PPJ8cXRKQnzM4iBg-CCiNRwBm8e1UxarrTQrJYst27CQV3klLqscq3hhA-ujGCVLWyuRaG5KXLFjKbP0VYzbewLhIvEKC6YNLbiLM-UBKemZroG664r8LB3EY1iKU1oLu9mnNyUkUV4XXphlk6YZcJLEOYuIsOq1jdX-c3nRZR4GRxE7_iVAJJfrnz5zytfoQcr8O-hrcWss6_RfbNcXM1nbwKafgBu6Y3F
linkProvider Elsevier
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Representation+learning+using+step-based+deep+multi-modal+autoencoders&rft.jtitle=Pattern+recognition&rft.au=Bhatt%2C+Gaurav&rft.au=Jha%2C+Piyush&rft.au=Raman%2C+Balasubramanian&rft.date=2019-11-01&rft.pub=Elsevier+Ltd&rft.issn=0031-3203&rft.eissn=1873-5142&rft.volume=95&rft.spage=12&rft.epage=23&rft_id=info:doi/10.1016%2Fj.patcog.2019.05.032&rft.externalDocID=S0031320319302146
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0031-3203&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0031-3203&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0031-3203&client=summon