Representation learning using step-based deep multi-modal autoencoders
Deep learning techniques have been successfully used in learning a common representation for multi-view data, wherein different modalities are projected onto a common subspace. In a broader perspective, the techniques used to investigate common representation learning falls under the categories of ‘...
Gespeichert in:
| Veröffentlicht in: | Pattern recognition Jg. 95; S. 12 - 23 |
|---|---|
| Hauptverfasser: | , , |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
Elsevier Ltd
01.11.2019
|
| Schlagworte: | |
| ISSN: | 0031-3203, 1873-5142 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | Deep learning techniques have been successfully used in learning a common representation for multi-view data, wherein different modalities are projected onto a common subspace. In a broader perspective, the techniques used to investigate common representation learning falls under the categories of ‘canonical correlation-based’ approaches and ‘autoencoder-based’ approaches. In this paper, we investigate the performance of deep autoencoder-based methods on multi-view data. We propose a novel step-based correlation multi-modal deep convolution neural network (CorrMCNN) which reconstructs one view of the data given the other while increasing the interaction between the representations at each hidden layer or every intermediate step. The idea of step reconstruction reduces the constraint of reconstruction of original data, instead, the objective function is optimized for reconstruction of representative features. This helps the proposed model to generalize for representation and transfer learning tasks efficiently for high dimensional data. Finally, we evaluate the performance of the proposed model on three multi-view and cross-modal problems viz., audio articulation, cross-modal image retrieval and multilingual (cross-language) document classification. Through extensive experiments, we find that the proposed model performs much better than the current state-of-the-art deep learning techniques on all three multi-view and cross-modal tasks. |
|---|---|
| AbstractList | Deep learning techniques have been successfully used in learning a common representation for multi-view data, wherein different modalities are projected onto a common subspace. In a broader perspective, the techniques used to investigate common representation learning falls under the categories of ‘canonical correlation-based’ approaches and ‘autoencoder-based’ approaches. In this paper, we investigate the performance of deep autoencoder-based methods on multi-view data. We propose a novel step-based correlation multi-modal deep convolution neural network (CorrMCNN) which reconstructs one view of the data given the other while increasing the interaction between the representations at each hidden layer or every intermediate step. The idea of step reconstruction reduces the constraint of reconstruction of original data, instead, the objective function is optimized for reconstruction of representative features. This helps the proposed model to generalize for representation and transfer learning tasks efficiently for high dimensional data. Finally, we evaluate the performance of the proposed model on three multi-view and cross-modal problems viz., audio articulation, cross-modal image retrieval and multilingual (cross-language) document classification. Through extensive experiments, we find that the proposed model performs much better than the current state-of-the-art deep learning techniques on all three multi-view and cross-modal tasks. |
| Author | Raman, Balasubramanian Bhatt, Gaurav Jha, Piyush |
| Author_xml | – sequence: 1 givenname: Gaurav surname: Bhatt fullname: Bhatt, Gaurav email: gauravbhatt.deeplearn@gmail.com organization: Indian Institute of Technology Roorkee (IITR), Roorkee, 247667, India – sequence: 2 givenname: Piyush surname: Jha fullname: Jha, Piyush email: piyushnit15@gmail.com organization: Malaviya National Institute of Technology (MNIT), Jaipur, 302017, India – sequence: 3 givenname: Balasubramanian surname: Raman fullname: Raman, Balasubramanian email: balarfma@iitr.ac.in organization: Indian Institute of Technology Roorkee (IITR), Roorkee, 247667, India |
| BookMark | eNqFkE1LxDAQhoOs4O7qP_DQP5A6aZJ-eBBk8QsWBNFzSNLpkqXblCQr-O_tUk8e9DJzGJ6XeZ8VWQx-QEKuGeQMWHmzz0edrN_lBbAmB5kDL87IktUVp5KJYkGWAJxRXgC_IKsY9wCsmg5L8viGY8CIQ9LJ-SHrUYfBDbvsGE8zJhyp0RHbrEUcs8OxT44efKv7TB-Tx8H6FkO8JOed7iNe_ew1-Xh8eN880-3r08vmfksthzLRilkpNOPciAalNpURXSNKrCshu7rkHLiEzghouGCVaLHG0lS1kbYutbCGr8ntnGuDjzFgp6ybP09Bu14xUCcjaq9mI-pkRIFUk5EJFr_gMbiDDl__YXczhlOxT4dBReum4ti6gDap1ru_A74BQIh_0A |
| CitedBy_id | crossref_primary_10_1007_s00371_021_02166_7 crossref_primary_10_1016_j_patcog_2024_110377 crossref_primary_10_1016_j_cad_2025_103932 crossref_primary_10_1088_2631_8695_ad56fc crossref_primary_10_1121_10_0003339 crossref_primary_10_1016_j_engappai_2020_103478 crossref_primary_10_1080_13658816_2024_2394228 crossref_primary_10_1002_nme_6905 crossref_primary_10_1016_j_patcog_2019_107166 crossref_primary_10_1016_j_heliyon_2020_e03751 crossref_primary_10_1016_j_patcog_2021_108383 crossref_primary_10_1007_s10489_021_02626_6 crossref_primary_10_1016_j_patcog_2020_107306 crossref_primary_10_1080_07421222_2024_2376384 crossref_primary_10_1016_j_sigpro_2025_110074 crossref_primary_10_1088_1361_6560_ad7f1b crossref_primary_10_1016_j_neunet_2024_106842 crossref_primary_10_1016_j_neucom_2021_03_090 crossref_primary_10_1016_j_patcog_2021_107905 |
| Cites_doi | 10.1002/0470013192.bsa068 10.1109/CVPR.2017.201 10.1016/j.inffus.2017.09.001 10.1109/TIP.2017.2651379 10.1109/TPAMI.2006.134 10.1109/TIP.2015.2490539 10.1109/TPAMI.2016.2587640 10.1162/NECO_a_00801 10.1016/j.ins.2017.01.011 10.1016/j.patcog.2018.03.018 10.1016/j.patcog.2018.02.006 10.1016/j.patrec.2012.02.002 10.1109/TPAMI.2008.70 10.1016/j.patcog.2015.04.012 10.1007/978-3-642-21735-7_7 10.1016/j.neucom.2017.09.045 10.1121/1.2029064 10.1016/j.patcog.2017.07.019 10.1016/j.patcog.2017.07.008 10.1016/j.patcog.2017.02.026 10.1016/j.patcog.2017.02.035 10.1016/0304-4076(76)90010-5 10.3115/v1/P14-1006 10.1109/TPAMI.2015.2417578 |
| ContentType | Journal Article |
| Copyright | 2019 Elsevier Ltd |
| Copyright_xml | – notice: 2019 Elsevier Ltd |
| DBID | AAYXX CITATION |
| DOI | 10.1016/j.patcog.2019.05.032 |
| DatabaseName | CrossRef |
| DatabaseTitle | CrossRef |
| DatabaseTitleList | |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISSN | 1873-5142 |
| EndPage | 23 |
| ExternalDocumentID | 10_1016_j_patcog_2019_05_032 S0031320319302146 |
| GroupedDBID | --K --M -D8 -DT -~X .DC .~1 0R~ 123 1B1 1RT 1~. 1~5 29O 4.4 457 4G. 53G 5VS 7-5 71M 8P~ 9JN AABNK AACTN AAEDT AAEDW AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AAQXK AAXUO AAYFN ABBOA ABEFU ABFNM ABFRF ABHFT ABJNI ABMAC ABTAH ABXDB ABYKQ ACBEA ACDAQ ACGFO ACGFS ACNNM ACRLP ACZNC ADBBV ADEZE ADJOM ADMUD ADMXK ADTZH AEBSH AECPX AEFWE AEKER AENEX AFKWA AFTJW AGHFR AGUBO AGYEJ AHHHB AHJVU AHZHX AIALX AIEXJ AIKHN AITUG AJBFU AJOXV ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD ASPBG AVWKF AXJTR AZFZN BJAXD BKOJK BLXMC CS3 DU5 EBS EFJIC EFLBG EJD EO8 EO9 EP2 EP3 F0J F5P FD6 FDB FEDTE FGOYB FIRID FNPLU FYGXN G-Q G8K GBLVA GBOLZ HLZ HVGLF HZ~ H~9 IHE J1W JJJVA KOM KZ1 LG9 LMP LY1 M41 MO0 N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. Q38 R2- RIG RNS ROL RPZ SBC SDF SDG SDP SDS SES SEW SPC SPCBC SST SSV SSZ T5K TN5 UNMZH VOH WUQ XJE XPP ZMT ZY4 ~G- 9DU AATTM AAXKI AAYWO AAYXX ABDPE ABWVN ACLOT ACRPL ACVFH ADCNI ADNMO AEIPS AEUPX AFJKZ AFPUW AGQPQ AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP CITATION EFKBS ~HD |
| ID | FETCH-LOGICAL-c306t-71c54a133b49e5ab7b4f946e8745f86330350fb40934174de8e6b78b5c86a4cb3 |
| ISICitedReferencesCount | 20 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000478710600002&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0031-3203 |
| IngestDate | Tue Nov 18 22:33:20 EST 2025 Sat Nov 29 07:28:29 EST 2025 Fri Feb 23 02:25:25 EST 2024 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Keywords | Representation learning Convolution autoencoders Transfer learning Multilingual document classification |
| Language | English |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c306t-71c54a133b49e5ab7b4f946e8745f86330350fb40934174de8e6b78b5c86a4cb3 |
| PageCount | 12 |
| ParticipantIDs | crossref_citationtrail_10_1016_j_patcog_2019_05_032 crossref_primary_10_1016_j_patcog_2019_05_032 elsevier_sciencedirect_doi_10_1016_j_patcog_2019_05_032 |
| PublicationCentury | 2000 |
| PublicationDate | November 2019 2019-11-00 |
| PublicationDateYYYYMMDD | 2019-11-01 |
| PublicationDate_xml | – month: 11 year: 2019 text: November 2019 |
| PublicationDecade | 2010 |
| PublicationTitle | Pattern recognition |
| PublicationYear | 2019 |
| Publisher | Elsevier Ltd |
| Publisher_xml | – name: Elsevier Ltd |
| References | Rasheed, Shah (bib0001) 2003; 2 Thompson (bib0012) 2005 Yu, Hu, Xu (bib0015) 2014 Lu, Foster (bib0023) 2014 Wang, Arora, Livescu, Bilmes (bib0003) 2015 Yu, Lin, Seah, Li, Lin (bib0029) 2012; 33 Xu, Tao, Xu (bib0032) 2015; 37 Gao, He, Yih, Deng (bib0040) 2014; 1 (2006). A. Klementiev, I. Titov, B. Bhattarai, Inducing crosslingual distributed representations of words (2012). P. Mineiro, N. Karampatziakis, A randomized algorithm for cca, arXiv Vinyals, Toshev, Bengio, Erhan (bib0004) 2017; 39 AP, Lauly, Larochelle, Khapra, Ravindran, Raykar, Saha (bib0042) 2014 Vinod (bib0013) 1976; 4 Arora, Livescu (bib0002) 2012 Ng (bib0021) 2011; 72 (2013). Liu, Yang, Tao, Cheng, Tang (bib0026) 2018; 41 Zhang, Zhang, Ma, Guan, Gong (bib0030) 2015; 48 Gao, Mu, Goulermas, Wang (bib0038) 2018; 75 Xu, Tao, Xu (bib0031) 2015; 24 A. Benton, H. Khayrallah, B. Gujral, D. Reisinger, S. Zhang, R. Arora, Deep generalized canonical correlation analysis, arXiv Afridi, Ross, Shapiro (bib0035) 2018; 73 (2017). Han, Jing, Wu (bib0025) 2018; 275 Lei, Han, Zhou, Yu, Qin, Elazab, Lei (bib0034) 2018; 79 Masci, Meier, Cireşan, Schmidhuber (bib0046) 2011 Li, Xu, Yang, Sun, Tao (bib0033) 2017; 26 H. Soyer, P. Stenetorp, A. Aizawa, Leveraging monolingual data for crosslingual compositional word representations, arXiv Michaeli, Wang, Livescu (bib0017) 2016 Bird, Klein, Loper (bib0050) 2009 Westbury, Milenkovic, Weismer, Kent (bib0044) 1990; 88 Zhuang, Yan, Chen, Wang, Shen (bib0037) 2018; 80 Jesorsky, Kirchberg, Frischholz (bib0045) 2001 S. Akaho, A kernel method for canonical correlation analysis, arXiv A. Saha, M.M. Khapra, S. Chandar, J. Rajendran, K. Cho, A correlational encoder decoder architecture for pivot based sequence generation, arXiv Chen, Zhou (bib0039) 2018; 75 Jiao, Gao, Wang, Li (bib0036) 2018; 75 Logan (bib0048) 2000 T. Mikolov, Q.V. Le, I. Sutskever, Exploiting similarities among languages for machine translation, arXiv Tao, Li, Wu, Maybank (bib0028) 2009; 31 Koehn (bib0049) 2005; 5 Pedregosa, Varoquaux, Gramfort, Michel, Thirion, Grisel, Blondel, Prettenhofer, Weiss, Vanderplas (bib0052) 2011; 12 Andrew, Arora, Bilmes, Livescu (bib0018) 2013 Bergstra, Bengio (bib0047) 2012; 13 Gouws, Bengio, Corrado (bib0051) 2015 Yang, Liu, Tao, Cheng (bib0024) 2017; 385 Tao, Tang, Li, Wu (bib0027) 2006; 28 Bach, Jordan (bib0016) 2002; 3 Härdle, Hlávka (bib0011) 2007 Chandar, Khapra, Larochelle, Ravindran (bib0008) 2016; 28 You, Jin, Wang, Fang, Luo (bib0005) 2016 (2014). K.M. Hermann, P. Blunsom, Multilingual models for compositional distributed semantics, arXiv (2016). A. Eisenschtat, L. Wolf, Linking image and text with 2-way nets, arXiv Ngiam, Khosla, Kim, Nam, Lee, Ng (bib0022) 2011 Bird (10.1016/j.patcog.2019.05.032_bib0050) 2009 Chen (10.1016/j.patcog.2019.05.032_bib0039) 2018; 75 10.1016/j.patcog.2019.05.032_bib0020 Westbury (10.1016/j.patcog.2019.05.032_sbref0035) 1990; 88 Pedregosa (10.1016/j.patcog.2019.05.032_bib0052) 2011; 12 AP (10.1016/j.patcog.2019.05.032_bib0042) 2014 Yang (10.1016/j.patcog.2019.05.032_bib0024) 2017; 385 Afridi (10.1016/j.patcog.2019.05.032_bib0035) 2018; 73 Jiao (10.1016/j.patcog.2019.05.032_bib0036) 2018; 75 Ng (10.1016/j.patcog.2019.05.032_bib0021) 2011; 72 Koehn (10.1016/j.patcog.2019.05.032_bib0049) 2005; 5 Härdle (10.1016/j.patcog.2019.05.032_bib0011) 2007 Tao (10.1016/j.patcog.2019.05.032_bib0027) 2006; 28 Logan (10.1016/j.patcog.2019.05.032_bib0048) 2000 Yu (10.1016/j.patcog.2019.05.032_bib0015) 2014 Andrew (10.1016/j.patcog.2019.05.032_bib0018) 2013 Thompson (10.1016/j.patcog.2019.05.032_bib0012) 2005 Chandar (10.1016/j.patcog.2019.05.032_bib0008) 2016; 28 Vinyals (10.1016/j.patcog.2019.05.032_bib0004) 2017; 39 Lei (10.1016/j.patcog.2019.05.032_bib0034) 2018; 79 Li (10.1016/j.patcog.2019.05.032_bib0033) 2017; 26 Wang (10.1016/j.patcog.2019.05.032_bib0003) 2015 Zhuang (10.1016/j.patcog.2019.05.032_bib0037) 2018; 80 Liu (10.1016/j.patcog.2019.05.032_bib0026) 2018; 41 Xu (10.1016/j.patcog.2019.05.032_bib0031) 2015; 24 Tao (10.1016/j.patcog.2019.05.032_bib0028) 2009; 31 Jesorsky (10.1016/j.patcog.2019.05.032_bib0045) 2001 10.1016/j.patcog.2019.05.032_bib0043 10.1016/j.patcog.2019.05.032_bib0041 10.1016/j.patcog.2019.05.032_bib0009 10.1016/j.patcog.2019.05.032_bib0007 10.1016/j.patcog.2019.05.032_bib0006 Bach (10.1016/j.patcog.2019.05.032_bib0016) 2002; 3 Michaeli (10.1016/j.patcog.2019.05.032_bib0017) 2016 Yu (10.1016/j.patcog.2019.05.032_bib0029) 2012; 33 10.1016/j.patcog.2019.05.032_bib0014 Bergstra (10.1016/j.patcog.2019.05.032_bib0047) 2012; 13 Gouws (10.1016/j.patcog.2019.05.032_bib0051) 2015 10.1016/j.patcog.2019.05.032_bib0010 Vinod (10.1016/j.patcog.2019.05.032_bib0013) 1976; 4 Ngiam (10.1016/j.patcog.2019.05.032_bib0022) 2011 Zhang (10.1016/j.patcog.2019.05.032_bib0030) 2015; 48 Xu (10.1016/j.patcog.2019.05.032_bib0032) 2015; 37 Gao (10.1016/j.patcog.2019.05.032_bib0040) 2014; 1 Lu (10.1016/j.patcog.2019.05.032_bib0023) 2014 10.1016/j.patcog.2019.05.032_bib0019 Han (10.1016/j.patcog.2019.05.032_bib0025) 2018; 275 Masci (10.1016/j.patcog.2019.05.032_bib0046) 2011 You (10.1016/j.patcog.2019.05.032_bib0005) 2016 Gao (10.1016/j.patcog.2019.05.032_bib0038) 2018; 75 Rasheed (10.1016/j.patcog.2019.05.032_bib0001) 2003; 2 Arora (10.1016/j.patcog.2019.05.032_bib0002) 2012 |
| References_xml | – volume: 39 start-page: 652 year: 2017 end-page: 663 ident: bib0004 article-title: Show and tell: lessons learned from the 2015 MSCOCO image captioning challenge publication-title: IEEE Transactions on Pattern Analysis and MachineIntelligence. – reference: H. Soyer, P. Stenetorp, A. Aizawa, Leveraging monolingual data for crosslingual compositional word representations, arXiv: – reference: K.M. Hermann, P. Blunsom, Multilingual models for compositional distributed semantics, arXiv: – reference: S. Akaho, A kernel method for canonical correlation analysis, arXiv: – reference: A. Saha, M.M. Khapra, S. Chandar, J. Rajendran, K. Cho, A correlational encoder decoder architecture for pivot based sequence generation, arXiv: – volume: 24 start-page: 5812 year: 2015 end-page: 5825 ident: bib0031 article-title: Multi-view learning with incomplete views publication-title: IEEE Trans. Image Process. – volume: 79 start-page: 290 year: 2018 end-page: 302 ident: bib0034 article-title: A deeply supervised residual network for hep-2 cell classification via cross-modal transfer learning publication-title: Pattern Recognit. – volume: 275 start-page: 1087 year: 2018 end-page: 1098 ident: bib0025 article-title: Multi-view local discrimination and canonical correlation analysis for image classification publication-title: Neurocomputing – volume: 28 start-page: 257-285 year: 2016 ident: bib0008 article-title: Correlational neural networks publication-title: Neural computation. – volume: 12 start-page: 2825 year: 2011 end-page: 2830 ident: bib0052 article-title: Scikit-learn: machine learning in python publication-title: J. Mach. Learn. Res. – volume: 31 start-page: 260 year: 2009 end-page: 274 ident: bib0028 article-title: Geometric mean for subspace selection publication-title: IEEE Trans. Pattern Anal. Mach. Intell. – volume: 5 start-page: 79 year: 2005 end-page: 86 ident: bib0049 article-title: Europarl: A parallel corpus for statistical machine translation publication-title: MT Summit – reference: P. Mineiro, N. Karampatziakis, A randomized algorithm for cca, arXiv: – reference: A. Benton, H. Khayrallah, B. Gujral, D. Reisinger, S. Zhang, R. Arora, Deep generalized canonical correlation analysis, arXiv: – start-page: 34 year: 2012 end-page: 37 ident: bib0002 article-title: Kernel CCA for multi-view learning of acoustic features using articulatory measurements. publication-title: MLSLP – start-page: 90 year: 2001 end-page: 95 ident: bib0045 article-title: Robust face detection using the hausdorff distance publication-title: International Conference on Audio-and Video-Based Biometric Person Authentication – volume: 75 start-page: 149 year: 2018 end-page: 160 ident: bib0039 article-title: Collaborative multiview hashing publication-title: Pattern Recognit. – reference: (2006). – start-page: 4651 year: 2016 end-page: 4659 ident: bib0005 article-title: Image captioning with semantic attention publication-title: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition – volume: 48 start-page: 3191 year: 2015 end-page: 3202 ident: bib0030 article-title: Multimodal learning for facial expression recognition publication-title: Pattern Recognit. – volume: 385 start-page: 338 year: 2017 end-page: 352 ident: bib0024 article-title: Canonical correlation analysis networks for two-view image recognition publication-title: Inf. Sci. – volume: 1 start-page: 699 year: 2014 end-page: 709 ident: bib0040 article-title: Learning continuous phrase representations for translation modeling publication-title: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) – volume: 13 start-page: 281 year: 2012 end-page: 305 ident: bib0047 article-title: Random search for hyper-parameter optimization publication-title: J. Mach. Learn. Res. – volume: 33 start-page: 1196 year: 2012 end-page: 1204 ident: bib0029 article-title: Image classification by multimodal subspace learning publication-title: Pattern Recognit. Lett. – volume: 88 year: 1990 ident: bib0044 article-title: X-Ray microbeam speech production database publication-title: J. Acoust. Soc. Am. – year: 2009 ident: bib0050 article-title: Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit – start-page: 689 year: 2011 end-page: 696 ident: bib0022 article-title: Multimodal deep learning publication-title: Proceedings of the 28th International Conference on Machine Learning (ICML-11) – year: 2005 ident: bib0012 article-title: Canonical Correlation Analysis publication-title: Encyclopedia of Statistics in Behavioral Science – volume: 72 start-page: 1 year: 2011 end-page: 19 ident: bib0021 article-title: Sparse autoencoder publication-title: CS294A Lecture notes – start-page: 748 year: 2015 end-page: 756 ident: bib0051 article-title: Bilbowa: Fast bilingual distributed representations without word alignments publication-title: International Conference on Machine Learning – reference: (2017). – reference: T. Mikolov, Q.V. Le, I. Sutskever, Exploiting similarities among languages for machine translation, arXiv: – start-page: 52 year: 2011 end-page: 59 ident: bib0046 article-title: Stacked convolutional auto-encoders for hierarchical feature extraction publication-title: Artif. Neural Netw. Mach. Learn.–ICANN 2011 – start-page: 1853 year: 2014 end-page: 1861 ident: bib0042 article-title: An autoencoder approach to learning bilingual word representations publication-title: Advances in Neural Information Processing Systems – start-page: 91 year: 2014 end-page: 99 ident: bib0023 article-title: Large scale canonical correlation analysis with iterative least squares publication-title: Advances in Neural Information Processing Systems – volume: 3 start-page: 1 year: 2002 end-page: 48 ident: bib0016 article-title: Kernel independent component analysis publication-title: J. Mach. Learn. Res. – volume: 41 start-page: 119 year: 2018 end-page: 128 ident: bib0026 article-title: Multiview dimension reduction via hessian multiset canonical correlations publication-title: Inf. Fusion – reference: (2016). – volume: 2 start-page: II year: 2003 end-page: 343 ident: bib0001 article-title: Scene detection in hollywood movies and tv shows publication-title: Computer Vision and Pattern Recognition, 2003. Proceedings, 2003 IEEE Computer Society Conference. – reference: A. Eisenschtat, L. Wolf, Linking image and text with 2-way nets, arXiv: – start-page: 1967 year: 2016 end-page: 1976 ident: bib0017 article-title: Nonparametric canonical correlation analysis publication-title: International Conference on Machine Learning – start-page: 263 year: 2007 end-page: 269 ident: bib0011 article-title: Canonical Correlation Analysis publication-title: Multivariate Statistics: Exercises and Solutions – volume: 37 start-page: 2531 year: 2015 end-page: 2544 ident: bib0032 article-title: Multi-view intact space learning publication-title: IEEE Trans. Pattern Anal. Mach. Intell. – volume: 4 start-page: 147 year: 1976 end-page: 166 ident: bib0013 article-title: Canonical ridge and econometrics of joint production publication-title: J. Econom. – reference: A. Klementiev, I. Titov, B. Bhattarai, Inducing crosslingual distributed representations of words (2012). – volume: 26 start-page: 3113 year: 2017 end-page: 3127 ident: bib0033 article-title: Discriminative multi-view interactive image re-ranking publication-title: IEEE Trans. Image Process. – start-page: 1247 year: 2013 end-page: 1255 ident: bib0018 article-title: Deep canonical correlation analysis publication-title: International Conference on Machine Learning – reference: (2014). – volume: 80 start-page: 225 year: 2018 end-page: 240 ident: bib0037 article-title: Multi-label learning based deep transfer neural network for facial attribute classification publication-title: Pattern Recognit. – volume: 73 start-page: 65 year: 2018 end-page: 75 ident: bib0035 article-title: On automated source selection for transfer learning in convolutional neural networks publication-title: Pattern Recognit. – volume: 75 start-page: 223 year: 2018 end-page: 234 ident: bib0038 article-title: Topic driven multimodal similarity learning with multi-view voted convolutional features publication-title: Pattern Recognit. – start-page: 4590 year: 2015 end-page: 4594 ident: bib0003 article-title: Unsupervised learning of acoustic features via deep canonical correlation analysis publication-title: Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on – volume: 75 start-page: 292 year: 2018 end-page: 301 ident: bib0036 article-title: A parasitic metric learning net for breast mass classification based on mammography publication-title: Pattern Recognit. – reference: (2013). – year: 2000 ident: bib0048 article-title: Mel frequency cepstral coefficients for music modeling publication-title: ISMIR – volume: 28 start-page: 1088 year: 2006 end-page: 1099 ident: bib0027 article-title: Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval publication-title: IEEE Trans. Pattern Anal. Mach. Intell. – start-page: 145 year: 2014 end-page: 152 ident: bib0015 article-title: Kernel Independent Component Analysis publication-title: Blind Source Separation: Theory and Applications – year: 2005 ident: 10.1016/j.patcog.2019.05.032_bib0012 article-title: Canonical Correlation Analysis doi: 10.1002/0470013192.bsa068 – ident: 10.1016/j.patcog.2019.05.032_bib0010 doi: 10.1109/CVPR.2017.201 – start-page: 263 year: 2007 ident: 10.1016/j.patcog.2019.05.032_bib0011 article-title: Canonical Correlation Analysis – volume: 2 start-page: II year: 2003 ident: 10.1016/j.patcog.2019.05.032_bib0001 article-title: Scene detection in hollywood movies and tv shows – start-page: 1853 year: 2014 ident: 10.1016/j.patcog.2019.05.032_bib0042 article-title: An autoencoder approach to learning bilingual word representations – start-page: 145 year: 2014 ident: 10.1016/j.patcog.2019.05.032_bib0015 article-title: Kernel Independent Component Analysis – start-page: 689 year: 2011 ident: 10.1016/j.patcog.2019.05.032_bib0022 article-title: Multimodal deep learning – volume: 12 start-page: 2825 issue: Oct year: 2011 ident: 10.1016/j.patcog.2019.05.032_bib0052 article-title: Scikit-learn: machine learning in python publication-title: J. Mach. Learn. Res. – ident: 10.1016/j.patcog.2019.05.032_bib0019 – volume: 41 start-page: 119 year: 2018 ident: 10.1016/j.patcog.2019.05.032_bib0026 article-title: Multiview dimension reduction via hessian multiset canonical correlations publication-title: Inf. Fusion doi: 10.1016/j.inffus.2017.09.001 – volume: 26 start-page: 3113 issue: 7 year: 2017 ident: 10.1016/j.patcog.2019.05.032_bib0033 article-title: Discriminative multi-view interactive image re-ranking publication-title: IEEE Trans. Image Process. doi: 10.1109/TIP.2017.2651379 – volume: 28 start-page: 1088 issue: 7 year: 2006 ident: 10.1016/j.patcog.2019.05.032_bib0027 article-title: Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval publication-title: IEEE Trans. Pattern Anal. Mach. Intell. doi: 10.1109/TPAMI.2006.134 – volume: 24 start-page: 5812 issue: 12 year: 2015 ident: 10.1016/j.patcog.2019.05.032_bib0031 article-title: Multi-view learning with incomplete views publication-title: IEEE Trans. Image Process. doi: 10.1109/TIP.2015.2490539 – volume: 39 start-page: 652 issue: 4 year: 2017 ident: 10.1016/j.patcog.2019.05.032_bib0004 article-title: Show and tell: lessons learned from the 2015 MSCOCO image captioning challenge publication-title: IEEE Transactions on Pattern Analysis and MachineIntelligence. doi: 10.1109/TPAMI.2016.2587640 – volume: 28 start-page: 257-285 issue: 2 year: 2016 ident: 10.1016/j.patcog.2019.05.032_bib0008 article-title: Correlational neural networks publication-title: Neural computation. doi: 10.1162/NECO_a_00801 – volume: 5 start-page: 79 year: 2005 ident: 10.1016/j.patcog.2019.05.032_bib0049 article-title: Europarl: A parallel corpus for statistical machine translation – ident: 10.1016/j.patcog.2019.05.032_bib0009 – volume: 385 start-page: 338 year: 2017 ident: 10.1016/j.patcog.2019.05.032_bib0024 article-title: Canonical correlation analysis networks for two-view image recognition publication-title: Inf. Sci. doi: 10.1016/j.ins.2017.01.011 – ident: 10.1016/j.patcog.2019.05.032_bib0043 – start-page: 4651 year: 2016 ident: 10.1016/j.patcog.2019.05.032_bib0005 article-title: Image captioning with semantic attention – volume: 80 start-page: 225 year: 2018 ident: 10.1016/j.patcog.2019.05.032_bib0037 article-title: Multi-label learning based deep transfer neural network for facial attribute classification publication-title: Pattern Recognit. doi: 10.1016/j.patcog.2018.03.018 – volume: 79 start-page: 290 year: 2018 ident: 10.1016/j.patcog.2019.05.032_bib0034 article-title: A deeply supervised residual network for hep-2 cell classification via cross-modal transfer learning publication-title: Pattern Recognit. doi: 10.1016/j.patcog.2018.02.006 – volume: 13 start-page: 281 year: 2012 ident: 10.1016/j.patcog.2019.05.032_bib0047 article-title: Random search for hyper-parameter optimization publication-title: J. Mach. Learn. Res. – volume: 33 start-page: 1196 issue: 9 year: 2012 ident: 10.1016/j.patcog.2019.05.032_bib0029 article-title: Image classification by multimodal subspace learning publication-title: Pattern Recognit. Lett. doi: 10.1016/j.patrec.2012.02.002 – volume: 31 start-page: 260 issue: 2 year: 2009 ident: 10.1016/j.patcog.2019.05.032_bib0028 article-title: Geometric mean for subspace selection publication-title: IEEE Trans. Pattern Anal. Mach. Intell. doi: 10.1109/TPAMI.2008.70 – volume: 3 start-page: 1 issue: Jul year: 2002 ident: 10.1016/j.patcog.2019.05.032_bib0016 article-title: Kernel independent component analysis publication-title: J. Mach. Learn. Res. – volume: 1 start-page: 699 year: 2014 ident: 10.1016/j.patcog.2019.05.032_bib0040 article-title: Learning continuous phrase representations for translation modeling – volume: 48 start-page: 3191 issue: 10 year: 2015 ident: 10.1016/j.patcog.2019.05.032_bib0030 article-title: Multimodal learning for facial expression recognition publication-title: Pattern Recognit. doi: 10.1016/j.patcog.2015.04.012 – start-page: 52 year: 2011 ident: 10.1016/j.patcog.2019.05.032_bib0046 article-title: Stacked convolutional auto-encoders for hierarchical feature extraction publication-title: Artif. Neural Netw. Mach. Learn.–ICANN 2011 doi: 10.1007/978-3-642-21735-7_7 – volume: 275 start-page: 1087 year: 2018 ident: 10.1016/j.patcog.2019.05.032_bib0025 article-title: Multi-view local discrimination and canonical correlation analysis for image classification publication-title: Neurocomputing doi: 10.1016/j.neucom.2017.09.045 – volume: 88 issue: S1 year: 1990 ident: 10.1016/j.patcog.2019.05.032_sbref0035 article-title: X-Ray microbeam speech production database publication-title: J. Acoust. Soc. Am. doi: 10.1121/1.2029064 – volume: 73 start-page: 65 year: 2018 ident: 10.1016/j.patcog.2019.05.032_bib0035 article-title: On automated source selection for transfer learning in convolutional neural networks publication-title: Pattern Recognit. doi: 10.1016/j.patcog.2017.07.019 – start-page: 34 year: 2012 ident: 10.1016/j.patcog.2019.05.032_bib0002 article-title: Kernel CCA for multi-view learning of acoustic features using articulatory measurements. – start-page: 90 year: 2001 ident: 10.1016/j.patcog.2019.05.032_bib0045 article-title: Robust face detection using the hausdorff distance – volume: 75 start-page: 292 year: 2018 ident: 10.1016/j.patcog.2019.05.032_bib0036 article-title: A parasitic metric learning net for breast mass classification based on mammography publication-title: Pattern Recognit. doi: 10.1016/j.patcog.2017.07.008 – start-page: 91 year: 2014 ident: 10.1016/j.patcog.2019.05.032_bib0023 article-title: Large scale canonical correlation analysis with iterative least squares – start-page: 1967 year: 2016 ident: 10.1016/j.patcog.2019.05.032_bib0017 article-title: Nonparametric canonical correlation analysis – start-page: 1247 year: 2013 ident: 10.1016/j.patcog.2019.05.032_bib0018 article-title: Deep canonical correlation analysis – start-page: 4590 year: 2015 ident: 10.1016/j.patcog.2019.05.032_bib0003 article-title: Unsupervised learning of acoustic features via deep canonical correlation analysis – volume: 75 start-page: 149 year: 2018 ident: 10.1016/j.patcog.2019.05.032_bib0039 article-title: Collaborative multiview hashing publication-title: Pattern Recognit. doi: 10.1016/j.patcog.2017.02.026 – volume: 72 start-page: 1 issue: 2011 year: 2011 ident: 10.1016/j.patcog.2019.05.032_bib0021 article-title: Sparse autoencoder publication-title: CS294A Lecture notes – ident: 10.1016/j.patcog.2019.05.032_bib0007 – ident: 10.1016/j.patcog.2019.05.032_bib0020 – year: 2009 ident: 10.1016/j.patcog.2019.05.032_bib0050 – ident: 10.1016/j.patcog.2019.05.032_bib0041 – volume: 75 start-page: 223 year: 2018 ident: 10.1016/j.patcog.2019.05.032_bib0038 article-title: Topic driven multimodal similarity learning with multi-view voted convolutional features publication-title: Pattern Recognit. doi: 10.1016/j.patcog.2017.02.035 – ident: 10.1016/j.patcog.2019.05.032_bib0014 – volume: 4 start-page: 147 issue: 2 year: 1976 ident: 10.1016/j.patcog.2019.05.032_bib0013 article-title: Canonical ridge and econometrics of joint production publication-title: J. Econom. doi: 10.1016/0304-4076(76)90010-5 – ident: 10.1016/j.patcog.2019.05.032_bib0006 doi: 10.3115/v1/P14-1006 – year: 2000 ident: 10.1016/j.patcog.2019.05.032_bib0048 article-title: Mel frequency cepstral coefficients for music modeling – volume: 37 start-page: 2531 issue: 12 year: 2015 ident: 10.1016/j.patcog.2019.05.032_bib0032 article-title: Multi-view intact space learning publication-title: IEEE Trans. Pattern Anal. Mach. Intell. doi: 10.1109/TPAMI.2015.2417578 – start-page: 748 year: 2015 ident: 10.1016/j.patcog.2019.05.032_bib0051 article-title: Bilbowa: Fast bilingual distributed representations without word alignments |
| SSID | ssj0017142 |
| Score | 2.4378898 |
| Snippet | Deep learning techniques have been successfully used in learning a common representation for multi-view data, wherein different modalities are projected onto a... |
| SourceID | crossref elsevier |
| SourceType | Enrichment Source Index Database Publisher |
| StartPage | 12 |
| SubjectTerms | Convolution autoencoders Multilingual document classification Representation learning Transfer learning |
| Title | Representation learning using step-based deep multi-modal autoencoders |
| URI | https://dx.doi.org/10.1016/j.patcog.2019.05.032 |
| Volume | 95 |
| WOSCitedRecordID | wos000478710600002&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVESC databaseName: Elsevier SD Freedom Collection Journals 2021 customDbUrl: eissn: 1873-5142 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0017142 issn: 0031-3203 databaseCode: AIEXJ dateStart: 19950101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3db9MwELfKxgMvML7EYCA_8DYZJbEdx4_btDF4mKoxpL4F23HGppFGbVON_55zbKethviSeImiqG6ru1_O57vf3SH0NksUTetaE6q0JMzQjKiUcqJyIUxFwWU3ST9sQpydFZOJHI9GX2ItzPJGNE1xeyvb_6pqeAbKdqWzf6Hu4UvhAdyD0uEKaofrHyn-vOe2hpKiJo6FuNzv5j50YFvitq5qv7K29YRC8m1auZ4B3WLq-lpWgRUfndZx34PT1b0EstEqdX_4Vfns0nvVzdRy4OP4NNL46ns3H-LN5ypEWw9d7WYHx3TXeyOgMwQeUhkq8IZoWKyIWdGPegtLU0KzxBst641qISgBx2zD6kq-ZjYDk9pvwL7--I5p91GG63ctbFHTS0fKk33P1RAe3Wya_cn3pHQlWrSfXX4PbWeCSzDd2wcfjicfh0yTSJnvKB_-dyyv7DmAd3_r5-7LmktysYMehrMEPvAYeIxGtnmCHsU5HTiY7afoZBMSOEIC95DAK0hgBwm8Bgm8Doln6PPJ8cXRKQnzM4iBg-CCiNRwBm8e1UxarrTQrJYst27CQV3klLqscq3hhA-ujGCVLWyuRaG5KXLFjKbP0VYzbewLhIvEKC6YNLbiLM-UBKemZroG664r8LB3EY1iKU1oLu9mnNyUkUV4XXphlk6YZcJLEOYuIsOq1jdX-c3nRZR4GRxE7_iVAJJfrnz5zytfoQcr8O-hrcWss6_RfbNcXM1nbwKafgBu6Y3F |
| linkProvider | Elsevier |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Representation+learning+using+step-based+deep+multi-modal+autoencoders&rft.jtitle=Pattern+recognition&rft.au=Bhatt%2C+Gaurav&rft.au=Jha%2C+Piyush&rft.au=Raman%2C+Balasubramanian&rft.date=2019-11-01&rft.pub=Elsevier+Ltd&rft.issn=0031-3203&rft.eissn=1873-5142&rft.volume=95&rft.spage=12&rft.epage=23&rft_id=info:doi/10.1016%2Fj.patcog.2019.05.032&rft.externalDocID=S0031320319302146 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0031-3203&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0031-3203&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0031-3203&client=summon |