Audio based depression detection using Convolutional Autoencoder
•A novel audio-based depression detection system using Convolutional Autoencoder.•Convolutional Autoencoder for extracting highly correlated and compact feature set.•Thorough experimental study based on a real-world depression detection dataset.•Complete comparison of proposed feature extraction met...
Gespeichert in:
| Veröffentlicht in: | Expert systems with applications Jg. 189; S. 116076 |
|---|---|
| Hauptverfasser: | , , , |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
New York
Elsevier Ltd
01.03.2022
Elsevier BV |
| Schlagworte: | |
| ISSN: | 0957-4174, 1873-6793 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | •A novel audio-based depression detection system using Convolutional Autoencoder.•Convolutional Autoencoder for extracting highly correlated and compact feature set.•Thorough experimental study based on a real-world depression detection dataset.•Complete comparison of proposed feature extraction method with other techniques.
Depression is a serious and common psychological disorder that requires early diagnosis and treatment. In severe episodes the condition may result in suicidal thoughts. Recently, the need for building an effective audio-based Automatic Depression Detection (ADD) system has sparked the interest of the research community. To date, most of the reported approaches to recognize depression rely on hand-crafted feature extraction for audio data representation. They combine wide variety of audio-related features to improve the classification performance. However, combining many hand-crafted features including relevant and less-relevant can enlarge the feature space which can lead to high-dimensionality issues as not all the features would carry significant information regarding depression. Having high number of features can make the pattern recognition more difficult and increase the risk of overfitting. To overcome these limitations, an audio-based framework of depression detection which includes an adaptation of a deep learning (DL) technique is proposed to automatically extract the highly relevant and compact feature set. This proposed framework uses an end-to-end Convolutional Neural Network-based Autoencoder (CNN AE) technique to learn the highly relevant and discriminative features from raw sequential audio data, and hence to detect depressed people more accurately. In addition, to address the sample imbalance problem we use a cluster-based sampling technique which highly reduces the risk of bias towards the major class (non-depressed). To evaluate the performance and effectiveness of the proposed pipeline, we perform the experiments on Distress Analysis Interview Corpus-Wizard of Oz (DAIC-WOZ) dataset and compare them with the hand-crafted feature extraction methods and other outstanding studies in this domain. The results show that proposed method outperforms other well-known audio-based ADD models with at least 7% improvement in F-measure for classifying depression. |
|---|---|
| AbstractList | •A novel audio-based depression detection system using Convolutional Autoencoder.•Convolutional Autoencoder for extracting highly correlated and compact feature set.•Thorough experimental study based on a real-world depression detection dataset.•Complete comparison of proposed feature extraction method with other techniques.
Depression is a serious and common psychological disorder that requires early diagnosis and treatment. In severe episodes the condition may result in suicidal thoughts. Recently, the need for building an effective audio-based Automatic Depression Detection (ADD) system has sparked the interest of the research community. To date, most of the reported approaches to recognize depression rely on hand-crafted feature extraction for audio data representation. They combine wide variety of audio-related features to improve the classification performance. However, combining many hand-crafted features including relevant and less-relevant can enlarge the feature space which can lead to high-dimensionality issues as not all the features would carry significant information regarding depression. Having high number of features can make the pattern recognition more difficult and increase the risk of overfitting. To overcome these limitations, an audio-based framework of depression detection which includes an adaptation of a deep learning (DL) technique is proposed to automatically extract the highly relevant and compact feature set. This proposed framework uses an end-to-end Convolutional Neural Network-based Autoencoder (CNN AE) technique to learn the highly relevant and discriminative features from raw sequential audio data, and hence to detect depressed people more accurately. In addition, to address the sample imbalance problem we use a cluster-based sampling technique which highly reduces the risk of bias towards the major class (non-depressed). To evaluate the performance and effectiveness of the proposed pipeline, we perform the experiments on Distress Analysis Interview Corpus-Wizard of Oz (DAIC-WOZ) dataset and compare them with the hand-crafted feature extraction methods and other outstanding studies in this domain. The results show that proposed method outperforms other well-known audio-based ADD models with at least 7% improvement in F-measure for classifying depression. Depression is a serious and common psychological disorder that requires early diagnosis and treatment. In severe episodes the condition may result in suicidal thoughts. Recently, the need for building an effective audio-based Automatic Depression Detection (ADD) system has sparked the interest of the research community. To date, most of the reported approaches to recognize depression rely on hand-crafted feature extraction for audio data representation. They combine wide variety of audio-related features to improve the classification performance. However, combining many hand-crafted features including relevant and less-relevant can enlarge the feature space which can lead to high-dimensionality issues as not all the features would carry significant information regarding depression. Having high number of features can make the pattern recognition more difficult and increase the risk of overfitting. To overcome these limitations, an audio-based framework of depression detection which includes an adaptation of a deep learning (DL) technique is proposed to automatically extract the highly relevant and compact feature set. This proposed framework uses an end-to-end Convolutional Neural Network-based Autoencoder (CNN AE) technique to learn the highly relevant and discriminative features from raw sequential audio data, and hence to detect depressed people more accurately. In addition, to address the sample imbalance problem we use a cluster-based sampling technique which highly reduces the risk of bias towards the major class (non-depressed). To evaluate the performance and effectiveness of the proposed pipeline, we perform the experiments on Distress Analysis Interview Corpus-Wizard of Oz (DAIC-WOZ) dataset and compare them with the hand-crafted feature extraction methods and other outstanding studies in this domain. The results show that proposed method outperforms other well-known audio-based ADD models with at least 7% improvement in F-measure for classifying depression. |
| ArticleNumber | 116076 |
| Author | Rastgoo, Mohammed Naim Sardari, Sara Eklund, Peter Nakisa, Bahareh |
| Author_xml | – sequence: 1 givenname: Sara surname: Sardari fullname: Sardari, Sara email: sara.sardari@shirazu.ac.ir organization: Computer Science, Engineering and IT Department, Shiraz University, Shiraz, Iran – sequence: 2 givenname: Bahareh surname: Nakisa fullname: Nakisa, Bahareh email: Bahar.nakisa@deakin.edu.au organization: School of Information Technology, Faculty of Science Engineering and Built Environment, Deakin University, Vic, Australia – sequence: 3 givenname: Mohammed Naim surname: Rastgoo fullname: Rastgoo, Mohammed Naim email: mohammadnaim.rastgoo@qut.edu.au organization: School of Electrical Engineering and Computer Science, Queensland University of Technology, Brisbane, QLD, Australia – sequence: 4 givenname: Peter surname: Eklund fullname: Eklund, Peter email: peter.eklund@deakin.edu.au organization: School of Information Technology, Faculty of Science Engineering and Built Environment, Deakin University, Vic, Australia |
| BookMark | eNp9kEtLxDAQgIOs4O7qH_BU8NyapG3SgAeX4gsWvOg5pMlUUmqzJu2K_96UevKwp3kw3zDzbdBqcAMgdE1wRjBht10G4VtlFFOSEcIwZ2doTSqep4yLfIXWWJQ8LQgvLtAmhA5jwjHma3S_m4x1SaMCmMTAwUMI1g0xHUGPczYFO3wktRuOrp_mjuqT3TQ6GLQz4C_Reav6AFd_cYveHx_e6ud0__r0Uu_2qc5pNaZNKRqTt6UgTVW0Oi9aQTVmlEFRFYqK1hCFlaC0wbFiwFmDGak0a0pT5JTnW3Sz7D149zVBGGXnJh-PCZIywpnAgtA4RZcp7V0IHlp58PZT-R9JsJxNyU7OpuRsSi6mIlT9g7Qd1fzq6JXtT6N3Cwrx9aMFL4O20QwY66M_aZw9hf8CRHGFww |
| CitedBy_id | crossref_primary_10_1016_j_compbiomed_2024_108382 crossref_primary_10_1016_j_specom_2024_103106 crossref_primary_10_3390_ijerph20021588 crossref_primary_10_1038_s41746_025_01933_3 crossref_primary_10_1038_s44184_023_00040_z crossref_primary_10_1109_TAFFC_2024_3521327 crossref_primary_10_1016_j_compbiomed_2023_106835 crossref_primary_10_1177_20552076241256730 crossref_primary_10_1145_3709367 crossref_primary_10_1016_j_jad_2025_01_136 crossref_primary_10_1109_ACCESS_2024_3362233 crossref_primary_10_2196_60439 crossref_primary_10_1109_TCSVT_2024_3491098 crossref_primary_10_1049_cit2_12113 crossref_primary_10_1016_j_eswa_2024_125025 crossref_primary_10_3390_e25091350 crossref_primary_10_1016_j_bspc_2025_108461 crossref_primary_10_3390_s24123714 crossref_primary_10_1016_j_heliyon_2024_e25959 crossref_primary_10_1038_s41598_024_63556_0 crossref_primary_10_1016_j_bspc_2022_104561 crossref_primary_10_1016_j_compbiomed_2024_109325 crossref_primary_10_1016_j_patrec_2023_07_016 crossref_primary_10_1109_LSP_2025_3567028 crossref_primary_10_1109_ACCESS_2022_3231681 crossref_primary_10_1016_j_compbiomed_2023_107534 crossref_primary_10_1016_j_inffus_2023_102017 crossref_primary_10_1109_TCSS_2023_3343689 crossref_primary_10_1109_TAFFC_2024_3506554 crossref_primary_10_1016_j_bspc_2024_106594 crossref_primary_10_1109_TASLPRO_2025_3533370 crossref_primary_10_3389_fcomp_2025_1629725 crossref_primary_10_1016_j_compeleceng_2024_109413 crossref_primary_10_1016_j_jad_2025_119739 crossref_primary_10_1016_j_neucom_2025_131126 crossref_primary_10_3390_s25164989 crossref_primary_10_1093_jamia_ocae189 crossref_primary_10_1007_s11571_022_09904_0 crossref_primary_10_3390_healthcare10050935 crossref_primary_10_1016_j_bspc_2025_108123 crossref_primary_10_1109_TII_2022_3224968 crossref_primary_10_1016_j_inffus_2024_102861 crossref_primary_10_1371_journal_pone_0291500 crossref_primary_10_1016_j_artmed_2023_102745 crossref_primary_10_1109_JBHI_2024_3404664 crossref_primary_10_1109_TAFFC_2024_3395117 crossref_primary_10_1007_s13755_022_00197_5 crossref_primary_10_1016_j_compbiomed_2022_106122 crossref_primary_10_1016_j_compbiomed_2023_106741 crossref_primary_10_1080_03772063_2024_2434572 crossref_primary_10_1109_TAFFC_2025_3552835 crossref_primary_10_1016_j_eswa_2023_122356 crossref_primary_10_1016_j_eswa_2023_120011 |
| Cites_doi | 10.1038/s41598-020-74399-w 10.1109/ACCESS.2019.2951750 10.1109/JBHI.2018.2866873 10.1016/j.eswa.2021.114693 10.21437/Interspeech.2017-1421 10.1016/j.specom.2015.03.004 10.1145/3186585 10.2174/1567205014666171120143800 10.1016/j.eswa.2019.07.010 10.3390/e22060688 10.1016/j.engappai.2018.09.018 10.1016/j.jad.2008.06.026 10.1145/3107990.3108004 10.1145/2661806.2661807 10.1016/j.csl.2018.07.007 10.1098/rsta.2015.0202 10.1109/ACCESS.2020.3027026 10.1007/s41666-019-00061-4 10.25080/Majora-7b98e3ed-003 10.1016/j.aquaeng.2020.102053 10.1109/TASLP.2019.2938863 10.1109/SCIS-ISIS.2018.00023 10.1145/3266302.3266316 10.1145/3133944.3133953 10.1016/j.ecoinf.2020.101084 10.1016/j.ins.2017.05.008 10.1145/2512530.2512533 10.1037/t00742-000 10.1016/j.media.2017.08.005 10.1371/journal.pone.0144610 10.1145/2988257.2988258 10.1186/s13636-020-00182-4 10.1109/JBHI.2019.2938247 10.1016/j.eswa.2017.09.062 10.1109/ACCESS.2018.2833746 10.1109/ACCESS.2018.2868361 10.1371/journal.pmed.0030442 10.1016/j.buildenv.2017.06.048 10.21437/Interspeech.2015-184 10.1109/ACCESS.2020.2970836 |
| ContentType | Journal Article |
| Copyright | 2021 Elsevier Ltd Copyright Elsevier BV Mar 1, 2022 |
| Copyright_xml | – notice: 2021 Elsevier Ltd – notice: Copyright Elsevier BV Mar 1, 2022 |
| DBID | AAYXX CITATION 7SC 8FD JQ2 L7M L~C L~D |
| DOI | 10.1016/j.eswa.2021.116076 |
| DatabaseName | CrossRef Computer and Information Systems Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | CrossRef Computer and Information Systems Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Advanced Technologies Database with Aerospace ProQuest Computer Science Collection Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Computer and Information Systems Abstracts |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISSN | 1873-6793 |
| ExternalDocumentID | 10_1016_j_eswa_2021_116076 S0957417421014147 |
| GroupedDBID | --K --M .DC .~1 0R~ 13V 1B1 1RT 1~. 1~5 4.4 457 4G. 5GY 5VS 7-5 71M 8P~ 9JN 9JO AAAKF AABNK AACTN AAEDT AAEDW AAIKJ AAKOC AALRI AAOAW AAQFI AARIN AATTM AAXKI AAXUO AAYFN ABBOA ABFNM ABJNI ABMAC ABMVD ABUCO ACDAQ ACGFS ACHRH ACNTT ACRLP ACZNC ADBBV ADEZE ADTZH AEBSH AECPX AEIPS AEKER AENEX AFTJW AGHFR AGUBO AGUMN AGYEJ AHHHB AHJVU AHZHX AIALX AIEXJ AIKHN AITUG AKRWK ALEQD ALMA_UNASSIGNED_HOLDINGS AMRAJ ANKPU AOUOD APLSM AXJTR BJAXD BKOJK BLXMC BNPGV BNSAS CS3 DU5 EBS EFJIC EO8 EO9 EP2 EP3 F5P FDB FIRID FNPLU FYGXN G-Q GBLVA GBOLZ HAMUX IHE J1W JJJVA KOM LG9 LY1 LY7 M41 MO0 N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. PQQKQ Q38 ROL RPZ SDF SDG SDP SDS SES SPC SPCBC SSB SSD SSH SSL SST SSV SSZ T5K TN5 ~G- 29G 9DU AAAKG AAQXK AAYWO AAYXX ABKBG ABUFD ABWVN ABXDB ACLOT ACNNM ACRPL ACVFH ADCNI ADJOM ADMUD ADNMO AEUPX AFJKZ AFPUW AGQPQ AIGII AIIUN AKBMS AKYEP APXCP ASPBG AVWKF AZFZN CITATION EFKBS EFLBG EJD FEDTE FGOYB G-2 HLZ HVGLF HZ~ R2- SBC SET SEW WUQ XPP ZMT ~HD 7SC 8FD JQ2 L7M L~C L~D |
| ID | FETCH-LOGICAL-c328t-b59bd3f591b84fc34f92c0626e484a29fd1a0a922b0a296e76b0618c6b5d43273 |
| ISICitedReferencesCount | 76 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000717676900001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0957-4174 |
| IngestDate | Sun Nov 09 07:53:26 EST 2025 Tue Nov 18 22:04:52 EST 2025 Sat Nov 29 07:07:43 EST 2025 Sun Apr 06 06:53:03 EDT 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Keywords | Audio depression detection Early depression detection Semi-supervised learning Convolutional Autoencoder |
| Language | English |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c328t-b59bd3f591b84fc34f92c0626e484a29fd1a0a922b0a296e76b0618c6b5d43273 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| PQID | 2617690912 |
| PQPubID | 2045477 |
| ParticipantIDs | proquest_journals_2617690912 crossref_primary_10_1016_j_eswa_2021_116076 crossref_citationtrail_10_1016_j_eswa_2021_116076 elsevier_sciencedirect_doi_10_1016_j_eswa_2021_116076 |
| PublicationCentury | 2000 |
| PublicationDate | 2022-03-01 2022-03-00 20220301 |
| PublicationDateYYYYMMDD | 2022-03-01 |
| PublicationDate_xml | – month: 03 year: 2022 text: 2022-03-01 day: 01 |
| PublicationDecade | 2020 |
| PublicationPlace | New York |
| PublicationPlace_xml | – name: New York |
| PublicationTitle | Expert systems with applications |
| PublicationYear | 2022 |
| Publisher | Elsevier Ltd Elsevier BV |
| Publisher_xml | – name: Elsevier Ltd – name: Elsevier BV |
| References | Beck, A. T., Steer, R. A., Brown, G. K., 1996. Beck depression inventory. Giannakopoulos (b0075) 2015; 10 Chernykh, V., & Prikhodko, P. 2017. Emotion recognition from speech with recurrent neural networks. Nakisa, Rastgoo, Rakotonirainy, Maire, Chandran (b0155) 2020; 8 Fan, Xu, Wu, Zheng, Tao (b0070) 2020; 8 Rastgoo, Nakisa, Maire, Rakotonirainy, Chandran (b0205) 2019; 138 pp. 81-84. IEEE. Wroge, Özkanca, Demiroglu, Si, Atkins, Ghomi (b0285) 2018 Qureshi, S. A., Hasanuzzaman, M., Saha, S., Dias, G., 2019. The Verbal and Non Verbal Signals of Depression--Combining Acoustics, Text and Visuals for Estimating Depression Level. Kroenke, Strine, Spitzer, Williams, Berry, Mokdad (b0100) 2009; 114 Wen, Zhang (b0275) 2018; 6 Zhang, Haddad, Nakisa, Rastgoo, Candido, Tjondronegoro, de Dear (b0305) 2017; 123 Lopez-de-Ipina, Martinez-de-Lizarduy, Calvo, Mekyska, Beitia, Barroso, Ecay-Torres (b0120) 2018; 15 Nogas, Khan, Mihailidis (b0175) 2020; 4 Lee, H., Kim, J., Kim, B., Kim, S., 2018, December. Convolutional Autoencoder Based Feature Extraction in Radar Data Analysis. In Lemaître, Nogueira, Aridas (b0110) 2017; 18 Banan, Nasiri, Taheri-Garavand (b0015) 2020; 89 McIntyre, Göcke, Hyett, Green, Breakspear (b0145) 2009 Sahu, S., Gupta, R., Sivaraman, G., AbdAlmageed, W. and Espy-Wilson, C., 2018. Adversarial auto-encoders for speech based emotion recognition. arXiv preprint arXiv:1904.07656. arXiv preprint arXiv:2006.10417. pp. 375-417. Zhao, Dong, Chen, Iraji, Li, Makkie, Liu (b0315) 2017; 42 Bredin, Yin, Coria, Gelly, Korshunov, Lavechin, Gill (b0035) 2020 Lin, Tsai, Hu, Jhang (b0115) 2017; 409 Mathers, Loncar (b0135) 2006; 3 Chorowski, Weiss, Bengio, van den Oord (b0055) 2019; 27 Valstar, M., Gratch, J., Schuller, B., Ringeval, F., Lalanne, D., Torres Torres, M., Scherer, S., Stratou, G., Cowie, R. and Pantic, M., 2016, October. Avec 2016: Depression, mood, and emotion recognition workshop and challenge. In Ringeval, F., Schuller, B., Valstar, M., Cowie, R., Kaya, H., Schmitt, M., Amiriparian, S., Cummins, N., Lalanne, D., Michaud, A. and Çiftçi, E., 2018, October. AVEC 2018 workshop and challenge: Bipolar disorder and cross-cultural affect recognition. Proceedings of the 2018 on audio/visual emotion challenge and workshop, pp. 3-13. Ma, Yang, Chen, Huang, Wang (b0125) 2016 Van Der Maaten, Postma, Van den Herik (b0255) 2009; 10 . An, Cho (b0010) 2015; 2 Al Hanai, Ghassemi, Glass (b0005) 2018 Mou, Zhou, Zhao, Nakisa, Rastgoo, Jain, Gao (b0150) 2021; 173 Cummins, Scherer, Krajewski, Schnieder, Epps, Quatieri (b0060) 2015; 71 Nakisa, Rastgoo, Tjondronegoro, Chandran (b0165) 2018; 93 Ribeiro, A., Matos, L. M., Pereira, P. J., Nunes, E. C., Ferreira, A. L., Cortez, P., Pilastri, A., 2020. Deep Dense and Convolutional Autoencoders for Unsupervised Anomaly Detection in Machine Condition Sounds. Palylyk-Colwell, Argáez (b0190) 2018 McFee, B., Raffel, C., Liang, D., Ellis, D. P., McVicar, M., Battenberg, E., Nieto, O., 2015, July. librosa: Audio and music signal analysis in python. Proceedings of the 14th python in science conference, Vol. 8, pp. 18- 25. Nanni, Maguolo, Paci (b0170) 2020; 57 Zlotnik, A., Montero, J.M., San-Segundo, R. and Gallardo-Antolín, A., 2015. Random forest-based prediction of Parkinson's disease progression using acoustic, ASR and intelligibility features. INTERSPEECH-2015, 503- 507. Cohn, J. F., Cummins, N., Epps, J., Goecke, R., Joshi, J. Scherer, S., 2018. Multimodal assessment of depression from behavioral signals. In Ortiz-Rodriguez, J. M., Martinez-Blanco, M. R, Cervantes-Viramontes, J. M., Vega-Carrillo, H. R., 2013. Robust design of artificial neural networks methodology in neutron spectrometry. In Artificial Neural Networks – Architectures and Applications – Edition 1. Chapter 4, INTECH. Pampouchidou, Simantiraki, Fazlollahi, Pediaditis, Manousos, Roniotis, Yang (b0195) 2016 Rastgoo, Nakisa, Rakotonirainy, Chandran, Tjondronegoro (b0210) 2018; 51 Vásquez-Correa, Arias-Vergara, Orozco-Arroyave, Eskofier, Klucken, Nöth (b0260) 2018; 23 Gosztolya, Vincze, Tóth, Pákáski, Kálmán, Hoffmann (b0085) 2019; 53 Nakisa, Rastgoo, Rakotonirainy, Maire, Chandran (b0160) 2018; 6 Vázquez-Romero, Gallardo-Antolín (b0265) 2020; 22 Yang, Sahli, Xia, Pei, Oveneke, Jiang (b0300) 2017 Shamshirband, Rabczuk, Chau (b0235) 2019; 7 Gogoi, Begum (b0080) 2017 Valstar, M., Schuller, B., Smith, K., Almaev, T., Eyben, F., Krajewski, J., Cowie, R. and Pantic, M., 2014, November. Avec 2014: 3d dimensional affect and depression recognition challenge. Proceedings of the 4th international workshop on audio/visual emotion challenge, pp. 3-10. Ozkanca, Demiroglu, Besirli, Celik (b0185) 2018; 2018 Proceedings of the 7th Annual Workshop on Audio/Visual Emotion Challenge, pp. 3-9. Demiroglu, C., Beşirli, A., Ozkanca, Y., Çelik, S., 2020. Depression-level assessment from multi-lingual conversational speech data using acoustic and text features. Journal on Audio, Speech, and Music Processing. 2020, 17 (2020). 10.1186/s13636-020-00182-4. Proceedings of the 6th international workshop on audio/visual emotion challenge, pp. 3-10. Venugopalan, Tong, Hassanzadeh, Wang (b0270) 2021; 11 Yang, Jiang, He, Pei, Oveneke, Sahli (b0295) 2016 Masci, Meier, Cireşan, Schmidhuber (b0130) 2011 Chollet, F. 2015. Keras. Available online at Valstar, M., Schuller, B., Smith, K., Eyben, F., Jiang, B., Bilakhia, S., Schnieder, S., Cowie, R. and Pantic, M., 2013, October. Avec 2013: the continuous audio/visual emotion and depression recognition challenge. Proceedings of the 3rd ACM international workshop on Audio/visual emotion challenge, pp. 3-10. Gratch, J., Artstein, R., Lucas, G. M., Stratou, G., Scherer, S., Nazarian, A., Wood, R., Boberg, J., DeVault, D., Marsella, S., Traum, D. R., 2014, May. The distress analysis interview corpus of human and computer interviews. LREC, pp. 3123-3128. Zhang, Shen, ud Din, Liu, Wang, Hu (b0310) 2019; 23 Ringeval, F., Schuller, B., Valstar, M., Gratch, J., Cowie, R., Scherer, S., Mozgai, S., Cummins, N., Schmitt, M., Pantic, M., 2017, October. Avec 2017: Real-life depression, and affect recognition workshop and challenge. In arXiv preprint arXiv:1806.02146. Jolliffe, Cadima (b0095) 2016; 374 Balakrishnama, Ganapathiraju (b0020) 1998; 18 Xia, Liu (b0290) 2013 Braga, Madureira, Coelho, Ajith (b0030) 2019; 77 World Health Organization (WHO) (b0280) 2017 Ma (10.1016/j.eswa.2021.116076_b0125) 2016 Vásquez-Correa (10.1016/j.eswa.2021.116076_b0260) 2018; 23 Banan (10.1016/j.eswa.2021.116076_b0015) 2020; 89 Nakisa (10.1016/j.eswa.2021.116076_b0160) 2018; 6 Chorowski (10.1016/j.eswa.2021.116076_b0055) 2019; 27 Lopez-de-Ipina (10.1016/j.eswa.2021.116076_b0120) 2018; 15 Xia (10.1016/j.eswa.2021.116076_b0290) 2013 Zhang (10.1016/j.eswa.2021.116076_b0310) 2019; 23 Balakrishnama (10.1016/j.eswa.2021.116076_b0020) 1998; 18 Cummins (10.1016/j.eswa.2021.116076_b0060) 2015; 71 Yang (10.1016/j.eswa.2021.116076_b0295) 2016 10.1016/j.eswa.2021.116076_b0230 Yang (10.1016/j.eswa.2021.116076_b0300) 2017 Al Hanai (10.1016/j.eswa.2021.116076_b0005) 2018 Kroenke (10.1016/j.eswa.2021.116076_b0100) 2009; 114 Rastgoo (10.1016/j.eswa.2021.116076_b0210) 2018; 51 Van Der Maaten (10.1016/j.eswa.2021.116076_b0255) 2009; 10 Masci (10.1016/j.eswa.2021.116076_b0130) 2011 Nakisa (10.1016/j.eswa.2021.116076_b0165) 2018; 93 McIntyre (10.1016/j.eswa.2021.116076_b0145) 2009 Nakisa (10.1016/j.eswa.2021.116076_b0155) 2020; 8 Mou (10.1016/j.eswa.2021.116076_b0150) 2021; 173 Zhang (10.1016/j.eswa.2021.116076_b0305) 2017; 123 Pampouchidou (10.1016/j.eswa.2021.116076_b0195) 2016 10.1016/j.eswa.2021.116076_b0025 10.1016/j.eswa.2021.116076_b0105 10.1016/j.eswa.2021.116076_b0225 Gogoi (10.1016/j.eswa.2021.116076_b0080) 2017 10.1016/j.eswa.2021.116076_b0180 Rastgoo (10.1016/j.eswa.2021.116076_b0205) 2019; 138 Shamshirband (10.1016/j.eswa.2021.116076_b0235) 2019; 7 10.1016/j.eswa.2021.116076_b0140 Nanni (10.1016/j.eswa.2021.116076_b0170) 2020; 57 10.1016/j.eswa.2021.116076_b0220 10.1016/j.eswa.2021.116076_b0065 Lemaître (10.1016/j.eswa.2021.116076_b0110) 2017; 18 Lin (10.1016/j.eswa.2021.116076_b0115) 2017; 409 10.1016/j.eswa.2021.116076_b0090 Nogas (10.1016/j.eswa.2021.116076_b0175) 2020; 4 Venugopalan (10.1016/j.eswa.2021.116076_b0270) 2021; 11 Jolliffe (10.1016/j.eswa.2021.116076_b0095) 2016; 374 10.1016/j.eswa.2021.116076_b0215 Mathers (10.1016/j.eswa.2021.116076_b0135) 2006; 3 10.1016/j.eswa.2021.116076_b0050 10.1016/j.eswa.2021.116076_b0250 Wen (10.1016/j.eswa.2021.116076_b0275) 2018; 6 Vázquez-Romero (10.1016/j.eswa.2021.116076_b0265) 2020; 22 Wroge (10.1016/j.eswa.2021.116076_b0285) 2018 Fan (10.1016/j.eswa.2021.116076_b0070) 2020; 8 Bredin (10.1016/j.eswa.2021.116076_b0035) 2020 World Health Organization (WHO) (10.1016/j.eswa.2021.116076_b0280) 2017 Palylyk-Colwell (10.1016/j.eswa.2021.116076_b0190) 2018 Ozkanca (10.1016/j.eswa.2021.116076_b0185) 2018; 2018 10.1016/j.eswa.2021.116076_b0200 10.1016/j.eswa.2021.116076_b0045 10.1016/j.eswa.2021.116076_b0320 10.1016/j.eswa.2021.116076_b0245 Zhao (10.1016/j.eswa.2021.116076_b0315) 2017; 42 An (10.1016/j.eswa.2021.116076_b0010) 2015; 2 Giannakopoulos (10.1016/j.eswa.2021.116076_b0075) 2015; 10 10.1016/j.eswa.2021.116076_b0040 10.1016/j.eswa.2021.116076_b0240 Braga (10.1016/j.eswa.2021.116076_b0030) 2019; 77 Gosztolya (10.1016/j.eswa.2021.116076_b0085) 2019; 53 |
| References_xml | – reference: Sahu, S., Gupta, R., Sivaraman, G., AbdAlmageed, W. and Espy-Wilson, C., 2018. Adversarial auto-encoders for speech based emotion recognition. – volume: 10 year: 2015 ident: b0075 article-title: Pyaudioanalysis: An open-source python library for audio signal analysis publication-title: PloS one – volume: 4 start-page: 50 year: 2020 end-page: 70 ident: b0175 article-title: Deepfall: Non-invasive fall detection with deep spatio-temporal convolutional autoencoders publication-title: Journal of Healthcare Informatics Research – volume: 8 start-page: 25111 year: 2020 end-page: 25121 ident: b0070 article-title: Spatiotemporal modeling for nonlinear distributed thermal processes based on KL decomposition, MLP and LSTM network publication-title: IEEE Access – reference: Cohn, J. F., Cummins, N., Epps, J., Goecke, R., Joshi, J. Scherer, S., 2018. Multimodal assessment of depression from behavioral signals. In – volume: 23 start-page: 1618 year: 2018 end-page: 1630 ident: b0260 article-title: Multimodal assessment of Parkinson's disease: A deep learning approach publication-title: IEEE journal of biomedical and health informatics – start-page: 2886 year: 2013 end-page: 2889 ident: b0290 article-title: Using denoising autoencoder for emotion recognition publication-title: Interspeech – volume: 18 start-page: 1 year: 1998 end-page: 8 ident: b0020 article-title: Linear discriminant analysis-a brief tutorial publication-title: Institute for Signal and information Processing – volume: 57 year: 2020 ident: b0170 article-title: Data augmentation approaches for improving animal audio classification publication-title: Ecological Informatics – volume: 7 start-page: 164650 year: 2019 end-page: 164666 ident: b0235 article-title: A survey of deep learning techniques: Application in wind and solar energy resources publication-title: IEEE Access – volume: 77 start-page: 148 year: 2019 end-page: 158 ident: b0030 article-title: Automatic detection of Parkinson’s disease based on acoustic analysis of speech publication-title: Engineering Applications of Artificial Intelligence – reference: Qureshi, S. A., Hasanuzzaman, M., Saha, S., Dias, G., 2019. The Verbal and Non Verbal Signals of Depression--Combining Acoustics, Text and Visuals for Estimating Depression Level. – reference: pp. 81-84. IEEE. – reference: Valstar, M., Schuller, B., Smith, K., Eyben, F., Jiang, B., Bilakhia, S., Schnieder, S., Cowie, R. and Pantic, M., 2013, October. Avec 2013: the continuous audio/visual emotion and depression recognition challenge. Proceedings of the 3rd ACM international workshop on Audio/visual emotion challenge, pp. 3-10. – volume: 11 start-page: 1 year: 2021 end-page: 13 ident: b0270 article-title: Multimodal deep learning models for early detection of Alzheimer’s disease stage publication-title: Scientific Reports – start-page: 1716 year: 2018 end-page: 1720 ident: b0005 article-title: September. Detecting Depression with Audio/Text Sequence Modeling of Interviews publication-title: Interspeech – reference: Ortiz-Rodriguez, J. M., Martinez-Blanco, M. R, Cervantes-Viramontes, J. M., Vega-Carrillo, H. R., 2013. Robust design of artificial neural networks methodology in neutron spectrometry. In Artificial Neural Networks – Architectures and Applications – Edition 1. Chapter 4, INTECH. – volume: 18 start-page: 559 year: 2017 end-page: 563 ident: b0110 article-title: Imbalanced-learn: A python toolbox to tackle the curse of imbalanced datasets in machine learning publication-title: The Journal of Machine Learning Research – volume: 374 start-page: 20150202 year: 2016 ident: b0095 article-title: Principal component analysis: A review and recent developments publication-title: Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences – volume: 15 start-page: 139 year: 2018 end-page: 148 ident: b0120 article-title: Advances on automatic speech analysis for early detection of Alzheimer disease: A non-linear multi-task approach publication-title: Current Alzheimer Research – volume: 8 start-page: 225463 year: 2020 end-page: 225474 ident: b0155 article-title: Automatic Emotion Recognition Using Temporal Multimodal Deep Learning publication-title: IEEE Access – year: 2017 ident: b0280 article-title: Depression and other common mental disorders: Global health estimates – start-page: 27 year: 2016 end-page: 34 ident: b0195 article-title: Depression assessment by fusing high and low level features from audio, video, and text – reference: Lee, H., Kim, J., Kim, B., Kim, S., 2018, December. Convolutional Autoencoder Based Feature Extraction in Radar Data Analysis. In – reference: Proceedings of the 6th international workshop on audio/visual emotion challenge, pp. 3-10. – start-page: 89 year: 2016 end-page: 96 ident: b0295 article-title: Decision tree based depression classification from audio video and language information publication-title: Proceedings of the 6th international workshop on audio/visual emotion challenge – volume: 6 start-page: 25399 year: 2018 end-page: 25410 ident: b0275 article-title: Deep convolution neural network and autoencoders-based unsupervised feature learning of EEG signals publication-title: IEEE Access – reference: Valstar, M., Gratch, J., Schuller, B., Ringeval, F., Lalanne, D., Torres Torres, M., Scherer, S., Stratou, G., Cowie, R. and Pantic, M., 2016, October. Avec 2016: Depression, mood, and emotion recognition workshop and challenge. In – start-page: 1 year: 2017 end-page: 5 ident: b0080 article-title: Image Classification Using Deep Autoencoders publication-title: IEEE International Conference on Computational Intelligence and Computing Research (ICCIC) – volume: 53 start-page: 181 year: 2019 end-page: 197 ident: b0085 article-title: Identifying mild cognitive impairment and mild Alzheimer’s disease based on spontaneous speech using ASR and linguistic features publication-title: Computer Speech & Language – reference: Proceedings of the 7th Annual Workshop on Audio/Visual Emotion Challenge, pp. 3-9. – volume: 93 start-page: 143 year: 2018 end-page: 155 ident: b0165 article-title: Evolutionary computation algorithms for feature selection of EEG-based emotion recognition using mobile sensors publication-title: Expert Systems with Applications – volume: 173 year: 2021 ident: b0150 article-title: Driver stress detection via multimodal fusion using attention-based CNN-LSTM publication-title: Expert Systems with Applications – start-page: 1 year: 2009 end-page: 8 ident: b0145 article-title: An approach for automatically measuring facial activity in depressed subjects – reference: , pp. 375-417. – reference: arXiv preprint arXiv:1904.07656. – reference: Gratch, J., Artstein, R., Lucas, G. M., Stratou, G., Scherer, S., Nazarian, A., Wood, R., Boberg, J., DeVault, D., Marsella, S., Traum, D. R., 2014, May. The distress analysis interview corpus of human and computer interviews. LREC, pp. 3123-3128. – volume: 51 start-page: 1 year: 2018 end-page: 35 ident: b0210 article-title: A critical review of proactive detection of driver stress levels based on multimodal measurements publication-title: ACM Computing Surveys (CSUR) – start-page: 1 year: 2018 end-page: 7 ident: b0285 article-title: Parkinson’s disease diagnosis using machine learning and voice – volume: 10 start-page: 13 year: 2009 ident: b0255 article-title: Dimensionality reduction: A comparative publication-title: J Mach Learn Res – reference: Chernykh, V., & Prikhodko, P. 2017. Emotion recognition from speech with recurrent neural networks. – reference: Ribeiro, A., Matos, L. M., Pereira, P. J., Nunes, E. C., Ferreira, A. L., Cortez, P., Pilastri, A., 2020. Deep Dense and Convolutional Autoencoders for Unsupervised Anomaly Detection in Machine Condition Sounds. – volume: 3 year: 2006 ident: b0135 article-title: Projections of global mortality and burden of disease from 2002 to 2030 publication-title: PLoS medicine – volume: 123 start-page: 176 year: 2017 end-page: 188 ident: b0305 article-title: The effects of higher temperature setpoints during summer on office workers' cognitive load and thermal comfort publication-title: Building and Environment – start-page: 8 year: 2018 end-page: 9 ident: b0190 article-title: Telehealth for the Assessment and Treatment of Depression, Post-Traumatic Stress Disorder, and Anxiety: Clinical Evidence publication-title: Canadian Agency for Drugs and Technologies in Health – start-page: 45 year: 2017 end-page: 51 ident: b0300 article-title: Hybrid depression classification and estimation from audio video and text information publication-title: Proceedings of the 7th Annual Workshop on Audio/Visual Emotion Challenge – volume: 2018 start-page: 3398 year: 2018 end-page: 3402 ident: b0185 article-title: Multi-lingual depression-level assessment from conversational speech using acoustic and text features publication-title: Proceedings of Interspeech – start-page: 7124 year: 2020 end-page: 7128 ident: b0035 article-title: Pyannote. audio: neural building blocks for speaker diarization publication-title: ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) – reference: Ringeval, F., Schuller, B., Valstar, M., Cowie, R., Kaya, H., Schmitt, M., Amiriparian, S., Cummins, N., Lalanne, D., Michaud, A. and Çiftçi, E., 2018, October. AVEC 2018 workshop and challenge: Bipolar disorder and cross-cultural affect recognition. Proceedings of the 2018 on audio/visual emotion challenge and workshop, pp. 3-13. – reference: Demiroglu, C., Beşirli, A., Ozkanca, Y., Çelik, S., 2020. Depression-level assessment from multi-lingual conversational speech data using acoustic and text features. Journal on Audio, Speech, and Music Processing. 2020, 17 (2020). 10.1186/s13636-020-00182-4. – volume: 2 start-page: 1 year: 2015 end-page: 18 ident: b0010 article-title: Variational autoencoder based anomaly detection using reconstruction probability publication-title: Special Lecture on IE – volume: 22 start-page: 688 year: 2020 ident: b0265 article-title: Automatic Detection of Depression in Speech Using Ensemble Convolutional Neural Networks publication-title: Entropy – volume: 23 start-page: 2265 year: 2019 end-page: 2275 ident: b0310 article-title: Multimodal depression detection: Fusion of electroencephalography and paralinguistic behaviors using a novel strategy for classifier ensemble publication-title: IEEE Journal of Biomedical and Health Informatics – reference: Valstar, M., Schuller, B., Smith, K., Almaev, T., Eyben, F., Krajewski, J., Cowie, R. and Pantic, M., 2014, November. Avec 2014: 3d dimensional affect and depression recognition challenge. Proceedings of the 4th international workshop on audio/visual emotion challenge, pp. 3-10. – volume: 6 start-page: 49325 year: 2018 end-page: 49338 ident: b0160 article-title: Long short term memory hyperparameter optimization for a neural network based emotion recognition framework publication-title: IEEE Access – volume: 27 start-page: 2041 year: 2019 end-page: 2053 ident: b0055 article-title: Unsupervised speech representation learning using wavenet autoencoders publication-title: IEEE/ACM transactions on audio, speech, and language processing – volume: 71 start-page: 10 year: 2015 end-page: 49 ident: b0060 article-title: A review of depression and suicide risk assessment using speech analysis publication-title: Speech Communication – reference: Zlotnik, A., Montero, J.M., San-Segundo, R. and Gallardo-Antolín, A., 2015. Random forest-based prediction of Parkinson's disease progression using acoustic, ASR and intelligibility features. INTERSPEECH-2015, 503- 507. – start-page: 35 year: 2016 end-page: 42 ident: b0125 article-title: Depaudionet: An efficient deep model for audio based depression classification publication-title: Proceedings of the 6th international workshop on audio/visual emotion challenge – reference: Beck, A. T., Steer, R. A., Brown, G. K., 1996. Beck depression inventory. – reference: arXiv preprint arXiv:2006.10417. – reference: Ringeval, F., Schuller, B., Valstar, M., Gratch, J., Cowie, R., Scherer, S., Mozgai, S., Cummins, N., Schmitt, M., Pantic, M., 2017, October. Avec 2017: Real-life depression, and affect recognition workshop and challenge. In – reference: . – volume: 114 start-page: 163 year: 2009 end-page: 173 ident: b0100 article-title: The PHQ-8 as a measure of current depression in the general population publication-title: Journal of affective disorders – volume: 89 year: 2020 ident: b0015 article-title: Deep learning-based appearance features extraction for automated carp species identification publication-title: Aquacultural Engineering – reference: Chollet, F. 2015. Keras. Available online at: – start-page: 52 year: 2011 end-page: 59 ident: b0130 article-title: Stacked convolutional auto-encoders for hierarchical feature extraction publication-title: International conference on artificial neural networks – volume: 409 start-page: 17 year: 2017 end-page: 26 ident: b0115 article-title: Clustering-based undersampling in class-imbalanced data publication-title: Information Sciences – reference: McFee, B., Raffel, C., Liang, D., Ellis, D. P., McVicar, M., Battenberg, E., Nieto, O., 2015, July. librosa: Audio and music signal analysis in python. Proceedings of the 14th python in science conference, Vol. 8, pp. 18- 25. – volume: 42 start-page: 200 year: 2017 end-page: 211 ident: b0315 article-title: Constructing fine-granularity functional brain network atlases via deep convolutional autoencoder publication-title: Medical Image Analysis – reference: arXiv preprint arXiv:1806.02146. – volume: 138 year: 2019 ident: b0205 article-title: Automatic driver stress level classification using multimodal deep learning publication-title: Expert Systems with Applications – volume: 11 start-page: 1 issue: 1 year: 2021 ident: 10.1016/j.eswa.2021.116076_b0270 article-title: Multimodal deep learning models for early detection of Alzheimer’s disease stage publication-title: Scientific Reports doi: 10.1038/s41598-020-74399-w – volume: 7 start-page: 164650 year: 2019 ident: 10.1016/j.eswa.2021.116076_b0235 article-title: A survey of deep learning techniques: Application in wind and solar energy resources publication-title: IEEE Access doi: 10.1109/ACCESS.2019.2951750 – volume: 23 start-page: 1618 issue: 4 year: 2018 ident: 10.1016/j.eswa.2021.116076_b0260 article-title: Multimodal assessment of Parkinson's disease: A deep learning approach publication-title: IEEE journal of biomedical and health informatics doi: 10.1109/JBHI.2018.2866873 – volume: 173 year: 2021 ident: 10.1016/j.eswa.2021.116076_b0150 article-title: Driver stress detection via multimodal fusion using attention-based CNN-LSTM publication-title: Expert Systems with Applications doi: 10.1016/j.eswa.2021.114693 – start-page: 89 year: 2016 ident: 10.1016/j.eswa.2021.116076_b0295 article-title: Decision tree based depression classification from audio video and language information – ident: 10.1016/j.eswa.2021.116076_b0230 doi: 10.21437/Interspeech.2017-1421 – start-page: 8 year: 2018 ident: 10.1016/j.eswa.2021.116076_b0190 article-title: Telehealth for the Assessment and Treatment of Depression, Post-Traumatic Stress Disorder, and Anxiety: Clinical Evidence – volume: 71 start-page: 10 year: 2015 ident: 10.1016/j.eswa.2021.116076_b0060 article-title: A review of depression and suicide risk assessment using speech analysis publication-title: Speech Communication doi: 10.1016/j.specom.2015.03.004 – volume: 51 start-page: 1 issue: 5 year: 2018 ident: 10.1016/j.eswa.2021.116076_b0210 article-title: A critical review of proactive detection of driver stress levels based on multimodal measurements publication-title: ACM Computing Surveys (CSUR) doi: 10.1145/3186585 – volume: 15 start-page: 139 issue: 2 year: 2018 ident: 10.1016/j.eswa.2021.116076_b0120 article-title: Advances on automatic speech analysis for early detection of Alzheimer disease: A non-linear multi-task approach publication-title: Current Alzheimer Research doi: 10.2174/1567205014666171120143800 – volume: 138 year: 2019 ident: 10.1016/j.eswa.2021.116076_b0205 article-title: Automatic driver stress level classification using multimodal deep learning publication-title: Expert Systems with Applications doi: 10.1016/j.eswa.2019.07.010 – ident: 10.1016/j.eswa.2021.116076_b0040 – volume: 22 start-page: 688 issue: 6 year: 2020 ident: 10.1016/j.eswa.2021.116076_b0265 article-title: Automatic Detection of Depression in Speech Using Ensemble Convolutional Neural Networks publication-title: Entropy doi: 10.3390/e22060688 – volume: 77 start-page: 148 year: 2019 ident: 10.1016/j.eswa.2021.116076_b0030 article-title: Automatic detection of Parkinson’s disease based on acoustic analysis of speech publication-title: Engineering Applications of Artificial Intelligence doi: 10.1016/j.engappai.2018.09.018 – volume: 114 start-page: 163 issue: 1–3 year: 2009 ident: 10.1016/j.eswa.2021.116076_b0100 article-title: The PHQ-8 as a measure of current depression in the general population publication-title: Journal of affective disorders doi: 10.1016/j.jad.2008.06.026 – ident: 10.1016/j.eswa.2021.116076_b0200 – ident: 10.1016/j.eswa.2021.116076_b0050 doi: 10.1145/3107990.3108004 – ident: 10.1016/j.eswa.2021.116076_b0245 doi: 10.1145/2661806.2661807 – volume: 53 start-page: 181 year: 2019 ident: 10.1016/j.eswa.2021.116076_b0085 article-title: Identifying mild cognitive impairment and mild Alzheimer’s disease based on spontaneous speech using ASR and linguistic features publication-title: Computer Speech & Language doi: 10.1016/j.csl.2018.07.007 – volume: 374 start-page: 20150202 issue: 2065 year: 2016 ident: 10.1016/j.eswa.2021.116076_b0095 article-title: Principal component analysis: A review and recent developments publication-title: Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences doi: 10.1098/rsta.2015.0202 – volume: 8 start-page: 225463 year: 2020 ident: 10.1016/j.eswa.2021.116076_b0155 article-title: Automatic Emotion Recognition Using Temporal Multimodal Deep Learning publication-title: IEEE Access doi: 10.1109/ACCESS.2020.3027026 – volume: 4 start-page: 50 issue: 1 year: 2020 ident: 10.1016/j.eswa.2021.116076_b0175 article-title: Deepfall: Non-invasive fall detection with deep spatio-temporal convolutional autoencoders publication-title: Journal of Healthcare Informatics Research doi: 10.1007/s41666-019-00061-4 – ident: 10.1016/j.eswa.2021.116076_b0045 – start-page: 1 year: 2009 ident: 10.1016/j.eswa.2021.116076_b0145 article-title: An approach for automatically measuring facial activity in depressed subjects – ident: 10.1016/j.eswa.2021.116076_b0215 – ident: 10.1016/j.eswa.2021.116076_b0140 doi: 10.25080/Majora-7b98e3ed-003 – volume: 89 year: 2020 ident: 10.1016/j.eswa.2021.116076_b0015 article-title: Deep learning-based appearance features extraction for automated carp species identification publication-title: Aquacultural Engineering doi: 10.1016/j.aquaeng.2020.102053 – start-page: 35 year: 2016 ident: 10.1016/j.eswa.2021.116076_b0125 article-title: Depaudionet: An efficient deep model for audio based depression classification – volume: 18 start-page: 1 issue: 1998 year: 1998 ident: 10.1016/j.eswa.2021.116076_b0020 article-title: Linear discriminant analysis-a brief tutorial publication-title: Institute for Signal and information Processing – volume: 27 start-page: 2041 issue: 12 year: 2019 ident: 10.1016/j.eswa.2021.116076_b0055 article-title: Unsupervised speech representation learning using wavenet autoencoders publication-title: IEEE/ACM transactions on audio, speech, and language processing doi: 10.1109/TASLP.2019.2938863 – year: 2017 ident: 10.1016/j.eswa.2021.116076_b0280 – start-page: 45 year: 2017 ident: 10.1016/j.eswa.2021.116076_b0300 article-title: Hybrid depression classification and estimation from audio video and text information – ident: 10.1016/j.eswa.2021.116076_b0105 doi: 10.1109/SCIS-ISIS.2018.00023 – volume: 10 start-page: 13 issue: 66–71 year: 2009 ident: 10.1016/j.eswa.2021.116076_b0255 article-title: Dimensionality reduction: A comparative publication-title: J Mach Learn Res – ident: 10.1016/j.eswa.2021.116076_b0220 doi: 10.1145/3266302.3266316 – ident: 10.1016/j.eswa.2021.116076_b0225 doi: 10.1145/3133944.3133953 – start-page: 1716 year: 2018 ident: 10.1016/j.eswa.2021.116076_b0005 article-title: September. Detecting Depression with Audio/Text Sequence Modeling of Interviews publication-title: Interspeech – volume: 57 year: 2020 ident: 10.1016/j.eswa.2021.116076_b0170 article-title: Data augmentation approaches for improving animal audio classification publication-title: Ecological Informatics doi: 10.1016/j.ecoinf.2020.101084 – volume: 409 start-page: 17 year: 2017 ident: 10.1016/j.eswa.2021.116076_b0115 article-title: Clustering-based undersampling in class-imbalanced data publication-title: Information Sciences doi: 10.1016/j.ins.2017.05.008 – ident: 10.1016/j.eswa.2021.116076_b0180 – ident: 10.1016/j.eswa.2021.116076_b0250 doi: 10.1145/2512530.2512533 – ident: 10.1016/j.eswa.2021.116076_b0025 doi: 10.1037/t00742-000 – start-page: 52 year: 2011 ident: 10.1016/j.eswa.2021.116076_b0130 article-title: Stacked convolutional auto-encoders for hierarchical feature extraction – volume: 42 start-page: 200 year: 2017 ident: 10.1016/j.eswa.2021.116076_b0315 article-title: Constructing fine-granularity functional brain network atlases via deep convolutional autoencoder publication-title: Medical Image Analysis doi: 10.1016/j.media.2017.08.005 – ident: 10.1016/j.eswa.2021.116076_b0090 – volume: 10 issue: 12 year: 2015 ident: 10.1016/j.eswa.2021.116076_b0075 article-title: Pyaudioanalysis: An open-source python library for audio signal analysis publication-title: PloS one doi: 10.1371/journal.pone.0144610 – ident: 10.1016/j.eswa.2021.116076_b0240 doi: 10.1145/2988257.2988258 – ident: 10.1016/j.eswa.2021.116076_b0065 doi: 10.1186/s13636-020-00182-4 – volume: 23 start-page: 2265 issue: 6 year: 2019 ident: 10.1016/j.eswa.2021.116076_b0310 article-title: Multimodal depression detection: Fusion of electroencephalography and paralinguistic behaviors using a novel strategy for classifier ensemble publication-title: IEEE Journal of Biomedical and Health Informatics doi: 10.1109/JBHI.2019.2938247 – start-page: 7124 year: 2020 ident: 10.1016/j.eswa.2021.116076_b0035 article-title: Pyannote. audio: neural building blocks for speaker diarization – volume: 93 start-page: 143 year: 2018 ident: 10.1016/j.eswa.2021.116076_b0165 article-title: Evolutionary computation algorithms for feature selection of EEG-based emotion recognition using mobile sensors publication-title: Expert Systems with Applications doi: 10.1016/j.eswa.2017.09.062 – start-page: 1 year: 2017 ident: 10.1016/j.eswa.2021.116076_b0080 article-title: Image Classification Using Deep Autoencoders – start-page: 27 year: 2016 ident: 10.1016/j.eswa.2021.116076_b0195 article-title: Depression assessment by fusing high and low level features from audio, video, and text – start-page: 2886 year: 2013 ident: 10.1016/j.eswa.2021.116076_b0290 article-title: Using denoising autoencoder for emotion recognition publication-title: Interspeech – volume: 2 start-page: 1 issue: 1 year: 2015 ident: 10.1016/j.eswa.2021.116076_b0010 article-title: Variational autoencoder based anomaly detection using reconstruction probability publication-title: Special Lecture on IE – volume: 6 start-page: 25399 year: 2018 ident: 10.1016/j.eswa.2021.116076_b0275 article-title: Deep convolution neural network and autoencoders-based unsupervised feature learning of EEG signals publication-title: IEEE Access doi: 10.1109/ACCESS.2018.2833746 – volume: 2018 start-page: 3398 year: 2018 ident: 10.1016/j.eswa.2021.116076_b0185 article-title: Multi-lingual depression-level assessment from conversational speech using acoustic and text features publication-title: Proceedings of Interspeech – volume: 18 start-page: 559 issue: 1 year: 2017 ident: 10.1016/j.eswa.2021.116076_b0110 article-title: Imbalanced-learn: A python toolbox to tackle the curse of imbalanced datasets in machine learning publication-title: The Journal of Machine Learning Research – volume: 6 start-page: 49325 year: 2018 ident: 10.1016/j.eswa.2021.116076_b0160 article-title: Long short term memory hyperparameter optimization for a neural network based emotion recognition framework publication-title: IEEE Access doi: 10.1109/ACCESS.2018.2868361 – volume: 3 issue: 11 year: 2006 ident: 10.1016/j.eswa.2021.116076_b0135 article-title: Projections of global mortality and burden of disease from 2002 to 2030 publication-title: PLoS medicine doi: 10.1371/journal.pmed.0030442 – volume: 123 start-page: 176 year: 2017 ident: 10.1016/j.eswa.2021.116076_b0305 article-title: The effects of higher temperature setpoints during summer on office workers' cognitive load and thermal comfort publication-title: Building and Environment doi: 10.1016/j.buildenv.2017.06.048 – ident: 10.1016/j.eswa.2021.116076_b0320 doi: 10.21437/Interspeech.2015-184 – start-page: 1 year: 2018 ident: 10.1016/j.eswa.2021.116076_b0285 article-title: Parkinson’s disease diagnosis using machine learning and voice – volume: 8 start-page: 25111 year: 2020 ident: 10.1016/j.eswa.2021.116076_b0070 article-title: Spatiotemporal modeling for nonlinear distributed thermal processes based on KL decomposition, MLP and LSTM network publication-title: IEEE Access doi: 10.1109/ACCESS.2020.2970836 |
| SSID | ssj0017007 |
| Score | 2.631339 |
| Snippet | •A novel audio-based depression detection system using Convolutional Autoencoder.•Convolutional Autoencoder for extracting highly correlated and compact... Depression is a serious and common psychological disorder that requires early diagnosis and treatment. In severe episodes the condition may result in suicidal... |
| SourceID | proquest crossref elsevier |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 116076 |
| SubjectTerms | Artificial neural networks Audio data Audio depression detection Classification Convolutional Autoencoder Early depression detection Feature extraction Machine learning Mental disorders Pattern recognition Performance evaluation Sampling methods Semi-supervised learning |
| Title | Audio based depression detection using Convolutional Autoencoder |
| URI | https://dx.doi.org/10.1016/j.eswa.2021.116076 https://www.proquest.com/docview/2617690912 |
| Volume | 189 |
| WOSCitedRecordID | wos000717676900001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVESC databaseName: Elsevier SD Freedom Collection Journals 2021 customDbUrl: eissn: 1873-6793 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0017007 issn: 0957-4174 databaseCode: AIEXJ dateStart: 19950101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3Pb9MwFLZKx4ELvxEbA-WAuEyZEidx7BsV6gRoFIQ61JvlOO7WUpKuTcpu_Os8x47TTaOCA5eoTZM0eu_z8_Pze-9D6LUkRESJkD6WcerHdEp8GEXUl3GY4Qim3DwRDdlEOhrRyYR96fV-tbUwm0VaFPTqii3_q6rhHChbl87-g7rdQ-EEfAalwxHUDse_UvygzmflkZ6d8i7PVae8VsrQgtd2r7_Y2PfQWqqrUre0zG2y7txl6KlVZds9t4VwW1veLj4DOBOmZl2HmF2EWbunhttZXIiVcqHnr2JdnZdNlPZTeaGD5znY-dkP599_X9Qm4t3lD9vQBKxqXW6WiZe5mplv1-KOgInQUPMcK2N1aRr5JDVUiZ1ZZluGNbzV3JvIw_xYrX_qHlI4hBmABOktvbVHn_nJ2ekpHw8n4zfLS1_TjuntecvBcgft4TRhtI_2Bh-Gk49uIyoNTMV9-9a27sqkCN782z_5Njdm-cZ1GT9E9-2awxsYrDxCPVU8Rg9aPg_Pmvcn6G0DHa-BjtdBx3PQ8RroeNeg421B5yk6OxmO3733LcOGLyNMKz9LWJZH04SFGY2nMoqnDMsA1rgqprHAbJqHIhAM4yyAb0SlJAP_j0qSJXkcgef7DPWLslDPkRcJEdCUZXCZ1CwGMOJDTGSURIrCs8Q-ClvZcGnbz2sWlAVv8wznXMuTa3lyI899dOTuWZrmKzuvTlqRc-s-GreQA1x23nfY6ofbcbzmmqiAMHCm8cHun1-gex30D1G_WtXqJborN9VsvXpl4fQbmnOZ6w |
| linkProvider | Elsevier |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Audio+based+depression+detection+using+Convolutional+Autoencoder&rft.jtitle=Expert+systems+with+applications&rft.au=Sardari%2C+Sara&rft.au=Nakisa%2C+Bahareh&rft.au=Rastgoo%2C+Mohammed+Naim&rft.au=Eklund%2C+Peter&rft.date=2022-03-01&rft.pub=Elsevier+BV&rft.issn=0957-4174&rft.eissn=1873-6793&rft.volume=189&rft.spage=1&rft_id=info:doi/10.1016%2Fj.eswa.2021.116076&rft.externalDBID=NO_FULL_TEXT |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0957-4174&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0957-4174&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0957-4174&client=summon |