An Adversarial Training Based Speech Emotion Classifier With Isolated Gaussian Regularization.

Uloženo v:
Podrobná bibliografie
Název: An Adversarial Training Based Speech Emotion Classifier With Isolated Gaussian Regularization.
Autoři: Fu, Changzeng, Liu, Chaoran, Ishi, Carlos Toshinori, Ishiguro, Hiroshi
Zdroj: IEEE Transactions on Affective Computing; Jul-Sep2023, Vol. 14 Issue 3, p2361-2374, 14p
Abstrakt: Speaker individual bias may cause emotion-related features to form clusters with irregular borders (non-Gaussian distributions), making the model sensitive to local irregularities of pattern distributions, resulting in the model over-fit of the in-domain dataset. This problem may cause a decrease in the validation scores in cross-domain (i.e., speaker-independent, channel-variant) implementation. To mitigate this problem, in this paper, we propose an adversarial training-based classifier to regularize the distribution of latent representations to further smooth the boundaries among different categories. In the regularization phase, the representations are mapped into Gaussian distributions in an unsupervised manner to improve the discriminative ability of the latent representations. A single Gaussian distribution is used for mapping the latent representations in our previous study. In this presented work, we adopt a mixture of isolated Gaussian distributions. Moreover, multi-instance learning was adopted by dividing speech into a bag of segments to capture the most salient part of presenting an emotion. The model was evaluated on the IEMOCAP and MELD datasets with in-corpus speaker-independent sittings. In addition, we investigated the accuracy of cross-corpus sittings in simulating speaker-independent and channel-variants. In the experiment, the proposed model was compared not only with baseline models but also with different configurations of our model. The results show that the proposed model is competitive with respect to the baseline, as demonstrated both by in-corpus and cross-corpus validation. [ABSTRACT FROM AUTHOR]
Copyright of IEEE Transactions on Affective Computing is the property of IEEE and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Databáze: Complementary Index
FullText Text:
  Availability: 0
CustomLinks:
  – Url: https://resolver.ebscohost.com/openurl?sid=EBSCO:edb&genre=article&issn=19493045&ISBN=&volume=14&issue=3&date=20230701&spage=2361&pages=2361-2374&title=IEEE Transactions on Affective Computing&atitle=An%20Adversarial%20Training%20Based%20Speech%20Emotion%20Classifier%20With%20Isolated%20Gaussian%20Regularization.&aulast=Fu%2C%20Changzeng&id=DOI:10.1109/TAFFC.2022.3169091
    Name: Full Text Finder
    Category: fullText
    Text: Full Text Finder
    Icon: https://imageserver.ebscohost.com/branding/images/FTF.gif
    MouseOverText: Full Text Finder
  – Url: https://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=EBSCO&SrcAuth=EBSCO&DestApp=WOS&ServiceName=TransferToWoS&DestLinkType=GeneralSearchSummary&Func=Links&author=Fu%20C
    Name: ISI
    Category: fullText
    Text: Nájsť tento článok vo Web of Science
    Icon: https://imagesrvr.epnet.com/ls/20docs.gif
    MouseOverText: Nájsť tento článok vo Web of Science
Header DbId: edb
DbLabel: Complementary Index
An: 172274291
RelevancyScore: 950
AccessLevel: 6
PubType: Academic Journal
PubTypeId: academicJournal
PreciseRelevancyScore: 949.677551269531
IllustrationInfo
Items – Name: Title
  Label: Title
  Group: Ti
  Data: An Adversarial Training Based Speech Emotion Classifier With Isolated Gaussian Regularization.
– Name: Author
  Label: Authors
  Group: Au
  Data: <searchLink fieldCode="AR" term="%22Fu%2C+Changzeng%22">Fu, Changzeng</searchLink><br /><searchLink fieldCode="AR" term="%22Liu%2C+Chaoran%22">Liu, Chaoran</searchLink><br /><searchLink fieldCode="AR" term="%22Ishi%2C+Carlos+Toshinori%22">Ishi, Carlos Toshinori</searchLink><br /><searchLink fieldCode="AR" term="%22Ishiguro%2C+Hiroshi%22">Ishiguro, Hiroshi</searchLink>
– Name: TitleSource
  Label: Source
  Group: Src
  Data: IEEE Transactions on Affective Computing; Jul-Sep2023, Vol. 14 Issue 3, p2361-2374, 14p
– Name: Abstract
  Label: Abstract
  Group: Ab
  Data: Speaker individual bias may cause emotion-related features to form clusters with irregular borders (non-Gaussian distributions), making the model sensitive to local irregularities of pattern distributions, resulting in the model over-fit of the in-domain dataset. This problem may cause a decrease in the validation scores in cross-domain (i.e., speaker-independent, channel-variant) implementation. To mitigate this problem, in this paper, we propose an adversarial training-based classifier to regularize the distribution of latent representations to further smooth the boundaries among different categories. In the regularization phase, the representations are mapped into Gaussian distributions in an unsupervised manner to improve the discriminative ability of the latent representations. A single Gaussian distribution is used for mapping the latent representations in our previous study. In this presented work, we adopt a mixture of isolated Gaussian distributions. Moreover, multi-instance learning was adopted by dividing speech into a bag of segments to capture the most salient part of presenting an emotion. The model was evaluated on the IEMOCAP and MELD datasets with in-corpus speaker-independent sittings. In addition, we investigated the accuracy of cross-corpus sittings in simulating speaker-independent and channel-variants. In the experiment, the proposed model was compared not only with baseline models but also with different configurations of our model. The results show that the proposed model is competitive with respect to the baseline, as demonstrated both by in-corpus and cross-corpus validation. [ABSTRACT FROM AUTHOR]
– Name: Abstract
  Label:
  Group: Ab
  Data: <i>Copyright of IEEE Transactions on Affective Computing is the property of IEEE and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract.</i> (Copyright applies to all Abstracts.)
PLink https://erproxy.cvtisr.sk/sfx/access?url=https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edb&AN=172274291
RecordInfo BibRecord:
  BibEntity:
    Identifiers:
      – Type: doi
        Value: 10.1109/TAFFC.2022.3169091
    Languages:
      – Code: eng
        Text: English
    PhysicalDescription:
      Pagination:
        PageCount: 14
        StartPage: 2361
    Titles:
      – TitleFull: An Adversarial Training Based Speech Emotion Classifier With Isolated Gaussian Regularization.
        Type: main
  BibRelationships:
    HasContributorRelationships:
      – PersonEntity:
          Name:
            NameFull: Fu, Changzeng
      – PersonEntity:
          Name:
            NameFull: Liu, Chaoran
      – PersonEntity:
          Name:
            NameFull: Ishi, Carlos Toshinori
      – PersonEntity:
          Name:
            NameFull: Ishiguro, Hiroshi
    IsPartOfRelationships:
      – BibEntity:
          Dates:
            – D: 01
              M: 07
              Text: Jul-Sep2023
              Type: published
              Y: 2023
          Identifiers:
            – Type: issn-print
              Value: 19493045
          Numbering:
            – Type: volume
              Value: 14
            – Type: issue
              Value: 3
          Titles:
            – TitleFull: IEEE Transactions on Affective Computing
              Type: main
ResultId 1