Voice adaptation by color-encoded frame matching as a multi-objective optimization problem for future games

Voice adaptation is an interactive speech processing technique that allows the speaker to transmit with a chosen target voice. We propose a novel method that is intended for dynamic scenarios, such as online video games, where the source speaker’s and target speaker’s data are nonaligned. This would...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Complex & intelligent systems Ročník 8; číslo 2; s. 1539 - 1550
Hlavní autoři: Midtlyng, Mads, Sato, Yuji, Hosobe, Hiroshi
Médium: Journal Article
Jazyk:angličtina
Vydáno: Cham Springer International Publishing 01.04.2022
Springer Nature B.V
Témata:
ISSN:2199-4536, 2198-6053
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract Voice adaptation is an interactive speech processing technique that allows the speaker to transmit with a chosen target voice. We propose a novel method that is intended for dynamic scenarios, such as online video games, where the source speaker’s and target speaker’s data are nonaligned. This would yield massive improvements to immersion and experience by fully becoming a character, and address privacy concerns to protect against harassment by disguising the voice. With unaligned data, traditional methods, e.g., probabilistic models become inaccurate, while recent methods such as deep neural networks (DNN) require too substantial preparation work. Common methods require multiple subjects to be trained in parallel, which constraints practicality in productive environments. Our proposal trains a subject nonparallel into a voice profile used against any unknown source speaker. Prosodic data such as pitch, power and temporal structure are encoded into RGBA-colored frames used in a multi-objective optimization problem to adjust interrelated features based on color likeness. Finally, frames are smoothed and adjusted before output. The method was evaluated using Mean Opinion Score, ABX, MUSHRA, Single Ease Questions and performance benchmarks using two voice profiles of varying sizes and lastly discussion regarding game implementation. Results show improved adaptation quality, especially in a larger voice profile, and audience is positive about using such technology in future games.
AbstractList Voice adaptation is an interactive speech processing technique that allows the speaker to transmit with a chosen target voice. We propose a novel method that is intended for dynamic scenarios, such as online video games, where the source speaker’s and target speaker’s data are nonaligned. This would yield massive improvements to immersion and experience by fully becoming a character, and address privacy concerns to protect against harassment by disguising the voice. With unaligned data, traditional methods, e.g., probabilistic models become inaccurate, while recent methods such as deep neural networks (DNN) require too substantial preparation work. Common methods require multiple subjects to be trained in parallel, which constraints practicality in productive environments. Our proposal trains a subject nonparallel into a voice profile used against any unknown source speaker. Prosodic data such as pitch, power and temporal structure are encoded into RGBA-colored frames used in a multi-objective optimization problem to adjust interrelated features based on color likeness. Finally, frames are smoothed and adjusted before output. The method was evaluated using Mean Opinion Score, ABX, MUSHRA, Single Ease Questions and performance benchmarks using two voice profiles of varying sizes and lastly discussion regarding game implementation. Results show improved adaptation quality, especially in a larger voice profile, and audience is positive about using such technology in future games.
Author Sato, Yuji
Midtlyng, Mads
Hosobe, Hiroshi
Author_xml – sequence: 1
  givenname: Mads
  orcidid: 0000-0002-4303-1697
  surname: Midtlyng
  fullname: Midtlyng, Mads
  email: midtlyng.madsalexander.9c@stu.hosei.ac.jp
  organization: Department of Computer Science, Hosei University
– sequence: 2
  givenname: Yuji
  surname: Sato
  fullname: Sato, Yuji
  organization: Faculty of Computer and Information Sciences, Hosei University
– sequence: 3
  givenname: Hiroshi
  surname: Hosobe
  fullname: Hosobe, Hiroshi
  organization: Faculty of Computer and Information Sciences, Hosei University
BookMark eNp9kMtOwzAQRS1UJErpD7CyxNowfsRulqjiJVViA2wtx3FKShMX20EqX4_bILFjNbO4c-bqnKNJ73uH0CWFawqgbqIAJRQBRgmABEHkCZoyWi6IhIJPjntJRMHlGZrHuAEAqtSCA5uijzffWodNbXbJpNb3uNpj67c-ENdbX7saN8F0Dncm2fe2X2MTscHdsE0t8dXG2dR-Oex3qe3a75GwC77aug43PuBmSENweJ0R8QKdNmYb3fx3ztDr_d3L8pGsnh-elrcrYnPHRLionWooKwtR06KyleWVqHgjhGFqoQSVMvdnYACM5QYcrRrDDCvNolS2lnyGrkZuLvI5uJj0xg-hzy81k4Uss6yizCk2pmzwMQbX6F1oOxP2moI-eNWjV5296qNXfUDz8SjmcL924Q_9z9UPJ019qQ
Cites_doi 10.21437/Interspeech.2010-596
10.1109/ICASSP.2018.8462342
10.1109/TASL.2012.2227735
10.1109/SMC.2018.00351
10.1109/APSIPA.2016.7820786
10.21437/Odyssey.2018-28
10.1109/ICASSP.2001.941046
10.1109/APSIPA.2017.8282216
10.1109/ICASSP.1988.196671
10.21437/Interspeech.2018-1190
10.1109/TASLP.2021.3049336
10.1109/SLT.2012.6424242
10.1016/j.chb.2013.07.014
10.23919/APSIPA.2018.8659628
10.1109/TASLP.2014.2333242
10.1109/CoG47356.2020.9231643
10.1109/TSA.2005.860839
10.1109/ICASSP.1998.674423
10.1109/ICASSP.2011.5947510
10.1109/ICASSP39728.2021.9413391
10.1109/TASLP.2021.3066047
10.1109/ICASSP.2009.4960401
10.1109/ICASSP.2019.8682897
10.1109/TEVC.2007.892759
10.1109/SMC.2016.7844220
10.21437/Interspeech.2007-550
10.5220/0006193301640174
10.1109/TASLP.2014.2353991
10.1016/0167-6393(95)90054-3
10.21437/Eurospeech.2003-664
10.1016/j.asoc.2004.06.005
ContentType Journal Article
Copyright The Author(s) 2021
The Author(s) 2021. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml – notice: The Author(s) 2021
– notice: The Author(s) 2021. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID C6C
AAYXX
CITATION
8FE
8FG
ABUWG
AFKRA
ARAPS
AZQEC
BENPR
BGLVJ
CCPQU
DWQXO
HCIFZ
P5Z
P62
PHGZM
PHGZT
PIMPY
PKEHL
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
DOI 10.1007/s40747-021-00604-6
DatabaseName Springer Nature OA Free Journals
CrossRef
ProQuest SciTech Collection
ProQuest Technology Collection
ProQuest Central (Alumni)
ProQuest Central UK/Ireland
ProQuest SciTech Premium Collection Technology Collection Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
ProQuest Central
ProQuest Technology Collection
ProQuest One
ProQuest Central Korea
SciTech Premium Collection
Advanced Technologies & Aerospace Database
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Premium
ProQuest One Academic
Publicly Available Content Database
ProQuest One Academic Middle East (New)
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic (retired)
ProQuest One Academic UKI Edition
ProQuest Central China
DatabaseTitle CrossRef
Publicly Available Content Database
Advanced Technologies & Aerospace Collection
Technology Collection
ProQuest One Academic Middle East (New)
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
ProQuest One Academic Eastern Edition
ProQuest Central (Alumni Edition)
SciTech Premium Collection
ProQuest One Community College
ProQuest Technology Collection
ProQuest SciTech Collection
ProQuest Central China
ProQuest Central
Advanced Technologies & Aerospace Database
ProQuest One Applied & Life Sciences
ProQuest One Academic UKI Edition
ProQuest Central Korea
ProQuest Central (New)
ProQuest One Academic
ProQuest One Academic (New)
DatabaseTitleList CrossRef

Publicly Available Content Database
Database_xml – sequence: 1
  dbid: PIMPY
  name: Publicly Available Content Database
  url: http://search.proquest.com/publiccontent
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Mathematics
EISSN 2198-6053
EndPage 1550
ExternalDocumentID 10_1007_s40747_021_00604_6
GrantInformation_xml – fundername: Japan Society for the Promotion of Science
  grantid: JP19K12162
  funderid: http://dx.doi.org/10.13039/501100001691
GroupedDBID 0R~
8FE
8FG
AAJSJ
AAKKN
ABEEZ
ABFTD
ACACY
ACGFS
ACULB
ADINQ
ADMLS
AFGXO
AFKRA
AHBYD
AHSBF
AHYZX
ALMA_UNASSIGNED_HOLDINGS
AMKLP
ARAPS
ASPBG
AVWKF
BAPOH
BENPR
BGLVJ
C24
C6C
CCPQU
EBLON
EBS
EJD
GROUPED_DOAJ
HCIFZ
IAO
ISR
ITC
M~E
OK1
P62
PIMPY
PROAC
RSV
SOJ
AASML
AAYXX
AFFHD
CITATION
PHGZM
PHGZT
PQGLB
ABUWG
AZQEC
DWQXO
PKEHL
PQEST
PQQKQ
PQUKI
PRINS
ID FETCH-LOGICAL-c453t-34de7f12954d15bcbc3b4b3f44a2787416601720a00ac3a0e1bfa2a29a897cd63
IEDL.DBID C24
ISICitedReferencesCount 0
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000738565700011&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 2199-4536
IngestDate Wed Oct 08 14:21:12 EDT 2025
Sat Nov 29 05:48:57 EST 2025
Fri Feb 21 02:45:55 EST 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 2
Keywords Video games
Multi-objective optimization problems
Voice adaptation
Color-encoding
Speech processing
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c453t-34de7f12954d15bcbc3b4b3f44a2787416601720a00ac3a0e1bfa2a29a897cd63
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0002-4303-1697
OpenAccessLink https://link.springer.com/10.1007/s40747-021-00604-6
PQID 2656974759
PQPubID 2044308
PageCount 12
ParticipantIDs proquest_journals_2656974759
crossref_primary_10_1007_s40747_021_00604_6
springer_journals_10_1007_s40747_021_00604_6
PublicationCentury 2000
PublicationDate 2022-04-01
PublicationDateYYYYMMDD 2022-04-01
PublicationDate_xml – month: 04
  year: 2022
  text: 2022-04-01
  day: 01
PublicationDecade 2020
PublicationPlace Cham
PublicationPlace_xml – name: Cham
– name: Heidelberg
PublicationTitle Complex & intelligent systems
PublicationTitleAbbrev Complex Intell. Syst
PublicationYear 2022
Publisher Springer International Publishing
Springer Nature B.V
Publisher_xml – name: Springer International Publishing
– name: Springer Nature B.V
References Takashima R, Takiguchi T, Ariki Y (2012) Exemplar-based Voice conversion in noisy environment. In: IEEE Spoken Language Technology Workshop (SLT), Miami, pp 313–317
Li Y, Lee KA, Yuan Y, Li H, Yang Z (2018) Many-to-many voice conversion based on bottleneck features with variational autoencoder for non-parallel training data. In: Asia–Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA), Hawaii, pp 829–833
YeHYoungSQuality-enhanced voice morphing using maximum likelihood transformationsIEEE Trans Audio Speech Lang Process20061441301131210.1109/TSA.2005.860839
Kotani G, Saito D, Minematsu N (2017) Voice conversion based on deep neural networks for time-variant linear transformations. In: Asia–Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Kuala Lumpur, pp 1259–1262
Microsoft WebView2 web rendering (2020) [Online]. Available: https://docs.microsoft.com/en-us/microsoft-edge/webview2
VillavicencioFBonadaJVoice conversion using deep neural networks with layer-wise generative trainingIEEE/ACM Trans Audio Speech Lange Process (TASLP) J2014221859187210.1109/TASLP.2014.2353991
FoxJTangWYSexism in online video games: the role of conformity to masculine norms and social dominance orientationComput Hum Behav20143331432010.1016/j.chb.2013.07.014
Villavicencio F, Bonada J (2010) Applying voice conversion to concatenative singing-voice synthesis. In: 11th Annual Conference of the International Speech Communication Association (INTERSPEECH), Chiba, pp 2162–2165
Liu L, Ling Z, Jiang Y, Zhou M, Dai L (2018) WaveNet Vocoder with limited training data for voice conversion. In: Proc. Interspeech 2018, pp 1983–1987
Midtlyng M, Sato Y (2016) Real-time voice adaptation with abstract normalization and sound-indexed based search. In: IEEE International Conference on Systems, Man, and Cybernetics (SMC), Budapest, pp 60–65
Sekii Y, Orihara R, Kojima K, Sei Y, Tahara Y, Ohsuga A (2017) Fast many-to-one voice conversion using autoencoders. In: International Conference on Agents and Artificial Intelligence (ICAART), Porto, pp 164–174
Kaneko T, Kameoka T, Tanaka K, Hojo N (2019) Cyclegan-VC2: improved cyclegan-based non-parallel voice conversion. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, pp 6820–6824
Erro D, Moreno A (2007) Weighted frequency warping for voice conversion. In: 8th Annual Conference of the International Speech Communication Association INTERSPEECH, Antwerp, pp 1965–1968
StylianouYCappéOMoulinesEContinuous probabilistic transform for voice conversionIEEE Trans Speech Audio Process19981285288
ErroDNavasEHernáezIParametric voice conversion based on bilinear frequency warping plus amplitude scalingIEEE Trans Audio Speech Lang Process201321355656610.1109/TASL.2012.2227735
ZhangQLiHMOEA/D: a multiobjective evolutionary algorithm based on decompositionIEEE Trans Evol Comput200711671273110.1109/TEVC.2007.892759
Rideout V (2015) The common sense census: media use by tweens and teens. Analysis & Policy Observatory, Common Sense Media
Zhou K, Sisman B, Liu R, Li H (2021) Seen and unseen emotional style transfer for voice conversion with a new emotional speech dataset. In: ICASSP 2021—2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021, pp 920–924
Midtlyng M, Sato Y (2018) Voice adaptation from mean dataset voice profile with dynamic power. In: IEEE International Conference on Systems, Man, and Cybernetics (SMC), Shizuoka, pp 2037–2042
Unity3D, Unity Technologies. Accessed on: January 1. 2019 [Online]. Available: https://unity.com
Chen Y, Chu M, Chang E, Liu J, Liu R (2003) Voice conversion with smoothed GMM and map adaptation. In: 8th European Conference on Speech Communication and Technology (Eurospeech 2003—Interspeech 2003), Geneva, pp 2413–2416
Tamura M, Morita M, Kagoshima T, Akamine M (2011) One sentence voice adaptation using GMM-based frequency warping and shift with a sub-band basis spectrum model. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Prague, pp 5124–5127
SatoYVoice quality conversion using interactive evolution of prosodic controlAppl Soft Comput J2004518119210.1016/j.asoc.2004.06.005
MoulinesESagisakaYVoice conversion: state of the art and perspectivesSpeech Commun199516212512610.1016/0167-6393(95)90054-3Special Issue
Eason Y, Stylianou (2009) Voice transformation: a survey. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Taipei, pp 3585–3588
Fang F, Yamagishi J, Echizen I, Lorenzo-Trueba J (2018) High-quality nonparallel voice conversion based on cycle-consistent adversarial network. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, pp 5279–5283
Abe M, Nakamura S, Shikano K, Kuwabara H (1988) Voice conversion through vector quantization. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), vol. 1, New York, pp 655–658
WuZVirtanenTChngESLiHExemplar-based sparse representation with residual compensation for voice conversionIEEE Trans Audio Speech Lang Process201422101506152110.1109/TASLP.2014.2333242
Hsu C-C, Hwang H-T, Wu Y-C, Tsao Y, Wang H-M (2016) Voice conversion from non-parallel corpora using variational auto-encoder. In: Asia–Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA), Jeju, pp 1–6
HuangW-CHayashiTWuY-CKameokaHTodaTPretraining techniques for sequence-to-sequence voice conversionIEEE/ACM Trans Audio Speech Lang Process20212974575510.1109/TASLP.2021.3049336
Lorenzo-Trueba J, Yamagishi J, Toda T, Saito D, Villavicencio F, Kinnunen T et al. (2018) The voice conversion challenge 2018: promoting development of parallel and nonparallel methods, Odyssy 2018
Midtlyng M, Sato Y (2020) Lightweight multi-objective voice adaptation for real-time speech interaction applied in games. In: IEEE Conference on Games (CoG), Osaka, pp 237–243
Microsoft .NET5 SDK (2020) [Online]. Available: https://dotnet.microsoft.com/download/dotnet/current
ZhangMZhouYZhaoLLiHTransfer learning from speech synthesis to voice conversion with non-parallel training dataIEEE/ACM Trans Audio Speech Lang Process2021291290130210.1109/TASLP.2021.3066047
Toda T, Saruwatari H, Shikano K (2001) Voice conversion algorithm based on Gaussian mixture model with dynamic frequency warping of STRAIGHT spectrum. In: Proc. ICASSP, pp 841–844
Kain A, Macon MW (1998) Spectral voice conversion for text-to-speech synthesis. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Seattle, pp 285–299
Y Stylianou (604_CR3) 1998; 1
H Ye (604_CR7) 2006; 14
Y Sato (604_CR12) 2004; 5
604_CR20
M Zhang (604_CR31) 2021; 29
E Moulines (604_CR5) 1995; 16
604_CR21
Q Zhang (604_CR23) 2007; 11
F Villavicencio (604_CR11) 2014; 22
604_CR22
604_CR25
604_CR26
604_CR27
604_CR28
W-C Huang (604_CR32) 2021; 29
604_CR29
604_CR2
604_CR1
604_CR6
604_CR4
604_CR8
D Erro (604_CR30) 2013; 21
J Fox (604_CR24) 2014; 33
604_CR10
604_CR33
604_CR34
604_CR13
604_CR35
604_CR14
604_CR36
604_CR15
604_CR16
604_CR17
604_CR18
604_CR19
Z Wu (604_CR9) 2014; 22
References_xml – reference: Microsoft WebView2 web rendering (2020) [Online]. Available: https://docs.microsoft.com/en-us/microsoft-edge/webview2/
– reference: Erro D, Moreno A (2007) Weighted frequency warping for voice conversion. In: 8th Annual Conference of the International Speech Communication Association INTERSPEECH, Antwerp, pp 1965–1968
– reference: Li Y, Lee KA, Yuan Y, Li H, Yang Z (2018) Many-to-many voice conversion based on bottleneck features with variational autoencoder for non-parallel training data. In: Asia–Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA), Hawaii, pp 829–833
– reference: Chen Y, Chu M, Chang E, Liu J, Liu R (2003) Voice conversion with smoothed GMM and map adaptation. In: 8th European Conference on Speech Communication and Technology (Eurospeech 2003—Interspeech 2003), Geneva, pp 2413–2416
– reference: YeHYoungSQuality-enhanced voice morphing using maximum likelihood transformationsIEEE Trans Audio Speech Lang Process20061441301131210.1109/TSA.2005.860839
– reference: Kotani G, Saito D, Minematsu N (2017) Voice conversion based on deep neural networks for time-variant linear transformations. In: Asia–Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Kuala Lumpur, pp 1259–1262
– reference: VillavicencioFBonadaJVoice conversion using deep neural networks with layer-wise generative trainingIEEE/ACM Trans Audio Speech Lange Process (TASLP) J2014221859187210.1109/TASLP.2014.2353991
– reference: Liu L, Ling Z, Jiang Y, Zhou M, Dai L (2018) WaveNet Vocoder with limited training data for voice conversion. In: Proc. Interspeech 2018, pp 1983–1987
– reference: Midtlyng M, Sato Y (2016) Real-time voice adaptation with abstract normalization and sound-indexed based search. In: IEEE International Conference on Systems, Man, and Cybernetics (SMC), Budapest, pp 60–65
– reference: Kaneko T, Kameoka T, Tanaka K, Hojo N (2019) Cyclegan-VC2: improved cyclegan-based non-parallel voice conversion. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, pp 6820–6824
– reference: Midtlyng M, Sato Y (2018) Voice adaptation from mean dataset voice profile with dynamic power. In: IEEE International Conference on Systems, Man, and Cybernetics (SMC), Shizuoka, pp 2037–2042
– reference: Toda T, Saruwatari H, Shikano K (2001) Voice conversion algorithm based on Gaussian mixture model with dynamic frequency warping of STRAIGHT spectrum. In: Proc. ICASSP, pp 841–844
– reference: Abe M, Nakamura S, Shikano K, Kuwabara H (1988) Voice conversion through vector quantization. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), vol. 1, New York, pp 655–658
– reference: Fang F, Yamagishi J, Echizen I, Lorenzo-Trueba J (2018) High-quality nonparallel voice conversion based on cycle-consistent adversarial network. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, pp 5279–5283
– reference: Zhou K, Sisman B, Liu R, Li H (2021) Seen and unseen emotional style transfer for voice conversion with a new emotional speech dataset. In: ICASSP 2021—2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021, pp 920–924
– reference: Tamura M, Morita M, Kagoshima T, Akamine M (2011) One sentence voice adaptation using GMM-based frequency warping and shift with a sub-band basis spectrum model. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Prague, pp 5124–5127
– reference: Kain A, Macon MW (1998) Spectral voice conversion for text-to-speech synthesis. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Seattle, pp 285–299
– reference: Eason Y, Stylianou (2009) Voice transformation: a survey. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Taipei, pp 3585–3588
– reference: MoulinesESagisakaYVoice conversion: state of the art and perspectivesSpeech Commun199516212512610.1016/0167-6393(95)90054-3Special Issue
– reference: Sekii Y, Orihara R, Kojima K, Sei Y, Tahara Y, Ohsuga A (2017) Fast many-to-one voice conversion using autoencoders. In: International Conference on Agents and Artificial Intelligence (ICAART), Porto, pp 164–174
– reference: ZhangMZhouYZhaoLLiHTransfer learning from speech synthesis to voice conversion with non-parallel training dataIEEE/ACM Trans Audio Speech Lang Process2021291290130210.1109/TASLP.2021.3066047
– reference: WuZVirtanenTChngESLiHExemplar-based sparse representation with residual compensation for voice conversionIEEE Trans Audio Speech Lang Process201422101506152110.1109/TASLP.2014.2333242
– reference: HuangW-CHayashiTWuY-CKameokaHTodaTPretraining techniques for sequence-to-sequence voice conversionIEEE/ACM Trans Audio Speech Lang Process20212974575510.1109/TASLP.2021.3049336
– reference: ZhangQLiHMOEA/D: a multiobjective evolutionary algorithm based on decompositionIEEE Trans Evol Comput200711671273110.1109/TEVC.2007.892759
– reference: Lorenzo-Trueba J, Yamagishi J, Toda T, Saito D, Villavicencio F, Kinnunen T et al. (2018) The voice conversion challenge 2018: promoting development of parallel and nonparallel methods, Odyssy 2018
– reference: Villavicencio F, Bonada J (2010) Applying voice conversion to concatenative singing-voice synthesis. In: 11th Annual Conference of the International Speech Communication Association (INTERSPEECH), Chiba, pp 2162–2165
– reference: Midtlyng M, Sato Y (2020) Lightweight multi-objective voice adaptation for real-time speech interaction applied in games. In: IEEE Conference on Games (CoG), Osaka, pp 237–243
– reference: Hsu C-C, Hwang H-T, Wu Y-C, Tsao Y, Wang H-M (2016) Voice conversion from non-parallel corpora using variational auto-encoder. In: Asia–Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA), Jeju, pp 1–6
– reference: StylianouYCappéOMoulinesEContinuous probabilistic transform for voice conversionIEEE Trans Speech Audio Process19981285288
– reference: SatoYVoice quality conversion using interactive evolution of prosodic controlAppl Soft Comput J2004518119210.1016/j.asoc.2004.06.005
– reference: FoxJTangWYSexism in online video games: the role of conformity to masculine norms and social dominance orientationComput Hum Behav20143331432010.1016/j.chb.2013.07.014
– reference: ErroDNavasEHernáezIParametric voice conversion based on bilinear frequency warping plus amplitude scalingIEEE Trans Audio Speech Lang Process201321355656610.1109/TASL.2012.2227735
– reference: Rideout V (2015) The common sense census: media use by tweens and teens. Analysis & Policy Observatory, Common Sense Media
– reference: Microsoft .NET5 SDK (2020) [Online]. Available: https://dotnet.microsoft.com/download/dotnet/current
– reference: Unity3D, Unity Technologies. Accessed on: January 1. 2019 [Online]. Available: https://unity.com/
– reference: Takashima R, Takiguchi T, Ariki Y (2012) Exemplar-based Voice conversion in noisy environment. In: IEEE Spoken Language Technology Workshop (SLT), Miami, pp 313–317
– ident: 604_CR34
– ident: 604_CR14
  doi: 10.21437/Interspeech.2010-596
– ident: 604_CR15
  doi: 10.1109/ICASSP.2018.8462342
– volume: 21
  start-page: 556
  issue: 3
  year: 2013
  ident: 604_CR30
  publication-title: IEEE Trans Audio Speech Lang Process
  doi: 10.1109/TASL.2012.2227735
– ident: 604_CR21
  doi: 10.1109/SMC.2018.00351
– ident: 604_CR17
  doi: 10.1109/APSIPA.2016.7820786
– ident: 604_CR18
  doi: 10.21437/Odyssey.2018-28
– ident: 604_CR36
– ident: 604_CR4
  doi: 10.1109/ICASSP.2001.941046
– ident: 604_CR27
  doi: 10.1109/APSIPA.2017.8282216
– ident: 604_CR13
  doi: 10.1109/ICASSP.1988.196671
– ident: 604_CR19
  doi: 10.21437/Interspeech.2018-1190
– volume: 29
  start-page: 745
  year: 2021
  ident: 604_CR32
  publication-title: IEEE/ACM Trans Audio Speech Lang Process
  doi: 10.1109/TASLP.2021.3049336
– ident: 604_CR10
  doi: 10.1109/SLT.2012.6424242
– volume: 33
  start-page: 314
  year: 2014
  ident: 604_CR24
  publication-title: Comput Hum Behav
  doi: 10.1016/j.chb.2013.07.014
– ident: 604_CR29
  doi: 10.23919/APSIPA.2018.8659628
– volume: 22
  start-page: 1506
  issue: 10
  year: 2014
  ident: 604_CR9
  publication-title: IEEE Trans Audio Speech Lang Process
  doi: 10.1109/TASLP.2014.2333242
– ident: 604_CR22
  doi: 10.1109/CoG47356.2020.9231643
– volume: 14
  start-page: 1301
  issue: 4
  year: 2006
  ident: 604_CR7
  publication-title: IEEE Trans Audio Speech Lang Process
  doi: 10.1109/TSA.2005.860839
– ident: 604_CR6
  doi: 10.1109/ICASSP.1998.674423
– ident: 604_CR28
  doi: 10.1109/ICASSP.2011.5947510
– ident: 604_CR33
  doi: 10.1109/ICASSP39728.2021.9413391
– volume: 29
  start-page: 1290
  year: 2021
  ident: 604_CR31
  publication-title: IEEE/ACM Trans Audio Speech Lang Process
  doi: 10.1109/TASLP.2021.3066047
– ident: 604_CR1
  doi: 10.1109/ICASSP.2009.4960401
– ident: 604_CR16
  doi: 10.1109/ICASSP.2019.8682897
– ident: 604_CR35
– volume: 11
  start-page: 712
  issue: 6
  year: 2007
  ident: 604_CR23
  publication-title: IEEE Trans Evol Comput
  doi: 10.1109/TEVC.2007.892759
– ident: 604_CR20
  doi: 10.1109/SMC.2016.7844220
– ident: 604_CR2
  doi: 10.21437/Interspeech.2007-550
– ident: 604_CR26
  doi: 10.5220/0006193301640174
– volume: 22
  start-page: 1859
  year: 2014
  ident: 604_CR11
  publication-title: IEEE/ACM Trans Audio Speech Lange Process (TASLP) J
  doi: 10.1109/TASLP.2014.2353991
– volume: 1
  start-page: 285
  year: 1998
  ident: 604_CR3
  publication-title: IEEE Trans Speech Audio Process
– ident: 604_CR25
– volume: 16
  start-page: 125
  issue: 2
  year: 1995
  ident: 604_CR5
  publication-title: Speech Commun
  doi: 10.1016/0167-6393(95)90054-3
– ident: 604_CR8
  doi: 10.21437/Eurospeech.2003-664
– volume: 5
  start-page: 181
  year: 2004
  ident: 604_CR12
  publication-title: Appl Soft Comput J
  doi: 10.1016/j.asoc.2004.06.005
SSID ssj0001778302
ssib044733412
ssib045327741
Score 2.1772277
Snippet Voice adaptation is an interactive speech processing technique that allows the speaker to transmit with a chosen target voice. We propose a novel method that...
SourceID proquest
crossref
springer
SourceType Aggregation Database
Index Database
Publisher
StartPage 1539
SubjectTerms Adaptation
Artificial neural networks
Color matching
Complexity
Computational Intelligence
Computer & video games
Data Structures and Information Theory
Engineering
Frames (data processing)
Intelligent systems
Linguistics
Methods
Multiple objective analysis
Neural networks
Optimization
Original Article
Privacy
Probabilistic models
Speech
Speech processing
Training
Voice
SummonAdditionalLinks – databaseName: Advanced Technologies & Aerospace Database
  dbid: P5Z
  link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1NS8QwEA26etCD3-L6RQ7eNNgmabM5iYjiQcWDingpk6QVFbfrdhX892bSdhcFvXgrlAY6k8xMMi_vEbKXOKstF4L1fLpksgeaQco141YUOgajuQmevlBXV737e33dHLhVDayyjYkhULvS4hn5IfeFB9a-iT4avDFUjcLuaiOhMU1mkCUBpRuuk4d2PkmphJCT9C0TwVWrNBPOYJRC-ivUn4u1ZjJ0MjfHt-skksszxDAga4lk6ffcNSlIf_RQQ2o6W_zvTy2RhaYopcf1LFomU3l_hcxfjhldq1Xyclf6kELBwaBu3lPzSZHxesiQCtPljhaI86L-gwDPpFBRoAGvyErzXMdVWvoI9dpc_aSNmA31dTOtuU3oI4J218jt2enNyTlrhBqY9TYbMSFdrooYW4YuTow1VhhpRCElcB8QfM2X4lYzgigCKyDKY1MAB66hp5V1qVgnnX7ZzzcI5Up7i4B0qd-oJXmknVL-SVsoOI8MdMl-64JsUPNxZGPm5eCwzDssCw7L0i7Zbu2eNWuzyiZG75KD1nOT17-Ptvn3aFtkjuPdiADr2Sad0fA93yGz9mP0VA13w8z8AmPB5Zc
  priority: 102
  providerName: ProQuest
Title Voice adaptation by color-encoded frame matching as a multi-objective optimization problem for future games
URI https://link.springer.com/article/10.1007/s40747-021-00604-6
https://www.proquest.com/docview/2656974759
Volume 8
WOSCitedRecordID wos000738565700011&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAON
  databaseName: Acceso a contenido Full Text - Doaj
  customDbUrl:
  eissn: 2198-6053
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0001778302
  issn: 2199-4536
  databaseCode: DOA
  dateStart: 20150101
  isFulltext: true
  titleUrlDefault: https://www.doaj.org/
  providerName: Directory of Open Access Journals
– providerCode: PRVHPJ
  databaseName: ROAD: Directory of Open Access Scholarly Resources
  customDbUrl:
  eissn: 2198-6053
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssib044733412
  issn: 2199-4536
  databaseCode: M~E
  dateStart: 20150101
  isFulltext: true
  titleUrlDefault: https://road.issn.org
  providerName: ISSN International Centre
– providerCode: PRVPQU
  databaseName: Advanced Technologies & Aerospace Database
  customDbUrl:
  eissn: 2198-6053
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0001778302
  issn: 2199-4536
  databaseCode: P5Z
  dateStart: 20151201
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/hightechjournals
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ProQuest Central
  customDbUrl:
  eissn: 2198-6053
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0001778302
  issn: 2199-4536
  databaseCode: BENPR
  dateStart: 20151201
  isFulltext: true
  titleUrlDefault: https://www.proquest.com/central
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Publicly Available Content Database
  customDbUrl:
  eissn: 2198-6053
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0001778302
  issn: 2199-4536
  databaseCode: PIMPY
  dateStart: 20151201
  isFulltext: true
  titleUrlDefault: http://search.proquest.com/publiccontent
  providerName: ProQuest
– providerCode: PRVAVX
  databaseName: SpringerLINK
  customDbUrl:
  eissn: 2198-6053
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0001778302
  issn: 2199-4536
  databaseCode: C24
  dateStart: 20151201
  isFulltext: true
  titleUrlDefault: https://link.springer.com/search?facet-content-type=%22Journal%22
  providerName: Springer Nature
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS-wwFD74WuhCvQ9xfAxZuPMG2iRtmqWKoqBDEa9476acJK2oOCPTUfDfm6TtjFdwcd2UlrSlnJOcR893vgDsJdYowzinmXOXVGSoKKZMUWZ4pWLUiumg6XM5GGQ3Nypvm8LqDu3elSSDpZ42uwnP9U49pMCTiAiazsNiEmfKA_mOZpzjQkjOReu0w58WKT3Jld9lLlaKilCv3Pr8tf96qFnY-aFSGhzQydrXPn0dVtuAkxw0M-QbzJXD77DyjobQXV1MuVvrH_BwPXLGg6DFp6ZMT_Qr8dzWY-pJL21pSeURXcQ9EICYBGuCJCAT6UjfNxaUjJwtemybPEm7bQ1xETJpWEzIrYfn_oTfJ8dXR6e03ZKBGie3CeXClrKKfXHQxok22nAtNK-EQOaWvovuUp9URhhFaDhGZawrZMgUZkoam_INWBiOhuUmECZVqiQKm7qULCkjZaV0Z8pgxViksQf7nRqKp4Z5o5hyLAeBFk6gRRBokfZgp9NU0a7CumAuWPX5UqJ68KvTzGz487dt_d_t27DMfFdEAPTswMJk_FzuwpJ5mdzV4z4sHh4P8st-mKX9kPS7Y578dSP52UX-5w2e4ODX
linkProvider Springer Nature
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1Lj9MwEB4tXSTgwBvRZQEf4ATWJrYT1weEELDaatuqhwUtpzC2EwSIpjQFtH-K34jHSVqBBLc9cIsUxVEyn-fhmfkG4FHmnXFCSj4K5pKrERqOuTBcOFmZFK0RNkp6omez0empme_Az74Xhsoqe50YFbWvHZ2RH4jgeJDvm5nny6-cpkZRdrUfodHC4rg8-xFCtubZ-FWQ72MhDl-fvDzi3VQB7lQm11wqX-oqpfyWTzPrrJNWWVkphSKgNzgoOcVFCSYJOolJmdoKBQqDI6Odz2VY9wLsKgL7AHbn4-n8XY9gpbSUauswhBcK3c-2iac-WhPhFk28S43hKuZO9zb9fIro7DlVTRBPiuL579Zy6wL_kbWNxvDw2v_2G6_D1c7tZi_afXIDdsrFTbgy3XDWNrfg89s6KE2GHpdteQKzZ4w4vVecyD596VlFlWwsPBALUBk2DFmsyOS1_dRaDlYHHfyla25l3bgeFiID1rK3sA9Ulnwb3pzL196BwaJelHeBCW2CBFD5PISiWZkYr3W4Mg4rIRKLQ3jSi7xYtowjxYZbOgKkCAApIkCKfAj7vZyLTvs0xVbIQ3jaI2V7---r7f17tYdw6ehkOikm49nxPbgsqBMkFjHtw2C9-lbeh4vu-_pjs3rQ7QsG788bQ78A179DRw
linkToPdf http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV07T8MwED5BQQgG3oiWlwc2sEhsN65HBFQgoGIAxBad7QQBoqnagsS_x3bSFpAYEFuiJJZyd7lH7rvPAPtNa5RhnNOWC5dUtFBRTJiizPBcxagV00HTV7LTaT08qJsvU_wB7T5qSZYzDZ6lqTs86tn8aDz4JjzvO_XwAk8oImgyDTO-I-Vt_GTCPy6E5FxUATz8dZHSE175HedipagIvcvG78t-j1aTFPRH1zQEo_bS_19jGRarRJQcl5azAlNZdxUWvtATurPrMafrYA1e7gvnVAha7JXte6I_iOe87lNPhmkzS3KP9CLugQDQJDggSAJikRb6ufSspHA-6rUa_iTVdjbEZc6kZDchjx62uw537bPbk3NabdVAjZPhkHJhM5nHvmlo46Y22nAtNM-FQOZcgsv6El9sRhhFaDhGWaxzZMgUtpQ0NuEbUOsW3WwTCJMqURKFTVyp1swiZaV0R8pgzliksQ4HI5WkvZKRIx1zLweBpk6gaRBomtRhe6S1tPo6BylzSayvo5qqDocjLU0u_75a42-378HczWk7vbroXG7BPPODEwHzsw21Yf8t24FZ8z58GvR3g9F-AgC9518
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Voice+adaptation+by+color-encoded+frame+matching+as+a+multi-objective+optimization+problem+for+future+games&rft.jtitle=Complex+%26+intelligent+systems&rft.au=Midtlyng%2C+Mads&rft.au=Sato%2C+Yuji&rft.au=Hosobe%2C+Hiroshi&rft.date=2022-04-01&rft.pub=Springer+International+Publishing&rft.issn=2199-4536&rft.eissn=2198-6053&rft.volume=8&rft.issue=2&rft.spage=1539&rft.epage=1550&rft_id=info:doi/10.1007%2Fs40747-021-00604-6&rft.externalDocID=10_1007_s40747_021_00604_6
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2199-4536&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2199-4536&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2199-4536&client=summon