Search Results - Electrical Engineering and Systems Science - Audio and Speech Processing

1

Loading…

Procurement Of Dsp Enabled Evaluation Kit For Speech And Audio Signal Processing At The Department Of Electronics Electrical Communication Engineering Department, Iit

ISSN: 2219-0112

Published: Camden Disco Digital Media, Inc 21.06.2025

Published in MENA Report (21.06.2025)

Get full text

Newsletter

Save to List

Saved in:
2

Loading…

PeriodGrad: Towards Pitch-Controllable Neural Vocoder Based on a Diffusion Probabilistic Model by Hono, Yukiya, Hashimoto, Kei, Nankaku, Yoshihiko, Tokuda, Keiichi

ISSN: 2379-190X

Published: IEEE 14.04.2024

Published in Proceedings of the ... IEEE International Conference on Acoustics, Speech and Signal Processing (1998) (14.04.2024)
“… In practical applications, such as singing voice synthesis, there is a demand for neural vocoders to generate high-fidelity speech waveforms with flexible pitch control…”

Get full text

Conference Proceeding

Save to List

Saved in:
3

Loading…

MOS-FAD: Improving Fake Audio Detection Via Automatic Mean Opinion Score Prediction by Zhou, Wangjin, Yang, Zhengdong, Chu, Chenhui, Li, Sheng, Dabre, Raj, Zhao, Yi, Tatsuya, Kawahara

ISSN: 2379-190X

Published: IEEE 14.04.2024

Published in Proceedings of the ... IEEE International Conference on Acoustics, Speech and Signal Processing (1998) (14.04.2024)
“… This study extends the application of predicted MOS to the task of Fake Audio Detection (FAD) as we expect that MOS can be used to assess how close synthesized speech is to the natural human voice…”

Get full text

Conference Proceeding

Save to List

Saved in:
4

Loading…

Bass Accompaniment Generation Via Latent Diffusion by Pasini, Marco, Grachten, Maarten, Lattner, Stefan

ISSN: 2379-190X

Published: IEEE 14.04.2024

Published in Proceedings of the ... IEEE International Conference on Acoustics, Speech and Signal Processing (1998) (14.04.2024)
“… We present a novel controllable system for generating single stems to accompany musical mixes of arbitrary length…”

Get full text

Conference Proceeding

Save to List

Saved in:
5

Loading…

Data Driven Grapheme-to-Phoneme Representations for a Lexicon-Free Text-to-Speech by Garg, Abhinav, Kim, Jiyeon, Khyalia, Sushil, Kim, Chanwoo, Gowda, Dhananjaya

ISSN: 2379-190X

Published: IEEE 14.04.2024

Published in Proceedings of the ... IEEE International Conference on Acoustics, Speech and Signal Processing (1998) (14.04.2024)
“…Grapheme-to-Phoneme (G2P) is an essential first step in any modern, high-quality Text-to-Speech (TTS) system…”

Get full text

Conference Proceeding

Save to List

Saved in:
6

Loading…

Localizing Acoustic Energy in Sound Field Synthesis by Directionally Weighted Exterior Radiation Suppression by Tomita, Yoshihide, Koyama, Shoichi, Saruwatari, Hiroshi

ISSN: 2379-190X

Published: IEEE 14.04.2024

Published in Proceedings of the ... IEEE International Conference on Acoustics, Speech and Signal Processing (1998) (14.04.2024)
“… The exterior radiation from the loudspeakers in sound field synthesis systems can be problematic in practical situations…”

Get full text

Conference Proceeding

Save to List

Saved in:
7

Loading…

Restoring speech intelligibility for hearing aid users with deep learning by Peter Udo Diehl, Yosef Singer, Hannes Zilly, Uwe Schönfeld, Paul Meyer-Rachner, Mark Berry, Henning Sprekeler, Elias Sprengel, Annett Pudszuhn, Veit M. Hofmann

ISSN: 2045-2322

Published: Springer Science and Business Media LLC 15.02.2023

Published in Scientific Reports (15.02.2023)

Get full text

Journal Article

Save to List

Saved in:
8

Loading…

Versatile Time-Frequency Representations Realized by Convex Penalty on Magnitude Spectrogram by Keidai Arai, Koki Yamada, Kohei Yatabe

ISSN: 1070-9908, 1558-2361

Published: Institute of Electrical and Electronics Engineers (IEEE) 01.01.2023

Published in IEEE Signal Processing Letters (01.01.2023)

Get full text

Journal Article

Save to List

Saved in:
9

Loading…

Expression-Preserving Face Frontalization Improves Visually Assisted Speech Processing by Zhiqi Kang, Mostafa Sadeghi, Radu Horaud, Xavier Alameda-Pineda

ISSN: 0920-5691, 1573-1405

Published: Springer Science and Business Media LLC 12.01.2023

Published in International Journal of Computer Vision (12.01.2023)

Get full text

Journal Article

Save to List

Saved in:
10

Loading…

Generalized Domain Adaptation Framework for Parametric Back-End in Speaker Recognition by Qiongqiong Wang, Koji Okabe, Kong Aik Lee, Takafumi Koshinaka

ISSN: 1556-6013, 1556-6021

Published: Institute of Electrical and Electronics Engineers (IEEE) 01.01.2023

Published in IEEE Transactions on Information Forensics and Security (01.01.2023)

Get full text

Journal Article

Save to List

Saved in:
11

Loading…

Investigation of Japanese PnG BERT Language Model in Text-to-Speech Synthesis for Pitch Accent Language by Yusuke Yasuda, Tomoki Toda

ISSN: 1932-4553, 1941-0484

Published: Institute of Electrical and Electronics Engineers (IEEE) 01.10.2022

Published in IEEE Journal of Selected Topics in Signal Processing (01.10.2022)

Get full text

Journal Article

Save to List

Saved in:
12

Loading…

Learning and controlling the source-filter representation of speech with a variational autoencoder by Sadok, Samir, Leglaive, Simon, Girin, Laurent, Alameda-Pineda, Xavier, Seguier, Renaud

ISSN: 0167-6393

Published: Elsevier BV 01.03.2023

Published in Speech Communication (01.03.2023)

Get full text

Journal Article

Save to List

Saved in:
13

Loading…

CASA-based speaker identification using cascaded GMM-CNN classifier in noisy and emotional talking conditions by Nawel Nemmour, Keikichi Hirose, Shibani Hamsa, Ismail Shahin, Ali Bou Nassif

ISSN: 1568-4946

Published: Elsevier BV 01.05.2021

Published in Applied Soft Computing (01.05.2021)

Get full text

Journal Article

Save to List

Saved in:
14

Loading…

GEDI: Gammachirp envelope distortion index for predicting intelligibility of enhanced speech by Shoko Araki, Keisuke Kinoshita, Katsuhiko Yamamoto, Toshio Irino, Tomohiro Nakatani

ISSN: 0167-6393

Published: Elsevier BV 01.10.2020

Published in Speech Communication (01.10.2020)

Get full text

Journal Article

Save to List

Saved in:
15

Loading…

A Large-Scale Evaluation of Speech Foundation Models by Shu-wen Yang, Heng-Jui Chang, Zili Huang, Andy T. Liu, Cheng-I Lai, Haibin Wu, Jiatong Shi, Xuankai Chang, Hsiang-Sheng Tsai, Wen-Chin Huang, Tzu-hsun Feng, Po-Han Chi, Yist Y. Lin, Yung-Sung Chuang, Tzu-Hsien Huang, Wei-Cheng Tseng, Kushal Lakhotia, Shang-Wen Li, Abdelrahman Mohamed, Shinji Watanabe, Hung-yi Lee

ISSN: 2329-9290, 2329-9304

Published: Institute of Electrical and Electronics Engineers (IEEE) 01.01.2024

Published in IEEE/ACM Transactions on Audio, Speech, and Language Processing (01.01.2024)

Get full text

Journal Article

Save to List

Saved in:
16

Loading…

ELP-Adapters: Parameter Efficient Adapter Tuning for Various Speech Processing Tasks by Nakamasa Inoue, Shinta Otake, Takumi Hirose, Masanari Ohi, Rei Kawakami

ISSN: 2329-9290, 2329-9304

Published: Institute of Electrical and Electronics Engineers (IEEE) 01.01.2024

Published in IEEE/ACM Transactions on Audio, Speech, and Language Processing (01.01.2024)

Get full text

Journal Article

Save to List

Saved in:
17

Loading…

RISC: A Corpus for Shout Type Classification and Shout Intensity Prediction by Takahiro Fukumori, Taito Ishida, Yoichi Yamashita

ISSN: 2329-9290, 2329-9304

Published: Institute of Electrical and Electronics Engineers (IEEE) 01.01.2024

Published in IEEE/ACM Transactions on Audio, Speech, and Language Processing (01.01.2024)

Get full text

Journal Article

Save to List

Saved in:
18

Loading…

The VoicePrivacy 2022 Challenge: Progress and Perspectives in Voice Anonymisation by Michele Panariello, Natalia Tomashenko, Xin Wang, Xiaoxiao Miao, Pierre Champion, Hubert Nourtel, Massimiliano Todisco, Nicholas Evans, Emmanuel Vincent, Junichi Yamagishi

ISSN: 2329-9290, 2329-9304

Published: Institute of Electrical and Electronics Engineers (IEEE) 01.01.2024

Published in IEEE/ACM Transactions on Audio, Speech, and Language Processing (01.01.2024)

Get full text

Journal Article

Save to List

Saved in:
19

Loading…

FastMVAE2: On Improving and Accelerating the Fast Variational Autoencoder-Based Source Separation Algorithm for Determined Mixtures by Li Li, Hirokazu Kameoka, Shoji Makino

ISSN: 2329-9290, 2329-9304

Published: Institute of Electrical and Electronics Engineers (IEEE) 01.01.2023

Published in IEEE/ACM Transactions on Audio, Speech, and Language Processing (01.01.2023)

Get full text

Journal Article

Save to List

Saved in:
20

Loading…

Decoupling Speaker-Independent Emotions for Voice Conversion via Source-Filter Networks by Zhaojie Luo, Shoufeng Lin, Rui Liu, Jun Baba, Yuichiro Yoshikawa, Hiroshi Ishiguro

ISSN: 2329-9290, 2329-9304

Published: Institute of Electrical and Electronics Engineers (IEEE) 01.01.2023

Published in IEEE/ACM Transactions on Audio, Speech, and Language Processing (01.01.2023)

Get full text

Journal Article

Save to List

Saved in:

Search Results - Electrical Engineering and Systems Science - Audio and Speech Processing

Procurement Of Dsp Enabled Evaluation Kit For Speech And Audio Signal Processing At The Department Of Electronics Electrical Communication Engineering Department, Iit

PeriodGrad: Towards Pitch-Controllable Neural Vocoder Based on a Diffusion Probabilistic Model by Hono, Yukiya, Hashimoto, Kei, Nankaku, Yoshihiko, Tokuda, Keiichi

MOS-FAD: Improving Fake Audio Detection Via Automatic Mean Opinion Score Prediction by Zhou, Wangjin, Yang, Zhengdong, Chu, Chenhui, Li, Sheng, Dabre, Raj, Zhao, Yi, Tatsuya, Kawahara

Bass Accompaniment Generation Via Latent Diffusion by Pasini, Marco, Grachten, Maarten, Lattner, Stefan

Data Driven Grapheme-to-Phoneme Representations for a Lexicon-Free Text-to-Speech by Garg, Abhinav, Kim, Jiyeon, Khyalia, Sushil, Kim, Chanwoo, Gowda, Dhananjaya

Localizing Acoustic Energy in Sound Field Synthesis by Directionally Weighted Exterior Radiation Suppression by Tomita, Yoshihide, Koyama, Shoichi, Saruwatari, Hiroshi

Restoring speech intelligibility for hearing aid users with deep learning by Peter Udo Diehl, Yosef Singer, Hannes Zilly, Uwe Schönfeld, Paul Meyer-Rachner, Mark Berry, Henning Sprekeler, Elias Sprengel, Annett Pudszuhn, Veit M. Hofmann

Versatile Time-Frequency Representations Realized by Convex Penalty on Magnitude Spectrogram by Keidai Arai, Koki Yamada, Kohei Yatabe

Expression-Preserving Face Frontalization Improves Visually Assisted Speech Processing by Zhiqi Kang, Mostafa Sadeghi, Radu Horaud, Xavier Alameda-Pineda

Generalized Domain Adaptation Framework for Parametric Back-End in Speaker Recognition by Qiongqiong Wang, Koji Okabe, Kong Aik Lee, Takafumi Koshinaka

Investigation of Japanese PnG BERT Language Model in Text-to-Speech Synthesis for Pitch Accent Language by Yusuke Yasuda, Tomoki Toda

Learning and controlling the source-filter representation of speech with a variational autoencoder by Sadok, Samir, Leglaive, Simon, Girin, Laurent, Alameda-Pineda, Xavier, Seguier, Renaud

CASA-based speaker identification using cascaded GMM-CNN classifier in noisy and emotional talking conditions by Nawel Nemmour, Keikichi Hirose, Shibani Hamsa, Ismail Shahin, Ali Bou Nassif

GEDI: Gammachirp envelope distortion index for predicting intelligibility of enhanced speech by Shoko Araki, Keisuke Kinoshita, Katsuhiko Yamamoto, Toshio Irino, Tomohiro Nakatani

ELP-Adapters: Parameter Efficient Adapter Tuning for Various Speech Processing Tasks by Nakamasa Inoue, Shinta Otake, Takumi Hirose, Masanari Ohi, Rei Kawakami

RISC: A Corpus for Shout Type Classification and Shout Intensity Prediction by Takahiro Fukumori, Taito Ishida, Yoichi Yamashita

The VoicePrivacy 2022 Challenge: Progress and Perspectives in Voice Anonymisation by Michele Panariello, Natalia Tomashenko, Xin Wang, Xiaoxiao Miao, Pierre Champion, Hubert Nourtel, Massimiliano Todisco, Nicholas Evans, Emmanuel Vincent, Junichi Yamagishi

FastMVAE2: On Improving and Accelerating the Fast Variational Autoencoder-Based Source Separation Algorithm for Determined Mixtures by Li Li, Hirokazu Kameoka, Shoji Makino

Decoupling Speaker-Independent Emotions for Voice Conversion via Source-Filter Networks by Zhaojie Luo, Shoufeng Lin, Rui Liu, Jun Baba, Yuichiro Yoshikawa, Hiroshi Ishiguro

Search Tools:

Refine Results

Format

Subject Area

Topic

Language

Year of Publication