Beam-search SIEVE for low-memory speech recognition
Saved in:
| Title: | Beam-search SIEVE for low-memory speech recognition |
|---|---|
| Authors: | Ciaperoni, Martino, Katsamanis, Athanasios, Gionis, Aristides, Karras, Panagiotis |
| Source: | Ciaperoni , M , Katsamanis , A , Gionis , A & Karras , P 2024 , Beam-search SIEVE for low-memory speech recognition . in Interspeech 2024 - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH . International Speech Communication Association (ISCA) , pp. 272-276 , 25th Interspeech Conferece 2024 , Kos Island , Greece , 01/09/2024 . https://doi.org/10.21437/Interspeech.2024-2457 |
| Publisher Information: | International Speech Communication Association (ISCA) |
| Publication Year: | 2024 |
| Collection: | University of Copenhagen: Research / Forskning ved Københavns Universitet |
| Subject Terms: | memory efficient algorithms, speech recognition |
| Description: | A capacity to recognize speech offline eliminates privacy concerns and the need for an internet connection. Despite efforts to reduce the memory demands of speech recognition systems, these demands remain formidable and thus popular tools such as Kaldi run best via cloud computing. The key bottleneck arises form the fact that a bedrock of such tools, the Viterbi algorithm, requires memory that grows linearly with utterance length even when contained via beam search. A recent recasting of the Viterbi algorithm, SIEVE, eliminates the path length factor from space complexity, but with a significant practical runtime overhead. In this paper, we develop a variant of SIEVE that lessens this runtime overhead via beam search, retains the decoding quality of standard beam search, and waives its linearly growing memory bottleneck. This space-complexity reduction is orthogonal to decoding quality and complementary to memory savings in model representation and training. |
| Document Type: | article in journal/newspaper |
| File Description: | application/pdf |
| Language: | English |
| DOI: | 10.21437/Interspeech.2024-2457 |
| Availability: | https://researchprofiles.ku.dk/da/publications/156cb9a9-01b9-4e94-83b9-de10f68ddef7 https://doi.org/10.21437/Interspeech.2024-2457 https://curis.ku.dk/ws/files/448640926/Beam-search_SIEVE_for_low-memory_speech_recognition.pdf |
| Rights: | info:eu-repo/semantics/openAccess |
| Accession Number: | edsbas.77EDFCE |
| Database: | BASE |
| Abstract: | A capacity to recognize speech offline eliminates privacy concerns and the need for an internet connection. Despite efforts to reduce the memory demands of speech recognition systems, these demands remain formidable and thus popular tools such as Kaldi run best via cloud computing. The key bottleneck arises form the fact that a bedrock of such tools, the Viterbi algorithm, requires memory that grows linearly with utterance length even when contained via beam search. A recent recasting of the Viterbi algorithm, SIEVE, eliminates the path length factor from space complexity, but with a significant practical runtime overhead. In this paper, we develop a variant of SIEVE that lessens this runtime overhead via beam search, retains the decoding quality of standard beam search, and waives its linearly growing memory bottleneck. This space-complexity reduction is orthogonal to decoding quality and complementary to memory savings in model representation and training. |
|---|---|
| DOI: | 10.21437/Interspeech.2024-2457 |
Nájsť tento článok vo Web of Science