Multimodal information fusion using the iterative decoding algorithm and its application to audio-visual speech recognition

The fusion of information from heterogenous sensors is crucial to the effectiveness of a multimodal system. Noise affect the sensors of different modalities independently. A good fusion scheme should be able to use local estimates of the reliability of each modality to weight the decisions. This pap...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:2008 IEEE International Conference on Acoustics, Speech and Signal Processing s. 2241 - 2244
Hlavní autoři: Shivappa, S.T., Rao, B.D., Trivedi, M.M.
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: IEEE 01.03.2008
Témata:
ISBN:9781424414833, 1424414830
ISSN:1520-6149
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract The fusion of information from heterogenous sensors is crucial to the effectiveness of a multimodal system. Noise affect the sensors of different modalities independently. A good fusion scheme should be able to use local estimates of the reliability of each modality to weight the decisions. This paper presents an iterative decoding based information fusion scheme motivated by the theory of turbo codes. This fusion framework is developed in the context of hidden Markov models. We present the mathematical framework of the fusion scheme. We then apply this algorithm to an audio-visual speech recognition task on the GRID audio-visual speech corpus and present the results.
AbstractList The fusion of information from heterogenous sensors is crucial to the effectiveness of a multimodal system. Noise affect the sensors of different modalities independently. A good fusion scheme should be able to use local estimates of the reliability of each modality to weight the decisions. This paper presents an iterative decoding based information fusion scheme motivated by the theory of turbo codes. This fusion framework is developed in the context of hidden Markov models. We present the mathematical framework of the fusion scheme. We then apply this algorithm to an audio-visual speech recognition task on the GRID audio-visual speech corpus and present the results.
Author Trivedi, M.M.
Rao, B.D.
Shivappa, S.T.
Author_xml – sequence: 1
  givenname: S.T.
  surname: Shivappa
  fullname: Shivappa, S.T.
  organization: Dept. of Electr. & Comput. Eng., Univ. of California, La Jolla, CA
– sequence: 2
  givenname: B.D.
  surname: Rao
  fullname: Rao, B.D.
  organization: Dept. of Electr. & Comput. Eng., Univ. of California, La Jolla, CA
– sequence: 3
  givenname: M.M.
  surname: Trivedi
  fullname: Trivedi, M.M.
  organization: Dept. of Electr. & Comput. Eng., Univ. of California, La Jolla, CA
BookMark eNo1kNtqAjEQhlNqoWp9Am_yAmuTTeIml0V6AksLttcyJhNN2U2WPQilL98V7VzMz8z888HMhIxiikjInLMF58zcv64eNpuPRc6YXkjFNTP8iky4zKXkUktzTWam0P-1ECMy5ipn2ZJLc0tmbfvNhpBKKKPG5PetL7tQJQclDdGnpoIupEh9355kyHFPuwPS0GEzjI5IHdrkTm0o96kJ3aGiEN1gaCnUdRnsmdAlCr0LKTuGth_obY1oD7QZtvcxnCx35MZD2eLsolPy9fT4uXrJ1u_Pw5XrLPBCdZmVXgq5tNpLAAfGOlsIBUoXoI1kZuec4eB3AkGi5blBEEIw64QXHi2KKZmfuQERt3UTKmh-tpffiT-yl2df
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/ICASSP.2008.4518091
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISBN 1424414849
9781424414840
EndPage 2244
ExternalDocumentID 4518091
Genre orig-research
GroupedDBID 23M
29P
6IE
6IF
6IH
6IK
6IL
6IM
6IN
AAJGR
AAWTH
ABLEC
ACGFS
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IJVOP
IPLJI
M43
OCL
RIE
RIL
RIO
RNS
ID FETCH-LOGICAL-i175t-c4f4346c8f4aada9cdc735a587a89409bdd91afb3ea4ec129ea3330cd3f3fece3
IEDL.DBID RIE
ISBN 9781424414833
1424414830
ISICitedReferencesCount 13
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000257456701222&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1520-6149
IngestDate Wed Aug 27 02:03:22 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i175t-c4f4346c8f4aada9cdc735a587a89409bdd91afb3ea4ec129ea3330cd3f3fece3
PageCount 4
ParticipantIDs ieee_primary_4518091
PublicationCentury 2000
PublicationDate 2008-March
PublicationDateYYYYMMDD 2008-03-01
PublicationDate_xml – month: 03
  year: 2008
  text: 2008-March
PublicationDecade 2000
PublicationTitle 2008 IEEE International Conference on Acoustics, Speech and Signal Processing
PublicationTitleAbbrev ICASSP
PublicationYear 2008
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0000453595
ssj0008748
Score 1.7884533
Snippet The fusion of information from heterogenous sensors is crucial to the effectiveness of a multimodal system. Noise affect the sensors of different modalities...
SourceID ieee
SourceType Publisher
StartPage 2241
SubjectTerms Acoustic noise
Application software
Drives
Feature extraction
Hidden Markov models
Iterative algorithms
Iterative decoding
Multimedia systems
Multimodal sensors
Robustness
Sensor fusion
Speech recognition
Title Multimodal information fusion using the iterative decoding algorithm and its application to audio-visual speech recognition
URI https://ieeexplore.ieee.org/document/4518091
WOSCitedRecordID wos000257456701222&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NS8MwGA5zeNCLH5v4TQ4erVuXZGmOMhwKMgZT2W2kyZutsLVjbXfxz5u0tU7w4q0pJQlJyfuR53lehO56RDuRdvACw8GjLmUlKWOeAhMSSU3IioT-xysfjYLpVIwb6L7mwgBAAT6DB_dY3OXrROUuVdahzKlN2Vhnj_N-ydWq8ynWNSk5ptUpHPCicpY1Ty48ouKb1GVnQ2qtp6pNKjkivys6L4PHyWRcgiyr8X4VXinszvDofzM-Ru0fAh8e16bpBDUgPkWHO9qDLfRZUG9XiZZLXKmnuj3CJnf5M-zw8HNsvUNc6i7bQxFrG6q6_rBczpNNlC1WWMbafpDinXtwnCVY5jpKvG2U5rb3dA2gFriGKiVxG70Pn94Gz15VicGLrHuReYoaSmhfBYZKqaVQWnHCJAu4DISNEEOthS_t7oKkoKwLAZIQ0lWaGGJAATlDzTiJ4RxhpplwKltGckOhS6Rv_ND4yoieoWEYXKCWW8bZuhTbmFUrePn36yt0UAI4HCjsGjWzTQ43aF9tsyjd3BZ_yBeMhLnB
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NT8IwGG4ImqgXP8D4bQ8enTDauvVoiEQiEhLQcCNd-xaWwEZg4-Kft93mxMSLt3VZ2qZd-n70eZ4XobsWUVakHRxfe-BQm7ISlDFHgg6IoDpgWUL_o-f1-_54zAcVdF9yYQAgA5_Bg33M7vJVLFObKmtQZtWmTKyzYytnFWytMqNinJOcZVqcw76X1c4yBsoGSJR_07rMfEip9lS0SSFI5DZ5o9t-Gg4HOcyyGPFX6ZXM8nQO_zfnI1T_ofDhQWmcjlEFohN0sKU-WEOfGfl2ESsxx4V-qt0lrFObQcMWET_Fxj_EufKyORaxMsGq7Q-L-TRehclsgUWkzAdrvHUTjpMYi1SFsbMJ16npfb0EkDNcgpXiqI7eO8-j9otT1GJwQuNgJI6kmhL6KH1NhVCCSyU9wgTzPeFzEyMGSnFXmP0FQUEaJwIEIaQpFdFEgwRyiqpRHMEZwkwxbnW2tPA0hSYRrnYD7UrNW5oGgX-OanYZJ8tcbmNSrODF369v0d7L6K036XX7r5doP4dzWIjYFaomqxSu0a7cJOF6dZP9LV9nG70K
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2008+IEEE+International+Conference+on+Acoustics%2C+Speech+and+Signal+Processing&rft.atitle=Multimodal+information+fusion+using+the+iterative+decoding+algorithm+and+its+application+to+audio-visual+speech+recognition&rft.au=Shivappa%2C+S.T.&rft.au=Rao%2C+B.D.&rft.au=Trivedi%2C+M.M.&rft.date=2008-03-01&rft.pub=IEEE&rft.isbn=9781424414833&rft.issn=1520-6149&rft.spage=2241&rft.epage=2244&rft_id=info:doi/10.1109%2FICASSP.2008.4518091&rft.externalDocID=4518091
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1520-6149&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1520-6149&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1520-6149&client=summon