Time-Lag Aware Multi-Modal Variational Autoencoder Using Baseball Videos And Tweets For Prediction Of Important Scenes

A novel method based on time-lag aware multi-modal variational autoencoder for prediction of important scenes (TI-MVAE-PIS) using baseball videos and tweets posted on Twitter is presented in this paper. This paper has the following two technical contributions. First, to effectively use heterogeneous...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Proceedings - International Conference on Image Processing s. 2678 - 2682
Hlavní autori: Hirasawa, Kaito, Maeda, Keisuke, Ogawa, Takahiro, Haseyama, Miki
Médium: Konferenčný príspevok..
Jazyk:English
Vydavateľské údaje: IEEE 01.01.2021
Predmet:
ISSN:2381-8549
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract A novel method based on time-lag aware multi-modal variational autoencoder for prediction of important scenes (TI-MVAE-PIS) using baseball videos and tweets posted on Twitter is presented in this paper. This paper has the following two technical contributions. First, to effectively use heterogeneous data for the prediction of important scenes, we transform textual, visual and audio features obtained from tweets and videos to the latent features. Then TI-MVAE-PIS can flexibly express the relationships between them in the constructed latent space. Second, since there are time-lags between tweets and the corresponding multiple previous events, Tl-MVAE-PIS considers such time-lags in their relationship estimation for successfully deriving their latent features. Therefore, these two contributions enable accurate important scene prediction. Results of experiments using actual baseball videos and their corresponding tweets show the effectiveness of TI-MVAE-PIS.
AbstractList A novel method based on time-lag aware multi-modal variational autoencoder for prediction of important scenes (TI-MVAE-PIS) using baseball videos and tweets posted on Twitter is presented in this paper. This paper has the following two technical contributions. First, to effectively use heterogeneous data for the prediction of important scenes, we transform textual, visual and audio features obtained from tweets and videos to the latent features. Then TI-MVAE-PIS can flexibly express the relationships between them in the constructed latent space. Second, since there are time-lags between tweets and the corresponding multiple previous events, Tl-MVAE-PIS considers such time-lags in their relationship estimation for successfully deriving their latent features. Therefore, these two contributions enable accurate important scene prediction. Results of experiments using actual baseball videos and their corresponding tweets show the effectiveness of TI-MVAE-PIS.
Author Hirasawa, Kaito
Ogawa, Takahiro
Haseyama, Miki
Maeda, Keisuke
Author_xml – sequence: 1
  givenname: Kaito
  surname: Hirasawa
  fullname: Hirasawa, Kaito
  email: hirasawa@lmd.ist.hokudai.ac.jp
  organization: Hokkaido University,Graduate School of Information Science and Technology
– sequence: 2
  givenname: Keisuke
  surname: Maeda
  fullname: Maeda, Keisuke
  email: maeda@lmd.ist.hokudai.ac.jp
  organization: Hokkaido University,Office of Institutional Research
– sequence: 3
  givenname: Takahiro
  surname: Ogawa
  fullname: Ogawa, Takahiro
  email: ogawa@lmd.ist.hokudai.ac.jp
  organization: Hokkaido University,Faculty of Information Science and Technology
– sequence: 4
  givenname: Miki
  surname: Haseyama
  fullname: Haseyama, Miki
  email: miki@ist.hokudai.ac.jp
  organization: Hokkaido University,Faculty of Information Science and Technology
BookMark eNotkNtOAjEYhKvRRECfwMT0BRZ72h4uVyJKAoFE8JZ0t_-SmqUl7SLx7cXI1czFfJPMDNFNiAEQeqJkTCkxz7PJbCWYYXrMCKNjUxIpjLxCQyplKQSlJb1GA8Y1LXQpzB0a5vxFCCOU0wH6Xvs9FHO7w9XJJsCLY9f7YhGd7fCnTd72Poazr459hNBEBwlvsg87_GIz1LY7x7yDmHEVHF6fAPqMpzHhVQLnmz8aL1s82x9i6m3o8UcDAfI9um1tl-HhoiO0mb6uJ-_FfPk2m1TzwjNp-qJVqmSm1kK7EhpScwHEcCldqZwiBhg3VGilVa1bx5VztOYN0-dlDdRS1XyEHv97PQBsD8nvbfrZXj7iv_woXhM
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/ICIP42928.2021.9506496
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE/IET Electronic Library (IEL) (UW System Shared)
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Applied Sciences
EISBN 1665441151
9781665441155
EISSN 2381-8549
EndPage 2682
ExternalDocumentID 9506496
Genre orig-research
GroupedDBID 6IE
6IF
6IH
6IK
6IL
6IM
6IN
AAJGR
AAWTH
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IPLJI
M43
OCL
RIE
RIL
RIO
RNS
ID FETCH-LOGICAL-i269t-f77529b848d5ec0b34e09366d57d709e239148787b8fd37dd1b3c28131ceb67b3
IEDL.DBID RIE
ISICitedReferencesCount 0
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000819455102160&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:36:52 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i269t-f77529b848d5ec0b34e09366d57d709e239148787b8fd37dd1b3c28131ceb67b3
PageCount 5
ParticipantIDs ieee_primary_9506496
PublicationCentury 2000
PublicationDate 2021-01-01
PublicationDateYYYYMMDD 2021-01-01
PublicationDate_xml – month: 01
  year: 2021
  text: 2021-01-01
  day: 01
PublicationDecade 2020
PublicationTitle Proceedings - International Conference on Image Processing
PublicationTitleAbbrev ICIP
PublicationYear 2021
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0020131
Score 2.1292193
Snippet A novel method based on time-lag aware multi-modal variational autoencoder for prediction of important scenes (TI-MVAE-PIS) using baseball videos and tweets...
SourceID ieee
SourceType Publisher
StartPage 2678
SubjectTerms Blogs
Conferences
Estimation
Image processing
important scene prediction
Multimodal variational autoencoder
Social networking (online)
sports video
time-lag
Transforms
Twitter
Visualization
Title Time-Lag Aware Multi-Modal Variational Autoencoder Using Baseball Videos And Tweets For Prediction Of Important Scenes
URI https://ieeexplore.ieee.org/document/9506496
WOSCitedRecordID wos000819455102160&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07a8MwEBZJ6NCpj6T0jYaOdWJLth5jGhoaaNMMackW9HIJpHaxneTvV5JNSqFLN1lIHJw4n-5033cA3CntKO-4DXJ4TaqdWJuLWUCJxJwgpBImfLMJOp2yxYLPWuB-j4UxxvjiM9N3Q_-Wr3O1camyAXfsapy0QZtSUmO19sGV441pEMBRyAeT0WTmOjG56i0U9Zudv1qoeA8yPvqf7GPQ-4HiwdneyZyAlslOwVFzd4SNZZZdsHVYjuBZfMDhThQGemBt8JJrsYbvNh5ucn5wuKlyx12pTQF9uQB8sH5MirVdttImL-Ew03C-M6Yq4TgvrHD3lON2w9cUTj79fT2rrGj3l-yBt_HjfPQUND0VghUivApSShPEJYuZTowKJY5NyDEhOqGahtwgzG2AZK1YslRjqnUksULMKlcZSajEZ6CT5Zk5BzAWKgoxdmBUFkeKi5ShCEuBeGq_FL4AXafG5VdNm7FsNHj59_QVOHQnVWc3rkGnKjbmBhyobbUqi1t_1t9ukal9
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3NT8IwFG8QTfSEisZve_DoYGu3tT0ikbAIyAENN7J-zJDgZrYB_75tt2BMvHjrmjYvec3b63t9v98D4EFIQ3nHdJDDKlLtQNucTx0ScsxChERAY9tsgkwmdD5n0wZ43GFhlFK2-Ex1zNC-5ctMrE2qrMsMuxoL98B-4PvIrdBau_DKMMfUGGDPZd2oH01NLyZTv4W8Tr33VxMV60MGrf9JPwZnP2A8ON25mRPQUOkpaNW3R1jbZtEGG4PmcEbxB-xt41xBC611xpmMV_BdR8R11g_21mVm2CulyqEtGIBP2pPxeKWXLaXKCthLJZxtlSoLOMhyLdw85pjd8DWB0ae9saelFm3-k2fgbfA86w-duquCs0QhK52EkAAxTn0qAyVcjn3lMhyGMiCSuEwhzHSIpO2Y00RiIqXHsUBUK1coHhKOz0EzzVJ1AaAfC8_F2MBRqe8JFicUeZjHiCX6S-BL0DZqXHxVxBmLWoNXf0_fg8PhbDxajKLJyzU4MqdW5TpuQLPM1-oWHIhNuSzyO3vu32vorMQ
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=Proceedings+-+International+Conference+on+Image+Processing&rft.atitle=Time-Lag+Aware+Multi-Modal+Variational+Autoencoder+Using+Baseball+Videos+And+Tweets+For+Prediction+Of+Important+Scenes&rft.au=Hirasawa%2C+Kaito&rft.au=Maeda%2C+Keisuke&rft.au=Ogawa%2C+Takahiro&rft.au=Haseyama%2C+Miki&rft.date=2021-01-01&rft.pub=IEEE&rft.eissn=2381-8549&rft.spage=2678&rft.epage=2682&rft_id=info:doi/10.1109%2FICIP42928.2021.9506496&rft.externalDocID=9506496