FSRT: Facial Scene Representation Transformer for Face Reenactment from Factorized Appearance, Head-Pose, and Facial Expression Features

The task of face reenactment is to transfer the head motion and facial expressions from a driving video to the appearance of a source image, which may be of a different person (cross-reenactment). Most existing methods are CNN-based and estimate optical flow from the source image to the current driv...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) s. 7716 - 7726
Hlavní autori: Rochow, Andre, Schwarz, Max, Behnke, Sven
Médium: Konferenčný príspevok..
Jazyk:English
Vydavateľské údaje: IEEE 16.06.2024
Predmet:
ISSN:1063-6919
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract The task of face reenactment is to transfer the head motion and facial expressions from a driving video to the appearance of a source image, which may be of a different person (cross-reenactment). Most existing methods are CNN-based and estimate optical flow from the source image to the current driving frame, which is then inpainted and refined to produce the output animation. We propose a transformer-based encoder for computing a set-latent representation of the source image(s). We then predict the output color of a query pixel using a transformer-based decoder, which is conditioned with keypoints and a facial expression vector extracted from the driving frame. Latent representations of the source person are learned in a self-supervised manner that factorize their appearance, head pose, and facial expressions. Thus, they are perfectly suited for cross-reenactment. In contrast to most related work, our method naturally extends to multiple source images and can thus adapt to person-specific facial dynamics. We also propose data augmentation and regularization schemes that are necessary to prevent overfitting and support generalizability of the learned representations. We evaluated our approach in a randomized user study. The results indicate superior performance compared to the state-of-the-art in terms of motion transfer quality and temporal consistency. 1 1 Code & Video: https://andrerochow.github.io/fsrt
AbstractList The task of face reenactment is to transfer the head motion and facial expressions from a driving video to the appearance of a source image, which may be of a different person (cross-reenactment). Most existing methods are CNN-based and estimate optical flow from the source image to the current driving frame, which is then inpainted and refined to produce the output animation. We propose a transformer-based encoder for computing a set-latent representation of the source image(s). We then predict the output color of a query pixel using a transformer-based decoder, which is conditioned with keypoints and a facial expression vector extracted from the driving frame. Latent representations of the source person are learned in a self-supervised manner that factorize their appearance, head pose, and facial expressions. Thus, they are perfectly suited for cross-reenactment. In contrast to most related work, our method naturally extends to multiple source images and can thus adapt to person-specific facial dynamics. We also propose data augmentation and regularization schemes that are necessary to prevent overfitting and support generalizability of the learned representations. We evaluated our approach in a randomized user study. The results indicate superior performance compared to the state-of-the-art in terms of motion transfer quality and temporal consistency. 1 1 Code & Video: https://andrerochow.github.io/fsrt
Author Schwarz, Max
Behnke, Sven
Rochow, Andre
Author_xml – sequence: 1
  givenname: Andre
  surname: Rochow
  fullname: Rochow, Andre
  email: rochow@ais.uni-bonn.de
  organization: University of Bonn
– sequence: 2
  givenname: Max
  surname: Schwarz
  fullname: Schwarz, Max
  email: schwarz@ais.uni-bonn.de
  organization: University of Bonn
– sequence: 3
  givenname: Sven
  surname: Behnke
  fullname: Behnke, Sven
  email: behnke@cs.uni-bonn.de
  organization: University of Bonn
BookMark eNo1kN1Kw0AQhVdRUGvfoBf7AKbuTzab9a6UxgoFS1u9LZPdCUSaTdhEUJ_Ax3aDejGcmeGbc2BuyIVvPRIy42zOOTP3y9ftTgkt5Vwwkc4Z01KfkanRJpeKSSUZy87JNWeZTDLDzRWZ9v0bY0wKzjOTX5PvYr87PNACbA0nurfoke6wC9ijH2CoW08PAXxftaHBQKOM7MigBzs0kaJVaJtxO7Sh_kJHF12HEI8s3tE1gku2bR9b8O4_Z_UxJvSje4EwvMfhllxWcOpx-qcT8lKsDst1snl-fFouNknNdTYkvExzIVHHyrUEk2KaGzDWpsI6UEaI0maa2RJL4SrunFCpy1PplLOl4iAnZPbrWyPisQt1A-HzGD-klFRa_gCWs2az
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/CVPR52733.2024.00737
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Applied Sciences
EISBN 9798350353006
EISSN 1063-6919
EndPage 7726
ExternalDocumentID 10655357
Genre orig-research
GroupedDBID 6IE
6IH
6IL
6IN
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IJVOP
OCL
RIE
RIL
RIO
ID FETCH-LOGICAL-i176t-1b4823e723e873a94e489a9cc42cda5922bc670cbeb2df1dd254d843d5dcb51a3
IEDL.DBID RIE
ISICitedReferencesCount 12
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001322555908012&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:00:51 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i176t-1b4823e723e873a94e489a9cc42cda5922bc670cbeb2df1dd254d843d5dcb51a3
PageCount 11
ParticipantIDs ieee_primary_10655357
PublicationCentury 2000
PublicationDate 2024-June-16
PublicationDateYYYYMMDD 2024-06-16
PublicationDate_xml – month: 06
  year: 2024
  text: 2024-June-16
  day: 16
PublicationDecade 2020
PublicationTitle Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online)
PublicationTitleAbbrev CVPR
PublicationYear 2024
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0003211698
Score 2.3340416
Snippet The task of face reenactment is to transfer the head motion and facial expressions from a driving video to the appearance of a source image, which may be of a...
SourceID ieee
SourceType Publisher
StartPage 7716
SubjectTerms Animation
Face recognition
Face Reenactment
Facial Animation
Rendering (computer graphics)
Shape
Training
Transformers
Vectors
Title FSRT: Facial Scene Representation Transformer for Face Reenactment from Factorized Appearance, Head-Pose, and Facial Expression Features
URI https://ieeexplore.ieee.org/document/10655357
WOSCitedRecordID wos001322555908012&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwELZoxcBUHkW85YERQ-NHHLOiRp2qqBTUrXJsR-qSoj4Q4hfws7lz0yIGBoYozsnKSfbZvjvfd0fIrTPWlLxXMefTislgFMNYReaTUAkZlDFyU2xCD4fZZGKKBqwesTAhhBh8Fu6xGe_y_dyt0VUGKzxVSijdIi2t0w1Ya-dQEWDKpCZr4HFJzzw8vRYjzC8mwAzkMUm2-F1EJZ4heeef3A9J9weNR4vdOXNE9kJ9TDqN-kibxbk8IV_582j8SHOLXnCgwy5GRzHQtcEX1XS8VVPDgsIL-2KfUFsXo80pok2QiqlDPuHvwAWWAkrGHR2AOLBivoSmrf2WT_-jCaWtKaqTa_jokpe8P34asKbQApslOl2xpJQZF0HDk2lhjQwygzl0TnLnrTKcly7VPVeCGe6rxHuwKn0mhVfelSqx4pS063kdzgh11nEeOOwKpZdGViWodF46EBWsue71OeniyE7fNrk0pttBvfiDfkkOcPIwOCtJr0h7tViHa7Lv3lez5eImSsA394m0EQ
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwELagIMFUHkW88cBIoPEjiVmrRkWUKioFdasc25G6pKgPhPgF_Gzu3LSIgYEhimNZsWTf-R6-746Qa6O0ylmzCIyNikA4JQOMVQxs6AounFRKLItNxL1eMhyqrAKreyyMc84Hn7lbbPq7fDsxC3SVAYdHUnIZb5ItCYI0WcK11i4VDsZMpJIKIBc21V3rNetjhjEOhiDzabL57zIqXoqk9X_Ov0caP3g8mq0lzT7ZcOUBqVcKJK3Yc3ZIvtLn_uCephr94NAP5xjt-1DXCmFU0sFKUXVTCi8ci2NcqY2PN6eIN8FeTB7yCX-HWYAZkDZuaAcIIsgmM2jq0q7maX9UwbQlRYVyAR8N8pK2B61OUJVaCMZhHM2DMBcJ4y6GJ4m5VsKJBHbRGMGM1VIxlpsobpocDHFbhNaCXWkTwa20Jpeh5kekVk5Kd0yo0YYxx-BcyK1QoshBqbPCALFg1XUbn5AGruzobZlNY7Ra1NM_-q_ITmfw1B11H3qPZ2QXNxJDtcLonNTm04W7INvmfT6eTS89NXwDmzS3Zg
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+%28IEEE+Computer+Society+Conference+on+Computer+Vision+and+Pattern+Recognition.+Online%29&rft.atitle=FSRT%3A+Facial+Scene+Representation+Transformer+for+Face+Reenactment+from+Factorized+Appearance%2C+Head-Pose%2C+and+Facial+Expression+Features&rft.au=Rochow%2C+Andre&rft.au=Schwarz%2C+Max&rft.au=Behnke%2C+Sven&rft.date=2024-06-16&rft.pub=IEEE&rft.eissn=1063-6919&rft.spage=7716&rft.epage=7726&rft_id=info:doi/10.1109%2FCVPR52733.2024.00737&rft.externalDocID=10655357