FSRT: Facial Scene Representation Transformer for Face Reenactment from Factorized Appearance, Head-Pose, and Facial Expression Features
The task of face reenactment is to transfer the head motion and facial expressions from a driving video to the appearance of a source image, which may be of a different person (cross-reenactment). Most existing methods are CNN-based and estimate optical flow from the source image to the current driv...
Uložené v:
| Vydané v: | Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) s. 7716 - 7726 |
|---|---|
| Hlavní autori: | , , |
| Médium: | Konferenčný príspevok.. |
| Jazyk: | English |
| Vydavateľské údaje: |
IEEE
16.06.2024
|
| Predmet: | |
| ISSN: | 1063-6919 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | The task of face reenactment is to transfer the head motion and facial expressions from a driving video to the appearance of a source image, which may be of a different person (cross-reenactment). Most existing methods are CNN-based and estimate optical flow from the source image to the current driving frame, which is then inpainted and refined to produce the output animation. We propose a transformer-based encoder for computing a set-latent representation of the source image(s). We then predict the output color of a query pixel using a transformer-based decoder, which is conditioned with keypoints and a facial expression vector extracted from the driving frame. Latent representations of the source person are learned in a self-supervised manner that factorize their appearance, head pose, and facial expressions. Thus, they are perfectly suited for cross-reenactment. In contrast to most related work, our method naturally extends to multiple source images and can thus adapt to person-specific facial dynamics. We also propose data augmentation and regularization schemes that are necessary to prevent overfitting and support generalizability of the learned representations. We evaluated our approach in a randomized user study. The results indicate superior performance compared to the state-of-the-art in terms of motion transfer quality and temporal consistency. 1 1 Code & Video: https://andrerochow.github.io/fsrt |
|---|---|
| AbstractList | The task of face reenactment is to transfer the head motion and facial expressions from a driving video to the appearance of a source image, which may be of a different person (cross-reenactment). Most existing methods are CNN-based and estimate optical flow from the source image to the current driving frame, which is then inpainted and refined to produce the output animation. We propose a transformer-based encoder for computing a set-latent representation of the source image(s). We then predict the output color of a query pixel using a transformer-based decoder, which is conditioned with keypoints and a facial expression vector extracted from the driving frame. Latent representations of the source person are learned in a self-supervised manner that factorize their appearance, head pose, and facial expressions. Thus, they are perfectly suited for cross-reenactment. In contrast to most related work, our method naturally extends to multiple source images and can thus adapt to person-specific facial dynamics. We also propose data augmentation and regularization schemes that are necessary to prevent overfitting and support generalizability of the learned representations. We evaluated our approach in a randomized user study. The results indicate superior performance compared to the state-of-the-art in terms of motion transfer quality and temporal consistency. 1 1 Code & Video: https://andrerochow.github.io/fsrt |
| Author | Schwarz, Max Behnke, Sven Rochow, Andre |
| Author_xml | – sequence: 1 givenname: Andre surname: Rochow fullname: Rochow, Andre email: rochow@ais.uni-bonn.de organization: University of Bonn – sequence: 2 givenname: Max surname: Schwarz fullname: Schwarz, Max email: schwarz@ais.uni-bonn.de organization: University of Bonn – sequence: 3 givenname: Sven surname: Behnke fullname: Behnke, Sven email: behnke@cs.uni-bonn.de organization: University of Bonn |
| BookMark | eNo1kN1Kw0AQhVdRUGvfoBf7AKbuTzab9a6UxgoFS1u9LZPdCUSaTdhEUJ_Ax3aDejGcmeGbc2BuyIVvPRIy42zOOTP3y9ftTgkt5Vwwkc4Z01KfkanRJpeKSSUZy87JNWeZTDLDzRWZ9v0bY0wKzjOTX5PvYr87PNACbA0nurfoke6wC9ijH2CoW08PAXxftaHBQKOM7MigBzs0kaJVaJtxO7Sh_kJHF12HEI8s3tE1gku2bR9b8O4_Z_UxJvSje4EwvMfhllxWcOpx-qcT8lKsDst1snl-fFouNknNdTYkvExzIVHHyrUEk2KaGzDWpsI6UEaI0maa2RJL4SrunFCpy1PplLOl4iAnZPbrWyPisQt1A-HzGD-klFRa_gCWs2az |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IH CBEJK RIE RIO |
| DOI | 10.1109/CVPR52733.2024.00737 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP) 1998-present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Applied Sciences |
| EISBN | 9798350353006 |
| EISSN | 1063-6919 |
| EndPage | 7726 |
| ExternalDocumentID | 10655357 |
| Genre | orig-research |
| GroupedDBID | 6IE 6IH 6IL 6IN AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IJVOP OCL RIE RIL RIO |
| ID | FETCH-LOGICAL-i176t-1b4823e723e873a94e489a9cc42cda5922bc670cbeb2df1dd254d843d5dcb51a3 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 12 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001322555908012&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Wed Aug 27 02:00:51 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-i176t-1b4823e723e873a94e489a9cc42cda5922bc670cbeb2df1dd254d843d5dcb51a3 |
| PageCount | 11 |
| ParticipantIDs | ieee_primary_10655357 |
| PublicationCentury | 2000 |
| PublicationDate | 2024-June-16 |
| PublicationDateYYYYMMDD | 2024-06-16 |
| PublicationDate_xml | – month: 06 year: 2024 text: 2024-June-16 day: 16 |
| PublicationDecade | 2020 |
| PublicationTitle | Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) |
| PublicationTitleAbbrev | CVPR |
| PublicationYear | 2024 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssj0003211698 |
| Score | 2.3341343 |
| Snippet | The task of face reenactment is to transfer the head motion and facial expressions from a driving video to the appearance of a source image, which may be of a... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 7716 |
| SubjectTerms | Animation Face recognition Face Reenactment Facial Animation Rendering (computer graphics) Shape Training Transformers Vectors |
| Title | FSRT: Facial Scene Representation Transformer for Face Reenactment from Factorized Appearance, Head-Pose, and Facial Expression Features |
| URI | https://ieeexplore.ieee.org/document/10655357 |
| WOSCitedRecordID | wos001322555908012&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwELZoxcBUHkW85YERQ-NHbLOiRp2qqBTUrfIrUpcU9YEQv4CfzdlNixgYGKI4Jysn2Xf23fm-M0K3UjHmBRXE8aAIpzQj1riKaCEdmM_GO6_SZRNyOFSTiS4bsHrCwoQQUvJZuI_NdJbv524dQ2Wg4bkQTMgWakmZb8Bau4AKA1cm16qBx2U9_fD0Wo5ifTEGbiBNRbLZ70tU0h5SdP7J_RB1f9B4uNztM0doL9THqNOYj7hRzuUJ-iqeR-NHXJgYBQc6rGJ4lBJdG3xRjcdbMzUsMLxi39gn1MalbHMc0SaRGkuHfMLfgQuoQpSMOzwAcSDlfAlNU_stn_5Hk0pb42hOruGji16K_vhpQJqLFsgsk_mKZJYryoKER0lmNA9caaOd49R5IzSl1uWy5yy44b7KvAev0isOk-ydFZlhp6hdz-twhjCsrc5XsOsZC7ZCFZQOWZXLwLytlGX8HHXjyE7fNrU0pttBvfiDfokO4uTF5Kwsv0Lt1WIdrtG-e1_NloubJAHfvBi0Ag |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LT8MwDI5gIMFpPIZ4kwNHAjRpm4TrtGqIMVWjoN2mvCrt0qE9EOIX8LNxQjfEgQOHqqkVNVJix3biz0bokgvGbEITYmInSExpRLQyJZEJN2A-K2usCMUmeL8vhkOZ12D1gIVxzoXgM3ftm-Eu307Mwh-VgYSnScISvo42fOmsGq61OlJh4MykUtQAuehW3rRf8oHPMMbAEaQhTTb7XUYlaJGs-c_xd1DrB4-H85Wm2UVrrtpDzdqAxLV4zvbRZ_Y0KO5wpvw5ONBhH8ODEOpaI4wqXCwNVTfF8PJ9fR9XKRPizbHHm3iqTx7yAX-HUUAYPG9c4S4wBMknM2iqyi7H6bzXwbQV9gblAj5a6DnrFO0uqUstkHHE0zmJdCwocxwewZmSsYuFVNKYmBqrEkmpNim_NRoccVtG1oJfaUUMy2yNTiLFDlCjmlTuEGHYXY0tQe8pDdZC6YR0UZlyx6wuhWbxEWr5mR29fmfTGC0n9fgP-gXa6haPvVHvvv9wgrb9QvpQrSg9RY35dOHO0KZ5m49n0_PADV9F7bdL |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+%28IEEE+Computer+Society+Conference+on+Computer+Vision+and+Pattern+Recognition.+Online%29&rft.atitle=FSRT%3A+Facial+Scene+Representation+Transformer+for+Face+Reenactment+from+Factorized+Appearance%2C+Head-Pose%2C+and+Facial+Expression+Features&rft.au=Rochow%2C+Andre&rft.au=Schwarz%2C+Max&rft.au=Behnke%2C+Sven&rft.date=2024-06-16&rft.pub=IEEE&rft.eissn=1063-6919&rft.spage=7716&rft.epage=7726&rft_id=info:doi/10.1109%2FCVPR52733.2024.00737&rft.externalDocID=10655357 |