MFDAN: Multi-Level Flow-Driven Attention Network for Micro-Expression Recognition

Facial expressions are an essential part of human emotional communication, and micro-expressions (MEs), as transient and imperceptible non-verbal signals, can potentially reveal real human emotions. However, subtle motion variations, limited and unbalanced samples make micro-expression recognition (...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on circuits and systems for video technology Vol. 34; no. 12; pp. 12823 - 12836
Main Authors: Cai, Wenhao, Zhao, Junli, Yi, Ran, Yu, Minjing, Duan, Fuqing, Pan, Zhenkuan, Liu, Yong-Jin
Format: Journal Article
Language:English
Published: New York IEEE 01.12.2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:
ISSN:1051-8215, 1558-2205
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Facial expressions are an essential part of human emotional communication, and micro-expressions (MEs), as transient and imperceptible non-verbal signals, can potentially reveal real human emotions. However, subtle motion variations, limited and unbalanced samples make micro-expression recognition (MER) challenging. In this paper, we design a novel dual-branch learning framework of multi-level flow-driven attention for micro-expression recognition (MFDAN), which innovatively integrates optical flow prior to guide the attention learning in the image encoding branch, enabling the model to focus on the most discriminative facial regions for subtle motion patterns. Firstly, we extract optical flow information by an optical flow encoding module. Then, in the image coding module, we construct a Transformer structure containing an optical flow-driven attention mechanism, which can effectively locate the interest region of micro-expressions in the image according to the position information of optical flow to capture more sensitive and fine-grained micro-expressions. By interoperating prior knowledge with data learning, and introducing the Dropkey operation and Focal Loss, our method can handle subtle micro-expression features on small imbalanced datasets. Through extensive experiments on three independent datasets and a composite database, including SMIC-HS, SAMM, and CASME II, robust leave-one-subject-out (LOSO) evaluation results show that our method outperforms state-of-the-art methods especially on the composite database.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1051-8215
1558-2205
DOI:10.1109/TCSVT.2024.3437481