P2SGrad: Refined Gradients for Optimizing Deep Face Models

Cosine-based softmax losses significantly improve the performance of deep face recognition networks. However, these losses always include sensitive hyper-parameters which can make training process unstable, and it is very tricky to set suitable hyper parameters for a specific dataset. This paper add...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) s. 9898 - 9906
Hlavní autoři: Zhang, Xiao, Zhao, Rui, Yan, Junjie, Gao, Mengya, Qiao, Yu, Wang, Xiaogang, Li, Hongsheng
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: IEEE 01.06.2019
Témata:
ISSN:1063-6919
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Cosine-based softmax losses significantly improve the performance of deep face recognition networks. However, these losses always include sensitive hyper-parameters which can make training process unstable, and it is very tricky to set suitable hyper parameters for a specific dataset. This paper addresses this challenge by directly designing the gradients for training in an adaptive manner. We first investigate and unify previous cosine softmax losses from the perspective of gradients. This unified view inspires us to propose a novel gradient called P2SGrad (Probability-to-Similarity Gradient), which leverages a cosine similarity instead of classification probability to control the gradients for updating neural network parameters. P2SGrad is adaptive and hyper-parameter free, which makes training process more efficient and faster. We evaluate our P2SGrad on three face recognition benchmarks, LFW, MegaFace, and IJB-C. The results show that P2SGrad is stable in training, robust to noise, and achieves state-of-the-art performance on all the three benchmarks.
ISSN:1063-6919
DOI:10.1109/CVPR.2019.01014