Directly training temporal Spiking Neural Network with sparse surrogate gradient

Brain-inspired Spiking Neural Networks (SNNs) have attracted much attention due to their event-based computing and energy-efficient features. However, the spiking all-or-none nature has prevented direct training of SNNs for various applications. The surrogate gradient (SG) algorithm has recently ena...

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	Neural networks Ročník 179; s. 106499
Hlavní autori:	Li, Yang, Zhao, Feifei, Zhao, Dongcheng, Zeng, Yi
Médium:	Journal Article
Jazyk:	English
Vydavateľské údaje:	United States Elsevier Ltd 01.11.2024
Predmet:	Action Potentials - physiology Algorithms Brain - physiology Direct Training Humans Models, Neurological Neural Networks, Computer Neurons - physiology Sparse Surrogate Gradient Spiking Neural Network Temporally Weighted Output Sparse Surrogate Gradient Temporally Weighted Output Spiking Neural Network Direct Training
ISSN:	0893-6080, 1879-2782, 1879-2782
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Popis
Shrnutí:	Brain-inspired Spiking Neural Networks (SNNs) have attracted much attention due to their event-based computing and energy-efficient features. However, the spiking all-or-none nature has prevented direct training of SNNs for various applications. The surrogate gradient (SG) algorithm has recently enabled spiking neural networks to shine in neuromorphic hardware. However, introducing surrogate gradients has caused SNNs to lose their original sparsity, thus leading to the potential performance loss. In this paper, we first analyze the current problem of direct training using SGs and then propose Masked Surrogate Gradients (MSGs) to balance the effectiveness of training and the sparseness of the gradient, thereby improving the generalization ability of SNNs. Moreover, we introduce a temporally weighted output (TWO) method to decode the network output, reinforcing the importance of correct timesteps. Extensive experiments on diverse network structures and datasets show that training with MSG and TWO surpasses the SOTA technique.
Bibliografia:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	0893-6080 1879-2782 1879-2782
DOI:	10.1016/j.neunet.2024.106499