Emotion Classification In Software Engineering Texts: A Comparative Analysis of Pre-trained Transformers Language Models

Emotion recognition in software engineering texts is critical for understanding developer expressions and improving collaboration. This paper presents a comparative analysis of state-of-the-art Pretrained Language Models (PTMs) for fine-grained emotion classification on two benchmark datasets from G...

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	2024 IEEE/ACM International Workshop on Natural Language-Based Software Engineering (NLBSE) s. 73 - 80
Hlavný autor:	Imran, Mia Mohammad
Médium:	Konferenčný príspevok..
Jazyk:	English
Vydavateľské údaje:	ACM 20.04.2024
Predmet:	ALBERT Analytical models Benchmark testing BERT CodeBERT DeBERTa Emotion Classification Emotion recognition GraphCodeBERT Large Language Models RoBERTa Software Task analysis Training Transformers
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Popis
Shrnutí:	Emotion recognition in software engineering texts is critical for understanding developer expressions and improving collaboration. This paper presents a comparative analysis of state-of-the-art Pretrained Language Models (PTMs) for fine-grained emotion classification on two benchmark datasets from GitHub and Stack Overflow We evaluate six transformer models - BERT, RoBERTa, ALBERT, DeBERTa, CodeBERT and GraphCodeBERT against the current best-performing tool SEntiMoji. Our analysis reveals consistent improvements ranging from 1.17 \% to 16.79 \% in terms of macroaveraged and micro-averaged F1 scores, with general domain models outperforming specialized ones. To further enhance PTMs, we incorporate polarity features in attention layer during training demonstrating additional average gains of 1.0 \% to 10.23 \% over baseline PTMs approaches. Our work provides strong evidence for the advancements afforded by PTMs in recognizing nuanced emotions like Anger, Love, Fear, Joy, Sadness, and Surprise in software engineering contexts. Through comprehensive benchmarking and error analysis, we also outline scope for improvements to address contextual gaps.
DOI:	10.1145/3643787.3648034