Naturalness of Attention: Revisiting Attention in Code Language Models

Language models for code such as CodeBERT offer the capability to learn advanced source code representation, but their opacity poses barriers to understanding of captured properties. Recent attention analysis studies provide initial interpretability insights by focusing solely on attention weights r...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	IEEE/ACM International Conference on Software Engineering: New Ideas and Emerging Technologies Results (Online) s. 107 - 111
Hlavní autoři:	Saad, Mootez, Sharma, Tushar
Médium:	Konferenční příspěvek
Jazyk:	angličtina
Vydáno:	ACM 14.04.2024
Témata:	Analytical models Attention Analysis Attention mechanisms Codes Explainable AI Focusing Interpretability Java Language Models of Code Norm Analysis Python Software engineering Source coding Syntactics Transformers
ISSN:	2832-7632
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	Language models for code such as CodeBERT offer the capability to learn advanced source code representation, but their opacity poses barriers to understanding of captured properties. Recent attention analysis studies provide initial interpretability insights by focusing solely on attention weights rather than considering the wider context modeling of Transformers. This study aims to shed some light on the previously ignored factors of the attention mechanism beyond the attention weights. We conduct an initial empirical study analyzing both attention distributions and transformed representations in CodeBERT. Across two programming languages, Java and Python, we find that the scaled transformation norms of the input better capture syntactic structure compared to attention weights alone. Our analysis reveals characterization of how CodeBERT embeds syntactic code properties. The findings demonstrate the importance of incorporating factors beyond just attention weights for rigorously understanding neural code models. This lays the groundwork for developing more interpretable models and effective uses of attention mechanisms in program analysis.
ISSN:	2832-7632
DOI:	10.1145/3639476.3639774