Effective Hierarchical Text Classification with Large Language Models
Hierarchical Text Classification presents significant challenges, especially when dealing with intricate taxonomies with multi-level labels. The scarcity of annotated datasets emphasizes these challenges, limiting traditional approaches. Large Language Models (LLMs) alone struggle with the inherent...
Uložené v:
| Vydané v: | SN computer science Ročník 6; číslo 7; s. 873 |
|---|---|
| Hlavní autori: | , , , , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
Singapore
Springer Nature Singapore
06.10.2025
Springer Nature B.V |
| Predmet: | |
| ISSN: | 2661-8907, 2662-995X, 2661-8907 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Shrnutí: | Hierarchical Text Classification presents significant challenges, especially when dealing with intricate taxonomies with multi-level labels. The scarcity of annotated datasets emphasizes these challenges, limiting traditional approaches. Large Language Models (LLMs) alone struggle with the inherent complexity of hierarchical structures and require significant computational resources. This work presents HTC-GEN, an innovative framework leveraging synthetic data generation using LLMs, specifically LLaMa3, to create realistic and context-aware text samples across hierarchical levels. HTC-GEN reduces the reliance on manual annotation, addressing class imbalance issues by producing high-quality data for underrepresented labels. We evaluate our framework on the Web of Science dataset in a zero-shot setting, benchmarking it against the state-of-the-art HTC model (Z-STC) and LLaMa3. The results highlight the effectiveness of HTC-GEN, which achieves state-of-the-art performance in hierarchical text classification. Our evaluation also demonstrates that LLaMa3 alone is insufficient for this task. Furthermore, we perform a comprehensive analysis of model performance, examining individual components and assessing the impact of different hyperparameter configurations, with a particular focus on temperature and dataset sizes. The study underscores the potential of LLM-generated data for enabling robust, scalable classification systems without extensive human intervention. |
|---|---|
| Bibliografia: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 2661-8907 2662-995X 2661-8907 |
| DOI: | 10.1007/s42979-025-04435-x |