Bibliographic Details
| Title: |
Paraphrase-Augmented Evaluation for Dialogue Models: A Unified Framework with AMR and GPT-Based Rewriting. |
| Authors: |
Liu, Xiao1 (AUTHOR), Xie, Zhenping1 (AUTHOR) xiezp@jiangnan.edu.cn, Zhou, Juncheng1 (AUTHOR), Jiang, Senlin2 (AUTHOR) |
| Source: |
International Journal of Software Engineering & Knowledge Engineering. Oct2025, Vol. 35 Issue 10, p1383-1398. 16p. |
| Subject Terms: |
*NATURAL language processing, *ARTIFICIAL intelligence, PARAPHRASE, EVALUATION methodology, GENERATIVE pre-trained transformers, LAMDA (Language model), LANGUAGE models |
| Abstract: |
In recent years, large language models (LLMs) have achieved remarkable progress in dialogue generation. However, existing automatic evaluation methods still face challenges in diverse scenarios, particularly in terms of limited generalization and low alignment with human judgment. To address these issues, we propose a paraphrase-based evaluation framework that integrates Abstract Meaning Representation (AMR) with general-purpose language models (GPT). This approach generates diverse paraphrases across lexical, syntactic and stylistic dimensions to enhance the coverage of traditional evaluation metrics. Experimental results show that incorporating paraphrase augmentation significantly improves the correlation between automatic metrics and human evaluation on multiple datasets. Additionally, extensive experiments on six mainstream LLMs demonstrate the effectiveness and generalizability of the proposed method. This study offers new insights into improving the human alignment of automatic evaluation and lays a foundation for the application and optimization of LLMs in open-domain dialogue systems. [ABSTRACT FROM AUTHOR] |
|
Copyright of International Journal of Software Engineering & Knowledge Engineering is the property of World Scientific Publishing Company and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.) |
| Database: |
Business Source Index |