Retrieval Augmented Encoder-Decoder with Diffusion for Sequential Hashtag Recommendation in Disaster Events
Gespeichert in:
| Titel: | Retrieval Augmented Encoder-Decoder with Diffusion for Sequential Hashtag Recommendation in Disaster Events |
|---|---|
| Autoren: | Shubhi Bansal, Seerla Parimala, Nagendra Kumar |
| Quelle: | Proceedings of the International AAAI Conference on Web and Social Media. 19:206-225 |
| Verlagsinformationen: | Association for the Advancement of Artificial Intelligence (AAAI), 2025. |
| Publikationsjahr: | 2025 |
| Beschreibung: | During disasters, access to timely and accurate information is crucial for effective response and recovery efforts. Hashtags have emerged as a lifeline in disaster response, organizing and disseminating critical information on social media, facilitating effective communication and real-time situational awareness. However, existing methods for hashtag recommendation fall short during disasters. Retrieval-based methods rely on fixed predefined hashtag lists, failing to capture dynamic information flow, while generation-based methods lack guidance for generating relevant hashtags. In view of the above, we propose a novel three-stage framework. First, the retriever identifies potential candidate hashtags from a vast collection of tweets annotated with hashtags. Next, a selector narrows down these candidates by analyzing the input tweet and ensuring only the most relevant hashtags are retained. Finally, a diffusion-based seq2seq encoder-decoder generates informative hashtags by leveraging the refined set of candidate hashtags and the original input tweet. The framework infused with diffusion overcomes the limitations of extant encoder-decoder models that produce generic hashtags due to reliance on maximizing training data likelihood. Our diffusion-based approach excels at capturing the dynamic and informal language of disaster situations by reversing a gradual noising process, allowing it to explore wider possibilities and generate more diverse hashtags. We enhance the generator with self-conditioning for better utilization of predicted sequence information. Furthermore, we devise an adaptive nonlinear noise schedule for balanced denoising across time steps for each token in the generated hashtag sequence. Empirical evaluations reveal that our proposed method exhibits superior performance compared to state-of-the-art hashtag recommendation methods in both the quality of generated hashtags and training time. |
| Publikationsart: | Article |
| ISSN: | 2334-0770 2162-3449 |
| DOI: | 10.1609/icwsm.v19i1.35812 |
| Dokumentencode: | edsair.doi...........eb1c30467e1c433d83f15a00d25f3924 |
| Datenbank: | OpenAIRE |
| Abstract: | During disasters, access to timely and accurate information is crucial for effective response and recovery efforts. Hashtags have emerged as a lifeline in disaster response, organizing and disseminating critical information on social media, facilitating effective communication and real-time situational awareness. However, existing methods for hashtag recommendation fall short during disasters. Retrieval-based methods rely on fixed predefined hashtag lists, failing to capture dynamic information flow, while generation-based methods lack guidance for generating relevant hashtags. In view of the above, we propose a novel three-stage framework. First, the retriever identifies potential candidate hashtags from a vast collection of tweets annotated with hashtags. Next, a selector narrows down these candidates by analyzing the input tweet and ensuring only the most relevant hashtags are retained. Finally, a diffusion-based seq2seq encoder-decoder generates informative hashtags by leveraging the refined set of candidate hashtags and the original input tweet. The framework infused with diffusion overcomes the limitations of extant encoder-decoder models that produce generic hashtags due to reliance on maximizing training data likelihood. Our diffusion-based approach excels at capturing the dynamic and informal language of disaster situations by reversing a gradual noising process, allowing it to explore wider possibilities and generate more diverse hashtags. We enhance the generator with self-conditioning for better utilization of predicted sequence information. Furthermore, we devise an adaptive nonlinear noise schedule for balanced denoising across time steps for each token in the generated hashtag sequence. Empirical evaluations reveal that our proposed method exhibits superior performance compared to state-of-the-art hashtag recommendation methods in both the quality of generated hashtags and training time. |
|---|---|
| ISSN: | 23340770 21623449 |
| DOI: | 10.1609/icwsm.v19i1.35812 |
Nájsť tento článok vo Web of Science