Graph-based hostile content detection in Hindi language

Abstract Organizations and governments are struggling to handle the hostile content on social media sites ( $$Facebook^{TM}$$ , $$Twitter^{TM}$$ , etc.). While extensive research exists for English-language content, regional languages like Hindi lack robust tools and datasets for effective moderatio...

Full description

Saved in:
Bibliographic Details
Published in:Discover Computing Vol. 28; no. 1; pp. 1 - 23
Main Authors: Angana Chakraborty, Subhankar Joardar, Dilip K. Prasad, Arif Ahmed Sekh
Format: Journal Article
Language:English
Published: Springer 13.11.2025
Subjects:
ISSN:2948-2992
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract Abstract Organizations and governments are struggling to handle the hostile content on social media sites ( $$Facebook^{TM}$$ , $$Twitter^{TM}$$ , etc.). While extensive research exists for English-language content, regional languages like Hindi lack robust tools and datasets for effective moderation. This study proposes a scalable AI-based framework for detecting hostile posts in Hindi, the most widely spoken language in the Indian subcontinent and the third most spoken globally. We employ both binary (coarse-grained) and multi-class, multi-label (fine-grained) classification using contextual and semantic features. Our approach integrates various BERT-based embeddings with Relational Graph Convolutional Networks (R-GCN), forming a hybrid BRGCN architecture trained on the Constraint 2021 Hindi dataset. To enhance performance, we implement a hard voting-based ensemble classifier. The proposed model achieves superior F1-scores compared to existing baselines: 0.98 for coarse-grained classification and 0.84, 0.61, 0.49, and 0.64 for the fine-grained categories of Fake, Hate, Defamation, and Offensive, respectively. Code and data will be made publicly available in https://github.com/mani-design/B-RGCN .
AbstractList Abstract Organizations and governments are struggling to handle the hostile content on social media sites ( $$Facebook^{TM}$$ , $$Twitter^{TM}$$ , etc.). While extensive research exists for English-language content, regional languages like Hindi lack robust tools and datasets for effective moderation. This study proposes a scalable AI-based framework for detecting hostile posts in Hindi, the most widely spoken language in the Indian subcontinent and the third most spoken globally. We employ both binary (coarse-grained) and multi-class, multi-label (fine-grained) classification using contextual and semantic features. Our approach integrates various BERT-based embeddings with Relational Graph Convolutional Networks (R-GCN), forming a hybrid BRGCN architecture trained on the Constraint 2021 Hindi dataset. To enhance performance, we implement a hard voting-based ensemble classifier. The proposed model achieves superior F1-scores compared to existing baselines: 0.98 for coarse-grained classification and 0.84, 0.61, 0.49, and 0.64 for the fine-grained categories of Fake, Hate, Defamation, and Offensive, respectively. Code and data will be made publicly available in https://github.com/mani-design/B-RGCN .
Author Arif Ahmed Sekh
Angana Chakraborty
Dilip K. Prasad
Subhankar Joardar
Author_xml – sequence: 1
  fullname: Angana Chakraborty
  organization: Department of Computer Science and Engineering, Haldia Institute of Technology
– sequence: 2
  fullname: Subhankar Joardar
  organization: School of Computer Science, Electronics and Informatics, Haldia Institute of Technology
– sequence: 3
  fullname: Dilip K. Prasad
  organization: UiT: The Arctic University of Norway
– sequence: 4
  fullname: Arif Ahmed Sekh
  organization: UiT: The Arctic University of Norway
BookMark eNotzMFKAzEUQNEgCtbaH3A1PxBNXpImWUrRtlDoRtfDS_KmjYxJmYkL_15QVxfO4t6x61ILMfYgxaMUwj7NUlgvuQDDhbdecHHFFuC14-A93LLVPOcgjLIK1kIsmN1OeDnzgDOl7lznlkfqYi2NSusSNYot19Ll0u1ySbkbsZy-8ET37GbAcabVf5fs_fXlbbPjh-N2v3k-8KgcNK70OilErcBFEGlIUkYkmZQlHYIHkyjZuA5Jo3FaIUWwZAczGEKw3qgl2_99U8WP_jLlT5y--4q5_4U6nXqcWo4j9do4EfQgCVDp6MFJr8h6DCBNGrRTP0gSVnQ
ContentType Journal Article
DBID DOA
DOI 10.1007/s10791-025-09790-0
DatabaseName DOAJ Directory of Open Access Journals
DatabaseTitleList
Database_xml – sequence: 1
  dbid: DOA
  name: Acceso a contenido Full Text - Doaj
  url: https://www.doaj.org/
  sourceTypes: Open Website
DeliveryMethod fulltext_linktorsrc
EISSN 2948-2992
EndPage 23
ExternalDocumentID oai_doaj_org_article_4580b4f1e2a34c928193e79ab215df48
GroupedDBID AAJSJ
AASML
ABDBE
AEFQL
ALMA_UNASSIGNED_HOLDINGS
EBLON
GROUPED_DOAJ
JZLTJ
SOJ
ID FETCH-LOGICAL-c382t-346d3aa4328c20dfd11cae1d37e4bb925ded7c6bd4a5843aec27e7f5f5ea27953
IEDL.DBID DOA
ISICitedReferencesCount 0
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001613811200002&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Mon Nov 17 19:34:54 EST 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 1
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c382t-346d3aa4328c20dfd11cae1d37e4bb925ded7c6bd4a5843aec27e7f5f5ea27953
OpenAccessLink https://doaj.org/article/4580b4f1e2a34c928193e79ab215df48
PageCount 23
ParticipantIDs doaj_primary_oai_doaj_org_article_4580b4f1e2a34c928193e79ab215df48
PublicationCentury 2000
PublicationDate 2025-11-13
PublicationDateYYYYMMDD 2025-11-13
PublicationDate_xml – month: 11
  year: 2025
  text: 2025-11-13
  day: 13
PublicationDecade 2020
PublicationTitle Discover Computing
PublicationYear 2025
Publisher Springer
Publisher_xml – name: Springer
SSID ssib053732600
Score 2.4028275
Snippet Abstract Organizations and governments are struggling to handle the hostile content on social media sites ( $$Facebook^{TM}$$ , $$Twitter^{TM}$$ , etc.). While...
SourceID doaj
SourceType Open Website
StartPage 1
SubjectTerms BERT
Hindi
Hostility detection
Natural language processing
R-GCN
Social networking
Title Graph-based hostile content detection in Hindi language
URI https://doaj.org/article/4580b4f1e2a34c928193e79ab215df48
Volume 28
WOSCitedRecordID wos001613811200002&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV09T8MwFLRQxcCCQID4lgdWC8fPjuMREKVTxQBSt8h-tqUuAZXA78cvCVKZWNjiDLFySXTP8d09xm5cYUnEWAuMAYR2aITPJLEqw5wsVJjl0GzCLpfNauWet1p9kSZsjAcegbvVppFB5yopDxod7ftAss6HwlUx68HmK63bWkyVN8mABUpen1wyk1fOkshHGSGddVLIXyn9A53MD9j-VAfyu3H-Q7aTuiNmnyg-WhCzRE72i_LJchKTF2bgMfWDbKrj644vaKuZ__xsPGav88eXh4WYOhsIhEb1AnQdwXsNqkElY45VhT5VEWzSIThlYooW6xC1LwUC-ITKJptNNskr6wycsFn31qVTxhMoDETKTS7FV4RyjE2tjSGLaFmbnLF7usv2fQyvaClOejhRQG4nkNu_QD7_j4tcsD1F8JOGDi7ZrN98piu2i1_9-mNzPTy_b2vOnt0
linkProvider Directory of Open Access Journals
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Graph-based+hostile+content+detection+in+Hindi+language&rft.jtitle=Discover+Computing&rft.au=Angana+Chakraborty&rft.au=Subhankar+Joardar&rft.au=Dilip+K.+Prasad&rft.au=Arif+Ahmed+Sekh&rft.date=2025-11-13&rft.pub=Springer&rft.eissn=2948-2992&rft.volume=28&rft.issue=1&rft.spage=1&rft.epage=23&rft_id=info:doi/10.1007%2Fs10791-025-09790-0&rft.externalDBID=DOA&rft.externalDocID=oai_doaj_org_article_4580b4f1e2a34c928193e79ab215df48