IndicBART for Translating Code-Mixed Kannada-English Sentences into Kannada: An Encoder-Decoder Transformer Approach

Translating Kannada-English code-mixed text continues to pose a major challenge in NLP owing to limited resource availability for Kannada, a lowresource Dravidian language, and the lack of parallel datasets. Existing models struggle with the structural complexity of code-mixed data, leading to subop...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	2025 5th International Conference on Intelligent Technologies (CONIT) S. 1 - 6
Hauptverfasser:	N, Shruthi, Sooda, Kavitha
Format:	Tagungsbericht
Sprache:	Englisch
Veröffentlicht:	IEEE 20.06.2025
Schlagworte:	Code-mixed texts Complexity theory Data models Encoder-Decoder Transformer Model Few shot learning IndicBart Kannada-English Code-mixed Multilingual Neural machine translation NLP Semantics Transformers Translation
ISBN:	9798331522322
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Abstract	Translating Kannada-English code-mixed text continues to pose a major challenge in NLP owing to limited resource availability for Kannada, a lowresource Dravidian language, and the lack of parallel datasets. Existing models struggle with the structural complexity of code-mixed data, leading to suboptimal performance. To address this, we experimented with a transformer-based encoder-decoder model, leveraging two variants of IndicBART, a pre-trained multilingual model. We explored IndicBART's potential for transfer and few-shot learning by fine-tuning it on two Kannada-English code-mixed datasets: one in Roman script and the other in Kannada script, both paired with Kannada translations. Through selfattention and cross-attention mechanisms, IndicBART effectively captured the semantic essence of code-mixed sentences. Our experiments showed that both variants achieved significant BLEU scores of approximately 0.807, with each outperforming the other under different scenarios. This demonstrates their potential for code-mixed translation with minimal data. These findings highlight the effectiveness of our methodologies in tackling code-mixed translation challenges, establishing a basis for continued research in low-resource language settings.
AbstractList	Translating Kannada-English code-mixed text continues to pose a major challenge in NLP owing to limited resource availability for Kannada, a lowresource Dravidian language, and the lack of parallel datasets. Existing models struggle with the structural complexity of code-mixed data, leading to suboptimal performance. To address this, we experimented with a transformer-based encoder-decoder model, leveraging two variants of IndicBART, a pre-trained multilingual model. We explored IndicBART's potential for transfer and few-shot learning by fine-tuning it on two Kannada-English code-mixed datasets: one in Roman script and the other in Kannada script, both paired with Kannada translations. Through selfattention and cross-attention mechanisms, IndicBART effectively captured the semantic essence of code-mixed sentences. Our experiments showed that both variants achieved significant BLEU scores of approximately 0.807, with each outperforming the other under different scenarios. This demonstrates their potential for code-mixed translation with minimal data. These findings highlight the effectiveness of our methodologies in tackling code-mixed translation challenges, establishing a basis for continued research in low-resource language settings.
Author	N, Shruthi Sooda, Kavitha
Author_xml	– sequence: 1 givenname: Shruthi surname: N fullname: N, Shruthi email: imshruthin29@gmail.com organization: College of Engineering,Dept of CSE B.M.S.,Bangalore,India – sequence: 2 givenname: Kavitha surname: Sooda fullname: Sooda, Kavitha email: kavithas.cse@bmsce.ac.in organization: College of Engineering,Dept of CSE B.M.S.,Bangalore,India
BookMark	eNpVkMFOwzAMhoOAA4y9AYe8QEftLO3CrZQBE4NJ0PuUJu4WqXOrtgd4eyrGDpw--__l7-BrccENkxAS4hlAbO7yzfuqSLRGmGGMegwhSSGBMzE1qVkoBRpRxYvzfzvilRhW7IN7yD4KWTWdLDrLfW2HwDuZN56it_BFXr5aZutttORdHfq9_CQeiB31MvDQnOp7mbFcshvvuuiRfnk0jurDOGdt2zXW7W_EZWXrnqZ_nIjiaVnkL9F687zKs3UUjBoi1Kr0WJLSlam8rbR34JwhkypjEyxh7ss0tYnyOJ8DIJoy1QZcrK0vjarURNwetYGItm0XDrb73p5-o34AbQldOA
ContentType	Conference Proceeding
DBID	6IE 6IL CBEJK RIE RIL
DOI	10.1109/CONIT65521.2025.11167161
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE/IET Electronic Library IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
EISBN	9798331522308 9798331522339
EndPage	6
ExternalDocumentID	11167161
Genre	orig-research
GroupedDBID	6IE 6IL CBEJK RIE RIL
ID	FETCH-LOGICAL-i93t-253bd2be35f9fdaf5dc1cc9e9739a62b14db77a63d24411229b7591c05adb93f3
IEDL.DBID	RIE
ISBN	9798331522322
IngestDate	Wed Oct 01 07:05:12 EDT 2025
IsPeerReviewed	false
IsScholarly	false
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-i93t-253bd2be35f9fdaf5dc1cc9e9739a62b14db77a63d24411229b7591c05adb93f3
PageCount	6
ParticipantIDs	ieee_primary_11167161
PublicationCentury	2000
PublicationDate	2025-June-20
PublicationDateYYYYMMDD	2025-06-20
PublicationDate_xml	– month: 06 year: 2025 text: 2025-June-20 day: 20
PublicationDecade	2020
PublicationTitle	2025 5th International Conference on Intelligent Technologies (CONIT)
PublicationTitleAbbrev	CONIT
PublicationYear	2025
Publisher	IEEE
Publisher_xml	– name: IEEE
Score	1.9123981
Snippet	Translating Kannada-English code-mixed text continues to pose a major challenge in NLP owing to limited resource availability for Kannada, a lowresource...
SourceID	ieee
SourceType	Publisher
StartPage	1
SubjectTerms	Code-mixed texts Complexity theory Data models Encoder-Decoder Transformer Model Few shot learning IndicBart Kannada-English Code-mixed Multilingual Neural machine translation NLP Semantics Transformers Translation
Title	IndicBART for Translating Code-Mixed Kannada-English Sentences into Kannada: An Encoder-Decoder Transformer Approach
URI	https://ieeexplore.ieee.org/document/11167161
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV29TwMhHCXaOJiYqLHG7zC40t7B8eVWq42NWpvYoVvD8WFu4Uy9Gv984XpX4-DgBIFACAzv_eC9HwBcO82UwolAIkvjbRXDSCXWIKUkx05Lna6TuD7xyUTM53LamNVrL4y1thaf2V6s1m_5ptSreFXWT-OjQRqDnW3OWWvW2pNcCkICEAV2sFHrJLI_fJmMZ4wGgApxIKa9dvivj1RqHBnt_3MFB6D748iD0w3WHIIt649ANfam0LeBksJAPWENO1Ha5t_gsDQWPRdf1sBH5b0yCjWGXfgas3BG8TQsfFW23Tdw4OG9jw73JbqzdbmeMbLaUB80yce7YDa6nw0fUPOLAiokqRCmJDc4t4Q66Yxy1OhUa2klJ1IxnKeZyTlXjJgA9IF8YZlzKlOdUGVySRw5Bh1fensCoM4CGxDMsTwLYZnTuXCaaI6FSowTlJ2Cbtyxxfs6T8ai3ayzP9rPwW48lyi8wskF6FTLlb0EO_qzKj6WV_XpfgMLeqXA
linkProvider	IEEE
linkToHtml	http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PS8MwGA2igoKg4sTf5uA1W5s0beJtzo2NbXVgD7uNND-kl1RmJ_75Jl038eDBU0ICIeQ7vPcl730B4MHIWAgcMMSi0N9WxRiJQCskBE-wkVyG6yKukyRN2XzOZ41ZvfbCaK1r8Zlu-279lq9KufJXZZ3QPxqEPtnZo1GEg41d64gnnBHioMjxg61eJ-Cd3ks6ymLqIMplgpi2Nwv8-kqlRpLB8T_3cAJaP548ONuizSnY0fYMVCOrCvnkSCl05BPWwOPFbfYN9kql0bT40gqOhbVCCdRYduGrr8Pp5dOwsFW5mX6EXQv71nvcl-hZ1-16Rc9rXb_blB9vgWzQz3pD1PyjgApOKoQpyRXONaGGGyUMVTKUkmueEC5inIeRypNExEQ5qHf0C_M8oTyUARUq58SQc7BrS6svAJSR4wMsNnEeucTMyJwZSWSCmQiUYTS-BC1_Yov3daWMxeawrv4YvwcHw2w6WUxG6fgaHPoYeRkWDm7AbrVc6VuwLz-r4mN5V0f6G9j4qQc
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2025+5th+International+Conference+on+Intelligent+Technologies+%28CONIT%29&rft.atitle=IndicBART+for+Translating+Code-Mixed+Kannada-English+Sentences+into+Kannada%3A+An+Encoder-Decoder+Transformer+Approach&rft.au=N%2C+Shruthi&rft.au=Sooda%2C+Kavitha&rft.date=2025-06-20&rft.pub=IEEE&rft.isbn=9798331522322&rft.spage=1&rft.epage=6&rft_id=info:doi/10.1109%2FCONIT65521.2025.11167161&rft.externalDocID=11167161
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9798331522322/lc.gif&client=summon&freeimage=true
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9798331522322/mc.gif&client=summon&freeimage=true
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9798331522322/sc.gif&client=summon&freeimage=true