IndicBART for Translating Code-Mixed Kannada-English Sentences into Kannada: An Encoder-Decoder Transformer Approach
Translating Kannada-English code-mixed text continues to pose a major challenge in NLP owing to limited resource availability for Kannada, a lowresource Dravidian language, and the lack of parallel datasets. Existing models struggle with the structural complexity of code-mixed data, leading to subop...
Gespeichert in:
| Veröffentlicht in: | 2025 5th International Conference on Intelligent Technologies (CONIT) S. 1 - 6 |
|---|---|
| Hauptverfasser: | , |
| Format: | Tagungsbericht |
| Sprache: | Englisch |
| Veröffentlicht: |
IEEE
20.06.2025
|
| Schlagworte: | |
| ISBN: | 9798331522322 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | Translating Kannada-English code-mixed text continues to pose a major challenge in NLP owing to limited resource availability for Kannada, a lowresource Dravidian language, and the lack of parallel datasets. Existing models struggle with the structural complexity of code-mixed data, leading to suboptimal performance. To address this, we experimented with a transformer-based encoder-decoder model, leveraging two variants of IndicBART, a pre-trained multilingual model. We explored IndicBART's potential for transfer and few-shot learning by fine-tuning it on two Kannada-English code-mixed datasets: one in Roman script and the other in Kannada script, both paired with Kannada translations. Through selfattention and cross-attention mechanisms, IndicBART effectively captured the semantic essence of code-mixed sentences. Our experiments showed that both variants achieved significant BLEU scores of approximately 0.807, with each outperforming the other under different scenarios. This demonstrates their potential for code-mixed translation with minimal data. These findings highlight the effectiveness of our methodologies in tackling code-mixed translation challenges, establishing a basis for continued research in low-resource language settings. |
|---|---|
| AbstractList | Translating Kannada-English code-mixed text continues to pose a major challenge in NLP owing to limited resource availability for Kannada, a lowresource Dravidian language, and the lack of parallel datasets. Existing models struggle with the structural complexity of code-mixed data, leading to suboptimal performance. To address this, we experimented with a transformer-based encoder-decoder model, leveraging two variants of IndicBART, a pre-trained multilingual model. We explored IndicBART's potential for transfer and few-shot learning by fine-tuning it on two Kannada-English code-mixed datasets: one in Roman script and the other in Kannada script, both paired with Kannada translations. Through selfattention and cross-attention mechanisms, IndicBART effectively captured the semantic essence of code-mixed sentences. Our experiments showed that both variants achieved significant BLEU scores of approximately 0.807, with each outperforming the other under different scenarios. This demonstrates their potential for code-mixed translation with minimal data. These findings highlight the effectiveness of our methodologies in tackling code-mixed translation challenges, establishing a basis for continued research in low-resource language settings. |
| Author | N, Shruthi Sooda, Kavitha |
| Author_xml | – sequence: 1 givenname: Shruthi surname: N fullname: N, Shruthi email: imshruthin29@gmail.com organization: College of Engineering,Dept of CSE B.M.S.,Bangalore,India – sequence: 2 givenname: Kavitha surname: Sooda fullname: Sooda, Kavitha email: kavithas.cse@bmsce.ac.in organization: College of Engineering,Dept of CSE B.M.S.,Bangalore,India |
| BookMark | eNpVkMFOwzAMhoOAA4y9AYe8QEftLO3CrZQBE4NJ0PuUJu4WqXOrtgd4eyrGDpw--__l7-BrccENkxAS4hlAbO7yzfuqSLRGmGGMegwhSSGBMzE1qVkoBRpRxYvzfzvilRhW7IN7yD4KWTWdLDrLfW2HwDuZN56it_BFXr5aZutttORdHfq9_CQeiB31MvDQnOp7mbFcshvvuuiRfnk0jurDOGdt2zXW7W_EZWXrnqZ_nIjiaVnkL9F687zKs3UUjBoi1Kr0WJLSlam8rbR34JwhkypjEyxh7ss0tYnyOJ8DIJoy1QZcrK0vjarURNwetYGItm0XDrb73p5-o34AbQldOA |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1109/CONIT65521.2025.11167161 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| EISBN | 9798331522308 9798331522339 |
| EndPage | 6 |
| ExternalDocumentID | 11167161 |
| Genre | orig-research |
| GroupedDBID | 6IE 6IL CBEJK RIE RIL |
| ID | FETCH-LOGICAL-i93t-253bd2be35f9fdaf5dc1cc9e9739a62b14db77a63d24411229b7591c05adb93f3 |
| IEDL.DBID | RIE |
| ISBN | 9798331522322 |
| IngestDate | Wed Oct 01 07:05:12 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | false |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-i93t-253bd2be35f9fdaf5dc1cc9e9739a62b14db77a63d24411229b7591c05adb93f3 |
| PageCount | 6 |
| ParticipantIDs | ieee_primary_11167161 |
| PublicationCentury | 2000 |
| PublicationDate | 2025-June-20 |
| PublicationDateYYYYMMDD | 2025-06-20 |
| PublicationDate_xml | – month: 06 year: 2025 text: 2025-June-20 day: 20 |
| PublicationDecade | 2020 |
| PublicationTitle | 2025 5th International Conference on Intelligent Technologies (CONIT) |
| PublicationTitleAbbrev | CONIT |
| PublicationYear | 2025 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| Score | 1.9125047 |
| Snippet | Translating Kannada-English code-mixed text continues to pose a major challenge in NLP owing to limited resource availability for Kannada, a lowresource... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 1 |
| SubjectTerms | Code-mixed texts Complexity theory Data models Encoder-Decoder Transformer Model Few shot learning IndicBart Kannada-English Code-mixed Multilingual Neural machine translation NLP Semantics Transformers Translation |
| Title | IndicBART for Translating Code-Mixed Kannada-English Sentences into Kannada: An Encoder-Decoder Transformer Approach |
| URI | https://ieeexplore.ieee.org/document/11167161 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PS8MwGA06PAiCihN_k4PXbG3TNI23OTcc4hy4w24jP75KL6nMTvzzTbJ24sGDp4aGlpBA3vuS974PoVtpmPJp4QjEiSSp2xuJKCJwoUpuVKFBi0iFYhN8Os0XCzFrzOrBCwMAQXwGPd8Md_mm0mt_VNaP_aVB7IOdXc6z1qx1ILjIKXVA5NjBVq0Tif7wZTqZZ8wBlIsDE9ZrP_9VSCXgyPjwnyM4Qt0fRx6ebbHmGO2APUH1xJpS3ztKih31xAF2vLTNvuFhZYA8l19g8JO0VhpJGsMufvVZOL14Gpe2rtruOzyweGS9w31FHiA8N3_0rNa1B03y8S6aj0fz4SNpqiiQUtCaJIwqkyigrBCFkQUzOtZagOBUyCxRcWoU5zKjxgG9I1-JUJyJWEdMGiVoQU9Rx1YWzhDOUxdbOMqoGVep8tQRUke3skgpbVKWnKOun7Hl-yZPxrKdrIs_3l-ifb8uXniVRFeoU6_WcI329Gddfqxuwup-A3GepUg |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PS8MwGA2igoKg4sTf5uA1W5s0beNtzo2NbXVgD7uN_Kr0ksrsxD_fJOsmHjx4amgglHyQ977mve8D4IErKlxZOKRDzFFkz0bEikDbVCVVopBaskD4ZhNJlqXzOZs1ZnXvhdFae_GZbruhv8tXlVy5X2Wd0F0ahC7Z2aNRhIONXeuIJSwlxEKR5QdbvU7AOr2XbJTH1EKUzQQxbW8W-NVKxSPJ4Pif33ACWj-ePDjbos0p2NHmDNQjo0r5ZEkptOQTeuBx4jbzBnuV0mhafmkFx9wYrjhqLLvw1dXhdPJpWJq62kw_wq6BfeM87kv0rP1zvaLjtXbcbcqPt0A-6Oe9IWr6KKCSkRphSoTCQhNasELxgioZSsk0SwjjMRZhpESS8JgoC_WWfmEmEspCGVCuBCMFOQe7pjL6AsA0stmFJY2SJiISjjzqyBKuOBBCqojiS9ByO7Z4X1fKWGw26-qP9_fgYJhPJ4vJKBtfg0MXIyfDwsEN2K2XK30L9uVnXX4s73ykvwE6N6iP |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2025+5th+International+Conference+on+Intelligent+Technologies+%28CONIT%29&rft.atitle=IndicBART+for+Translating+Code-Mixed+Kannada-English+Sentences+into+Kannada%3A+An+Encoder-Decoder+Transformer+Approach&rft.au=N%2C+Shruthi&rft.au=Sooda%2C+Kavitha&rft.date=2025-06-20&rft.pub=IEEE&rft.isbn=9798331522322&rft.spage=1&rft.epage=6&rft_id=info:doi/10.1109%2FCONIT65521.2025.11167161&rft.externalDocID=11167161 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9798331522322/lc.gif&client=summon&freeimage=true |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9798331522322/mc.gif&client=summon&freeimage=true |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9798331522322/sc.gif&client=summon&freeimage=true |

