Diversity and language technology: how language modeling bias causes epistemic injustice
It is well known that AI-based language technology—large language models, machine translation systems, multilingual dictionaries, and corpora—is currently limited to three percent of the world’s most widely spoken, financially and politically backed languages. In response, recent efforts have sought...
Uloženo v:
| Vydáno v: | Ethics and information technology Ročník 26; číslo 1; s. 8 |
|---|---|
| Hlavní autoři: | , , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
Dordrecht
Springer Netherlands
01.03.2024
Springer Nature B.V Springer Verlag |
| Témata: | |
| ISSN: | 1388-1957, 1572-8439 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | It is well known that AI-based language technology—large language models, machine translation systems, multilingual dictionaries, and corpora—is currently limited to three percent of the world’s most widely spoken, financially and politically backed languages. In response, recent efforts have sought to address the “digital language divide” by extending the reach of large language models to “underserved languages.” We show how some of these efforts tend to produce flawed solutions that adhere to a hard-wired representational preference for certain languages, which we call language modeling bias. Language modeling bias is a specific and under-studied form of linguistic bias were language technology by design favors certain languages, dialects, or sociolects with respect to others. We show that language modeling bias can result in systems that, while being precise regarding languages and cultures of dominant powers, are limited in the expression of socio-culturally relevant notions of other communities. We further argue that at the root of this problem lies a systematic tendency of technology developer communities to apply a simplistic understanding of diversity which does not do justice to the more profound differences that languages, and ultimately the communities that speak them, embody. Drawing on the concept of epistemic injustice, we point to the broader ethico-political implications and show how it can lead not only to a disregard for valuable aspects of diversity but also to an under-representation of the needs of marginalized language communities. Finally, we present an alternative socio-technical approach that is designed to tackle some of the analyzed problems. |
|---|---|
| AbstractList | It is well known that AI-based language technology—large language models, machine translation systems, multilingual dictionaries, and corpora—is currently limited to three percent of the world’s most widely spoken, financially and politically backed languages. In response, recent efforts have sought to address the “digital language divide” by extending the reach of large language models to “underserved languages.” We show how some of these efforts tend to produce flawed solutions that adhere to a hard-wired representational preference for certain languages, which we call language modeling bias. Language modeling bias is a specific and under-studied form of linguistic bias were language technology by design favors certain languages, dialects, or sociolects with respect to others. We show that language modeling bias can result in systems that, while being precise regarding languages and cultures of dominant powers, are limited in the expression of socio-culturally relevant notions of other communities. We further argue that at the root of this problem lies a systematic tendency of technology developer communities to apply a simplistic understanding of diversity which does not do justice to the more profound differences that languages, and ultimately the communities that speak them, embody. Drawing on the concept of epistemic injustice, we point to the broader ethico-political implications and show how it can lead not only to a disregard for valuable aspects of diversity but also to an under-representation of the needs of marginalized language communities. Finally, we present an alternative socio-technical approach that is designed to tackle some of the analyzed problems. |
| ArticleNumber | 8 |
| Author | Bella, Gábor Giunchiglia, Fausto Helm, Paula Koch, Gertraud |
| Author_xml | – sequence: 1 givenname: Paula orcidid: 0000-0002-2719-9721 surname: Helm fullname: Helm, Paula email: p.m.helm@uva.nl organization: University of Amsterdam – sequence: 2 givenname: Gábor surname: Bella fullname: Bella, Gábor organization: Lab-STICC CNRS UMR 628, IMT Atlantique – sequence: 3 givenname: Gertraud surname: Koch fullname: Koch, Gertraud organization: University of Hamburg – sequence: 4 givenname: Fausto surname: Giunchiglia fullname: Giunchiglia, Fausto organization: University of Trento |
| BackLink | https://hal.science/hal-04421595$$DView record in HAL |
| BookMark | eNp9kMtu3CAUhlE0kZpM-gJdWcqqC6eAwUB3ozSXSiNlkywjhDH2MPLAFHCieRs_i58snrgXqYusDjrn_w5H3zlYOO8MAF8QvEIQsm8RwZKVOcRFDgUjOC9PwBmiDOecFGIxvQvOcyQo-wTOY9xCCClD7Aw8_7AvJkSbDplyddYp1_aqNVkyeuN859vD92zjX8fhz2Qcdr42nXXtOFRWxUyrPpo4DmZvYzI7q8fBum0fk9XmApw2qovm8--6BE-3N4_X9_n64e7n9Wqd60KwlJuacE2pJpDVZYEZF1rgSpCSi6bWilCheE15qZq6omQamYZiInSNGBYVRcUSfJ33blQn98HuVDhIr6y8X63lsQcJwYgK-nLMXs7ZffC_ehOT3Po-uOk8iQXiHAoxOVsCPKd08DEG0_xdi6A8Kpezcjkpl-_KZTlB_D9I26SS9S4FZbuP0WJG4_SPa034d9UH1BvG_JxT |
| CitedBy_id | crossref_primary_10_1515_lingvan_2024_0001 crossref_primary_10_1007_s10676_025_09837_2 crossref_primary_10_1080_01419870_2025_2469694 crossref_primary_10_1109_ACCESS_2025_3589319 crossref_primary_10_1177_14614448251321162 crossref_primary_10_1051_e3sconf_202560407001 crossref_primary_10_1007_s00521_024_10472_z crossref_primary_10_1016_j_sftr_2025_101126 crossref_primary_10_3390_info16090771 crossref_primary_10_1007_s13347_025_00953_x crossref_primary_10_1080_15710882_2024_2341799 crossref_primary_10_31305_rrjss_2025_v05_n01_009 crossref_primary_10_1016_j_jrt_2025_100135 crossref_primary_10_1177_20539517251365228 crossref_primary_10_1007_s13347_025_00928_y crossref_primary_10_1007_s00146_025_02539_9 crossref_primary_10_1007_s00146_025_02392_w crossref_primary_10_1007_s00521_024_10747_5 crossref_primary_10_1007_s44204_024_00178_3 crossref_primary_10_56294_ai2025417 crossref_primary_10_59896_pesolah_v1i1_346 |
| Cites_doi | 10.1145/230538.230561 10.18653/v1/2020.coling-main.313 10.1145/3442188.3445922 10.1215/0961754X-1630424 10.24963/ijcai.2017/560 10.1007/978-3-030-34974-5_6 10.18653/v1/2022.acl-long.539 10.18653/v1/P16-1162 10.3389/fpsyg.2023.1229697 10.1080/1369118X.2016.1216147 10.18653/v1/2022.findings-acl.44 10.1177/2378023120967171 10.3366/epi.2010.0001 10.1145/3287560.3287572 10.2307/3178066 10.1371/journal.pone.0077056 10.1093/sf/soz162 10.18653/v1/2021.eacl-main.188 10.18653/v1/2022.acl-demo.15 10.7208/chicago/9780226902098.001.0001 10.26643/gis.v12i3.5173 10.14763/2022.2.1654 10.1145/1753326.1753522 10.1145/3411763.3441334 10.18653/v1/2021.naacl-main.49 10.7551/mitpress/9302.001.0001 10.5040/9781350219656 10.7551/mitpress/14234.001.0001 10.1007/s10462-023-10427-1 10.18653/v1/2020.acl-main.485 10.18653/v1/2023.acl-long.699 10.1145/3514094.3534149 10.1093/0195138791.001.0001 10.2307/410659 10.1177/20539517231206802 10.1080/15710882.2018.1426773 10.1007/BF01064504 10.1093/oso/9780192859624.001.0001 10.18653/v1/2020.acl-main.560 10.1111/lnc3.12432 10.18653/v1/2021.acl-long.38 10.18653/v1/2022.acl-short.82 10.1177/0306312717706110 10.4159/9780674238879 10.1007/s10579-021-09544-6 10.1016/j.geoforum.2015.07.002 |
| ContentType | Journal Article |
| Copyright | The Author(s) 2024 The Author(s) 2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. Attribution |
| Copyright_xml | – notice: The Author(s) 2024 – notice: The Author(s) 2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. – notice: Attribution |
| DBID | C6C AAYXX CITATION 3V. 7WY 7WZ 7XB 87Z 8FE 8FG 8FK 8FL 8G5 AABKS ABSDQ ABUWG AEUYN AFKRA ALSLI ARAPS AVQMV AZQEC BENPR BEZIV BGLVJ CCPQU CNYFK DWQXO FRNLG F~G GNUQQ GUQSH HCIFZ K50 K60 K6~ L.- M0C M1D M1O M2O MBDVC P5Z P62 PEJEM PGAAH PHGZM PHGZT PKEHL PMKZF PQBIZ PQBZA PQEST PQGLB PQQKQ PQUKI PRINS PRQQA Q9U 1XC BXJBU IHQJB VOOES |
| DOI | 10.1007/s10676-023-09742-6 |
| DatabaseName | Springer Nature OA Free Journals CrossRef ProQuest Central (Corporate) ABI/INFORM Collection ABI/INFORM Global (PDF only) ProQuest Central (purchase pre-March 2016) ABI/INFORM Collection ProQuest SciTech Collection ProQuest Technology Collection ProQuest Central (Alumni) (purchase pre-March 2016) ABI/INFORM Collection (Alumni) Research Library (Alumni) Philosophy Collection Philosophy Database ProQuest Central (Alumni) ProQuest One Sustainability ProQuest Central UK/Ireland Social Science Premium Collection Advanced Technologies & Computer Science Collection Arts Premium Collection ProQuest Central Essentials - QC ProQuest Central Business Premium Collection ProQuest Technology Collection ProQuest One Library & information science collection. ProQuest Central Korea Business Premium Collection (Alumni) ABI/INFORM Global (Corporate) ProQuest Central Student Research Library Prep SciTech Premium Collection Art, Design & Architecture Collection ProQuest Business Collection (Alumni Edition) ProQuest Business Collection ABI/INFORM Professional Advanced ABI/INFORM Global Arts & Humanities Database Library Science Database Research Library Research Library (Corporate) ProQuest advanced technologies & aerospace journals ProQuest Advanced Technologies & Aerospace Collection ProQuest One Visual Arts & Design ProQuest One Religion & Philosophy Proquest Central Premium ProQuest One Academic (New) ProQuest One Academic Middle East (New) ProQuest Digital Collections ProQuest One Business (UW System Shared) ProQuest One Business (Alumni) ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Applied & Life Sciences ProQuest One Academic (retired) ProQuest One Academic UKI Edition ProQuest Central China ProQuest One Social Sciences ProQuest Central Basic Hyper Article en Ligne (HAL) HAL-SHS: Archive ouverte en Sciences de l'Homme et de la Société HAL-SHS: Archive ouverte en Sciences de l'Homme et de la Société (Open Access) Hyper Article en Ligne (HAL) (Open Access) |
| DatabaseTitle | CrossRef ProQuest Business Collection (Alumni Edition) Research Library Prep ProQuest Central Student ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Essentials SciTech Premium Collection ProQuest Central China ABI/INFORM Complete ProQuest One Religion & Philosophy Philosophy Collection ProQuest One Applied & Life Sciences ProQuest One Sustainability Arts Premium Collection Library & Information Science Collection ProQuest Central (New) Advanced Technologies & Aerospace Collection Business Premium Collection Social Science Premium Collection ABI/INFORM Global ProQuest One Academic Eastern Edition ProQuest Technology Collection ProQuest Business Collection ProQuest One Academic UKI Edition Arts & Humanities Full Text ProQuest One Academic ProQuest One Academic (New) ABI/INFORM Global (Corporate) ProQuest One Business Technology Collection ProQuest One Academic Middle East (New) ProQuest Central (Alumni Edition) ProQuest One Community College Research Library (Alumni Edition) ProQuest Central ABI/INFORM Professional Advanced ProQuest Library Science ProQuest Central Korea ProQuest Research Library ProQuest Art, Design and Architecture Collection ABI/INFORM Complete (Alumni Edition) ProQuest One Social Sciences ABI/INFORM Global (Alumni Edition) ProQuest Central Basic ProQuest SciTech Collection ProQuest Digital Collections Advanced Technologies & Aerospace Database ProQuest One Business (Alumni) ProQuest One Visual Arts & Design ProQuest Central (Alumni) Business Premium Collection (Alumni) Philosophy Database |
| DatabaseTitleList | ProQuest Business Collection (Alumni Edition) CrossRef |
| Database_xml | – sequence: 1 dbid: BENPR name: ProQuest Central url: https://www.proquest.com/central sourceTypes: Aggregation Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Library & Information Science Philosophy Computer Science |
| EISSN | 1572-8439 |
| ExternalDocumentID | oai:HAL:hal-04421595v1 10_1007_s10676_023_09742_6 |
| GrantInformation_xml | – fundername: EU grantid: JIDEP |
| GroupedDBID | -59 -5G -BR -EM -Y2 -~C .4S .86 .DC .VR 06D 0R~ 0VY 199 1N0 1SB 203 29G 2J2 2JN 2JY 2KG 2KM 2LR 2P1 2VQ 2~H 30V 3V. 4.4 406 408 409 40D 40E 5GY 5VS 67Z 6NX 78A 7WY 8FE 8FG 8FL 8FW 8G5 8TC 8UJ 95- 95. 95~ 96X AAAVM AABHQ AABKS AACDK AACJB AAHNG AAIAL AAJBT AAJKR AANZL AARHV AARTL AASML AATNV AATVU AAUYE AAWCG AAYIU AAYQN AAYTO AAYZH ABAKF ABBBX ABBXA ABDZT ABECU ABFTD ABFTV ABHLI ABHQN ABJNI ABJOX ABKCH ABKTR ABMNI ABMQK ABNWP ABQBU ABQSL ABSDQ ABSXP ABTEG ABTHY ABTKH ABTMW ABULA ABUWG ABWNU ABXPI ACAOD ACBXY ACDTI ACGFO ACGFS ACHQT ACHSB ACHXU ACKNC ACMDZ ACMLO ACOKC ACOMO ACPIV ACREN ACSNA ACZOJ ADHHG ADHIR ADINQ ADKNI ADKPE ADMLS ADRFC ADTPH ADURQ ADYFF ADYOE ADZKW AEBTG AEFQL AEGAL AEGNC AEJHL AEJRE AEKMD AEMSY AEOHA AEPYU AESKC AETLH AEUYN AEVLU AEXYK AFBBN AFGCZ AFKRA AFLOW AFQWF AFWTZ AFYQB AFZKB AGAYW AGDGC AGGDS AGJBK AGMZJ AGQEE AGQMX AGRTI AGWIL AGWZB AGYKE AHAVH AHBYD AHKAY AHSBF AHYZX AIAKS AIGIU AIIXL AILAN AITGF AJBLW AJRNO AJZVZ ALMA_UNASSIGNED_HOLDINGS ALSLI ALWAN AMKLP AMTXH AMXSW AMYLF AMYQR AOCGG ARAPS ARCSS ARMRJ ASPBG AVQMV AVWKF AXYYD AYJHY AZFZN AZQEC B-. BA0 BDATZ BENPR BEZIV BGLVJ BGNMA BPHCQ BSONS C6C CAG CCPQU CNYFK COF CS3 CSCUP DDRTE DL5 DNIVK DPUIP DU5 DWQXO EBLON EBS EDO EIOEI EJD ESBYG F5P FD6 FEDTE FERAY FFXSO FIGPU FINBP FNLPD FRNLG FRRFC FSGXE FWDCC GGCAI GGRSB GJIRD GNUQQ GNWQR GQ6 GQ7 GQ8 GROUPED_ABI_INFORM_COMPLETE GROUPED_ABI_INFORM_RESEARCH GUQSH GXS H13 HCIFZ HF~ HG5 HG6 HMJXF HQYDN HRMNR HVGLF HZ~ I09 IHE IJ- IKXTQ ITM IWAJR IXC IZIGR IZQ I~X I~Z J-C J0Z JBSCW JCJTX JZLTJ K50 K60 K6~ KDC KOV LAK LLZTM M0C M1D M1O M2O M4Y MA- MK~ ML~ N2Q N9A NB0 NPVJJ NQJWS NU0 O9- O93 O9J OAM OVD P2P P62 P9O PF- PQBIZ PQBZA PQQKQ PROAC PT4 PT5 Q2X QOS R89 R9I RNI RNS ROL RPX RSV RZC RZE RZK S16 S1Z S27 S3B SAP SCO SDH SHX SISQX SJYHP SNE SNPRN SNX SOHCF SOJ SPISZ SRMVM SSLCW STPWE SZN T13 TEORI TSG TSK TSV TUC TUS U2A U5U UG4 UOJIU UTJUX UZXMN VC2 VFIZW W23 W48 WK8 YLTOR Z45 Z7R Z7W Z7X Z81 Z83 Z88 ZMTXR AAPKM AAYXX ABBRH ABDBE ABFSG ABRTQ ACSTC ADHKG AEZWR AFDZB AFFHD AFHIU AFOHR AGQPQ AHPBZ AHWEU AIXLP ATHPR AYFIA CITATION PEJEM PGAAH PHGZM PHGZT PMKZF PQGLB PRQQA 7XB 8FK L.- MBDVC PKEHL PQEST PQUKI PRINS Q9U 1XC BXJBU IHQJB VOOES |
| ID | FETCH-LOGICAL-c397t-ed48c55c407d632789c92b94689fdca459a8d586afdb5492bef5249cd1729b513 |
| IEDL.DBID | RSV |
| ISICitedReferencesCount | 12 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001148865400001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1388-1957 |
| IngestDate | Sat Nov 29 15:03:56 EST 2025 Tue Dec 02 15:59:44 EST 2025 Sat Nov 29 02:43:50 EST 2025 Tue Nov 18 22:24:57 EST 2025 Fri Feb 21 02:43:56 EST 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 1 |
| Keywords | Language technology Large language models Epistemic injustice Diversity Language modeling bias Linguistic bias Digital divide Lexical gaps epistemic injustice language modeling bias bias linguistic diversity |
| Language | English |
| License | Attribution: http://creativecommons.org/licenses/by |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c397t-ed48c55c407d632789c92b94689fdca459a8d586afdb5492bef5249cd1729b513 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0000-0002-2719-9721 0000-0002-3868-1740 |
| OpenAccessLink | https://link.springer.com/10.1007/s10676-023-09742-6 |
| PQID | 2918809984 |
| PQPubID | 25743 |
| ParticipantIDs | hal_primary_oai_HAL_hal_04421595v1 proquest_journals_2918809984 crossref_primary_10_1007_s10676_023_09742_6 crossref_citationtrail_10_1007_s10676_023_09742_6 springer_journals_10_1007_s10676_023_09742_6 |
| PublicationCentury | 2000 |
| PublicationDate | 2024-03-01 |
| PublicationDateYYYYMMDD | 2024-03-01 |
| PublicationDate_xml | – month: 03 year: 2024 text: 2024-03-01 day: 01 |
| PublicationDecade | 2020 |
| PublicationPlace | Dordrecht |
| PublicationPlace_xml | – name: Dordrecht |
| PublicationTitle | Ethics and information technology |
| PublicationTitleAbbrev | Ethics Inf Technol |
| PublicationYear | 2024 |
| Publisher | Springer Netherlands Springer Nature B.V Springer Verlag |
| Publisher_xml | – name: Springer Netherlands – name: Springer Nature B.V – name: Springer Verlag |
| References | Bird, S. (2022, May). Local languages, third spaces, and other high-resource scenarios. Proceedings of the 60th annual meeting of the association for computational linguistics (volume 1: Long papers) (pp. 7817–7829). Dublin, Ireland: Association for Computational Linguistics. Retrieved from https://aclanthology.org/2022.acl-long.539 10.18653/v1/2022.acl-long.539 WinnerLThe whale and the reactor: A search for limits in an age of high technology (Reprint1988EditionUniversity of Chicago Press10.7208/chicago/9780226902098.001.0001 AroraPBottom of the data pyramid: Big data and the global southInternational Journal of Communication2016101119 HelmPde GötzenACernuzziLHumeADiwakarSRuiz CorreaSGatica-PerezDDiversity and neocolonialism in big data research: Avoiding extractivism while struggling with paternalismBig Data & Society202310.1177/20539517231206802 RanciereJDisagreement: Politics and philosophy1998University of Minnesota Press TsingALOn nonscalability: The living world is not amenable to precision-nested scalesCommon Knowledge201218350552410.1215/0961754X-1630424 BeerDThe social power of algorithmsInformation, Communication & Society201720111343646210.1080/1369118X.2016.1216147 SpivakGCGrossbergLNelsonCCan the subaltern speakMarxism and the interpretation of culture1988University of Illinois Press66111 Bird, S. (2020, December). Decolonising speech and language technology. Proceedings of the 28th international conference on computational linguistics (pp. 3504–3519). Barcelona, Spain (Online): International Committee on Computational Linguistics. Retrieved from https://aclanthology.org/2020.colingmain.313 10.18653/v1/2020.coling-main.313 Joshi, P., Santy, S., Budhiraja, A., Bali, K., Choudhury, M. (2020, July). The state and fate of linguistic diversity and inclusion in the NLP world. D. Jurafsky, J. Chai, N. Schluter, & J. Tetreault (Eds.), Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (pp. 6282–6293). Online: Association for Computational Linguistics. Retrieved from https://aclanthology.org/2020.acl-main.560 10.18653/v1/2020.acl-main.560 GitelmanLRaw data is an oxymoron2013MIT Press10.7551/mitpress/9302.001.0001 Hovy, D., & Yang, D. (2021, June). The importance of modeling social factors of language: Theory and practice. K. Toutanova et al. (Eds.), Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human language technologies (pp. 588–602). Online: Association for Computational Linguistics. Retrieved from https://aclanthology.org/2021.naacl-main.49 10.18653/v1/2021.naacl-main.49 De-Arteaga, M., Romanov, A., Wallach, H., Chayes, J., Borgs, C., Chouldechova, A., . . . Kalai, A.T. (2019a). Bias in bios: A case study of semantic representation bias in a high-stakes setting. , 120–128. Retrieved from https://doi.org/10.1145/3287560.3287572 Bella, G., Byambadorj, E., Chandrashekar, Y., Batsuren, K., Cheema, D., Giunchiglia, F. (2022). Language diversity: Visible to humans, exploitableby machines. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations (pp. 156–165). Dibitso, M. A., Owolawi, P. A., Ojo, S. O. (2019). Context-driven corpus-based model for automatic text segmentation and part of speech tagging in setswana using opennlp tool. Modeling and using context: 11th International and Interdisciplinary Conference, Context 2019, November 20–22, 2019, proceedings 11 (pp. 62–73). MazruiAMMazruiAAThe political culture of language: Swahili, society and the state1999Global Academic Publishing SchwemmerCKnightCBello-PardoEDOklobdzijaSSchoonveldeMLockhartJWDiagnosing gender bias in image recognition systemsSocius202010.1177/2378023120967171 Bender, E. M., Gebru, T., McMillan-Major, A., Shmitchell, S. (2021). On the dangers of stochastic parrots: Can language models be too big? Proceedings of the 2021 acm conference on fairness, accountability, and transparency (p. 610–623). New York, NY, USA: Association for Computing Machinery. Retrieved from https://dl.acm.org/doi/10.1145/3442188.3445922 10.1145/3442188.3445922 FriedmanBNissenbaumHBias in computer systemsACM Transactions on Information Systems199614333034710.1145/230538.230561 Zaugg, I.A., Hossain, A., Molloy, B. (2022, Apr). Digitally-disadvantaged languages. Internet Policy Review, 11(2). Retrieved from https://policyreview.info/glossary/digitally-disadvantaged-languages 10.14763/2022.2.1654 PotthastTThe values of biodiversity: philosophical considerations connecting theory and practice. Concepts and values in biodiversity2014Routledge YoungIMJustice and the politics of difference1990Princeton University Press HarawayDSituated knowledges: The science question in feminism and the privilege of partial perspectiveFeminist Studies198814357510.2307/3178066 Blodgett, S.L., Barocas, S., Daumé III, H., Wallach, H. (2020). Language (technology) is power: A critical survey of “bias” in nlp. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (pp. 5454–5476). Giunchiglia, F., Batsuren, K., Bella, G. (2017). Understanding and exploiting language diversity. Ijcai (pp. 4009–4017). De-Arteaga, M., Romanov, A., Wallach, H., Chayes, J., Borgs, C., Chouldechova, A., . . . Kalai, A.T. (2019b). Bias in bios: A case study of semantic representation bias in a high-stakes setting. Proceedings of the Conference on Fairness, Accountability, and Transparency (p. 120–128). Association for Computing Machinery. Retrieved from https://doi.org/10.1145/3287560.328757210.1145/3287560.3287572 RijkhoffJBakkerDHengeveldKKahrelPA method of language sampling. Studies in LanguageInternational Journal sponsored by the Foundation1993171169203 AradauCBlankeTAlgorithmic reason: The new government of self and other2022Oxford University Press10.1093/oso/9780192859624.001.0001 Smith, R.C., Winschiers-Theophilus, H., Loi, D., de Paula, R.A., Kambunga, A.P., Samuel, M.M., Zaman, T. (2021). Decolonizing design practices: Towards pluriversality. Extended Abstracts of the 2021 Chi Conference on Human Factors in Computing Systems. Association for Computing Machinery. Retrieved from https://doi.org/10.1145/3411763.3441334 White, J.C., & Cotterell, R. (2021, August). Examining the inductive bias of neural language models with artificial languages. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (vol. 1: Long papers) (pp. 454–463). Online: Association for Computational Linguistics Retrieved from https://aclanthology.org/2021.acl-long.38 10.18653/v1/2021.acl-long.38 Irani, L., Vertesi, J., Dourish, P., Philip, K., Grinter, R.E. (2010, Apr). Postcolonial computing: a lens on design and development. Proceedings of the Sigchi Conference on Human Factors in Computing Systems (p. 1311–1320). Association for Computing Machinery. Retrieved from https://doi.org/10.1145/1753326.175352210.1145/1753326.1753522 Bella, G., McNeill, F., Gorman, R., Donnaíle, C.Ó., MacDonald, K., Chandrashekar, Y., Giunchiglia, F. (2020). A major wordnet for a minority language: Scottish gaelic. In: Proceedings of the 12th Language Resources and Evaluation Conference (pp. 2812–2818). Bhuiyan, J. (2023, September). Lost in ai translation: growing reliance on language apps jeopardizes some asylum applications. The Guardian. Retrieved from https://www.theguardian.com/us-news/2023/sep/07/asylumseekers-ai-translation-apps GiunchigliaFBellaGNairNCChiYXuHRepresenting interlingual meaning in lexical databasesArtificial Intelligence Review202310.1007/s10462-023-10427-1 AroraPThe next billion users: Digital life beyond the west2019Harvard University Press10.4159/9780674238879 GoldmanAI51the unity of the epistemic virtues. Pathways to knowledge: Private and Ublic2002In Pathways to knowledgeOxford University Press10.1093/0195138791.001.0001 Helm, P., Michael, L., Schelenz, L. (2022, Jul). Diversity by design? balancing the inclusion and protection of users in an online social platform. Proceedings of the 2022 aaai/acm Conference on ai, Ethics, and Society (p. 324–334). Association for Computing Machinery. Retrieved from https://doi.org/10.1145/3514094.353414910.1145/3514094.3534149 CoadyDTwo concepts of epistemic injusticeEpisteme20107210111310.3366/epi.2010.0001 KornaiADigital language deathPloS one201381010.1371/journal.pone.0077056 Thiong’o, N. w. (1986). Decolonising the mind: The politics of language in african literature. N.H: Heinemann, Oxford. HovyDPrabhumoyeSFive sources of bias in natural language processingLanguage and Linguistics Compass202115810.1111/lnc3.12432 Giunchiglia, F., Batsuren, K., Freihat, A. A. (2018). One world–seven thousand languages. Proceedings 19th International Conference on Computational Linguistics and Intelligent Text Processing, Cicling2018, (pp. 18-24) March 2018. EngelJSGlobal clusters of innovation: Entrepreneurial engines of economic growth around the world (Reprint2016editionEdward Elgar Pub Khishigsuren, T., Bella, G., Batsuren, K., Freihat, A.A., Nair, N.C., Ganbold, A., Giunchiglia, F. (2022). Using linguistic typology to enrich multilingual lexicons: the case of lexical gaps in kinship. arXiv preprint arXiv:2204.05049. Zouhar, V., Chang, K., Cui, C., Carlson, N., Robinson, N., Sachan, M., Mortensen, D. (2023). Pwesuite: Phonetic word embeddings and tasks they facilitate. arXiv preprint arXiv:2304.02541. Ochigame, R. (2019, Dec). How big tech manipulates academia to avoid regulation. Retrieved from https://theintercept.com/2019/12/20/mit-ethical-aiartificial-intelligence Saad-SulonenJErikssonEHalskovKKarastiHVinesJUnfolding participation over time: Temporal lenses in participatory designCoDesign201814141610.1080/15710882.2018.1426773 HardingSStrong objectivity: A response to the new objectivity questionSynthese1995104333134910.1007/BF01064504 Bella, G., Batsuren, K., Khishigsuren, T., Giunchiglia, F. (2022). Linguistic diversity and bias in online dict GC Spivak (9742_CR58) 1988 P Arora (9742_CR3) 2016; 10 JH Greenberg (9742_CR32) 1956; 32 L Winner (9742_CR64) 1988 J Saad-Sulonen (9742_CR53) 2018; 14 AI Goldman (9742_CR31) 2002 E Agirre (9742_CR1) 2007 AL Tsing (9742_CR61) 2012; 18 9742_CR19 9742_CR17 9742_CR15 D Coady (9742_CR20) 2010; 7 9742_CR16 9742_CR13 9742_CR57 9742_CR14 L Gitelman (9742_CR27) 2013 S Harding (9742_CR34) 1995; 104 9742_CR11 9742_CR12 9742_CR56 P Helm (9742_CR36) 2023 9742_CR10 9742_CR54 9742_CR62 D Hovy (9742_CR38) 2021; 15 9742_CR63 JS Engel (9742_CR24) 2016 9742_CR60 S Pfotenhauer (9742_CR49) 2017; 47 M Broussard (9742_CR18) 2023 S Barocas (9742_CR5) 2016; 104 J Rijkhoff (9742_CR52) 1993; 17 AM Mazrui (9742_CR45) 1999 D Haraway (9742_CR33) 1988; 14 T Potthast (9742_CR50) 2014 9742_CR28 9742_CR29 9742_CR68 9742_CR69 9742_CR22 9742_CR23 9742_CR67 9742_CR21 9742_CR65 9742_CR9 9742_CR6 GA Miller (9742_CR46) 1998 D Beer (9742_CR8) 2017; 20 L Taylor (9742_CR59) 2015; 64 C Aradau (9742_CR2) 2022 9742_CR39 J Ranciere (9742_CR51) 1998 F Giunchiglia (9742_CR30) 2023 K Batsuren (9742_CR7) 2022; 56 9742_CR37 9742_CR35 P Arora (9742_CR4) 2019 IM Young (9742_CR66) 1990 9742_CR40 9742_CR41 M Fricker (9742_CR25) 2009 B Friedman (9742_CR26) 1996; 14 N Nyabola (9742_CR47) 2018 9742_CR48 A Kornai (9742_CR43) 2013; 8 9742_CR44 9742_CR42 C Schwemmer (9742_CR55) 2020 |
| References_xml | – reference: Dibitso, M. A., Owolawi, P. A., Ojo, S. O. (2019). Context-driven corpus-based model for automatic text segmentation and part of speech tagging in setswana using opennlp tool. Modeling and using context: 11th International and Interdisciplinary Conference, Context 2019, November 20–22, 2019, proceedings 11 (pp. 62–73). – reference: Helm, P., Michael, L., Schelenz, L. (2022, Jul). Diversity by design? balancing the inclusion and protection of users in an online social platform. Proceedings of the 2022 aaai/acm Conference on ai, Ethics, and Society (p. 324–334). Association for Computing Machinery. Retrieved from https://doi.org/10.1145/3514094.353414910.1145/3514094.3534149 – reference: AradauCBlankeTAlgorithmic reason: The new government of self and other2022Oxford University Press10.1093/oso/9780192859624.001.0001 – reference: Bella, G., McNeill, F., Gorman, R., Donnaíle, C.Ó., MacDonald, K., Chandrashekar, Y., Giunchiglia, F. (2020). A major wordnet for a minority language: Scottish gaelic. In: Proceedings of the 12th Language Resources and Evaluation Conference (pp. 2812–2818). – reference: GitelmanLRaw data is an oxymoron2013MIT Press10.7551/mitpress/9302.001.0001 – reference: Giunchiglia, F., Batsuren, K., Freihat, A. A. (2018). One world–seven thousand languages. Proceedings 19th International Conference on Computational Linguistics and Intelligent Text Processing, Cicling2018, (pp. 18-24) March 2018. – reference: Vanmassenhove, E., Shterionov, D., Gwilliam, M. (2021, April). Machine translationese: Effects of algorithmic bias on linguistic complexity in machine translation. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main volume (pp. 2203– 2213). Online: Association for Computational Linguistics. Retrieved from https://aclanthology.org/2021.eacl-main.188 10.18653/v1/2021.eacl-main.188 – reference: Zevallos, R., & Bel, N. (2023). Hints on the data for language modeling of synthetic languages with transformers. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (vol. 1: Long papers) (pp. 12508–12522). – reference: GoldmanAI51the unity of the epistemic virtues. Pathways to knowledge: Private and Ublic2002In Pathways to knowledgeOxford University Press10.1093/0195138791.001.0001 – reference: AroraPBottom of the data pyramid: Big data and the global southInternational Journal of Communication2016101119 – reference: PotthastTThe values of biodiversity: philosophical considerations connecting theory and practice. Concepts and values in biodiversity2014Routledge – reference: HelmPde GötzenACernuzziLHumeADiwakarSRuiz CorreaSGatica-PerezDDiversity and neocolonialism in big data research: Avoiding extractivism while struggling with paternalismBig Data & Society202310.1177/20539517231206802 – reference: YoungIMJustice and the politics of difference1990Princeton University Press – reference: Khalilia, H., Bella, G., Freihat, A.A., Darma, S., Giunchiglia, F. (2023). Lexical diversity in kinship across languages and dialects. To appear in Frontiers in Psychology, special issue on the adaptive value of language diversity. https://arxiv.org/abs/2308.13056 [cs.CL] – reference: PfotenhauerSJasanoffSPanacea or diagnosis? Imaginaries of innovation and the ‘Mit model’ in three political culturesSocial Studies of Science201747678381010.1177/0306312717706110 – reference: Bird, S. (2020, December). Decolonising speech and language technology. Proceedings of the 28th international conference on computational linguistics (pp. 3504–3519). Barcelona, Spain (Online): International Committee on Computational Linguistics. Retrieved from https://aclanthology.org/2020.colingmain.313 10.18653/v1/2020.coling-main.313 – reference: KornaiADigital language deathPloS one201381010.1371/journal.pone.0077056 – reference: WinnerLThe whale and the reactor: A search for limits in an age of high technology (Reprint1988EditionUniversity of Chicago Press10.7208/chicago/9780226902098.001.0001 – reference: NyabolaNDigital democracy, analogue politics: How the internet era is transforming politics in kenya2018Zed Books10.5040/9781350219656 – reference: BeerDThe social power of algorithmsInformation, Communication & Society201720111343646210.1080/1369118X.2016.1216147 – reference: SchwemmerCKnightCBello-PardoEDOklobdzijaSSchoonveldeMLockhartJWDiagnosing gender bias in image recognition systemsSocius202010.1177/2378023120967171 – reference: GiunchigliaFBellaGNairNCChiYXuHRepresenting interlingual meaning in lexical databasesArtificial Intelligence Review202310.1007/s10462-023-10427-1 – reference: Thiong’o, N. w. (1986). Decolonising the mind: The politics of language in african literature. N.H: Heinemann, Oxford. – reference: Bhuiyan, J. (2023, September). Lost in ai translation: growing reliance on language apps jeopardizes some asylum applications. The Guardian. Retrieved from https://www.theguardian.com/us-news/2023/sep/07/asylumseekers-ai-translation-apps – reference: Lignos, C., Holley, N., Palen-Michel, C., Sälevä, J. (2022, May). Toward more meaningful resources for lower-resourced languages. Findings of the association for computational linguistics: Acl 2022 (pp. 523–532). Dublin, Ireland: Association for Computational Linguistics. Retrieved from https://aclanthology.org/2022.findings-acl.44 10.18653/v1/2022.findings-acl.44 – reference: Bender, E. M., Gebru, T., McMillan-Major, A., Shmitchell, S. (2021). On the dangers of stochastic parrots: Can language models be too big? Proceedings of the 2021 acm conference on fairness, accountability, and transparency (p. 610–623). New York, NY, USA: Association for Computing Machinery. Retrieved from https://dl.acm.org/doi/10.1145/3442188.3445922 10.1145/3442188.3445922 – reference: De-Arteaga, M., Romanov, A., Wallach, H., Chayes, J., Borgs, C., Chouldechova, A., . . . Kalai, A.T. (2019b). Bias in bios: A case study of semantic representation bias in a high-stakes setting. Proceedings of the Conference on Fairness, Accountability, and Transparency (p. 120–128). Association for Computing Machinery. Retrieved from https://doi.org/10.1145/3287560.328757210.1145/3287560.3287572 – reference: CoadyDTwo concepts of epistemic injusticeEpisteme20107210111310.3366/epi.2010.0001 – reference: Sennrich, R., Haddow, B., Birch, A. (2015). Neural machine translation of rare words with subword units. arXiv preprint arXiv:1508.07909. – reference: Chandran Nair, N., Velayuthan, R.S., Chandrashekar, Y., Bella, G., Giunchiglia, F. (2022, June). IndoUKC: A concept-centered Indian multilingual lexicalresource. Proceedings of the Thirteenth Language Resources and Evaluation Conference (pp. 2833–2840). Marseille, France: European Language Resources Association. Retrieved from https://aclanthology.org/2022.lrec-1.303 – reference: FriedmanBNissenbaumHBias in computer systemsACM Transactions on Information Systems199614333034710.1145/230538.230561 – reference: MillerGAWordnet: An electronic lexical database1998MIT press – reference: FrickerMEpistemic injustice: Power and the ethics of knowing2009Oxford University Press – reference: TsingALOn nonscalability: The living world is not amenable to precision-nested scalesCommon Knowledge201218350552410.1215/0961754X-1630424 – reference: Khishigsuren, T., Bella, G., Batsuren, K., Freihat, A.A., Nair, N.C., Ganbold, A., Giunchiglia, F. (2022). Using linguistic typology to enrich multilingual lexicons: the case of lexical gaps in kinship. arXiv preprint arXiv:2204.05049. – reference: Hovy, D., & Yang, D. (2021, June). The importance of modeling social factors of language: Theory and practice. K. Toutanova et al. (Eds.), Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human language technologies (pp. 588–602). Online: Association for Computational Linguistics. Retrieved from https://aclanthology.org/2021.naacl-main.49 10.18653/v1/2021.naacl-main.49 – reference: Bella, G., Batsuren, K., Khishigsuren, T., Giunchiglia, F. (2022). Linguistic diversity and bias in online dictionaries. University of Bayreuth African Studies Online,173. – reference: Bella, G., Byambadorj, E., Chandrashekar, Y., Batsuren, K., Cheema, D., Giunchiglia, F. (2022). Language diversity: Visible to humans, exploitableby machines. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations (pp. 156–165). – reference: MazruiAMMazruiAAThe political culture of language: Swahili, society and the state1999Global Academic Publishing – reference: AroraPThe next billion users: Digital life beyond the west2019Harvard University Press10.4159/9780674238879 – reference: Schwartz, L. (2022, May). Primum Non Nocere: Before working with Indigenous data, the ACL must confront ongoing colonialism. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (vol. 2: Short papers) (pp. 724–731). Dublin, Ireland: Association for Computational Linguistics. Retrieved from https://aclanthology.org/2022.acl-short.82 10.18653/v1/2022.acl-short.82 – reference: White, J.C., & Cotterell, R. (2021, August). Examining the inductive bias of neural language models with artificial languages. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (vol. 1: Long papers) (pp. 454–463). Online: Association for Computational Linguistics Retrieved from https://aclanthology.org/2021.acl-long.38 10.18653/v1/2021.acl-long.38 – reference: RanciereJDisagreement: Politics and philosophy1998University of Minnesota Press – reference: Zouhar, V., Chang, K., Cui, C., Carlson, N., Robinson, N., Sachan, M., Mortensen, D. (2023). Pwesuite: Phonetic word embeddings and tasks they facilitate. arXiv preprint arXiv:2304.02541. – reference: RijkhoffJBakkerDHengeveldKKahrelPA method of language sampling. Studies in LanguageInternational Journal sponsored by the Foundation1993171169203 – reference: Irani, L., Vertesi, J., Dourish, P., Philip, K., Grinter, R.E. (2010, Apr). Postcolonial computing: a lens on design and development. Proceedings of the Sigchi Conference on Human Factors in Computing Systems (p. 1311–1320). Association for Computing Machinery. Retrieved from https://doi.org/10.1145/1753326.175352210.1145/1753326.1753522 – reference: Joshi, P., Santy, S., Budhiraja, A., Bali, K., Choudhury, M. (2020, July). The state and fate of linguistic diversity and inclusion in the NLP world. D. Jurafsky, J. Chai, N. Schluter, & J. Tetreault (Eds.), Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (pp. 6282–6293). Online: Association for Computational Linguistics. Retrieved from https://aclanthology.org/2020.acl-main.560 10.18653/v1/2020.acl-main.560 – reference: BroussardMMore than a glitch: Confronting race, gender, and ability bias in tech2023The MIT Press10.7551/mitpress/14234.001.0001 – reference: BatsurenKBellaGGiunchigliaFA large and evolving cognate databaseLanguage Resources and Evaluation202256116518910.1007/s10579-021-09544-6 – reference: Giunchiglia, F., Batsuren, K., Bella, G. (2017). Understanding and exploiting language diversity. Ijcai (pp. 4009–4017). – reference: Blodgett, S.L., Barocas, S., Daumé III, H., Wallach, H. (2020). Language (technology) is power: A critical survey of “bias” in nlp. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (pp. 5454–5476). – reference: EngelJSGlobal clusters of innovation: Entrepreneurial engines of economic growth around the world (Reprint2016editionEdward Elgar Pub – reference: TaylorLBroedersDAugust). In the name of Development: Power, profit and the datafication of the global SouthGeoforum20156422923710.1016/j.geoforum.2015.07.002 – reference: HarawayDSituated knowledges: The science question in feminism and the privilege of partial perspectiveFeminist Studies198814357510.2307/3178066 – reference: Saad-SulonenJErikssonEHalskovKKarastiHVinesJUnfolding participation over time: Temporal lenses in participatory designCoDesign201814141610.1080/15710882.2018.1426773 – reference: SpivakGCGrossbergLNelsonCCan the subaltern speakMarxism and the interpretation of culture1988University of Illinois Press66111 – reference: AgirreEEdmondsPWord sense disambiguation: Algorithms and applications2007Springer – reference: Smith, R.C., Winschiers-Theophilus, H., Loi, D., de Paula, R.A., Kambunga, A.P., Samuel, M.M., Zaman, T. (2021). Decolonizing design practices: Towards pluriversality. Extended Abstracts of the 2021 Chi Conference on Human Factors in Computing Systems. Association for Computing Machinery. Retrieved from https://doi.org/10.1145/3411763.3441334 – reference: Young, H. (2015). The digital language divide. Retrieved from https://labs.theguardian.com/digital-language-divide/ – reference: BarocasSSelbstADBig data’s disparate impactCalifornia Law Review20161043671732 – reference: Ochigame, R. (2019, Dec). How big tech manipulates academia to avoid regulation. Retrieved from https://theintercept.com/2019/12/20/mit-ethical-aiartificial-intelligence/ – reference: Bird, S. (2022, May). Local languages, third spaces, and other high-resource scenarios. Proceedings of the 60th annual meeting of the association for computational linguistics (volume 1: Long papers) (pp. 7817–7829). Dublin, Ireland: Association for Computational Linguistics. Retrieved from https://aclanthology.org/2022.acl-long.539 10.18653/v1/2022.acl-long.539 – reference: Benjamin, R. (2019). Race After Technology: Abolitionist Tools for the New Jim Code (1. edition ed.). Polity. – reference: GreenbergJHThe measurement of linguistic diversityLanguage1956321109115109016710.2307/410659 – reference: Zaugg, I.A., Hossain, A., Molloy, B. (2022, Apr). Digitally-disadvantaged languages. Internet Policy Review, 11(2). Retrieved from https://policyreview.info/glossary/digitally-disadvantaged-languages 10.14763/2022.2.1654 – reference: HovyDPrabhumoyeSFive sources of bias in natural language processingLanguage and Linguistics Compass202115810.1111/lnc3.12432 – reference: Batsuren, K., Ganbold, A., Chagnaa, A., Giunchiglia, F. (2019). Building the mongolian wordnet. In: Proceedings of the 10th Global Wordnet Conference (pp.238–244). – reference: De-Arteaga, M., Romanov, A., Wallach, H., Chayes, J., Borgs, C., Chouldechova, A., . . . Kalai, A.T. (2019a). Bias in bios: A case study of semantic representation bias in a high-stakes setting. , 120–128. Retrieved from https://doi.org/10.1145/3287560.3287572 – reference: HardingSStrong objectivity: A response to the new objectivity questionSynthese1995104333134910.1007/BF01064504 – volume: 14 start-page: 330 issue: 3 year: 1996 ident: 9742_CR26 publication-title: ACM Transactions on Information Systems doi: 10.1145/230538.230561 – ident: 9742_CR15 doi: 10.18653/v1/2020.coling-main.313 – ident: 9742_CR60 – ident: 9742_CR12 doi: 10.1145/3442188.3445922 – volume: 18 start-page: 505 issue: 3 year: 2012 ident: 9742_CR61 publication-title: Common Knowledge doi: 10.1215/0961754X-1630424 – ident: 9742_CR28 doi: 10.24963/ijcai.2017/560 – ident: 9742_CR9 – volume-title: Wordnet: An electronic lexical database year: 1998 ident: 9742_CR46 – ident: 9742_CR23 doi: 10.1007/978-3-030-34974-5_6 – ident: 9742_CR29 – volume-title: Word sense disambiguation: Algorithms and applications year: 2007 ident: 9742_CR1 – ident: 9742_CR19 – ident: 9742_CR16 doi: 10.18653/v1/2022.acl-long.539 – volume: 17 start-page: 169 issue: 1 year: 1993 ident: 9742_CR52 publication-title: International Journal sponsored by the Foundation – ident: 9742_CR56 doi: 10.18653/v1/P16-1162 – ident: 9742_CR41 doi: 10.3389/fpsyg.2023.1229697 – volume: 20 start-page: 1 issue: 1 year: 2017 ident: 9742_CR8 publication-title: Information, Communication & Society doi: 10.1080/1369118X.2016.1216147 – ident: 9742_CR44 doi: 10.18653/v1/2022.findings-acl.44 – ident: 9742_CR11 – year: 2020 ident: 9742_CR55 publication-title: Socius doi: 10.1177/2378023120967171 – volume: 7 start-page: 101 issue: 2 year: 2010 ident: 9742_CR20 publication-title: Episteme doi: 10.3366/epi.2010.0001 – ident: 9742_CR21 doi: 10.1145/3287560.3287572 – volume: 14 start-page: 575 issue: 3 year: 1988 ident: 9742_CR33 publication-title: Feminist Studies doi: 10.2307/3178066 – start-page: 66 volume-title: Marxism and the interpretation of culture year: 1988 ident: 9742_CR58 – volume: 8 issue: 10 year: 2013 ident: 9742_CR43 publication-title: PloS one doi: 10.1371/journal.pone.0077056 – volume-title: Epistemic injustice: Power and the ethics of knowing year: 2009 ident: 9742_CR25 – ident: 9742_CR42 – ident: 9742_CR13 doi: 10.1093/sf/soz162 – volume-title: Global clusters of innovation: Entrepreneurial engines of economic growth around the world (Reprint year: 2016 ident: 9742_CR24 – ident: 9742_CR62 doi: 10.18653/v1/2021.eacl-main.188 – volume: 10 start-page: 1 issue: 1 year: 2016 ident: 9742_CR3 publication-title: International Journal of Communication – ident: 9742_CR10 doi: 10.18653/v1/2022.acl-demo.15 – ident: 9742_CR14 – volume-title: The whale and the reactor: A search for limits in an age of high technology (Reprint year: 1988 ident: 9742_CR64 doi: 10.7208/chicago/9780226902098.001.0001 – ident: 9742_CR48 doi: 10.26643/gis.v12i3.5173 – ident: 9742_CR67 doi: 10.14763/2022.2.1654 – ident: 9742_CR39 doi: 10.1145/1753326.1753522 – ident: 9742_CR57 doi: 10.1145/3411763.3441334 – ident: 9742_CR37 doi: 10.18653/v1/2021.naacl-main.49 – volume-title: Raw data is an oxymoron year: 2013 ident: 9742_CR27 doi: 10.7551/mitpress/9302.001.0001 – volume-title: Digital democracy, analogue politics: How the internet era is transforming politics in kenya year: 2018 ident: 9742_CR47 doi: 10.5040/9781350219656 – volume-title: More than a glitch: Confronting race, gender, and ability bias in tech year: 2023 ident: 9742_CR18 doi: 10.7551/mitpress/14234.001.0001 – year: 2023 ident: 9742_CR30 publication-title: Artificial Intelligence Review doi: 10.1007/s10462-023-10427-1 – ident: 9742_CR17 doi: 10.18653/v1/2020.acl-main.485 – ident: 9742_CR68 doi: 10.18653/v1/2023.acl-long.699 – ident: 9742_CR35 doi: 10.1145/3514094.3534149 – volume-title: 51the unity of the epistemic virtues. Pathways to knowledge: Private and Ublic year: 2002 ident: 9742_CR31 doi: 10.1093/0195138791.001.0001 – volume: 32 start-page: 109 issue: 1 year: 1956 ident: 9742_CR32 publication-title: Language doi: 10.2307/410659 – volume-title: Disagreement: Politics and philosophy year: 1998 ident: 9742_CR51 – ident: 9742_CR6 – year: 2023 ident: 9742_CR36 publication-title: Big Data & Society doi: 10.1177/20539517231206802 – volume: 14 start-page: 4 issue: 1 year: 2018 ident: 9742_CR53 publication-title: CoDesign doi: 10.1080/15710882.2018.1426773 – ident: 9742_CR65 – volume: 104 start-page: 331 issue: 3 year: 1995 ident: 9742_CR34 publication-title: Synthese doi: 10.1007/BF01064504 – volume-title: Algorithmic reason: The new government of self and other year: 2022 ident: 9742_CR2 doi: 10.1093/oso/9780192859624.001.0001 – ident: 9742_CR22 doi: 10.1145/3287560.3287572 – ident: 9742_CR40 doi: 10.18653/v1/2020.acl-main.560 – ident: 9742_CR69 – volume: 15 issue: 8 year: 2021 ident: 9742_CR38 publication-title: Language and Linguistics Compass doi: 10.1111/lnc3.12432 – ident: 9742_CR63 doi: 10.18653/v1/2021.acl-long.38 – volume-title: The values of biodiversity: philosophical considerations connecting theory and practice. Concepts and values in biodiversity year: 2014 ident: 9742_CR50 – ident: 9742_CR54 doi: 10.18653/v1/2022.acl-short.82 – volume: 47 start-page: 783 issue: 6 year: 2017 ident: 9742_CR49 publication-title: Social Studies of Science doi: 10.1177/0306312717706110 – volume-title: The next billion users: Digital life beyond the west year: 2019 ident: 9742_CR4 doi: 10.4159/9780674238879 – volume: 56 start-page: 165 issue: 1 year: 2022 ident: 9742_CR7 publication-title: Language Resources and Evaluation doi: 10.1007/s10579-021-09544-6 – volume-title: Justice and the politics of difference year: 1990 ident: 9742_CR66 – volume-title: The political culture of language: Swahili, society and the state year: 1999 ident: 9742_CR45 – volume: 64 start-page: 229 year: 2015 ident: 9742_CR59 publication-title: Geoforum doi: 10.1016/j.geoforum.2015.07.002 – volume: 104 start-page: 671 issue: 3 year: 2016 ident: 9742_CR5 publication-title: California Law Review |
| SSID | ssj0005717 |
| Score | 2.4736888 |
| Snippet | It is well known that AI-based language technology—large language models, machine translation systems, multilingual dictionaries, and corpora—is currently... |
| SourceID | hal proquest crossref springer |
| SourceType | Open Access Repository Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 8 |
| SubjectTerms | Artificial Intelligence Bias Bilingual dictionaries Community Computation and Language Computer Science Computers and Society Dialects Dictionaries Ethics History, Philosophy and Sociology of Sciences Humanities and Social Sciences Innovation/Technology Management Justice Language Language modeling Languages Large language models Library Science Machine translation Management of Computing and Information Systems Marginality Modelling Original Paper Sociolects Speech communities Technology Underserved populations User Interfaces and Human Computer Interaction |
| SummonAdditionalLinks | – databaseName: ABI/INFORM Global dbid: M0C link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1NT9wwEB0VyoELUD7UpYCsCvUCFokTJ3EvFYIiDoA4tNXeInvsFYtKdiELiH_fcdbZABJceo3txNEb28_2zDyAXREVymW64EgTAU-zwnAjELkQxlg9MCJpvAn_nOUXF0W_ry7DgVsd3CrbObGZqO0I_Rn5gVA-cxhtDtIf41vuVaP87WqQ0JiDj57ZeJe-8-ioc_HIG8XdOCFriJXMQ9BMCJ3Lcu9-m_CIKLXg2YuFae7Ku0U-45yvrkmb1edk-X_7vQJLgXeyw6mhfIIPrlqF5VbTgYUhvgrbIZCBfWMhUskj15UvXrbaB09r0D9uvTqYrixrzz7ZZHZe_51djR67gkZ3h36QmaGuGer72tXMjb2p3QyRDavrqbjYOvw--fnr6JQHqQaORGgm3Nm0QCmRtoc2S3x0LSphFGGvBhZ1KpUurCwyPbDG54QzbiBp44eW-JMyMk42YL4aVe4zMDQux1Ri5KhdLkSB2ihNu_iM2KdRcQ_iFqcSQx5zL6fxt-wyMHtsS8K2bLAtsx7szdqMp1k83q39leCfVfQJuE8Pz0r_LEpT4khKPlA3tlq8yzDs67IDuwf7rcV0xW9_cvP9t32BRUFkaur7tgXzk7t7tw0L-DAZ1nc7jdH_A5aDBr0 priority: 102 providerName: ProQuest |
| Title | Diversity and language technology: how language modeling bias causes epistemic injustice |
| URI | https://link.springer.com/article/10.1007/s10676-023-09742-6 https://www.proquest.com/docview/2918809984 https://hal.science/hal-04421595 |
| Volume | 26 |
| WOSCitedRecordID | wos001148865400001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVAVX databaseName: Springer Journals customDbUrl: eissn: 1572-8439 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0005717 issn: 1388-1957 databaseCode: RSV dateStart: 19990301 isFulltext: true titleUrlDefault: https://link.springer.com/search?facet-content-type=%22Journal%22 providerName: Springer Nature |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3NbtQwEB51Ww69UCgglrYrCyEuYClx4sTurfRHPZRlVaAUJBTZjlddBGnVbIt4mzxLnoxx1t5tESDBZaTEdmJ5xvaMPTMfwDMWCWkzJajBhYCmmdBUM2MoY1qXaqxZ0nkTnhzlw6E4PZUjHxRWB2_3cCXZrdQ3gt2y3DnMJjRCJZjRrAcruN0JB9hw_PZk4diRdzi7cYIyEEue-1CZ33_j1nbUO3POkDc0zV8uR7s952Dt_3p7D-56HZPszITiPizZah3WAn4D8dN5HbZ80AJ5TnxUkuPSonx1FHAOfjyAz3vBg4OoqiThnJNM52fz2-Ts_HvbhJK26VB2sNNtoyeqJkZd1bZuG3vhROvbxLTNpPoyQxN7CO8P9t_tHlKPzUANajBTastUGM4N2oNllrhwWiOZlshsOS6NSrlUouQiU-NSuyRw2o45WnqmRIVJah4nj2C5Oq_sYyBG29yk3EQW2-WMCaO0VGi2Z6huahn3IQ4sKoxPXO7wM74Wi5TLbrALHOyiG-wi68OLeZuLWdqOv9Z-ipyfV3QZtw93jgr3LkpTVIokv8ZubAbBKPw8rwsmXT47NFnTPrwMgrAo_vMvn_xb9Q1YZahNzZzfNmF5enllt-COuZ5O6ssB9PIPHwew8mp_ODrGp9fRrqPxXkffOMocHfFPg26u_ASYKwpc |
| linkProvider | Springer Nature |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V3Nb9MwFH_aBhK7bDCY1rGBhYALWDROnMRICE2UqdO6aoeBejO242pFLO2WbtP-Kf5GnhO7ASR224Fr_JHI-fl92O-9H8BL1s2FTVVODQoCmqS5ppoZQxnTulBjzeI6mvDrIBsO89FIHC_Bz5AL48Iqg0ysBXUxNe6M_B0TrnIYOgfJx9k5daxR7nY1UGg0sDi0N9foslUfDnr4f18xtv_55FOfelYBalD3zqktktxwbtCTKdLYJYIawbTAzxTjwqiEC5UXPE_VuNCufJm2Y44-iilQ1QvNoxjnXYZ7rq6ec_aOol4bUpLVDL9RjOiLBM98ko5P1UszF-4b0y6a8IymfyjC5VMXhvmbjfvXtWyt7fbX_7d1eghr3q4me81GeARLttyA9cBZQbwI24Bdn6hBXhOfieWQ2bavHgduh5vHMOqFqBWiyoKEs10yX9xHvCen0-u2oeYVwgUleqIqYtRlZStiZ24rnU0MmZTfG_K0J_DlThZjE1bKaWm3gBhtM5Nw07U4LmMsN0oLlaSuSg7XIupAFHAhja_T7uhCfsi2wrTDkkQsyRpLMu3Am8WYWVOl5NbeLxBui46uwHh_byDds26SoA0o-BV-xk7Al_RirZItuDrwNiC0bf73K7dvn-05POifHA3k4GB4-BRWGRqOTZzfDqzMLy7tLtw3V_NJdfGs3nAEvt01cn8B76pjZA |
| linkToPdf | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1Nb9QwEB21BaFeKBQQCy1YCLiA1Y0TJzESQhXLqlVXqz0AWnFJ_RV1EWSXZtuqf41fxzixN4BEbz1wjR0ncp7HM_GbeQDPWT8XNpU51WgIaJLmiiqmNWVMKSNLxeKGTfh5lI3H-XQqJmvwM-TCOFplsImNoTZz7f6R7zHhKodhcJDslZ4WMRkM3y1-UKcg5U5ag5xGC5Eje3mB4Vv99nCA3_oFY8MPH98fUK8wQDXuw0tqTZJrzjVGNSaNXVKoFkwJfGVRGi0TLmRueJ7K0ihXykzZkmO8og1u-0LxKMZx1-FGhjGmoxNO-JeOXpI1ar9RjEiMBM98wo5P20szR_2NaR_deUbTPzbF9RNHyfzN3_3riLbZ-YZb__Oc3YHb3t8m--0CuQtrttqGraBlQbxp24Zdn8BBXhKfoeUQ27VvToLmw-U9mA4Cm4XIypDwz5csV-cUb8jJ_KJraPSGcHKJmsmaaHlW25rYhVti32eazKqvrajaffh0LZPxADaqeWUfAtHKZjrhum_xvoyxXEslJCIrRa9biagHUcBIoX39dicj8q3oKk87XBWIq6LBVZH24NXqnkVbveTK3s8QequOrvD4wf6ocNf6SYK-oeDn-Bo7AWuFN3d10QGtB68DWrvmfz_y0dWjPYVbCNhidDg-egybDP3Jlv63AxvL0zO7Czf1-XJWnz5p1h6B4-sG7i-0t2xo |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Diversity+and+language+technology%3A+how%C2%A0language%C2%A0modeling%C2%A0bias+causes%C2%A0epistemic%C2%A0injustice&rft.jtitle=Ethics+and+information+technology&rft.au=Helm%2C+Paula&rft.au=Bella%2C+G%C3%A1bor&rft.au=Koch%2C+Gertraud&rft.au=Giunchiglia%2C+Fausto&rft.date=2024-03-01&rft.pub=Springer+Netherlands&rft.issn=1388-1957&rft.eissn=1572-8439&rft.volume=26&rft.issue=1&rft_id=info:doi/10.1007%2Fs10676-023-09742-6&rft.externalDocID=10_1007_s10676_023_09742_6 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1388-1957&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1388-1957&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1388-1957&client=summon |