Diversity and language technology: how language modeling bias causes epistemic injustice

It is well known that AI-based language technology—large language models, machine translation systems, multilingual dictionaries, and corpora—is currently limited to three percent of the world’s most widely spoken, financially and politically backed languages. In response, recent efforts have sought...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Ethics and information technology Ročník 26; číslo 1; s. 8
Hlavní autoři: Helm, Paula, Bella, Gábor, Koch, Gertraud, Giunchiglia, Fausto
Médium: Journal Article
Jazyk:angličtina
Vydáno: Dordrecht Springer Netherlands 01.03.2024
Springer Nature B.V
Springer Verlag
Témata:
ISSN:1388-1957, 1572-8439
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract It is well known that AI-based language technology—large language models, machine translation systems, multilingual dictionaries, and corpora—is currently limited to three percent of the world’s most widely spoken, financially and politically backed languages. In response, recent efforts have sought to address the “digital language divide” by extending the reach of large language models to “underserved languages.” We show how some of these efforts tend to produce flawed solutions that adhere to a hard-wired representational preference for certain languages, which we call language modeling bias. Language modeling bias is a specific and under-studied form of linguistic bias were language technology by design favors certain languages, dialects, or sociolects with respect to others. We show that language modeling bias can result in systems that, while being precise regarding languages and cultures of dominant powers, are limited in the expression of socio-culturally relevant notions of other communities. We further argue that at the root of this problem lies a systematic tendency of technology developer communities to apply a simplistic understanding of diversity which does not do justice to the more profound differences that languages, and ultimately the communities that speak them, embody. Drawing on the concept of epistemic injustice, we point to the broader ethico-political implications and show how it can lead not only to a disregard for valuable aspects of diversity but also to an under-representation of the needs of marginalized language communities. Finally, we present an alternative socio-technical approach that is designed to tackle some of the analyzed problems.
AbstractList It is well known that AI-based language technology—large language models, machine translation systems, multilingual dictionaries, and corpora—is currently limited to three percent of the world’s most widely spoken, financially and politically backed languages. In response, recent efforts have sought to address the “digital language divide” by extending the reach of large language models to “underserved languages.” We show how some of these efforts tend to produce flawed solutions that adhere to a hard-wired representational preference for certain languages, which we call language modeling bias. Language modeling bias is a specific and under-studied form of linguistic bias were language technology by design favors certain languages, dialects, or sociolects with respect to others. We show that language modeling bias can result in systems that, while being precise regarding languages and cultures of dominant powers, are limited in the expression of socio-culturally relevant notions of other communities. We further argue that at the root of this problem lies a systematic tendency of technology developer communities to apply a simplistic understanding of diversity which does not do justice to the more profound differences that languages, and ultimately the communities that speak them, embody. Drawing on the concept of epistemic injustice, we point to the broader ethico-political implications and show how it can lead not only to a disregard for valuable aspects of diversity but also to an under-representation of the needs of marginalized language communities. Finally, we present an alternative socio-technical approach that is designed to tackle some of the analyzed problems.
ArticleNumber 8
Author Bella, Gábor
Giunchiglia, Fausto
Helm, Paula
Koch, Gertraud
Author_xml – sequence: 1
  givenname: Paula
  orcidid: 0000-0002-2719-9721
  surname: Helm
  fullname: Helm, Paula
  email: p.m.helm@uva.nl
  organization: University of Amsterdam
– sequence: 2
  givenname: Gábor
  surname: Bella
  fullname: Bella, Gábor
  organization: Lab-STICC CNRS UMR 628, IMT Atlantique
– sequence: 3
  givenname: Gertraud
  surname: Koch
  fullname: Koch, Gertraud
  organization: University of Hamburg
– sequence: 4
  givenname: Fausto
  surname: Giunchiglia
  fullname: Giunchiglia, Fausto
  organization: University of Trento
BackLink https://hal.science/hal-04421595$$DView record in HAL
BookMark eNp9kMtu3CAUhlE0kZpM-gJdWcqqC6eAwUB3ozSXSiNlkywjhDH2MPLAFHCieRs_i58snrgXqYusDjrn_w5H3zlYOO8MAF8QvEIQsm8RwZKVOcRFDgUjOC9PwBmiDOecFGIxvQvOcyQo-wTOY9xCCClD7Aw8_7AvJkSbDplyddYp1_aqNVkyeuN859vD92zjX8fhz2Qcdr42nXXtOFRWxUyrPpo4DmZvYzI7q8fBum0fk9XmApw2qovm8--6BE-3N4_X9_n64e7n9Wqd60KwlJuacE2pJpDVZYEZF1rgSpCSi6bWilCheE15qZq6omQamYZiInSNGBYVRcUSfJ33blQn98HuVDhIr6y8X63lsQcJwYgK-nLMXs7ZffC_ehOT3Po-uOk8iQXiHAoxOVsCPKd08DEG0_xdi6A8Kpezcjkpl-_KZTlB_D9I26SS9S4FZbuP0WJG4_SPa034d9UH1BvG_JxT
CitedBy_id crossref_primary_10_1515_lingvan_2024_0001
crossref_primary_10_1007_s10676_025_09837_2
crossref_primary_10_1080_01419870_2025_2469694
crossref_primary_10_1109_ACCESS_2025_3589319
crossref_primary_10_1177_14614448251321162
crossref_primary_10_1051_e3sconf_202560407001
crossref_primary_10_1007_s00521_024_10472_z
crossref_primary_10_1016_j_sftr_2025_101126
crossref_primary_10_3390_info16090771
crossref_primary_10_1007_s13347_025_00953_x
crossref_primary_10_1080_15710882_2024_2341799
crossref_primary_10_31305_rrjss_2025_v05_n01_009
crossref_primary_10_1016_j_jrt_2025_100135
crossref_primary_10_1177_20539517251365228
crossref_primary_10_1007_s13347_025_00928_y
crossref_primary_10_1007_s00146_025_02539_9
crossref_primary_10_1007_s00146_025_02392_w
crossref_primary_10_1007_s00521_024_10747_5
crossref_primary_10_1007_s44204_024_00178_3
crossref_primary_10_56294_ai2025417
crossref_primary_10_59896_pesolah_v1i1_346
Cites_doi 10.1145/230538.230561
10.18653/v1/2020.coling-main.313
10.1145/3442188.3445922
10.1215/0961754X-1630424
10.24963/ijcai.2017/560
10.1007/978-3-030-34974-5_6
10.18653/v1/2022.acl-long.539
10.18653/v1/P16-1162
10.3389/fpsyg.2023.1229697
10.1080/1369118X.2016.1216147
10.18653/v1/2022.findings-acl.44
10.1177/2378023120967171
10.3366/epi.2010.0001
10.1145/3287560.3287572
10.2307/3178066
10.1371/journal.pone.0077056
10.1093/sf/soz162
10.18653/v1/2021.eacl-main.188
10.18653/v1/2022.acl-demo.15
10.7208/chicago/9780226902098.001.0001
10.26643/gis.v12i3.5173
10.14763/2022.2.1654
10.1145/1753326.1753522
10.1145/3411763.3441334
10.18653/v1/2021.naacl-main.49
10.7551/mitpress/9302.001.0001
10.5040/9781350219656
10.7551/mitpress/14234.001.0001
10.1007/s10462-023-10427-1
10.18653/v1/2020.acl-main.485
10.18653/v1/2023.acl-long.699
10.1145/3514094.3534149
10.1093/0195138791.001.0001
10.2307/410659
10.1177/20539517231206802
10.1080/15710882.2018.1426773
10.1007/BF01064504
10.1093/oso/9780192859624.001.0001
10.18653/v1/2020.acl-main.560
10.1111/lnc3.12432
10.18653/v1/2021.acl-long.38
10.18653/v1/2022.acl-short.82
10.1177/0306312717706110
10.4159/9780674238879
10.1007/s10579-021-09544-6
10.1016/j.geoforum.2015.07.002
ContentType Journal Article
Copyright The Author(s) 2024
The Author(s) 2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Attribution
Copyright_xml – notice: The Author(s) 2024
– notice: The Author(s) 2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
– notice: Attribution
DBID C6C
AAYXX
CITATION
3V.
7WY
7WZ
7XB
87Z
8FE
8FG
8FK
8FL
8G5
AABKS
ABSDQ
ABUWG
AEUYN
AFKRA
ALSLI
ARAPS
AVQMV
AZQEC
BENPR
BEZIV
BGLVJ
CCPQU
CNYFK
DWQXO
FRNLG
F~G
GNUQQ
GUQSH
HCIFZ
K50
K60
K6~
L.-
M0C
M1D
M1O
M2O
MBDVC
P5Z
P62
PEJEM
PGAAH
PHGZM
PHGZT
PKEHL
PMKZF
PQBIZ
PQBZA
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
PRQQA
Q9U
1XC
BXJBU
IHQJB
VOOES
DOI 10.1007/s10676-023-09742-6
DatabaseName Springer Nature OA Free Journals
CrossRef
ProQuest Central (Corporate)
ABI/INFORM Collection
ABI/INFORM Global (PDF only)
ProQuest Central (purchase pre-March 2016)
ABI/INFORM Collection
ProQuest SciTech Collection
ProQuest Technology Collection
ProQuest Central (Alumni) (purchase pre-March 2016)
ABI/INFORM Collection (Alumni)
Research Library (Alumni)
Philosophy Collection
Philosophy Database
ProQuest Central (Alumni)
ProQuest One Sustainability
ProQuest Central UK/Ireland
Social Science Premium Collection
Advanced Technologies & Computer Science Collection
Arts Premium Collection
ProQuest Central Essentials
ProQuest Central
Business Premium Collection
Technology Collection
ProQuest One
Library & Information Science Collection
ProQuest Central
Business Premium Collection (Alumni)
ABI/INFORM Global (Corporate)
ProQuest Central Student
ProQuest Research Library
SciTech Premium Collection
Art, Design & Architecture Collection
ProQuest Business Collection (Alumni Edition)
ProQuest Business Collection
ABI/INFORM Professional Advanced
ABI/INFORM Global
Arts & Humanities Database
Library Science Database
Research Library
Research Library (Corporate)
Advanced Technologies & Aerospace Database
ProQuest Advanced Technologies & Aerospace Collection
ProQuest One Visual Arts & Design
One Religion & Philosophy
ProQuest Databases
ProQuest One Academic (New)
ProQuest One Academic Middle East (New)
ProQuest Digital Collections
ProQuest One Business (UW System Shared)
ProQuest One Business (Alumni)
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic (retired)
ProQuest One Academic UKI Edition
ProQuest Central China
ProQuest One Social Sciences
ProQuest Central Basic
Hyper Article en Ligne (HAL)
HAL-SHS: Archive ouverte en Sciences de l'Homme et de la Société
HAL-SHS: Archive ouverte en Sciences de l'Homme et de la Société (Open Access)
Hyper Article en Ligne (HAL) (Open Access)
DatabaseTitle CrossRef
ProQuest Business Collection (Alumni Edition)
Research Library Prep
ProQuest Central Student
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
SciTech Premium Collection
ProQuest Central China
ABI/INFORM Complete
ProQuest One Religion & Philosophy
Philosophy Collection
ProQuest One Applied & Life Sciences
ProQuest One Sustainability
Arts Premium Collection
Library & Information Science Collection
ProQuest Central (New)
Advanced Technologies & Aerospace Collection
Business Premium Collection
Social Science Premium Collection
ABI/INFORM Global
ProQuest One Academic Eastern Edition
ProQuest Technology Collection
ProQuest Business Collection
ProQuest One Academic UKI Edition
Arts & Humanities Full Text
ProQuest One Academic
ProQuest One Academic (New)
ABI/INFORM Global (Corporate)
ProQuest One Business
Technology Collection
ProQuest One Academic Middle East (New)
ProQuest Central (Alumni Edition)
ProQuest One Community College
Research Library (Alumni Edition)
ProQuest Central
ABI/INFORM Professional Advanced
ProQuest Library Science
ProQuest Central Korea
ProQuest Research Library
ProQuest Art, Design and Architecture Collection
ABI/INFORM Complete (Alumni Edition)
ProQuest One Social Sciences
ABI/INFORM Global (Alumni Edition)
ProQuest Central Basic
ProQuest SciTech Collection
ProQuest Digital Collections
Advanced Technologies & Aerospace Database
ProQuest One Business (Alumni)
ProQuest One Visual Arts & Design
ProQuest Central (Alumni)
Business Premium Collection (Alumni)
Philosophy Database
DatabaseTitleList
ProQuest Business Collection (Alumni Edition)

CrossRef
Database_xml – sequence: 1
  dbid: BENPR
  name: ProQuest Central
  url: https://www.proquest.com/central
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Library & Information Science
Philosophy
Computer Science
EISSN 1572-8439
ExternalDocumentID oai:HAL:hal-04421595v1
10_1007_s10676_023_09742_6
GrantInformation_xml – fundername: EU
  grantid: JIDEP
GroupedDBID -59
-5G
-BR
-EM
-Y2
-~C
.4S
.86
.DC
.VR
06D
0R~
0VY
199
1N0
1SB
203
29G
2J2
2JN
2JY
2KG
2KM
2LR
2P1
2VQ
2~H
30V
3V.
4.4
406
408
409
40D
40E
5GY
5VS
67Z
6NX
78A
7WY
8FE
8FG
8FL
8FW
8G5
8TC
8UJ
95-
95.
95~
96X
AAAVM
AABHQ
AABKS
AACDK
AACJB
AAHNG
AAIAL
AAJBT
AAJKR
AANZL
AARHV
AARTL
AASML
AATNV
AATVU
AAUYE
AAWCG
AAYIU
AAYQN
AAYTO
AAYZH
ABAKF
ABBBX
ABBXA
ABDZT
ABECU
ABFTD
ABFTV
ABHLI
ABHQN
ABJNI
ABJOX
ABKCH
ABKTR
ABMNI
ABMQK
ABNWP
ABQBU
ABQSL
ABSDQ
ABSXP
ABTEG
ABTHY
ABTKH
ABTMW
ABULA
ABUWG
ABWNU
ABXPI
ACAOD
ACBXY
ACDTI
ACGFO
ACGFS
ACHQT
ACHSB
ACHXU
ACKNC
ACMDZ
ACMLO
ACOKC
ACOMO
ACPIV
ACREN
ACSNA
ACZOJ
ADHHG
ADHIR
ADINQ
ADKNI
ADKPE
ADMLS
ADRFC
ADTPH
ADURQ
ADYFF
ADYOE
ADZKW
AEBTG
AEFQL
AEGAL
AEGNC
AEJHL
AEJRE
AEKMD
AEMSY
AEOHA
AEPYU
AESKC
AETLH
AEUYN
AEVLU
AEXYK
AFBBN
AFGCZ
AFKRA
AFLOW
AFQWF
AFWTZ
AFYQB
AFZKB
AGAYW
AGDGC
AGGDS
AGJBK
AGMZJ
AGQEE
AGQMX
AGRTI
AGWIL
AGWZB
AGYKE
AHAVH
AHBYD
AHKAY
AHSBF
AHYZX
AIAKS
AIGIU
AIIXL
AILAN
AITGF
AJBLW
AJRNO
AJZVZ
ALMA_UNASSIGNED_HOLDINGS
ALSLI
ALWAN
AMKLP
AMTXH
AMXSW
AMYLF
AMYQR
AOCGG
ARAPS
ARCSS
ARMRJ
ASPBG
AVQMV
AVWKF
AXYYD
AYJHY
AZFZN
AZQEC
B-.
BA0
BDATZ
BENPR
BEZIV
BGLVJ
BGNMA
BPHCQ
BSONS
C6C
CAG
CCPQU
CNYFK
COF
CS3
CSCUP
DDRTE
DL5
DNIVK
DPUIP
DU5
DWQXO
EBLON
EBS
EDO
EIOEI
EJD
ESBYG
F5P
FD6
FEDTE
FERAY
FFXSO
FIGPU
FINBP
FNLPD
FRNLG
FRRFC
FSGXE
FWDCC
GGCAI
GGRSB
GJIRD
GNUQQ
GNWQR
GQ6
GQ7
GQ8
GROUPED_ABI_INFORM_COMPLETE
GROUPED_ABI_INFORM_RESEARCH
GUQSH
GXS
H13
HCIFZ
HF~
HG5
HG6
HMJXF
HQYDN
HRMNR
HVGLF
HZ~
I09
IHE
IJ-
IKXTQ
ITM
IWAJR
IXC
IZIGR
IZQ
I~X
I~Z
J-C
J0Z
JBSCW
JCJTX
JZLTJ
K50
K60
K6~
KDC
KOV
LAK
LLZTM
M0C
M1D
M1O
M2O
M4Y
MA-
MK~
ML~
N2Q
N9A
NB0
NPVJJ
NQJWS
NU0
O9-
O93
O9J
OAM
OVD
P2P
P62
P9O
PF-
PQBIZ
PQBZA
PQQKQ
PROAC
PT4
PT5
Q2X
QOS
R89
R9I
RNI
RNS
ROL
RPX
RSV
RZC
RZE
RZK
S16
S1Z
S27
S3B
SAP
SCO
SDH
SHX
SISQX
SJYHP
SNE
SNPRN
SNX
SOHCF
SOJ
SPISZ
SRMVM
SSLCW
STPWE
SZN
T13
TEORI
TSG
TSK
TSV
TUC
TUS
U2A
U5U
UG4
UOJIU
UTJUX
UZXMN
VC2
VFIZW
W23
W48
WK8
YLTOR
Z45
Z7R
Z7W
Z7X
Z81
Z83
Z88
ZMTXR
AAPKM
AAYXX
ABBRH
ABDBE
ABFSG
ABRTQ
ACSTC
ADHKG
AEZWR
AFDZB
AFFHD
AFHIU
AFOHR
AGQPQ
AHPBZ
AHWEU
AIXLP
ATHPR
AYFIA
CITATION
PEJEM
PGAAH
PHGZM
PHGZT
PMKZF
PQGLB
PRQQA
7XB
8FK
L.-
MBDVC
PKEHL
PQEST
PQUKI
PRINS
Q9U
1XC
BXJBU
IHQJB
VOOES
ID FETCH-LOGICAL-c397t-ed48c55c407d632789c92b94689fdca459a8d586afdb5492bef5249cd1729b513
IEDL.DBID M0C
ISICitedReferencesCount 12
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001148865400001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1388-1957
IngestDate Sat Nov 29 15:03:56 EST 2025
Tue Dec 02 15:59:44 EST 2025
Sat Nov 29 02:43:50 EST 2025
Tue Nov 18 22:24:57 EST 2025
Fri Feb 21 02:43:56 EST 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 1
Keywords Language technology
Large language models
Epistemic injustice
Diversity
Language modeling bias
Linguistic bias
Digital divide
Lexical gaps
epistemic injustice
language modeling bias
bias
linguistic diversity
Language English
License Attribution: http://creativecommons.org/licenses/by
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c397t-ed48c55c407d632789c92b94689fdca459a8d586afdb5492bef5249cd1729b513
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0002-2719-9721
0000-0002-3868-1740
OpenAccessLink https://link.springer.com/10.1007/s10676-023-09742-6
PQID 2918809984
PQPubID 25743
ParticipantIDs hal_primary_oai_HAL_hal_04421595v1
proquest_journals_2918809984
crossref_primary_10_1007_s10676_023_09742_6
crossref_citationtrail_10_1007_s10676_023_09742_6
springer_journals_10_1007_s10676_023_09742_6
PublicationCentury 2000
PublicationDate 2024-03-01
PublicationDateYYYYMMDD 2024-03-01
PublicationDate_xml – month: 03
  year: 2024
  text: 2024-03-01
  day: 01
PublicationDecade 2020
PublicationPlace Dordrecht
PublicationPlace_xml – name: Dordrecht
PublicationTitle Ethics and information technology
PublicationTitleAbbrev Ethics Inf Technol
PublicationYear 2024
Publisher Springer Netherlands
Springer Nature B.V
Springer Verlag
Publisher_xml – name: Springer Netherlands
– name: Springer Nature B.V
– name: Springer Verlag
References Bird, S. (2022, May). Local languages, third spaces, and other high-resource scenarios. Proceedings of the 60th annual meeting of the association for computational linguistics (volume 1: Long papers) (pp. 7817–7829). Dublin, Ireland: Association for Computational Linguistics. Retrieved from https://aclanthology.org/2022.acl-long.539 10.18653/v1/2022.acl-long.539
WinnerLThe whale and the reactor: A search for limits in an age of high technology (Reprint1988EditionUniversity of Chicago Press10.7208/chicago/9780226902098.001.0001
AroraPBottom of the data pyramid: Big data and the global southInternational Journal of Communication2016101119
HelmPde GötzenACernuzziLHumeADiwakarSRuiz CorreaSGatica-PerezDDiversity and neocolonialism in big data research: Avoiding extractivism while struggling with paternalismBig Data & Society202310.1177/20539517231206802
RanciereJDisagreement: Politics and philosophy1998University of Minnesota Press
TsingALOn nonscalability: The living world is not amenable to precision-nested scalesCommon Knowledge201218350552410.1215/0961754X-1630424
BeerDThe social power of algorithmsInformation, Communication & Society201720111343646210.1080/1369118X.2016.1216147
SpivakGCGrossbergLNelsonCCan the subaltern speakMarxism and the interpretation of culture1988University of Illinois Press66111
Bird, S. (2020, December). Decolonising speech and language technology. Proceedings of the 28th international conference on computational linguistics (pp. 3504–3519). Barcelona, Spain (Online): International Committee on Computational Linguistics. Retrieved from https://aclanthology.org/2020.colingmain.313 10.18653/v1/2020.coling-main.313
Joshi, P., Santy, S., Budhiraja, A., Bali, K., Choudhury, M. (2020, July). The state and fate of linguistic diversity and inclusion in the NLP world. D. Jurafsky, J. Chai, N. Schluter, & J. Tetreault (Eds.), Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (pp. 6282–6293). Online: Association for Computational Linguistics. Retrieved from https://aclanthology.org/2020.acl-main.560 10.18653/v1/2020.acl-main.560
GitelmanLRaw data is an oxymoron2013MIT Press10.7551/mitpress/9302.001.0001
Hovy, D., & Yang, D. (2021, June). The importance of modeling social factors of language: Theory and practice. K. Toutanova et al. (Eds.), Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human language technologies (pp. 588–602). Online: Association for Computational Linguistics. Retrieved from https://aclanthology.org/2021.naacl-main.49 10.18653/v1/2021.naacl-main.49
De-Arteaga, M., Romanov, A., Wallach, H., Chayes, J., Borgs, C., Chouldechova, A., . . . Kalai, A.T. (2019a). Bias in bios: A case study of semantic representation bias in a high-stakes setting. , 120–128. Retrieved from https://doi.org/10.1145/3287560.3287572
Bella, G., Byambadorj, E., Chandrashekar, Y., Batsuren, K., Cheema, D., Giunchiglia, F. (2022). Language diversity: Visible to humans, exploitableby machines. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations (pp. 156–165).
Dibitso, M. A., Owolawi, P. A., Ojo, S. O. (2019). Context-driven corpus-based model for automatic text segmentation and part of speech tagging in setswana using opennlp tool. Modeling and using context: 11th International and Interdisciplinary Conference, Context 2019, November 20–22, 2019, proceedings 11 (pp. 62–73).
MazruiAMMazruiAAThe political culture of language: Swahili, society and the state1999Global Academic Publishing
SchwemmerCKnightCBello-PardoEDOklobdzijaSSchoonveldeMLockhartJWDiagnosing gender bias in image recognition systemsSocius202010.1177/2378023120967171
Bender, E. M., Gebru, T., McMillan-Major, A., Shmitchell, S. (2021). On the dangers of stochastic parrots: Can language models be too big? Proceedings of the 2021 acm conference on fairness, accountability, and transparency (p. 610–623). New York, NY, USA: Association for Computing Machinery. Retrieved from https://dl.acm.org/doi/10.1145/3442188.3445922 10.1145/3442188.3445922
FriedmanBNissenbaumHBias in computer systemsACM Transactions on Information Systems199614333034710.1145/230538.230561
Zaugg, I.A., Hossain, A., Molloy, B. (2022, Apr). Digitally-disadvantaged languages. Internet Policy Review, 11(2). Retrieved from https://policyreview.info/glossary/digitally-disadvantaged-languages 10.14763/2022.2.1654
PotthastTThe values of biodiversity: philosophical considerations connecting theory and practice. Concepts and values in biodiversity2014Routledge
YoungIMJustice and the politics of difference1990Princeton University Press
HarawayDSituated knowledges: The science question in feminism and the privilege of partial perspectiveFeminist Studies198814357510.2307/3178066
Blodgett, S.L., Barocas, S., Daumé III, H., Wallach, H. (2020). Language (technology) is power: A critical survey of “bias” in nlp. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (pp. 5454–5476).
Giunchiglia, F., Batsuren, K., Bella, G. (2017). Understanding and exploiting language diversity. Ijcai (pp. 4009–4017).
De-Arteaga, M., Romanov, A., Wallach, H., Chayes, J., Borgs, C., Chouldechova, A., . . . Kalai, A.T. (2019b). Bias in bios: A case study of semantic representation bias in a high-stakes setting. Proceedings of the Conference on Fairness, Accountability, and Transparency (p. 120–128). Association for Computing Machinery. Retrieved from https://doi.org/10.1145/3287560.328757210.1145/3287560.3287572
RijkhoffJBakkerDHengeveldKKahrelPA method of language sampling. Studies in LanguageInternational Journal sponsored by the Foundation1993171169203
AradauCBlankeTAlgorithmic reason: The new government of self and other2022Oxford University Press10.1093/oso/9780192859624.001.0001
Smith, R.C., Winschiers-Theophilus, H., Loi, D., de Paula, R.A., Kambunga, A.P., Samuel, M.M., Zaman, T. (2021). Decolonizing design practices: Towards pluriversality. Extended Abstracts of the 2021 Chi Conference on Human Factors in Computing Systems. Association for Computing Machinery. Retrieved from https://doi.org/10.1145/3411763.3441334
White, J.C., & Cotterell, R. (2021, August). Examining the inductive bias of neural language models with artificial languages. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (vol. 1: Long papers) (pp. 454–463). Online: Association for Computational Linguistics Retrieved from https://aclanthology.org/2021.acl-long.38 10.18653/v1/2021.acl-long.38
Irani, L., Vertesi, J., Dourish, P., Philip, K., Grinter, R.E. (2010, Apr). Postcolonial computing: a lens on design and development. Proceedings of the Sigchi Conference on Human Factors in Computing Systems (p. 1311–1320). Association for Computing Machinery. Retrieved from https://doi.org/10.1145/1753326.175352210.1145/1753326.1753522
Bella, G., McNeill, F., Gorman, R., Donnaíle, C.Ó., MacDonald, K., Chandrashekar, Y., Giunchiglia, F. (2020). A major wordnet for a minority language: Scottish gaelic. In: Proceedings of the 12th Language Resources and Evaluation Conference (pp. 2812–2818).
Bhuiyan, J. (2023, September). Lost in ai translation: growing reliance on language apps jeopardizes some asylum applications. The Guardian. Retrieved from https://www.theguardian.com/us-news/2023/sep/07/asylumseekers-ai-translation-apps
GiunchigliaFBellaGNairNCChiYXuHRepresenting interlingual meaning in lexical databasesArtificial Intelligence Review202310.1007/s10462-023-10427-1
AroraPThe next billion users: Digital life beyond the west2019Harvard University Press10.4159/9780674238879
GoldmanAI51the unity of the epistemic virtues. Pathways to knowledge: Private and Ublic2002In Pathways to knowledgeOxford University Press10.1093/0195138791.001.0001
Helm, P., Michael, L., Schelenz, L. (2022, Jul). Diversity by design? balancing the inclusion and protection of users in an online social platform. Proceedings of the 2022 aaai/acm Conference on ai, Ethics, and Society (p. 324–334). Association for Computing Machinery. Retrieved from https://doi.org/10.1145/3514094.353414910.1145/3514094.3534149
CoadyDTwo concepts of epistemic injusticeEpisteme20107210111310.3366/epi.2010.0001
KornaiADigital language deathPloS one201381010.1371/journal.pone.0077056
Thiong’o, N. w. (1986). Decolonising the mind: The politics of language in african literature. N.H: Heinemann, Oxford.
HovyDPrabhumoyeSFive sources of bias in natural language processingLanguage and Linguistics Compass202115810.1111/lnc3.12432
Giunchiglia, F., Batsuren, K., Freihat, A. A. (2018). One world–seven thousand languages. Proceedings 19th International Conference on Computational Linguistics and Intelligent Text Processing, Cicling2018, (pp. 18-24) March 2018.
EngelJSGlobal clusters of innovation: Entrepreneurial engines of economic growth around the world (Reprint2016editionEdward Elgar Pub
Khishigsuren, T., Bella, G., Batsuren, K., Freihat, A.A., Nair, N.C., Ganbold, A., Giunchiglia, F. (2022). Using linguistic typology to enrich multilingual lexicons: the case of lexical gaps in kinship. arXiv preprint arXiv:2204.05049.
Zouhar, V., Chang, K., Cui, C., Carlson, N., Robinson, N., Sachan, M., Mortensen, D. (2023). Pwesuite: Phonetic word embeddings and tasks they facilitate. arXiv preprint arXiv:2304.02541.
Ochigame, R. (2019, Dec). How big tech manipulates academia to avoid regulation. Retrieved from https://theintercept.com/2019/12/20/mit-ethical-aiartificial-intelligence
Saad-SulonenJErikssonEHalskovKKarastiHVinesJUnfolding participation over time: Temporal lenses in participatory designCoDesign201814141610.1080/15710882.2018.1426773
HardingSStrong objectivity: A response to the new objectivity questionSynthese1995104333134910.1007/BF01064504
Bella, G., Batsuren, K., Khishigsuren, T., Giunchiglia, F. (2022). Linguistic diversity and bias in online dict
GC Spivak (9742_CR58) 1988
P Arora (9742_CR3) 2016; 10
JH Greenberg (9742_CR32) 1956; 32
L Winner (9742_CR64) 1988
J Saad-Sulonen (9742_CR53) 2018; 14
AI Goldman (9742_CR31) 2002
E Agirre (9742_CR1) 2007
AL Tsing (9742_CR61) 2012; 18
9742_CR19
9742_CR17
9742_CR15
D Coady (9742_CR20) 2010; 7
9742_CR16
9742_CR13
9742_CR57
9742_CR14
L Gitelman (9742_CR27) 2013
S Harding (9742_CR34) 1995; 104
9742_CR11
9742_CR12
9742_CR56
P Helm (9742_CR36) 2023
9742_CR10
9742_CR54
9742_CR62
D Hovy (9742_CR38) 2021; 15
9742_CR63
JS Engel (9742_CR24) 2016
9742_CR60
S Pfotenhauer (9742_CR49) 2017; 47
M Broussard (9742_CR18) 2023
S Barocas (9742_CR5) 2016; 104
J Rijkhoff (9742_CR52) 1993; 17
AM Mazrui (9742_CR45) 1999
D Haraway (9742_CR33) 1988; 14
T Potthast (9742_CR50) 2014
9742_CR28
9742_CR29
9742_CR68
9742_CR69
9742_CR22
9742_CR23
9742_CR67
9742_CR21
9742_CR65
9742_CR9
9742_CR6
GA Miller (9742_CR46) 1998
D Beer (9742_CR8) 2017; 20
L Taylor (9742_CR59) 2015; 64
C Aradau (9742_CR2) 2022
9742_CR39
J Ranciere (9742_CR51) 1998
F Giunchiglia (9742_CR30) 2023
K Batsuren (9742_CR7) 2022; 56
9742_CR37
9742_CR35
P Arora (9742_CR4) 2019
IM Young (9742_CR66) 1990
9742_CR40
9742_CR41
M Fricker (9742_CR25) 2009
B Friedman (9742_CR26) 1996; 14
N Nyabola (9742_CR47) 2018
9742_CR48
A Kornai (9742_CR43) 2013; 8
9742_CR44
9742_CR42
C Schwemmer (9742_CR55) 2020
References_xml – reference: Dibitso, M. A., Owolawi, P. A., Ojo, S. O. (2019). Context-driven corpus-based model for automatic text segmentation and part of speech tagging in setswana using opennlp tool. Modeling and using context: 11th International and Interdisciplinary Conference, Context 2019, November 20–22, 2019, proceedings 11 (pp. 62–73).
– reference: Helm, P., Michael, L., Schelenz, L. (2022, Jul). Diversity by design? balancing the inclusion and protection of users in an online social platform. Proceedings of the 2022 aaai/acm Conference on ai, Ethics, and Society (p. 324–334). Association for Computing Machinery. Retrieved from https://doi.org/10.1145/3514094.353414910.1145/3514094.3534149
– reference: AradauCBlankeTAlgorithmic reason: The new government of self and other2022Oxford University Press10.1093/oso/9780192859624.001.0001
– reference: Bella, G., McNeill, F., Gorman, R., Donnaíle, C.Ó., MacDonald, K., Chandrashekar, Y., Giunchiglia, F. (2020). A major wordnet for a minority language: Scottish gaelic. In: Proceedings of the 12th Language Resources and Evaluation Conference (pp. 2812–2818).
– reference: GitelmanLRaw data is an oxymoron2013MIT Press10.7551/mitpress/9302.001.0001
– reference: Giunchiglia, F., Batsuren, K., Freihat, A. A. (2018). One world–seven thousand languages. Proceedings 19th International Conference on Computational Linguistics and Intelligent Text Processing, Cicling2018, (pp. 18-24) March 2018.
– reference: Vanmassenhove, E., Shterionov, D., Gwilliam, M. (2021, April). Machine translationese: Effects of algorithmic bias on linguistic complexity in machine translation. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main volume (pp. 2203– 2213). Online: Association for Computational Linguistics. Retrieved from https://aclanthology.org/2021.eacl-main.188 10.18653/v1/2021.eacl-main.188
– reference: Zevallos, R., & Bel, N. (2023). Hints on the data for language modeling of synthetic languages with transformers. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (vol. 1: Long papers) (pp. 12508–12522).
– reference: GoldmanAI51the unity of the epistemic virtues. Pathways to knowledge: Private and Ublic2002In Pathways to knowledgeOxford University Press10.1093/0195138791.001.0001
– reference: AroraPBottom of the data pyramid: Big data and the global southInternational Journal of Communication2016101119
– reference: PotthastTThe values of biodiversity: philosophical considerations connecting theory and practice. Concepts and values in biodiversity2014Routledge
– reference: HelmPde GötzenACernuzziLHumeADiwakarSRuiz CorreaSGatica-PerezDDiversity and neocolonialism in big data research: Avoiding extractivism while struggling with paternalismBig Data & Society202310.1177/20539517231206802
– reference: YoungIMJustice and the politics of difference1990Princeton University Press
– reference: Khalilia, H., Bella, G., Freihat, A.A., Darma, S., Giunchiglia, F. (2023). Lexical diversity in kinship across languages and dialects. To appear in Frontiers in Psychology, special issue on the adaptive value of language diversity. https://arxiv.org/abs/2308.13056 [cs.CL]
– reference: PfotenhauerSJasanoffSPanacea or diagnosis? Imaginaries of innovation and the ‘Mit model’ in three political culturesSocial Studies of Science201747678381010.1177/0306312717706110
– reference: Bird, S. (2020, December). Decolonising speech and language technology. Proceedings of the 28th international conference on computational linguistics (pp. 3504–3519). Barcelona, Spain (Online): International Committee on Computational Linguistics. Retrieved from https://aclanthology.org/2020.colingmain.313 10.18653/v1/2020.coling-main.313
– reference: KornaiADigital language deathPloS one201381010.1371/journal.pone.0077056
– reference: WinnerLThe whale and the reactor: A search for limits in an age of high technology (Reprint1988EditionUniversity of Chicago Press10.7208/chicago/9780226902098.001.0001
– reference: NyabolaNDigital democracy, analogue politics: How the internet era is transforming politics in kenya2018Zed Books10.5040/9781350219656
– reference: BeerDThe social power of algorithmsInformation, Communication & Society201720111343646210.1080/1369118X.2016.1216147
– reference: SchwemmerCKnightCBello-PardoEDOklobdzijaSSchoonveldeMLockhartJWDiagnosing gender bias in image recognition systemsSocius202010.1177/2378023120967171
– reference: GiunchigliaFBellaGNairNCChiYXuHRepresenting interlingual meaning in lexical databasesArtificial Intelligence Review202310.1007/s10462-023-10427-1
– reference: Thiong’o, N. w. (1986). Decolonising the mind: The politics of language in african literature. N.H: Heinemann, Oxford.
– reference: Bhuiyan, J. (2023, September). Lost in ai translation: growing reliance on language apps jeopardizes some asylum applications. The Guardian. Retrieved from https://www.theguardian.com/us-news/2023/sep/07/asylumseekers-ai-translation-apps
– reference: Lignos, C., Holley, N., Palen-Michel, C., Sälevä, J. (2022, May). Toward more meaningful resources for lower-resourced languages. Findings of the association for computational linguistics: Acl 2022 (pp. 523–532). Dublin, Ireland: Association for Computational Linguistics. Retrieved from https://aclanthology.org/2022.findings-acl.44 10.18653/v1/2022.findings-acl.44
– reference: Bender, E. M., Gebru, T., McMillan-Major, A., Shmitchell, S. (2021). On the dangers of stochastic parrots: Can language models be too big? Proceedings of the 2021 acm conference on fairness, accountability, and transparency (p. 610–623). New York, NY, USA: Association for Computing Machinery. Retrieved from https://dl.acm.org/doi/10.1145/3442188.3445922 10.1145/3442188.3445922
– reference: De-Arteaga, M., Romanov, A., Wallach, H., Chayes, J., Borgs, C., Chouldechova, A., . . . Kalai, A.T. (2019b). Bias in bios: A case study of semantic representation bias in a high-stakes setting. Proceedings of the Conference on Fairness, Accountability, and Transparency (p. 120–128). Association for Computing Machinery. Retrieved from https://doi.org/10.1145/3287560.328757210.1145/3287560.3287572
– reference: CoadyDTwo concepts of epistemic injusticeEpisteme20107210111310.3366/epi.2010.0001
– reference: Sennrich, R., Haddow, B., Birch, A. (2015). Neural machine translation of rare words with subword units. arXiv preprint arXiv:1508.07909.
– reference: Chandran Nair, N., Velayuthan, R.S., Chandrashekar, Y., Bella, G., Giunchiglia, F. (2022, June). IndoUKC: A concept-centered Indian multilingual lexicalresource. Proceedings of the Thirteenth Language Resources and Evaluation Conference (pp. 2833–2840). Marseille, France: European Language Resources Association. Retrieved from https://aclanthology.org/2022.lrec-1.303
– reference: FriedmanBNissenbaumHBias in computer systemsACM Transactions on Information Systems199614333034710.1145/230538.230561
– reference: MillerGAWordnet: An electronic lexical database1998MIT press
– reference: FrickerMEpistemic injustice: Power and the ethics of knowing2009Oxford University Press
– reference: TsingALOn nonscalability: The living world is not amenable to precision-nested scalesCommon Knowledge201218350552410.1215/0961754X-1630424
– reference: Khishigsuren, T., Bella, G., Batsuren, K., Freihat, A.A., Nair, N.C., Ganbold, A., Giunchiglia, F. (2022). Using linguistic typology to enrich multilingual lexicons: the case of lexical gaps in kinship. arXiv preprint arXiv:2204.05049.
– reference: Hovy, D., & Yang, D. (2021, June). The importance of modeling social factors of language: Theory and practice. K. Toutanova et al. (Eds.), Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human language technologies (pp. 588–602). Online: Association for Computational Linguistics. Retrieved from https://aclanthology.org/2021.naacl-main.49 10.18653/v1/2021.naacl-main.49
– reference: Bella, G., Batsuren, K., Khishigsuren, T., Giunchiglia, F. (2022). Linguistic diversity and bias in online dictionaries. University of Bayreuth African Studies Online,173.
– reference: Bella, G., Byambadorj, E., Chandrashekar, Y., Batsuren, K., Cheema, D., Giunchiglia, F. (2022). Language diversity: Visible to humans, exploitableby machines. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations (pp. 156–165).
– reference: MazruiAMMazruiAAThe political culture of language: Swahili, society and the state1999Global Academic Publishing
– reference: AroraPThe next billion users: Digital life beyond the west2019Harvard University Press10.4159/9780674238879
– reference: Schwartz, L. (2022, May). Primum Non Nocere: Before working with Indigenous data, the ACL must confront ongoing colonialism. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (vol. 2: Short papers) (pp. 724–731). Dublin, Ireland: Association for Computational Linguistics. Retrieved from https://aclanthology.org/2022.acl-short.82 10.18653/v1/2022.acl-short.82
– reference: White, J.C., & Cotterell, R. (2021, August). Examining the inductive bias of neural language models with artificial languages. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (vol. 1: Long papers) (pp. 454–463). Online: Association for Computational Linguistics Retrieved from https://aclanthology.org/2021.acl-long.38 10.18653/v1/2021.acl-long.38
– reference: RanciereJDisagreement: Politics and philosophy1998University of Minnesota Press
– reference: Zouhar, V., Chang, K., Cui, C., Carlson, N., Robinson, N., Sachan, M., Mortensen, D. (2023). Pwesuite: Phonetic word embeddings and tasks they facilitate. arXiv preprint arXiv:2304.02541.
– reference: RijkhoffJBakkerDHengeveldKKahrelPA method of language sampling. Studies in LanguageInternational Journal sponsored by the Foundation1993171169203
– reference: Irani, L., Vertesi, J., Dourish, P., Philip, K., Grinter, R.E. (2010, Apr). Postcolonial computing: a lens on design and development. Proceedings of the Sigchi Conference on Human Factors in Computing Systems (p. 1311–1320). Association for Computing Machinery. Retrieved from https://doi.org/10.1145/1753326.175352210.1145/1753326.1753522
– reference: Joshi, P., Santy, S., Budhiraja, A., Bali, K., Choudhury, M. (2020, July). The state and fate of linguistic diversity and inclusion in the NLP world. D. Jurafsky, J. Chai, N. Schluter, & J. Tetreault (Eds.), Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (pp. 6282–6293). Online: Association for Computational Linguistics. Retrieved from https://aclanthology.org/2020.acl-main.560 10.18653/v1/2020.acl-main.560
– reference: BroussardMMore than a glitch: Confronting race, gender, and ability bias in tech2023The MIT Press10.7551/mitpress/14234.001.0001
– reference: BatsurenKBellaGGiunchigliaFA large and evolving cognate databaseLanguage Resources and Evaluation202256116518910.1007/s10579-021-09544-6
– reference: Giunchiglia, F., Batsuren, K., Bella, G. (2017). Understanding and exploiting language diversity. Ijcai (pp. 4009–4017).
– reference: Blodgett, S.L., Barocas, S., Daumé III, H., Wallach, H. (2020). Language (technology) is power: A critical survey of “bias” in nlp. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (pp. 5454–5476).
– reference: EngelJSGlobal clusters of innovation: Entrepreneurial engines of economic growth around the world (Reprint2016editionEdward Elgar Pub
– reference: TaylorLBroedersDAugust). In the name of Development: Power, profit and the datafication of the global SouthGeoforum20156422923710.1016/j.geoforum.2015.07.002
– reference: HarawayDSituated knowledges: The science question in feminism and the privilege of partial perspectiveFeminist Studies198814357510.2307/3178066
– reference: Saad-SulonenJErikssonEHalskovKKarastiHVinesJUnfolding participation over time: Temporal lenses in participatory designCoDesign201814141610.1080/15710882.2018.1426773
– reference: SpivakGCGrossbergLNelsonCCan the subaltern speakMarxism and the interpretation of culture1988University of Illinois Press66111
– reference: AgirreEEdmondsPWord sense disambiguation: Algorithms and applications2007Springer
– reference: Smith, R.C., Winschiers-Theophilus, H., Loi, D., de Paula, R.A., Kambunga, A.P., Samuel, M.M., Zaman, T. (2021). Decolonizing design practices: Towards pluriversality. Extended Abstracts of the 2021 Chi Conference on Human Factors in Computing Systems. Association for Computing Machinery. Retrieved from https://doi.org/10.1145/3411763.3441334
– reference: Young, H. (2015). The digital language divide. Retrieved from https://labs.theguardian.com/digital-language-divide/
– reference: BarocasSSelbstADBig data’s disparate impactCalifornia Law Review20161043671732
– reference: Ochigame, R. (2019, Dec). How big tech manipulates academia to avoid regulation. Retrieved from https://theintercept.com/2019/12/20/mit-ethical-aiartificial-intelligence/
– reference: Bird, S. (2022, May). Local languages, third spaces, and other high-resource scenarios. Proceedings of the 60th annual meeting of the association for computational linguistics (volume 1: Long papers) (pp. 7817–7829). Dublin, Ireland: Association for Computational Linguistics. Retrieved from https://aclanthology.org/2022.acl-long.539 10.18653/v1/2022.acl-long.539
– reference: Benjamin, R. (2019). Race After Technology: Abolitionist Tools for the New Jim Code (1. edition ed.). Polity.
– reference: GreenbergJHThe measurement of linguistic diversityLanguage1956321109115109016710.2307/410659
– reference: Zaugg, I.A., Hossain, A., Molloy, B. (2022, Apr). Digitally-disadvantaged languages. Internet Policy Review, 11(2). Retrieved from https://policyreview.info/glossary/digitally-disadvantaged-languages 10.14763/2022.2.1654
– reference: HovyDPrabhumoyeSFive sources of bias in natural language processingLanguage and Linguistics Compass202115810.1111/lnc3.12432
– reference: Batsuren, K., Ganbold, A., Chagnaa, A., Giunchiglia, F. (2019). Building the mongolian wordnet. In: Proceedings of the 10th Global Wordnet Conference (pp.238–244).
– reference: De-Arteaga, M., Romanov, A., Wallach, H., Chayes, J., Borgs, C., Chouldechova, A., . . . Kalai, A.T. (2019a). Bias in bios: A case study of semantic representation bias in a high-stakes setting. , 120–128. Retrieved from https://doi.org/10.1145/3287560.3287572
– reference: HardingSStrong objectivity: A response to the new objectivity questionSynthese1995104333134910.1007/BF01064504
– volume: 14
  start-page: 330
  issue: 3
  year: 1996
  ident: 9742_CR26
  publication-title: ACM Transactions on Information Systems
  doi: 10.1145/230538.230561
– ident: 9742_CR15
  doi: 10.18653/v1/2020.coling-main.313
– ident: 9742_CR60
– ident: 9742_CR12
  doi: 10.1145/3442188.3445922
– volume: 18
  start-page: 505
  issue: 3
  year: 2012
  ident: 9742_CR61
  publication-title: Common Knowledge
  doi: 10.1215/0961754X-1630424
– ident: 9742_CR28
  doi: 10.24963/ijcai.2017/560
– ident: 9742_CR9
– volume-title: Wordnet: An electronic lexical database
  year: 1998
  ident: 9742_CR46
– ident: 9742_CR23
  doi: 10.1007/978-3-030-34974-5_6
– ident: 9742_CR29
– volume-title: Word sense disambiguation: Algorithms and applications
  year: 2007
  ident: 9742_CR1
– ident: 9742_CR19
– ident: 9742_CR16
  doi: 10.18653/v1/2022.acl-long.539
– volume: 17
  start-page: 169
  issue: 1
  year: 1993
  ident: 9742_CR52
  publication-title: International Journal sponsored by the Foundation
– ident: 9742_CR56
  doi: 10.18653/v1/P16-1162
– ident: 9742_CR41
  doi: 10.3389/fpsyg.2023.1229697
– volume: 20
  start-page: 1
  issue: 1
  year: 2017
  ident: 9742_CR8
  publication-title: Information, Communication & Society
  doi: 10.1080/1369118X.2016.1216147
– ident: 9742_CR44
  doi: 10.18653/v1/2022.findings-acl.44
– ident: 9742_CR11
– year: 2020
  ident: 9742_CR55
  publication-title: Socius
  doi: 10.1177/2378023120967171
– volume: 7
  start-page: 101
  issue: 2
  year: 2010
  ident: 9742_CR20
  publication-title: Episteme
  doi: 10.3366/epi.2010.0001
– ident: 9742_CR21
  doi: 10.1145/3287560.3287572
– volume: 14
  start-page: 575
  issue: 3
  year: 1988
  ident: 9742_CR33
  publication-title: Feminist Studies
  doi: 10.2307/3178066
– start-page: 66
  volume-title: Marxism and the interpretation of culture
  year: 1988
  ident: 9742_CR58
– volume: 8
  issue: 10
  year: 2013
  ident: 9742_CR43
  publication-title: PloS one
  doi: 10.1371/journal.pone.0077056
– volume-title: Epistemic injustice: Power and the ethics of knowing
  year: 2009
  ident: 9742_CR25
– ident: 9742_CR42
– ident: 9742_CR13
  doi: 10.1093/sf/soz162
– volume-title: Global clusters of innovation: Entrepreneurial engines of economic growth around the world (Reprint
  year: 2016
  ident: 9742_CR24
– ident: 9742_CR62
  doi: 10.18653/v1/2021.eacl-main.188
– volume: 10
  start-page: 1
  issue: 1
  year: 2016
  ident: 9742_CR3
  publication-title: International Journal of Communication
– ident: 9742_CR10
  doi: 10.18653/v1/2022.acl-demo.15
– ident: 9742_CR14
– volume-title: The whale and the reactor: A search for limits in an age of high technology (Reprint
  year: 1988
  ident: 9742_CR64
  doi: 10.7208/chicago/9780226902098.001.0001
– ident: 9742_CR48
  doi: 10.26643/gis.v12i3.5173
– ident: 9742_CR67
  doi: 10.14763/2022.2.1654
– ident: 9742_CR39
  doi: 10.1145/1753326.1753522
– ident: 9742_CR57
  doi: 10.1145/3411763.3441334
– ident: 9742_CR37
  doi: 10.18653/v1/2021.naacl-main.49
– volume-title: Raw data is an oxymoron
  year: 2013
  ident: 9742_CR27
  doi: 10.7551/mitpress/9302.001.0001
– volume-title: Digital democracy, analogue politics: How the internet era is transforming politics in kenya
  year: 2018
  ident: 9742_CR47
  doi: 10.5040/9781350219656
– volume-title: More than a glitch: Confronting race, gender, and ability bias in tech
  year: 2023
  ident: 9742_CR18
  doi: 10.7551/mitpress/14234.001.0001
– year: 2023
  ident: 9742_CR30
  publication-title: Artificial Intelligence Review
  doi: 10.1007/s10462-023-10427-1
– ident: 9742_CR17
  doi: 10.18653/v1/2020.acl-main.485
– ident: 9742_CR68
  doi: 10.18653/v1/2023.acl-long.699
– ident: 9742_CR35
  doi: 10.1145/3514094.3534149
– volume-title: 51the unity of the epistemic virtues. Pathways to knowledge: Private and Ublic
  year: 2002
  ident: 9742_CR31
  doi: 10.1093/0195138791.001.0001
– volume: 32
  start-page: 109
  issue: 1
  year: 1956
  ident: 9742_CR32
  publication-title: Language
  doi: 10.2307/410659
– volume-title: Disagreement: Politics and philosophy
  year: 1998
  ident: 9742_CR51
– ident: 9742_CR6
– year: 2023
  ident: 9742_CR36
  publication-title: Big Data & Society
  doi: 10.1177/20539517231206802
– volume: 14
  start-page: 4
  issue: 1
  year: 2018
  ident: 9742_CR53
  publication-title: CoDesign
  doi: 10.1080/15710882.2018.1426773
– ident: 9742_CR65
– volume: 104
  start-page: 331
  issue: 3
  year: 1995
  ident: 9742_CR34
  publication-title: Synthese
  doi: 10.1007/BF01064504
– volume-title: Algorithmic reason: The new government of self and other
  year: 2022
  ident: 9742_CR2
  doi: 10.1093/oso/9780192859624.001.0001
– ident: 9742_CR22
  doi: 10.1145/3287560.3287572
– ident: 9742_CR40
  doi: 10.18653/v1/2020.acl-main.560
– ident: 9742_CR69
– volume: 15
  issue: 8
  year: 2021
  ident: 9742_CR38
  publication-title: Language and Linguistics Compass
  doi: 10.1111/lnc3.12432
– ident: 9742_CR63
  doi: 10.18653/v1/2021.acl-long.38
– volume-title: The values of biodiversity: philosophical considerations connecting theory and practice. Concepts and values in biodiversity
  year: 2014
  ident: 9742_CR50
– ident: 9742_CR54
  doi: 10.18653/v1/2022.acl-short.82
– volume: 47
  start-page: 783
  issue: 6
  year: 2017
  ident: 9742_CR49
  publication-title: Social Studies of Science
  doi: 10.1177/0306312717706110
– volume-title: The next billion users: Digital life beyond the west
  year: 2019
  ident: 9742_CR4
  doi: 10.4159/9780674238879
– volume: 56
  start-page: 165
  issue: 1
  year: 2022
  ident: 9742_CR7
  publication-title: Language Resources and Evaluation
  doi: 10.1007/s10579-021-09544-6
– volume-title: Justice and the politics of difference
  year: 1990
  ident: 9742_CR66
– volume-title: The political culture of language: Swahili, society and the state
  year: 1999
  ident: 9742_CR45
– volume: 64
  start-page: 229
  year: 2015
  ident: 9742_CR59
  publication-title: Geoforum
  doi: 10.1016/j.geoforum.2015.07.002
– volume: 104
  start-page: 671
  issue: 3
  year: 2016
  ident: 9742_CR5
  publication-title: California Law Review
SSID ssj0005717
Score 2.4736888
Snippet It is well known that AI-based language technology—large language models, machine translation systems, multilingual dictionaries, and corpora—is currently...
SourceID hal
proquest
crossref
springer
SourceType Open Access Repository
Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 8
SubjectTerms Artificial Intelligence
Bias
Bilingual dictionaries
Community
Computation and Language
Computer Science
Computers and Society
Dialects
Dictionaries
Ethics
History, Philosophy and Sociology of Sciences
Humanities and Social Sciences
Innovation/Technology Management
Justice
Language
Language modeling
Languages
Large language models
Library Science
Machine translation
Management of Computing and Information Systems
Marginality
Modelling
Original Paper
Sociolects
Speech communities
Technology
Underserved populations
User Interfaces and Human Computer Interaction
SummonAdditionalLinks – databaseName: SpringerLINK Contemporary 1997-Present
  dbid: RSV
  link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LT9wwEB7x6IELFApieVRWhbhApI1jJ3ZvqC3igBAqD3GpIsd2xFYlILKA-Df5LflljLP2LkUtElwz48TyjOPP9sx8AFvcqphSK6NC0SRihZaRMpqhQUouVRrrvhId2UR2dCQuLuSxTwqrQ7R7uJLs_tTPkt3SzAXMJlEfQTCN0mmYxeVOOMKGnyfnk8COrOPZjRP0gVjyzKfK_Psdfy1H05cuGPIZ0nxxOdqtOfsL7-vtR5j3GJPsjZxiEaZstQQLgb-B-Om8BJs-aYFsE5-V5Kw0kc8dB56Dx0_w63uI4CCqMiScc5Lh-Gz-K7m8fmibIGmbjmUHO902xUDVRKu72tZtY2-ca10NdNsMqt8jNrFlONv_cfrtIPLcDJFGBDOMrGFCc65xP2jSxKXTakkLyVIhS6MVQ0MLw0WqSlO4InCFLTnu9LRBwCQLHicrMFNdV3YVCC-4AzLClJYyBEiKaoT9LOtLd0Uaix7EwUS59oXLHX_Gn3xSctkNdo6DnXeDnac92Bm3uRmV7XhV-wtafqzoKm4f7B3m7lmfMQRFkt_HPdgIjpH7eV7nVLp6drhlZT3YDY4wEf__k2tvU1-HOYpoahT8tgEzw9s7uwkf9P1wUN9-7vz_CT7KAaE
  priority: 102
  providerName: Springer Nature
Title Diversity and language technology: how language modeling bias causes epistemic injustice
URI https://link.springer.com/article/10.1007/s10676-023-09742-6
https://www.proquest.com/docview/2918809984
https://hal.science/hal-04421595
Volume 26
WOSCitedRecordID wos001148865400001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAVX
  databaseName: SpringerLINK Contemporary 1997-Present
  customDbUrl:
  eissn: 1572-8439
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0005717
  issn: 1388-1957
  databaseCode: RSV
  dateStart: 19990301
  isFulltext: true
  titleUrlDefault: https://link.springer.com/search?facet-content-type=%22Journal%22
  providerName: Springer Nature
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1Lb9QwEB6xbQ-9tNAWdaFdWQhxAauxYycxF1T6UA-wrMqjC5fIsR11EWSXZlvEv8eTdTZQiV64zCF2Xvr8-GzPzAfwVDrNOHeKFprHVBRGUW2N8ICUUumEmUhnjdhEOhxm47EahQ23OrhVtmNiM1DbqcE98gOuMHOYXxyIV7MfFFWj8HQ1SGj0YBWZDbr0vY2OOhePtFHcZbFvDUzJNATNhNC5JEX325hGnlJzmvw1MfUu0S3yD85565i0mX1ON__3u-_DRuCd5HDRUB7APVdtwWar6UBCF9-C_RDIQJ6REKmEyHXl66NW--DXNoyPW68OoitL2r1PMl_u178kl9OfXUGju-N_kBQTXROjr2tXEzfDpvZ9Ysik-roQF9uBj6cnH47OaJBqoMYTmjl1VmRGSuOXhzaJMbrWKF4okWSqtEYLj3tmZZbo0haYE65wpfQLP2M9f1KFZPFDWKmmldsFIguJvCazpePC8yXNjV8FiDRSeGLKsj6wFqfchDzmKKfxLe8yMCO2ucc2b7DNkz48X94zW2TxuLP2Ew__siIm4D47fJPjtUgIz5GUvGF92GvxzkO3r_MO7D68aFtMV_zvVz66-2mPYZ17MrXwfduDlfnVtduHNXMzn9RXA-ilF58HsPr6ZDg6HzRdAC07buw7tBztSH7x9vz9p98DEwzD
linkProvider ProQuest
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1Nb9QwEB21BYleKBQQCy1YCLhA1I1jJzESQhVLtVW3qx6KtLfg2I66iGa3zbZV_xS_kZkk3gASvfXANXacyH6eD3tmHsBr6XTIuVNBrnkUiNyoQFsjcEEKqXQcmr5Oa7KJZDxOJxN1tAI_fS4MhVV6mVgLajszdEa-wxVVDkPnQHyanwXEGkW3q55Co4HFgbu-Qpet-rg_wPV9w_nel-PPw6BlFQgM6t5F4KxIjZQGPRkbR5QIahTPlYhTVVijBf5iamUa68LmVL4sd4VEH8VYVPUql2GE467CHaqrR87eYTjoQkqSmuE3jBB9oZJJm6TTpurFCYX7RkEfTXgexH8owtUTCsP8zcb961q21nZ7G__bPD2A-61dzXabjfAQVly5CRues4K1ImwTtttEDfaWtZlYhMyuff3IcztcP4LJwEetMF1a5s922WJ5H_GBncyuuoaaVwgnlOVTXTGjLypXMTenrXQ6NWxafm_I0x7D11uZjCewVs5K9xSYzCXZbaktHBdoD2pu0MsRSV_RjXCY9iD0uMhMW6ed6EJ-ZF2FacJShljKaixlcQ_eLd-ZN1VKbuz9CuG27EgFxoe7o4ye9YVAG1DJy7AHWx5fWSvWqqwDVw_ee4R2zf_-5LObR3sJ94bHh6NstD8-eA7rHA3HJs5vC9YW5xduG-6ay8W0On9RbzgG324bub8A_55h7A
linkToPdf http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V3Pb9MwFH7axoR2YTBAFDaw0OAC1hrHTmIkhCa6atOmqgeQKi7BsR2tCNKydJv2r_HX8ZzYDSCx2w67xo4T2Z_fD_u99wHsCqsixqykhWIx5YWWVBnNcUFKIVUS6b7KGrKJdDTKJhM5XoFfIRfGhVUGmdgIajPT7ox8j0lXOQydA75X-rCI8WD4Yf6TOgYpd9Ma6DRaiBzbq0t03-r3RwNc61eMDQ8-fTyknmGAatTDC2oNz7QQGr0ak8QuKVRLVkieZLI0WnH83cyILFGlKVwps8KWAv0VbVDty0JEMY67CndS9DFdOOFYfOnCS9KG7TeKEYmRFKlP2PFpe0nqQn9j2kdzntHkL6W4eupCMv-wd_-5om0033DzNs_Zfbjn7W2y326QB7Biqy3YDFwWxIu2LdjxCRzkNfEZWg6xXfvGOHA-XD2EySBEsxBVGRLOfMlieU_xjpzOLruGhm8IJ5cUU1UTrc5rWxM7d1vsx1STafWtJVV7BJ9vZDIew1o1q-wTIKIQzp7LTGkZRztRMY3eD0_70t0UR1kPooCRXPv67Y5G5HveVZ52uMoRV3mDqzzpwZvlO_O2esm1vV8i9JYdXeHxw_2T3D3rc462oRQXUQ-2A9ZyL-7qvANaD94GtHbN___k0-tHewF3EbD5ydHo-BlsMLQn2_C_bVhbnJ3bHVjXF4tpffa82XsEvt40cH8DwXNq8A
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Diversity+and+language+technology%3A+how%C2%A0language%C2%A0modeling%C2%A0bias+causes%C2%A0epistemic%C2%A0injustice&rft.jtitle=Ethics+and+information+technology&rft.au=Helm%2C+Paula&rft.au=Bella%2C+G%C3%A1bor&rft.au=Koch%2C+Gertraud&rft.au=Giunchiglia%2C+Fausto&rft.date=2024-03-01&rft.issn=1388-1957&rft.eissn=1572-8439&rft.volume=26&rft.issue=1&rft_id=info:doi/10.1007%2Fs10676-023-09742-6&rft.externalDBID=n%2Fa&rft.externalDocID=10_1007_s10676_023_09742_6
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1388-1957&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1388-1957&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1388-1957&client=summon