The ontological politics of synthetic data: Normalities, outliers, and intersectional hallucinations
Saved in:
| Title: | The ontological politics of synthetic data: Normalities, outliers, and intersectional hallucinations |
|---|---|
| Authors: | Lee, Francis, 1974, Hajisharif, Saghi, Johnson, Ericka |
| Source: | Big Data and Society. 12(2) |
| Subject Terms: | intersectionality, data ethics, classification, data bias, Synthetic structured data, ontological politics |
| Description: | Synthetic data is increasingly used as a substitute for real data due to ethical, legal, and logistical reasons. However, the rise of synthetic data also raises critical questions about its entanglement with the politics of classification and the reproduction of social norms and categories. This paper aims to problematize the use of synthetic data by examining how its production is intertwined with the maintenance of certain worldviews and classifications. We argue that synthetic data, like real data, is embedded with societal biases and power structures, leading to the reproduction of existing social inequalities. Through empirical examples, we demonstrate how synthetic data tends to highlight majority elements as the “normal” and minimize minority elements, and that the slight changes to the data structures that create synthetic data will also inevitably result in what we term “intersectional hallucinations.” These hallucinations are inherent to synthetic data and cannot be entirely eliminated without compromising the purpose of creating synthetic datasets. We contend that decisions about synthetic data involve determining which intersections are essential and which can be disregarded, a practice which will imbue these decisions with norms and values. Our study underscores the need for critical engagement with the mathematical and statistical choices in synthetic data production and advocates for careful consideration of the ontological and political implications of these choices during curatorial style production of synthetic structured data. |
| File Description: | electronic |
| Access URL: | https://research.chalmers.se/publication/546035 https://research.chalmers.se/publication/546035/file/546035_Fulltext.pdf |
| Database: | SwePub |
| FullText | Text: Availability: 0 CustomLinks: – Url: https://research.chalmers.se/publication/546035# Name: EDS - SwePub (s4221598) Category: fullText Text: View record in SwePub – Url: https://resolver.ebscohost.com/openurl?sid=EBSCO:edsswe&genre=article&issn=20539517&ISBN=&volume=12&issue=2&date=20250101&spage=&pages=&title=Big Data and Society&atitle=The%20ontological%20politics%20of%20synthetic%20data%3A%20Normalities%2C%20outliers%2C%20and%20intersectional%20hallucinations&aulast=Lee%2C%20Francis&id=DOI:10.1177/20539517251318289 Name: Full Text Finder Category: fullText Text: Full Text Finder Icon: https://imageserver.ebscohost.com/branding/images/FTF.gif MouseOverText: Full Text Finder – Url: https://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=EBSCO&SrcAuth=EBSCO&DestApp=WOS&ServiceName=TransferToWoS&DestLinkType=GeneralSearchSummary&Func=Links&author=Lee%20F Name: ISI Category: fullText Text: Nájsť tento článok vo Web of Science Icon: https://imagesrvr.epnet.com/ls/20docs.gif MouseOverText: Nájsť tento článok vo Web of Science |
|---|---|
| Header | DbId: edsswe DbLabel: SwePub An: edsswe.oai.research.chalmers.se.04caa1ee.3e55.4725.ac5d.fa2e70105971 RelevancyScore: 1065 AccessLevel: 6 PubType: Academic Journal PubTypeId: academicJournal PreciseRelevancyScore: 1064.736328125 |
| IllustrationInfo | |
| Items | – Name: Title Label: Title Group: Ti Data: The ontological politics of synthetic data: Normalities, outliers, and intersectional hallucinations – Name: Author Label: Authors Group: Au Data: <searchLink fieldCode="AR" term="%22Lee%2C+Francis%22">Lee, Francis</searchLink>, 1974<br /><searchLink fieldCode="AR" term="%22Hajisharif%2C+Saghi%22">Hajisharif, Saghi</searchLink><br /><searchLink fieldCode="AR" term="%22Johnson%2C+Ericka%22">Johnson, Ericka</searchLink> – Name: TitleSource Label: Source Group: Src Data: <i>Big Data and Society</i>. 12(2) – Name: Subject Label: Subject Terms Group: Su Data: <searchLink fieldCode="DE" term="%22intersectionality%22">intersectionality</searchLink><br /><searchLink fieldCode="DE" term="%22data+ethics%22">data ethics</searchLink><br /><searchLink fieldCode="DE" term="%22classification%22">classification</searchLink><br /><searchLink fieldCode="DE" term="%22data+bias%22">data bias</searchLink><br /><searchLink fieldCode="DE" term="%22Synthetic+structured+data%22">Synthetic structured data</searchLink><br /><searchLink fieldCode="DE" term="%22ontological+politics%22">ontological politics</searchLink> – Name: Abstract Label: Description Group: Ab Data: Synthetic data is increasingly used as a substitute for real data due to ethical, legal, and logistical reasons. However, the rise of synthetic data also raises critical questions about its entanglement with the politics of classification and the reproduction of social norms and categories. This paper aims to problematize the use of synthetic data by examining how its production is intertwined with the maintenance of certain worldviews and classifications. We argue that synthetic data, like real data, is embedded with societal biases and power structures, leading to the reproduction of existing social inequalities. Through empirical examples, we demonstrate how synthetic data tends to highlight majority elements as the “normal” and minimize minority elements, and that the slight changes to the data structures that create synthetic data will also inevitably result in what we term “intersectional hallucinations.” These hallucinations are inherent to synthetic data and cannot be entirely eliminated without compromising the purpose of creating synthetic datasets. We contend that decisions about synthetic data involve determining which intersections are essential and which can be disregarded, a practice which will imbue these decisions with norms and values. Our study underscores the need for critical engagement with the mathematical and statistical choices in synthetic data production and advocates for careful consideration of the ontological and political implications of these choices during curatorial style production of synthetic structured data. – Name: Format Label: File Description Group: SrcInfo Data: electronic – Name: URL Label: Access URL Group: URL Data: <link linkTarget="URL" linkTerm="https://research.chalmers.se/publication/546035" linkWindow="_blank">https://research.chalmers.se/publication/546035</link><br /><link linkTarget="URL" linkTerm="https://research.chalmers.se/publication/546035/file/546035_Fulltext.pdf" linkWindow="_blank">https://research.chalmers.se/publication/546035/file/546035_Fulltext.pdf</link> |
| PLink | https://erproxy.cvtisr.sk/sfx/access?url=https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsswe&AN=edsswe.oai.research.chalmers.se.04caa1ee.3e55.4725.ac5d.fa2e70105971 |
| RecordInfo | BibRecord: BibEntity: Identifiers: – Type: doi Value: 10.1177/20539517251318289 Languages: – Text: English Subjects: – SubjectFull: intersectionality Type: general – SubjectFull: data ethics Type: general – SubjectFull: classification Type: general – SubjectFull: data bias Type: general – SubjectFull: Synthetic structured data Type: general – SubjectFull: ontological politics Type: general Titles: – TitleFull: The ontological politics of synthetic data: Normalities, outliers, and intersectional hallucinations Type: main BibRelationships: HasContributorRelationships: – PersonEntity: Name: NameFull: Lee, Francis – PersonEntity: Name: NameFull: Hajisharif, Saghi – PersonEntity: Name: NameFull: Johnson, Ericka IsPartOfRelationships: – BibEntity: Dates: – D: 01 M: 01 Type: published Y: 2025 Identifiers: – Type: issn-print Value: 20539517 – Type: issn-locals Value: SWEPUB_FREE – Type: issn-locals Value: CTH_SWEPUB Numbering: – Type: volume Value: 12 – Type: issue Value: 2 Titles: – TitleFull: Big Data and Society Type: main |
| ResultId | 1 |
Full Text Finder
Nájsť tento článok vo Web of Science