Identification of Twitter Bots Based on an Explainable Machine Learning Framework: The US 2020 Elections Case Study

Twitter is one of the most popular social networks attracting millions of users, while a considerable proportion of online discourse is captured. It provides a simple usage framework with short messages and an efficient application programming interface (API) enabling the research community to study...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Proceedings of the International AAAI Conference on Web and Social Media Ročník 16; s. 956 - 967
Hlavní autori: Shevtsov, Alexander, Tzagkarakis, Christos, Antonakaki, Despoina, Ioannidis, Sotiris
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: 31.05.2022
ISSN:2162-3449, 2334-0770
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract Twitter is one of the most popular social networks attracting millions of users, while a considerable proportion of online discourse is captured. It provides a simple usage framework with short messages and an efficient application programming interface (API) enabling the research community to study and analyze several aspects of this social network. However, the Twitter usage simplicity can lead to malicious handling by various bots. The malicious handling phenomenon expands in online discourse, especially during the electoral periods, where except the legitimate bots used for dissemination and communication purposes, the goal is to manipulate the public opinion and the electorate towards a certain direction, specific ideology, or political party. This paper focuses on the design of a novel system for identifying Twitter bots based on labeled Twitter data. To this end, a supervised machine learning (ML) framework is adopted using an Extreme Gradient Boosting (XGBoost) algorithm, where the hyper-parameters are tuned via cross-validation. Our study also deploys Shapley Additive Explanations (SHAP) for explaining the ML model predictions by calculating feature importance, using the game theoretic-based Shapley values. Experimental evaluation on distinct Twitter datasets demonstrate the superiority of our approach, in terms of bot detection accuracy, when compared against a recent state-of-the-art Twitter bot detection method.
AbstractList Twitter is one of the most popular social networks attracting millions of users, while a considerable proportion of online discourse is captured. It provides a simple usage framework with short messages and an efficient application programming interface (API) enabling the research community to study and analyze several aspects of this social network. However, the Twitter usage simplicity can lead to malicious handling by various bots. The malicious handling phenomenon expands in online discourse, especially during the electoral periods, where except the legitimate bots used for dissemination and communication purposes, the goal is to manipulate the public opinion and the electorate towards a certain direction, specific ideology, or political party. This paper focuses on the design of a novel system for identifying Twitter bots based on labeled Twitter data. To this end, a supervised machine learning (ML) framework is adopted using an Extreme Gradient Boosting (XGBoost) algorithm, where the hyper-parameters are tuned via cross-validation. Our study also deploys Shapley Additive Explanations (SHAP) for explaining the ML model predictions by calculating feature importance, using the game theoretic-based Shapley values. Experimental evaluation on distinct Twitter datasets demonstrate the superiority of our approach, in terms of bot detection accuracy, when compared against a recent state-of-the-art Twitter bot detection method.
Author Antonakaki, Despoina
Ioannidis, Sotiris
Shevtsov, Alexander
Tzagkarakis, Christos
Author_xml – sequence: 1
  givenname: Alexander
  surname: Shevtsov
  fullname: Shevtsov, Alexander
– sequence: 2
  givenname: Christos
  surname: Tzagkarakis
  fullname: Tzagkarakis, Christos
– sequence: 3
  givenname: Despoina
  surname: Antonakaki
  fullname: Antonakaki, Despoina
– sequence: 4
  givenname: Sotiris
  surname: Ioannidis
  fullname: Ioannidis, Sotiris
BookMark eNp9kEFPwjAYhhuDiYj8AG_9A8N27brOmxBAEowH4Lx07TdpHB1pq8i_d0xPHjx9b_K9z3t4btHAtQ4QuqdkQgUpHqw-hcPkkwpLJ7RgvLhCw5QxnpA8J4MuU5EmjPPiBo1DsBXhPM9EkdEhCisDLtraahVt63Bb4-3JxggeT9sY8FQFMLh7KIfnX8dGWaeqBvCL0nvrAK9BeWfdG154dYBT698f8XYPeLfBKUkJnjegL8MBz7olvIkf5nyHrmvVBBj_3hHaLebb2XOyfl2uZk_rRFPGiqSWIJmpBEsrampuZFXximmRcmWMFBRSlSkpBEgJoIShkrNLz9BMd13GRoj-7GrfhuChLo_eHpQ_l5SUF3FlL67sxZW9uI7J_zDaxl5N9Mo2_5DfRgV4yQ
CitedBy_id crossref_primary_10_1038_s41598_024_52471_z
crossref_primary_10_11144_Javeriana_syp43_misr
crossref_primary_10_1140_epjds_s13688_025_00545_x
crossref_primary_10_1016_j_comnet_2024_110808
ContentType Journal Article
DBID AAYXX
CITATION
DOI 10.1609/icwsm.v16i1.19349
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList CrossRef
DeliveryMethod fulltext_linktorsrc
EISSN 2334-0770
EndPage 967
ExternalDocumentID 10_1609_icwsm_v16i1_19349
GroupedDBID AAYXX
ALMA_UNASSIGNED_HOLDINGS
CITATION
M~E
ID FETCH-LOGICAL-c1339-f8e83db632b1df4d8bb4b3c624add861e2a5a866e88eea6d1843df4dd15c8bb33
ISSN 2162-3449
IngestDate Sat Nov 29 06:35:22 EST 2025
Tue Nov 18 22:49:43 EST 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c1339-f8e83db632b1df4d8bb4b3c624add861e2a5a866e88eea6d1843df4dd15c8bb33
OpenAccessLink https://ojs.aaai.org/index.php/ICWSM/article/download/19349/19121
PageCount 12
ParticipantIDs crossref_primary_10_1609_icwsm_v16i1_19349
crossref_citationtrail_10_1609_icwsm_v16i1_19349
PublicationCentury 2000
PublicationDate 2022-05-31
PublicationDateYYYYMMDD 2022-05-31
PublicationDate_xml – month: 05
  year: 2022
  text: 2022-05-31
  day: 31
PublicationDecade 2020
PublicationTitle Proceedings of the International AAAI Conference on Web and Social Media
PublicationYear 2022
SSID ssib044756951
Score 1.7960337
Snippet Twitter is one of the most popular social networks attracting millions of users, while a considerable proportion of online discourse is captured. It provides a...
SourceID crossref
SourceType Enrichment Source
Index Database
StartPage 956
Title Identification of Twitter Bots Based on an Explainable Machine Learning Framework: The US 2020 Elections Case Study
Volume 16
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVHPJ
  databaseName: ROAD: Directory of Open Access Scholarly Resources
  customDbUrl:
  eissn: 2334-0770
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssib044756951
  issn: 2162-3449
  databaseCode: M~E
  dateStart: 20080101
  isFulltext: true
  titleUrlDefault: https://road.issn.org
  providerName: ISSN International Centre
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Bb9MwFLbK4MAFgQAxGMgHTkQpTZw4DrcwikBi06S2sFtkx-4WrSRTk2UTB_42V56dxCRMk9iBS1S5zmva9_W9Z_t77yH0Ogx8QXwRuwRmuEHGmcszJlzFeZhFLAyUOTH9-iU6PGTHx_HRZPKrz4VpNlFRsKur-Py_qhrGQNk6dfYW6rZCYQBeg9LhCmqH6z8pvk29XXd7cToYXF7mtamFWNaV8x7clnQMB1nXOd702VMHhlWp-oKrJzqkbWlbPTFjtXB8CPuc-cbQt4rK2QdZhok4Ohs-sj6x6hkI433HJEk-D1IN9cN8U8KSSLtsYesvFqeqqauyGeXj2B2HH_zkjG91HDyolWAXColukczPeNuc-4MmBHf9ws3_odQdm2R766Ks820-2gaBFXR3gt9bS9-jvkuCtv7pVLVjhATuLGo7k1hzP7TXcUgHrj9uO4Nc8yrUFGXNs8vq-7TxaO5NIeztP2lYwfsvz2r5jnqlBUJSIyI1IlIj4g6660dhrLmIBz_nvSHURRhpbFqH2q_VnciDlLfXHmQQUw2Co-VD9KBb1eCkReMjNFHFY1SNkYjLNe6QiDUSsUEihjd4gQdIxB0ScY9EbJH4DgMO8WqBNQ6xxSHWOMQGh0_Q6uN8uf_J7Xp8uJlHSOyumWJECgr2wpPrQDIhAkEy6gfgeBn1lM9DzihVjCnFqdTtifQ86YVgUgQhT9FOURbqGcK-JIQKmlFwMbDsFyyIZ9KTigl99j7ju2jW_0pp1hXA131YNumN6tlFb-wt5231l5snP7_N5Bfo_h8U76GdenuhXqJ7WVPn1faVQcNvnVau_A
linkProvider ISSN International Centre
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Identification+of+Twitter+Bots+Based+on+an+Explainable+Machine+Learning+Framework%3A+The+US+2020+Elections+Case+Study&rft.jtitle=Proceedings+of+the+International+AAAI+Conference+on+Web+and+Social+Media&rft.au=Shevtsov%2C+Alexander&rft.au=Tzagkarakis%2C+Christos&rft.au=Antonakaki%2C+Despoina&rft.au=Ioannidis%2C+Sotiris&rft.date=2022-05-31&rft.issn=2162-3449&rft.eissn=2334-0770&rft.volume=16&rft.spage=956&rft.epage=967&rft_id=info:doi/10.1609%2Ficwsm.v16i1.19349&rft.externalDBID=n%2Fa&rft.externalDocID=10_1609_icwsm_v16i1_19349
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2162-3449&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2162-3449&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2162-3449&client=summon