Automated Hashtag Hierarchy Generation Using Community Detection and the Shannon Diversity Index

Developing semantic hierarchies from user-created hashtags in social media can provide useful organizational structure to large volumes of data. However, construction of these hierarchies is difficult using established ontologies (e.g. WordNet [1]) due to the differences in the semantic and pragmati...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:2022 IEEE 16th International Conference on Semantic Computing (ICSC) s. 59 - 66
Hlavní autoři: Torene, Spencer, Howald, Blake
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: IEEE 01.01.2022
Témata:
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract Developing semantic hierarchies from user-created hashtags in social media can provide useful organizational structure to large volumes of data. However, construction of these hierarchies is difficult using established ontologies (e.g. WordNet [1]) due to the differences in the semantic and pragmatic use of words vs. hashtags in social media. While alternative construction methods based on hashtag frequency are relatively straightforward, these methods can be susceptible to the dynamic nature of social media, such as hashtags associated with surges in popularity. We drew inspiration from the ecologically-based Shannon Diversity Index (SDI) [2] to create a more representative and resilient method of semantic hierarchy construction that relies upon graph-based community detection and a novel, entropy-based ensemble diversity index (EDI) score. The EDI quantifies the contextual diversity of each hashtag, resulting in thousands of semantically-related groups of hashtags organized along a general-to-specific spectrum. Through an application of EDI to Twitter data and a comparison of our results to prior approaches, we demonstrate our method's ability to create semantically consistent hierarchies that can be flexibly applied and adapted to a range of use cases.
AbstractList Developing semantic hierarchies from user-created hashtags in social media can provide useful organizational structure to large volumes of data. However, construction of these hierarchies is difficult using established ontologies (e.g. WordNet [1]) due to the differences in the semantic and pragmatic use of words vs. hashtags in social media. While alternative construction methods based on hashtag frequency are relatively straightforward, these methods can be susceptible to the dynamic nature of social media, such as hashtags associated with surges in popularity. We drew inspiration from the ecologically-based Shannon Diversity Index (SDI) [2] to create a more representative and resilient method of semantic hierarchy construction that relies upon graph-based community detection and a novel, entropy-based ensemble diversity index (EDI) score. The EDI quantifies the contextual diversity of each hashtag, resulting in thousands of semantically-related groups of hashtags organized along a general-to-specific spectrum. Through an application of EDI to Twitter data and a comparison of our results to prior approaches, we demonstrate our method's ability to create semantically consistent hierarchies that can be flexibly applied and adapted to a range of use cases.
Author Howald, Blake
Torene, Spencer
Author_xml – sequence: 1
  givenname: Spencer
  surname: Torene
  fullname: Torene, Spencer
  email: spencer.torene@trssllc.com
  organization: Thomson Reuters Special Services, LLC,McLean,VA,USA
– sequence: 2
  givenname: Blake
  surname: Howald
  fullname: Howald, Blake
  email: blake.howald@trssllc.com
  organization: Thomson Reuters Special Services, LLC,McLean,VA,USA
BookMark eNotjkFOwzAURI0ECyg9ASx8gQR_20nsZZVCU6kSi9J1cZzvxhJxUOIicntCYTUjzdPo3ZHr0Ack5BFYCsD007bclxlXElLOOE8ZY5BfkaUuFOR5JoUEpW7J--oc-85EbGhlxjaaE608Dmaw7UQ3GOYafR_oYfThRMu-687Bx4muMaK9LCY0NLZI960JswJd-y8cxl9mGxr8vic3znyMuPzPBTm8PL-VVbJ73WzL1S7xIERMJGTINIosU6pGC6CknUWR59wWzpkclFaoAMEoWTjpaqMtA6mtcLVCKxbk4e_XI-Lxc_CdGaajLkTONRM_0IhTNA
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/ICSC52841.2022.00016
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE/IET Electronic Library (IEL) (UW System Shared)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE/IET Electronic Library (IEL) (UW System Shared)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 9781665434188
166543418X
EndPage 66
ExternalDocumentID 9736290
Genre orig-research
GroupedDBID 6IE
6IL
CBEJK
RIE
RIL
ID FETCH-LOGICAL-i133t-415e09e35588bec1184c816e262c7ffa61898e81e1a847f4fba9c0149c3fb8ec3
IEDL.DBID RIE
ISICitedReferencesCount 3
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000835706300008&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Thu Jun 29 18:36:55 EDT 2023
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i133t-415e09e35588bec1184c816e262c7ffa61898e81e1a847f4fba9c0149c3fb8ec3
PageCount 8
ParticipantIDs ieee_primary_9736290
PublicationCentury 2000
PublicationDate 2022-Jan.
PublicationDateYYYYMMDD 2022-01-01
PublicationDate_xml – month: 01
  year: 2022
  text: 2022-Jan.
PublicationDecade 2020
PublicationTitle 2022 IEEE 16th International Conference on Semantic Computing (ICSC)
PublicationTitleAbbrev ICSC
PublicationYear 2022
Publisher IEEE
Publisher_xml – name: IEEE
Score 1.8024256
Snippet Developing semantic hierarchies from user-created hashtags in social media can provide useful organizational structure to large volumes of data. However,...
SourceID ieee
SourceType Publisher
StartPage 59
SubjectTerms Indexes
information entropy
recommender systems
Refining
Semantics
social computing
Social networking (online)
tagging
text analysis
Time series analysis
Training data
twitter
Weight measurement
Title Automated Hashtag Hierarchy Generation Using Community Detection and the Shannon Diversity Index
URI https://ieeexplore.ieee.org/document/9736290
WOSCitedRecordID wos000835706300008&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NTwIxEG2QePCkBozf6cGjK-wH2_ZoQAKJISSo4YbTdipcFgLFhH9v213RgxdvTS9Npkln3vS9N4TcGRDSVSGe1SjBd6vSCLKsE3HdllxqBSWr8u2ZjUZ8OhXjGrnfa2EQMZDP8MEvw1--Xqqtb5W1BHPPrXAA_YCxvNRqVWq4uC1aw-6k23GvrUd9SbDh9EPMf81MCSmjf_y_w05I80d7R8f7rHJKalg0yPvj1i5dbYmaDmAzt_BBBwsvHVbzHS2do32AaSAA0Er0YXe0hzZQrQoKhaau1qOTORQO8dPeNx-DDr1fYpO89p9euoOomo0QLRyqtJHLu9gW6M3RubsGBxMyxeMckzxRzBjIYy448hhjcPnHZEaCUB4OqdRIjio9I3V3Gp4TmkKmjNYdj82yhCmQgutYoQHtwAmDC9Lw0ZmtSvuLWRWYy7-3r8iRD3_Zpbgmdbve4g05VJ92sVnfhjv7ApW_m64
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NT8JAEN0QNNGTGjCKX3vwaIV-0d2jAUmJSEhAww2nu7PCpRgoJvx7d9qKHrx4a3rZZCbZmTf73hvGbg3IxHYhxGpMgKZVvgNBEDpCtxKRaAUFq_J1EA2HYjqVowq722lhEDEnn-E9feZv-XqpNjQqa8rIXrfSAvQ92pwVFmqtUg_ntmSz3xl3QnvfEu7zciNOWmP-a2tKXjR6R_877pjVf9R3fLSrKyesgmmNvT1ssqXtLlHzGNbzDN55vCDxsJpveeEdTSHmOQWAl7KPbMu7mOVkq5RDqrnt9vh4DqnF_Lz7zcjgfXJMrLOX3uOkEzvldgRnYXFl5tgAYEsi2aMLmwgLFAIl3DZ6bU9FxkDbFVKgcNEFW4FMYBKQigCR8k0iUPmnrGpPwzPGfQiU0TokdBZ4kYJECu0qNKAtPIngnNUoOrOPwgBjVgam8ffvG3YQT54Hs0F_-HTBDikVxcziklWz1Qav2L76zBbr1XWevy_iTJ75
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2022+IEEE+16th+International+Conference+on+Semantic+Computing+%28ICSC%29&rft.atitle=Automated+Hashtag+Hierarchy+Generation+Using+Community+Detection+and+the+Shannon+Diversity+Index&rft.au=Torene%2C+Spencer&rft.au=Howald%2C+Blake&rft.date=2022-01-01&rft.pub=IEEE&rft.spage=59&rft.epage=66&rft_id=info:doi/10.1109%2FICSC52841.2022.00016&rft.externalDocID=9736290