Automated Hashtag Hierarchy Generation Using Community Detection and the Shannon Diversity Index

Developing semantic hierarchies from user-created hashtags in social media can provide useful organizational structure to large volumes of data. However, construction of these hierarchies is difficult using established ontologies (e.g. WordNet [1]) due to the differences in the semantic and pragmati...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:2022 IEEE 16th International Conference on Semantic Computing (ICSC) S. 59 - 66
Hauptverfasser: Torene, Spencer, Howald, Blake
Format: Tagungsbericht
Sprache:Englisch
Veröffentlicht: IEEE 01.01.2022
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract Developing semantic hierarchies from user-created hashtags in social media can provide useful organizational structure to large volumes of data. However, construction of these hierarchies is difficult using established ontologies (e.g. WordNet [1]) due to the differences in the semantic and pragmatic use of words vs. hashtags in social media. While alternative construction methods based on hashtag frequency are relatively straightforward, these methods can be susceptible to the dynamic nature of social media, such as hashtags associated with surges in popularity. We drew inspiration from the ecologically-based Shannon Diversity Index (SDI) [2] to create a more representative and resilient method of semantic hierarchy construction that relies upon graph-based community detection and a novel, entropy-based ensemble diversity index (EDI) score. The EDI quantifies the contextual diversity of each hashtag, resulting in thousands of semantically-related groups of hashtags organized along a general-to-specific spectrum. Through an application of EDI to Twitter data and a comparison of our results to prior approaches, we demonstrate our method's ability to create semantically consistent hierarchies that can be flexibly applied and adapted to a range of use cases.
AbstractList Developing semantic hierarchies from user-created hashtags in social media can provide useful organizational structure to large volumes of data. However, construction of these hierarchies is difficult using established ontologies (e.g. WordNet [1]) due to the differences in the semantic and pragmatic use of words vs. hashtags in social media. While alternative construction methods based on hashtag frequency are relatively straightforward, these methods can be susceptible to the dynamic nature of social media, such as hashtags associated with surges in popularity. We drew inspiration from the ecologically-based Shannon Diversity Index (SDI) [2] to create a more representative and resilient method of semantic hierarchy construction that relies upon graph-based community detection and a novel, entropy-based ensemble diversity index (EDI) score. The EDI quantifies the contextual diversity of each hashtag, resulting in thousands of semantically-related groups of hashtags organized along a general-to-specific spectrum. Through an application of EDI to Twitter data and a comparison of our results to prior approaches, we demonstrate our method's ability to create semantically consistent hierarchies that can be flexibly applied and adapted to a range of use cases.
Author Howald, Blake
Torene, Spencer
Author_xml – sequence: 1
  givenname: Spencer
  surname: Torene
  fullname: Torene, Spencer
  email: spencer.torene@trssllc.com
  organization: Thomson Reuters Special Services, LLC,McLean,VA,USA
– sequence: 2
  givenname: Blake
  surname: Howald
  fullname: Howald, Blake
  email: blake.howald@trssllc.com
  organization: Thomson Reuters Special Services, LLC,McLean,VA,USA
BookMark eNotjkFOwzAURI0ECyg9ASx8gQR_20nsZZVCU6kSi9J1cZzvxhJxUOIicntCYTUjzdPo3ZHr0Ack5BFYCsD007bclxlXElLOOE8ZY5BfkaUuFOR5JoUEpW7J--oc-85EbGhlxjaaE608Dmaw7UQ3GOYafR_oYfThRMu-687Bx4muMaK9LCY0NLZI960JswJd-y8cxl9mGxr8vic3znyMuPzPBTm8PL-VVbJ73WzL1S7xIERMJGTINIosU6pGC6CknUWR59wWzpkclFaoAMEoWTjpaqMtA6mtcLVCKxbk4e_XI-Lxc_CdGaajLkTONRM_0IhTNA
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/ICSC52841.2022.00016
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 9781665434188
166543418X
EndPage 66
ExternalDocumentID 9736290
Genre orig-research
GroupedDBID 6IE
6IL
CBEJK
RIE
RIL
ID FETCH-LOGICAL-i133t-415e09e35588bec1184c816e262c7ffa61898e81e1a847f4fba9c0149c3fb8ec3
IEDL.DBID RIE
ISICitedReferencesCount 3
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000835706300008&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Thu Jun 29 18:36:55 EDT 2023
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i133t-415e09e35588bec1184c816e262c7ffa61898e81e1a847f4fba9c0149c3fb8ec3
PageCount 8
ParticipantIDs ieee_primary_9736290
PublicationCentury 2000
PublicationDate 2022-Jan.
PublicationDateYYYYMMDD 2022-01-01
PublicationDate_xml – month: 01
  year: 2022
  text: 2022-Jan.
PublicationDecade 2020
PublicationTitle 2022 IEEE 16th International Conference on Semantic Computing (ICSC)
PublicationTitleAbbrev ICSC
PublicationYear 2022
Publisher IEEE
Publisher_xml – name: IEEE
Score 1.8025271
Snippet Developing semantic hierarchies from user-created hashtags in social media can provide useful organizational structure to large volumes of data. However,...
SourceID ieee
SourceType Publisher
StartPage 59
SubjectTerms Indexes
information entropy
recommender systems
Refining
Semantics
social computing
Social networking (online)
tagging
text analysis
Time series analysis
Training data
twitter
Weight measurement
Title Automated Hashtag Hierarchy Generation Using Community Detection and the Shannon Diversity Index
URI https://ieeexplore.ieee.org/document/9736290
WOSCitedRecordID wos000835706300008&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NSwMxEA21ePCk0orf5ODRtZvd7G5ylNbSgpRCVXqr2ezE9rItbSr03zvJrtWDF08JgRCYwMy8ZN4bQu5Mkus8CcMg58ADnhRRkMexDMJcFaEGhBy-D9nbczYaielUjhvkfs-FAQBffAYPbur_8oul3rqnso7M0N1KBOgHWZZWXK2aDcdC2Rl2J90Eva1DfZGX4XRNzH_1TPEho3_8v8NOSPuHe0fH-6hyShpQtsj749YuMbeEgg7UZm7VBx0sHHVYz3e0Uo52Bqa-AIDWpA-7oz2wvtSqpKosKOZ6dDJXJSJ-2vuux6BDp5fYJq_9p5fuIKh7IwQLRJU2wLgLoQQnji7wGhAmcC1YClEa6cwYlTIhBQgGTGH8MdzkSmoHh3RscgE6PiNNPA3OCeWacc0hTWKFuZFSArczHFPjtGAYvyAtZ53ZqpK_mNWGufx7-YocOfNXrxTXpGnXW7ghh_rTLjbrW39nX5_Emyo
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NTwIxEG0ImuhJDRi_7cGjK9vd7tIeDWiWiIQENNyw250Kl8XAYsK_d9pd0YMXT22aNE1mks68dt4bQm5MlOo08n0v5cA9HmWBl4ah9PxUZb4GhByuD9lrvz0YiMlEDmvkdsuFAQBXfAZ3dur-8rOFXtunspZs43UrEaDvRJwHfsnWqvhwzJetXmfUifC-tbgvcEKcto35r64pLmg8HvzvuEPS_GHf0eE2rhyRGuQN8na_LhaYXUJGE7WaFeqdJnNLHtazDS21o62JqSsBoBXto9jQLhSu2CqnKs8oZnt0NFM5Yn7a_a7IoD2rmNgkL48P407iVd0RvDniysLDyAu-BCuPLtARCBS4FiyGIA502xgVMyEFCAZMYQQy3KRKaguIdGhSATo8JnU8DU4I5ZpxzSGOQoXZkVICtzMcY2PVYBg_JQ1rnelHKYAxrQxz9vfyNdlLxs_9ab83eDon-9YV5ZvFBakXyzVckl39WcxXyyvnvy_fQJ5x
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2022+IEEE+16th+International+Conference+on+Semantic+Computing+%28ICSC%29&rft.atitle=Automated+Hashtag+Hierarchy+Generation+Using+Community+Detection+and+the+Shannon+Diversity+Index&rft.au=Torene%2C+Spencer&rft.au=Howald%2C+Blake&rft.date=2022-01-01&rft.pub=IEEE&rft.spage=59&rft.epage=66&rft_id=info:doi/10.1109%2FICSC52841.2022.00016&rft.externalDocID=9736290