Real-Time Big Data Architecture for Processing Cryptocurrency and Social Media Data: A Clustering Approach Based on k-Means

Cryptocurrencies have recently emerged as financial assets that allow their users to execute transactions in a decentralized manner. Their popularity has led to the generation of huge amounts of data, specifically on social media networks such as Twitter. In this study, we propose an iterative kappa...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Algorithms Jg. 15; H. 5; S. 140
Hauptverfasser: Barradas, Adrian, Tejeda-Gil, Acela, Cantón-Croda, Rosa-María
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Basel MDPI AG 01.05.2022
Schlagworte:
ISSN:1999-4893, 1999-4893
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Cryptocurrencies have recently emerged as financial assets that allow their users to execute transactions in a decentralized manner. Their popularity has led to the generation of huge amounts of data, specifically on social media networks such as Twitter. In this study, we propose an iterative kappa architecture that collects, processes, and temporarily stores data regarding transactions and tweets of two of the major cryptocurrencies according to their market capitalization: Bitcoin (BTC) and Ethereum (ETH). We applied a k-means clustering approach to group data according to their principal characteristics. Data are categorized into three groups: BTC typical data, ETH typical data, BTC and ETH atypical data. Findings show that activity on Twitter correlates to activity regarding the transactions of cryptocurrencies. It was also found that around 14% of data relate to extraordinary behaviors regarding cryptocurrencies. These data contain higher transaction volumes of both cryptocurrencies, and about 9.5% more social media publications in comparison with the rest of the data. The main advantages of the proposed architecture are its flexibility and its ability to relate data from various datasets.
Bibliographie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1999-4893
1999-4893
DOI:10.3390/a15050140