Detecting malicious tweets in trending topics using clustering and classification

Detection of spam Twitter social networks is one of the significant research areas to discover unauthorized user accounts. A number of research works have been carried out to solve these issues but most of the existing techniques had not focused on various features and doesn't group similar use...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:2014 International Conference on Recent Trends in Information Technology s. 1 - 6
Hlavní autoři: Soman, Saini Jacob, Murugappan, S.
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: IEEE 01.04.2014
Témata:
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Detection of spam Twitter social networks is one of the significant research areas to discover unauthorized user accounts. A number of research works have been carried out to solve these issues but most of the existing techniques had not focused on various features and doesn't group similar user trending topics which become their major limitation. Trending topics collects the current Internet trends and topics of argument of each and every user. In order to overcome the problem of feature extraction,this work initially extracts many features such as user profile features, user activity features, location based features and text and content features. Then the extracted text features use Jenson-Shannon Divergence (JSD) measure to characterize each labeled tweet using natural language models. Different features are extracted from collected trending topics data in twitter. After features are extracted, clusters are formed to group similar trending topics of tweet user profile. Fuzzy K-means (FKM) algorithm primarily cluster the similar user profiles with same trending topics of tweet and centers are determined to similar user profiles with same trending topics of tweet from fuzzy membership function. Moreover, Extreme learning machine (ELM) algorithm is applied to analyze the growing characteristics of spam with similar topics in twitter from clustering result and acquire necessary knowledge in the detection of spam. The results are evaluated with F-measure, True Positive Rate (TPR), False Positive Rate (FPR) and Classification Accuracy with improved detection results.
DOI:10.1109/ICRTIT.2014.6996188