Parallelized Jaccard-Based Learning Method and MapReduce Implementation for Mobile Devices Recognition from Massive Network Data

The ability of accurate and scalable mobile device recognition is critically important for mobile network operators and ISPs to understand their customers' behaviours and enhance their user experience. In this paper, we propose a novel method for mobile device model recognition by using statistical...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:China communications Ročník 10; číslo 7; s. 71 - 84
Hlavní autori: Liu Jun, Li Yinzhou, Cuadrado, F., Uhlig, S., Lei Zhenming
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: China Communications Magazine Co. Ltd 01.07.2013
Beijing Key Laboratory of Network System Architecture and Convergence, Beijing University of Posts and Telecommunications,Beijing 100876, China%Department of Electronic Engineering and Computer Science, Queen Mary, University of London, London E1 4NS, UK
Predmet:
ISSN:1673-5447
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:The ability of accurate and scalable mobile device recognition is critically important for mobile network operators and ISPs to understand their customers' behaviours and enhance their user experience. In this paper, we propose a novel method for mobile device model recognition by using statistical information derived from large amounts of mobile network traffic data. Specifically, we create a Jaccardbased coefficient measure method to identify a proper keyword representing each mobile device model from massive unstructured textual HTTP access logs. To handle the large amount of traffic data generated from large mobile networks, this method is designed as a set of parallel algorithms, and is implemented through the MapReduce framework which is a distributed parallel programming model with proven low-cost and high-efficiency features. Evaluations using real data sets show that our method can accurately recognise mobile client models while meeting the scalability and producer-independency requirements of large mobile network operators. Results show that a 91.5% accuracy rate is achieved for recognising mobile client models from 2 billion records, which is dramatically higher than existing solutions.
Bibliografia:mobile device recognition; data mining; Jaccard coefficient measurement; distributed computing; MapReduce
The ability of accurate and scalable mobile device recognition is critically important for mobile network operators and ISPs to understand their customers' behaviours and enhance their user experience. In this paper, we propose a novel method for mobile device model recognition by using statistical information derived from large amounts of mobile network traffic data. Specifically, we create a Jaccardbased coefficient measure method to identify a proper keyword representing each mobile device model from massive unstructured textual HTTP access logs. To handle the large amount of traffic data generated from large mobile networks, this method is designed as a set of parallel algorithms, and is implemented through the MapReduce framework which is a distributed parallel programming model with proven low-cost and high-efficiency features. Evaluations using real data sets show that our method can accurately recognise mobile client models while meeting the scalability and producer-independency requirements of large mobile network operators. Results show that a 91.5% accuracy rate is achieved for recognising mobile client models from 2 billion records, which is dramatically higher than existing solutions.
11-5439/TN
ISSN:1673-5447
DOI:10.1109/CC.2013.6571290