A Sample Decreasing Threshold Greedy-Based Algorithm for Big Data Summarisation

As the scales of datasets expand rapidly in the applications of big data, increasing efforts have been made to develop fast algorithms. This paper addresses big data summarisation problems using the submodular maximisation approach and proposes an efficient algorithm for maximising general non-negat...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Journal of Big Data
Hlavní autori: Li, Teng, Shin, Hyo-Sang, Tsourdos, Antonios
Médium: Web Resource
Jazyk:English
Vydavateľské údaje: Durham Research Square 18.11.2020
Predmet:
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:As the scales of datasets expand rapidly in the applications of big data, increasing efforts have been made to develop fast algorithms. This paper addresses big data summarisation problems using the submodular maximisation approach and proposes an efficient algorithm for maximising general non-negative submodular objective functions subject to k-extendible system constraints. Leveraging the sampling process and the decreasing threshold strategy, we develop an algorithm, named Sample Decreasing Threshold Greedy (SDTG). The proposed algorithm obtains an expected approximation guarantee of 1/1+k-є for maximising monotone submodular functions and of k/(1+k)2-є in non-monotone cases with expected computational complexity of O(n/(1+k)є ln r/є). Here, r is the largest size of feasible solutions, and є є(0.1/1+k) is an adjustable designing parameter for the trade-off between the approximation ratio and the computational complexity. The performance of the proposed algorithm is verified through experiments with a movie recommendation system and compared with that of benchmark algorithms.
Bibliografia:ObjectType-Article-1
content type line 62
ObjectType-Feature-2
SourceType-Undefined-1
DOI:10.21203/rs.3.rs-107397/v1