A Sample Decreasing Threshold Greedy-Based Algorithm for Big Data Summarisation

As the scales of datasets expand rapidly in the applications of big data, increasing efforts have been made to develop fast algorithms. This paper addresses big data summarisation problems using the submodular maximisation approach and proposes an efficient algorithm for maximising general non-negat...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Journal of Big Data
Hlavní autoři: Li, Teng, Shin, Hyo-Sang, Tsourdos, Antonios
Médium: Web Resource
Jazyk:angličtina
Vydáno: Durham Research Square 18.11.2020
Témata:
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:As the scales of datasets expand rapidly in the applications of big data, increasing efforts have been made to develop fast algorithms. This paper addresses big data summarisation problems using the submodular maximisation approach and proposes an efficient algorithm for maximising general non-negative submodular objective functions subject to k-extendible system constraints. Leveraging the sampling process and the decreasing threshold strategy, we develop an algorithm, named Sample Decreasing Threshold Greedy (SDTG). The proposed algorithm obtains an expected approximation guarantee of 1/1+k-є for maximising monotone submodular functions and of k/(1+k)2-є in non-monotone cases with expected computational complexity of O(n/(1+k)є ln r/є). Here, r is the largest size of feasible solutions, and є є(0.1/1+k) is an adjustable designing parameter for the trade-off between the approximation ratio and the computational complexity. The performance of the proposed algorithm is verified through experiments with a movie recommendation system and compared with that of benchmark algorithms.
Bibliografie:ObjectType-Article-1
content type line 62
ObjectType-Feature-2
SourceType-Undefined-1
DOI:10.21203/rs.3.rs-107397/v1