A Sample Decreasing Threshold Greedy-Based Algorithm for Big Data Summarisation
As the scales of datasets expand rapidly in the applications of big data, increasing efforts have been made to develop fast algorithms. This paper addresses big data summarisation problems using the submodular maximisation approach and proposes an efficient algorithm for maximising general non-negat...
Uložené v:
| Vydané v: | Journal of Big Data |
|---|---|
| Hlavní autori: | , , |
| Médium: | Web Resource |
| Jazyk: | English |
| Vydavateľské údaje: |
Durham
Research Square
18.11.2020
|
| Predmet: | |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Shrnutí: | As the scales of datasets expand rapidly in the applications of big data, increasing efforts have been made to develop fast algorithms. This paper addresses big data summarisation problems using the submodular maximisation approach and proposes an efficient algorithm for maximising general non-negative submodular objective functions subject to k-extendible system constraints. Leveraging the sampling process and the decreasing threshold strategy, we develop an algorithm, named Sample Decreasing Threshold Greedy (SDTG). The proposed algorithm obtains an expected approximation guarantee of 1/1+k-є for maximising monotone submodular functions and of k/(1+k)2-є in non-monotone cases with expected computational complexity of O(n/(1+k)є ln r/є). Here, r is the largest size of feasible solutions, and є є(0.1/1+k) is an adjustable designing parameter for the trade-off between the approximation ratio and the computational complexity. The performance of the proposed algorithm is verified through experiments with a movie recommendation system and compared with that of benchmark algorithms. |
|---|---|
| Bibliografia: | ObjectType-Article-1 content type line 62 ObjectType-Feature-2 SourceType-Undefined-1 |
| DOI: | 10.21203/rs.3.rs-107397/v1 |