A Sample Decreasing Threshold Greedy-Based Algorithm for Big Data Summarisation
As the scales of datasets expand rapidly in the applications of big data, increasing efforts have been made to develop fast algorithms. This paper addresses big data summarisation problems using the submodular maximisation approach and proposes an efficient algorithm for maximising general non-negat...
Uloženo v:
| Vydáno v: | Journal of Big Data |
|---|---|
| Hlavní autoři: | , , |
| Médium: | Web Resource |
| Jazyk: | angličtina |
| Vydáno: |
Durham
Research Square
18.11.2020
|
| Témata: | |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | As the scales of datasets expand rapidly in the applications of big data, increasing efforts have been made to develop fast algorithms. This paper addresses big data summarisation problems using the submodular maximisation approach and proposes an efficient algorithm for maximising general non-negative submodular objective functions subject to k-extendible system constraints. Leveraging the sampling process and the decreasing threshold strategy, we develop an algorithm, named Sample Decreasing Threshold Greedy (SDTG). The proposed algorithm obtains an expected approximation guarantee of 1/1+k-є for maximising monotone submodular functions and of k/(1+k)2-є in non-monotone cases with expected computational complexity of O(n/(1+k)є ln r/є). Here, r is the largest size of feasible solutions, and є є(0.1/1+k) is an adjustable designing parameter for the trade-off between the approximation ratio and the computational complexity. The performance of the proposed algorithm is verified through experiments with a movie recommendation system and compared with that of benchmark algorithms. |
|---|---|
| Bibliografie: | ObjectType-Article-1 content type line 62 ObjectType-Feature-2 SourceType-Undefined-1 |
| DOI: | 10.21203/rs.3.rs-107397/v1 |