Achieving Better Benefits via Flexible Feature Matching in Post-Deduplication Delta Compression
Cloud or distributed storage systems characterized by high data redundancy necessitate effective data reduction techniques to reduce storage costs. Post-deduplication delta compression has proven effective by eliminating both duplicated and similar yet non-duplicated chunks. However, existing approa...
Uloženo v:
| Vydáno v: | Proceedings - IEEE International Parallel and Distributed Processing Symposium s. 998 - 1010 |
|---|---|
| Hlavní autoři: | , , , , , , , |
| Médium: | Konferenční příspěvek |
| Jazyk: | angličtina |
| Vydáno: |
IEEE
03.06.2025
|
| Témata: | |
| ISSN: | 1530-2075 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | Cloud or distributed storage systems characterized by high data redundancy necessitate effective data reduction techniques to reduce storage costs. Post-deduplication delta compression has proven effective by eliminating both duplicated and similar yet non-duplicated chunks. However, existing approaches often rely on fixed-feature matching for resemblance detection, which, while fast, may lead to lower reduction ratios and not robust benefits across various datasets. In this paper, we introduce BePro, a novel system that integrates Flexible Feature Matching (§IV-A) to achieve better benefits in post-deduplication delta compression. BePro employs Gain Filtering (§IV-B) to identify high-gain chunks while discarding low-gain similar chunks, ensuring robust benefits across different datasets. Additionally, BePro implements a new indexing structure, LSH-Delta (§IV-C), to search for similar chunks and utilizes Index Load Balancer (§IV-D) for efficient resemblance detection by exploiting the distribution characteristics of similar chunks. Furthermore, the Index Manager (§IV-E) skillfully manages memory space overhead, ensuring memory efficiency. We implemented a pipeline prototyping framework to facilitate the evaluation of BePro and other leading techniques. Extensive experiments demonstrate that BePro improves the data-reduction ratios by up to 1.15 \times-2.35 \times while achieving comparable speed. |
|---|---|
| ISSN: | 1530-2075 |
| DOI: | 10.1109/IPDPS64566.2025.00093 |