Prefetching on Storage Servers through Mining Access Patterns on Blocks

Distributed file systems have been widely deployed as back-end storage systems to offer I/O services for parallel/distributed applications that process large amounts of data. Data prefetching in distributed file systems is a well-known optimization technique which can mask both network and disk late...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	IEEE Transactions on Parallel and Distributed Systems Ročník 27; číslo 9; s. 2698 - 2710
Hlavní autoři:	Jianwei Liao, Trahay, Francois, Gerofi, Balazs, Ishikawa, Yutaka
Médium:	Journal Article
Jazyk:	angličtina
Vydáno:	New York IEEE 01.09.2016 Institute of Electrical and Electronics Engineers (IEEE) The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Institute of Electrical and Electronics Engineers
Témata:	[INFO.INFO-DC]Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC] Algorithms Block Access Patterns Computer Science Data Prefetching Disks Distributed databases Distributed File Systems Distributed, Parallel, and Cluster Computing Graphs History Horizontal Visibility Graph Network storage Optimization Pattern analysis Pattern matching Prefetching Product development Servers Storage Servers Time series Time series analysis Transforms Horizontal Visibility Graph Block Access Patterns Distributed File Systems Data Prefetching Storage Servers
ISSN:	1045-9219, 1558-2183
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	Distributed file systems have been widely deployed as back-end storage systems to offer I/O services for parallel/distributed applications that process large amounts of data. Data prefetching in distributed file systems is a well-known optimization technique which can mask both network and disk latency and consequently boost I/O performance. Traditionally, data prefetching is initiated by the client file systems, however, conventional prefetching schemes are not well suited for client machines that have limited memory and computing capacity. To offer an efficient prefetching approach for resource-limited client machines, this paper proposes a novel server-side prefetching mechanism. Specifically, we propose to piggyback client identification to I/O requests so that server side block access history can be put into context. On the server side, we utilize the horizontal visibility graph technique to transform per-client time series of block access sequences into a connected graph for which we employ Tarjan's algorithm to disclose cut points in the connected graph. We express these patterns with feature tuples and we propose the X-step pattern matching algorithm to find a matching access pattern (i.e., a feature tuple) for a given block access history. Experimental results indicate that our newly proposed prefetching mechanism can ease client machines and their applications from the process of data prefetching, boosting client performance accordingly, and that it yields an attractive increase in data throughput as well.
Bibliografie:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	1045-9219 1558-2183
DOI:	10.1109/TPDS.2015.2496595