Implementing Randomized Matrix Algorithms in Parallel and Distributed Environments

In this era of large-scale data, distributed systems built on top of clusters of commodity hardware provide cheap and reliable storage and scalable processing of massive data. With cheap storage, instead of storing only currently relevant data, it is common to store as much data as possible, hoping...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Proceedings of the IEEE Ročník 104; číslo 1; s. 58 - 92
Hlavní autoři: Yang, Jiyan, Meng, Xiangrui, Mahoney, Michael W.
Médium: Journal Article
Jazyk:angličtina
Vydáno: New York IEEE 01.01.2016
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Témata:
ISSN:0018-9219, 1558-2256
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:In this era of large-scale data, distributed systems built on top of clusters of commodity hardware provide cheap and reliable storage and scalable processing of massive data. With cheap storage, instead of storing only currently relevant data, it is common to store as much data as possible, hoping that its value can be extracted later. In this way, exabytes (1018 bytes) of data are being created on a daily basis. Extracting value from these data, however, requires scalable implementations of advanced analytical algorithms beyond simple data processing, e.g., statistical regression methods, linear algebra, and optimization algorithms. Most such traditional methods are designed to minimize floating-point operations, which is the dominant cost of in-memory computation on a single machine. In parallel and distributed environments, however, load balancing and communication, including disk and network input/output (I/O), can easily dominate computation. These factors greatly increase the complexity of algorithm design and challenge traditional ways of thinking about the design of parallel and distributed algorithms. Here, we review recent work on developing and implementing randomized matrix algorithms in large-scale parallel and distributed environments. Randomized algorithms for matrix problems have received a great deal of attention in recent years, thus far typically either in theory or in machine learning applications or with implementations on a single machine.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:0018-9219
1558-2256
DOI:10.1109/JPROC.2015.2494219