A pipelined data-parallel algorithm for ILP

The amount of data collected and stored in databases is growing considerably for almost all areas of human activity. Processing this amount of data is very expensive, both humanly and computationally. This justifies the increased interest both on the automatic discovery of useful knowledge from data...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:2005 IEEE International Conference on Cluster Computing s. 1 - 10
Hlavní autoři: Fonseca, N.A., Silva, F., Costa, V.S., Camacho, R.
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: IEEE 01.09.2005
Témata:
ISBN:9780780394858, 0780394852
ISSN:1552-5244
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:The amount of data collected and stored in databases is growing considerably for almost all areas of human activity. Processing this amount of data is very expensive, both humanly and computationally. This justifies the increased interest both on the automatic discovery of useful knowledge from databases, and on using parallel processing for this task. Multi relational data mining (MRDM) techniques, such as inductive logic programming (ILP), can learn rides from relational databases consisting of multiple tables. However, ILP systems are designed to run in main memory and can have long running times. We propose a pipelined data-parallel algorithm for ILP. The algorithm was implemented and evaluated on a commodity PC cluster with 8 processors. The results show that our algorithm yields excellent speedups, while preserving the quality of learning
ISBN:9780780394858
0780394852
ISSN:1552-5244
DOI:10.1109/CLUSTR.2005.347059