Improving scalability of inductive logic programming via pruning and best-effort optimisation

•Pruning in hypothesis generalization algorithm enables learning from larger dataset.•Using latest optimization methods for better usage of modern solver technology.•Adding a time budget allowing the usage of suboptimal results in XHAIL.•Obtaining competitive results and explainable hypotheses in se...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Expert systems with applications Ročník 87; s. 291 - 303
Hlavní autoři: Kazmi, Mishal, Schüller, Peter, Saygın, Yücel
Médium: Journal Article
Jazyk:angličtina
Vydáno: New York Elsevier Ltd 30.11.2017
Elsevier BV
Témata:
ISSN:0957-4174, 1873-6793
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:•Pruning in hypothesis generalization algorithm enables learning from larger dataset.•Using latest optimization methods for better usage of modern solver technology.•Adding a time budget allowing the usage of suboptimal results in XHAIL.•Obtaining competitive results and explainable hypotheses in sentence chunking. Inductive Logic Programming (ILP) combines rule-based and statistical artificial intelligence methods, by learning a hypothesis comprising a set of rules given background knowledge and constraints for the search space. We focus on extending the XHAIL algorithm for ILP which is based on Answer Set Programming and we evaluate our extensions using the Natural Language Processing application of sentence chunking. With respect to processing natural language, ILP can cater for the constant change in how we use language on a daily basis. At the same time, ILP does not require huge amounts of training examples such as other statistical methods and produces interpretable results, that means a set of rules, which can be analysed and tweaked if necessary. As contributions we extend XHAIL with (i) a pruning mechanism within the hypothesis generalisation algorithm which enables learning from larger datasets, (ii) a better usage of modern solver technology using recently developed optimisation methods, and (iii) a time budget that permits the usage of suboptimal results. We evaluate these improvements on the task of sentence chunking using three datasets from a recent SemEval competition. Results show that our improvements allow for learning on bigger datasets with results that are of similar quality to state-of-the-art systems on the same task. Moreover, we compare the hypotheses obtained on datasets to gain insights on the structure of each dataset.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0957-4174
1873-6793
DOI:10.1016/j.eswa.2017.06.013