A propositionalization method of multi-relational data based on Grammar-Guided Genetic Programming

The propositionalization process tries to find distinctive features of the examples in a database to transform such relational data into a simpler representation. More informative features have a positive impact on the classification capabilities of the learning algorithms. In this work, we propose...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Expert systems with applications Jg. 168; S. 114263
Hauptverfasser: Quintero-Domínguez, Luis A., Morell, Carlos, Ventura, Sebastián
Format: Journal Article
Sprache:Englisch
Veröffentlicht: New York Elsevier Ltd 15.04.2021
Elsevier BV
Schlagworte:
ISSN:0957-4174, 1873-6793
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The propositionalization process tries to find distinctive features of the examples in a database to transform such relational data into a simpler representation. More informative features have a positive impact on the classification capabilities of the learning algorithms. In this work, we propose a new propositionalization method, which generates complex Boolean attributes using Grammar-Guided Genetic Programming (G3P). The generated attributes are compound formulas that combine word items coming from a Bag-of-Words (BoW) representation using Boolean operators. The proposal was assessed against three state-of-the-art simple-instance and multiple-instance propositionalization methods. The experimental results show that the proposed method achieves an improvement in terms of classification accuracy and a considerable reduction in the dimensionality of the resulting datasets. •Grammar-Guided Genetic Programming is used to generate complex attributes.•Words coming from a Bag-of-Words representation are combined using Boolean operators.•Simple-instance and multiple-instance propositionalization were analyzed.•A considerable reduction in the dimensionality of the resulting datasets is achieved.
Bibliographie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0957-4174
1873-6793
DOI:10.1016/j.eswa.2020.114263