Comparing four methods for decision-tree induction: A case study on the invasive Iberian gudgeon (Gobio lozanoi; Doadrio and Madeira, 2004)

The invasion of freshwater ecosystems is a particularly alarming phenomenon in the Iberian Peninsula. Habitat suitability modelling is a proficient approach to extract knowledge about species ecology and to guide adequate management actions. Decision-trees are an interpretable modelling technique wi...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Ecological informatics Ročník 34; s. 22 - 34
Hlavní autoři: Muñoz-Mas, Rafael, Fukuda, Shinji, Vezza, Paolo, Martínez-Capel, Francisco
Médium: Journal Article
Jazyk:angličtina
Vydáno: Elsevier B.V 01.07.2016
Témata:
ISSN:1574-9541
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:The invasion of freshwater ecosystems is a particularly alarming phenomenon in the Iberian Peninsula. Habitat suitability modelling is a proficient approach to extract knowledge about species ecology and to guide adequate management actions. Decision-trees are an interpretable modelling technique widely used in ecology, able to handle strongly nonlinear relationships with high order interactions and diverse variable types. Decision-trees recursively split the input space into two parts maximising child node homogeneity. This recursive partitioning is typically performed with axis-parallel splits in a top-down fashion. However, recent developments of the R packages oblique.tree, which allows the development of oblique split-based decision-trees, and evtree, which performs globally optimal searches with evolutionary algorithms to do so, seem to outperform the standard axis-parallel top-down algorithms; CART and C5.0. To evaluate their possible use in ecology, the two new partitioning algorithms were compared with the two well-known, standard axis-parallel algorithms. The entire process was performed in R by simultaneously tuning the decision-tree parameters and the variables subset with a genetic algorithm and modelling the presence–absence of the Iberian gudgeon (Gobio lozanoi; Doadrio and Madeira, 2004), an invasive fish species that has spread across the Iberian Peninsula. The accuracy and complexity of the trees, the modelled patterns of mesohabitat selection and the variables importance were compared. None of the new R packages, namely oblique.tree and evtree, outperformed the C5.0 algorithm. They rendered almost the same decision-trees as the CART algorithm, although they were completely interpretable – they performed from four to eight partitions – in comparison with C5.0, which resulted in a more complex structure with 17 partitions. Oblique.tree proved to be affected by prevalence and it does not include the possibility of weighting the observations, which potentially discourage its actual use. Although the use of evtree did not suggest a major improvement compared with the remaining packages, it allowed the development of regression trees which may be informative for additional modelling tasks such as abundance estimation. Looking at the resulting decision-trees, the optimal habitats for the Iberian gudgeon were large pools in lowland river segments with depositional areas and aquatic vegetation present, which typically appeared in the form of scattered macrophytes clumps. Furthermore, Iberian gudgeon seems to avoid habitats characterised by scouring phenomena and limited vegetated cover availability. Accordingly, we can assume that river regulation and artificial impoundment would have favoured the spread of the Iberian gudgeon across the entire peninsula. •C5.0 outperformed the algorithms: CART, oblique tree and evolutionary tree.•Oblique.tree proved to be affected by prevalence.•Iberian gudgeon selected wide pools with aquatic vegetation and depositional areas.•River regulation and impoundment favoured Iberian gudgeon's spread.
Bibliografie:http://dx.doi.org/10.1016/j.ecoinf.2016.04.011
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1574-9541
DOI:10.1016/j.ecoinf.2016.04.011