Cocktail clustering - a new hierarchical agglomerative algorithm for extracting species groups in vegetation databases

Aims: In one approach of formalized vegetation classification, species groups define a vegetation unit as a set of relevés, each of which possesses a minimum number of species from that group. Thus, species groups provide unequivocal rules for the assignment of individual vegetation records to veget...

Full description

Saved in:
Bibliographic Details
Published in:Journal of vegetation science Vol. 27; no. 6; pp. 1297 - 1307
Main Author: Bruelheide, Helge
Format: Journal Article
Language:English
Published: Blackwell Publishing Ltd 01.11.2016
John Wiley & Sons Ltd
Subjects:
ISSN:1100-9233, 1654-1103
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Aims: In one approach of formalized vegetation classification, species groups define a vegetation unit as a set of relevés, each of which possesses a minimum number of species from that group. Thus, species groups provide unequivocal rules for the assignment of individual vegetation records to vegetation units and can be applied beyond the set of records used for defining them. Here, I present a new method that subjects all species in a vegetation database to clustering that produces such clear membership rules of vegetation records to clusters. More specifically, the algorithm obtains species groups from a species × relevé matrix which consist of species that show the highest probability of co-occurring with each other and delivers unequivocal rules to assign relevés to these groups. Methods: A hierarchical agglomerative clustering algorithm for species is presented that starts with a species × species matrix of the Φ coefficient of association. After fusing the species with the highest Φ coefficient, the Φ association matrix is recalculated for the new group of species. For calculating Φ association for groups to other species or to the nodes formed by groups of species, the observed frequency distribution of co-occurrences of the species in that group is compared to the expected frequency distribution of co-occurrence, derived the from the observed number of species occurrences. As a result, for each species group a minimum number of species is obtained that is required to assign a relevé to this species group. The resulting Cocktail species groups are partially nested, and with increasing node hierarchy show a tendency of decreasing Φ correlation to the last-joining species to that group. Results and Conclusion: As the clustering algorithm assigns all of the n species in a data set to groups, the result are n — 1 partly nested species groups. These groups correspond to species groups that have been extracted from the same data sets using preconceived start groups. Subsequently, the species groups can be used separately or in logical combinations to classify vegetation relevés either by expert systems, Twinspan-like classification algorithms or by redefining existing vegetation units with automated brute-force match algorithms. Used in this way, Cocktail clustering is able to form the backbone of a consistent large-scale vegetation classification system.
Bibliography:ArticleID:JVS12454
Appendix S1. Studies that used Cocktail groups in different vegetation types.Appendix S2. R code of the Cocktail clustering algorithm.Appendix S3. Comparison of group size, required minimum number and species composition with the 23 species groups reported by Bruelheide () and groups defined by Cocktail clustering.Appendix S4. Cluster dendrogram of the Harz data set (Bruelheide ), using Cocktail clustering, but plotting the clusters not at the Ф value of the species that joined the cluster last (Fig. ), but at the mean Ф value of all species in the cluster to that cluster.
istex:A6C0E52B01FD0A3479BCF6EB1B50AFDFD172C965
ark:/67375/WNG-QWC36L0P-K
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1100-9233
1654-1103
DOI:10.1111/jvs.12454