Adaptive partitioning by local density‐peaks: An efficient density‐based clustering algorithm for analyzing molecular dynamics trajectories

We present an efficient density‐based adaptive‐resolution clustering method APLoD for analyzing large‐scale molecular dynamics (MD) trajectories. APLoD performs the k‐nearest‐neighbors search to estimate the density of MD conformations in a local fashion, which can group MD conformations in the same...

Full description

Saved in:
Bibliographic Details
Published in:Journal of computational chemistry Vol. 38; no. 3; pp. 152 - 160
Main Authors: Liu, Song, Zhu, Lizhe, Sheong, Fu Kit, Wang, Wei, Huang, Xuhui
Format: Journal Article
Language:English
Published: United States Wiley Subscription Services, Inc 30.01.2017
Subjects:
ISSN:0192-8651, 1096-987X, 1096-987X
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:We present an efficient density‐based adaptive‐resolution clustering method APLoD for analyzing large‐scale molecular dynamics (MD) trajectories. APLoD performs the k‐nearest‐neighbors search to estimate the density of MD conformations in a local fashion, which can group MD conformations in the same high‐density region into a cluster. APLoD greatly improves the popular density peaks algorithm by reducing the running time and the memory usage by 2–3 orders of magnitude for systems ranging from alanine dipeptide to a 370‐residue Maltose‐binding protein. In addition, we demonstrate that APLoD can produce clusters with various sizes that are adaptive to the underlying density (i.e., larger clusters at low‐density regions, while smaller clusters at high‐density regions), which is a clear advantage over other popular clustering algorithms including k‐centers and k‐medoids. We anticipate that APLoD can be widely applied to split ultra‐large MD datasets containing millions of conformations for subsequent construction of Markov State Models. © 2016 Wiley Periodicals, Inc. Incorporating the k‐nearest‐neighbors search into the density peaks clustering algorithm results in a novel clustering method Adaptive Partitioning by Local Density‐peaks (APLoD) for analyzing of molecular dynamics (MD) trajectories. APLoD is highly efficient and applicable to large MD datasets containing millions of frames. The density‐based feature and adaptive resolution of APLoD make it particularly useful in constructing Markov State Models for complex processes, especially those with heterogeneous metastable regions.
Bibliography:SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 14
ObjectType-Article-1
ObjectType-Feature-2
content type line 23
ISSN:0192-8651
1096-987X
1096-987X
DOI:10.1002/jcc.24664