Stream-learn — open-source Python library for difficult data stream batch analysis

Stream-learn is a Python package compatible with scikit-learn and developed for the drifting and imbalanced data stream analysis. Its main component is a stream generator, which allows producing a synthetic data stream that may incorporate each of the three main concept drift types (i.e., sudden, gr...

Full description

Saved in:
Bibliographic Details
Published in:Neurocomputing (Amsterdam) Vol. 478; pp. 11 - 21
Main Authors: Ksieniewicz, P., Zyblewski, P.
Format: Journal Article
Language:English
Published: Elsevier B.V 14.03.2022
Subjects:
ISSN:0925-2312, 1872-8286
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Stream-learn is a Python package compatible with scikit-learn and developed for the drifting and imbalanced data stream analysis. Its main component is a stream generator, which allows producing a synthetic data stream that may incorporate each of the three main concept drift types (i.e., sudden, gradual and incremental drift) in their recurring or non-recurring version, as well as static and dynamic class imbalance. The package allows conducting experiments following established evaluation methodologies (i.e., Test-Then-Train and Prequential). Besides, estimators adapted for data stream classification have been implemented, including both simple classifiers and state-of-the-art chunk-based and online classifier ensembles. The package utilises its own implementations of prediction metrics for imbalanced binary classification tasks to improve computational efficiency.
ISSN:0925-2312
1872-8286
DOI:10.1016/j.neucom.2021.10.120