Data augmentation for time series regression: Applying transformations, autoencoders and adversarial networks to electricity price forecasting

A model’s expected generalisation error is inversely proportional to its training set size. This relationship can pose a problem when modelling multivariate time series, because structural breaks, low sampling rates, and high data gathering costs can severely restrict training set sizes, increasing...

Full description

Saved in:
Bibliographic Details
Published in:Applied energy Vol. 304; p. 117695
Main Authors: Demir, Sumeyra, Mincev, Krystof, Kok, Koen, Paterakis, Nikolaos G.
Format: Journal Article
Language:English
Published: Elsevier Ltd 15.12.2021
Subjects:
ISSN:0306-2619, 1872-9118
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:A model’s expected generalisation error is inversely proportional to its training set size. This relationship can pose a problem when modelling multivariate time series, because structural breaks, low sampling rates, and high data gathering costs can severely restrict training set sizes, increasing a model’s expected generalisation error by spurring regression model overfitting. Artificially expanding the training set size, using data augmentation methods, can, however, counteract the restrictions imposed by small sample sizes: increasing a model’s robustness to overfitting and boosting out-of-sample prediction accuracies. While existing time series augmentation methods have predominantly utilised feature space transformations to artificially expand training set sizes and boost prediction accuracies, we propose using autoencoders (AEs), variational autoencoders (VAEs) and Wasserstein generative adversarial networks with a gradient penalty (WGAN-GPs) for time series augmentation. To evaluate our proposed augmentors, as a case study we forecast Belgian and Dutch day-ahead electricity market prices using both autoregressive models and artificial neural networks. Overall, our results demonstrate that AEs, VAEs, and WGAN-GPs can significantly boost regression accuracies; on average decreasing benchmark model mean absolute errors by 2.23%, 2.73% and 2.97% respectively. Moreover, our results demonstrate that combining AE, VAE, and WGAN-GP generated time series can further boost regression accuracies; on average decreasing benchmark errors by 3.44%. As our proposed augmentors outperform existing augmentation methods, we strongly believe that both practitioners and researchers aiming to generate time series or reduce time series regression errors will find utility in our study. •Novel time series augmentation methods, using generative models, are developed.•The viability of augmenting multivariate time series with exogenous inputs is shown.•Electricity price forecast accuracies are statistically significantly improved.•Generative augmentors are found to outperform feature space augmentors.•Combining data from multiple augmentors is found to yield further improvements.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0306-2619
1872-9118
DOI:10.1016/j.apenergy.2021.117695