Data augmentation for time series regression: Applying transformations, autoencoders and adversarial networks to electricity price forecasting

A model’s expected generalisation error is inversely proportional to its training set size. This relationship can pose a problem when modelling multivariate time series, because structural breaks, low sampling rates, and high data gathering costs can severely restrict training set sizes, increasing...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Applied energy Ročník 304; s. 117695
Hlavní autoři: Demir, Sumeyra, Mincev, Krystof, Kok, Koen, Paterakis, Nikolaos G.
Médium: Journal Article
Jazyk:angličtina
Vydáno: Elsevier Ltd 15.12.2021
Témata:
ISSN:0306-2619, 1872-9118
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:A model’s expected generalisation error is inversely proportional to its training set size. This relationship can pose a problem when modelling multivariate time series, because structural breaks, low sampling rates, and high data gathering costs can severely restrict training set sizes, increasing a model’s expected generalisation error by spurring regression model overfitting. Artificially expanding the training set size, using data augmentation methods, can, however, counteract the restrictions imposed by small sample sizes: increasing a model’s robustness to overfitting and boosting out-of-sample prediction accuracies. While existing time series augmentation methods have predominantly utilised feature space transformations to artificially expand training set sizes and boost prediction accuracies, we propose using autoencoders (AEs), variational autoencoders (VAEs) and Wasserstein generative adversarial networks with a gradient penalty (WGAN-GPs) for time series augmentation. To evaluate our proposed augmentors, as a case study we forecast Belgian and Dutch day-ahead electricity market prices using both autoregressive models and artificial neural networks. Overall, our results demonstrate that AEs, VAEs, and WGAN-GPs can significantly boost regression accuracies; on average decreasing benchmark model mean absolute errors by 2.23%, 2.73% and 2.97% respectively. Moreover, our results demonstrate that combining AE, VAE, and WGAN-GP generated time series can further boost regression accuracies; on average decreasing benchmark errors by 3.44%. As our proposed augmentors outperform existing augmentation methods, we strongly believe that both practitioners and researchers aiming to generate time series or reduce time series regression errors will find utility in our study. •Novel time series augmentation methods, using generative models, are developed.•The viability of augmenting multivariate time series with exogenous inputs is shown.•Electricity price forecast accuracies are statistically significantly improved.•Generative augmentors are found to outperform feature space augmentors.•Combining data from multiple augmentors is found to yield further improvements.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0306-2619
1872-9118
DOI:10.1016/j.apenergy.2021.117695