K-AGRUED: A Container Autoscaling Technique for Cloud-based Web Applications in Kubernetes Using Attention-based GRU Encoder-Decoder

Cloud service providers can operate several execution instances on a single physical server using virtualization technology, which improves resource utilization. In recent years, container-based virtualization has been developed as a remarkably lightweight alternative to virtual machines. Containers...

Full description

Saved in:

Bibliographic Details
Published in:	Journal of grid computing Vol. 20; no. 4; p. 40
Main Authors:	Dogani, Javad, Khunjush, Farshad, Seydali, Mehdi
Format:	Journal Article
Language:	English
Published:	Dordrecht Springer Netherlands 01.12.2022 Springer Nature B.V
Subjects:	Applications programs Cloud computing Coders Computer engineering Computer Science Containers Cost control Encoders-Decoders Error reduction Feature extraction Management of Computing and Information Systems Neural networks Processor Architectures Provisioning Resource allocation Resource utilization Scaling Software Time series User Interfaces and Human Computer Interaction Virtual environments Virtual memory systems Workloads Elasticity Cloud computing Attention-based GRU encoder-decoder Time series prediction Horizontal autoscaling Kubernetes
ISSN:	1570-7873, 1572-9184
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Cloud service providers can operate several execution instances on a single physical server using virtualization technology, which improves resource utilization. In recent years, container-based virtualization has been developed as a remarkably lightweight alternative to virtual machines. Containers consume less memory than virtual machines, enabling faster setup and portability. Cloud-based applications require dynamic resource allocation in response to fluctuations in the number of incoming requests. Most articles on proactive autoscaling in cloud computing have shortcomings in two ways. 1) During feature extraction, the temporal patterns of the data are ignored, and the historical sequences are assigned equal weight. 2) Existing research omits cool down time (CDT) from the planning phase. 3) Scaling operations can be performed at any time depending only on the current input workload, resulting in a large number of contradicting scaling actions. In response to the above shortcomings, this paper presents a proactive autoscaling method for web applications in Kubernetes using an attention-based gated recurrent unit (GRU) encoder-decoder (K-AGRUED), which predicts the resource usage of several future steps based on CDT. The results demonstrate that the proposed method reduces prediction error by 2–25% compared to state of the art methods. Our approach significantly reduces scaling operations and under-provisioning compared to the standard horizontal pod autoscaler (HPA) of Kubernetes and two previous studies. The K-AGRUED increases the scaling speedup by a factor of up to five in a real environment.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1570-7873 1572-9184
DOI:	10.1007/s10723-022-09634-x