Kubernetes

Autoscaling Microservices in Kubernetes with Horizontal Pod Autoscaler

Damian Igbe, Phd
March 24, 2022, 5:46 p.m.

Subscribe to Newsletter

Be first to know about new blogs, training offers, and company news.

This tutorial discusses how to automatically scale your microservices applications running on Kubernetes using Horizontal Pod Autoscaler (HPA) resources. It also discusses how to automatically scale your Kubernetes cluster nodes using Cluster Autoscaler (CA) resource.

Horizontal Pod Autoscaler (HPA)

One of the big advantages of the cloud is the ability to create intelligent applications that respond to user demand. Applications designed with autoscaling capabilities can scale in or out to meet expected baseline performance. In order words, you can configure your application to use more resources with increased demand and fewer resources with less demand. Properly implemented, autoscaling can save tons of money since the cloud infrastructure is pay-as-you-go. i.e you pay only for the resources that you use.

Kubernetes is generally deployed either on-premise or in the cloud. Kubernetes in the cloud require either using a managed service (e.g AWS EKS) or deployed in the cloud using DIY tools like KOPS or Kubeadm.

In this tutorial, the cluster will be deployed using KOPS. Apart from being a production-grade tool for deploying Kubernetes, KOPs can integrate with your cloud provider and hence can speak the cloud API. This feature is important to achieve Cluster Autoscaling as the cluster autoscaler will need to communicate with the cloud provider to add more resources like virtual machines (EC2 in AWS).

Steps in Configuring HPA for a microservice application:

  • Write your code
  • Package your code in a container image
  • Run and manage your containerized microservice with Kubernetes deployment resource
  • Monitor your application through important metrics like percentage CPU Utilisation or using custom metrics.  Here, we will use Metrics-server for metrics collection.
  • Configure HPA for your application and set a threshold value for triggering your application
  • Your application is now configured for autoscaling and will adjust/scale the number of pods based on demand coming from the users.

How Cluster Autoscaler (CA) works

Cluster Autoscaler monitors the Status of the Kubernetes pods on your cluster. A Pod status is either Pending, Creating or Running. Pending generally means that the Kubernetes scheduler is yet to assign the pod to a node. There could be a couple of reasons for this but one of the obvious reasons for the pod to remain in Pending state is when there are no available Cluster Nodes that meet the requirement to schedule the pod. Cluster Autoscaler uses the Pending state as the condition for triggering the scale-out of additional cluster nodes. Cluster Autoscaler notices this condition and corrects it by creating additional nodes to join the cluster. It will create as many nodes as required to deploy all the Pending pods, not exceeding the Max nodes that you configured. The maximum number of nodes that will be created depends on your Cluster Autoscaler configuration. Once the additional nodes have joined the cluster, it will schedule the Pending pods to the nodes.

Now the hands-on:

In this tutorial we will do the following:

  • Create a Kubernetes cluster using KOPS
  • Configure Cluster Autoscaler
  • Configure Horizontal Pod Autoscaler
  • Test HPA and CA with a simple container image

Part 1: Deploy the Cluster

Step 1: Create Resources Needed for the Deployment

Create the Kubernetes cluster on AWS with KOPS. This requires familiarity with AWS Cloud. Note that the instructions for other cloud providers like Google Cloud and MS Azure will differ slightly. The Architecture of the deployment is shown below. You need to create a deployment VM to deploy KOPS. Once KOPS is deployed, Kops will be instructed to create the cluster. KOPS will create the VMS requited to install Kubernetes and deploy Kubernetes. In the topology below, KOPs deployed Kubernetes on 3 nodes of 1 master and 2 worker nodes. The only VM that we have to create is the deployment VM. KOPS will create an Autoscaling Group in AWS and add the worker Nodes to the Autoscaling Group. In this way, the Cluster Autoscaler can add/remove additional nodes to the Autoscaling Group when needed.

 

To prepare a cluster with KOPS you need to have the following ready:

  1. A registered domain. Here cloudtechexperts.com is used.
  2. Use the AWS  Route53 DNS service to create a hosted record for the domain. Here cte.cloudtechexperts.com is used
  3. Create an AWS S3 Bucket to store the configuration files for KOPS. Here the bucket name is cte.cloudtechexperts.com
  4. Create an EC2 instance in AWS to be used as a deploy-node and take note of the IP address and the KeyPair used. Also, ensure that the Security Groups allow you to ssh into the EC2 instance. This tutorial uses t2.micro instance with Amazon Linux AMI.
  5. Login to the EC2 instance (the Deploy node) to perform the following activities (ssh -i keypair.pem ec2-user@3.88.132.232)

Once these are ready, you can export them as follows:

$export DOMAIN_NAME="cte.cloudtechexperts.com"
$export CLUSTER_AWS_REGION="us-east-1"
$export CLUSTER_AWS_AZ="us-east-1a"
$export KOPS_STATE_STORE="s3://cte.cloudtechexperts.com"

Step 2: Configure your AWS credentials for speaking with the AWS Cloud

Configure your CLI with AWS secret key and secret access key. Then test to ensure that the configuration works. This is important because everything going forward depends on you being able to talk to AWS API from the CLI.

$aws configure
$aws s3 ls

Step 3 Generate ssh keys as required by KOPS.

Accept all the default values by just pressing the return key.

$ssh-keygen

Step 4: Install KOPs and Kubectl

It’s very important that the versions of KOPs and Kubectl are compatible. You need to contact the compatibility reference pages to decide on the best version to use. Even when you decide to upgrade at a later time, ensure to use compatible versions.

This tutorial uses version 1.13.0 of kubectl, Kubernetes cluster, and KOPS.

$sudo wget https://github.com/kubernetes/kops/releases/download/1.13.0/kops-linux-amd64
$sudo chmod +x kops-linux-amd64
$sudo mv kops-linux-amd64 /usr/local/bin/kops
$sudo chmod +x /usr/local/bin/kops
$wget https://storage.googleapis.com/kubernetes-release/release/v1.13.0/bin/linux/amd64/kubectl
$sudo chmod +x kubectl
$sudo mv kubectl /usr/local/bin/

Step 5: Cluster deployment

Let’s deploy a 3-node Kubernetes cluster with KOPS. Take note of the version of Kubernetes API used here. Take note of the Instance size for the master node. The minimum tends to be t2.medium. After you run the command, it could take up to 7 minutes for the cluster to be fully ready.

#----
$kops create cluster --name=${DOMAIN_NAME} --zones=${CLUSTER_AWS_AZ} --master-size="t2.medium" --node-size="t2.micro" --node-count="2" --dns-zone=${DOMAIN_NAME} --ssh-public-key="~/.ssh/id_rsa.pub" --kubernetes-version="1.13.0" --yes
#---

When KOPS finishes running, it will display the following:

Cluster is starting. It should be ready in a few minutes.

Suggestions:
* validate cluster: kops validate cluster
* list nodes: kubectl get nodes --show-labels
* ssh to the master: ssh -i ~/.ssh/id_rsa admin@api.cte.cloudtechexperts.com
* the admin user is specific to Debian. If not using Debian please use the appropriate user based on your OS.
* read about installing addons at: https://github.com/kubernetes/kops/blob/master/docs/addons.md.

Wait for a few minutes and check that the cluster is ready.

$kubectl get nodes
NAME STATUS ROLES AGE VERSION
NAME                            STATUS   ROLES    AGE   VERSION
ip-172-20-38-139.ec2.internal   Ready    node     67m   v1.13.0
ip-172-20-55-157.ec2.internal   Ready    node     61m   v1.13.0
ip-172-20-56-26.ec2.internal    Ready    master   73m   v1.13.0

Hooray, our 3-node cluster is ready for use!

Part 2: Configure Cluster Autoscaler

Step 1: Tune the Cluster deployed by KOPS.

Configure the maximum size for the AWS Autoscaling Group that controls the number of worker nodes. The original size of MaxSize was 2 but change it to 4 or any desired sized to look as shown below:

$kops edit ig nodes
apiVersion: kops/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: 2019-09-19T15:06:23Z
  generation: 1
  labels:
    kops.k8s.io/cluster: cte.cloudtechexperts.com
  name: nodes
spec:
  image: kope.io/k8s-1.13-debian-stretch-amd64-hvm-ebs-2019-08-16
  machineType: t2.micro
  maxSize: 4
  minSize: 2
  nodeLabels:
    kops.k8s.io/instancegroup: nodes
  role: Node
  subnets:
  - us-east-1a

Step 2: Update KOPS to pick up the changes

$kops update cluster ${DOMAIN_NAME} --yes

After this, check that your AWS Autoscaling group has the MAXsize reflected. You can check from the AWS Mangement Dashboard.

 

 

Step 3: Create a policy for the AWS Autoscaling groups

Copy the policy from here and ensure to name it policy-cluster-autoscaler.json.

$aws iam put-role-policy --role-name nodes.${DOMAIN_NAME} --policy-name asg-nodes.${DOMAIN_NAME} --policy-document file://policy-cluster-autoscaler.json

Step 4: Install and Configure Monitoring software.

Autoscaling requires that you monitor metrics to help determine when to scale.  Popular choices for Kubernetes are Prometheus and Metrics-server. Here we use Metrics Server.

$sudo yum install git -y
$git clone https://github.com/kubernetes-incubator/metrics-server
$cd /home/ec2-user/metrics-server/
$kubectl create -f deploy/1.8+/

clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created
serviceaccount/metrics-server created
deployment.apps/metrics-server created
service/metrics-server created
clusterrole.rbac.authorization.k8s.io/system:metrics-server created
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created

At this point the metrics server is up and running: However, we are getting an error since it’s not able to talk to the Kubelet on the worker nodes to get metrics as you can see below.

$kubectl top no
error: metrics not available yet

$kubectl get pods -n kube-system
metrics-server-6456bffb67-d5vrq                         1/1     Running   0          20s

The Metrics server is running but we are not able to retrieve metrics from the cluster. To Solve this for the version of Kubernetes deployed with KOPS, we need to make  2 changes:

Change 1:

Need to configure metrics server to allow unsecured tls connection. This is good enough for now but you may investigate how to make metrics server work with tls.

$cd /home/ec2-user/metrics-server/deploy/1.8+/

Edit metrics-server-deployment.yaml  to have the containers  section look like this:

 containers:
      - name: metrics-server
        image: k8s.gcr.io/metrics-server-amd64:v0.3.4
        imagePullPolicy: Always
        command:
        - /metrics-server
        - --kubelet-insecure-tls
        - --kubelet-preferred-address-types=InternalIP

Then delete and recreate the metrics-server

$kubectl delete -f metrics-server-deployment.yaml 
$kubectl create -f metrics-server-deployment.yaml

Change 2:

Makes changes to the Kubernetes cluster. Since we deployed with KOPS utility we will make the changes with KOPS.

$kops edit cluster --name cte.cloudtechexperts.com

Modify the Kubelet section from:

  kubelet:
    anonymousAuth: false   

To

  kubelet:
    anonymousAuth: false
    authenticationTokenWebhook: true
    authorizationMode: Webhook

Update the KOPS cluster with the changes and wait for the cluster to be ready. This could take up to 20 minutes so be patient!

$kops update cluster --yes
$kops rolling-update cluster --yes

When the Cluster is ready:

$kubectl get nodes
NAME                            STATUS   ROLES    AGE   VERSION
ip-172-20-38-139.ec2.internal   Ready    node     67m   v1.13.0
ip-172-20-55-157.ec2.internal   Ready    node     61m   v1.13.0
ip-172-20-56-26.ec2.internal    Ready    master   73m   v1.13.0

Test that the Metrics server is working. You will see the pod running in the Kube-system namespace.

$kubectl get pods -n kube-system

metrics-server-6456bffb67-9bb6m 1/1 Running 0 7m14
$kubectl top no

NAME                            CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
ip-172-20-38-139.ec2.internal   29m          2%     381Mi           42%       
ip-172-20-55-157.ec2.internal   25m          2%     410Mi           45%       
ip-172-20-56-26.ec2.internal    829m         41%    1354Mi          35%    

Step 5: Download the Cluster Autoscaler Manifest file and make changes

Download the Cluster Autoscaler YAML file from here or you may just copy and paste the YAML file using your favorite editor like vim. You may name the file cluster-autoscaler.yaml

Make 3 changes to this file:

First Change:

Ensure that the name of the AWS Autoscaling group is correct. Also, ensure that the minimum and the maximum number of nodes for the Autoscaler corresponds to what you have configured in the AWS autoscaling groups in Step 1 above. The original file downloaded contains the entry

– –nodes=1:10:k8s-worker-asg-1

But we want it to look below because this is what our AWS autoscaling group looks like:

– –nodes:=2:4:nodes.cte.cloudtechexperts.com

We will use the Linux sed tool to make the edit. Alternatively, you, may just open the file and edit the line manually.

$export MIN_NODES="2" 
$export MAX_NODES="4"
$sed -i -e "s|--nodes=.*|--nodes:=${MIN_NODES}:${MAX_NODES}:nodes.${DOMAIN_NAME}|g" ./cluster-autoscaler.yaml

Second Change:

Ensure that the certificate path is correct. I changed from line 1 to line 2 below:
/etc/ssl/certs/ca-certificates.crt
/etc/pki/tls/certs/ca-certificates.crt

After the changes, your Autoscaler file should have the Command section look like below. TAKE NOTE of the path of the certificate /etc/pki/tls/certs/ca-certificates.crt and the line starting with — nodes.

          command:
            - ./cluster-autoscaler
            - --v=4
            - --stderrthreshold=info
            - --cloud-provider=aws
            - --skip-nodes-with-local-storage=false
            - --nodes=2:4:nodes.cte.cloudtechexperts.com
          volumeMounts:
            - name: ssl-certs
              mountPath: /etc/pki/tls/certs/ca-certificates.crt
              readOnly: true
          imagePullPolicy: "Always"
      volumes:
        - name: ssl-certs
          hostPath:
            path: "/etc/pki/tls/certs/ca-bundle.crt"

Third Change:  Change the image to match the version of Kubernetes API

Note that the Cluster Autoscaler image version must be compatible with the version of Kubernetes you are using. You can check here for compatibility of the images for your Kubernetes API. For Kubernetes v1.13.0, we are also using Cluster Autoscaler image v1.13.0 as you can see below:

containers:
        - image: k8s.gcr.io/cluster-autoscaler:v1.13.1
          name: cluster-autoscaler
          resources:
            limits:
              cpu: 100m
              memory: 300Mi
            requests:
              cpu: 100m
 

Step 5: Create  Cluster Autoslacer

$kubectl apply -f cluster-autoscaler.yaml

serviceaccount/cluster-autoscaler created
clusterrole.rbac.authorization.k8s.io/cluster-autoscaler created
role.rbac.authorization.k8s.io/cluster-autoscaler created
clusterrolebinding.rbac.authorization.k8s.io/cluster-autoscaler created
rolebinding.rbac.authorization.k8s.io/cluster-autoscaler created
deployment.apps/cluster-autoscaler created

$kubectl get pods -n kube-system
NAME                                                   READY   STATUS    RESTARTS   AGE
cluster-autoscaler-7cfd87ccc-dtqgd                     1/1     Running   0          11h
dns-controller-79cbf4d696-4cct9                        1/1     Running   0          11h
etcd-manager-events-ip-172-20-32-68.ec2.internal       1/1     Running   0          11h
etcd-manager-main-ip-172-20-32-68.ec2.internal         1/1     Running   0          11h
kube-apiserver-ip-172-20-32-68.ec2.internal            1/1     Running   3          11h
kube-controller-manager-ip-172-20-32-68.ec2.internal   1/1     Running   0          11h
kube-dns-685fbb458-52252                               3/3     Running   0          11h
kube-dns-685fbb458-dm5gm                               3/3     Running   0          11h
kube-dns-autoscaler-74887878cc-fqxmp                   1/1     Running   0          11h
kube-proxy-ip-172-20-32-68.ec2.internal                1/1     Running   0          11h
kube-proxy-ip-172-20-39-18.ec2.internal                1/1     Running   0          11h
kube-proxy-ip-172-20-39-189.ec2.internal               1/1     Running   0          11h
kube-scheduler-ip-172-20-32-68.ec2.internal            1/1     Running   0          11h
metrics-server-65d64d6865-2jdql                        1/1     Running   0          11h

Above you can see that the cluster-autoscaler pod is running, highlighted in bold.

Part 3: Configure Horizontal Pod Autoscaler

Step 1: Create a Kubernetes deployment  to be autoscaled

We will use just hpa-apache image. The kubectl run command creates a deployment, a replica set and the pods required to run this container.

$kubectl run php-apache --image=gcr.io/google_containers/hpa-example --requests=cpu=200m --expose --port=80

Note that this command also creates a service object because of the –expose tag

$ kubectl get svc
NAME         TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE
kubernetes   ClusterIP   100.64.0.1      <none>        443/TCP   44m
php-apache   ClusterIP   100.70.86.121   <none>        80/TCP    30s

$kubectl get pods 
NAME                          READY   STATUS    RESTARTS   AGE
php-apache-864967cc4f-tjz87   1/1     Running   0          26s

Step 2: Create a Horizontal Pod Autoscaler

The HPA command is the ‘autoscale’ command in the line below. This is what performs the autoscaling of the application. It is using CPU Utilization as the scaling metrics with a Threshold value of 20%. This means that each time the average CPU utilization of all the pods exceeds 20%, more pods will be added. The minimum number of pods to be maintained will be 2 and Maximum will be 20. In production cases autoscaling, 70% is the normal threshold but we use 20% here for demo purposes alone.

$ kubectl autoscale deployment php-apache --cpu-percent=20 --min=2 --max=20
horizontalpodautoscaler.autoscaling/php-apache autoscaled

$kubectl get hpa php-apache
NAME         REFERENCE               TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
php-apache   Deployment/php-apache   <unknown>/20%   2         20        0          12s

$ kubectl get hpa php-apache
NAME         REFERENCE               TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
php-apache   Deployment/php-apache   0%/20%    2         20        2          55s

Part 4: Test HPA and CA with a simple web-server

Step 1: Let’s start pumping traffic to the service object of the deployment.

In Kubernetes, the Service object acts as the frontend load balancer to the application and you access an application through the Service object. We will create a pod using busybox image, ssh into the image and run an infinite loop sending traffic to the service object. This will saturate the backend pods connected to the service object hence increasing the CPU Utilization of the pods.

$kubectl run -i --tty load-generator --image=busybox /bin/sh

# while true; do wget -q -O- http://php-apache.default.svc.cluster.local; done

Step 2: Watch the Scaling of the Pods and the Cluster

Login from another terminal to the KOPS master node on AWS. This is so as to leave the load generator running on the previous terminal.

$ssh -i k8skeyclass.pem ec2-user@3.88.132.232
$watch kubectl get hpa
NAME         REFERENCE               TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
php-apache   Deployment/php-apache   0%/20%    2         20        2          11h

NAME         REFERENCE               TARGETS    MINPODS   MAXPODS   REPLICAS   AGE
php-apache   Deployment/php-apache   236%/20%   2         20        2          3m51s

NAME         REFERENCE               TARGETS    MINPODS   MAXPODS   REPLICAS   AGE
php-apache   Deployment/php-apache   236%/20%   2         20        4          4m10s

NAME         REFERENCE               TARGETS    MINPODS   MAXPODS   REPLICAS   AGE
php-apache   Deployment/php-apache   236%/20%   2         20        8          4m23s

NAME         REFERENCE               TARGETS    MINPODS   MAXPODS   REPLICAS   AGE
php-apache   Deployment/php-apache   236%/20%   2         20        16         4m31s
NAME         REFERENCE               TARGETS    MINPODS   MAXPODS   REPLICAS   AGE
php-apache   Deployment/php-apache   123%/20%   2         20        20         4m46s

More pods go into ‘Pending’ state as 2 Cluster node is not sufficient to hosts all the Pods.

[ec2-user@ip-172-31-92-153 ~]$ kubectl get pods
NAME                              READY   STATUS    RESTARTS   AGE
load-generator-557649ddcd-r4ktb   1/1     Running   0          3m19s
php-apache-864967cc4f-28bqg       1/1     Running   0          92s
php-apache-864967cc4f-2xl22       1/1     Running   0          5m1s
php-apache-864967cc4f-2zj6d       0/1     Pending   0          77s
php-apache-864967cc4f-6mncl       0/1     Pending   0          62s
php-apache-864967cc4f-bgs2h       0/1     Pending   0          62s
php-apache-864967cc4f-chhh9       0/1     Pending   0          62s
php-apache-864967cc4f-fmzdn       0/1     Pending   0          62s
php-apache-864967cc4f-jgh55       0/1     Pending   0          47s
php-apache-864967cc4f-kcn9k       1/1     Running   0          6m51s
php-apache-864967cc4f-kvm8l       0/1     Pending   0          47s
php-apache-864967cc4f-kzqgb       0/1     Pending   0          62s
php-apache-864967cc4f-l7dtb       0/1     Pending   0          62s
php-apache-864967cc4f-lqw8n       1/1     Running   0          92s
php-apache-864967cc4f-ndjdp       0/1     Pending   0          62s
php-apache-864967cc4f-p7kv4       0/1     Pending   0          77s
php-apache-864967cc4f-pwh9q       0/1     Pending   0          77s
php-apache-864967cc4f-r49gk       0/1     Pending   0          47s
php-apache-864967cc4f-tftpl       0/1     Pending   0          47s
php-apache-864967cc4f-wfqnc       0/1     Pending   0          77s
php-apache-864967cc4f-zhjkq       0/1     Pending   0          62s

Cluster Autoscaler then scales the number of worker nodes from 2 to 4 as you can see below.

[ec2-user@ip-172-31-92-153 ~]$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
ip-172-20-35-155.ec2.internal Ready node 21m v1.13.0
ip-172-20-37-80.ec2.internal Ready node 35s v1.13.0
ip-172-20-43-242.ec2.internal Ready node 15m v1.13.0
ip-172-20-44-99.ec2.internal Ready node 38s v1.13.0
ip-172-20-57-10.ec2.internal Ready master 28m v1.13.0

Since more capacity, more pods are rescheduled to the new nodes and containers start creating.

[ec2-user@ip-172-31-92-153 ~]$ kubectl get pods
NAME                              READY   STATUS              RESTARTS   AGE
load-generator-557649ddcd-r4ktb   1/1     Running             0          3m44s
php-apache-864967cc4f-28bqg       1/1     Running             0          117s
php-apache-864967cc4f-2xl22       1/1     Running             0          5m26s
php-apache-864967cc4f-2zj6d       0/1     Pending             0          102s
php-apache-864967cc4f-6mncl       0/1     ContainerCreating   0          87s
php-apache-864967cc4f-bgs2h       0/1     Pending             0          87s
php-apache-864967cc4f-chhh9       0/1     ContainerCreating   0          87s
php-apache-864967cc4f-fmzdn       0/1     ContainerCreating   0          87s
php-apache-864967cc4f-fpgjv       0/1     Pending             0          9s
php-apache-864967cc4f-jgh55       0/1     ContainerCreating   0          72s
php-apache-864967cc4f-kcn9k       1/1     Running             0          7m16s
php-apache-864967cc4f-kvm8l       0/1     Pending             0          72s
php-apache-864967cc4f-kzqgb       0/1     OutOfcpu            0          87s
php-apache-864967cc4f-l7dtb       0/1     ContainerCreating   0          87s
php-apache-864967cc4f-lqw8n       1/1     Running             0          117s
php-apache-864967cc4f-ndjdp       0/1     OutOfcpu            0          87s
php-apache-864967cc4f-p7kv4       0/1     ContainerCreating   0          102s
php-apache-864967cc4f-pwh9q       0/1     ContainerCreating   0          102s
php-apache-864967cc4f-r49gk       0/1     Pending             0          72s
php-apache-864967cc4f-tftpl       0/1     Pending             0          72s
php-apache-864967cc4f-wfqnc       0/1     Pending             0          102s
php-apache-864967cc4f-zb88l       0/1     Pending             0          6s
php-apache-864967cc4f-zhjkq       0/1     ContainerCreating   0          87s

Now the Cluster Nodes stabilizes at 4, our MaxSize for the worker nodes.

[ec2-user@ip-172-31-92-153 ~]$ kubectl get nodes
NAME                            STATUS   ROLES    AGE   VERSION
ip-172-20-35-155.ec2.internal   Ready    node     21m   v1.13.0
ip-172-20-37-80.ec2.internal    Ready    node     35s   v1.13.0
ip-172-20-43-242.ec2.internal   Ready    node     15m   v1.13.0
ip-172-20-44-99.ec2.internal    Ready    node     38s   v1.13.0
ip-172-20-57-10.ec2.internal    Ready    master   28m   v1.13.0

And the number of Pods also stabilizes, though some pods are still in the Pending state but we set the maximum cluster nodes to 4 so no more nodes will be added.

[ec2-user@ip-172-31-92-153 ~]$ kubectl get pods
NAME                              READY   STATUS     RESTARTS   AGE
load-generator-557649ddcd-r4ktb   1/1     Running    0          4m37s
php-apache-864967cc4f-28bqg       1/1     Running    0          2m50s
php-apache-864967cc4f-2xl22       1/1     Running    0          6m19s
php-apache-864967cc4f-2zj6d       0/1     Pending    0          2m35s
php-apache-864967cc4f-6mncl       1/1     Running    0          2m20s
php-apache-864967cc4f-bgs2h       0/1     Pending    0          2m20s
php-apache-864967cc4f-chhh9       1/1     Running    0          2m20s
php-apache-864967cc4f-fmzdn       1/1     Running    0          2m20s
php-apache-864967cc4f-fpgjv       0/1     Pending    0          62s
php-apache-864967cc4f-jgh55       1/1     Running    0          2m5s
php-apache-864967cc4f-kcn9k       1/1     Running    0          8m9s
php-apache-864967cc4f-kvm8l       0/1     Pending    0          2m5s
php-apache-864967cc4f-kzqgb       0/1     OutOfcpu   0          2m20s
php-apache-864967cc4f-l7dtb       1/1     Running    0          2m20s
php-apache-864967cc4f-lqw8n       1/1     Running    0          2m50s
php-apache-864967cc4f-ndjdp       0/1     OutOfcpu   0          2m20s
php-apache-864967cc4f-p7kv4       1/1     Running    0          2m35s
php-apache-864967cc4f-pwh9q       1/1     Running    0          2m35s
php-apache-864967cc4f-r49gk       0/1     Pending    0          2m5s
php-apache-864967cc4f-tftpl       0/1     Pending    0          2m5s
php-apache-864967cc4f-wfqnc       0/1     Pending    0          2m35s
php-apache-864967cc4f-zb88l       0/1     Pending    0          59s
php-apache-864967cc4f-zhjkq       1/1     Running    0          2m20s

Scaling Down:

Let’s stop the load generator:

OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!^C
/ #

Now that the load has been reduced, the cluster should scale down to 2 nodes.

Conclusions

In this tutorial, I have shown you how to autoscale your microservices application running on Kubernetes as well as how to scale your Kubernetes cluster. To successfully follow this tutorial, you require good knowledge of the AWS cloud and a good knowledge of the Kubernetes platform and its ecosystem. That being said, if you follow carefully, you will be able to get this up and running. I hope you enjoy this and feel free to share it with colleagues.

Zero-to-Hero Program: We Train and Mentor you to land your first Tech role