Managing individual Pods demands a lot of effort. Kubernetes (K8s) provides an API to manage the workload (application running on K8s) object that is managed by the control plane automatically. The built in APIs are Deployment, ReplicaSet and DaemonSet and this objects will monitor Pod health, replica number, resources available ...

Scalability

Application scalability is the potential of an application to grow in time, being able to efficiently handle more and more requests per minute (RPM) [1]. This means as an application is required to handle more traffic it is needed to allocate more resources to it, but also, as the RPM decreases the amount of resources allocated to it, also must decrease. The ability, to increase and/or decrease resources, is called Application Scalability.

In K8s the application will execute in a Container and the Container will execute in a Pod. Pods are the K8s objects that are scaled, and Deployments are the K8s objects that manage the scaling activity

Fig. 1 - Deployment Architecture

Horizontal and Vertical Scaling

Horizontal Scaling is about adding more instances while Vertical Scaling is about adding CPU and Memory.

Fig. 2 - Horizontal and Vertical Scaling

The Pod is the K8s object to which scaling is applied. In K8s, it is also possible to use Auto Scaling for both types. You can check [2], [3] and [4] for a more detailed reading.

The type of scaling to use depends on the type of application:

Stateless Set - for services that don't have state, like frontend. They only process requests and do not store any data. For this type, both Vertical and Horizontal Scaling can be applied.
Stateful Set - for services that have state, like databases. They process requests, but they store data. So, it's not advisable for most cases to apply horizontal scaling because it will replicate an entire database, so this kind of application the most advisable is Vertical Scaling.

ReplicationController

A ReplicationController ensures that a specified number of pod replicas are running at any one time. In other words, a ReplicationController makes sure that a pod or a homogeneous set of pods is always up and available [5].

Code Block 1 shows an example of a manifest of a ReplicationController that will create 3 pods (spec.replicas).

apiVersion: v1
kind: ReplicationController
metadata:
  name: my-alpine-replicationcontroller # name of the controller
spec:
  replicas: 3  #number of pods as defined in template section
  selector: # This is a Pod selector
# the identification of pods for the ReplicaController for status monitoring
    app: my-alpine-rc 
  template:  # the Pod's definition for creation
    metadata: # metadata to aply to each pod
      name: alpine
      labels:
        app: my-alpine-rc
    spec:
      containers:
      - name: my-alpine
        image: alpine
        command: ["sleep", "3600"]

Code Block 1 - ReplicationController manifest example

The most important parts of a ReplicationController are:

.spec.replicas - the amount of pods to start and keep running
.spec.selector - ReplicationController manages all the pods with labels that match the selector;
.spec.template - the Pod's definition for creation
.spec.template.metadata.labels - the labels to give to each pod. In order for their status to be tracked by a ReplicaController, the labels must match the .spec.selector;
.spec.template.spec.containers - the containers a Pod will have. A note, one Pod can have many containers, but all containers share the resources available for the Pod.

Lab - Create a ReplicationController and scale the number of pods

To create a ReplicationController just do kubectl apply -f <file_name.yaml>:

To check the status of ReplicationController in Code Block 1 :

> kubectl get replicationcontroller/my-alpine-replicationcontroller
NAME                              DESIRED   CURRENT   READY   AGE
my-alpine-replicationcontroller   3         3         3       38s

Code Block 2 - Status of my-alpine-replicationcontroller with 3 Pods

We will have 4 counters that describe ReplicationController's status :

DESIRED - the amount of Pods declared .spec.replicas;
CURRENT - the amount of Pods running in the ReplicationController;
READY - the amount of Pods in the Ready state in the ReplicationController;
AGE - the age of this ReplicationController since it has been created,

Scale to 5 Pods by changing the .spec.replicas in the manifest to 5 and perform kubectl apply -f <file_name.yaml>. And if we check the status again it will be seen that DESIRED AND CURRENT have 5 but as the 2 new Pods have not gained the ready status by the kubelet.

> kubectl get replicationcontroller/my-alpine-replicationcontroller
NAME                              DESIRED   CURRENT   READY   AGE
my-alpine-replicationcontroller   5         5         3       9m11s

Code Block 3 - Status of my-alpine-replicationcontroller with 5 Pods

As times passes and kubelet executes its probe mechanisms on the 2 new Pods, the ReplicationController will get the READY also with 5 Pods.

> kubectl get replicationcontroller/my-alpine-replicationcontroller
NAME                              DESIRED   CURRENT   READY   AGE
my-alpine-replicationcontroller   5         5         5       24m

ReplicaSet

ReplicaSet is an enhanced version of ReplicaController. While in the ReplicaController the .spec.selector must match exactly with .spec.template.metadata.labels. With the ReplicaSet, it will be used an enhanced .spec.selector using mach expressions and conditional statements - In, NotIn, Exists [6].

apiVersion: apps/v1
kind: ReplicaSet
metadata:
  name: frontend
  labels: ###labels we see here are the labels for replicaSet###
    app: guestbook
    tier: frontend
spec:
  # modify replicas according to your case
  replicas: 3
  selector:
    matchLabels:  ####match label with the labels of pod###
      - {key: tier, operator: In, values: frontend}
  template:
    metadata:
      labels: ###labels defined under template section ###
              ###are the labels configure on the pods### 
        tier: frontend
    spec:
      containers:
      - name: php-redis
        image: gcr.io/google_samples/gb-frontend:v3

Code Block 4 - Example of Replicaset manifest file[5]

When setting label selectors, be careful because it may select unwanted Pods leading to issues difficult to resolve.

Deployment Object

The Deployment Object is on a higher level than the Replicaset. It's aim is to manage both the Replicaset and Pod states.

One of advantages of Deployment, is that when creating a new deployment or updating an existing one, it is created a version making possible to perform a rollback if needed. Other advantage is that performs rolling updates in a controlled manner, while it creates a the new Pods with the application version one by one, as soon as the new pod reaches the ready state, it kills one existing Pod replacing with the new one [7]. The Kubernetes official documentation offers an excellent lesson on this subject [8].

Deployments Strategy

When creating a new version of an application, there's the need to deploy the application into several environemnts until it reaches the Production environment. Depending on the environment or the application's purpose the strategy to apply is a process to reflect on.

Recreate Deployment

This strategy is natively suported by K8s. When deploying a new application version, K8s will terminate all existing pods and after all pods where terminated it will initiate all pods with the new application version.

Rolling Update Deployment

Rolling updates is the default deployment strategy on K8s. Allows Deployments update to take place with zero downtime by incrementally updating Pods instances with new ones. The new Pods will be scheduled on Nodes with available resources.

Fig. 3 - How Rolling Update works

The new ReplicaSet version creation is triggered by a new version image . As soon the new ReplicaSet is created, its new Pod is initiated, when the new Pod reaches the Ready state it starts to receive new requests and the Pod from the older ReplicaSet stops receiving new connections and is terminated.

In K8s the Rolling Deployment can be setup with 2 optional parameters [8]:

MaxSurge - is an optional field that specifies the maximum number of Pods that can be unavailable during the update process. The value can be an absolute number (for example, 5) or a percentage of desired Pods (for example, 10%). The absolute number is calculated from percentage by rounding down. The value cannot be 0 if .spec.strategy.rollingUpdate.maxSurge is 0. The default value is 25%.

For example, when this value is set to 30%, the old ReplicaSet can be scaled down to 70% of desired Pods immediately when the rolling update starts. Once new Pods are ready, old ReplicaSet can be scaled down further, followed by scaling up the new ReplicaSet, ensuring that the total number of Pods available at all times during the update is at least 70% of the desired Pods.
MaxUnavailable - is an optional field that specifies the maximum number of Pods that can be created over the desired number of Pods. The value can be an absolute number (for example, 5) or a percentage of desired Pods (for example, 10%). The value cannot be 0 if MaxUnavailable is 0. The absolute number is calculated from the percentage by rounding up. The default value is 25%.

At least one of these parameters must be larger than zero.

spec:
  replicas: 5
  strategy: 
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0

Code Block 5 - Setting Rolling Deployment

Other Deployment Strategies

It is possible touse other deployment strategies like Canary or Blue/Green on K8s with the help of 3rd party tools like Argo Rollouts or GlooEdge.

References

[1] “Application Scalability — How To Do Efficient Scaling - DZone,” dzone.com. https://dzone.com/articles/application-scalability-how-to-do-efficient-scalin (accessed Dec. 17, 2023).

[2] Kubernetes, “Horizontal Pod Autoscaler,” Kubernetes. https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/ (accessed Dec. 17, 2023).

[3] Kubecost, “The Guide To Kubernetes VPA by Example,” kubecost.com. https://www.kubecost.com/kubernetes-autoscaling/kubernetes-vpa/ (accessed Dec. 17, 2023).

[4] P. Abbassi, “Vertical autoscaling in Kubernetes» Giant Swarm,” giantswarm.io, May 04, 2021. https://www.giantswarm.io/blog/vertical-autoscaling-in-kubernetes (accessed Dec. 17, 2023).

[5] kubernetes, “ReplicationController,” Kubernetes, May 18, 2023. https://kubernetes.io/docs/concepts/workloads/controllers/replicationcontroller/ (accessed Dec. 17, 2023).

[5] “ReplicaSet,” Kubernetes, Aug. 24, 2023. https://kubernetes.io/docs/concepts/workloads/controllers/replicaset/ (accessed Dec. 27, 2023).

[6] “Labels and Selectors,” Kubernetes, Sep. 01, 2023. https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/ (accessed Dec. 27, 2023).

[7] “Kubernetes Rolling Deployment: A Practical Guide,” Codefresh. https://codefresh.io/learn/software-deployment/kubernetes-rolloing-deployment-a-practical-guide/ (accessed Dec. 27, 2023).

[8] “Deployments,” Kubernetes, Sep. 06, 2023. https://kubernetes.io/docs/concepts/workloads/controllers/deployment/ (accessed Dec. 28, 2023).

Part 6 - Workload Management

Kubernetes for Beginners series