Kubernetes

Pronounced: kub-er-nate-es

Kubernetes is an orchestrator for microservice apps using Docker containers.

Kubernetes was created by Google, written by Go/Golang, and is one of the biggest open source infrastructure project. Google was running off containers far before Docker. To manage these, Google created in-house systems called Borg and Omega (proprietary), as the container management system. In Greek – kubernates means helmsmen, or person who steers the ship. Kubernetes is also known as k8s. K8s has been around on github since 2014.

 

Architecture

Kubernetes coordinates a highly available cluster of computers that are connected to work as a single unit. The abstractions in Kubernetes allow you to deploy containerized applications to a cluster without tying them specifically to individual machines. A Kubernetes cluster consists of two types of resources:

  • The Master coordinates the cluster
  • Nodes are the workers that run applications

A cluster has masters (one or many) and nodes (one or many). The masters are in charge and determine which nodes do the work. The nodes do the actual work. The cluster is deployed as a Kubernetes Deployment, defined in a yaml file. When an application is deployed as a Kubernetes Deployment, the Master determines which Nodes to use for this application.

Once you have a running Kubernetes cluster, you can deploy your containerized applications on top of it. To do so, you create a Kubernetes Deployment configuration. The Deployment instructs Kubernetes how to create and update instances of your application. Once you’ve created a Deployment, the Kubernetes master schedules mentioned application instances onto individual Nodes in the cluster.

  • Kubelet = main kubernetes age
  • Container Enginer = Docker
  • kube-proxy = kubernetes networking

K8 works as a declarative state. The Kubernetes deployment file is more of a high-level manifest file. It simply declares what is desired and not how the implementation is to be done. Kubernetes figures out the details on how to meet that desired state. For example, assuming the desired state is 3 pods and these are running on 3 nodes, if one of those nodes were to go down, then K8 would automatically spin up a new pod on one of the remaining nodes.

Master

In a simple example, a master would run on a single server. It usually runs on top of Linux. A master is composed of the following parts:

  • Kube-apiserver – is the front-end REST API to the master controls. By default this is exposed on port 443. It handles commands (taken as JSON) and also authentication.
    • Manages pods
    • Scales pods – replication controller
    • Does deployements
    • Manages services
  • Cluster store – a persistent storage that keeps track of state and config. It uses the etcd.
  • Controller manager – a controller that manages other controllers.
  • Scheduler – watches apiserver for new pods and assigns work to the nodes.

Nodes

A Pod always runs on a Node. A Node is a worker machine in Kubernetes and may be either a virtual or a physical machine, depending on the cluster. Each Node is managed by the Master. A Node can have multiple pods, and the Kubernetes master automatically handles scheduling the pods across the Nodes in the cluster. The Master’s automatic scheduling takes into account the available resources on each Node.

Every Kubernetes Node runs at least:

  • Kubelet, a process responsible for communication between the Kubernetes Master and the Node; it manages the Pods and the containers running on a machine.
  • A container runtime (like Docker, rkt) responsible for pulling the container image from a registry, unpacking the container, and running the application.
  • Containers should only be scheduled together in a single Pod if they are tightly coupled and need to share resources such as disk.

 

Pods

When you create a Deployment, Kubernetes created a Pod to host your application instance. A Pod is a Kubernetes abstraction that represents a group of one or more application containers (such as Docker or rkt), and some shared resources for those containers. Those resources include:

  • Shared storage, as Volumes
  • Networking, as a unique cluster IP address
  • Information about how to run each container, such as the container image version or specific ports to use

A Pod models an application-specific “logical host” and can contain different application containers which are relatively tightly coupled.

Ring fenced environment that allows the running of the containers. It shares a network stack, memory, processor. Containers running in the same pod has localhost access to other containers as well as storage space. Pods that are running inside Kubernetes are running on a private, isolated network. By default they are visible from other pods and services within the same kubernetes cluster, but not outside that network.

From a layered view from top to bottom, it looks like:

  • container
  • pod
  • vm
  • server

Containers always run inside of pods and a pod could have multiple containers. Pods are intended for tightly coupled containers. Also – pods are what gets scaled so the architecture design needs to consider the whole pod being replicated under scaling requirements. Under scaling, the replicated pods are direct clones though the master controller would assign new IPs to them.

Services

Services help with the management of Pods. Since pods have their unique IP addresses, when these scale the addresses could change. To manage this we have services that have constant IP addresses.

Services use labels to determine these links. Labels are unique and constant, therefore a service would own labels that are tied to specific pods. As other pods communicate, they do it with labels and the services are able to determine the route to the appropriate pods. Labels not only help distinguish between different pods but also versioning of single pod.

nginx

Nginx is a web server used as a reverse proxy, load balancer, and HTTP cache. It is often used within a Pod to manage the containers within the Pod.

NGINX is free, open-source, high-performance HTTP server and reverse proxy, as well as an IMAP/POP3 proxy server. NGINX is known for its high performance, stability, rich feature set, simple configuration, and low resource consumption.

NGINX is one of a handful of servers written to address the C10K problem. Unlike traditional servers, NGINX doesn’t rely on threads to handle requests. Instead it uses a much more scalable event-driven (asynchronous) architecture. This architecture uses small, but more importantly, predictable amounts of memory under load. Even if you don’t expect to handle thousands of simultaneous requests, you can still benefit from NGINX’s high-performance and small memory footprint. NGINX scales in all directions: from the smallest VPS all the way up to large clusters of servers.

 

Installation

There are many ways to do Kubernetes installation and setup. From low to high complexity, these tools can be used for installing Kubernetes:

  • Minikube
  • Google Container Engine
  • kops (AWS)
  • kubeadm (manual)

For a local dev enviroment we usually setup minikube and kubectl. Installation process can be found here:

https://kubernetes.io/docs/tasks/tools/install-kubectl/

Note that by installing a Docker engine, kubectl is already installed (at least in Windows Docker CE kubectl is already included).

Once Kubernetes is installed we can interact with it using the command: kubectl

kubectl create -f pods/mysample.yaml

 

Pods

Pods contain one or more containers. These containers would be tightly coupled so they should have same logical functions in the application. They will have shared volumes, namespaces, IP. A single pod has a single IP address exposing it externally. Within the pod there can be multiple ports pointing to the containers. Pods are created using pod configuration files. Pod configuration files describe what containers to include and network addresses.

To view active pods, we can running the following commands:

kubectl get pods
kubectl get pods/hellopod
kubectl get pods --all-namespaces

A sample pod manifest yaml file:

apiVersion: v1
kind: Pod
metadata: 
 name: hellopod
 labels:
  zone: prod
  version: v1
spec:
 containers: hellocontainer
 image: sampleimages/helloimage:latest
 ports
  containerPort: 8080

We can create the pod using the following command.

kubectl create -f pods/mysample.yaml

Pods cannot be reached outside the cluster. In order to reach it we must setup a port-forward on the cluster. This can be done with:

kubectl port-forward mysample 10080:80

To interact within the pod, we can run an exec command to launch an interactive shell.

kubectl exec mysample --stdin --ty -c myssample /bin/sh

To delete a pod:

kubectl delete pods hellopod

 

Replication Controller

Replication controllers are like wrappers around Pods. A ReplicationController ensures that a specified number of pod replicas are running at any one time. In other words, a ReplicationController makes sure that a pod or a homogeneous set of pods is always up and available. They can manage the creation/replication of Pods so that we do not have to directly interact with them. ReplicationController is often abbreviated to “rc” or “rcs” in discussion, and as a shortcut in kubectl commands.

This example ReplicationController config runs three copies of the nginx web server.

apiVersion: v1
kind: ReplicationController
metadata:
  name: nginx
spec:
  replicas: 3
  selector:
    app: nginx
  template:
    metadata:
      name: nginx
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx
        ports:
        - containerPort: 80

Run the example job by downloading the example file and then running this command:

$ kubectl create -f replication.yaml
replicationcontroller "nginx" created

 

Services

Services are used for managing multiple pods. It uses labels to identify the pods. Services never change. There are three types service network addressing:

  • cluster ip = stable internal cluster ip, accessible internal to cluster only
  • node port = a cluster-wide port that goes over cluster ip for external access
  • load balancing = integrates NodePort with cloud-based load balancers

A sample service configuration file:

kind: Service
apiVersion: v1
metadata:
  name: my-service
spec:
  selector:
    app: MyApp
  type: NodePort
  ports:
  - protocol: TCP
    port: 80
    nodePort: 100001

A service can be created by a configuration file. Within it there is a selector that defines the pods it manages and the node ports. It can be created by:

kubectl create -f services/myservice.yaml

To delete a service we execute:

kubectl delete svc hello-svc

Labels

Labels are key-value pairs that are put on pods and services. These are essential for uniquely identifying pods. They are also powerful when handling versioning. In the example below, we are migrating to the newer version by simply updating the version number on the service. The older versions are still there in those pods. So if there is a problem migrating, we can easily rollback to the original versions.

Service Discovery

Every Service defined in the cluster (including the DNS server itself) is assigned a DNS name. By default, a client Pod’s DNS search list will include the Pod’s own namespace and the cluster’s default domain. This is best illustrated by example:

Assume a Service named foo in the Kubernetes namespace bar. A Pod running in namespace bar can look up this service by simply doing a DNS query for foo. A Pod running in namespace quux can look up this service by doing a DNS query for foo.bar.

 

Deployment

Kubernetes clusters are created using Kubernetes Deployments. Deployments are configured using a manifest yaml file. The deployment process involves the Master’s apiServer creating replications. A new deployment would create a new replication with pods within it.

You can create and manage a Deployment by using the Kubernetes command line interface, kubectl. Kubectl uses the Kubernetes API to interact with the cluster. Like nodes and services, deployments are configured using configuration YAML files.

kubectl create -f deployments/mydeploy.yaml

To apply a change to an existing deployment:

kubectl apply --record -f deployments/mydeploy.yaml

As with any other Kubernetes manifest files, these should be sourced controlled.

Secrets

Any credentials or sensitive/secret data can be stored at the pod level instead of within the containers. This avoids issues with secrets getting stored with container source code configuration files. Secrets can be created like:

kubectl describe secrets tls-certs
kubectl create secret generic tls-certs --from-file=tls
kubctl create configmap nginx-proxy-conf --from-file nginx/proxy.conf

Rolling Updates

When pods require updates we can use rolling updates through a service. The service would handle the traffic during the updates. This is configured in the deployment configuration file. It is the RollingUpdateStrategy field.

 

Kubernetes vs Docker Swarm

In the container and microservices model, you are constantly starting containers. The typical way of using containers does not restart a sleeping container, because the container is disposable. Orchestrators (like Docker Swarm, Kubernetes, DCOS or Azure Service Fabric) simply create new instances of images. What this means is that you would need to optimize by precompiling the application when it is built so the instantiation process will be faster. When the container is started, it should be ready to run. You should not restore and compile at run time, using dotnet restore and dotnet build commands from the dotnet CLI that, as you see in many blog posts about .NET Core and Docker.

Kubernetes:

  • Created by Google, maintained by CNCF
  • Larger user base and contributors
  • Ideal when handling large sets of containers
  • Containers run as a set in what is called a POD
  • As of this post date – there is GUI
    • Easy to scale and deploy
    • Auto scaling
    • Built in logging and monitoring
  • Manual load balancing
  • Rolling updates with automatic rollback
  • Storage is shared among containers in same POD

Swarm:

  • Created by Docker
  • Part of Docker CLI
  • Ideal when handling smaller sets of containers
  • As of this post date – there is no GUI
  • Must use third party tool for logging and monitoring
  • Manual scaling
  • Auto load balancing
  • Rolling updates with manual rollback
  • Storage can be shared among containers in the same node (server)

 

 

References

Kubernetes
https://kubernetes.io

Getting Started with Kubernetes
https://app.pluralsight.com/library/courses/getting-started-kubernetes/

Scalable Microservices with Kubernetes
https://classroom.udacity.com/courses/ud615/

Kubernates in 8 minutes
https://www.youtube.com/watch?v=TlHvYWVUZyc