The Need for Container Orchestration

The k8s-for-beginners container we built in Exercise 1.01, Creating a Docker Image and Uploading it to Docker Hub, is nothing but a simple demonstration. In the case of a serious workload deployed in a production environment, and to enable hundreds of thousands of containers running in a cluster, we have many more things to consider. We need a system to manage the following problems:

Container Interactions

As an example, suppose that we are going to build a web app with a frontend container displaying information and accepting user requests, and a backend container serving as a datastore that interacts with the frontend container. The first challenge is to figure out how to specify the address of the backend container to the frontend container. It is not a good idea to hardcode the IP, as the container IP is not static. In a distributed system, it is not uncommon for containers or machines to fail due to unexpected issues. So, the link between any two containers must be discoverable and effective across all the machines. On the other hand, the second challenge is that we may want to limit which containers (for example, the backend container) can be visited by which kind of containers (for example, its corresponding frontend ones).

Network and Storage

All the examples that we gave in the previous sections used containers running on the same machine. This is pretty straightforward, as the underlying Linux namespaces and cgroup technologies were designed to work within the same OS entity. If we want to run thousands of containers in a production environment, which is pretty common, we have to resolve the network connectivity issue to ensure that different containers across different machines are able to connect with each other. On the other hand, local or temporary on-disk storage doesn't always work for all workloads. Applications may need the data to be stored remotely and be available to be mounted at will to any machine in the cluster the container is run on, no matter if the container is starting up for the first time or restarting after a failure.

Resource Management and Scheduling

We have seen that a container leverages Linux cgroups to manage its resource usage. To be a modern resource manager, it needs to build an easy-to-use resource model to abstract resources such as CPU, RAM, disk, and GPU. We need to manage a number of containers efficiently, and to provision and free up resources in time so as to achieve high cluster utilization.

Scheduling involves assigning an appropriate machine in the cluster for each of our workloads to run on. We will take a closer look at scheduling as we proceed further in this book. To ensure that each container has the best machine to run, the scheduler (a Kubernetes component that takes care of scheduling) needs to have a global view of the distribution of all containers across the different machines in the cluster. Additionally, in large data centers, the containers would need to be distributed based on the physical locations of the machines or the availability zones of the cloud services. For example, if all containers supporting a service are allocated to the same physical machine, and that machine happens to fail, the service will experience a period of outage regardless of how many replicas of the containers you had deployed.

Failover and Recovery

Application or machine errors are quite common in a distributed system. Therefore, we must consider container and machine failures. When containers encounter fatal errors and exit, they should be able to be restarted on the same or another suitable machine that is available. We should be able to detect machine faults or network partitions so as to reschedule the containers from problematic machines to healthy ones. Moreover, the reconciliation process should be autonomous, to make sure the application is always running in its desired state.

Scalability

As demand increases, you may want to scale up an application. Take a web frontend application as an example. We may need to run several replicas of it and use a load balancer to distribute the incoming traffic evenly among the many replicas of containers supporting the service. To walk one step further, depending on the volume of incoming requests, you may want the application to be scaled dynamically, either horizontally (by having more or fewer replicas), or vertically (by allocating more or fewer resources). This takes the difficulty of system design to another level.

Service Exposure

Suppose we've tackled all the challenges mentioned previously; that's to say, all things are working great within the cluster. Well, here comes another challenge: how can the applications be accessed externally? On one hand, the external endpoint needs to be associated with the underlying on-premises or cloud environment so that it can leverage the infrastructure's API to make itself always accessible. On the other hand, to keep the internal network traffic always going through, the external endpoint needs to be associated with internal backing replicas dynamically – any unhealthy replicas need to be taken out and backfilled automatically to ensure that the application remains online. Moreover, L4 (TCP/UDP) and L7 (HTTP, HTTPS) traffic has different characteristics in terms of packets, and, therefore, needs to be treated in slightly different ways to ensure efficiency. For example, the HTTP header information can be used to reuse the same public IP to serve multiple backend applications.

Delivery Pipeline

From a system administrator's point of view, a healthy cluster must be monitorable, operable, and autonomous in responding to failures. This requires the applications deployed on to the cluster to follow a standardized and configurable delivery pipeline so that it can be managed well at different phases, as well as in different environments.

An inpidual container is typically used only for completing a single functionality, which is not enough. We need to provide several building blocks to connect the containers all together to accomplish a complicated task.

Orchestrator: Putting All the Things Together

We don't mean to overwhelm you, but the aforementioned problems are very serious, and they arise as a result of the large number of containers that need to be automatically managed. Compared to the VM era, containers do open another door for application management in a large, distributed cluster. However, this also takes container and cluster management challenges to another level. In order to connect the containers to each other to accomplish the desired functionality in a scalable, high-performant, and self-recovering manner, we need a well-designed container orchestrator. Otherwise, we would not be able to migrate our applications from VMs to containers. It's the third reason why containerization technologies began to be adopted on a large scale in recent years, particularly upon the emergence of Kubernetes – which is the de facto container orchestrator nowadays.