What is Kubernetes and how does it work?

Kubernetes Explained for Beginners

Updated June 2026

Kubernetes is an open source container orchestration platform that automates the deployment, scaling, and management of containerized applications across clusters of servers. It groups containers into pods, manages their lifecycle through controllers like Deployments and StatefulSets, provides networking through Services and Ingress, and self-heals by restarting failed containers and rescheduling workloads from unhealthy nodes. Originally created by Google and donated to the Cloud Native Computing Foundation in 2015, Kubernetes has become the industry standard for running containers in production.

The Problem Kubernetes Solves

Containers package applications with their dependencies into portable, consistent units. Running one container on one server is simple. The challenge begins when you need to run many containers across many servers. Questions arise immediately: Which server should each container run on? What happens when a server goes down? How do containers find and communicate with each other? How do you deploy a new version without downtime? How do you scale up when traffic increases and scale down when it decreases?

Before Kubernetes, teams answered these questions with custom scripts, configuration management tools, and manual processes. A deployment might involve SSHing into servers, pulling new container images, stopping old containers, starting new ones, and updating load balancer configurations. This approach is fragile, slow, and error-prone, especially when managing dozens or hundreds of containers.

Kubernetes provides a declarative system for answering all of these questions. You describe what you want (three copies of this container, accessible on this port, with this much CPU and memory), and Kubernetes figures out how to make it happen. If reality drifts from your description, Kubernetes corrects it automatically. This declarative approach is the fundamental insight that makes Kubernetes powerful.

What is a Kubernetes cluster?

A Kubernetes cluster is a set of machines (physical servers or virtual machines) that work together to run containerized applications. The cluster has two types of nodes: control plane nodes that run the Kubernetes management components (API server, scheduler, controller manager, etcd), and worker nodes that run the actual application containers. The control plane makes scheduling decisions and monitors the cluster's state, while worker nodes execute the workloads assigned to them. A production cluster typically has at least three control plane nodes for redundancy and as many worker nodes as the workload requires.

What is a pod?

A pod is the smallest deployable unit in Kubernetes. It consists of one or more containers that share the same network namespace (meaning they communicate over localhost), the same storage volumes, and the same lifecycle. Most pods contain a single application container, but sidecar patterns use multiple containers for tasks like log collection, proxy handling, or configuration updates. Pods are ephemeral: they are created, run, and eventually terminated. You rarely create individual pods directly. Instead, you use higher-level controllers like Deployments that create and manage pods for you.

What is a Deployment?

A Deployment is a Kubernetes resource that manages a group of identical pods. You specify the container image, the number of replicas (copies), resource limits, and environment variables, and the Deployment creates and maintains that many pods. If a pod crashes, the Deployment creates a replacement. When you update the container image, the Deployment performs a rolling update, gradually replacing old pods with new ones so the application stays available throughout the transition. Deployments also support rollbacks, letting you revert to a previous version if a new release has problems.

What is a Service?

A Service provides a stable network endpoint for accessing a set of pods. Since pods are ephemeral and their IP addresses change when they restart, other applications cannot rely on pod IPs for communication. A Service gets a fixed cluster-internal IP address and DNS name that does not change, and it routes traffic to whichever pods are currently running and healthy. ClusterIP Services are accessible only within the cluster, NodePort Services expose a port on every node, and LoadBalancer Services integrate with cloud provider load balancers for external access.

Core Architecture

The Kubernetes control plane consists of four main components. The API Server is the central hub that processes all cluster operations. Every kubectl command, every controller action, and every node status report goes through the API Server. It validates requests, updates the cluster state in etcd, and notifies watchers of changes.

etcd is a distributed key-value store that holds all cluster data: resource definitions, configuration, secrets, and current state. It is the single source of truth for the cluster, and its consistency guarantees ensure that all control plane components agree on the cluster's state. Backing up etcd is the most critical backup operation for a Kubernetes cluster.

The Scheduler watches for newly created pods that have not been assigned to a node and selects the best node based on resource availability, affinity rules, taints and tolerations, and other constraints. Once the Scheduler assigns a pod to a node, the kubelet on that node takes over and starts the containers.

The Controller Manager runs a set of controllers that watch the cluster state through the API Server and make changes to move the current state toward the desired state. The Deployment controller ensures the right number of pods are running. The Node controller monitors node health and evicts pods from unhealthy nodes. The Job controller manages batch workloads that run to completion. Each controller operates independently, watching for specific resource types and reconciling discrepancies.

On each worker node, the kubelet is an agent that receives pod specifications from the API Server and ensures the specified containers are running and healthy. It communicates with the container runtime (containerd or CRI-O) to start, stop, and monitor containers. The kube-proxy component on each node maintains network rules that route traffic from Services to the correct pods.

Key Concepts Beyond Pods and Deployments

Namespaces provide logical isolation within a cluster. Different teams or environments (development, staging, production) can share a cluster while keeping their resources separate. Resource quotas and network policies can be applied per-namespace to control resource consumption and network access.

ConfigMaps and Secrets store configuration data and sensitive information separately from container images. ConfigMaps hold non-sensitive key-value pairs (database hostnames, feature flags, application settings) that are injected into pods as environment variables or mounted as files. Secrets hold sensitive data like passwords, API keys, and TLS certificates, stored encoded in etcd with optional encryption at rest.

StatefulSets manage stateful applications that need stable identities and persistent storage, such as databases and message queues. Unlike Deployments, which treat all pods as interchangeable, StatefulSets give each pod a persistent identifier that survives rescheduling and a dedicated PersistentVolume that is reattached when the pod restarts.

Ingress resources manage external HTTP and HTTPS access to services within the cluster. An Ingress defines routing rules that map hostnames and URL paths to backend Services. An Ingress controller (such as Nginx Ingress Controller, Traefik, or HAProxy) reads these rules and configures its reverse proxy accordingly, handling TLS termination, path-based routing, and load balancing.

Horizontal Pod Autoscaler (HPA) automatically adjusts the number of pod replicas based on observed metrics like CPU usage, memory usage, or custom application metrics. When CPU usage exceeds a target threshold (for example, 70%), HPA increases the replica count. When usage drops, it decreases replicas. This automatic scaling ensures applications handle traffic spikes without wasting resources during quiet periods.

Why Kubernetes Became the Standard

Several factors explain Kubernetes' dominance. Google's pedigree gave it immediate credibility, as Google had been running containers at massive scale with its internal Borg system for over a decade before releasing Kubernetes. The donation to the Cloud Native Computing Foundation ensured vendor-neutral governance, preventing any single company from controlling the project's direction.

The extensibility model is another factor. Kubernetes' Custom Resource Definitions (CRDs) allow anyone to extend the API with new resource types, and operators (controllers that manage CRDs) automate the lifecycle of complex applications like databases, message brokers, and monitoring systems. This extensibility has created an enormous ecosystem of tools and integrations built on top of Kubernetes.

Every major cloud provider offers managed Kubernetes (Amazon EKS, Google GKE, Azure AKS, DigitalOcean Kubernetes), making it accessible without the operational burden of managing the control plane. On-premises distributions like k3s, k0s, and OpenShift bring Kubernetes to data centers and edge locations. This universal availability has made Kubernetes the common API for deploying applications regardless of where they run.

The trade-off is complexity. Kubernetes has a steep learning curve with dozens of resource types, YAML manifests, networking concepts, and operational concerns. For simple applications running on a few servers, Kubernetes may be more complexity than the problem warrants. Docker Compose on a single server, or a simple process manager like systemd, can be more appropriate for small-scale deployments. Kubernetes shines when you run multiple services, need automatic scaling, require zero-downtime deployments, or want a consistent deployment interface across environments.

Getting Started

The most practical way to learn Kubernetes is to run a cluster locally and deploy applications to it. k3s provides a lightweight cluster that runs on a single machine with minimal resources. minikube creates a single-node cluster in a virtual machine on your laptop. kind (Kubernetes in Docker) runs cluster nodes as Docker containers, which is particularly useful for testing and CI/CD environments.

Start by deploying a simple web application as a Deployment, exposing it with a Service, and accessing it through an Ingress. Then practice scaling, rolling updates, and rollbacks. Once comfortable with the basics, explore ConfigMaps, Secrets, PersistentVolumes, and Namespaces. The Kubernetes documentation at kubernetes.io provides comprehensive tutorials and reference material.

Understanding containers is a prerequisite. If you are new to containers, read about Docker and Podman and the broader container tool ecosystem before diving into Kubernetes. Kubernetes orchestrates containers, so familiarity with building and running them is essential context.

Key Takeaway

Kubernetes automates container deployment, scaling, and management across server clusters using a declarative model. You describe the desired state, and Kubernetes continuously reconciles reality to match. It is powerful and flexible but adds significant complexity, so evaluate whether your workload scale justifies the investment.

The Problem Kubernetes Solves

Core Architecture

Key Concepts Beyond Pods and Deployments

Why Kubernetes Became the Standard

Getting Started

Related Questions

How to Install Kubernetes with k3s

Docker vs Podman

Open Source Container Tools

Best Open Source DevOps Tools

Open Source Monitoring