linuxlab.io
Tutorials▾
  • Linux & networking
    File system, processes, TCP/IP, BGP and OSPF
    →
  • Terraform & IaC
    HCL, state, plan/apply on a LocalStack sandbox
    →
  • Git & GitHub
    Object model, plumbing, branching, GitHub Actions
    →
All tutorials →
PricingAboutSign inCreate account
/
  • Introduction
  • Lessons
  • How it works
  • Simulator
  • Knowledge base
  • Interview prep
Index
Categories
All entries
Footer
linuxlab-TutorialsPricingAboutPrivacy & cookies
Copyright © 2026 LinuxLab. All rights reserved.
home/linux/kb/Containers (bonus)/kubernetes-pod-lifecycle

kb/containers ── Containers (bonus) ── intermediate

Kubernetes pod lifecycle: from Pending to Terminated

A pod moves through phases Pending, Running, Succeeded/Failed/Unknown. Init containers run sequentially before the main ones. Probes: startup, then readiness/liveness. SIGTERM plus a grace period on delete.

view as markdownaka: pod-lifecycle, k8s-pod, init-container, readiness-probe, liveness-probe, pod-phases

Why understand the lifecycle

In k8s a pod is the atom of a deployment: one or more containers with a shared network/IPC namespace and volumes. Between the moment of kubectl apply and the final Running state, a pod passes through a chain of states, and any of them can fail. Once you know the lifecycle, you can diagnose "why it won't start" / "why it keeps restarting" / "why it can't shut down gracefully" in a minute.

Phases, the .status.phase field

A pod has exactly one phase:

PhaseWhat it means
Pendingaccepted by the API, not Running yet. Could be scheduling, image pull, or init containers.
Runningscheduled on a node, at least one container is started. Does not mean all of them pass readiness.
Succeededall containers finished with exit 0 (for Job/CronJob)
Failedat least one container finished with a non-zero code and will not be restarted
Unknownkubelet could not report the status (node crashed, network split)

The phase is too coarse. The real state lives in .status.conditions.

Conditions, the detailed picture

yaml
status:
  phase: Running
  conditions:
  - type: PodScheduled
    status: "True"
    lastTransitionTime: "..."
  - type: Initialized
    status: "True"           # all init containers OK
  - type: ContainersReady
    status: "True"           # all main containers ready
  - type: Ready
    status: "True"           # = ContainersReady && readiness probe pass
  • PodScheduled means the scheduler found a node
  • Initialized means the init containers finished with exit 0
  • ContainersReady means all main containers passed the readiness probe
  • Ready means the pod can receive traffic (it is added to Service endpoints)

A pod can be Running but not Ready, which is normal during startup or after a liveness probe failure.

Init containers

Containers in spec.initContainers run sequentially BEFORE the main ones, and each must exit 0:

yaml
apiVersion: v1
kind: Pod
spec:
  initContainers:
  - name: wait-for-db
    image: busybox
    command: ['sh', '-c', 'until nc -z db 5432; do sleep 1; done']
  - name: migrate
    image: myapp:v1
    command: ['./migrate']
  containers:
  - name: app
    image: myapp:v1

Semantics:

  • If an init container fails, the pod restarts from the first init container
  • restartPolicy: Always for the pod does not apply to init (init always restarts on failure)
  • Init containers can have their own resources and securityContext

Use cases:

  • Wait for dependencies (DB, configmap)
  • DB migrations
  • Permissions fix on a mount (chown /data before the main process)
  • Generate config from templates with envsubst

Probes: startup, readiness, liveness

There are three types of health check, run by the kubelet on the node:

startup probe

yaml
startupProbe:
  httpGet: { path: /healthz, port: 8080 }
  failureThreshold: 30
  periodSeconds: 10
  • Checked first, and disables the other probes while it runs.
  • Protects slow-starting applications from a false liveness failure.
  • On success it no longer runs, and readiness/liveness start working.
  • On the failure threshold the kubelet kills the container.

Without a startup probe, a slow start means the liveness probe kills the container, producing an endless restart loop.

readiness probe

yaml
readinessProbe:
  httpGet: { path: /ready, port: 8080 }
  initialDelaySeconds: 5
  periodSeconds: 10
  failureThreshold: 3
  • Does not kill the pod, it just removes the endpoint from the Service on failure.
  • Returns to the Service once it passes again.
  • Ideal for warmup (cache not yet warm, DB connection not yet established).

liveness probe

yaml
livenessProbe:
  httpGet: { path: /healthz, port: 8080 }
  periodSeconds: 10
  failureThreshold: 3
  • Kills the container on failure (restarts it via restartPolicy).
  • Use it ONLY to self-heal a "deadlocked" application. Not for checking upstream dependencies.
  • Anti-pattern: checking the DB in liveness. If the DB lags, k8s will kill all the pods, causing a cascading failure.

Probe types

TypeWhat
httpGetHTTP 200-399 = OK; on any path/port
tcpSocketTCP connect possible = OK
execcommand exit 0 = OK
grpc (1.27+)gRPC health check stream

Restart Policy

  • Always (default for Deployment/ReplicaSet/StatefulSet), restarts on any exit
  • OnFailure (Job/CronJob default), restarts only on a non-zero exit
  • Never, does not restart at all

This is for the Pod level only. At the controller level (Deployment) the logic is different (replicas).

Termination, graceful shutdown

On kubectl delete pod or a scale-down:

1. API: pod.metadata.deletionTimestamp = now
2. Pod removed from Service endpoints (Ready=False)
3. preStop hook (if any)                 -> synchronous
4. SIGTERM to pid 1 of the container     -> grace period starts
5. ... wait terminationGracePeriodSeconds (default 30s)
6. SIGKILL if not finished
7. Pod removed from the API
yaml
spec:
  terminationGracePeriodSeconds: 60
  containers:
  - name: app
    lifecycle:
      preStop:
        exec:
          command: ['sh', '-c', 'sleep 5 && kill -TERM 1']

preStop:

  • Runs before SIGTERM
  • Useful for drain mode (tell the LB "stop sending me traffic")
  • sleep 5 is a typical pause for the distribution of the removal across the Service

Common mistakes:

  • PID 1 does not handle SIGTERM. Some languages and shell scripts do not forward the signal. Use tini or dumb-init as PID 1.
  • A long graceful operation > terminationGracePeriodSeconds. k8s sends SIGKILL. Increase the grace period, or make the work non-blocking.

OOM in a pod

k8s sets a cgroup limit on memory. On overflow:

  • The container is OOMKilled. The kernel OOM killer kills the container process
  • In status: lastState.terminated.reason: OOMKilled
  • The pod restarts if restartPolicy=Always

Check:

bash
kubectl describe pod mypod | grep -A2 'Last State'
# Last State:     Terminated
# Reason:        OOMKilled
# Exit Code:     137                    ← 128 + 9 (SIGKILL)

Fix: increase resources.limits.memory or fix the leak in the application.

ImagePullBackOff and CrashLoopBackOff

  • ImagePullBackOff means the registry is unreachable / the image tag is wrong / there is no imagePullSecret. kubectl describe pod shows events.
  • CrashLoopBackOff means the container crashes quickly after start, and the kubelet increases the delay exponentially between restarts (10s, 20s, 40s, up to 5 min).

Debug:

bash
kubectl logs mypod -c container1                # current
kubectl logs mypod -c container1 --previous     # previous instance
kubectl describe pod mypod                      # events at the bottom
kubectl get events --sort-by='.lastTimestamp'   # global events

When things go wrong

  • Pod stuck in Pending means there is no node with the required resources/labels. kubectl describe pod shows events: FailedScheduling. Check kubectl describe node (Allocatable, taints).
  • Pod stuck in ContainerCreating means image pulls take too long or the volume does not mount. Check kubectl describe pod events.
  • OOMKilled with no obvious cause means the limits are too low, or the JVM is not aware of the cgroup limit (you need -XX:+UseContainerSupport for Java 8u131+, default on Java 11+).
  • The liveness probe kills the pod after a deploy means initialDelaySeconds is too small and the application is still starting. Use a startup probe.
  • kubectl delete pod hangs means finalizers or a very large terminationGracePeriodSeconds. Use kubectl delete pod --grace-period=0 --force.
  • The container is alive but gets no traffic means the readiness probe failed. kubectl describe pod shows conditions: ContainersReady=False.
  • An init container finishes but the pod does not move means a non-zero exit. Check kubectl logs mypod -c <init-name>.

Useful kubectl commands

bash
kubectl get pod -o wide                       # find the node + IP
kubectl get pod -o jsonpath='{.status.containerStatuses[*].state}'
kubectl exec -it mypod -- sh
kubectl debug -it mypod --image=busybox       # ephemeral container 1.25+
kubectl port-forward pod/mypod 8080:8080      # local test without a Service
kubectl rollout status deployment/myapp
kubectl rollout restart deployment/myapp      # rolling restart without editing the spec

§ команды

bash
kubectl get pod mypod -o jsonpath='{.status.phase}'

The current pod phase, the first thing to check when debugging

bash
kubectl describe pod mypod | grep -A3 'Last State'

The reason for a restart (OOMKilled, Error, Completed) with its exit code

bash
kubectl logs mypod --previous -c container1

Logs from the previous container instance, what happened before the restart

bash
kubectl get events --sort-by='.lastTimestamp' -A | tail -30

The latest cluster events, the source of 'why the pod won't start'

bash
kubectl exec -it mypod -- sh

Enter the container to debug from the inside (if sh is present)

bash
kubectl debug -it mypod --image=busybox --target=container1

An ephemeral debug container in the pod, no sh needed in the original image

bash
kubectl delete pod mypod --grace-period=0 --force

Force deletion, use it when terminating hangs

§ см. также

  • namespacesLinux namespacesNamespaces are a kernel mechanism that gives a process its own isolated view of a resource (network, mount points, PID, UID, IPC, hostname, time). Every container is built on them.
  • cgroupscgroups (v2)cgroups v2 is a hierarchical virtual FS under `/sys/fs/cgroup` that the kernel uses to limit CPU, memory, and I/O for processes. Docker, k8s, and systemd write here.
  • signalsSignals (SIGTERM, SIGKILL, SIGHUP)A signal is an asynchronous notification to a process from the kernel or another process. TERM asks it to quit, KILL kills it now, HUP reloads config.
  • runc-and-runscrunc, runsc, kata: container runtimesrunc is the standard OCI runtime: namespaces+cgroups+seccomp. runsc/gVisor is a userspace kernel for extra isolation. kata is a lightweight VM per container. Performance and isolation trade off against each other.
  • oci-specOCI spec: the container standardOCI is three specs: Image (layers + manifest), Runtime (config.json + rootfs for runc), Distribution (registry API). The standard that followed Docker; runc, podman, containerd, CRI-O are all OCI-compatible.
  • kubelet-internalskubelet: the Kubernetes node agent architecturekubelet is a daemon on every node. It receives the PodSpec through the API, starts containers through CRI, mounts volumes through CSI, and watches health. Under pressure it does eviction. Image GC and the cgroup tree are also its job.
Footer
linuxlab-
Copyright © 2026 LinuxLab. All rights reserved.
Tutorials
Pricing
About
Privacy & cookies