Why understand the lifecycle
In k8s a pod is the atom of a deployment: one or more containers with a
shared network/IPC namespace and volumes. Between the moment of
kubectl apply and the final Running state, a pod passes through a chain
of states, and any of them can fail. Once you know the lifecycle, you can
diagnose "why it won't start" / "why it keeps restarting" /
"why it can't shut down gracefully" in a minute.
Phases, the .status.phase field
A pod has exactly one phase:
| Phase | What it means |
|---|---|
| Pending | accepted by the API, not Running yet. Could be scheduling, image pull, or init containers. |
| Running | scheduled on a node, at least one container is started. Does not mean all of them pass readiness. |
| Succeeded | all containers finished with exit 0 (for Job/CronJob) |
| Failed | at least one container finished with a non-zero code and will not be restarted |
| Unknown | kubelet could not report the status (node crashed, network split) |
The phase is too coarse. The real state lives in .status.conditions.
Conditions, the detailed picture
status:
phase: Running
conditions:
- type: PodScheduled
status: "True"
lastTransitionTime: "..."
- type: Initialized
status: "True" # all init containers OK
- type: ContainersReady
status: "True" # all main containers ready
- type: Ready
status: "True" # = ContainersReady && readiness probe pass
- PodScheduled means the scheduler found a node
- Initialized means the init containers finished with exit 0
- ContainersReady means all main containers passed the readiness probe
- Ready means the pod can receive traffic (it is added to Service endpoints)
A pod can be Running but not Ready, which is normal during startup or after a liveness probe failure.
Init containers
Containers in spec.initContainers run sequentially BEFORE the main
ones, and each must exit 0:
apiVersion: v1
kind: Pod
spec:
initContainers:
- name: wait-for-db
image: busybox
command: ['sh', '-c', 'until nc -z db 5432; do sleep 1; done']
- name: migrate
image: myapp:v1
command: ['./migrate']
containers:
- name: app
image: myapp:v1
Semantics:
- If an init container fails, the pod restarts from the first init container
restartPolicy: Alwaysfor the pod does not apply to init (init always restarts on failure)- Init containers can have their own resources and securityContext
Use cases:
- Wait for dependencies (DB, configmap)
- DB migrations
- Permissions fix on a mount (chown /data before the main process)
- Generate config from templates with
envsubst
Probes: startup, readiness, liveness
There are three types of health check, run by the kubelet on the node:
startup probe
startupProbe:
httpGet: { path: /healthz, port: 8080 }failureThreshold: 30
periodSeconds: 10
- Checked first, and disables the other probes while it runs.
- Protects slow-starting applications from a false liveness failure.
- On success it no longer runs, and readiness/liveness start working.
- On the failure threshold the kubelet kills the container.
Without a startup probe, a slow start means the liveness probe kills the container, producing an endless restart loop.
readiness probe
readinessProbe:
httpGet: { path: /ready, port: 8080 }initialDelaySeconds: 5
periodSeconds: 10
failureThreshold: 3
- Does not kill the pod, it just removes the endpoint from the Service on failure.
- Returns to the Service once it passes again.
- Ideal for warmup (cache not yet warm, DB connection not yet established).
liveness probe
livenessProbe:
httpGet: { path: /healthz, port: 8080 }periodSeconds: 10
failureThreshold: 3
- Kills the container on failure (restarts it via restartPolicy).
- Use it ONLY to self-heal a "deadlocked" application. Not for checking upstream dependencies.
- Anti-pattern: checking the DB in liveness. If the DB lags, k8s will kill all the pods, causing a cascading failure.
Probe types
| Type | What |
|---|---|
httpGet | HTTP 200-399 = OK; on any path/port |
tcpSocket | TCP connect possible = OK |
exec | command exit 0 = OK |
grpc (1.27+) | gRPC health check stream |
Restart Policy
- Always (default for Deployment/ReplicaSet/StatefulSet), restarts on any exit
- OnFailure (Job/CronJob default), restarts only on a non-zero exit
- Never, does not restart at all
This is for the Pod level only. At the controller level (Deployment) the logic is different (replicas).
Termination, graceful shutdown
On kubectl delete pod or a scale-down:
1. API: pod.metadata.deletionTimestamp = now
2. Pod removed from Service endpoints (Ready=False)
3. preStop hook (if any) -> synchronous
4. SIGTERM to pid 1 of the container -> grace period starts
5. ... wait terminationGracePeriodSeconds (default 30s)
6. SIGKILL if not finished
7. Pod removed from the API
spec:
terminationGracePeriodSeconds: 60
containers:
- name: app
lifecycle:
preStop:
exec:
command: ['sh', '-c', 'sleep 5 && kill -TERM 1']
preStop:
- Runs before SIGTERM
- Useful for drain mode (tell the LB "stop sending me traffic")
sleep 5is a typical pause for the distribution of the removal across the Service
Common mistakes:
- PID 1 does not handle SIGTERM. Some languages and shell scripts do
not forward the signal. Use
tiniordumb-initas PID 1. - A long graceful operation > terminationGracePeriodSeconds. k8s sends SIGKILL. Increase the grace period, or make the work non-blocking.
OOM in a pod
k8s sets a cgroup limit on memory. On overflow:
- The container is OOMKilled. The kernel OOM killer kills the container process
- In status:
lastState.terminated.reason: OOMKilled - The pod restarts if restartPolicy=Always
Check:
kubectl describe pod mypod | grep -A2 'Last State'
# Last State: Terminated
# Reason: OOMKilled
# Exit Code: 137 ← 128 + 9 (SIGKILL)
Fix: increase resources.limits.memory or fix the leak in the application.
ImagePullBackOff and CrashLoopBackOff
- ImagePullBackOff means the registry is unreachable / the image tag is
wrong / there is no imagePullSecret.
kubectl describe podshows events. - CrashLoopBackOff means the container crashes quickly after start, and the kubelet increases the delay exponentially between restarts (10s, 20s, 40s, up to 5 min).
Debug:
kubectl logs mypod -c container1 # current
kubectl logs mypod -c container1 --previous # previous instance
kubectl describe pod mypod # events at the bottom
kubectl get events --sort-by='.lastTimestamp' # global events
When things go wrong
- Pod stuck in Pending means there is no node with the required
resources/labels.
kubectl describe podshows events:FailedScheduling. Checkkubectl describe node(Allocatable, taints). - Pod stuck in ContainerCreating means image pulls take too long or the
volume does not mount. Check
kubectl describe podevents. OOMKilledwith no obvious cause means the limits are too low, or the JVM is not aware of the cgroup limit (you need-XX:+UseContainerSupportfor Java 8u131+, default on Java 11+).- The liveness probe kills the pod after a deploy means
initialDelaySecondsis too small and the application is still starting. Use a startup probe. kubectl delete podhangs means finalizers or a very largeterminationGracePeriodSeconds. Usekubectl delete pod --grace-period=0 --force.- The container is alive but gets no traffic means the readiness probe
failed.
kubectl describe podshows conditions: ContainersReady=False. - An init container finishes but the pod does not move means a non-zero
exit. Check
kubectl logs mypod -c <init-name>.
Useful kubectl commands
kubectl get pod -o wide # find the node + IP
kubectl get pod -o jsonpath='{.status.containerStatuses[*].state}'kubectl exec -it mypod -- sh
kubectl debug -it mypod --image=busybox # ephemeral container 1.25+
kubectl port-forward pod/mypod 8080:8080 # local test without a Service
kubectl rollout status deployment/myapp
kubectl rollout restart deployment/myapp # rolling restart without editing the spec