When it fires
Linux usually overcommits memory. If every process asked for all the memory it claimed at once, there would not be enough physical pages. As long as nobody actively uses it, things work. Once you hit the wall:
- on the host: the kernel runs the global OOM killer
- in a container or a cgroup with
memory.max: a cgroup OOM runs and kills only within that cgroup
The alternative is a full kernel panic, but that is the last resort.
How the victim is chosen
Each process gets an oom_score (0..1000) based on:
- how much RAM it ate (the main factor)
oom_score_adj, a manual adjustment (-1000 = immune, +1000 = killed first)- root processes get a small penalty (a few percent off)
Files:
cat /proc/<pid>/oom_score # current final score
cat /proc/<pid>/oom_score_adj # manual adjustment
How to protect a critical process
echo -1000 | sudo tee /proc/$(pidof sshd)/oom_score_adj
▸now sshd will never be an OOM victim
In a systemd unit, the same thing through OOMScoreAdjust=-1000.
OOM in cgroups v2
In v2 the option memory.oom.group = 1 changes the behavior: on OOM it kills not
the single "worst" process but the whole cgroup. This is critical for
applications where losing one worker breaks the rest (the Kubernetes use case).
Signs that an OOM happened
dmesg | grep -i 'killed process'
journalctl -k | grep -i oom
# Mar 14 12:34:56 host kernel: Out of memory: Killed process 1234 (chrome) ...
In a container, docker inspect shows:
docker inspect <container> --format '{{.State.OOMKilled}}'# true → the container was killed by OOM, it did not exit on its own
The main causes of OOM in production:
- Memory leak in the application (Java/Node with no heap limit, growing right up to the server's RAM)
- Too tight a limit in k8s. The application works but does not fit the limit at peaks; the fix is raising request/limit or vertical autoscaling
- A large ALLOCATION in one operation (reading a huge file whole)