linuxlab.io
Tutorials▾
  • Linux & networking
    File system, processes, TCP/IP, BGP and OSPF
    →
  • Terraform & IaC
    HCL, state, plan/apply on a LocalStack sandbox
    →
  • Git & GitHub
    Object model, plumbing, branching, GitHub Actions
    →
All tutorials →
PricingAboutSign inCreate account
/
  • Introduction
  • Lessons
  • How it works
  • Simulator
  • Knowledge base
  • Interview prep
Index
Categories
All entries
Footer
linuxlab-TutorialsPricingAboutPrivacy & cookies
Copyright © 2026 LinuxLab. All rights reserved.
home/linux/kb/Processes & resources/cgroups

kb/processes ── Processes & resources ── advanced

cgroups (v2)

cgroups v2 is a hierarchical virtual FS under `/sys/fs/cgroup` that the kernel uses to limit CPU, memory, and I/O for processes. Docker, k8s, and systemd write here.

view as markdownaka: cgroup, cgroups-v2, control-groups

What a cgroup is

A control group is a set of processes that share resource limits (CPU, RAM, I/O, PIDs, network) and shared accounting of consumption.

In cgroups v2 (the standard on modern Ubuntu/Debian/Fedora) there is one directory hierarchy under /sys/fs/cgroup/. Each directory is a cgroup. Subdirectories are child cgroups, and they inherit the parent's limits.

Each process belongs to exactly one cgroup. To find yours:

bash
cat /proc/self/cgroup
# 0::/system.slice/docker-abc123.scope

The full path is /sys/fs/cgroup plus the value from the file.

Controllers

Within a single hierarchy you enable "controllers", the modules that do accounting and apply limits:

  • cpu: share / quota; cpu.max = quota period (e.g. 50000 100000 = half a CPU)
  • memory: memory.max = hard limit, memory.high = soft (throttle with reclaim)
  • io: bandwidth and iops per device (io.max)
  • pids: pids.max (how many processes are allowed)
  • cpuset: pinning to specific cores and NUMA nodes

You enable them through cgroup.subtree_control:

bash
cat /sys/fs/cgroup/cgroup.controllers       # available
cat /sys/fs/cgroup/cgroup.subtree_control   # enabled on children

What Docker / k8s / systemd write here

When you run docker run --cpus=0.5 --memory=256m:

  • Docker creates /sys/fs/cgroup/system.slice/docker-<id>.scope/
  • Writes cpu.max = "50000 100000" (50ms out of 100ms)
  • Writes memory.max = 268435456
  • Places the container init PID into cgroup.procs

A k8s pod with resources: limits: { cpu: 500m, memory: 256Mi } does the same through kubelet → cri-o/containerd → kernel.

A systemd unit with MemoryMax=512M does the same thing, only through slice/scope units.

PSI: Pressure Stall Information

The most useful addition in v2 is the files cpu.pressure, memory.pressure, and io.pressure. They show what percentage of time a process waited for a resource. PSI is more accurate than load-average because it is normalized and works per-cgroup, which matters inside containers.

some avg10=12.34 avg60=8.90 avg300=4.50 total=...
full avg10=2.10  avg60=1.80 avg300=0.90 total=...

some = at least one process waited; full = ALL processes waited.

OOM in a cgroup

When a process in a cgroup hits memory.max, the oom-killer fires, but only within that cgroup. The rest of the system is unaffected.

§ команды

bash
cat /proc/self/cgroup

Which cgroup the current process is in

bash
MY=$(awk -F: '{print $3}' /proc/self/cgroup); ls /sys/fs/cgroup$MY

Which controllers are available for our cgroup

bash
cat /sys/fs/cgroup/cpu.max

CPU limit of the current root cgroup: `<quota> <period>` µs or `max`

bash
cat /sys/fs/cgroup/memory.current

How much RAM the cgroup uses right now (bytes)

bash
cat /sys/fs/cgroup/cpu.pressure

PSI: the precise per-cgroup metric for how short a resource is

§ см. также

  • process-and-pidProcess and PIDA process is a running program with its own PID, memory, open descriptors, and UID. Every process forms a tree rooted at init (PID 1).
  • namespacesLinux namespacesNamespaces are a kernel mechanism that gives a process its own isolated view of a resource (network, mount points, PID, UID, IPC, hostname, time). Every container is built on them.
  • cgroups-v2-deepcgroups v2: unified hierarchy, PSI, eBPF controlcgroups v2 uses one tree instead of separate per-controller hierarchies. Clean semantics, new fields (memory.high, io.cost). PSI shows resource pressure. eBPF can manage resources. Default in RHEL 9, Ubuntu 22+.
  • oom-killerOOM killerOOM killer is the kernel mechanism that picks and terminates a process when the system hits its memory limit. In containers it works per-cgroup.
  • kubelet-internalskubelet: the Kubernetes node agent architecturekubelet is a daemon on every node. It receives the PodSpec through the API, starts containers through CRI, mounts volumes through CSI, and watches health. Under pressure it does eviction. Image GC and the cgroup tree are also its job.

§ упоминается в уроках

  • ›advanced-02-cgroups-v2
  • ›advanced-07-perf-and-flame
  • ›beginner-02-directory-tree
Footer
linuxlab-TutorialsPricingAboutPrivacy & cookies
Copyright © 2026 LinuxLab. All rights reserved.