linuxlab.io
Tutorials▾
  • Linux & networking
    File system, processes, TCP/IP, BGP and OSPF
    →
  • Terraform & IaC
    HCL, state, plan/apply on a LocalStack sandbox
    →
  • Git & GitHub
    Object model, plumbing, branching, GitHub Actions
    →
All tutorials →
PricingAboutSign inCreate account
/
  • Introduction
  • Lessons
  • How it works
  • Simulator
  • Knowledge base
  • Interview prep
Index
Categories
All entries
Footer
linuxlab-TutorialsPricingAboutPrivacy & cookies
Copyright © 2026 LinuxLab. All rights reserved.
home/linux/kb/Containers (bonus)/docker-storage-drivers

kb/containers ── Containers (bonus) ── advanced

Docker storage drivers: overlay2, btrfs, zfs

A storage driver is how Docker keeps image layers and container changes on disk. overlay2 is the default (overlayfs over ext4/xfs), btrfs and zfs work through subvolumes and snapshots, fuse-overlayfs is for rootless.

view as markdownaka: storage-driver, overlay2, graph-driver, docker-graphdriver, container-storage

Why there are different drivers

Image layers are read-only. A container writes, so it needs a writable layer on top. Older filesystems could not do this efficiently, so Docker supported several storage drivers, each with its own approach.

Today overlay2 covers 95% of cases, but the rest are worth knowing. You will meet them in legacy systems and in production scenarios with special requirements.

Where all of this lives:

/var/lib/docker/
├── overlay2/                  ← if driver = overlay2
│   ├── <layer-hash>/
│   │   ├── diff/              ← files of a specific layer
│   │   ├── lower              ← list of lower layers
│   │   └── work/              ← overlayfs working directory
├── containers/
└── image/

Which driver is active now:

bash
docker info | grep -i storage
# Storage Driver: overlay2
# Backing Filesystem: extfs

overlay2, the default

Uses the kernel's [[tmpfs-overlayfs|overlayfs]]. Image layers are the read-only lower, container changes are the upper.

Pros:

  • In the kernel, no userspace overhead
  • Page cache is shared between containers with the same base image
  • Fast startup, the overlay mount is cheap

Cons:

  • Inode usage grows fast, copy-up at write copies whole files. A large file with small edits means double the space
  • It does not like many layers, the kernel limit on nested overlay is around 128

Backing filesystem: ext4, xfs (with ftype=1), btrfs, ext4, [[tmpfs-overlayfs|tmpfs]] (for rootless through a user namespace).

For XFS this is mandatory at mkfs time:

bash
mkfs.xfs -f -n ftype=1 /dev/sdb1

Otherwise overlay2 refuses to work (d_type is missing).

btrfs, native CoW

Uses [[btrfs|btrfs]] subvolumes and snapshots:

  • Each layer is a subvolume
  • The container layer is a snapshot of the last image subvolume
  • Writes inside the container are copy-on-write at the filesystem level

Pros:

  • Snapshot in O(1)
  • Deduplication at the level of FS blocks
  • btrfs scrub finds bit rot

Cons:

  • All of /var/lib/docker must be on btrfs
  • Btrfs does not like a near-full filesystem
  • On a high-write workload, fragmentation hurts performance
  • In 2026 it is not the default anywhere, a niche choice
bash
# daemon.json
{ "storage-driver": "btrfs" }

zfs, enterprise CoW

Similar to btrfs, but through ZFS:

  • Each layer is a zfs filesystem
  • A container is a clone of the last layer
  • Internally, zfs send/receive handles migration

Pros:

  • ZFS-grade reliability (ARC, scrubs, raid-z)
  • Deduplication (if enabled)
  • Snapshots are cheap

Cons:

  • ZFS is not in the mainline kernel, you need ZoL (zfs-on-Linux), with the CDDL+GPL licensing friction
  • RAM-hungry (ARC eats free memory)
  • The production stack is rare, mostly TrueNAS and Solaris descendants

For containers this is usually overkill, except in special cases (a storage server).

devicemapper, legacy

Creates a thin pool on block devices, each layer is a separate thin LV.

It used to be the default on RHEL 7. Deprecated, removed in Docker 23+ (2023). Do not use it.

vfs, fallback

It simply copies files recursively between layers. No CoW.

  • Huge disk usage
  • Very slow pull
  • Works on any filesystem

The default for rootless Docker without user-namespace overlay permissions. For production, never.

fuse-overlayfs, rootless

In rootless Docker/Podman, plain overlayfs does not work (it needs CAP_SYS_ADMIN to mount). The FUSE version does the same thing in userspace:

bash
# Podman rootless by default
podman info | grep -i graphdriver
# graphDriverName: overlay
# ... mountopt: nodev,fsync=0

Modern kernels 5.11+ have rootless overlayfs through user namespaces, and overlay works without FUSE, faster. But FUSE is needed for older kernels.

Comparison

DriverBackingCoW levelRecommendation
overlay2ext4/xfs/btrfsoverlayfs (file)default, 95% of cases
btrfsbtrfsbtrfs subvol/snapshot (block)if you already run btrfs
zfszfszfs clone (block)enterprise NAS
devicemapperthin poolLVM thin (block)deprecated
vfsanynone (full copy)testing/fallback
fuse-overlayfsanyuserspace overlayrootless legacy

Performance: the large-write problem of overlay2

Overlay2 does a copy-up on write: on the first write to a file from a read-only layer, the whole file is copied into the writable layer. If a file is 10 GB and one byte changes, the copy is 10 GB.

For databases and VM images this is fatal. The fixes:

  • A volume or bind mount for the data directory:
    bash
    docker run -v /var/lib/postgres-data:/var/lib/postgresql/data postgres
    The volume sits outside the overlay, with no copy-up.
  • chattr +C on the host directory if the backing is btrfs ([[btrfs|see this]])
  • tmpfs for ephemeral directories:
    bash
    docker run --tmpfs /tmp:size=100M ...

Storage size limits

By default a container can take up the whole disk (until the FS is full). The limit:

bash
docker run --storage-opt size=10G ...

Works only on storage drivers with a native quota:

  • overlay2 + xfs+pquota, yes
  • btrfs, yes through subvolume quota
  • devicemapper, yes (thin pool)
  • overlay2 + ext4, no

An alternative: docker-compose with storage_opt:, or k8s ephemeral-storage requests/limits.

Cleaning up disk usage

bash
docker system df                            # how much is used
docker image prune                          # unused images
docker container prune                      # stopped ones
docker volume prune                         # unmounted volumes
docker builder prune                        # build cache
docker system prune -a --volumes            # EVERYTHING (dangerous, removes images)
# Targeted: which layers and their size
du -sh /var/lib/docker/overlay2/* | sort -h | tail

A large /var/lib/docker/overlay2/ is normal for CI runners. Prune once a day is a must.

When things go wrong

  • failed to register layer: ApplyLayer exit status 1, usually the backing FS does not support overlay (vfat) or space ran out.
  • device or resource busy on docker rm, the overlay mount is still held. Run docker stop first, or kill the container's processes through docker top.
  • d_type is not supported on XFS made without ftype=1. The only fix is to recreate the FS with the correct mkfs.
  • No space left on device with free GB available means inodes ran out on ext4 (df -i). Overlay creates many small files. Recreate with mkfs.ext4 -i 4096.
  • A very slow pull, many small layers (docker history will show 50+). When you build, combine RUN commands.
  • /var/lib/docker ballooned out of proportion, old dangling images, containers that exited but were not removed, build cache. Run docker system df, then prune.
  • A container write to a shared volume "disappears", another container on the same volume wrote over it. A volume is not an overlay, the last writer wins.

Alternative runtimes and their storage

  • podman, similar drivers (overlay, vfs); rootless = overlay through a user namespace
  • containerd has its own "snapshotter" subsystem: native, overlayfs, btrfs, devmapper. The standard for k8s.
  • CRI-O, overlay (default) or vfs

§ команды

bash
docker info | grep -A5 'Storage Driver'

The storage driver and backing FS are the first thing to check

bash
docker system df

A summary of Docker's disk usage: images/containers/volumes/cache

bash
du -sh /var/lib/docker/overlay2/* | sort -h | tail

The top 10 fattest layer directories, where space is leaking

bash
docker image prune -a

Remove ALL unused images, not just dangling ones, careful in production

bash
docker run --storage-opt size=5G --runtime=runc ubuntu

A limit on the writable layer, works on xfs+pquota / btrfs / devmapper

bash
xfs_quota -x -c 'report' /var/lib/docker

Quotas for overlay2 on xfs, who takes how much

bash
stat -c '%T' -f /var/lib/docker

The FS type in /var/lib/docker, compatibility with the storage driver

§ см. также

  • btrfsbtrfs: copy-on-write, subvolumes, and snapshotsbtrfs is a copy-on-write filesystem with subvolumes, O(1) snapshots, native RAID 0/1/10, and data checksums. RAID 5/6 is problematic. COW fragmentation hurts databases and VM images, so turn it off for them.
  • xfsXFS: extents and parallel I/OXFS is the RHEL 7+ default: allocation groups (parallel I/O), extent-based allocation, online grow. **It cannot shrink**, grow only. Ideal for big files, databases, and parallel workloads.
  • cgroups-v2-deepcgroups v2: unified hierarchy, PSI, eBPF controlcgroups v2 uses one tree instead of separate per-controller hierarchies. Clean semantics, new fields (memory.high, io.cost). PSI shows resource pressure. eBPF can manage resources. Default in RHEL 9, Ubuntu 22+.
  • oci-specOCI spec: the container standardOCI is three specs: Image (layers + manifest), Runtime (config.json + rootfs for runc), Distribution (registry API). The standard that followed Docker; runc, podman, containerd, CRI-O are all OCI-compatible.
Footer
linuxlab-
Copyright © 2026 LinuxLab. All rights reserved.
Tutorials
Pricing
About
Privacy & cookies