linuxlab.io
Tutorials▾
  • Linux & networking
    File system, processes, TCP/IP, BGP and OSPF
    →
  • Terraform & IaC
    HCL, state, plan/apply on a LocalStack sandbox
    →
  • Git & GitHub
    Object model, plumbing, branching, GitHub Actions
    →
All tutorials →
PricingAboutSign inCreate account
/
  • Introduction
  • Lessons
  • How it works
  • Simulator
  • Knowledge base
  • Interview prep
Index
Categories
All entries
Footer
linuxlab-TutorialsPricingAboutPrivacy & cookies
Copyright © 2026 LinuxLab. All rights reserved.
home/linux/kb/Containers (bonus)/oci-spec

kb/containers ── Containers (bonus) ── intermediate

OCI spec: the container standard

OCI is three specs: Image (layers + manifest), Runtime (config.json + rootfs for runc), Distribution (registry API). The standard that followed Docker; runc, podman, containerd, CRI-O are all OCI-compatible.

view as markdownaka: oci, oci-image, oci-runtime, oci-distribution, container-spec

Why OCI

In 2015 Docker agreed to move the container specs out of its own project so the ecosystem would not depend on a single vendor. The result was the Open Container Initiative (under the Linux Foundation), which maintains three separate specifications:

SpecWhat it describes
OCI Imagethe on-disk image format: layers + manifest + config
OCI Runtimehow the runtime starts a container from rootfs + config.json
OCI Distributionthe registry HTTP API for push/pull

Today "Docker container" is almost a synonym for "OCI container": Dockerfile to image to registry to runtime, each step follows OCI. Alternative runtimes ([[runc-and-runsc|runc/runsc]]), registries (Harbor, GHCR, Quay), and build tools (buildah, kaniko) all work with the same format.

OCI Image: what it is on disk

An image is a set of files on disk, not a tarball. The structure:

myimage/
├── oci-layout                        ← {"imageLayoutVersion": "1.0.0"}
├── index.json                        ← root, points to the manifests
└── blobs/
    └── sha256/
        ├── <hash-config>             ← config (JSON)
        ├── <hash-layer1>             ← layer (tar or tar.gz)
        ├── <hash-layer2>
        └── <hash-manifest>           ← manifest (links config + layers)

index.json

json
{
  "schemaVersion": 2,
  "manifests": [
    {
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "digest": "sha256:abc...",
      "size": 1234,
      "platform": { "architecture": "amd64", "os": "linux" }
    },
    {
      "digest": "sha256:def...",
      "platform": { "architecture": "arm64", "os": "linux" }
    }
  ]
}

This is a multi-arch index. One manifest per platform.

Manifest

json
{
  "schemaVersion": 2,
  "mediaType": "application/vnd.oci.image.manifest.v1+json",
  "config": {
    "mediaType": "application/vnd.oci.image.config.v1+json",
    "digest": "sha256:abc...",
    "size": 7000
  },
  "layers": [
    { "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
      "digest": "sha256:111...", "size": 5000000 },
    { "digest": "sha256:222...", "size": 1000000 },
    { "digest": "sha256:333...", "size":   50000 }
  ]
}

A manifest contains:

  • config: environment, entrypoint, ENV, USER, WORKDIR
  • layers: an ordered list of tarballs

Config

json
{
  "architecture": "amd64",
  "os": "linux",
  "config": {
    "User": "1000:1000",
    "Env": ["PATH=/usr/bin:/bin"],
    "Entrypoint": ["/app/server"],
    "Cmd": ["--port=8080"],
    "WorkingDir": "/app",
    "ExposedPorts": { "8080/tcp": {} }
  },
  "rootfs": {
    "type": "layers",
    "diff_ids": [
      "sha256:layer1-uncompressed-hash",
      "sha256:layer2-uncompressed-hash"
    ]
  },
  "history": [ ... ]
}

Layers: the basis of image deduplication

Each layer is a diff against the previous one: added or changed files as a tar archive. Deleted files are .wh.<filename> whiteout markers.

On deploy the registry downloads only the missing layers. If 100 images are based on the same ubuntu:22.04, the Ubuntu layer is stored once. The savings in client and registry storage are enormous.

Layers are applied through [[tmpfs-overlayfs|overlayfs]]: a lower stack of read-only layers plus an upper layer for container writes.

Build vs pull

bash
# Build from a Dockerfile
docker build -t myimage:v1 .
# Push to a registry
docker push registry.example.com/myimage:v1
# Pull
docker pull registry.example.com/myimage:v1
# Without Docker, buildah / podman
buildah bud -t myimage:v1 .
buildah push myimage:v1 docker://registry.example.com/myimage:v1

buildah and podman need no daemon, run like ordinary CLI tools, and write OCI-compatible images.

OCI Runtime: config.json + rootfs

The runtime takes a bundle: a directory with

bundle/
├── config.json           ← everything about the container (mounts, namespaces, args)
└── rootfs/               ← extracted layers, the ready FS tree
    ├── bin/
    ├── etc/
    └── usr/

and starts the container. This is not an image, it is the unpacked image plus the runtime config. The image must be "unpacked" into a bundle before the runtime can work with it (the runtime supervisor does this: containerd, CRI-O).

config.json: what is inside

json
{
  "ociVersion": "1.2.0",
  "process": {
    "args": ["/app/server", "--port=8080"],
    "cwd": "/app",
    "env": ["PATH=/usr/bin:/bin"],
    "user": { "uid": 1000, "gid": 1000 },
    "capabilities": {
      "bounding": ["CAP_NET_BIND_SERVICE"],
      "effective": ["CAP_NET_BIND_SERVICE"],
      "permitted": ["CAP_NET_BIND_SERVICE"]
    },
    "noNewPrivileges": true,
    "rlimits": [ { "type": "RLIMIT_NOFILE", "hard": 65535, "soft": 65535 } ]
  },
  "root": { "path": "rootfs", "readonly": false },
  "mounts": [
    { "destination": "/proc", "type": "proc", "source": "proc" },
    { "destination": "/dev", "type": "tmpfs", "source": "tmpfs", "options": ["mode=755", "size=65536k"] },
    { "destination": "/data", "type": "bind", "source": "/var/lib/myapp/data", "options": ["bind", "ro"] }
  ],
  "linux": {
    "namespaces": [
      { "type": "pid" },
      { "type": "network" },
      { "type": "mount" },
      { "type": "uts" },
      { "type": "ipc" },
      { "type": "user" }
    ],
    "cgroupsPath": "system.slice:myapp:abc123",
    "resources": {
      "memory": { "limit": 268435456 },
      "cpu":    { "shares": 1024, "quota": 50000, "period": 100000 }
    },
    "seccomp": { "defaultAction": "SCMP_ACT_ALLOW", ... }
  }
}

This is the full description of the container: what to run, which namespaces, which cgroups limits, which capabilities, which seccomp profile.

OCI Distribution: the registry API

The HTTP API of registries. The main endpoints:

GET  /v2/                                   ← ping
GET  /v2/<name>/tags/list                   ← list of tags
GET  /v2/<name>/manifests/<reference>        ← manifest by tag/digest
GET  /v2/<name>/blobs/<digest>               ← download a layer/config
POST /v2/<name>/blobs/uploads/               ← start an upload
PUT  /v2/<name>/manifests/<reference>        ← upload a manifest
  • <name> is the image name (library/ubuntu, myorg/myapp)
  • <reference> is a tag (v1.0) or a digest (sha256:...)

Any OCI registry (Docker Hub, GHCR, Harbor, Quay, ECR, GCR, ACR, GitLab Registry) implements this API. Pull and push are cross-compatible.

Authentication is a Bearer token, usually through an OAuth2 token server.

bash
# Raw request to a registry
curl -H "Accept: application/vnd.oci.image.manifest.v1+json" \
     https://registry-1.docker.io/v2/library/alpine/manifests/latest

skopeo: low-level work with OCI

bash
# Copy an image between registries without local unpacking
skopeo copy docker://registry.example.com/app:v1 \
            docker://other-registry.com/app:v1
# Inspect a manifest without a pull
skopeo inspect docker://nginx:latest
# Save as an OCI layout
skopeo copy docker://nginx:latest oci:/tmp/nginx-oci:latest
ls /tmp/nginx-oci/                          # the classic OCI structure

Tags vs digest: immutability

  • A tag (nginx:1.25) is a mutable pointer; latest especially so
  • A digest (nginx@sha256:abc...) is immutable, the hash of the manifest

In a production deploy, always pin by digest, not by tag. A tag can be rewritten in the registry; a digest cannot (change the content and the hash changes too).

bash
# Get the digest of the current tag
docker inspect --format='{{index .RepoDigests 0}}' nginx:1.25

▸nginx@sha256:abcdef...

# Pin it in a Dockerfile / k8s manifest
FROM nginx@sha256:abcdef...

When things go wrong

  • manifest unknown on pull: the tag does not exist or was removed from the registry. skopeo list-tags docker://registry/repo.
  • A multi-arch image was not pulled: the container runtime found no matching platform manifest. docker pull --platform=linux/arm64.
  • unauthorized: no token, or it expired. docker login, and check the credentials in ~/.docker/config.json / ~/.config/containers/auth.json.
  • The image build is slow every time: no layer caching. The build cache invalidates when any line above changes; put RUN apt-get install after COPY package*.json to reuse the layer.
  • OCI vs Docker manifest schema: old registries return a v1 manifest, modern ones return v2 OCI. Most clients handle both, but some server-side validators may fail.
  • Digest mismatch on air-gapped transfer: after a gzip repack of a layer its SHA256 changes and the manifest becomes invalid. Use skopeo or save to an OCI layout.

Alternative formats (for the curious)

  • AppImage / Snap / Flatpak: for the desktop, not containers in the OCI sense
  • Singularity / Apptainer (.sif): scientific clusters, a single-file image
  • WASM components: not yet containers in OCI terms, but moving that way (some runtimes run WASM through an OCI config)

§ команды

bash
skopeo inspect docker://nginx:latest

The image manifest and config without a pull, a quick view

bash
skopeo copy docker://src/app:v1 oci:/tmp/app-oci:v1

Save an image to an OCI layout on disk, a portable format

bash
docker manifest inspect --verbose nginx:latest

The multi-arch index and each platform manifest

bash
docker inspect --format='{{index .RepoDigests 0}}' nginx:1.25

The digest of the current tag, for immutable pinning in prod

bash
umoci unpack --image image:tag bundle/

Unpack an OCI image into an OCI runtime bundle (config.json + rootfs)

bash
buildah bud -t myapp:v1 .

Build an OCI image without the Docker daemon, rootless-friendly

bash
podman pull --platform=linux/arm64 nginx:latest

Force the architecture on pull, for cross-arch dev

§ см. также

  • runc-and-runscrunc, runsc, kata: container runtimesrunc is the standard OCI runtime: namespaces+cgroups+seccomp. runsc/gVisor is a userspace kernel for extra isolation. kata is a lightweight VM per container. Performance and isolation trade off against each other.
  • docker-storage-driversDocker storage drivers: overlay2, btrfs, zfsA storage driver is how Docker keeps image layers and container changes on disk. overlay2 is the default (overlayfs over ext4/xfs), btrfs and zfs work through subvolumes and snapshots, fuse-overlayfs is for rootless.
  • namespacesLinux namespacesNamespaces are a kernel mechanism that gives a process its own isolated view of a resource (network, mount points, PID, UID, IPC, hostname, time). Every container is built on them.
  • kubernetes-pod-lifecycleKubernetes pod lifecycle: from Pending to TerminatedA pod moves through phases Pending, Running, Succeeded/Failed/Unknown. Init containers run sequentially before the main ones. Probes: startup, then readiness/liveness. SIGTERM plus a grace period on delete.
Footer
linuxlab-
Copyright © 2026 LinuxLab. All rights reserved.
Tutorials
Pricing
About
Privacy & cookies