linuxlab.io
Tutorials▾
  • Linux & networking
    File system, processes, TCP/IP, BGP and OSPF
    →
  • Terraform & IaC
    HCL, state, plan/apply on a LocalStack sandbox
    →
  • Git & GitHub
    Object model, plumbing, branching, GitHub Actions
    →
All tutorials →
PricingAboutSign inCreate account
/
Intro
Lessons
Footer
linuxlab-TutorialsPricingAboutPrivacy & cookies
Copyright © 2026 LinuxLab. All rights reserved.
linuxlab.io
Tutorials▾
  • Linux & networking
    File system, processes, TCP/IP, BGP and OSPF
    →
  • Terraform & IaC
    HCL, state, plan/apply on a LocalStack sandbox
    →
  • Git & GitHub
    Object model, plumbing, branching, GitHub Actions
    →
All tutorials →
PricingAboutSign inCreate account
/
  • Introduction
  • Chapters
  • How it works
  • Lessons
  • Knowledge base
  • Interview prep
home/git/kb/Object model/packfile

kb/objects ── Object model ── advanced

Packfile

A compressed file in `.git/objects/pack/` where Git packs many loose objects to save space and speed up network operations. Uses delta compression between similar objects.

view as markdownaka: pack, git-packfile

When an object is first created (by git add or git commit), it lives as a separate file, a loose object: .git/objects/8d/0e41.... That works well for writes, but storing millions of individual files is inefficient. Git periodically consolidates them into packfiles.

Inside a packfile

Two files appear in .git/objects/pack/:

pack-abc123def456.pack    # compressed objects
pack-abc123def456.idx     # SHA index for fast lookup

Inside the .pack file:

  • objects are stored one after another, each zlib-compressed;
  • for similar objects, Git computes a delta and stores only the difference relative to a base object;
  • the whole stream is compressed again at the container level.

Delta compression

This is where most of Git's space savings come from. Suppose you have two versions of a large file that differ by five lines. In loose format, those are two blobs, each nearly the full size of the file. In a packfile, one version is stored in full and the other as "take that one and apply these edits."

The algorithm resembles xdelta/bsdiff. Git selects a base not by filename but by heuristic: it looks for a similar object of the same type and comparable size. Deltas can therefore exist between two completely unrelated files if they happen to be similar.

When packfiles are created

  • On git gc (manual or automatic).
  • On git push or git fetch: the two sides exchange objects in packfile format, not one object at a time.
  • On git clone: the server sends the entire repository as a packfile.

Auto-gc triggers when certain thresholds are exceeded:

  • more than 6700 loose objects;
  • more than 50 packfiles.

Both thresholds are configurable: gc.auto, gc.autoPackLimit.

Reading objects from a packfile

Git commands work the same way regardless of whether an object is loose or packed. git cat-file -p <sha> finds the object in either form.

To inspect a pack file directly:

bash
git verify-pack -v .git/objects/pack/pack-abc.idx
# SHA  type  size  packfile-offset  base-SHA?

The output shows which objects are bases and which are deltas, along with each delta's dependency chain.

Pitfalls

  • Delta compression does not mean Git "stores diffs." The logical data model is still snapshot-based (commit holds tree, not a diff). Deltas are a physical storage detail.
  • With a very large packfile, a single read can be slow: Git must decompress a chain of deltas. The pack.deltaCacheSize option limits that chain length.
  • git gc --aggressive recomputes all deltas from scratch to find better bases. It takes a long time but produces a smaller pack. Worth running once every few months on large repositories.

§ команды

bash
git gc

Pack loose objects into a packfile and remove unreachable objects

bash
git verify-pack -v <pack.idx>

Print packfile contents with types and sizes

bash
git count-objects -v

Show the number of loose objects and packfiles in the repository

§ см. также

  • blobBlobA Git object that stores the content of a single file. Just bytes, no name, no permissions, no date. The filename lives in the `tree`, not in the blob.
  • treeTreeA Git object that holds the listing of one directory: entries of the form `(mode, type, SHA, name)`. It references other tree objects recursively for subdirectories.
  • commitCommitA Git object: a snapshot of the entire project (via a tree) plus metadata including author, committer, date, parents, and message. The SHA of a commit includes the parent's SHA, which makes history cryptographically linked.
Footer
linuxlab-
Copyright © 2026 LinuxLab. All rights reserved.
Tutorials
Pricing
About
Privacy & cookies