In systems like SVN, CVS, and Perforce, each version of a file is stored as a delta from the previous one. To retrieve a file from a week ago, the system starts from one of the stored versions and applies deltas in the required direction.
SVN / CVS / Perforce - delta model:
v1: full file
v2: delta (3 lines changed)
v3: delta (1 line added)
v4: delta (5 lines deleted)
↑
to reconstruct v4,
apply deltas v1+v2+v3+v4
Git works differently. Each commit stores a complete snapshot of the entire project through a tree object. Not a diff: a photograph.
Git - snapshot model:
commit 1: snapshot 1
commit 2: snapshot 2
commit 3: snapshot 3
↑
to view commit 3,
just take snapshot 3
Where the storage savings come from
Storing the whole project with every commit sounds wasteful. It is, and Git uses two techniques to address it.
Content-addressed deduplication. A file is stored through a blob addressed by the SHA of its contents. If the file did not change between commits, it has the same SHA, and Git reuses the same blob. What actually lives on disk is "a snapshot of the directory tree with pointers to files," not "a full copy of the project."
Deltas inside a packfile. During git gc or git push,
accumulated objects are packed into a single file where similar objects
are encoded as deltas. This is a storage optimization, not the data model.
Why this matters
The difference between the two models is the foundation of how Git works.
- A branch is a pointer to a snapshot, not a record of "forked at this moment." Creating a branch costs 40 bytes. Switching is instant.
git logwalks the chain ofparentcommits, reading ready-made snapshots. No history reconstruction needed.git diffshows the difference between two snapshots, computed on demand. That is a view, not the storage model. This is a common source of confusion.