Blob (binary large object) is the simplest of the four Git object types. It holds only the file's content as a sequence of bytes. It has no knowledge of the file's name or location.
The filename, permissions, and path all live in a tree object, which references the blob by its SHA.
Creating a blob
At the high-level API, blobs are created automatically on git add. With
plumbing commands:
echo "content" | git hash-object --stdin -w
# 8d0e41234f24b6da002d962a26c2495ea16a425f
The -w flag writes the blob to .git/objects/. Without it, Git only
returns the SHA and writes nothing.
SHA-1 hash
The hash is computed not from the raw content but from this string:
blob <length-in-bytes>\0<content>
The blob prefix and the length are part of the data being hashed. That
is why an empty file and a file containing one blank line have different
SHAs: the lengths differ.
Deduplication
If two files in the repository are byte-for-byte identical, they share the
same SHA, and Git stores one blob. At the tree level they appear as two
separate entries (with different names), but both point to the same object
in .git/objects/.
This gives you content-based deduplication for free: identical files do not consume space twice, even across different commits.