Sparse files: holes and apparent size

Why sparse

When you need a file "sized" at 100 GB that will fill up gradually, you don't have to allocate the full 100 GB on disk up front. You can create an empty file with a logical size of 100 GB but physically 0 bytes. As writes come in, the filesystem allocates blocks.

Where this is used:

qcow2/vmdk for VMs, a "thin" virtual disk
loop image for a filesystem, a 10 GB file with ext4 inside that really uses 1 GB
databases with pre-allocated tablespace (Oracle, MS SQL)
disk backups with empty regions, ddrescue
sparse logfile, a rewindable ring buffer

How holes are made

Three ways:

1. Seek + write across a boundary

bash

dd if=/dev/zero of=big.img bs=1 count=0 seek=10G

Creates a file with logical size 10 GB that occupies 0 blocks. The filesystem does not write zeros. It just records "there is a hole up to position X".

2. truncate / ftruncate

bash

truncate -s 10G big.img

The same thing in one command. It grows the inode length without allocating.

3. Removing blocks from an existing file (FALLOC_FL_PUNCH_HOLE)

bash

fallocate -p -o 1G -l 1G existing.dat

Drop bytes 1-2 GB from the middle of the file, punching a hole. The logical size stays the same, physical usage drops.

ls / stat / du: who shows what

bash

$ truncate -s 10G big.img

$ ls -lh big.img

-rw-r--r-- 1 user user 10G May  2 15:00 big.img       ← apparent (logical)

$ du -h big.img

0       big.img                                       ← actual (allocated)

$ stat big.img

  Size: 10737418240   Blocks: 0      IO Block: 4096  regular empty file

ls -l shows the apparent size, what seek SEEK_END returns
du shows disk usage in kilobyte units
du --apparent-size or du -k --apparent-size gives the apparent size
stat shows both: Size: (apparent) and Blocks: (×512 = bytes)

If only 5 GB is free on disk but the filesystem "sees" files totaling 20 GB, that is normal for sparse, but dangerous: as holes fill in, you can hit ENOSPC inside write().

fallocate vs sparse

Sparse means unallocated blocks.

fallocate (without -p) does the opposite. It reserves blocks without writing zeros:

bash

fallocate -l 10G allocated.dat

The file "occupies" 10 GB on disk, but the contents are undefined garbage (the kernel does not zero them). This speeds up the case "we will write 10 GB sequentially":

protection against fragmentation, the blocks are laid out contiguously
a guarantee that write() will not hit ENOSPC

If the filesystem supports it, the allocation is instant (no zeros written). On ext4/xfs, yes. On fat, no (zeros are always written).

fallocate options:

Option	What it does
`-l SIZE`	size
`-o OFFSET`	offset
`-p`	`FALLOC_FL_PUNCH_HOLE`, punch a hole
`-z`	`FALLOC_FL_ZERO_RANGE`, zero a range, possibly sparse
`-d`	`FALLOC_FL_DIG_HOLES`, find zero blocks and turn them into holes
`-c`	`FALLOC_FL_COLLAPSE_RANGE`, remove and shift
`-i`	`FALLOC_FL_INSERT_RANGE`, insert and shift

fallocate -d compacts an existing file, turning zero regions into holes:

bash

fallocate -d disk.img

SEEK_HOLE / SEEK_DATA

Modern filesystems (ext4, xfs, btrfs, tmpfs) support these seeks in lseek():

SEEK_HOLE finds the next hole
SEEK_DATA finds the next allocated block

With cp --sparse=auto (the default), copying preserves holes:

bash

cp --sparse=auto big.img copy.img         # carries sparse over, if the FS can

cp --sparse=always big.img copy.img       # scans for zero regions and makes holes

cp --sparse=never big.img copy.img        # copies "dense", fills holes with zeros

The same applies to tar, rsync, dd:

bash

rsync --sparse                            # holes are preserved

tar --sparse -cf backup.tar big.img

dd conv=sparse if=src of=dst              # skip zero blocks

Without the right flags, a sparse 100GB file expands during the copy into an honest 100 GB.

Real uses in production

qcow2 for KVM

bash

qemu-img create -f qcow2 disk.qcow2 100G

qcow2 is a format with built-in sparse + COW + a chain of snapshots. On an ext4 host the qcow2 file is itself sparse too, so you save twice.

Loop device with a filesystem inside

bash

truncate -s 10G ext4.img

mkfs.ext4 ext4.img

sudo mount -o loop ext4.img /mnt/loop

The file starts at 0 bytes, mkfs lays out metadata (~1% of the size), and the rest is consumed as you write.

Backup with holes

bash

# Direct disk copy that skips zero blocks

dd if=/dev/sda of=backup.img conv=sparse status=progress

# Or with ddrescue

ddrescue /dev/sda backup.img backup.log

To restore, run dd if=backup.img of=/dev/sda without conv=sparse: then the holes reach the disk as real zeros.

When something goes wrong

ENOSPC on write into an "empty" hole: there was no physical space to materialize the block. Sparse saves space only while the holes are empty.
du shows huge numbers after a backup restore: you copied without --sparse=auto, so the holes filled with zeros and became real blocks.
tar extracted the sparse file "fat": pass --sparse both when creating the archive and when extracting. On GNU tar 1.30+, sparse extraction happens automatically if the archive was created with --sparse.
a VM disk grows on its own: the guest rewrites non-zero blocks to zeros, but qcow2 and the filesystem do not know those are zero. To fix it, inside the VM run fstrim periodically (for SSD-aware setups), or zerofree plus fallocate -d.
fallocate fails with ENOTSUP on NFS: not every NFS version supports punch_hole. NFSv4.2 does.
rsync expands holes into zeros: pass --sparse together with -S.

Checking that a file is sparse

bash

# Ratio of allocated to apparent

python3 -c "

import os

s = os.stat('big.img')

print(f'apparent: {s.st_size}, allocated: {s.st_blocks * 512}, ratio: {s.st_blocks * 512 / s.st_size if s.st_size else 0:.2%}')

# Map of allocated regions

filefrag -v big.img

xfs_io -c 'fiemap -v' big.img         # on any FS, kernel >= 2.6.36

Why sparse

Where this is used:

qcow2/vmdk for VMs, a "thin" virtual disk
loop image for a filesystem, a 10 GB file with ext4 inside that really uses 1 GB
databases with pre-allocated tablespace (Oracle, MS SQL)
disk backups with empty regions, ddrescue
sparse logfile, a rewindable ring buffer

How holes are made

Three ways:

1. Seek + write across a boundary

bash

dd if=/dev/zero of=big.img bs=1 count=0 seek=10G

Creates a file with logical size 10 GB that occupies 0 blocks. The filesystem does not write zeros. It just records "there is a hole up to position X".

2. truncate / ftruncate

bash

truncate -s 10G big.img

The same thing in one command. It grows the inode length without allocating.

3. Removing blocks from an existing file (FALLOC_FL_PUNCH_HOLE)

bash

fallocate -p -o 1G -l 1G existing.dat

Drop bytes 1-2 GB from the middle of the file, punching a hole. The logical size stays the same, physical usage drops.

ls / stat / du: who shows what

bash

$ truncate -s 10G big.img

$ ls -lh big.img

-rw-r--r-- 1 user user 10G May  2 15:00 big.img       ← apparent (logical)

$ du -h big.img

0       big.img                                       ← actual (allocated)

$ stat big.img

  Size: 10737418240   Blocks: 0      IO Block: 4096  regular empty file

ls -l shows the apparent size, what seek SEEK_END returns
du shows disk usage in kilobyte units
du --apparent-size or du -k --apparent-size gives the apparent size
stat shows both: Size: (apparent) and Blocks: (×512 = bytes)

If only 5 GB is free on disk but the filesystem "sees" files totaling 20 GB, that is normal for sparse, but dangerous: as holes fill in, you can hit ENOSPC inside write().

fallocate vs sparse

Sparse means unallocated blocks.

fallocate (without -p) does the opposite. It reserves blocks without writing zeros:

bash

fallocate -l 10G allocated.dat

The file "occupies" 10 GB on disk, but the contents are undefined garbage (the kernel does not zero them). This speeds up the case "we will write 10 GB sequentially":

protection against fragmentation, the blocks are laid out contiguously
a guarantee that write() will not hit ENOSPC

If the filesystem supports it, the allocation is instant (no zeros written). On ext4/xfs, yes. On fat, no (zeros are always written).

fallocate options:

Option	What it does
`-l SIZE`	size
`-o OFFSET`	offset
`-p`	`FALLOC_FL_PUNCH_HOLE`, punch a hole
`-z`	`FALLOC_FL_ZERO_RANGE`, zero a range, possibly sparse
`-d`	`FALLOC_FL_DIG_HOLES`, find zero blocks and turn them into holes
`-c`	`FALLOC_FL_COLLAPSE_RANGE`, remove and shift
`-i`	`FALLOC_FL_INSERT_RANGE`, insert and shift

fallocate -d compacts an existing file, turning zero regions into holes:

bash

fallocate -d disk.img

SEEK_HOLE / SEEK_DATA

Modern filesystems (ext4, xfs, btrfs, tmpfs) support these seeks in lseek():

SEEK_HOLE finds the next hole
SEEK_DATA finds the next allocated block

With cp --sparse=auto (the default), copying preserves holes:

bash

cp --sparse=auto big.img copy.img         # carries sparse over, if the FS can

cp --sparse=always big.img copy.img       # scans for zero regions and makes holes

cp --sparse=never big.img copy.img        # copies "dense", fills holes with zeros

The same applies to tar, rsync, dd:

bash

rsync --sparse                            # holes are preserved

tar --sparse -cf backup.tar big.img

dd conv=sparse if=src of=dst              # skip zero blocks

Without the right flags, a sparse 100GB file expands during the copy into an honest 100 GB.

Real uses in production

qcow2 for KVM

bash

qemu-img create -f qcow2 disk.qcow2 100G

qcow2 is a format with built-in sparse + COW + a chain of snapshots. On an ext4 host the qcow2 file is itself sparse too, so you save twice.

Loop device with a filesystem inside

bash

truncate -s 10G ext4.img

mkfs.ext4 ext4.img

sudo mount -o loop ext4.img /mnt/loop

The file starts at 0 bytes, mkfs lays out metadata (~1% of the size), and the rest is consumed as you write.

Backup with holes

bash

# Direct disk copy that skips zero blocks

dd if=/dev/sda of=backup.img conv=sparse status=progress

# Or with ddrescue

ddrescue /dev/sda backup.img backup.log

To restore, run dd if=backup.img of=/dev/sda without conv=sparse: then the holes reach the disk as real zeros.

When something goes wrong

ENOSPC on write into an "empty" hole: there was no physical space to materialize the block. Sparse saves space only while the holes are empty.
du shows huge numbers after a backup restore: you copied without --sparse=auto, so the holes filled with zeros and became real blocks.
tar extracted the sparse file "fat": pass --sparse both when creating the archive and when extracting. On GNU tar 1.30+, sparse extraction happens automatically if the archive was created with --sparse.
a VM disk grows on its own: the guest rewrites non-zero blocks to zeros, but qcow2 and the filesystem do not know those are zero. To fix it, inside the VM run fstrim periodically (for SSD-aware setups), or zerofree plus fallocate -d.
fallocate fails with ENOTSUP on NFS: not every NFS version supports punch_hole. NFSv4.2 does.
rsync expands holes into zeros: pass --sparse together with -S.

Checking that a file is sparse

bash

# Ratio of allocated to apparent

python3 -c "

import os

s = os.stat('big.img')

print(f'apparent: {s.st_size}, allocated: {s.st_blocks * 512}, ratio: {s.st_blocks * 512 / s.st_size if s.st_size else 0:.2%}')

# Map of allocated regions

filefrag -v big.img

xfs_io -c 'fiemap -v' big.img         # on any FS, kernel >= 2.6.36

Sparse files: holes and apparent size

Why sparse

How holes are made

ls / stat / du: who shows what

fallocate vs sparse

SEEK_HOLE / SEEK_DATA

Real uses in production

qcow2 for KVM

Loop device with a filesystem inside

Backup with holes

When something goes wrong

Checking that a file is sparse

§ команды

§ см. также

Sparse files: holes and apparent size

Why sparse

How holes are made

ls / stat / du: who shows what

fallocate vs sparse

SEEK_HOLE / SEEK_DATA

Real uses in production

qcow2 for KVM

Loop device with a filesystem inside

Backup with holes

When something goes wrong

Checking that a file is sparse

§ команды

§ см. также