When something breaks
A filesystem can end up in a "dirty" state after:
- Power loss / kernel panic during a write
- Bad sectors on an HDD/SSD, where the storage stack returned an error
- A disk controller that lied about fsync (write-back cache without a BBU)
- Memory corruption (ECC, or the lack of it)
- A bug in the kernel or the filesystem
- An accidental write to a raw device past the filesystem
Symptoms:
Read-only file systemafter mount (errors=remount-ro)Structure needs cleaningat the next mountInput/output errorwhen reading specific files- Odd
lsbehavior (garbage names, files disappearing) - The machine stuck at boot on "checking filesystems..."
Strategy:
- Stop. Do not write to the damaged filesystem.
- Back up the raw device (
ddorddrescue) before any edits. - If there is a journal, try a replay at mount.
- If that does not help, do an offline fsck/repair.
- If even that fails, extract the data with recovery utilities.
Journal replay, the automatic magic
At mount ext4/xfs/btrfs see that the previous umount was dirty, and they replay the journal: they apply changes that were committed but not written. This takes seconds and resolves most crash scenarios.
What you will see in dmesg:
EXT4-fs (sda1): recovery required on readonly filesystem
EXT4-fs (sda1): write access will be enabled during recovery
EXT4-fs (sda1): mounted filesystem with ordered data mode
XFS (sdb1): Mounting V5 Filesystem
XFS (sdb1): Starting recovery (logdev: internal)
XFS (sdb1): Ending recovery (logdev: internal)
If journal recovery fails, then reach for fsck.
fsck, the frontend
fsck is a universal wrapper that looks at the filesystem type and
calls the right tool:
| FS | Real binary |
|---|---|
| ext2/3/4 | e2fsck |
| xfs | a no-op in fsck, use xfs_repair directly |
| btrfs | btrfs check directly |
| vfat | dosfsck |
| jfs | jfs_fsck |
# Run automatically from fstab at boot
fsck -A # everything in fstab (passno > 0)
fsck / # a specific mountpoint
fsck /dev/sda1 # a specific device
Options:
| Option | What it does |
|---|---|
-A | everything in fstab |
-y | "yes to all", not safe if you do not understand the damage |
-n | dry-run, show only |
-f | force even on a clean filesystem |
-V | verbose |
-C | progress bar |
The main rule: the filesystem must be unmounted. A live fsck on
a mounted filesystem that is being written to means corruption. -n
(RO) on a mounted filesystem is fine for diagnostics.
e2fsck, the ext family
umount /dev/sda1
e2fsck -f /dev/sda1 # force, always
e2fsck -fy /dev/sda1 # auto-yes (for scripts and desperate cases)
e2fsck -fp /dev/sda1 # preen, fix what is safe, complain about the rest
Special cases:
# Use a backup superblock if the primary one is broken
e2fsck -b 32768 /dev/sda1 # on ext4 with 4K blocks
mke2fs -n /dev/sda1 # shows where the backups are (no write!)
dumpe2fs -h /dev/sda1 | grep -i backup # on a healthy filesystem
What it prints on corruption:
Pass 1: Checking inodes, blocks, and sizes
Inode 12345 has illegal block(s). Clear<y>?
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Files that lost their name but still have an inode land in
/lost+found/ named by inode number. The content is readable; you
recover the name from the content.
xfs_repair, XFS
umount /dev/sdb1
xfs_repair /dev/sdb1 # normal
xfs_repair -n /dev/sdb1 # dry-run
xfs_repair -L /dev/sdb1 # ZERO LOG, last resort!
-L zeroes the journal, losing whatever was not committed. Do it
only when a normal xfs_repair fails with "log is corrupt or
cannot be read".
XFS has no lost+found concept in the same form. Orphan inodes are
reattached automatically or removed.
btrfs check / btrfs rescue
umount /dev/sdc1
btrfs check /dev/sdc1 # diagnostics only
btrfs check --repair /dev/sdc1 # DANGEROUS, can make it worse!
btrfs check --repair has been marked experimental for years. Use
it only when you have nothing to lose. Try btrfs rescue first:
btrfs rescue super-recover /dev/sdc1 # recover the super blocks
btrfs rescue chunk-recover /dev/sdc1 # recover the chunk tree
btrfs rescue zero-log /dev/sdc1 # like xfs_repair -L
If the filesystem will not mount, try RO:
mount -o ro,recovery,nologreplay /dev/sdc1 /mnt
and rescue the data with cp -a or btrfs send.
fstab, passno
The sixth field in fstab says when to fsck at boot:
UUID=... / ext4 defaults 0 1
UUID=... /var ext4 defaults 0 2
UUID=... /tmp tmpfs defaults 0 0
- 0: do not check
- 1: root, checked first
- 2: checked after root, in parallel with other
2entries
XFS and btrfs set passno=0, because fsck is a no-op for them (they use their own tools when needed).
errors=remount-ro
A mount option, set in fstab:
UUID=... / ext4 defaults,errors=remount-ro 0 1
When the kernel sees a filesystem error, it remounts RO instead of a
panic. This saves the data: there is nowhere left to write, you can
read dmesg, unmount (or boot from rescue), and run fsck.
Alternatives: errors=continue (do nothing, dangerous),
errors=panic (kernel panic, for embedded).
ddrescue for a dying disk
When an HDD is failing and a plain dd chokes on bad sectors:
ddrescue /dev/sda backup.img backup.log
It builds an image, skipping the bad sectors (without hanging on them),
and keeps a log. A second run reads only the problem areas it missed.
Before any xfs_repair --L or btrfs check --repair, make a
ddrescue copy first.
Recovering a file
If a file was deleted but its inode was not overwritten:
- ext4:
extundelete /dev/sda1 --restore-file path/to/fileore2undel. Unmount the filesystem before you run it. - xfs:
xfs_undelete(a community tool, does not always work). - btrfs: you can pull it from a snapshot if one existed.
btrfs restorefrom a broken FS. - Any FS: photorec from the testdisk package, a signature-based search by file type.
The longer the filesystem runs after the deletion, the smaller your chances. Unmount right away.
Best practices
- Backups make fsck unnecessary
- errors=remount-ro in fstab, always
- smartctl -a watches S.M.A.R.T. and predicts a crash
- scrub on btrfs/zfs once a week
- fsck -fp at boot for the passno=1 entry catches small issues
- ddrescue before a repair when you suspect the hardware
- Document the UUIDs of your partitions and your backup superblocks
When something goes wrong
fsck.xfs does nothing: this is normal, fsck.xfs is a stub. Usexfs_repair.- fsck hangs:
xfs_repairon a huge filesystem can take hours. Leave it alone. Bad superblock: use an ext4 backup superblock throughe2fsck -b $BACKUP /dev/sda1.UUID conflict: afterddthe copy has the same UUID. Runtune2fs -U random /dev/sdb1.- lost+found empty after fsck: there was nothing to recover as an orphan, or the filesystem was healthy.
- the system will not boot because fsck failed: boot from rescue
(
init=/bin/sh, a USB live), and rune2fsck -fy /dev/sdaNby hand.