linuxlab.io
Tutorials▾
  • Linux & networking
    File system, processes, TCP/IP, BGP and OSPF
    →
  • Terraform & IaC
    HCL, state, plan/apply on a LocalStack sandbox
    →
  • Git & GitHub
    Object model, plumbing, branching, GitHub Actions
    →
All tutorials →
PricingAboutSign inCreate account
/
  • Introduction
  • Lessons
  • How it works
  • Simulator
  • Knowledge base
  • Interview prep
Index
Categories
All entries
Footer
linuxlab-TutorialsPricingAboutPrivacy & cookies
Copyright © 2026 LinuxLab. All rights reserved.
home/linux/kb/File system/fsck-and-recovery

kb/filesystem ── File system ── advanced

fsck and recovery: checking and repairing a filesystem

fsck, a check of an unmounted filesystem. e2fsck (ext), xfs_repair (XFS), btrfs check (btrfs). Journal replay at mount handles 90% of problems after a crash.

view as markdownaka: fsck, filesystem-check, e2fsck, xfs-repair, fs-recovery, journal-replay

When something breaks

A filesystem can end up in a "dirty" state after:

  • Power loss / kernel panic during a write
  • Bad sectors on an HDD/SSD, where the storage stack returned an error
  • A disk controller that lied about fsync (write-back cache without a BBU)
  • Memory corruption (ECC, or the lack of it)
  • A bug in the kernel or the filesystem
  • An accidental write to a raw device past the filesystem

Symptoms:

  • Read-only file system after mount (errors=remount-ro)
  • Structure needs cleaning at the next mount
  • Input/output error when reading specific files
  • Odd ls behavior (garbage names, files disappearing)
  • The machine stuck at boot on "checking filesystems..."

Strategy:

  1. Stop. Do not write to the damaged filesystem.
  2. Back up the raw device (dd or ddrescue) before any edits.
  3. If there is a journal, try a replay at mount.
  4. If that does not help, do an offline fsck/repair.
  5. If even that fails, extract the data with recovery utilities.

Journal replay, the automatic magic

At mount ext4/xfs/btrfs see that the previous umount was dirty, and they replay the journal: they apply changes that were committed but not written. This takes seconds and resolves most crash scenarios.

What you will see in dmesg:

EXT4-fs (sda1): recovery required on readonly filesystem
EXT4-fs (sda1): write access will be enabled during recovery
EXT4-fs (sda1): mounted filesystem with ordered data mode
XFS (sdb1): Mounting V5 Filesystem
XFS (sdb1): Starting recovery (logdev: internal)
XFS (sdb1): Ending recovery (logdev: internal)

If journal recovery fails, then reach for fsck.

fsck, the frontend

fsck is a universal wrapper that looks at the filesystem type and calls the right tool:

FSReal binary
ext2/3/4e2fsck
xfsa no-op in fsck, use xfs_repair directly
btrfsbtrfs check directly
vfatdosfsck
jfsjfs_fsck
bash
# Run automatically from fstab at boot
fsck -A           # everything in fstab (passno > 0)
fsck /            # a specific mountpoint
fsck /dev/sda1    # a specific device

Options:

OptionWhat it does
-Aeverything in fstab
-y"yes to all", not safe if you do not understand the damage
-ndry-run, show only
-fforce even on a clean filesystem
-Vverbose
-Cprogress bar

The main rule: the filesystem must be unmounted. A live fsck on a mounted filesystem that is being written to means corruption. -n (RO) on a mounted filesystem is fine for diagnostics.

e2fsck, the ext family

bash
umount /dev/sda1
e2fsck -f /dev/sda1                # force, always
e2fsck -fy /dev/sda1               # auto-yes (for scripts and desperate cases)
e2fsck -fp /dev/sda1               # preen, fix what is safe, complain about the rest

Special cases:

bash
# Use a backup superblock if the primary one is broken
e2fsck -b 32768 /dev/sda1                    # on ext4 with 4K blocks
mke2fs -n /dev/sda1                          # shows where the backups are (no write!)
dumpe2fs -h /dev/sda1 | grep -i backup       # on a healthy filesystem

What it prints on corruption:

Pass 1: Checking inodes, blocks, and sizes
Inode 12345 has illegal block(s).  Clear<y>?
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information

Files that lost their name but still have an inode land in /lost+found/ named by inode number. The content is readable; you recover the name from the content.

xfs_repair, XFS

bash
umount /dev/sdb1
xfs_repair /dev/sdb1                # normal
xfs_repair -n /dev/sdb1             # dry-run
xfs_repair -L /dev/sdb1             # ZERO LOG, last resort!

-L zeroes the journal, losing whatever was not committed. Do it only when a normal xfs_repair fails with "log is corrupt or cannot be read".

XFS has no lost+found concept in the same form. Orphan inodes are reattached automatically or removed.

btrfs check / btrfs rescue

bash
umount /dev/sdc1
btrfs check /dev/sdc1                  # diagnostics only
btrfs check --repair /dev/sdc1         # DANGEROUS, can make it worse!

btrfs check --repair has been marked experimental for years. Use it only when you have nothing to lose. Try btrfs rescue first:

bash
btrfs rescue super-recover /dev/sdc1   # recover the super blocks
btrfs rescue chunk-recover /dev/sdc1   # recover the chunk tree
btrfs rescue zero-log /dev/sdc1        # like xfs_repair -L

If the filesystem will not mount, try RO:

bash
mount -o ro,recovery,nologreplay /dev/sdc1 /mnt

and rescue the data with cp -a or btrfs send.

fstab, passno

The sixth field in fstab says when to fsck at boot:

UUID=...  /     ext4  defaults  0 1
UUID=...  /var  ext4  defaults  0 2
UUID=...  /tmp  tmpfs defaults  0 0
  • 0: do not check
  • 1: root, checked first
  • 2: checked after root, in parallel with other 2 entries

XFS and btrfs set passno=0, because fsck is a no-op for them (they use their own tools when needed).

errors=remount-ro

A mount option, set in fstab:

UUID=...  /  ext4  defaults,errors=remount-ro  0 1

When the kernel sees a filesystem error, it remounts RO instead of a panic. This saves the data: there is nowhere left to write, you can read dmesg, unmount (or boot from rescue), and run fsck.

Alternatives: errors=continue (do nothing, dangerous), errors=panic (kernel panic, for embedded).

ddrescue for a dying disk

When an HDD is failing and a plain dd chokes on bad sectors:

bash
ddrescue /dev/sda backup.img backup.log

It builds an image, skipping the bad sectors (without hanging on them), and keeps a log. A second run reads only the problem areas it missed. Before any xfs_repair --L or btrfs check --repair, make a ddrescue copy first.

Recovering a file

If a file was deleted but its inode was not overwritten:

  • ext4: extundelete /dev/sda1 --restore-file path/to/file or e2undel. Unmount the filesystem before you run it.
  • xfs: xfs_undelete (a community tool, does not always work).
  • btrfs: you can pull it from a snapshot if one existed. btrfs restore from a broken FS.
  • Any FS: photorec from the testdisk package, a signature-based search by file type.

The longer the filesystem runs after the deletion, the smaller your chances. Unmount right away.

Best practices

  • Backups make fsck unnecessary
  • errors=remount-ro in fstab, always
  • smartctl -a watches S.M.A.R.T. and predicts a crash
  • scrub on btrfs/zfs once a week
  • fsck -fp at boot for the passno=1 entry catches small issues
  • ddrescue before a repair when you suspect the hardware
  • Document the UUIDs of your partitions and your backup superblocks

When something goes wrong

  • fsck.xfs does nothing: this is normal, fsck.xfs is a stub. Use xfs_repair.
  • fsck hangs: xfs_repair on a huge filesystem can take hours. Leave it alone.
  • Bad superblock: use an ext4 backup superblock through e2fsck -b $BACKUP /dev/sda1.
  • UUID conflict: after dd the copy has the same UUID. Run tune2fs -U random /dev/sdb1.
  • lost+found empty after fsck: there was nothing to recover as an orphan, or the filesystem was healthy.
  • the system will not boot because fsck failed: boot from rescue (init=/bin/sh, a USB live), and run e2fsck -fy /dev/sdaN by hand.

§ команды

bash
sudo umount /dev/sdb1 && sudo e2fsck -fy /dev/sdb1

Unmount and check an ext filesystem with force and auto-yes

bash
sudo xfs_repair -n /dev/sdb1

Dry-run check of XFS; shows problems without fixing them

bash
sudo dumpe2fs -h /dev/sda1 | grep -i 'last mount\|last check\|state'

Filesystem state and last check time; how often it gets checked

bash
sudo mount -o ro,noload /dev/sda1 /mnt/rescue

Read-only mount without journal replay; rescue data from a broken FS

bash
sudo ddrescue /dev/sda /backup/sda.img /backup/sda.log

Image a dying disk before any repair attempt

bash
sudo btrfs rescue zero-log /dev/sdc1

Btrfs: zero the journal if the log is broken; last resort before --repair

bash
sudo mke2fs -n /dev/sdb1

Show backup superblocks without overwriting; useful during corruption

§ см. также

  • ext4ext4: the Linux filesystem workhorseext4 is the default filesystem on most distributions: journaling, extents, a fixed inode count set at mkfs time. The main tunes are the data mode, noatime, and lazy init. Stable for 15+ years. Does not scale like XFS.
  • xfsXFS: extents and parallel I/OXFS is the RHEL 7+ default: allocation groups (parallel I/O), extent-based allocation, online grow. **It cannot shrink**, grow only. Ideal for big files, databases, and parallel workloads.
  • btrfsbtrfs: copy-on-write, subvolumes, and snapshotsbtrfs is a copy-on-write filesystem with subvolumes, O(1) snapshots, native RAID 0/1/10, and data checksums. RAID 5/6 is problematic. COW fragmentation hurts databases and VM images, so turn it off for them.
  • mount-and-fstabmount and /etc/fstab: attaching filesystems`mount` attaches a block device or filesystem to a mount point in the tree. `/etc/fstab` is the list of what to mount at boot.
  • block-devicesBlock devices: disks in LinuxA block device is read and written in fixed-size blocks (usually 512B or 4K). Disks, SSDs, and NVMe drives are all block devices in `/dev/`.
Footer
linuxlab-
Copyright © 2026 LinuxLab. All rights reserved.
Tutorials
Pricing
About
Privacy & cookies