Table of Content

Error-handling mechanisms in XFS

This section describes how XFS handles various kinds of errors in the file system.

Unclean unmounts

Journalling maintains a transactional record of metadata changes that happen on the file system.

In the event of a system crash, power failure, or other unclean unmount, XFS uses the journal (also called log) to recover the file system. The kernel performs journal recovery when mounting the XFS file system.

Corruption

In this context, corruption means errors on the file system caused by, for example:

  • Hardware faults
  • Bugs in storage firmware, device drivers, the software stack, or the file system itself
  • Problems that cause parts of the file system to be overwritten by something outside of the file system

When XFS detects corruption in the file system or the file-system metadata, it may shut down the file system and report the incident in the system log. Note that if the corruption occurred on the file system hosting the /var directory, these logs will not be available after a reboot.

System log entry reporting an XFS corruption

$ dmesg --notime | tail -15

XFS (loop0): Mounting V5 Filesystem
XFS (loop0): Metadata CRC error detected at xfs_agi_read_verify+0xcb/0xf0 [xfs], xfs_agi block 0x2
XFS (loop0): Unmount and run xfs_repair
XFS (loop0): First 128 bytes of corrupted metadata buffer:
00000000027b3b56: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
000000005f9abc7a: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
000000005b0aef35: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000000da9d2ded: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
000000001e265b07: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
000000006a40df69: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
000000000b272907: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000000e484aac5: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
XFS (loop0): metadata I/O error in "xfs_trans_read_buf_map" at daddr 0x2 len 1 error 74
XFS (loop0): xfs_imap_lookup: xfs_ialloc_read_agi() returned error -117, agno 0
XFS (loop0): Failed to read root inode 0x80, error 11

User-space utilities usually report the Input/output error message when trying to access a corrupted XFS file system. Mounting an XFS file system with a corrupted log results in a failed mount and the following error message: mount: /mount-point: mount(2) system call failed: Structure needs cleaning.

You must manually use the xfs_repari utility to repai the corruption.

xfs_repair

Checking an XFS file system with xfs_repair

This procedure performs a read-only check of an XFS file system using the xfs_repair utility. You must manually use the xfs_repair utility to repair any corruption. Unlike other file system repair utilities, xfs_repair does not run at boot time, even when an XFS file system was not cleanly unmounted. In the event of an unclean unmount, XFS simply replays the log at mount time, ensuring a consistent file system; xfs_repair cannot repair an XFS file system with a dirty log without remounting it first.

Although an fsck.xfs binary is present in the xfsprogs package, this is present only to satisfy initscripts that look for an fsck.file system binary at boot time. fsck.xfs immediately exits with an exit code of 0.

Procedure

Replay the log by mounting and unmounting the file system

$ mount /target
$ umout /target

If the mount fails with a structure needs cleaning error, the log is corrupted and cannot be replayed. The dry run should discover and report more on-disk corruption as a result.

Use the xfs_repair utility to perform a dry run to check the file system. Any errors are printed and an indication of the actions that would be taken, without modifying the file system.

$ xfs_repari -n block-device

Mount the file system

$ mount /target

Repairing an XFS file system with xfs_repair

This procedure repairs a corrupted XFS file system using the xfs_repair utility.

Procedure

Create a metadata image prior to repair for diagnostic or testing purposes using the xfs_metadump utility. A pre-repair file system metadata image can be useful for support investigations if the corruption is due to a software bug. Patterns of corruption present in the pre-repair image can aid in root-cause analysis.

Use the xfs_metadump debugging tool to copy the metadata from an XFS file system to a file. The resulting metadump file can be compressed using standard compression utilities to reduce the file size if large metadump files need to be sent to support.

$ xfs_metadump block-device metadump-file

Replay the log by remounting the file system

$ mount /target
$ umount /target

Use the xfs_repair utility to repair the unmounted file system: If the mount succeeded, no additional options are required:

$ xfs_repair block-device

If the mount failed with the Structure needs cleaning error, the log is corrupted and cannot be replayed. Use the -L option (force log zeroing) to clear the log:

Warning This command causes all metadata updates in progress at the time of the crash to be lost, which might cause significant file system damage and data loss. This should be used only as a last resort if the log cannot be replayed.

$ xfs_repair -L block-device

Mount the file system

$ mount /target