April 2018 1 post
Identifying file associated with a bad sector on ext2/ext3/ext4
Monday, April 16, 2018
I got some SMART warnings about a bad sector on my hard drive, and I wanted to know which specific file had the bad sector.
First, I looked at the SMART logs to see where the problem was:
# smartctl -x /dev/sdd ... After command completion occurred, registers were: ER -- ST COUNT LBA_48 LH LM LL DV DC -- -- -- == -- == == == -- -- -- -- -- 40 -- 51 00 08 00 00 04 4b 5b c0 40 00 Error: UNC at LBA = 0x044b5bc0 = 72047552 ...
fdisk -l
is useful for looking at the partition info and sector size:
# fdisk -l /dev/sdd Disk /dev/sdd: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disklabel type: dos Disk identifier: 0x54afc7e9 Device Boot Start End Sectors Size Id Type /dev/sdd1 2048 3907029167 3907027120 1.8T 83 Linux
Then I used badblocks
to look around that physical sector for more bad sectors. My sector size is 512 bytes, shown above; also, badblocks
takes the end sector number first, followed by the start sector:
# badblocks -b 512 /dev/sdd 72047570 72047540 72047552 72047553 72047554 72047555 72047556 72047557 72047558 72047559 72047560
Finally, debugfs
is useful for finding which files are on those blocks.
Explanation:
- First, find the logical filesystem block number by computing (physical sector - partition start sector) * (physical sector size / filesystem block size). In my case, this would be (72047552 − 2048) * (512 / 4096) = 9005688. Since there are 9 contiguous sectors affected, the bad area stretches into block 9005689 as well.
- Use
testb
to see whether there is actually anything there. If not, then no data is lost. - Use
icheck
to find the inode corresponding to those blocks. Luckily (?), both bad blocks are associated with the same inode here. - Finally, use
ncheck
to find the pathname(s) associated with the inode.
# debugfs /dev/sdd1 debugfs 1.43.5 (04-Aug-2017) debugfs: testb 9005688 Block 9005688 marked in use debugfs: testb 9005689 Block 9005689 marked in use debugfs: icheck 9005688 Block Inode number 9005688 105518423 debugfs: icheck 9005689 Block Inode number 9005689 105518423 debugfs: ncheck 105518423 Inode Pathname 105518423 /drz/rdiff-backup/artanis/var/lib/pgsql/data/base/21595/26720
Here, it was just a backup file, so once I swap out the hard drive or reallocate the sector, the next backup cycle will fix the lost data.