Post by Dag-Erling SmørgravPost by Miroslav LachmanSo... can somebody with enough knowledge write some docs / script how
to find the affected file based on LBA read error from messages /
SMART log?
ZFS will tell you straight away, but I guess if you used ZFS, you
wouldn't be asking :)
Yes, but we have ZFS only on two servers, others are using UFS2 (some
with gmirror, some with gjournal)
Post by Dag-Erling SmørgravFor FFS, you can unmount the file system (boot from a CD or memory stick
or whatever if that file system is / or /usr), run fsdb on the failing
disk, use findblk to look up the inode number for the file that contains
the bad sector. Note that you have to convert the LBA to an offset
relative to the start of the partition.
As I write in my first post to this thread, I already tried fsdb +
findblk, but without success. Findblk did not returned any inode. Maybe
the meaning of block is of different size or something else I can't
understand.
So can you please show me some real world example?
I have one from the past:
__________________
/var/log/messages:
Sep 23 23:58:00 edith kernel: ad4: FAILURE - READ_DMA
status=51<READY,DSC,ERROR> error=40<UNCORRECTABLE> LBA=79725056
Sep 23 23:58:00 edith kernel: GEOM_MIRROR: Request failed (error=5).
ad4[READ(offset=40819228672, length=131072)]
__________________
SMART log:
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 6f 82 c0 44 Error: UNC at LBA = 0x04c0826f = 79725167
The LBA of bad sector is *79725167*
__________________
Information about disk slices:
sysid 165 (0xa5),(FreeBSD/NetBSD/386BSD)
start 63, size 209712447 (102398 Meg), flag 80 (active)
beg: cyl 0/ head 1/ sector 1;
end: cyl 1023/ head 254/ sector 63
The data for partition 2 is:
sysid 165 (0xa5),(FreeBSD/NetBSD/386BSD)
start 209712510, size 1743807555 (851468 Meg), flag 0
beg: cyl 1023/ head 255/ sector 63;
end: cyl 1023/ head 254/ sector 63
__________________
According to LBA and size of s1, I thing the error is in s1
# /dev/mirror/gm0s1:
8 partitions:
# size offset fstype [fsize bsize bps/cpg]
a: 2097152 0 /
b: 25165824 2097152 swap
c: 209712447 0
d: 12582912 27262976 /var
e: 146800640 39845888 /var/db
f: 16777216 186646528 /usr
g: 6288703 203423744 /tmp
And LBA 79725056 is on */var/db* (between offset 39845888 and 186646528)
__________________
s1 starts 63 sectors from the beginning of the drive and /var/db has
offset 39845888. So am I right that I need to find block number
*39879105* by findblk command?
LBA err - s1 start - /var/db offset = findblk inside /dev/mirror/gm0s1e
79725056 - 63 - 39845888 = 39879105
__________________
/# fsdb -r /dev/mirror/gm0s1e
** /dev/mirror/gm0s1e (NO WRITE)
Examining file system '/dev/mirror/gm0s1e'
Last Mounted on /var/db
current inode: directory
I=2 MODE=40755 SIZE=512
BTIME=May 1 08:07:23 2009 [0 nsec]
MTIME=Sep 24 15:52:01 2009 [0 nsec]
CTIME=Sep 24 15:52:01 2009 [0 nsec]
ATIME=Sep 24 16:24:34 2009 [0 nsec]
OWNER=root GRP=wheel LINKCNT=11 FLAGS=0 BLKCNT=4 GEN=4ebc65fc
findblk 39879105
findblk 39879106
findblk 39879107
findblk 39879108
.
.
I tried more than 256 incrementing block numbers, but findblk didn't
found any inode! (length=131072 in error message means 256 sectors, right?)
So there must be some misunderstanding on my part and that's why I am
asking for some step-by-step documentation or script "how to find file
by LBA read error message"
I tried the fsdb + findblk on well known data, but again without success.
I created file /tmp/test.txt, it has inum 3, than I use fsdb on gm0s1f
(gm0s1f is mounted as /tmp). Command "inode 3" inside fsdb prompt
returned informations about this file, command "blocks" returned 3001 as
block number, but command "findblk 3001" returned nothing instead of
inum 3!
Where is the error? What I am doing wrong?
__________________
~/# echo test > /tmp/test.txt
~/# ls -i /tmp/test.txt
3 /tmp/test.txt
~/# fsdb -r /dev/mirror/gm0s1f
** /dev/mirror/gm0s1f (NO WRITE)
Examining file system '/dev/mirror/gm0s1f'
Last Mounted on /tmp
current inode: directory
I=2 MODE=41777 SIZE=512
BTIME=Feb 7 18:32:22 2008 [0 nsec]
MTIME=Mar 14 10:33:22 2010 [0 nsec]
CTIME=Mar 14 10:33:22 2010 [0 nsec]
ATIME=Mar 14 10:33:35 2010 [0 nsec]
OWNER=root GRP=wheel LINKCNT=7 FLAGS=0 BLKCNT=4 GEN=3f7c9384
fsdb (inum: 2)> inode 3
current inode: regular file
I=3 MODE=100644 SIZE=5
BTIME=Mar 14 10:33:22 2010 [0 nsec]
MTIME=Mar 14 10:33:22 2010 [0 nsec]
CTIME=Mar 14 10:33:22 2010 [0 nsec]
ATIME=Mar 14 10:33:22 2010 [0 nsec]
OWNER=root GRP=wheel LINKCNT=1 FLAGS=0 BLKCNT=4 GEN=45c26de1
fsdb (inum: 3)> blocks
Blocks for inode 3:
Direct blocks:
3001 (1 frag)
fsdb (inum: 3)> findblk 3001
fsdb (inum: 3)>
^^^^^^^^ findblk did not returned inode 3!
Post by Dag-Erling SmørgravUnfortunately, you can't easily go from inode to file name; you have to
mount the file system and use something like find -inum.
Yes, I know this.
Thanks in advance to help me understand and use fsdb + findblk commands.
Miroslav Lachman
PS: all above was tested on gmirror gm0, but I did the same tests on
single drive ad4 with the same "empty" result (info just for case if
fsdb can't be used on gmirror, but I don't think so)