Friday, April 18, 2008

USB dead device fix

I have a slug (NSLU2) at home, serving out a big USB hard disk. I like my slug. It's small, and it runs Debian, and so long as I remember that it's really small, it runs really well.

It looks like USB drives suffer from lots of issues with dodgey firmware and drivers. Certainly, I haven't escaped them. Every few months I lose the USB drive:
kern.log:
Apr 12 20:32:54 slug kernel: scsi 1:0:0:0: rejecting I/O to dead device

...

This can cause filesystem corruption as well, and is generally "A Bad Thing". Fixing this issue is (for me) a case of remounting the filesystem. I recall once I also had to cycle the power too. Fortunately, I found this page: http://www.mail-archive.com/linux-usb-users@lists.sourceforge.net/msg16510.html which has a very nifty technique for coaxing the USB stack back to life.

I've written a script to pull out the sysfs name of the USB device for a given mount point. At some point I'll add this in to my slug so that it "just works". That just leaves a log scraper to pick up each mount and get the sysfs name, wait for a failure, unmount the dead file system, rebind the USB device, check the file system for damage, and remount the filesystem. Easy!
#!/bin/bash
# get_usb_name.sh
# e.g.: ./get_usb_name.sh /media/usb2

USB_DEV=$(mount | grep ${1} | cut -f 1 -d ' ' | sed -e 's/[0-9]$//' -e 's(^/dev/((')
USB_ADDR=$(ls -l /sys/block/${USB_DEV}/device | sed -e 's(/host.*$((' -e 's(^.*/((')
if [ -e /sys/bus/usb/drivers/usb-storage/${USB_ADDR} ]
then
echo $USB_ADDR
else
echo "could not verify usb address of ${USB_ADDR} for device ${USB_DEV}" >&2
fi

No comments: