====== bad sectors ======
====== SMART test ======
There are three types of self-tests that a device can execute (all are safe to user data):
* **Short**: runs tests that have a high probability of detecting device problems,
* **Extended or Long**: the test is the same as the short check but with no time limit and with complete disk surface examination,
* **Conveyance**: identifies if damage incurred during transportation of the device.
Run test in foreground mode:
smartctl -t short -C /dev/sdd
View test results:
sudo smartctl -l selftest /dev/sdd
====== Repair sector ======
smartctl -a /dev/sdb
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed: read failure 90% 60539 974041815
# 2 Short offline Completed without error 00% 60516 -
#Try to read bad sector:
hdparm --read-sector 974041815 /dev/sdb
/dev/sdb:
reading sector 974041815: SG_IO: bad/missing sense data, sb[]: 70 00 03 00 00 00 00 0a 40 51 e0 01 11 04 00 00 a0 d7 00 00 00 00 00 00 00 00 00 00 00 00 00 00
succeeded
hdparm --yes-i-know-what-i-am-doing --repair-sector 974041815 /dev/sdb
# Test rest of disk:
smartctl -t select,974041815-max /dev/sdb
smartctl -l selftest /dev/sdb
=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Selective offline Completed: read failure 90% 60550 974041843
# 2 Short offline Completed: read failure 90% 60539 974041815
hdparm --yes-i-know-what-i-am-doing --repair-sector 974041843 /dev/sdb
# Test rest of disk:
smartctl -t select,974041815-max /dev/sdb
Ready script to automate above:
[[https://serverfault.com/questions/461203/how-to-use-hdparm-to-fix-a-pending-sector|How to use hdparm to fix a pending sector?]]
Fixed version:
#!/bin/bash -u
baddrive=/dev/sdb
badsect=974041815
while true; do
echo Testing from LBA $badsect
smartctl -t select,${badsect}-max ${baddrive} 2>&1 >> /dev/null
echo "Waiting for test to stop (each dot is 5 sec)"
while [ "$(smartctl -a ${baddrive} | awk '/Self-test execution status:/ {print $5}')" = "249)" ]; do
echo -n .
sleep 5
done
echo
echo "Waiting for test to stop (each dot is 5 sec)"
while [ "$(smartctl -l selftest ${baddrive} | awk '/^# 1/{print substr($5,1,9)}')" != "Completed" ]; do
echo -n .
sleep 5
done
echo
badsect=$(smartctl -l selftest ${baddrive} | awk '/# 1 Selective offline Completed/ {print $10}')
[ $badsect = "-" ] && exit 0
echo Attempting to fix sector $badsect on $baddrive
hdparm --repair-sector ${badsect} --yes-i-know-what-i-am-doing $baddrive
echo Continuning test
done
For ZFS with RAIDZ filesytem, SCRUB is needed to replace bad data:
zpool scrub poolname
zpool status -v poolname
[[https://github.com/hradec/fix_smart_last_bad_sector]]
[[https://raw.githubusercontent.com/unxed/fixhdd/master/fixhdd.py]]