====== bad sectors ====== ====== SMART test ====== There are three types of self-tests that a device can execute (all are safe to user data): * **Short**: runs tests that have a high probability of detecting device problems, * **Extended or Long**: the test is the same as the short check but with no time limit and with complete disk surface examination, * **Conveyance**: identifies if damage incurred during transportation of the device. Run test in foreground mode: smartctl -t short -C /dev/sdd View test results: sudo smartctl -l selftest /dev/sdd ====== Repair sector ====== smartctl -a /dev/sdb SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed: read failure 90% 60539 974041815 # 2 Short offline Completed without error 00% 60516 - #Try to read bad sector: hdparm --read-sector 974041815 /dev/sdb /dev/sdb: reading sector 974041815: SG_IO: bad/missing sense data, sb[]: 70 00 03 00 00 00 00 0a 40 51 e0 01 11 04 00 00 a0 d7 00 00 00 00 00 00 00 00 00 00 00 00 00 00 succeeded hdparm --yes-i-know-what-i-am-doing --repair-sector 974041815 /dev/sdb # Test rest of disk: smartctl -t select,974041815-max /dev/sdb smartctl -l selftest /dev/sdb === START OF READ SMART DATA SECTION === SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Selective offline Completed: read failure 90% 60550 974041843 # 2 Short offline Completed: read failure 90% 60539 974041815 hdparm --yes-i-know-what-i-am-doing --repair-sector 974041843 /dev/sdb # Test rest of disk: smartctl -t select,974041815-max /dev/sdb Ready script to automate above: [[https://serverfault.com/questions/461203/how-to-use-hdparm-to-fix-a-pending-sector|How to use hdparm to fix a pending sector?]] Fixed version: #!/bin/bash -u baddrive=/dev/sdb badsect=974041815 while true; do echo Testing from LBA $badsect smartctl -t select,${badsect}-max ${baddrive} 2>&1 >> /dev/null echo "Waiting for test to stop (each dot is 5 sec)" while [ "$(smartctl -a ${baddrive} | awk '/Self-test execution status:/ {print $5}')" = "249)" ]; do echo -n . sleep 5 done echo echo "Waiting for test to stop (each dot is 5 sec)" while [ "$(smartctl -l selftest ${baddrive} | awk '/^# 1/{print substr($5,1,9)}')" != "Completed" ]; do echo -n . sleep 5 done echo badsect=$(smartctl -l selftest ${baddrive} | awk '/# 1 Selective offline Completed/ {print $10}') [ $badsect = "-" ] && exit 0 echo Attempting to fix sector $badsect on $baddrive hdparm --repair-sector ${badsect} --yes-i-know-what-i-am-doing $baddrive echo Continuning test done For ZFS with RAIDZ filesytem, SCRUB is needed to replace bad data: zpool scrub poolname zpool status -v poolname [[https://github.com/hradec/fix_smart_last_bad_sector]] [[https://raw.githubusercontent.com/unxed/fixhdd/master/fixhdd.py]]