meta data for this page

bad sectors

SMART test

There are three types of self-tests that a device can execute (all are safe to user data):

  • Short: runs tests that have a high probability of detecting device problems,
  • Extended or Long: the test is the same as the short check but with no time limit and with complete disk surface examination,
  • Conveyance: identifies if damage incurred during transportation of the device.

Run test in foreground mode:

smartctl -t short -C /dev/sdd

View test results:

sudo smartctl -l selftest /dev/sdd

Repair sector

smartctl -a /dev/sdb
 
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed: read failure       90%     60539         974041815
# 2  Short offline       Completed without error       00%     60516         -
#Try to read bad sector:
hdparm --read-sector 974041815 /dev/sdb
 
/dev/sdb:
reading sector 974041815: SG_IO: bad/missing sense data, sb[]:  70 00 03 00 00 00 00 0a 40 51 e0 01 11 04 00 00 a0 d7 00 00 00 00 00 00 00 00 00 00 00 00 00 00
succeeded
 
hdparm --yes-i-know-what-i-am-doing --repair-sector 974041815  /dev/sdb
 
# Test rest of disk:
smartctl -t select,974041815-max /dev/sdb
 
smartctl -l selftest /dev/sdb
=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Selective offline   Completed: read failure       90%     60550         974041843
# 2  Short offline       Completed: read failure       90%     60539         974041815
 
hdparm --yes-i-know-what-i-am-doing --repair-sector 974041843  /dev/sdb
 
# Test rest of disk:
smartctl -t select,974041815-max /dev/sdb

Ready script to automate above: How to use hdparm to fix a pending sector? Fixed version:

#!/bin/bash -u
baddrive=/dev/sdb
badsect=974041815
while true; do
  echo Testing from LBA $badsect
  smartctl -t select,${badsect}-max ${baddrive} 2>&1 >> /dev/null
 
  echo "Waiting for test to stop (each dot is 5 sec)"
  while [ "$(smartctl -a ${baddrive} | awk '/Self-test execution status:/ {print $5}')" = "249)" ]; do
    echo -n .
    sleep 5
  done
  echo
 
  echo "Waiting for test to stop (each dot is 5 sec)"
  while [ "$(smartctl -l selftest ${baddrive} | awk '/^# 1/{print substr($5,1,9)}')" != "Completed" ]; do
    echo -n .
    sleep 5
  done
  echo
 
  badsect=$(smartctl -l selftest ${baddrive} | awk '/# 1  Selective offline   Completed/ {print $10}')
  [ $badsect = "-" ] && exit 0
 
  echo Attempting to fix sector $badsect on $baddrive
  hdparm --repair-sector ${badsect} --yes-i-know-what-i-am-doing $baddrive
  echo Continuning test
done

For ZFS with RAIDZ filesytem, SCRUB is needed to replace bad data:

zpool scrub poolname
zpool status -v poolname

https://github.com/hradec/fix_smart_last_bad_sector https://raw.githubusercontent.com/unxed/fixhdd/master/fixhdd.py