Posted on Leave a comment

Is SMART Really Useful?

Being in technology for a long time, I have seen my fair share of disk failures. However I have never seen a single instance where SMART has issued a sufficient warning to backup any data on a failing disk. The following is an example of this in action.

Toshiba MQ01ABD050
Toshiba MQ01ABD050

Here is a 2.5″ Toshiba MQ01ABD050 500GB disk drive. This unit was made in 2014, but has a very low hour count of ~8 months, with only ~5 months of the heads being loaded onto the platters, since it has been used to store offline files. This disk was working perfectly the last time it was plugged in a few weeks ago, but today within seconds of starting to transfer data, it began slowing down, then stopped entirely. A quick look at the SMART stats showed over 4000 reallocated sectors, so a full scan was initiated.

SMART Test Failure
SMART Test Failure

After the couple of hours an extended test takes, the firmware managed to find a total of 16,376 bad sectors, of which 10K+ were still pending reallocation. Just after the test finished, the disk began making the usual clicking sound of the head actuator losing lock on the servo tracks. Yet SMART was still insisting that the disk was OK! In total about 3 hours between first power up & the disk failing entirely. This is possibly the most sudden failure of a disk I’ve seen so far, but SMART didn’t even twig from the huge number of sector reallocations that something was amiss. I don’t believe the platters are at fault here, it’s most likely to be either a head fault or preamp failure, as I don’t think platters can catastrophically fail this quickly. I expected SMART to at least flag that the drive was in a bad state once it’s self-test completed, but nope.

Internals
Internals

After pulling the lid on this disk, to see if there’s any evidence of a head crashing into a platter, there’s nothing – at least on a macroscopic scale, the single platter is pristine. I’ve seen disks crash to the point where the coating has been scrubbed from the platters so thoroughly that they’ve been returned to the glass discs they started off as, with the enclosure packed full of fine black powder that used to be data layer, but there’s no indication of mechanical failure here. Electronic failure is looking very likely.

Clearly, relying on SMART to alert when a disk is about to take a dive is an unwise idea, replacing drives after a set period is much better insurance if they are used for critical applications. Of course, current backups is always a good idea, no matter the age of drive.

Posted on Leave a comment

Western Digital 160GB 2.5″ HDD

Top Of Drive With Label
Top

This is a Western Digital drive recently removed from my laptop when it died of a severe head crash.
Top of drive can be seen here.

Top Removed From Drive
Top Removed

Here the cover has been removed from the drive, showing the platter, head arm & magnet. Yellow piece top left is head parking ramp.

Head Arm of Drive
Head Arm

The head assembly of the drive is shown here. The head itself is on the left hand end of the arm in the plastic parking ramp. The other end of the arm holds the voice coil part of the head motor, surrounded by the magnet.

Bottom Of Drive with PCB
Bottom Of Drive with PCB

Bottom of drive, with controller PCB. SATA interface socket at bottom.

PCB removed from bottom of drive. Spindle motor connections & connections to the head unit can be seen on the bottom of the drive unit.

Controller PCB. Supports the cache, interface & motor controller ICs.

Closeup of the motor driver IC, this controls the speed of the spindle motor precisely to 5,400RPM. Also controls the voice coil motor controlling the position of the head arm on the platters.

Interface IC closeup. This IC receives signals from the head assembly & processes them for transmission to the SATA bus. Also holds drive firmware, controls the Motor driver IC & all other functions of the drive.

Cache Memory IC.