The final casualties
The next victim totally had it coming, but it still deserves our respect. Bow your head in a moment of silence for the second HyperX 3K.
SandForce-based SSDs like the HyperX (and the Intel 335 Series) use write compression to shrink the flash footprint of incoming data. To prevent this feature from tainting the results of the experiment, we tested the drives with incompressible data. We also hammered a second, identical HyperX with compressible data that would cooperate with SandForce's special sauce. This twin was fed a diet of 46% incompressible data from Anvil's Storage Utilities, the application used to accumulate writes and test performance.
From the very beginning, the second HyperX's compressible payload measurably reduced the volume of writes committed to the NAND. The following plot shows the host and compressed writes accumulated by both HyperX drives. Host writes denote data written by the system, while compressed writes represent the corresponding impact on the flash.
The incompressible HyperX wrote slightly more data to the flash than it received from the host, an expected result given the low write amplification of our sequential workload. Meanwhile, its compressible twin wrote 28% less to the NAND.
As the graph illustrates, the compressible HyperX didn't hit the same volume of flash writes that killed its sibling until around 1.1PB. The drive evidently wasn't ready to go quietly into the night, either. It went on to write another freaking petabyte before failing. To get a sense of how far the drive exceeded its life expectancy, check the next plot of the life-remaining attribute:
The life attribute takes compression into account, so it's clear this HyperX survived on more than just SandForce mojo. The low number of reallocated sectors suggests that the NAND deserves much of the credit. Like all semiconductors, flash memory chips produced by the same process—and even cut from the same wafer—can have slightly different characteristics. Just like some CPUs are particularly comfortable at higher clock speeds and voltages, some NAND is especially resistant to write-induced wear.
The second HyperX got lucky, in other words.
It also didn't lead a perfect life. On the leg between 900TB and 1PB, the HyperX logged a couple of uncorrectable errors along with its first reallocated sectors. Even two uncorrectable errors are too many, so the HyperX continued with the same asterisk attached to it that the 840 Series did after it had the same issue. Not counting correctable program and erase failures, the drive was error-free after that.
The HyperX is designed to keep writing until its flash reserves run out, which seems to be what happened with the first drive. The circumstances surrounding the second's death are obscured by a power outage that struck after 2.1PB of writes. This interruption occurred over the Christmas holidays, while I was away from the lab. The machine booted without issue when I returned, but it hard-locked as soon as I tried to access the HyperX, and the drive wasn't detected after a subsequent reboot. Attempts to recovery data and SMART stats also failed.
With the data available, it's impossible to tell whether the outage precipitated the failure or ocurred after it. To the HyperX's credit, messages warning of impending failure started appearing after the life attribute flattened out, long before the drive's eventual demise.
And so the Samsung 840 Pro soldiered on as the last SSD standing.
The 840 Pro was among the most well-behaved drives in the experiment. It remained free of uncorrectable errors until the very end, and it accumulated reallocated sectors at a surprisingly consistent rate.
Reallocated sectors started appearing in volume after 600TB of writes. Through 2.4PB, the Pro racked up over 7000 reallocated sectors totaling 10.7GB of flash. Samsung's Magician utility gave a clean bill of health, though, and the used-block counter showed ample reserves to push past 2.5PB:
As I prepared to leave the drive unattended during a week-long vacation at the end of February, I thought, "what could possibly go wrong?" Famous last words.
When I logged into the endurance test rig upon returning last week, Anvil's Storage Utilities were unresponsive, as was HD Sentinel, the program used to pull SMART data from the drives. The interfaces for both applications were blank, and Windows Explorer crashed when I tried to access the 840 Pro. Then a message from Intel's SATA drivers appeared to say that the drive was no longer connected to the system. The 840 Pro took its last gasp in my arms—or, rather, at my fingertips—and it's been completely unresponsive.
As with the demise of Samsung's TLC-based 840 Series, death struck without warning or mercy. A sudden burst of flash failures may have been responsible.
Before moving on to the performance analysis on the next page, I should note that the 840 Pro exhibited a curious inflation of writes associated with the power outage after 2.1PB. The SMART attributes indicate an extra 38TB of host writes during that period, yet Anvil's logs contain no evidence of the additional writes. Weird. Maybe the SMART counter tripped up when the power cut out unexpectedly.