Home The SSD Endurance Experiment: Two freaking petabytes
Reviews

The SSD Endurance Experiment: Two freaking petabytes

Geoff Gasior Former Managing Editor Author expertise
Disclosure
Disclosure
In our content, we occasionally include affiliate links. Should you click on these links, we may earn a commission, though this incurs no additional cost to you. Your use of this website signifies your acceptance of our terms and conditions as well as our privacy policy.

More than a year ago, we drafted six SSDs for a suicide mission. We were curious about how many writes they could survive before burning out. We also wanted to track how each one’s performance characteristics and health statistics changed as the writes accumulated. And, somewhat morbidly, we wanted to watch what happened when the drives finally expired.

Our SSD Endurance Experiment has left four casualties in its wake so far. Representatives from the Corsair Neutron Series GTX, Intel 335 Series, Kingston HyperX 3K, and Samsung 840 Series all perished to satisfy our curiosity. Each one absorbed far more damage than its official endurance specification promised—and far more than the vast majority of users are likely to inflict.

The last victim fell at 1.2PB, which is barely a speck in the rear-view mirror for our remaining subjects. The 840 Pro and a second HyperX 3K have now reached two freaking petabytes of writes. To put that figure into perspective, the SSDs in my main desktop have logged less than two terabytes of writes over the past couple years. At this rate, it’ll take me a thousand years to reach that total.

So, yeah. Pretty insane. It’s time for another check-up.

The story so far
If this is your first encounter with our endurance experiment, I recommend reading this introductory article. It has more details about our subjects, methods, and test rigs than we’ll rehash here. Here’s the TL;DR version:

The experiment explores a weakness inherent to the very core of flash memory. NAND stores data by trapping electrons inside billions of individual memory cells. The cells are walled off by an insulating layer that normally prevents electrons from getting in or out. Applying voltage to a cell induces electron flow through that barrier via a process called tunneling. Electrons are drawn in when data is written and expelled when data is erased.

Tunneling is a pretty slick feat of nanoscale engineering, but it comes at a cost. The accumulated traffic slowly breaks down the physical integrity of the insulator, degrading its ability to trap electrons in the cell. Some electrons also get caught in the insulator, imparting a negative charge that narrows the cell’s usable voltage range. The more that window shrinks, the more difficult it is to read and write data reliably—and quickly.

When cells become more trouble than they’re worth, fresh blood is called up from the SSD’s overprovisioned “spare” area. These replacement cells ensure the drive maintains the same user-accessible capacity regardless of any underlying flash failures.

Although all SSDs are living on borrowed time, they can take different paths to the end of the road. Intel’s 335 Series is designed to go out on its own terms, after a pre-determined volume of writes. Ours took its own life after 750TB—but not before its wear indicator bottomed out and multiple SMART warnings were issued.

Our first HyperX 3K only made it to 728TB. Unlike the 335 Series, which was almost entirely free of failed flash, the HyperX reallocated nearly a thousand sectors before it ultimately expired. Again, though, the wear indicator and SMART warnings provided plenty of notice that the end was nigh.

All but a few of the HyperX’s reallocated sectors hit after 600TB of writes. The Samsung 840 Series started reporting reallocated sectors after just 100TB, likely because its TLC NAND is more sensitive to voltage-window shrinkage than the MLC flash in the other SSDs. The 840 Series went on to log thousands of reallocated sectors before veering into a ditch on the last stretch before the petabyte threshold. There was no warning before it died, and the SMART attributes said ample spare flash lay in reserve. The SMART stats also showed two batches of uncorrectable errors, one of which hit after only 300TB of writes. Even though the 840 Series technically made it past 900TB, its reliability was compromised long before that.

Corsair’s Neutron GTX was our most recent casualty. Despite being the picture of health up to 1.1PB, it suffered a rash of flash failures over the next 100TB. SMART errors also began to appear, foretelling the drive’s imminent doom. The Neutron ultimately reached 1.2PB, and it completed the usual round of tests at that milestone. However, it failed to power up properly after a subsequent reboot.

After the Neutron GTX failed to answer the bell, the 840 Pro and second HyperX 3K pressed on to 2PB without issue. They also completed their fifth unpowered retention test. This time, the SSDs were left unplugged for 10 days. Both maintained the integrity of our 200GB test file.

To be fair, the official JEDEC specs require that drives accurately retain data for much longer unpowered periods. We had to make a few concessions to accelerate the timeline for this experiment.

Our two remaining subjects have passed the same retention tests and absorbed the same volume of writes, but their individual stories are very different. On the next page, we’ll take a closer look at how each one is coping with the continuous barrage of incoming data.

 

The war of attrition continues
We’ll start with the Samsung 840 Pro, which has an unblemished record despite mounting reallocated sectors.

Flash failures started piling up after 600TB of writes. Apart from a few undulations, the retirement rate has been fairly consistent since.

The 840 Pro is now up to 5591 reallocated sectors, which translates to over 8GB of flash. That may sound like a lot, but it’s only 3% of the drive’s 256GB total. The SMART data indicates that we’re only 61% into the used block reserve.

That reserve counter seems to be the best gauge for the 840 Pro’s remaining life. The wear-leveling count is supposed to be related to drive health, but it expired 1.5PB ago.

Given what’s supposedly still in the tank, 3PB doesn’t seem impossible. The “good” health rating reported by Samsung’s Magician software is encouraging, too, though it’s hard to put a lot of faith in that assessment. Our failed 840 Series had the same health rating before its sudden demise. Unlike its fallen sibling, the 840 Pro has at least remained completely free of uncorrectable errors.

Before digging into the Kingston HyperX’s vital signs, we should point out that this drive is running a different race than the 840 Pro and the other SSDs. The HyperX is based on a SandForce controller that compresses incoming writes to reduce flash wear (and to accelerate performance). Thanks this DuraWrite mojo, the HyperX has squeezed the experiment’s 2PB of host writes into just 1.4PB of flash writes.

Now, that compression ratio is only applicable to our particular write workload. The surviving HyperX has been getting a stream of sequential data using the 46% incompressible “applications” setting in Anvil’s Storage Utilities. To ensure an even playing field, completely incompressible data has been used for the other drives, including the first HyperX. The graph below illustrates the impact that difference has on the HyperX’s compressed writes attribute, which tracks the true flash footprint of inbound data.

Write compression clearly isn’t the only factor responsible for the remaining HyperX’s survival. If that were the case, the drive would have quit around 1.1PB, when its flash writes matched those of its deceased twin. Digging deeper into the SMART data provides some additional insight on this candidate’s exceptional endurance.

The lifetime attribute leveled out long ago, triggering a warning that drive is in a “pre-failure” state. Despite that ominous message, flash failures have been few and far between. Only 31 reallocated sectors have been reported through 2PB of writes, which translates to a mere 124 megabytes of failed flash. The death toll has risen slightly since we last checked in, but the total still represents no more than a blip.

Less than half of the reallocated sectors have been prompted by program or erase failures. The HyperX recovered gracefully from those hiccups, but it also logged two uncorrectable errors just before reaching 1PB. Uncorrectable errors can compromise data integrity, so we recommend taking SSDs out of service if any appear. While the HyperX remains in the experiment, a black mark taints its permanent record—and an asterisk denotes its compressible payload.

Caveats aside, there’s no denying that the flash in this particular unit is incredibly robust. Since the HyperX is designed to keep writing data until its sector reserves are exhausted, this one may have a lot of life ahead. Then again, failure could be just around the corner. The other Kingston SSD went from 10 reallocated sectors to nearly 1000 over its last 128TB of writes.

With our health check-up complete, it’s time to see if the aging survivors can keep up with their former, fresher selves. On to the benchmarks!

 

Performance
We benchmarked all the SSDs before we began our endurance experiment, and we’ve gathered more performance data after every 100TB of writes since. It’s important to note that these tests are far from exhaustive. Our in-depth SSD reviews are a much better resource for comparative performance data. What we’re looking for here is how each SSD’s benchmark scores change as the writes add up.

As the experiment progresses, the 840 Pro is becoming somewhat more prone to slower performance in Anvil’s sequential write speed test. Those slowdowns have been relatively minor so far, and they’re still inconsistent. For example, the 840 Pro didn’t skip a beat during its last two benchmarking sessions.

The minor variance in the HyperX’s random read scores doesn’t seem to be related to wear. That drive has otherwise performed consistently, a common trend throughout the experiment.

Unlike our first batch of results, which was obtained on the same system after secure-erasing each drive, the next set comes from the endurance test itself. Anvil’s utility lets us calculate the write speed of each loop that loads the drives with random data. This test runs simultaneously on six drives split between two separate systems (and between 3Gbps SATA ports for the HyperX drives and 6Gbps ones for the others), so the data isn’t useful for apples-to-apples comparisons. However, it does provide a long-term look at how each drive handles this particular write workload.

Again, there’s some evidence that the 840 Pro’s write performance is slowing slightly. While the average write speed per run has oscillated wildly since the beginning, the peaks have been a little lower over the past 300TB.

With the exception of regularly spaced spikes associated with secure-erasing the SSDs before each round of benchmarks, the HyperX has maintained steady write speeds since the beginning. Credit compressible data for the drive’s performance advantage over its incompressible counterpart.

 

Until the last SSD standing
I knew it was possible for some of the SSDs in our endurance experiment to survive 2PB of writes, but I didn’t really expect any of them to make it this far. Two petabytes is a staggering amount of data for consumer-grade drives.

To be fair, our sample size is too limited to draw definitive conclusions about the drives we tested. Flash wear is tied to the physical integrity of individual cells, so it can be influenced by normal semiconductor manufacturing variances. One needs to look no further than the experiment’s twin HyperX units to see that, even within the same family, some SSDs simply have more durable NAND than others.

The results of our experiment do, however, point to some more general conclusions about SSDs as a whole. Although only two drives made it to 2PB, all six wrote hundreds of terabytes without issue, vastly exceeding their official endurance specifications. More importantly, the drives all survived far more writes than most users are likely to generate. Typical consumers shouldn’t worry about exceeding the endurance of modern SSDs.

With 2PB in the bag, our survivors are already on the lonely road to the next milestone. Their ongoing battle reminds me a little of the Iron War, an infamous showdown between Dave Scott and Mark Allen during the 1989 Ironman triathlon world championship. After matching each other through the race’s 2.4-mile swim and 112-mile bike, the two legends ran side-by-side for much of the marathon that followed. Allen ended up pulling ahead in the final miles to win the eight-hour race by less than a minute.

Yeah, I’ve written enough of these endurance updates that I’m now tapping the well of obscure sports references to keep things fresh. But the Ironman is all about endurance, just like this experiment.

Right now, it’s hard to say which of our remaining subjects will be the last SSD standing. As the lone survivor to remain free of serious errors, the 840 Pro is already a victor of sorts. The question is whether it can outlast the last HyperX, which refuses to give up despite stumbling through a couple of uncorrectable errors. The HyperX has write compression on its side and plenty of spare flash in reserve, so the final duel could go on for a while. We’ll be watching.

Geoff Gasior Former Managing Editor

Geoff Gasior Former Managing Editor

Geoff Gasior, a seasoned tech marketing expert with over 20 years of experience, specializes in crafting engaging narratives that connect people with technology. At Tech Report, he excelled in editorial management, covering all aspects of computer hardware and software and much more.

Gasior's deep expertise in this field allows him to effectively communicate complex concepts to a wide range of audiences, making technology accessible and engaging for everyone