The SSD Endurance Experiment: 22TB update

Flash memory has limited write endurance. So do the SSDs based on it. How many writes can modern drives take before they expire, and what happens to them as the flash wears out? We’re trying to find out by testing a selection of SSDs to failure. You can read all the nitty-gritty details about the experiment in this introductory article. Today, we’re checking in on our subjects after 22TB of writes.

22TB might seem like an odd place to pause our testing, but it’s a close match for the endurance specification attached to the Intel 335 Series. That drive is rated for 20GB of writes per day for three years. Do the math.

The Intel 335 Series is one of only two drives in our lineup with a published endurance rating. The other, the Kingston HyperX 3K, is supposed to withstand 192TB of writes. Since it’ll take a while to push into triple-terabyte digits, we’ve decided to stop at 22TB before moving on.

Unfortunately—or fortunately, depending on your perspective—we have little to report at this juncture. Anvil’s endurance test wrote 22TB to each SSD, and all the drives passed the benchmark’s built-in data integrity tests without issues. The SSDs didn’t reach the 22TB mark at the same time, though. Getting there required a little less than three days of non-stop testing on the slowest drive. Here are the average write speeds for the first batch of tests.

Now for the requisite sprinkling of salt: to expedite our experiment, we’re running the endurance benchmark on all the drives simultaneously. The SSDs are split between two different systems and also between 6Gbps and 3Gbps SATA ports, so the average speeds above aren’t entirely comparable. That said, the results provide us with a baseline that may be helpful in assessing the write speeds for subsequent endurance runs. Plus, the numbers give us a sense of how long the next round of testing will take.

(For reference, the Neutron GTX, Intel 335 Series, and the HyperX 3K are all housed in the same system. Another, identical rig contains the Samsung SSDs and a second HyperX 3K that’s being tested with compressible data rather than the incompressible payload used for the others. The HyperX 3K SSDs are both connected to 3Gbps SATA ports, while the rest are plugged into 6Gbps ones.)

We can draw more meaningful conclusions from our targeted performance tests. For these benchmarks, the SSDs are tested individually using the same SATA port on the same system. This method should produce data more appropriate for head-to-head comparisons, but we’re not particularly concerned with how the drives perform relative to each other in this limited collection of synthetic tests. Instead, we’re interested in how each SSD’s basic performance characteristics change as the flash wears out.

So far, we haven’t observed too much of note. We benched the SSDs before endurance testing began and again after we reached the 22TB mark. The performance differences are summarized below.

The vast majority of our most recent results are within 1-2% of the factory-fresh readings. Given the run-to-run variance associated with these tests, I wouldn’t worry about such small differences—not unless they become part of a long-term trend.

Surprisingly, the HyperX SSDs got a lot faster in the random read speed test. You shouldn’t need to break in an SSD before it starts delivering peak performance, but the Kingston drives apparently didn’t kick into high gear right away. The Intel 335 Series uses the same SandForce controller as the HyperX drives; it also had higher random read performance after 22TB, although only by 8%. Perhaps we’re looking at a quirk of the controller or its associated firmware.

We expect flash wear to decrease SSD performance over time, but these drives still have a lot of life left in them. Each SSD has SMART attributes that tally bad blocks, bytes written, and other variables. We’re tracking those attributes, and the SSDs are so far free from bad blocks, which means all of their NAND remains intact.

The data we’ve collected also provide some insight into SandForce’s write compression mojo. Again, we’re testing two SandForce-based HyperX drives: one with incompressible data like the other SSDs, and another with the endurance benchmark’s 46% “applications” compression setting. According to the “lifetime compressed writes” SMART attribute on the HyperX drives, the incompressible data produced 22.8TB of flash writes. The compressible data wrote only 15.5TB to the flash, a savings of 32%.

We can compare those totals to the host write tallies in order to get a sense of write amplification. Both HyperX drives report 21.6TB of host writes, resulting in write amplification factors of 1.05 for the incompressible data and 0.72 for compressible data. The Intel 335 Series registers 22.9TB of total NAND writes on 21.6TB of host writes, closely matching the write amplification of its HyperX counterpart. Unfortunately, the other SSDs don’t track NAND writes in addition to host writes, preventing us from calculating their write amplification factors.

Our endurance experiment is still in its infancy, so that’s all the analysis we’ll indulge for now. Next on the agenda: another 78TB of writes to bring the drives up to 100TB. We’ll evaluate the SSDs again at that point, but we may not report back with additional results until something exciting happens. It could take hundreds of terabytes for the first cracks to appear.

Comments closed
    • glugglug
    • 6 years ago

    It’s been 33 days since the last update. No 100TB update yet?

    That’s 792 hours for the drives to write another 78TB. Which would be a write rate of only 28.6MB/s.

    Did one of the SSDs really get that slow?

    • MarkG509
    • 6 years ago

    Any updates? Is the test still in progress? Should I just be patient? Has anything died yet? Does SMART indicate anything about having eaten into the over-provisioning?

    I need another SSD (but admittedly only about as much as the SO needs new shoes :), and now that the 25nm flash Crucial M4’s (~15k rewrite cycles) are no longer available, and with M500’s somewhere (unspecified) in the low 4-figures of rewrite cycles, any hard real-world ‘data’ would help.

    • glugglug
    • 6 years ago

    If I understand the test correctly, there is a serious flaw in the experiment.

    Starting with an empty drive, bringing it to 100% full, since every flash page was free to start, there is no write amplification, even with the most insanely bad wear leveling algorithm.

    Trimming it entirely brings it back to initial state, so on the next round, there is again no write amplification (meaning an effective ratio of 1.0 still).

    It would take much longer, but be more meaningful to test endurance at various disk fullness ranges. i.e. run one set of drives from 25% full to 50%, then deleting files to get them back to 25% and looping, run another set from 50% to 75%, deleting to loop back to 50%, and for a near-worst-case scenario, fill another set of drives completely and only delete enough files to bring them back to 75% each loop. A 0-25% test could be skipped; the result would be identical to what you are doing now I think, and it would take the longest because it is a best-case scenario.

    Ignoring this, can we take the lack of update to mean that your typical consumer drive can sustain the petabytes of writes that SLC drives are rated for? 🙂

    • south side sammy
    • 6 years ago

    were these drive supplied or purchased commercially?

    • StorageOlogist
    • 6 years ago

    Great set of tests. Of course in real world implementations folks do a lot to mitigate the writes to SSD. I wrote a blog discussing some of those and how to protect SSD in a storage system.

    [url<]http://blog.starboardstorage.com/blog/bid/316419/How-to-Protect-Solid-State-SSD-and-Flash-Storage[/url<]

    • anotherengineer
    • 6 years ago

    “The SSDs didn’t reach the 22TB mark at the same time, though. Getting there required a little less than three days of non-stop testing on the slowest drive.”

    Posted on September the 6th, so about 7.33TB/day, so being the 17th today, all the drives should be around the 100TB mark.

    Any more news from your sweatshop?

      • house
      • 6 years ago

      Geoff said they may not report back at 100 TB unless something exciting happens. Barring any major failures, I don’t expect a follow up until several hundred TBs are written.

    • psyclone
    • 6 years ago

    Thanks for doing the test, SSD long term reliability is something every computer geek likes to know about.

    • Cloef
    • 6 years ago

    I follow this experiment closely. I even started my own [url=http://ssdendurancetest.com<]ssd endurance test[/url<] just to find out how these SSDs perform in servers/raid. The main difference is that I don't use TRIM and use random non repeatable and non compressible data. 50/50 random and sequential writes. But my random writes are completely random. Took some time figuring out that random writes comes in different shapes and forms. I am not interested in neatly organized random writes that evenly fills up a file with data - this makes life too easy for the NAND controllers. I want chaotic random writes that will overwrite parts of itself, just like in a database or multi user environment. I noticed that some of you want to monitor SMART data more closely. I share the same idea when I started plotting out the data in graphs. It gives a great visual health status and makes it easier to spot correlations and signs of imminent failure.

    • iq100
    • 6 years ago

    You wrote>”… Anvil’s endurance test – Average write speed” was 100 MBytes/sec for Samsung 840 Pro 256GB …”

    BUT reviews state 500 MByte/sec for these new breed of SSDs?
    So, are we saying that steady state is really only 100MB/sec which is achievable by many HDDs?
    I am more interested in sustained continuous write rates than shorter term burst speeds.
    Is it true that SSD’s long term sustainable write speeds are not much faster, if at all, than hard drives?
    Remember a chain is only as strong as the weakest link. For SSDs/flash this is time to erase. Erase times can be hidden during a pause in SSD writes, but only if there is a pause! It there is no such pause, then the entire write chain process will run at SSD page erase times!

      • Waco
      • 6 years ago

      The sustained write rates rarely come close to the “maximum write speeds” shown by benchmarks and spec sheets.

      Also recall that this isn’t doing ideal-sized writes AFAIK – most drives only come close to their spec’d speeds with high queue depths and large block sizes.

      • Dissonance
      • 6 years ago

      Anvil’s endurance test writes files of different sizes using a single thread. That’s a very different workload from the synthetic benchmarks that produce sequential transfer rates around 500MB/s. It’s also very different from real-world I/O patterns, at least for typical desktop systems. We noted that in our introductory article.

      We don’t recommend drawing conclusions based on a single performance test. The 840 Pro has showed some signs of weakness in the workload associated with Anvil’s endurance benchmark, but we’ve also tested the drive in a wide range of synthetic and real-world tests, and its performance has been excellent overall. I recommend reading our review for more context: [url<]https://techreport.com/review/23990/samsung-840-pro-series-ssd-reviewed.[/url<] While I agree that a chain is only as strong as its weakest length, a little more nuance is required for this situation. The weakness here isn't necessarily relevant to real-world workloads, it just constrains how quickly we can reach various milestones in our endurance experiment. I'd be far more concerned if the 840 Pro had stumbled in, say, one of our DriveBench 2.0 metrics, which measures disk response time over a trace that comprises nearly two weeks of real-world desktop I/O. Or if it had faltered in FileBench, which measures copy speeds with real-world files instead of randomly generated ones.

    • NovusBogus
    • 6 years ago

    Interesting stuff. You guys should keep the test going in a closet or something because it would be really fun to see what happens after even more writes. There’s not a lot of quantitative data about what exactly happens when SSDs go bad, just a mixture of Panglossian optimism and paranoiac FUD.

    What about logical disk size, have you noticed a significant decrease yet?

      • Dissonance
      • 6 years ago

      We intend to keep these drives writing as long as we can. And yes, they’re even in a closet 😉 No changes in logical disk size.

      • bmntr
      • 6 years ago

      Why would there be changes in disk size?

        • quetzal
        • 6 years ago

        I think the notion here was that as sectors wear out, become bad sectors, and are found and omitted, the usable drive size would decrease.

          • golpebaixo
          • 6 years ago

          [url<]http://en.wikipedia.org/wiki/Wear_leveling[/url<]

          • meerkt
          • 6 years ago

          The spare area isn’t exposed as part of the normal drive size. I think once that’s exhausted a drive would sound all SMART alarms, with the expectation you’d backup and replace it immediately.

    • MarkG509
    • 6 years ago

    Impressively good results so far. But, I’d like to second the requests for detailed SMART info to see if any drives are eating into their “over provisioning” yet. Some SSD-aware SMART tools attempt to predict life-time remaining on a drive.

    I sure hope the outcome of this experiment isn’t to make me lazy about backing up my SSD.

    • dreamer77dd
    • 6 years ago

    I wonder how PCI Express SSD hold up also. Very interesting. Wonder if it is a bug you found in the firmware that speeds up the drive when it should be already fast out of the gate.

      • Farting Bob
      • 6 years ago

      PCIe SSD’s use the same chips, just usually more of them with an extra controller chip sandwiched inbetween the controller and the PCIe interface. It shouldn’t make any difference to endurance providing the flash chips are the same.

    • Saber Cherry
    • 6 years ago

    I’m interested in knowing how well their data retention time is holding up. Perhaps between phases of testing the drives could sit a week turned off to see if they they can still be read?

      • albundy
      • 6 years ago

      i dont see why not. how else would they end up sitting on store shelves (or in laptops) for weeks/months before being sold?

        • Saber Cherry
        • 6 years ago

        Writing to flash is a destructive process and gradually reduces retention time.

        • bmntr
        • 6 years ago

        Retention time is from the moment data is written. Even if the drive sits with data for a while, as I understand it, if you read and rewrite the whole drive you effectively refresh it.

      • flip-mode
      • 6 years ago

      I second this.

      • bmntr
      • 6 years ago

      And particularly after exhausting the P/E cycles. The JEDEC spec calls for 1 year retention for exhausted cells on consumer drives. I am actually more interesting in retention for new and for halfway-used, but that’s more difficult to test for if it really takes years for errors to start showing up.

      • kilkennycat
      • 6 years ago

      Hey, Geoff, PLEASE reply to this request !!!

      After a couple of weeks vacation away from my well-used SSD-equipped laptop, I sure would like to know whether my OS and data are still fully intact after power-up.

        • indeego
        • 6 years ago

        The results of this study have near zero applicability to your particular setup. Backup your data.

    • NAND
    • 6 years ago

    there is only 1GB left alive on my 4GB usb thumb after a year of use 🙁

      • Firestarter
      • 6 years ago

      that’s probably bargin-bin TLC NAND though

        • travbrad
        • 6 years ago

        Even with bargain-bin NAND I’m surprised that a USB thumb drive degraded so quickly. I have a bargain bin 8GB thumb drive that I’ve probably written 1-2TB of data to, and it’s exactly the same size as when I got it.

          • faramir
          • 6 years ago

          I own a no-name gadget that plays MP3s and doubles as a flash drive. This thing will not store data reliably from the get-go (one test run will finish with 0 errors, next run will return errors in multiple locations and next run will end up with 0 errors again; using identical data pattern on all tries).

    • DPete27
    • 6 years ago

    Question: The 3Gbps bandwidth limit isn’t hindering any SSD’s write speeds, correct?

    Is there any chance the drives hooked up to a 3Gbps ports would have an endurance advantage since they’re not receiving data as fast? (blocks aren’t being overwriten quite as frequently)

      • MadManOriginal
      • 6 years ago

      He’s testing them after an equal amount of writes, not after an equal time, so speed of data written shouldn’t matter.

      • Tjalve
      • 6 years ago

      I ask the same question. Even if its the amount o written data is the same, the fact remain that the drives connected to the SATA 2 ports have more “idle time” then the others. Depending on how GC is processed on the drive, this lower speed could give the drive lower WA and thereby increase its lifepan.
      Making this test invalid.
      The problem is that there is no way to be sure if lower write speed effects WA. The SF drives could calculate the WA but none of them are capapble of 300Mb/s or more compressible data.

      Regarding the low scores of the 840 Pro. Since consumer drives are not designed for continous writes (only servers and high-end workstations can produce this over longer periods of time), the drive is designed to delviver high burst speeds.Personaly i would go for a drive that has lower peak performance, but better “worst case” performance. Like the Neutron GTX or Seagate 600 or Sandisk Extreme 2.

      Im currently sittning on 4K steady state numbers for almost all consumer SSDs on the market. If anyone is intererested, i could post a few numbers.

        • Cloef
        • 6 years ago

        It would be really interesting to see some of your 4k steady state numbers!

    • Sargent Duck
    • 6 years ago

    Keep up the good work.

    I know I’ll never reach this point with my SSD’s, but it’s still really good to know.

      • ClickClick5
      • 6 years ago

      +1

      Indeed.

      • indeego
      • 6 years ago

      On my Main workstation I have like 4 TB written over 16K hours powered on sitting at 99% in crystal. I think I’m going to be OK.

        • Forge
        • 6 years ago

        I have just under 7TB written in 3308 power hours on my laptop. Individual usage may vary.

    • ca197
    • 6 years ago

    “and the SSDs are so far free from bad blocks” Quick question. Would an SSD actually report bad blocks in SMART for an SSD which has switched a block into spare area (Or more like just marked a portion of flash as unusable)? In other words would you actually have to wear down enough flash for the drive to use all the “spare” area before SMART would report a bad block?

      • BIF
      • 6 years ago

      This is my question too.

      • Dissonance
      • 6 years ago

      The bad blocks tracked by SMART attributes should pertain to blocks that have been pillaged from spare area. In fact, the attribute is called the “reallocated sector count” in some instances.

    • indeego
    • 6 years ago

    The average write speeds show such a variance! That is what I find the most interesting. We say there isn’t much difference between the SSD latest generations, but man that is quite the difference…

      • Peldor
      • 6 years ago

      The fact that the Samsung 840 Pro leads all the benchmarks and is 2nd slowest in the actual test requires some additional explanation IMO.

        • albundy
        • 6 years ago

        probably marketing hype conjured with review sites giving it glorious reviews when performance and reliability were not there yet (aka ocz) and endurance wasnt even considered beyond TRIM.

          • Firestarter
          • 6 years ago

          maybe you should read TRs reviews of these SSDs again, and tell us why those reviews are not up to your standards

        • sparkman
        • 6 years ago

        Yes why is the one of the TR System Guide’s top SSD’s, the Samsung 840 Pro, looking so weak in these graphs?

    • d3m0n5
    • 6 years ago

    Any chance you can dump the SMART of all drives in text for us to analyze at this point?
    Also, FW versions on all drives pretty please? :)) Great job with the tests and writeup so far!

      • Dissonance
      • 6 years ago

      I’ll see if I can come up with a good way to release the SMART data. So far, lotta CSV files. Four for each of the SSDs at different points during the setup/testing process. But it’s easy to give you firmware revs for now…

      Neutron GTX: M306
      335 Series: 335U
      840 Series: DXT08B0Q
      840 Pro: DXM05B0Q
      HyperX: 501ABBF0

        • Geonerd
        • 6 years ago

        How about a simple screen grab?

        [url<]https://dl.dropboxusercontent.com/u/60092457/misc/CDI.jpg[/url<]

        • continuum
        • 6 years ago

        Yes, SMART use showing media wearout at least would be very interesting. Thanks!

Pin It on Pinterest

Share This