The SSD Endurance Experiment: Data retention after 600TB

Six weeks have passed since our last SSD endurance update. When we last visited our heroes, they had just crossed the half-petabyte threshold—no small feat for a collection of consumer-grade drives that includes the Corsair Neutron GTX, Intel 335 Series, Kingston HyperX 3K, and Samsung 840 Series and 840 Pro. Those drives have now left the 600TB mark in the rear-view mirror, so it’s time for another update.

If you think it’s taken longer than usual to add 100TB to the total, you’re right. The truth is, the SSDs have been on hiatus, and so have I. The drives hit the 600TB mark about a week before I was scheduled to escape to Thailand for a two-week vacation. Since our subjects have been working pretty much non-stop since August, I figured they could use a break, too. Their vacation would be a working one, though. Instead of spending their time on sun-soaked beaches with cold beers in hand, the drives participated in another unpowered data retention test.

The Samsung 840 Series kicked out a spate of unrecoverable errors when we conducted our first retention test after 300TB of writes, so we were curious to see how it and the others would fare in a longer unpowered test. We have more bad blocks to report for several of our candidates, including a couple of MLC-based drives, plus another set of performance results. Let’s get started.

Those unfamiliar with our endurance experiment would do well to start with this introductory article, which explains our methods in greater detail than we’ll indulge here. The concept is pretty simple. Flash-based storage has a limited tolerance for writes, so we’re writing loads of data to a bunch of SSDs to see how much they can take. We’re also monitoring drive health and performance to observe what happens as the flash degrades.

Our test subjects include six SSDs designed for consumer desktops and notebooks: the Corsair Neutron GTX 240GB, Intel 335 Series 240GB, Kingston HyperX 3K 240GB, and Samsung 840 Pro 256GB, which are all based on two-bit MLC flash, and the Samsung 840 Series 250GB, which uses three-bit TLC NAND. The drives are being tested with incompressible data, and we have a second HyperX unit that’s being tested with compressible data to explore the impact of SandForce’s write compression tech.

Flash cells fail when the effects of accumulated write cycling prevent data from being stored reliably. In addition to degrading the physical structure of the NAND cells, write cycling causes a negative charge to build up within them. This charge reduces the range of voltages that can be used to define the contents of individual cells, making it more difficult to write and verify data. Eventually, cells become unreliable and have to be retired. They’re replaced with reserve blocks culled from the SSD’s overprovisioned spare area.

TLC NAND squeezes an extra bit into each flash cell, so it’s more prone to wear than the MLC alternative. That puts the Samsung 840 Series at a theoretical disadvantage versus its MLC competition, and this relative weakness has been reflected in the experiment’s results already. The 840 Series has registered far more reallocated sectors—bad blocks of flash that have been replaced—than any of the MLC-based drives. It was also the only drive to stumble in our first retention test after 300TB of writes. (That’s not to say that the 840 Series flunked this class; 300TB is well beyond what the average consumer is likely to write to any SSD during its lifetime.)

In that initial retention test, we copied a 200GB file to each drive and performed a hash check to verify its integrity. We then unplugged the drives for a week before performing the same hash check again.

The 840 Series failed repeated hash checks after we first copied our test file to the drive. Its SMART attributes logged a considerable number of unrecoverable errors around the same time, indicating serious failures that could have led to corrupt or lost data in a real-world situation. The hash errors disappeared when we copied the file over a second time, and the 840 Series ultimately passed the unpowered component of the retention test without issue.

We followed the same test procedure after 600TB of writes, except this time, the drives were left in an unpowered state for three weeks. Each drive passed the hash checks we ran before and after the unpowered period. Even the 840 Series had no issues—and no more unrecoverable errors. Whatever caused the problem in our first retention test didn’t affect this latest one.

Interestingly, the 840 Series logged one reallocated sector during the retention test. Copying our 200GB file to the drive apparently pushed one of its flash blocks beyond the brink. That brings the number of bad blocks to date up to 2192, far more than for any of the MLC-based SSDs. Allow me to illustrate:

So, yeah, no contest here. As expected, the 840 Series’ TLC flash is eroding at a much higher rate than the two-bit alternatives. The decay rate has been pretty linear since failures starting piling up after 100TB of writes.

According to my calculations, those reallocated sectors add up to over 3GB lost to flash failures. Thanks to overprovisioned spare area, though, Windows reports the same total capacity as when the 840 Series was in its factory-fresh state. Samsung allocated additional spare area to make up for TLC NAND’s more limited lifespan; the 840 Series 250GB devotes 23GB of flash to overprovisioned area, while the MLC-based 840 Pro 256GB sets aside less than 18GB. Even with nearly 2200 block failures, the 840 Series still has a long way to go before it runs out of reserves.

In a moment, we’ll see if the 840 Series’ flash failures have resulted in any performance slowdowns. First, we have to address the part of the graph that’s squished to keep up with the mounting flash failures. Although it’s hard to see, two other SSDs suffered additional flash failures over the last 100TB of writes. The number of reallocated sectors reported by the Kingston HyperX 3K jumped from four to 10, and the 840 Pro’s total climbed from two to 28. Those numbers are still very low, suggesting that the drives have loads of life left in them.

The other MLC drives are faring even better. The Intel 335 Series continues to report only one bad block, and the Corsair Neutron GTX remains fully intact. So does the HyperX drive we’re testing with compressible data. Thanks to write compression, that drive has written less to the flash than its twin being tested with incompressible data.

Now, let’s look at the state of the drives from another angle, with a quick battery of performance tests.

Performance

We benchmarked all the SSDs before we began our endurance experiment, and we’ve gathered more performance data at every milestone since. It’s important to note that these tests are far from exhaustive. Our in-depth SSD reviews are a much better resource for comparative performance data. What we’re looking for here is how each SSD’s benchmark scores change as the writes add up.

Nothing to see here, folks. The SSDs have produced largely consistent scores throughout our endurance experiment. There have been a few unexplained blips here and there, but for the most part, the drives continue to hold the line. Even the 840 Series’ increasing flash death toll has had no ill effects on its performance.

Unlike our first batch of performance results, which were obtained on the same system after secure-erasing each drive, the next set comes from the endurance test itself. Anvil’s utility lets us calculate the write speed of each loop that loads the drives with random data. This test runs simultaneously on six drives split between two separate systems (and between 3Gbps SATA ports for the HyperX drives and 6Gbps ones for the others), so the data isn’t useful for apples-to-apples comparisons. However, it does provide a long-term look at how each drive handles this particular write workload.

Again, each drive’s behavior has remained largely consistent as writes have accumulated. The Samsung 840 Series slowed down a little initially, but since then, its average write speed has barely budged.

We didn’t expect any of the SSDs to get faster, so the Neutron GTX has been a bit of a surprise. Its average write speed continues to increase, albeit at an extremely slow pace. That’s the opposite of what’s happening with the other SSDs, whose write speeds are holding steady or falling ever so slightly.

By far the most telling takeaway thus far is the fact that all the drives have endured 600TB of writes without dying. That’s an awful lot of data—well over 300GB per day for five years—and far more than typical PC users are ever likely to write to their drives. Even the most demanding power users would have a hard time pushing the endurance limits of these SSDs.

So far, the MLC drives have been exemplary. Their flash has suffered relatively few failures, and two drives remain completely unscathed. Our lone TLC offering continues to retire bad blocks at a steady pace, but the Samsung 840 Series still has plenty of flash in reserve. Apart from the unrecoverable errors we encountered after 300TB of writes, its performance has been solid overall.

At this rate, even the 840 Series may reach a petabyte of writes before burning out. The others are on track to cross that threshold easily, and I expect we’ll be waiting a long time for their eventual demise. With that in mind, it might be a little while until our next update. We’ve already established that modern SSDs have more than enough endurance for typical consumer workloads, and at this point, we’re just reporting more of the same. Besides, I’m fresh out of ideas for how to pose these things for pictures.

Maybe I’ll be able to come up with something interesting for the 1PB milestone.

Comments closed
    • youssef_che
    • 5 years ago

    Thank you very much for all your hard work. The Endurance issue has been really worrying me and I was seriously thinking about replacing my Vertex 3 Max IOPS. But now I see that there’s no need for that as I’ll never reach 300GB/day of writes.

    Again, thanks a million for all your hard work.

    • guruMarkB
    • 6 years ago

    Next time you have the next report at the Petabyte data level you should try to hold the 6 drives overlapping like they were 6 playing cards (all aces if none have failed by then). If 1 or more SSD has failed then hold the working drives and show the 1 or more failed drives as if they were discarded playing cards. 🙂
    Of course some one else would need to take the picture.

    • quantumrider
    • 6 years ago

    Hi Everyone,

    Seems this will be a perfect spot to ask these questions…

    Looking at your tests I guess it is clear that in real world tests the Neutron GTX is a clear winner?

    I was going to get 2 1TB EVO mSATA drives and 1 1TB EVO SSD for my MSI laptop but with these findings I would prefer to go with the Neutron GTX. Alas, they don’t make an msata Neutron GTX drive?

    I use my laptop to copy huge video files 30 GB, usually from 60GB to 400GB of files per day, and sometimes more to and from SD cards and to other portable drives and I work every day so speed and reliability are my main concerns, would I benefit from setting up two msata drives in RAID? to prevent complete data loss in event one of the drives fails?

    I think it would be a good idea for me to get that small drobo raid box for SSD drives?

    • sicofante
    • 6 years ago

    This is a bit off-topic.

    Sorry for my ignorance but how’s the average write speed of the Samsung 840 Pro so low when it measures so high in every other write test?

    Any why is it so irregular as well? It’s the only drive that shows such behavior, isn’t it?

    • meerkt
    • 6 years ago

    Thanks for looking a bit on retention. Still hoping for more exhaustive checks of this. 🙂

    • sschaem
    • 6 years ago

    Pointing to the same issue.

    This test is a perfect test of each cell. write, erase the whole drive, repeat.

    In most case people dont reformat their windows partition when they write a new file.
    So the failure rate will be happen exponentially faster. ( in most cases)

    The better test would be to not do the test this way.

    a) allocate 90% of the drivie.
    b) never reformat the data
    c) rewrite the 10% left over.

    This would potentially cause a 10x faster failure rate. (on the weaker drive design)

      • Melvar
      • 6 years ago

      I don’t think you’re correct. SSDs aren’t like hard disks; you can’t write data to a particular flash cell over and over even if you try. The wear leveling routines won’t allow it. That first 90% of data can be moved around as needed by the SSD’s controller without the filesystem ever needing to know about it.

        • UberGerbil
        • 6 years ago

        Yep. Attempting to write 1000 times to one block will result in one write to each of a thousands blocks (grossly simplified), even if there is no free space left on the drive. Of course some wear levelling routines are going to be better than others, and I believe most use a kind of bucket approach so you might end up with something like a few blocks with 200 writes, a few with one hundred, and many more with less than that.

        Sschaem does qualify this with “on a weaker drive design” and some early SSDs might have fallen victim to this, but I’d be surprised if any modern controllers/firmware had this kind of issue.

    • deruberhanyok
    • 6 years ago

    “Besides, I’m fresh out of ideas for how to pose these things for pictures.”

    House of cards? Has that been tried yet? I don’t recall seeing it.

    Doubles as a nice short-fall impact test. 🙂

    • mcnabney
    • 6 years ago

    After reading the comments it appears that ANY suggestion that the 840 was inferior is being answered by down-votes due to confirmation bias. Sure, the total writes are titanic, but also realize that these tests are not factoring for time (lead free solder whiskering), power cycles, varying temperatures, and extended down time. All of those are additional stresses on these drives which will decrease all of the drives potential lifespan. The well documented increased cell failure rate will be a multiplicative factor over time. So don’t write-off a weak foundation intrinsic to TLC.

      • Melvar
      • 6 years ago

      [quote<]it appears that ANY suggestion that the 840 was inferior is being answered by down-votes due to confirmation bias.[/quote<] Actually, I think the down votes are a response to people saying things along the line of "see? I told you the 840 would never hold up!" when that's actually the opposite of what the test has shown. As far as confirmation bias, you can put the belief that the single uncorrectable error that any of these drives has experienced is a result of TLC and not some other random glitch in that category as well. This is far too small a sample size to make any judgement about such things. The error rate could be abnormally low or high; many more drives would have to be tested to find out. This experiment was intended to test one thing: write endurance. All of the drives have exceeded expectations so far, regardless of the fact that there are other factors that haven't been tested for.

        • indeego
        • 6 years ago

        > This is far too small a sample size to make any judgement about such things.

        point, set, match. That is this experiment in a nutshell. The experiment shows [i<]the potential[/i<] for success for SSD devices, not a pattern of any sort. We should all celebrate how well these last. 99% of us will never reach 10% of these drives total use before they are replaced.

      • stoatwblr
      • 5 years ago

      WRT whiskers you can do accelerated testing by raising the temperature, but most manufacturers mitigate the problem these days by using conformal coatings (which stops the whiskers growing)

      It’s worth noting that the rated “maximum write cycles” for SSD drives is the point where they can be powreed down for an extended period (months, not weeks)and still retain data – which is a whole lot different to the “writes fail” absolute end of life point.

    • Zaxx
    • 6 years ago

    This test is something I link people to when I see posts from folks worrying about endurance or trying to off load writes (caches and pagefiles) to a slow hdd. Slowly but surely everyone should get the message that SSD endurance isn’t an issue for even the heaviest power users. Heavy server duty would be about the only concern but that’s what they make enterprise nand (eMLC) for. I’ve seen a Sammy 830 hit the 6PB (that’s 6,000 TBs) mark in Xtreme System’s endurance test here: [url<]http://www.xtremesystems.org/forums/showthread.php?271063-SSD-Write-Endurance-25nm-Vs-34nm/page137[/url<]

    • jihadjoe
    • 6 years ago

    [quote<]As expected, the 840 Series' TLC flash is eroding at a much higher rate than the two-bit alternatives.[/quote<] I chuckled a bit after reading this.

    • odizzido
    • 6 years ago

    Originally I felt I might want to keep my page file on a mechanical drive to keep the SSD from burning out, but I don’t see that as an issue anymore.

      • stdRaichu
      • 6 years ago

      In most of the use case scenarios I’ve witnessed, utilisation of the page file is a) very low and b) 80% or more reads. As long as your commit charge is routinely lower than your physical memory you’ll see very little get written to the page file – and much of that is stuff that is still in physical memory (windows pre-emptively pages some stuff out in case it needs to dump it from physical memory onto the page file in a hurry. Linux does a similar thing depending on the “swappiness” parameter).

      As such, page files/swap partitions are pretty much the perfect use case for SSDs. Not written to much, usually written only in 4k blocks, but benefit hugely from having little-to-no random access penalty.

      NT6.x is thankfully now very frugal at writing to the page file, whilst NT5.x would do it at the drop of a hat – paging stuff out that was already running in order to make room for a file you’re copying over the network was one of the classic examples. I’ve always given my workstations enough physical memory for any working set they’re likely to encounter, and with the advent of SSDs I started either drastically reducing or eliminating page files altogether. I think the biggest page file I have at present is 512MB.

        • indeego
        • 6 years ago

        Small page file = risk that the OS has to increase it regardless (and it will if it needs it.) This incurs a performance penalty. (probably slight on SSDs, but it was a very large performance penalty on mechanicals)

        Large/default page file: Very static, almost never a need to increase size.

        Unless you are starving for space, you really should not mess with the pagefile. It also will need to be a specific size or larger for dumps, so in anything critical, should not be decreased below recommendation on system. If you are starving for space, a far better use of your time is to image and double your storage media, wipe the old media, sell it. Move on.

        Also, you can spread out your pagefile across drives, and Windows knows how to intelligently allocate between them. This is the best of all worlds.

          • stdRaichu
          • 6 years ago

          Windows lets you impose limits on the initial and maximum sizes your page file can be if you so wish – my workstation is set to 64MB/64MB so it never grows. If I ever had a process that caused me to go over my 16GB+64MB vmem, the windows OoM killer would stop it – IME this is almost always preferable to an out-of-control process a) swallowing all physical memory and b) thrashing the disc drive(s) with an infinitely increasing page file at the same time.

          But yes, page file fragmentation does give you problems on platter-based drives, although a bigger problem is the inherently random nature of access to the page file. On SSDs this isn’t a problem and doesn’t give any measurable performance penalty, so if you fancy experimenting with small page files then try setting it to 64MB/4GB and see if windows grows it at all over the course of a normal week or month – chances are if you’ve got 8GB or more of memory it never will.

          I don’t think anything past NT6.1 even allows full memory dumps any more, you have to apply a reg hack in order to enable it as an option – and in any case, physically writing, say 64 to 512GB of data to a local drive in the event of a BSOD will reliably cause a bigger outage than the BSOD itself – and I’m not aware of any issues we’ve had where the vendor has requested a full memory dump for years. The new default dump, a dump of kernel memory only, will typically be much less than 2GB (and in all but the most extreme of circumstances won’t be written into the page file anyway), but all of our servers at work are set to a small memory dump (~128kB) which, as long as you have the relevant symbols included in your debugger, is more than adequate for pinpointing drivers or processes that led to a crash.

          Even with SSDs, the performance and capacity gulf between memory and storage capacity/performance has grown so wide that I see relying on the page file as a curse rather than a way of saving a few quid, except on older non-upgradeable hardware of course. All of the above of course comes with the caveat of running enough physical memory for your workload.

          I’ve posted this link before but well worth a read: [url<]http://blogs.citrix.com/2011/12/23/the-pagefile-done-right/[/url<]

          • joselillo_25
          • 6 years ago

          In SSD there is also a perfomance penalty because windows (at least in my rig) tend to freeze the system until the page file ends increasing size. Is less time than a mecanical but the freezes are here, at least in my computer.

          I am using a 6 gb page file minimum so the freezes are not happening anymore. But I only have 2,7 Gb of RAM to use, so in other cases there is no need for a file of this size.

          Do not know if creating this file have an impact on the SSD life, or creating a huge file does not equal to write data in all those cells until you use the space.

            • UberGerbil
            • 6 years ago

            Creating a over-large page-file will actually improve overall endurance of the SSD (though of course it’s all pretty academic) because the part of the page file that is allocated but never written to will act as additional free blocks for wear-levelling.

            That said, if you are regularly overcommitting your available memory, so that the page file gets thrashed a lot, you are putting more wear on your SSD — and the solution to that of course is more RAM. But again, as this whole exercise has shown, you’re almost certainly going to be replacing the SSD or even the entire computer before that becomes a practical concern.

    • PixelArmy
    • 6 years ago

    Woot! Neutron GTX FTW!

    • Ochadd
    • 6 years ago

    I look forward to each update. Thanks for the test and the article. All my concerns for PC users wearing them out are gone. Anyone writing enough data to put a scratch in the nand supply will have enough knowledge themselves or in their IT department to keep on top of things.

    • mesyn191
    • 6 years ago

    [quote<]Now, let's look at the state of the drives from another angle[/quote<] So many hidden SSD art pile puns in this article. +1 gentlemen. [b<]+1[/b<]

    • ronch
    • 6 years ago

    I think it’s safe to say that today’s SSDs are quite durable and one shouldn’t worry too much about writing too much on them. Now the next question is: how about future SSDs? As manufacturers shrink their chips, how will it affect durability and long-term performance consistency? This question will only be answered by TR in a future article. Stay tuned.

    • Meadows
    • 6 years ago

    So the Samsung 840 is just as reliable after all. …Even so, I already bought the 840 PRO last year and don’t regret it despite the higher price.

      • oldDummy
      • 6 years ago

      [quote<]The SSD Endurance Experiment[/quote<] Personally, I believe the main questions have been answered. Awesomely I might add. Testing to complete failure of all drives might satisfy our ghoulish tendencies with little purpose. SSD's are to HDD's as papyrus is to clay tablets. Great job.

        • Ninjitsu
        • 6 years ago

        Would be interesting to know how many TBs a mechanical drive can read/write before suffering mechanical failure.

          • derFunkenstein
          • 6 years ago

          With the whirling platters and whatnot, small samples are going to look totally random I think. Plus with the much-lower write speeds and higher capacities, it’ll take way longer to do.

          The better bet is looking at platter-based storage studies published by Backblaze and anyone else who posts huge samples.

          • davidbowser
          • 6 years ago

          [url<]http://research.google.com/archive/disk_failures.pdf[/url<] Google's paper on the subject of mechanical drive failure broke them out into low/medium/high utilization based on I/O (weekly avg R/W bandwidth). Read pg 5 for the explanation. I think they made a logical oversight in not examining (or maybe not showing) Total bytes R/W with correlation to failure. One could look at the usage charts and start to see that type of pattern based on some (9-10%) high I/O drives failing within 3 months, but low and medium I/O drives failing after a few years. I imagine the chart of failure rate as a function of Total I/O would have been pretty telling.

        • UberGerbil
        • 6 years ago

        But it will be interesting to see the failure mechanisms as the drives finally die. This small sample isn’t going to characterize that in a way that is predictive, but if some drives degrade relatively gracefully while others fail abruptly, that will be useful to know. I wouldn’t make a purchasing decision based on it, and most of us will probably never see an example in real life, but it might be useful diagnostic data someday (somewhere in the future when running across “ancient” SSDs in old computers).

    • tomwalker
    • 6 years ago

    Tech Report – This is a great test and great data for us who fear SSD endurance. I’d imagine that the higher capacity drives (500GB and 1TB) would have less endurance given the higher densities. I hope there are plans to do a similar test on those higher capcity drives.

    All — thumbs up this if you agree you want to see the same on 500GB/1TB drives.

    Or is my assumption wrong that higher capacity translates to higher density which may translate to less reliability/durability?

      • Melvar
      • 6 years ago

      Higher capacity SSDs usually use the same flash memory chips, just more of them. 2x or 4x as many of the same flash cells means 2x or 4x as many writes to wear out the drive.

      Also, asking for thumbs up is basically asking for thumbs down.

      • Ninjitsu
      • 6 years ago

      Lol what is this, YouTube?

      • Dissonance
      • 6 years ago

      Within the same drive family, higher-capacity models typically use the exact same flash chips as lower-capacity variants; they just have more of them. Density shouldn’t be an issue there.

      That said, higher-capacity drives do have more overprovisioned spare area than lower-capacity ones (the percentage is the same, the total is just higher when more flash is involved.) This should give them a larger pool of flash to draw from when replacing bad blocks.

      Density can become an issue when dealing with flash built on a finer fabrication process. Smaller process geometries put the individual flash cells closer together and shrink the thickness of the cell walls, both of which can accelerate flash wear.

      • UberGerbil
      • 6 years ago

      [quote<]Or is my assumption wrong that higher capacity translates to higher density which may translate to less reliability/durability?[/quote<]Yes.

    • Wirko
    • 6 years ago

    The slow and steady rise of reallocated sectors in the 840 is quite an unexpected result.
    Assuming that wear leveling works well, all the cells should be worn out to a similar degree. We should see very few signs of bad cells for a very long time, say, until 250 TB (1000 writes). After that, I’d expect most cells to be near the end of their life, and see the drive fail in a short time, certainly before 2000 writes. Obviously, the 840 doesn’t behave like that, and the curve on the graph looks so predictable that it could be extrapolated to a petabyte and beyond.

      • Chrispy_
      • 6 years ago

      Once all the spare area on the 840 is used up (it’s managed to burn out just 3 out of 23GB so far) I’d expect the drive to either fail completely or just shrink capacity until it has no free space.

      My real curiousity (though it may not happen until around 3PB have been written) is whether the drive is still readable at the very end or whether all the data is lost.

      In theory, write endurance shouldn’t affect read endurance, but these controllers do all sorts of mojo that I can’t even pretend to understand.

        • Wirko
        • 6 years ago

        Yeah, if I were a SSD controller, I’d try to reuse bad blocks in MLC (4-level) mode. That could work for some more time. Car analogy: take two wrecked old cars, make one new old car from them.

          • Chrispy_
          • 6 years ago

          Yeah, that could work, but not for long; re-writing a flash page that’s failed requires exponentially higher voltage. Their usability drops off a cliff very quickly, because each attempt to re-write is more damaging than the last, and hastens the death of that block.

          It was probably Anand’s article on flash endurance but I remember coming away with the conclusion that trying to patch up a failing flash block is very much like flogging a dead horse. Abandon it and move on!

      • ryan345
      • 6 years ago

      There is evidently more variation in the lifetime of the sectors than you are picturing. If so, then the graph is just showing the weakest sectors (outliers on the short lifetime side) failing first. If the sector lifetimes have a bell curve distribution (which I would have assumed) then the failure rate should increase as it approaches the mean sector lifetime (if you compensate for the fact that the wear leveling is spreading across fewer and fewer sectors).

    • Inverter
    • 6 years ago

    This is definitely one of the best SSD articles on the internet, after all, raw speed of an SSD is irrelevant if it can’t keep it’s data. Which is why the Samsung 840’s failure at 300 should already mark the drive as COMPLETE FAILURE compared to the others, even though it’s a good idea to keep it in the race “just for LoLs” and to see when it fails again.

    So thanks for this really interesting test, and all the work you are putting into it, greatly appreciated!

      • Chrispy_
      • 6 years ago

      I wouldn’t call it a complete failure.

      in over 600TB of data, the Samsung has reported one glitch in one file after a prolonged power-down.
      The drive passed its manufacturer’s best-case ratings at 250GB written, so by the time it had this “COMPLETE FAILURE” hiccup, it was already way beyond the expected life, outside the warranty for a consumer device in terms of “reasonable use” and Samsung would likely honour the warranty anyway.

      In short, at even 300TB you’d have be intentionally trying to kill the drive just to reproduce that one-off hiccup, and the fact it’s unrepeatable means it may not even be the drive’s fault.

        • derFunkenstein
        • 6 years ago

        Yeah, it only happened once, but even if it’s only happened once on a mission-critical drive, it’s time to retire it. This is the sort of thing RAID0 was meant to protect against, though, so maybe a second one would have helped keep it in check.

        Doesn’t affect my opinion of Samsung – the drive has written far more than I’d ever expect any of them to.

          • stdRaichu
          • 6 years ago

          The only thing RAID0 is there to protect against are bosses who don’t want to spend money on backups 🙂

          Seriously, if mission critical data isn’t backed up, then it isn’t mission critical. And if it’s not backed up [i<]and[/i<] on RAID0, it isn't even data in my opinion. I liken this effect to Schrödinger's Blocks - the data might be there, but ever since it was written it was merely waiting to die and the only way you'll know is when you try and prod it. Edited for emphasis.

            • ChronoReverse
            • 6 years ago

            I hope you two meant RAID1 because RAID0 protects against nothing and makes it even more likely to have catastrophic data loss…

            • stdRaichu
            • 6 years ago

            Isn’t that exactly what I said about RAID0, albeit floridly?

            • ChronoReverse
            • 6 years ago

            I think you’re talking about RAID1 as well since “Raid Is Not a Backup” is abused the most with RAID1.

            RAID0 isn’t actually RAID since it only has the “array” part but not the “redundant” part of RAID.

            • Chrispy_
            • 6 years ago

            “mission critical” and “low-budget consumer drive” don’t go together save in the company of fools.

            • derFunkenstein
            • 6 years ago

            I totally meant RAID1. *Facepalm*

            This is the kind of thing that RAID1 (or RAID 5 or a whole bunch of others) is supposed to fix. The drive didn’t die independently or anything, it just crapped its drawers once.

            • stdRaichu
            • 6 years ago

            Meant to fix yes, but RAID also brings in a whole bunch of other “you’ve just hosed your data” problems too. In the event of unspecified corruption on a RAID1 pair, without block checksums how do you guarantee which of the pairs is the valid one? And I’m sure I’m not the only one who now gives merely a resigned sigh when they see yet another Unrecoverable Read Error during a RAID5 rebuild because someone thought that having more storage was better than having reliable storage.

            But yes, any RAID level is better than RAID0 but they all come with an inherent set of problems. Running a SLED with a good backup + airgap is far superior to running a RAID with no decent backup. If you need and can afford running on a RAID with a good backup, even better.

            • derFunkenstein
            • 6 years ago

            It doesn’t have to be either/or, though. RAID + hot spare + backup with offsite storage should minimize downtime the most, right? Sure it costs more but if it’s mission-critial, the cost should be offset by the elimination (partial or otherwise) of lost productivity.

            • stdRaichu
            • 6 years ago

            Indeed doesn’t have to be either/or, indeed the belt-n-braces approach is usually the best, but there are different types of “downtime” just as there are different types of RAID – it all has to be tailored to a) the application and b) the budget. You’re right in your observations but a great deal of companies don’t see it that way until it’s too late and there’s a whole host of gotchas to be had along the way.

            For and example your suggestion doesn’t fully address, my first proper IT job (in a company young enough and small enough to have never done a DR) utilised a third party offsite backup from a well-known company, something that would be called a cloud backup provider these days. One day I got a call on the batphone at 6am asking me why email wasn’t working – user had got in early and couldn’t even log on to her machine. Hmm. Can you go into the server room and tell me what the lights on the box labelled XYZ look like? To cut a long story short, someone had broken into the office and stolen all the hot-swap drives out of the servers. Domain controllers, mail server, BES, file servers – all gone. Me and my colleague then worked a non-stop 36hour shift sourcing spares (no support contract in place) and re-imaging/re-building the machines… only to find out that it would take about a week for our backups to come down the pipe (and we had what was an insanely fast internet connection at the time for such a small company, 10Mb/s symmetric) or three days to spool off the tapes; we ended up hiring extra tape drives to get data off them faster and then syncing the remaining deltas from the internet) and we had the company back up and running in 48hrs. To avoid these problems, we a) beefed up security b) backed up everything to a local backup server rather than going straight to the internet and c) synced that backup server to an identical one in a branch office – so that in a similar emergency someone from the other office can just take a very expensive taxi ride. Furher refinements allowed us to do async replication to the backup server over the WAN and we could make the backup server the primary until we were outside of business hours if need be (thanks DRBD!). Since online backup had and still does have a crap turnaround time this has been my personal gold-standard ever since.

            Incidentally, I’d been brought in to replace a third-party IT company whom my employer had grown unhappy with and we were still in transition when the incident happened – their SLA on a full company restore was ten working days and a quote with a whole extra page of zeroes on it. As well as giving us an extra 5 days holiday, my boss based part of our yearly bonus on the money we saved the company by getting that time down to two days, plus the cost of the third-party solution, minus the cost of our new backup implementation, divided by the number of people in the company. Said figure was more than half my salary and I actually needed to sit down when I saw my payslip, which also came packaged with bottles of very fine spirits from each of the five directors.

            IME, all a hot-spare does is decrease the time between array failure and a rebuild starting; there’s still just as much potential for a rebuild failure when the rebuild is underway. Rebuild failures notwithstanding, performance of RAID5 and 6 is still pretty awful during a rebuild, and the rebuild will still take a long time even with bitmaps, and for many applications the disc being “up” but not performant enough to cope with the load would also be considered downtime. A previous employer who cheaped out on storage for their exchange setup against our recommendations learnt this the hard way during FYE despite the fact we’d spelt out for them exactly what would happen in the event of a drive failure. Hence the oft-heard refrain of “RAID10 or GTFO” 🙂

            My long-winded, misty-eyed point? As soon as your storage setup gets to be any more complicated than “my home computer with a USB drive for backups”, you can get into some very, [i<]very[/i<] complicated risk and cost/benefit analysis. People and companies that are prepared to do that analysis typically end up saving money. People and companies that don't tend to end up out of business. Reduce the amount of hyperbole to fit a home user scenario 🙂

          • sjl
          • 6 years ago

          “Mission critical” and TLC are not terms that I would expect to see in the same sentence. If the drive is used for such an important task, I’d bite the bullet and spend the money on something that’s more resistant to wear.

          And, as others have said, if the data isn’t backed up, it’s obviously not important (and hence can’t be said to be mission critical.)

          All that said, the sheer volume of data written gives credence to the viability of SSDs for primary storage. I would like more data on how long these drives can retain the data written – it’s entirely possible, for example, that after 200 TB written, the drive can only hold data effectively for a couple of months (all figures pulled out of nowhere for illustrative purposes only), but testing that is going to take far longer than any technical website can realistically allow. Not to mention that there’s also the question: was the data lost at one day, one week, one fortnight, one month, or later? There’s no way of knowing, short of checking every day, and that could introduce data refreshes to distort the test results.

            • derFunkenstein
            • 6 years ago

            Yeah, probably not. SLC drives have the best durability. Still, this is pretty astounding.

          • the
          • 6 years ago

          I believe you mean RAID1 (aka mirroring).

          One thing about errors is that you need to know their source. If bad data gets sent to the RAID controller and then to a RAID1 array then all the drives will contain that bad data. If it was a bit flip, then it could be detected and then potentially recovered.

        • Inverter
        • 6 years ago

        Oh I hadn’t noticed that the 300TB point was already beyond the specified write endurance, so it’s ok.

      • albundy
      • 6 years ago

      the 840 is still running at nominal performance after 600tb of written data. thats a failure?

      • UnfriendlyFire
      • 6 years ago

      If you plan on using a SSD for regular usage for 10 years, I hope you bought a 512GB SSD.

      My 2001 desktop computer has a 32 GB HDD.

      We now have flash drives that have at least 1 TB of storage space, and SD cards with 512GB of storage.

    • dragosmp
    • 6 years ago

    It’s crazy to think there were worries SSD may burn out looking at this test. I hope you’ll keep pushing these drives to 1PB, afterwards it may just be wasted electricity as you’d have already proven the flash on MLC is immortal (and that TLC is hard to kill)

    • Melvar
    • 6 years ago

    Towards the end of the first page:[quote<] So does the HyperX drive we're testing with [u<]incompressible[/u<] data. Thanks to write compression, that drive has written less to the flash than its twin being tested with incompressible data.[/quote<] That should say compressible, not incompressible.

    • quasi_accurate
    • 6 years ago

    Probably looks worse than it is. I just looked at my SSD boot drive. Nearly 4 years of work time, and only 13 TB of written data.

      • Voldenuit
      • 6 years ago

      [quote<]Probably looks worse than it is. I just looked at my SSD boot drive. Nearly 4 years of work time, and only 13 TB of written data.[/quote<] But are those 13 TB spread evenly across the drive, like in TR's tests? Also, if you do anything disk intensive (one of the reasons people buy SSDs) like non-linear video editing, I'd wager that you'd burn through a lot more than 13 TB over 4 years. Scratch disks for image editing can also get pretty large. Am I being overly cautious here? Maybe. Butt I'd rather be [i<]too[/i<] cautious with my data than not enough, and it's not like Samsung's TLC drives are half the price of MLC alternatives (the difference going from 4 bits per cell to 8).

        • UberGerbil
        • 6 years ago

        [quote<]But are those 13 TB spread evenly across the drive, like in TR's tests?[/quote<]Wear-leveling should ensure they mostly are.

          • Voldenuit
          • 6 years ago

          If your drive is full (or near-full), wear-leveling has less free NAND to work with, which I alluded to in an earlier post.

            • Melvar
            • 6 years ago

            That will lead to write amplification which will cause more wear, but it will still be spread out across the drive.

            • UberGerbil
            • 6 years ago

            If the drive is full, wear-levelling will be using the blocks that have just one write (a *lot* of data is write-once / read-many). All modern SSD firmware I know of doesn’t restrict itself to just unallocated blocks.

        • Inverter
        • 6 years ago

        The good thing about large files, like videos, is that they are usually written linearly (even in non-linear video editing) so they should be written with a write-amplification of around 1, so they don’t wear out an SSD too much.

        (On the other hand when you write many small files or poke around randomly, which produce a higher write-amplification, the total amount of data is usually not so much, so again the wear is not so great even at higher write amplifications.)

        • stoatwblr
        • 5 years ago

        If you’re using a SSD for scratch work, it’s no big deal if it goes bad. Just change it. You still have the original and output data saved somewhere else.

        If you’re using the same drive for scratch _and_ long term storage then you’re already indulging in false economies.

      • DarkMikaru
      • 6 years ago

      Great article… I love following up on this! I’ve had my Samsung 830 Pro 256 or at least a year & a half now and I’m only at 1.96TB (according to Samsung Magician). Long story short, do I need to worry about wear leveling and my SSD hitting the proverbial wall? Nope. Especially if 100TB is the point where it should even cross my mind even a little. Wow.

        • Melvar
        • 6 years ago

        Yeah, every time one of these updates is posted I check Magician, and every time I’m surprised by how little data I’ve written to my 840. Currently at 1.18TB after about 10 months of use.

        Voldenuit’s going on about how (s)he wouldn’t trust one of these drives, and meanwhile I’m thinking it’s going to have a long life as a hand-me-down after I’m done with it.

          • DarkMikaru
          • 6 years ago

          Right.. I have 2 120GB 840’s in my work & home laptops and I gotta say that I love the performance. But more than anything else, this testing has absolutely put what little fears I had to rest about cell degradation killing data. It will, sure… but to know that it won’t be in the lifetime of either of those laptops is a good feeling.

          I am just amazed at these consumer level SSD’s! Wow!! TR should of thrown a WD Black 1TB HDD in for good measure! I could be wrong, I just don’t see a mechanical drive being able to hang. I just don’t see it. Though, I’ve been wrong before so…

        • Voldenuit
        • 6 years ago

        The Samsung 830 uses 2-bit MLC flash, so you wouldn’t have the same (some would say theoretical) problems as the 840.

        What’s interesting though is that Samsung seems to be the only people pushing TLC NAND. Toshiba does make TLC NAND for smartphones and tablets, but those aren’t typically subject to as heavy disk loads as full-blown PCs.

          • Ninjitsu
          • 6 years ago

          Well, i have a 256GB 840 (non-Pro, non-EVO), around 232 was usable at the start (decimal to binary conversion), i was paranoid so i kept 10% unpartitioned.

          I use it for my games, for a while was recording FRAPS videos on it. Been a year almost. 1.21TB written so far.

          I’ll probably die before the drive does.

            • Wirko
            • 6 years ago

            We’ll probably have Kryptonite-based consumer SSDs before the current consumer SSDs are worn out.
            As for next year’s consumer SSDs … well, if Samsung tries to squeeze 4-5 bits into one 14nm cell …

      • Ninjitsu
      • 6 years ago

      My boot SSD has written 3.12TB, read 5.61 TB, Since June 2012.

      • Chrispy_
      • 6 years ago

      Even running multiple VM’s, including a database server, my 830 has only just passed 60TB in a year.

      At 5 years, it’d hit 300TB but it’s still so far removed from consumer use that I’d be amazed if a consumer could ever hit 60TB before SATAIII becomes obsolete.

      My laptop is pushing 3TB I think, which means I have another 200 years before I even [i<]start[/i<] to burn through the flash at the same rate.

    • UnfriendlyFire
    • 6 years ago

    I’m interested to see how would the SSDs respond to a situation where they run out of sectors. Do they become read-only? Or completely fail and the data becoming unrecoverable?

    To one petabyte!

    EDIT: At one point, you should leave the SSDs unpowered for a month or two to see how well they retain their data.

    My Nintendo Gamecube’s two 64MB memory cards have lost all of their data. Though I hadn’t turned on the Gamecube for the past 4 years. Goodbye to the saved data of “Paper Mario and the 1000 Years Door”, and “Super Smash Bros”…

      • crystall
      • 6 years ago

      [quote<]I'm interested to see how would the SSDs respond to a situation where they run out of sectors. Do they become read-only? Or completely fail and the data becoming unrecoverable?[/quote<] If they behave like hard-drives they'll just show up to the OS as smaller drives (i.e. the disk geometry will turn up with less LBAs than when the drive was in pristine state). I run into this condition with a 2.5" hard-drive that had a bunch of bad blocks (320); after forcing all the bad-blocks to be reallocated the drive had exhausted it's reserve (256) and thus showed up a bit smaller than what it used to be. If you've got a partition that spans all the way to the end of the drive this condition can be troublesome as the filesystem will be confused by the incoherence of the partition size with the actual disk physical size. The only way to avoid this problem is to leave some spare space at the end of the drive when partitioning but after these tests I very much doubt that this condition will arise frequently with SSDs, if at all.

        • Ninjitsu
        • 6 years ago

        My HDD developed bad blocks and they seemed to keep on increasing, so i just took a back up, wrote zeroes and sent it for RMA.

          • tomwalker
          • 6 years ago

          Ninjitsu – how do you see how many bad blocks you have?

            • cheddarlump
            • 6 years ago

            SMART data. Relocated sector count and recovered read failures.

            • Ninjitsu
            • 6 years ago

            Yup, SMART data, as cheddarlump says.

            The “RAW” or “Data” values usually tell you what the current value of the SMART variable is.

            SMART attribute 05 is relocated sector count, current pending sector (C5) tells you how many bad sectors you have.

            I use HD Tune to monitor hard drives, along with WD Diagnostics and previously Speed Fan.

            HD Tune is nice and simple, highlights problems in yellow.

    • Sargent Duck
    • 6 years ago

    [quote<]Besides, I'm fresh out of ideas for how to pose these things for pictures[/quote<] Some pictures of cats with SSD's would go along way... I still remember when TR re-did their whole front page with lolcatz. Best. Ever.

      • BIF
      • 6 years ago

      Make sure they are anti-static cats!

      But don’t shave ’em; they hate that!

        • Milo Burke
        • 6 years ago

        Please shave them. =D

          • Meadows
          • 6 years ago
    • Voldenuit
    • 6 years ago

    Any corrupted files in that 3GB of bad cells? How does the drive handle data in bad cells – is it smart enough to reallocate the data before it’s irreversibly gone? Do bits flip in data?

      • Dissonance
      • 6 years ago

      [i<]How does the drive handle data in bad cells - is it smart enough to reallocate the data before it's irreversibly gone?[/i<] Yes. Or that's how it's supposed to work, anyway. The hash check failures and unrecoverable errors logged by the 840 Series in our first data retention test suggest that corruption can occur in some scenarios. That's the only example of it we've witnessed thus far.

        • Voldenuit
        • 6 years ago

        Thanks [s<]Scott[/s<] Geoff. 840 sounds too risky to me as a consumer. Even if the possibility of actual data loss is remote, I'd wager that most users will fill up their drives and leave ~20% free space for performance reasons. This means that even with wear leveling, there is going to be a disproportionate amount of writing to the available (and spare) flash above and beyond what we'd see in a full drive wipe and write cycle, unless the drive wastes even more write cycles to cycle out old and static data. Unless I'm missing something? EDIT: Corrected nick/name.

          • Sargent Duck
          • 6 years ago

          Damage is Scott, Dissonance is Geoff.

            • Voldenuit
            • 6 years ago

            Oops, you’re right!

            All those Ds…

            Fixed my post.

          • continuum
          • 6 years ago

          Fortunately or unfortunately the original 840 is rarely seen these days– that said I would love to see an endurance test with the 840 EVO.

          Maybe I should drop by the XS.org forums to see what they have testing in there…

            • Voldenuit
            • 6 years ago

            Yeah, not too optimistic about the 840 EVO either. Smaller process (19 nm vs 21 nm on the original) and still TLC means there’s probably even less electron well capacity before each individual cell dies out.

            I’ve been warning ppl off TLC for a while, but my arguments against TLC have generally been ignored. TR’s endurance test does little to change my opinion of the technology.

            • indeego
            • 6 years ago

            Can you point to anything resembling science in your warning or is this something like “a feeling” or “a theory” that you have a hunch about? Otherwise it’s sky-is-falling and will keep being ignored.

            • Voldenuit
            • 6 years ago

            I’d say that bad blocks appearing well below even the minimum 1000-write threshold* that Samsung claimed for their drive should be a good indication that their TLC flash isn’t lasting as long as claimed.

            * 100 TB is 400 writes average
            * 200 TB is 800 writes average

            Now I’m neglecting write amplification, but with full drive wipes/writes, I’d imagine TR’s test drives should be experiencing something close to 1.0. Also neglecting overprovisioned space, but I’d imagine that overprovisioning (typically in the 8% range) should balance out any write amplification above 1.

            I’ve never said the sky was falling. But given the choice between two similarly priced and similarly performing drives, one with [i<]testably[/i<] worse endurance than the other, I'd pick the one that lasts longer, even if (for most users) the advantage is theoretical.

          • Dissonance
          • 6 years ago

          Wear leveling algorithms are designed to take static data into account. They should move that data around in the flash to spread write cycles evenly across the drive as a whole, even if it means burning extra write cycles in the process. Modern flash can clearly tolerate and awful lot of writes, so drives have plenty of cycles to spend on wear leveling.

            • Voldenuit
            • 6 years ago

            Well, Samsung rates its TLC at 1000-3000 write cycles, so at the 300 TB mark, the 840 would have had an average of 1,200 write cycles (neglecting wear leveling or spare NAND), and it’s already spitting out errors.

            Lest people think I’m simply carrying a crusade against TLC, MLC NAND typically has a write endurance rating of 3000 cycles, so I’d be interested to see what happens to the other drives in the endurance experiment once we hit the 900 TB mark. Keep up the good work!

            • Ninjitsu
            • 6 years ago

            Well, even if you look at 100TB as being the limit of a ~256GB TLC drive at 25(?)nm, it’ll still take an average consumer 70 years at the least to get to that number.

            From whatever anecdotal evidence i’ve seen so far, including my own drive data, during normal use our SSDs write about 2-3 TB per year for a boot drive. 100TB is a long way off.

            That said, anyone using SSDs in an enterprise environment, or video editing and similar work would be sticking to SLC/MLC anyway.

            TLC drives tend to be cheaper, though.

            • indeego
            • 6 years ago

            We see between .5 TB and 1 TB for common users, and “power” users at 2-3 TB/year. If you never reinstall your OS or work with frequently changing data the rate of writing will actually be lower over time.

      • BIF
      • 6 years ago

      I think you’re overworrying the situation, and self-exacerbating it in your own mind by using “FAILURE” in all caps (I saw it in some other post), with creative adjectives also in all caps. It’s just not going to be that big of a problem.

      Even with my paging/swap files on my C partition, my main SSD will probably be replaced for other reasons long before it “wears out” (like…it’s too damned small).

        • Ninjitsu
        • 6 years ago

        I actually got a 20GB SSD for the page file. :/

        Hasn’t had a ton of writes to it (214 GB), been almost a year, however for a few months something had caused the page file to revert back to C:, and i didn’t realise.

        • Voldenuit
        • 6 years ago

        Nope, never used ‘FAILURE’* in conjunction with the 840. You might be confusing me with Inverter.

        I’ve just been staying away from them to be cautious.

        * As far as I recall. Maybe I have some bad blocks? :p

      • UberGerbil
      • 6 years ago

      Blocks have to be erased before they can be written. A lot of incipient failures in flash can be detected at that point, when the firmware detects ghost data with a read-before-write. And a read-after-write should detect failures when the data is actually written; either case should cause the block to be retired and the data re-written elsewhere (possibly to a fresh block if one is available; otherwise to whatever is at the head of the wear-levelling list). A more worrisome cause of data loss is “spontaneous” corruption that can happen when NAND is just sitting around; NAND includes extra bits for error detection/correction, but it an error overwhelms those you have a corrupted block, which may have been be what happened in the test.

Pin It on Pinterest

Share This