OCZ’s latest Z-Drive has 16 flash channels laced with NVMe goodness

There's a new NVM Express SSD in town. OCZ's enterprise-oriented Z-Drive 6000 Series supports version 1.1b of the low-overhead protocol designed to replace AHCI. Rather than using an in-house controller, the drive taps a PMC Sierra chip with 16 NAND channels. This controller is tied to four PCIe Gen3 lanes via a cabled SFF-8639 interface.

Source: OCZ

OCZ splits the family between two camps. The Z-Drive 6000 Series is meant for read-intensive applications, while the 6300 Series is optimized for mixed workloads. Both employ Toshiba's A19 MLC NAND, but the 6300 Series uses a higher-endurance variant. It's rated for three drive writes per day over the course of the five-year warranty, while the 6000 Series is specced for only one full write per day.

Initially, the drives will be available in 800GB, 1.6TB, and 3.2TB capacities. There are also plans for a 6.4TB version—and for a 6300 derivative based on a half-height, half-length expansion card. Right now, the 6000 Series is limited to the thicker 2.5" form factor familiar from Intel's 750 Series SSD.

As one might expect given the target market, the new Z-Drive has power-loss protection, end-to-end data protection, and hot-swap support. 256-bit AES encryption is built in, and the drive can throttle performance to prevent overheating. Interestingly, it also has a configurable thermal envelope that can be set between 15 and 25W.

Lowering the TDP will undoubtely affect performance, which purportedly peaks at 2900MB/s for sequential reads and 1900MB/s for writes. Random I/O rates top out at 700k IOps for reads and 160k IOps for writes, according to OCZ. That random read rate is particularly high compared to the specs for other PCIe SSDs.

The Z-Drive 6000 Series should be compatible with the NVMe drivers baked into some modern operating systems. That's not the only option, though. OCZ also promises custom drivers for Windows, Linux, and VMWare.

Comments closed
    • UnfriendlyFire
    • 4 years ago

    At some point, the BIOS/firmware/UEFI has to be optimized in order to reduce OS boot-times.

    The exception would be for bloated programs that bog down even SSDs. Such as Adobe’s photo shop stuff. Or TF2 (Windows 7 boots faster than that game).

    • LostCat
    • 4 years ago

    NVMe still sounds pretty awesome, wonder when it’ll hit consumer gear.

      • mesyn191
      • 4 years ago

      Late 2015 to early 2016 you should see them get more affordable and common.

      You can already get them now if you don’t mind paying a premium. Intel’s 750 SSD’s have been out for a month or 2 though they’ll only guarantee they’ll boot reliably on 9x or higher chipset mobo’s.

      The 400GB version is about $400 right now:
      [url<]http://www.newegg.com/Product/Product.aspx?Item=N82E16820167300[/url<]

    • Deanjo
    • 4 years ago

    [quote<] OCZ also promises custom drivers for Windows, Linux, and VMWare.[/quote<] The only way I would use one of these is if the custom drivers are committed to the linux kernel. Proprietary storage drivers on linux are just a world of pain to deal with.

      • sweatshopking
      • 4 years ago

      [quote<] linux is just a world of pain to deal with [/quote<] Ftfy

        • Deanjo
        • 4 years ago

        Still 100x better than Windows.

          • anotherengineer
          • 4 years ago

          If you know something about Linux and the command line. So even though it might actually be better for certain things from a technical perspective, if people can’t use it then it’s kind of useless.

            • Deanjo
            • 4 years ago

            Same thing can be said about Windows. If you can figure out the “where is this setting” in it’s schizophrenic UI it might have something useful but there are even times where a person has to drop to cli fix things in even windows. Such as when you pooch the uEFI boot loader by doing a bios update. The truth is however pretty much anything can be done from the comfort of a GUI in linux if you wish but you have the option of going completely gui’less if you want.

            What exactly do you need to drop to cli for in linux again?

            [url<]http://s9.postimg.org/51av2v6fj/GUITools.png[/url<]

            • anotherengineer
            • 4 years ago

            Only UI issue is win8/8.1 that can be easily fixed with classic shell. And my comment was in general to all users.

            If you went into an office of 1000 running win7 and put on mint/ubuntu/whatever on the weekend and everyone came back into work, I wonder how well that would help productivity?

            Don’t get me wrong, I really hope that Ubuntu or some linux version that is very similar to windows ends up getting at least 30% market share, because windows could use some real competition. I think Valve is helping to push that, but it’s a long road ahead.

            Also, can I use AutoCad on Linux yet? See OSX support finally, but nothing for Linux.

            • Deanjo
            • 4 years ago

            [quote<]Only UI issue is win8/8.1 that can be easily fixed with classic shell. [/quote<] Classic shell only fixes the menu, it doesn't fix the "Holy crap, what's this sparse fullscreen setting or wanting to join a vpn and being able to click off the ribbon to copy and paste a username and password" stuff. [quote<]If you went into an office of 1000 running win7 and put on mint/ubuntu/whatever on the weekend and everyone came back into work, I wonder how well that would help productivity?[/quote<] If they didn't use Windows exclusively before hand, more than likely perfectly fine. It's not that the learning curve is great to use Linux, it's more the fact that people get programmed to doing something one way and any deviation from that set path would require them realizing that it operates different. Even MS found that out the hard way with their Start button. There is a reason why they are bringing it back and why people like you are saying install Classic Shell. As far as autocad goes, it works fine under wine.

    • llisandro
    • 4 years ago

    Sweet, this is perfect timing for me- at work I’m speccing a workstation designed to handle a 1GByte/sec video stream coming from a sCMOS camera. We basically need to take a peek at the data stream in the shortest time interval possible, process it, and use that analysis to inform feedback control of the instrument hooked up to the instrument.

    I’m looking at a Intel DC P3700 800GB NVMe instead of a RAID stripe of SSDs to make sure we’re definitely over 1.0 GB/sec.

    anyone have any suggestions/experience for bonkers writes? (looking to acquire minutes of video so RAMdrive probably isn’t cost-effective)

      • brucethemoose
      • 4 years ago

      You could cache disk writes to a sIzable RAM buffer, which then writes to a cheaper RAID array as fast as it can. PrimoCache should do the job. With a fast CPU, NTFS compression could effectively lower the required write speeds to the disk.

        • llisandro
        • 4 years ago

        interesting, thanks for the suggestion!

        At this point, I’m actually unsure how much CPU overhead I’ll have- waiting on a coworker to generate some dummy videos to test this.

        Most of the workstations I build in our lab are a lot more pedestrian, this is gonna be fun 😉

      • Waco
      • 4 years ago

      What platform is this going to be?

        • llisandro
        • 4 years ago

        Haswell-E/EP for the PCI-E lanes.

        At this point I am undecided on a Xeon system with ECC RAM vs an overclocked Haswell-E- for what we are trying to pull off, the extra core speed might be important. 32GB is probably plenty for what we’re trying to do, but I probably will end up needing the PCI-E lanes in -E series.

        Recommended builds for cameras like this are a little confusing: our camera recommends E5-1630 v3 (3.8GHz turbo), and they specifically state clockspeed matters most, but another slightly slower camera suggests E5-2640 v3 8C/16T with only a 3.4GHz turbo, and a third source is running a dual-socket E5-2643v3, but for a slightly different application. Since I’m planning doing something a little more computationally heavy than my camera manufacturer’s default acquisition mode, I’m thinking an overclocked 5930K might be a better idea, and am not so concerned with ECC ram.

        Mulling over some workstations from Dell, Boxx, but might build it myself, given that I need a PCI slot, and the only board I can find in the -E series with one is the Asrock X99 WS, and we might end up going the 10GBe option on that board.

        Gotta wait and see how CPU-intensive our live processing of the video stream is, hopefully this week.

        Open to suggestions, thanks!

      • llisandro
      • 4 years ago

      Some more info for anyone interested- here’s an example of how these sCMOS cameras are driving this- this is a blog of a guy at a microscopy facility at UCSF. He runs a really nice blog that talks about imaging and microscopy in general.

      [url=http://nic.ucsf.edu/blog/?p=464/<]Here[/url<] is an example from his blog of what you can do with one of these cameras. They're capable of 100fps full-frame (2048x2048 16-bit), but with binning you can approach 25,000 FPS, so at 16-bit we're looking at about 1GB/sec. So, in his case, they're just recording video. In our case, we'll be analyzing what we see in the video as quickly as we can, to enable feedback control of the microscope, so our computational needs will probably be higher (benchmarking this soon)- our feedback will be limited by how rapidly we can analyze the smallest chunks of frames we can, hence I'm considering an overclockable Haswell-E system- I haven't seen the code yet, but I'm assuming clockspeed will matter more- it's not a big analysis, we just need to do a lot of analyses as fast as possible. [url=http://nic.ucsf.edu/blog/?p=357/<]Here[/url<] is a post he wrote about using a 4-disk RAID 0 array of 840 pros- surpasses 2GB/sec. But on a Xeon system with an intel RAID card, there's no TRIM support, apparently, for SSD arrays [url=http://nic.ucsf.edu/blog/?p=566/<]so write speeds slow down.[/url<] This is a problem for us, as we'll fill up 1 TB in ~15 minutes, so we'll constantly need to be reformatting to keep the array at acceptable speeds. Anyone know if there are hardware RAID cards that support TRIM? [url=http://nic.ucsf.edu/blog/?paged=9/<]They've also used a Z97 system [/url<] and just used the integrated RAID controller. That gets them about 1.1 GB/sec, and newer drives can probably hit about 1.3 GB/sec which might be cutting it close if we are processing the data stream at the same time as more data comes in, essentially saving one long movie but also splitting a copy into bunches of 100-1000 frames and analyzing those. So, with all these uncertainties, I'm tempted to just tell my boss we can't get it done without a P3700, which is 4x as expensive, but i'm guaranteed 1.8 GB/sec at QD=1. And then I can use the integrated Intel RAID for a RAID5 disk array for temporary storage in the workstation. Fun times 🙂

        • mesyn191
        • 4 years ago

        Sounds like he is one of the few who reeeeeeeaaaaallllllyyyyy needs a LSI based hardware RAID card with a large multi GB cache.

        I like the higher end Areca cards myself. Something like this:

        [url<]http://www.newegg.com/Product/Product.aspx?Item=N82E16816151117[/url<] Modern SSD's pretty much all have some form of internal garbage collection so the lack of TRIM support isn't as big of a deal as it used to be. The thing is you have to leave some 'dead space' on the drives to give it some room to work with otherwise it won't be effective. 10% unformatted space is the minimum recommended generally. 20% is better. Anand did a article on this a couple of years ago I think. Also thanks for the blog link. Looks interesting.

        • freebird
        • 4 years ago

        Just curious, do you use a 3rd party app to analyze the video or write your own? I thought what would be better to process a video feed than a GPU especially either a CUDA or GCN/OpenCL one…

        Another question is whether you store the video long term or just “store it until processed”

        I would think a 1GB/sec video feed would be able to be processed by a high-end GPU, if the programming lends itself to being efficiently parallel programmed on a GPU. If that doesn’t work maybe an increase of 64GB system memory to cache video and/or add multiple GPUs?

          • llisandro
          • 4 years ago

          Analysis (and instrument control) is in Matlab, so we do have CUDA as an option- Matlab is not great at parallelism, but that’s what it was written in for a lot of other reasons. So yeah, I’ll definitely be trying this out. Worried a bit about the latency of moving it off to the GPU- the calculations are pretty simple, so it might be better to keep it on the CPU if we can manage it- we will see.

          the movies are the raw data, so we do want to be able to store them long-term. we process them in a variety of ways but you always want to keep the originals 😉

          yeah, a 128GB Xeon system is looking like it might be a good idea 🙂

      • stdRaichu
      • 4 years ago

      ‘ello again llisandro – I’m also the one building an X99 WS with an E5-1650 v3 as I think we mentioned earlier although I was doing so more for scaling encodes across multiple CPUs rather than balls-to-the-wall clocks and IO requirements.

      Make no mistake, your write requirements really are bonkers 🙂 Avoiding the RAID sounds like the best approach to me purely because of KISS so if your budget allows it I would go down the “bloody fast PCIe enterprise drives” route… although given that your required speeds already seems close to the limit of what workstation-level tech is able to provide I’d also think about investigating a “small” flash DAS or SAN so you can scale up or out if needs be. Are the writes going to be strictly sequential or will other processes going to be generating IO on the same devices at the same time?

      What interface does your video stream come in on? A custom PCIe card or summat?

        • llisandro
        • 4 years ago

        yeah the video stream comes on a custom “framegrabber” card (PCI-E 2.0 x8). So, lane-wise we could possibly get away a 4790K, just thinking I might as well go -E for more lanes for more PCI-E storage down the road.

        the trick is we’re kind of at the bleeding edge of this tech- the usual use case is just recording video at 1GB/sec, but then we are performing image analysis on a set of frames as they get saved, which is a feedback signal to adjust the instrument, so I have a latency need as well, we just don’t understand what it is yet, but it might be onerous if we really want to shoot for single-frame analysis at 25,000 FPS. The old instrument that did this had a RAID6 array of spinners, recording at ~1000FPS, and could perform the image analysis every 1 second- but I’m assuming this was mostly limited by the write speeds, not the speed of image analysis-I’ll be benchmarking this aspect later this week.

        At the theoretical limit, you’d want to do the image analysis on a single frame, or as close to it as possible, but now these frames are coming in at ~10,000FPS. So, honestly I’m wondering if ~100GB RAM drive for the initial dump, then it gets sprayed onto a 8ish disk array of 0.5-1TB SSDs might get me closest to the theoretical limit. The camera’s native software supports doing this in a “rolling” type of recording, but we need to figure out how to make this work with our custom software.

        Flash DAS is something I’m considering for sure.

        The problem is that we don’t have the camera yet, so I’m just speculating 🙂 I think my initial plan is to try to run it on an OCed 4770K system I have, and see what we can do with just a “cheap” 3-disk SSD RAID 0. The sensitivity of the instrument is tied to how fast we can “feed back,” so we wanna push the envelope here, but there might also be some practical limitations at which the framerates are “good enough.”

        I have a decent budget to play with, as commercial workstations sold by the camera companies are like $9k, and I can get way more for half that amount, so even my bonkers workstation is “saving money” compared to commercial stuff.
        Maybe I’ll do a forum post on the build!

          • stdRaichu
          • 4 years ago

          Fascinating stuff – how bursty is the average recording/analysis session? Obv. if you’ve got data coming in off your capture card at 1GB/s that 800GB SSD won’t last long. Maxing out a board like the X99 WS with 128GB of RAM might be an idea but I would hope the OS and software would handle the data intelligently, namely caching it in RAM without having to resort to an explicit RAM drive. Dealing with tiered storage is an absolute PITA if your software and hardware doesn’t have inbuilt support for it, and writing to a RAM drive and then splurging to SSDs and/or big fat RAID array afterwards presents a big risk of additional latency/complexity.

          Yeah, figured that there’d be an application reading from the data pretty much as soon as it was committed – there should be enough headroom in discs like the 3700 for this to not introduce extra latency but obv. it’s something you should test once you’ve got the camera.

          Worst-case scenario, if your storage isn’t fast enough to cope with the incoming data, what happens? Does it just drop frames or would that sort of thing invalidate whatever it is you’re analysing?

          Still fairly surprised though that software exists that can process that much information so quickly, but from your infodump it sounds like it doesn’t analyse all the frames, just some of them…?

            • llisandro
            • 4 years ago

            yeah, we are writing the software to do this, it doesn’t exist 🙂 We are looking at a hamatsu camera- 2048×2048, 16-bit at 100 FPS, ([url=https://hcimage.com/assets/pdfs/HCImageLiveGuide.pdf/<]manual here[/url<]). They do have software for this, and there is an option for streaming to RAM with a "circular buffer"- so the camera can do it, the issue is we'll be doing this in our own software, so we need to get it to work -I agree wholeheartedly- ideally I can have this go just to the RAM cache, so we've got max speed for the image analysis, and then we save it to an array. If I can get the ram cache to work right, then I'll just throw it to a case full of 1TB SSDs in RAID 0. With that, I can get at least one hour of continuous data, which should be plenty. And one awesome trap brought up in [url=http://www.anandtech.com/show/8147/the-intel-ssd-dc-p3700-review-part-2-nvme-on-client-workloads/4/<]anandtech's review[/url<] of the 3700 is that if we bin too small, we speed up the framerate, but the image size might drop below 32KB, this is actually a huge problem- below 32k block sizes r/w speeds drop to zero. This raises questions about exactly how the capture card is writing to my disk- looks like it'll be better to automatically chunk the video into 4GB sections as they come in rather than keep appending super tiny individual images- some cameras basically record in a TIFF stack if you want. But as I understand it, the sCMOS bins in "strips" the full field width because of how the data is read out of the sensor, so at 16bit 10x2048 we are at 41KB, just above the threshold where SSDs get awesome. But it's something I've never had to worry about before, just don't have any experience in this area. (I do also have some HPC experts at my disposal) As far as worst-case- basically what we have done is combine some number of N frames which is the smallest group of frames that we can analyze in real time. So if I've got a 10KFPS hose, the question is how close can i get to analyzing every single frame as they come in, or do I have to look at a video of 1000 frames, 100 frames, etc. The faster the feedback, the better for the sensitivity of the instrument. Really we won't know until we get the camera, and I can't get the camera until a guy builds the instrument(you need a really specialized instrument to even get the data we need to push the image analysis), so I've got 2 months to brainstorm. But yeah, this will be bleeding edge, unfortunately at a scale way below what HPC people are doing, not a lot of context for doing this in a desktop system, which we need, as this needs to be a workstation we use to control all aspects of the instrument. the good news is that this is a pretty cheap camera, about $15-20k. That is cheap for us, as most of our cameras are emCCDs, not these cheaper sCMOS. 🙂 But even this insane desktop will probably only be ~5% of the instrument cost. Hell, this isn't even for my research, I'm just known as the computer guy in my lab. Understanding workloads isn't something all of our grad students are great at- I've got a professor who bought a $5k workstation for a different kind of image analysis, but he fell into the "moar cores" trap and bought a 32-threaded 2.6 GHz Xeon that I can slaughter with an i7, cause his code doesn't scale well enough 😉

            • stdRaichu
            • 4 years ago

            Things have obviously moved on a bit since my days of hooking up my OM40 to the petrographic microscopes 🙂

            It’s highly dependant on the software of course but if its got half the smarts it should have, even if it’s writing 32kB images it and/or the OS should be caching in RAM and batching writes into a more-or-less sequential load (which I imagine is what those 4GB chunks are that you mention) so performance might not be as bad as it seems. And the anandtech graphs seem to indicate it’s only a real problem for sub-4k block sizes; performance at 32k is considerably better and if it’s really batching into 4GB writes then you should be able to get peak performance out of it.

            Like you say though, too many variables flimbling around in the air to make a judgement call but I await your build thread 🙂

            Reminds me of the first build I did for work – huge industrial scanner that pumped out 600dpi TIFFs almost as fast as the U320 SCSI cable could manage. Me and a mate built a superpowered linux box with cutting edge components – dual socket 940, an mdadm RAID10 of eight 36GB raptors as an acquisition drive and an eye-wateringly huge 12GB of RAM. Cost about a third of what was available for purchase at the time and outperformed them all.

    • chuckula
    • 4 years ago

    [quote<]Lowering the TDP will undoubtely affect performance, which [i<]purportedly peaks[/i<] at 2900MB/s for sequential reads and 1900MB/s for writes.[/quote<] Oh good, an alliterative purportedly. Nice to see the master at his craft.

      • Generic
      • 4 years ago

      I saw [i<]the word[/i<], and jumped straight to the comments. I was not disappointed.

      • Dissonance
      • 4 years ago

      Gotta save ’em for when they’ll really count 😉

    • Neutronbeam
    • 4 years ago

    Let’s focus on what’s really important here with this technology. When’s the review and when are you giving one away in a contest? 🙂

Pin It on Pinterest

Share This