Radeon Pro Solid State Graphics keeps big data close to the GPU

AMD threw the latest in its series of Capsaicin events at the SIGGRAPH conference this evening. At the show, the company announced the Radeon Pro SSG, for "solid state graphics"—a new kind of graphics card that's meant to keep large amounts of data close to the GPU. This card has what AMD calls a "one-terabyte extended frame buffer" that relies on non-volatile memory to store all those bits—presumably NAND. In turn, the GPU is connected to that memory using a dedicated PCIe bus. For perspective, consider that AMD's previous capacity champ, the FirePro W9100, only has 32GB of GDDR5 RAM on board.

To demonstrate the benefits of keeping large data sets close to the GPU, AMD showed off a demo where an 8K file was scrubbed in the timeline in a pro video app. Without the Pro SSG card, that demo only ran at 17 FPS. Throw in that huge chunk of flash storage, however, and the app could churn through the same video while updating at more than 90 FPS. AMD also envisions the card changing the way pros work with big data sets and GPU computing in the medical, scientific, and petrochemical industries. Interested developers can apply for a beta version of the Radeon Pro SSG hardware now, assuming they're willing to pony up $10k for a card if they're approved.

We're sadly not at SIGGRAPH this year, but I'm working on setting up a briefing with AMD this week so we can learn more about how the Radeon Pro SSG works. Stay tuned.

Comments closed
    • evilpaul
    • 3 years ago

    I like to think they use SRAM instead of NAND and there’s 15 other cards that go with it they aren’t showing.

    • Duct Tape Dude
    • 3 years ago

    They should make one of these with an APU and just dub it a PCIe blade server.

    • Klimax
    • 3 years ago

    Sounds interesting. Would be nice to have it for tests… (and maybe some useful work too. :D)

    • Parallax
    • 3 years ago

    How long before the IO wears the flash out?

      • tipoo
      • 3 years ago

      I’m guessing well outside of the useful life of this card for professional time sensitive use, given TRs own SSD torture test. And there’s a pro-grade warrenty like the Firepros at any rate.

      • smilingcrow
      • 3 years ago

      I presume they will sell them bare-bones and you choose NAND to meet your needs.
      You can buy enterprise grade M.2 drives so buy a drive with 3 to 5 DWPD or more.

    • ronch
    • 3 years ago

    Doesn’t this remind you of the time when you could actually stick more memory chips on your video card?

    The past is the future.

      • f0d
      • 3 years ago

      i dont think you were able to do that with any 3d accelerator
      that was way back when people had 2d cards iirc

        • ronch
        • 3 years ago

        No. It was even before 3D came out.

        • Klimax
        • 3 years ago

        Matrox G200

          • Concupiscence
          • 3 years ago

          The prior Matrox Mystiques and Millenniums (Millennia?) could do that too. Nudging my Mystique from 2 to 4 MB made it a lot more generally useful (especially in the days before I snagged a Voodoo1). There was a 6 MB module that would take the card up to a whopping 8 MB, which was pretty spiffy for working in resolutions qualifying as high in the mid- to late ’90s.

          There were people who paid to bump their G200s to 16 MB as I recall, though it’d mostly be good for enabling vsync + triple buffering on that class of hardware. There were so many gotchas: 32-bit color was supported, but there was no multitexturing and fillrate might have been 80 Mpix on a good day; memory bandwidth was scarce; OpenGL support was initially implemented through a wrapper that translated calls to Direct3D, and even years after the GL driver was ostensibly “mature” there were glitches in basic rendering functionality; the open source Linux driver ran circles around Matrox’s Win32 drivers in both quality and speed… That said, paired with a Voodoo2 it was a pretty decent solution that covered most of your bases, and shaky as the drivers were the hardware was certainly more capable than something like a Permedia2.

          edited: both for my stream of consciousness about an ancient 3D accelerator, and to provide [url=http://www.512bit.net/matrox.html<]a handy link[/url<] to old Matrox card specifications. It looks like the G200 managed ~84 Mpix, though overdraw and driver issues didn't put it too far ahead of a Voodoo1 in most real-world benchmarks. It looked good trying, though.

        • derFunkenstein
        • 3 years ago

        I had an S3 ViRGE that had a slot for more RAM.

    • ronch
    • 3 years ago

    Keep your friends close, and your… um, large amounts of data closer.

    • tipoo
    • 3 years ago

    “But โ€œthe art of the impossibleโ€ is also about, well, the art. Bollywood film โ€œBaahubali,โ€ the result of a collaboration between AMD and director S.S. Rajamouli, is a great example of what can be done. ”

    1) It’s funny/cute how much Raja brings this movie up (he has a poster in his office I think) and
    2) Man, that movie was awful. Not like, cultural divide “I didn’t get it”, plenty of good Indian movies, but just, no plot, cheesy borderline assault-ey romance (“imma crush some berries to slap some makeup on your face”), awful awful everything, except for bringing Indian cinema forward technologically.

    • chuckula
    • 3 years ago

    Interesting to see how one of these $10K cards would stack up against a $5K [base model] Xeon Phi Ninja workstation with a PCIe SSD thrown in.

    [url<]http://dap.xeonphi.com/ninja-dev-platform-pedestal.aspx[/url<]

      • tipoo
      • 3 years ago

      The differentiator from regular PCI-E storage seems to be using this as an extended framebuffer. I wonder how different that is performance wise from having the SSD just appear as base storage.

        • chuckula
        • 3 years ago

        Using raw NAND flash without any higher-level logic (like a filesystem) would let the drive operate pretty much like a framebuffer if you need one. I’m thinking something like using the MTD infrastructure in Linux for raw access. [url<]http://www.linux-mtd.infradead.org/[/url<]

    • Legend
    • 3 years ago

    That’s the most dangerous thing I’ve seen since big data with a sneaker : )

    Edit – to include Pokemon Go

    • kamikaziechameleon
    • 3 years ago

    This is interesting… but a SDCC primer would be sweet. This is the best community on the internet to gossip with. ๐Ÿ™‚

    Seriously crazy big trailers and announcements out of that.

    • smilingcrow
    • 3 years ago

    “Actual memory management/usage/tiering is handled by a combination of the drivers and developer software, so developers will need to code specifically for it as things stand.”

    That’s the downside but memory management has to be easier than low level graphic APIs!

    • DPete27
    • 3 years ago

    $10,0000!?!?!? The [url=http://www.newegg.com/Product/Product.aspx?Item=9SIA2F846N8264<]W9100 ONLY costs $4000-$5000.[/url<] Add in 1TB NAND at a generous [url=http://www.newegg.com/Product/Product.aspx?Item=N82E16820228164<]$800[/url<] and you're still under $6000. And that's with the "elitist tax" already included in the W9100 price. Good grief.

      • tipoo
      • 3 years ago

      Seems like it would have to be more than just NAND on a PCI-E board though, as those already exist. And there’s certainly a large margin built in for the pro class, but that already existed in the pro GPUs you mention. Remember, it’s 10K for perhaps a 5X speedup over the 4K card…For some places with large data sets, a drop in the bucket, and worth it to work faster.

      • maxxcool
      • 3 years ago

      You clearly do not work in the mining\oil , CAD, datamining, medical imaging fields. They will pay 25k$ per card for early delivery and not even blink.

        • tipoo
        • 3 years ago

        Our data is pharma transactions and if these would speed them up even 10% of the 5x claimed above it could well be worth it, a drop in the bucket. Hardware cost is manyfold dwarfed by software licencing costs anyways. May as well go top shelf for most hardware then.

        i.e, if you pay 300,000 a year for software licensing, saving by going with 10,000 dollar Xeons over 15% faster 15,000 dollar Xeons is kind of silly. People look at these top shelf hardware systems from an individual enthusiasts point of view, which doesn’t fit.

        • Concupiscence
        • 3 years ago

        Too right. I work in petroleum, and GPGPU seismic data processing is a [b<]huge[/b<] computational problem. You don't blink at spending $25k on something that could help you find hundreds of millions of dollars buried in the ground.

      • derFunkenstein
      • 3 years ago

      I think you’ve missed the point of this market. It’s more than the sum of its parts. Literally.

        • DPete27
        • 3 years ago

        Ok ok, I get it. [5x] More performance = higher price. I don’t know if the underlying GPU has changed or if this is just a W9100 with NAND. If the latter is true, good for AMD for figuring out they can add a minimal cost of additional hardware and charge almost twice as much!!

          • derFunkenstein
          • 3 years ago

          There are always cases where time is (literally) money, and the money put into these cards can pay for themselves. I don’t think AMD owes anybody anything by doing anything less than maximizing what they can get for it. If it can render a video 4x or 5x as fast and that saves somebody hours of waiting, and if that lack of waiting can be turned into more work (and therefore more money) then so be it. I think it’s great that AMD finally has a high-margin part.

          • anotherengineer
          • 3 years ago

          A W9100 is more of a CAD/3D modeling card, they do help, but I have seen cad being done in integrating intel gpu’s before, having a good cad card can be faster, but it’s not as big a money saver/maker as having someone very proficient and skilled on 3d modeling software vs. a noob.

          These cards are made for production, and high production makes more money, and it costs money to make money. If something is designed to make money, the company can charge more money for it, since it’s like an investment that can pay for itself many times over if someone has the right task to utilize it.

          I’ve worked in mining, where the $5 million/month electrical bill was nothing IFF production was full bore 24/7. Need more ore, get a bigger truck, 300Ton isn’t cheap, but I have seen places with 50+ of them running. If margins are low, one needs more tons to make money.

          As for the card itself I have no idea, but if theses things had no dump box they would be a $5 mil paper weight. [url<]http://www.cat.com/en_US/products/new/equipment/off-highway-trucks/mining-trucks.html[/url<] Sometimes you get lucky and a small change can make something far more valuable to the right job.

      • shank15217
      • 3 years ago

      U’r not gonna buy these cards at newegg. These cards are developer cards getting thrown out there so specific problem sets can be addressed.

    • tipoo
    • 3 years ago

    Whatever else it is, I think the packaging is A+. That sort of minimalism is what I want on desktop class hardware too. Great colour scheme too.

      • chuckula
      • 3 years ago

      [quote<]Great colour scheme too.[/quote<] Raj: I don't want to start ANY unsubstantiated rumors about Intel buying out the RTG but.... LOOK AT THIS LOVELY SHADE OF BLUE WE'VE DISCOVERED!

        • tipoo
        • 3 years ago

        It does look like Intel Baby Blue with two drops of red mixed in the paint ๐Ÿ˜›

        [url<]http://www.intel.com/content/dam/www/public/us/en/images/product/RWD/xeon-phi-family-rwd.png.rendition.intel.web.416.234.png[/url<]

        • the
        • 3 years ago

        It looks like a [url=http://www.intel.com/content/dam/www/public/us/en/images/product/RWD/xeon-phi-family-rwd.png.rendition.intel.web.416.234.png<]Xeon Phi card[/url<]...

        • dodozoid
        • 3 years ago

        I would actualy appreciate intel buying RTG as it would probably give them some more money for R&D… As long as they would continue to do dGPUs

          • NTMBK
          • 3 years ago

          If it goes anything like their acquisitions of Infineon and Altera, they will take several years too long to port designs to an in-house process, leaving them lagging far behind the competitor in manufacturing capability and unable to compete.

      • ImSpartacus
      • 3 years ago

      Yeah, I find these cards surprisingly tasteful. It’s a nice subtle blue.

        • shank15217
        • 3 years ago

        There’s nothing subtle about that blue!

          • tipoo
          • 3 years ago

          More subtle in reality than in render:
          [url<]http://images.anandtech.com/doci/10518/RajaSSG_575px.jpg[/url<]

      • Jeff Kampman
      • 3 years ago

      Fun fact: [url<]https://twitter.com/RadeonPro/status/757759851590217729[/url<]

        • tipoo
        • 3 years ago

        Huh, that is fun. Now, can our monitors even properly represent this newly discovered shade?

          • w76
          • 3 years ago

          Not a new color, just a new pigment, a way of making a color, according to the article. The real question would be then where it lies in the spectrum, then, and how crappy of a monitor we’re talking about. ๐Ÿ™‚

            • tipoo
            • 3 years ago

            I should have known this was another round of “every headline misrepresents what a scientist did” ๐Ÿ˜›

    • chuckula
    • 3 years ago

    It’s interesting how they are able to get the performance gains considering even two high-end M.2 drives are going to have much lower total bandwidth than a full width 16GB/sec PCIe bus.

    I’m guessing the advantage lies on direct access to the drives instead of having to load the data into the regular VRAM and then pass the data through to the GPU.

      • bjm
      • 3 years ago

      Most M.2 drive slots are connecting to the CPU via the system chipset and have to pass through DMI’s shared bandwidth. Since the M.2 SSD in this case is sitting on the board itself, it’s bypassing the round-trip latency that would normally be added when going through GPU <-> CPU <-> DMI <-> chipset.

      Being that an M.2 slot is limited to PCIe x4 though, AMD can achieve even greater performance by developing some custom x16 M.2 slot, but they wouldn’t be able to use off-the-shelf SSDs. But then again, I suppose at that point, they might as well just develop some NV-Link competitor on a Zen chipset.

        • tipoo
        • 3 years ago

        > Since the M.2 SSD in this case is sitting on the board itself

        To be clear, I believe it’s sitting on *a* board, but the GPU is a separate entity on a separate PCI-E port. I don’t think this card includes a GPU, unless I’m wrong. I read the press release twice and I’m still only 80% sure on it.

          • bjm
          • 3 years ago

          The board includes the GPU and the SSDs on a single board. Otherwise, why would they mount a huge cooling fan for M.2 drives? ๐Ÿ™‚ Also, from Anandtech: “Architecturally the prototype card is essentially a PCIe SSD adapter and a video card on a single board, with no special connectivity in use beyond what the PCIe bridge chip provides.”

            • tipoo
            • 3 years ago

            Ah, cool. So is the GPU a W9100-ish chip then? Or I wonder if there will be a whole SSG line analogous to the whole Firepro line.

        • Andrew Lauritzen
        • 3 years ago

        I don’t buy the DMI argument – it’s still *way* faster than the NAND here. It may add a bit of latency, but if it were a big deal everyone would buy X99… I don’t think you see X99 separate from Z170 until >3-4 RAID0 M.2 SSD type levels of performance.

        This seems like a weird stop-gap that didn’t cost them much R&D money by using off-the shelf parts to be honest. I’m guessing most of the issues with the regular path through the CPU are more software than hardware, but it was probably just easier to work around them in hardware for the intended use cases (not uncommon). I can’t see this kind of design being very interesting in the long term though.

          • bjm
          • 3 years ago

          I agree, I think any longer term effort would be best applied to developing an NV-Link competitor. Actually, I wouldn’t be surprised if that’s what they have coming down the line and that the intention of this stop-gap solution is to expose developers to a large memory environment that would otherwise be too expensive to obtain. After all, AMD did announce these cards as dev-kits.

      • Mr Bill
      • 3 years ago

      They are linked to the GPU via the PEX8747 bridge chip according to Anandtech.

      • the
      • 3 years ago

      I’m not even sure that is true.

      The competition here for large data sets is system memory hanging off the processor. Modern dual socket boxes can go all the way up to 1.5 TB using the largest DDR4 DIMMs. The downside here is latency going across the PCIe bus and DMA transaction.

      Even having directly attached storage hanging off of PCIe has to deal with the latency of a NVMe controller and a thin software storage stack. This is the latest and greatest on the storage front but radically higher in terms of latency compared to DDR4 memory. That is a very hard barrier to overcome.

      Bandwidth wise, accessing DDR4 memory over a PCIe 16x link still comes out ahead as two M.2 slots have at best half the bandwidth.

        • Andrew Lauritzen
        • 3 years ago

        Yep, right on all points. I can only surmise that there’s some sort of latency and serialization issue with the way the GPU is interfacing with the OS/regular storage system that is being bypassed here. Probably as much software as hardware, but ultimately delivered performance is what matters and this was possibly just an easier way to do it quickly.

        Indeed there’s no way this competes with an equivalent amount of DRAM though… it’s only really interesting for more cost-sensitive (or RAM-slot-limited) workstations I imagine.

          • tipoo
          • 3 years ago

          1TB DDR4 Nvidia systems can cost over six figures, so a 10,000 dollar add in card could certainly have appeal in a range of niches. And the comparable DDR4 storage one is a 3 rack solution.

            • Andrew Lauritzen
            • 3 years ago

            Agreed – perhaps it’s a “middle of the road” sort of cost situation, but as noted in other threads below, the market for this sort of hardware is generally not very cost sensitive where performance is concerned. More options are good though for sure.

            • freebird
            • 3 years ago

            That was going to be my point also… There is a dramatic saving here if you think you are going to compare this to a system with 1.5 TB of DDR4 memory in it…most systems that can access that much memory:

            1) are VERY EXPENSIVE
            2) Need Multi-cpus and memory buses which introduce additional latencies and once again COST
            3) Heat energy efficiency all those CPUs & memory modules use up MUCH more power and hence cooling
            4) DEFINITELY won’t fit in DUAL SLOT PCI-E slot.

            • the
            • 3 years ago

            Just going out and [url=http://www.superbiiz.com/detail.php?name=D42464G4S<]buying the DIMMs[/url<] for 1 TB of memory for a dual socket system comes in under $13,000. In the server market, that figure isn't that expensive. The thing is that OEMs love charging a premium for their own OEM memory, skewing price comparisons.

    • arunphilip
    • 3 years ago

    Interesting idea. Its also intriguing to see how more “computer” functions are migrating onto the GPU. There was a time when GPUs moved away from having graphics processors into having more general purpose compute units. Now we’ve got voluminous amounts of memory, and this product adds storage to the mix!

    • brucethemoose
    • 3 years ago

    You know what this seems like a perfect application for, somewhere down the line?

    XPoint.

    It’s closer to GDDR5 in the latency department, it doesn’t have the write durably issue NAND flash does, it can use the same interface…

    It’s not totally out of the question, right? XPoint is a joint venture between Intel and Micron, after all.

      • dragosmp
      • 3 years ago

      If Intel provides a non-proprietary bus connection and access protocol (NVMe over PCIe?) I imagine AMD should support Xpoint as well as any MLC SSD.
      However I see some stumbling blocks if Intel decides NVMe is too slow or adds to much latency, which it might be, and makes up a different protocol.

      good catch

      • chuckula
      • 3 years ago

      Came looking for Xpoint comment.
      Went away satisfied.

      • maxxcool
      • 3 years ago

      mmm yummy.. 1GB on board Xpoint with its own 4-way memory controller too allow the cpu\gpu\l2 and l3 to function on the same silicon… in a few generations I could get down with that.

    • Unknown-Error
    • 3 years ago

    After the 4×0/Polaris fiasco, this is refreshing news from AMD. It actually sounds innovative. Just hope they don’t ….-up this opportunity. So, i’ll say, I am very cautiously optimistic.

      • ImSpartacus
      • 3 years ago

      Yeah, it’s good to see amd innovating.

    • odizzido
    • 3 years ago

    Pretty cool. I wonder if there would be any benefit for games. Put 32gigs of flash on the card to hold all of a game’s GPU related data and just load it into ram from there as needed.

      • brucethemoose
      • 3 years ago

      We can finally run modded Skyrim with 8k textures!

    • Mr Bill
    • 3 years ago

    The next gaming card arms race… Loading all the frames into the graphics card frame buffer. That must be some seriously fast SSD.

      • ImSpartacus
      • 3 years ago

      Apparently the pci-e interface can become a bottleneck for system memory, which allows an on-board ssd to become competitive. I doubt the ssd, itself, is particularly special.

        • Mr Bill
        • 3 years ago

        They are linked to the GPU via the PEX8747 bridge chip according to Anandtech. This is the same chip that links the two GPU’s in the 295X2 and the Pro Duo. So the bandwidth is considerably higher.

          • Andrew Lauritzen
          • 3 years ago

          It doesn’t really matter – it’s still far lower bandwidth than PCIe gives you. There’s clearly something more going on in the protocol or interface to the OS that is preventing the GPU from getting to regular SSDs fast enough for some reason. I’m guessing a lot of it is fixable in software to be honest but this was ironically probably an easier solution.

          Beyond some latency, there’s no real advantage to this design from a hardware point of view.

            • Mr Bill
            • 3 years ago

            [quote<]The PEX8747 has 48-Lane, 5-Port PCI Express Gen 3 (8 GT/s)... supports packet cut-through with a maximum latency of 100ns (X16 to X16)...[/quote<] according to the specs pdf at this [url=http://www.avagotech.com/products/pcie-switches-bridges/pcie-switches/pex8747<]Avago Technologies PEX8747 link[/url<]. I don't really understand the specs but it looks faster than PCIe via the slot (X16 being 17GB/s). Going from 17 FPS via PCIe... to 90 FPS when using a PEX8747 to facilitate the transfers. If I am reading [url=https://en.wikipedia.org/wiki/Transfer_(computing)<]Wiki Link about Transfers[/url<] correctly; 8GT/s is 64GB/s.

            • Andrew Lauritzen
            • 3 years ago

            Right but that’s completely irrelevant as the SSDs attached to it can only do ~4GB/s (and that’s read, best case). GPU’s PCIe bus is easily fast enough to saturate that with lots of headroom; if that bus were the bottleneck this would outperform CPU DRAM as well, which is clearly not the case.

            • Mr Bill
            • 3 years ago

            Oh, I see your point, even with perfect raid 0, its only 8GB/sec; 1GT/s.

            Edit: An 8K frame is 33.2 megapixels. So if my math is right, 90 fps at 8K resolution is 3 GB/s.

            • Mr Bill
            • 3 years ago

            I suppose there can be speedups if you don’t need to rewrite the entire frame.

            • Andrew Lauritzen
            • 3 years ago

            The 4GB/s figure was already factoring in “perfect” RAID0 ๐Ÿ™‚ Those drives each peak at around 2GB/s read best case (and slower write!).

            • Mr Bill
            • 3 years ago

            DOH! Reading fast still works but reading with entire comprehension… fail. Thanks for the correction.

        • Mr Bill
        • 3 years ago

        Ah, I did not understand which PCIe interface you meant until I read what the Anandtech article said, more carefully. [quote<]After putting some thought into it, I think AMD has hit upon the fact that most M.2 slots on motherboards are routed through the system chipset rather than being directly attached to the CPU. This not only adds another hop of latency, but it means crossing the relatively narrow DMI 3.0 (~PCIe 3.0 x4) link that is shared with everything else attached to the chipset.[/quote<] [url<]http://www.anandtech.com/show/10518/amd-announces-radeon-pro-ssg-fiji-with-m2-ssds-onboard[/url<]

          • Andrew Lauritzen
          • 3 years ago

          Yeah that line doesn’t really make any sense either – sigh AnandTech ๐Ÿ™‚ DMI bandwidth is also plenty sufficient and if DMI latency were a real issue, anyone who would consider a $10k GPU can easily go X99 (or Xeon, let’s be realistic).

          It has nothing to do with DMI, and probably very little to do with how the hardware is all connected to be honest.

        • Mr Bill
        • 3 years ago

        I wonder if this setup can be further improved by using a PEX8747 enabled motherboard?…
        [url<]http://www.anandtech.com/show/6170/four-multigpu-z77-boards-from-280350-plx-pex-8747-featuring-gigabyte-asrock-ecs-and-evga[/url<]

      • tipoo
      • 3 years ago

      Presenting 4000-frame buffering! Buffer 66 seconds in advance! You just, uh, don’t get to make any time sensitive inputs…Or timely inputs at all.

    • albundy
    • 3 years ago

    can you run an entire OS system on it?

      • Meadows
      • 3 years ago

      Operating system system?

        • albundy
        • 3 years ago

        i was thinking in the lines of video card and PSU only system.

      • derFunkenstein
      • 3 years ago

      Why would you want to? If you push other garbage through it you’ll slow down its intended purpose.

      • tipoo
      • 3 years ago

      You can set the drive to become visible to an OS, so in theory a PCI-E SSD boot capable OS could work…But that seems to defeat the entire point of this.

    • Sargent Duck
    • 3 years ago

    Now you can install Crysis directly on the video card!

      • chยตck
      • 3 years ago

      you can render 8k textures for the next crysis while playing crysis, yo

      • ronch
      • 3 years ago

      And yet it’s not your video card that will actually be pulling the game’s code in and running it.

      Nice try.

        • Wonders
        • 3 years ago

        [quote<]Nice try.[/quote<] And he would've gotten away with it, if it weren't for you meddling kids!

        • derFunkenstein
        • 3 years ago

        so glad you were born with a sense of humor

      • tipoo
      • 3 years ago

      CRYSIS: APPLY DIRECTLY TO THE VIDEO CARD

      • Bomber
      • 3 years ago

      Yes, but will it PLAY Crysis? That is the true question

      • Wirko
      • 3 years ago

      One step closer to your brain!

    • Meadows
    • 3 years ago

    That is actually a pretty impressive idea.

    I’m genuinely surprised NVidia didn’t come up with it first, seeing as how they’re the ones usually with all the pro-GPU stuff.

      • dragosmp
      • 3 years ago

      …Nvidia doesn’t sell SSDs

        • smilingcrow
        • 3 years ago

        Perhaps Nvidia could partner with Amazon then as they sell SSDs?

        • derFunkenstein
        • 3 years ago

        Yeah, i don’t think that has anything to do with anything. It’s not like AMD does anything other than put a sticker on the SSDs it does sell.

        • tipoo
        • 3 years ago

        I don’t see how that’s a bar here. AMD’s SSDs aren’t custom controllers or anything, they’re OCZ SSDs with a sticker. Nvidia could have just as well bought some NAND supply and done something like this. AMD rebranding SSDs wasn’t a head start.

      • bjm
      • 3 years ago

      Well, nVidia’s solution is NV-Link, which allows for far greater bandwidth and lesser latency by accessing DDR4 (vs. NAND) directly over the CPU via a link faster than even PCIe x16. Now granted, the solution is going to cost more, but we’re talking about the pro’s here.

        • tipoo
        • 3 years ago

        You also can’t put 1TB in DDR4 yet, if ever on a single board. Though it’s still magnitudes faster. Different pros and cons.

          • bjm
          • 3 years ago

          Sure, you can: [url=http://www.qct.io/information/pressrelease/pressrelease?pressrelease_id=66<]QuantaPlex T21W-3U[/url<]. Check out that bad boy: NV-Link, 1TB DDR4 LRDIMM and a Xeon E5-2600 v4 CPU. Edit: The RAM isn't on a single board, but over NV-Link, that 1TB DDR4 will be accessed faster than an SSD over PCIe x4 even if on the same board. Theoretically at least.

            • tipoo
            • 3 years ago

            Ooh, I stand corrected then.
            Though then again The DGX-1 this is compared to costs 129,000 USD, with the Quantaplex you mention having a TBD beside it, which I take to mean it will make this card look like chump change. It’s also a 3 rack (!) form factor, compared to a single add in card as in this article.

            [url<]http://www.nextplatform.com/2016/04/21/nvidias-tesla-p100-steals-machine-learning-cpu/[/url<]

          • Waco
          • 3 years ago

          Servers have been able to equip far more than that for years. I have some 3 year old boxes that can handle 6 TB of RAM…

    • f0d
    • 3 years ago

    and you thought the titan and quadro were expensive.!
    $10k wow

      • nexxcat
      • 3 years ago

      My old job, we estimate the GPU-accelerated components earned about $250k/day. We packed 4 high-end cards per box, and had 32 boxes, running custom models of financial instruments. So 40k for these cards, plus another, say, 10k for the rest of the rackmount server for $50k/server. 32 of these would be $1.6 million, or ROI in less than 2 weeks. Cheap! ๐Ÿ™‚

        • f0d
        • 3 years ago

        i can understand the need for most gpu accelerated cards but wouldnt it be faster and cheaper to just get a dual socket system and a relatively cheaper workstation gpu/gpu’s (like the also announced radeon pro) and jam it full of as much ram as possible for a ramdrive than it would be for a polaris with a built in slow ssd (compared to ram)
        pci-e 3.0 x16 is around 16gb/s of bandwidth – much faster than any ssd

        im not en expert or anything on workstation gear but wouldnt it have been better to have some dram slots (like quad ddr4 channels) instead of a slow ssd?

          • terranup16
          • 3 years ago

          I’m guessing the reason why it’s supposed to be better than DRAM is that it’s nonvolatile storage. 3DXpoint DIMM modules then would be the equivalent answer.

          That said, this sounds like the GPU can directly access the storage card, so there may be some latency benefits even though raw bandwidth from the memory channel would be higher.

          • nexxcat
          • 3 years ago

          We were constrained by number of PCIe slots and number of machines we can fit in the datacenter. We were further constrained by PCIe bandwidth for transferring between RAM and cards. Bear in mind these SSDs will likely be SLC, have the fastest controllers, and have dedicated PCIe lanes between them and the GPU.

          I’d imagine density is why they chose flash; I cannot yet get terabytes in DRAM in a single machine.

            • Andrew Lauritzen
            • 3 years ago

            Let’s be clear though – these are still *really* slow compared to even DRAM over PCIe so it’s really only in cases where you are cost-constrained (still cheaper than equivalent amount of RAM!) or your working set is too big for RAM but not too big for these SSDs *and* latency sensitive that it makes sense.

            It’s a very specific use case, and there’s still aspects of it that don’t make a lot of sense to me. i.e. are they just bypassing software/OS limitations here? There’s nothing in hardware that should prevent you from being able to fully saturate a similar SSD setup across the regular GPU PCIe bus.

            That aside, they’re not crazy priced for the target market, but it’s sort of unclear if there’s really a long term future in this kind of design or if it just fills a few niche use cases in the short term.

      • Krogoth
      • 3 years ago

      The high-end Quadros and Telsa also go around 10K USD per card. The hardware cost pales in comparison to software licensing cost.

        • f0d
        • 3 years ago

        i thought the quadro (big pascal) was about $5k? and much faster than a polaris
        [url<]http://www.newegg.com/Product/Product.aspx?Item=N82E16814133586[/url<] i guess i just think they could have done better than just having a relatively slow ssd (compared to dram) on a relatively slow gpu (compared to the competition) that is probably just connected via a pcie switch onboard imo this would have been fantastic if they had like 512gb of dram onboard

          • ImSpartacus
          • 3 years ago

          Anandtech says that it supports a full TB of storage, though that might’ve been limited by SSD tech, not the card, itself.

            • terranup16
            • 3 years ago

            For the price they are asking, I wouldn’t be surprised to find SLC VNAND powering it, which would explain capacity constraints and speed/durability.

            • BurntMyBacon
            • 3 years ago

            Afraid not:
            [quote=”anandtech”<]The SSDs themselves are a pair of 512GB Samsung 950 Pros, which are about the fastest thing available on the market today. [/quote<] [url<]http://www.anandtech.com/show/10518/amd-announces-radeon-pro-ssg-polaris-with-m2-ssds-onboard[/url<]

            • Mr Bill
            • 3 years ago

            So, this is just one polaris 10 core? But using that same PEX bridge chip they use on the 295X2 and the Pro Duo as a link eh? WOW.

            [quote<]In terms of hardware, the Polaris based card is outfit with a PCIe bridge chip โ€“ the same PEX8747 bridge chip used on the Radeon Pro Duo, Iโ€™m told โ€“ with the bridge connecting the two PCIe x4 M.2 slots to the GPU, and allowing both cards to share the PCIe system connection.[/quote<] Edit: Oh dual 512GB 850 Pros in raid 0.

      • ImSpartacus
      • 3 years ago

      It’s a “beta” developer device, so it can be expensive.

Pin It on Pinterest

Share This