Nvidia unveils the mighty DGX-2 Deep Learning System

At its GPU Technology Conference today (GTC), Nvidia CEO Jensen Huang made a fair number of announcements. However, none were likely as exciting to the machine-learning crowd as the announcement of the Nvidia DGX-2. As the successor to the DGX-1, the DGX-2 is likewise a pre-built computing cluster that combines a bundle of Nvidia GPUs and supporting hardware into a turn-key HPC system. This year's model doubles the number of GPUs to 16, and Nvidia says it offers a staggering 2 petaflops of compute throughput.

Naturally, the DGX-2 upgrades the GPUs in question to Volta-based Tesla V100s. These cards are the latest model (also introduced at GTC) that double the local memory capacity to 32 GB of HBM2 onboard each card. As usual for the Tesla V100, the GPUs and their RAM sit astride SXM3-form-factor cards that use mezzanine connectors for their NVLink 2 interfaces instead of the usual golden finger slots.

The 16 GPUs are connected to an interconnect fabric called NVSwitch that allows any of the GPUs to communicate at 300 GB/sec with any of its brethren. Nvidia says NVSwitch is custom silicon created specifically for enabling the fabric within the DGX-2. That kind of link speed allows software running on the DGX-2 to treat the total 512 GB of HBM2 memory in this box as a single pool, rather than addressing each chip's local memory separately. Nvidia says that thanks to the combination of the new hardware and software optimizations, the DGX-2 is 10 times faster than a DGX-1 system equipped with Tesla V100 cards. Despite the DGX-2's enormous compute prowess, it only draws 10 kW of power.

While the GPUs are obviously the DGX-2's raison d'être, the rest of the machine is nothing to sneer at. A pair of Xeon Platinum CPUs are served by 1.5 TB of memory, and a "PCIe switch complex" links those chips to 30 TB of NVMe storage and eight 100-Gigabit Infiniband cards. The whole package is barely the size of a mini-fridge, and Nvidia says it weighs 350 lbs (159 kg). If you're after what Nvidia calls "the world's largest GPU," you'll apparently be able to get one in Q3 of this year for $400,000.

Comments closed
    • ronch
    • 2 years ago

    I think I have enough intelligence in my digital watch. Works fine as a timer, a stopwatch, and an alarm.

    • ronch
    • 2 years ago

    One day AI will rule the world and those who worked so hard to create it will all be its slaves. 0_o

    • techguy
    • 2 years ago

    Pffffffffft. Micro Center sells this for $379,999.

      • Ninjitsu
      • 2 years ago

      Oh god, living in Europe is messing my brain up. I read that as $379.99 :/

        • DancinJack
        • 2 years ago

        I think you meant living in America for however long messed your brain up. We do things soooooooooooooo stupidly here.

        Metric system? NO SIR I DON’T THINK SO THAT WOULD MAKE TOO MUCH SENSE.

        Celcius like the rest of the world? YOU’RE INSANE DUDE FAHRENHEIT FOR THE WIN

        smh

          • K-L-Waster
          • 2 years ago

          Don’t give us none o’ that base 10 math, there, boy… what, you thinkin’ yer better’n us?

          </sarc>

            • DancinJack
            • 2 years ago

            I honestly can’t fathom how the early deciders said, omg, we HAVE to do things differently than everyone else. It won’t create any problems or confusion, ever. And if so? who cares, ‘MURICCCCAAAA!

            It is utterly disgusting and flabbergasting to me that we (the US) use Fahrenheit, Imperial etc etc.

            [url<]https://www.zmescience.com/other/map-of-countries-officially-not-using-the-metric-system/[/url<]

            • JustAnEngineer
            • 2 years ago

            The U.S. was well on its way to converting to metric in the mid-1970s, but General Motors lobbied to put a stop to it.

      • the
      • 2 years ago

      Too bad you have to have one locally to get it at that price.

      • derFunkenstein
      • 2 years ago

      Dammit. In-store pickup only.

        • K-L-Waster
        • 2 years ago

        Apparently fork-lift rental is extra too.

    • Vigil80
    • 2 years ago

    So what’s it for?

    And if I had way more money than sense, how hard would it be to use this as my actual gaming GPU? Guessing GeForce Experience won’t recognize it.

      • chuckula
      • 2 years ago

      As for what it’s used for: If you have to ask you aren’t a customer, but think things like HPC, complex simulations, large-scale machine learning with the Tensor cores, etc.

      If you really really want to drop a large amount of money on a card that at least is verified to be capable of playing games, then get the Titan V. This system includes 16 GPUs (the GV100) that are actually similar to the Titan V except for having 32GB of RAM instead of “only” 16GB and having all 84 compute units turned on while the Titan V only has 80 turned on.

      And the Titan V is a dirt-cheap [relatively] $3,000 while each GV100 is $9000.

        • Vigil80
        • 2 years ago

        Of course it’s an academic interest. I’m not on the market for an Apache helicopter either, but I still wonder what it’s like to pilot one.

      • Neutronbeam
      • 2 years ago

      Really? It plays Crysis, man, with ALL the eye candy turned up to ultra!

        • K-L-Waster
        • 2 years ago

        At what res?

        • nico1982
        • 2 years ago

        It actually plays Crysis rather well on recruit, not so much on hard – let alone delta – but it will become better 😛

          • TheRazorsEdge
          • 2 years ago

          This isn’t built for an AI to play Crysis. It’s built to run Crysis.

          I heard it gets 60 FPS at 1080p. Not with maxed settings though—that will come with the DGX-3.

    • odizzido
    • 2 years ago

    Haven’t read the article…just want to say how customer unfriendly nvidia is.

      • Leader952
      • 2 years ago
        • odizzido
        • 2 years ago

        what? What does not supporting variable sync and having a bunch of proprietary crap that does nothing but hurt gaming have to do with trump and myself? I don’t even live in the US.

      • K-L-Waster
      • 2 years ago

      Pro tip: comments are usually better received if they are either insightful or funny.

        • odizzido
        • 2 years ago

        eh, I am not too worried about that. Downvote away. I think nvidia’s practices are pretty well known?

      • DancinJack
      • 2 years ago

      I’m VERY friendly with my GTX 1080 and 1440p Gsync monitor. Very.

        • derFunkenstein
        • 2 years ago

        yikes

        • chuckula
        • 2 years ago

        TMI MAN!
        T.M.I.

        • Redocbew
        • 2 years ago

        Dude, pick one and stick with it. Don’t be a jackass.

        • Mr Bill
        • 2 years ago

        Is there an adapter for that?

          • derFunkenstein
          • 2 years ago

          you should see his dongle.

            • Mr Bill
            • 2 years ago

            Wooo! Socket to em!

            Does it include copyright, protection?

            • derFunkenstein
            • 2 years ago

            Yes, it has Denuvo, so that way you KNOW you’re getting screwed.

        • Srsly_Bro
        • 2 years ago

        1080 Ti and 1440p Gsync. #getrekt

    • chuckula
    • 2 years ago

    [quote<]A pair of Xeon Platinum CPUs are served by 1.5 TB of memory, and a "PCIe switch complex" links those chips to 30 TB of NVMe storage and eight 100-Gigabit Infiniband cards. The whole package is barely the size of a mini-fridge, and Nvidia says it weighs 350 lbs (159 kg). If you're after what Nvidia calls "the world's largest GPU," you'll apparently be able to get one in Q3 of this year for $400,000.[/quote<] Meh, if they had just used Epyc the system would only go for $19.95 + S&H.

      • tay
      • 2 years ago

      But nGreedia so not surprising… [sub<] (am i doing this right? Teach me master!)[/sub<]

        • chuckula
        • 2 years ago

        Obi Wan has trained you well.

        • jihadjoe
        • 2 years ago

        If only those damned AI researchers and ray tracers would learn to optimize their code this could run on a GTS450.

      • anubis44
      • 2 years ago

      But most importantly, an Epyc system would spit out the correct answers. 😉

Pin It on Pinterest

Share This