TSUBAME3.0 gears up for AI supercomputing with 2160 Tesla P100s

Nvidia just announced a new partnership with the Tokyo Institute of Technology to create what it calls Japan's fastest AI supercomputer. The machine will be known as TSUBAME3.0, and predictably, this will be the third iteration of the TSUBAME cluster design. The 3.0 version will use Broadwell-EP Xeons in combination with Nvidia Tesla P100 accelerators to achieve an expected 12.2 PFLOPS of double-precision throughput. Nvidia says the new cluster will operate alongside the existing TSUBAME2.5 machine (which uses over four thousand Tesla K20X cards) to crunch up to 64.3 PFLOPS for AI work.

TSUBAME is actually an acronym. According to the project's website, it stands for "Tokyo-tech Supercomputer and UBiquitously Accessible Mass-storage Environment." It's also a Japanese word that refers to the swallow. Next Platform reports that TSUBAME3.0 will use 540 blades designed by HP Enterprise, each equipped with four Tesla P100 processors and two Xeon E5-2680 v4 chips. Each node will have 1.08 PB of storage and 256GB of main memory, plus the 64GB of HBM2 between the four Tesla chips.

If it reaches its performance targets, TSUBAME3.0 will end up in the top 10 of the Top500 list. The existing TSUBAME 2.5 machine sits at 40th place in the current ranking. Way back in 2008, the original TSUBAME machine was one of the first to combine x86 CPUs with Nvidia Tesla compute accelerators to achieve massive number-crunching throughput. At that time, it reached 29th in the Top500 list with floating-point throughput of 77.48 TFLOPS. Times sure have changed.

Comments closed
    • POLAR
    • 3 years ago

    Tokyo Institute of Technology, you idiot, Ryzen is almost here! You should have waited.

    • SuperSpy
    • 3 years ago

    Here I am trying to figure out how someone is going to make a supercomputer out of 2160 electric cars.

    Man either I need more or less caffeine, I’m not sure which.

      • Neutronbeam
      • 3 years ago

      Do you know the mpg in petabytes? Well, do ya, punk?

      • Terra_Nocuus
      • 3 years ago

      It’ll have station wagons of bandwidth!

    • Rza79
    • 3 years ago

    A Japanese supercomputer designed in the USA … How times have changed.

      • Gasaraki
      • 3 years ago

      Not really surprising. Most of the super computers in the world use Intel, AMD or IBM processors. nVidia is a Taiwanese company.

        • Rza79
        • 3 years ago

        Not surprising? How old are you, if I may ask? 🙂

        [url<]https://en.wikipedia.org/wiki/Earth_Simulator[/url<] [url<]https://en.wikipedia.org/wiki/Nvidia[/url<] (nVidia Taiwanese?) Japanese semiconductor companies were on top till the 90's: [url<]http://www.icinsights.com/files/images/bulletin20111212fig.gif[/url<]

    • Legend
    • 3 years ago

    There should be an official semantically bound gradient rating system for compute power. Like at some threshold computing becomes super-computing then follows hyper-computing to quantum-computing, universal-computing to meta-computing etc.

    You know just in case we arrive at the finding we need it : )

    • Anomymous Gerbil
    • 3 years ago

    OK, what am I missing here? Be gentle!

    “The 3.0 version will … achieve an expected 12.2 PFLOPS of double-precision… the new cluster will operate alongside the existing TSUBAME2.5 machine to crunch up to 64.3 PFLOPS”.

    Are those two numbers mixing different levels of precision? On the surface it implies that T2.5 can perform 52.1 PFLOPS, more than four times more than T3, which doesn’t make sense, given that T3 will sit higher than T2.5 on the Top100 list.

      • Anomymous Gerbil
      • 3 years ago

      Ah, when confused by a TR post, check another site:

      [i<]The supercomputer will be able to deliver up to 47 petaflops of what Nvidia calls “AI computation.” What it really means is half precision computation, which is one of the precision sweet points for AI computation (Nvidia has made 8-bit precision GPUs as well, targeted mainly at systems that need only inference computation). Nvidia also said that when working together with the Tsubame2.5, the two supercomputers can deliver up to 64.3 petaflops, making the combined system Japan’s highest performing AI supercomputer.[/i<] This is at [url<]http://www.tomshardware.com/news/nvidia-japan-fastest-ai-supercomputer,33683.html[/url<]

    • ronch
    • 3 years ago

    Ryze, Autobot TSUBAME!! Release the power of the [s<]Matrix[/s<] Matrox!!

      • I.S.T.
      • 3 years ago

      Of all the things I did not expect today, a reference to The Transformers: The Movie is the second highest on the list.

    • wizardz
    • 3 years ago

    1.08PB of nonvolatile storage per node? according to the NextPlatform page, each node is less than 1U. i sincerely wonder how they did that..

      • chuckula
      • 3 years ago

      That could easily be in an external SAN. It doesn’t necessarily mean the storage is placed directly in each server.

        • wizardz
        • 3 years ago

        i dont know why but i never thought of that. i was focused on the PB per node and i guess i assumed a PB *in* each node.

        still, 540 nodes with a peta each. thats a cr**ton of storage.

          • chuckula
          • 3 years ago

          Yeah, over half an Exabyte of storage is big even by HPC standards.

            • Waco
            • 3 years ago

            Gargantuan. The biggest PFS built to date (publicly) is under 100 PiB.

            • chuckula
            • 3 years ago

            Or they got the units wrong.

            • Waco
            • 3 years ago

            Which is why I doubt the numbers. 1 TB per node is far more reasonable unless we want to believe they spent $2-4 billion on the flash alone…

            • chuckula
            • 3 years ago

            I think you are on to something.

      • Waco
      • 3 years ago

      I have to wonder if that number is correct. Over half an exabyte of NVRAM is ludicrously expensive…

        • chuckula
        • 3 years ago

        If the number is right, then I’m pretty sure most of that is conventional hard drives with some solid-state caching.

          • Waco
          • 3 years ago

          Not in-node, and they already call out the parallel file system separately at 16 PiB.

          I’m thinking it’s 1 TB per node.

    • 223 Fan
    • 3 years ago

    Tokyo Institute of Technology should have called the computer the 40DD.

      • ronch
      • 3 years ago

      Tokyo Institute of Technology – or TIT for short.

        • 223 Fan
        • 3 years ago

        I can’t wait for the Sam Houston Institute of Technology supercomputer announcement.

          • krazyredboy
          • 3 years ago

          I hear their Bovine Scatology program is a blast!

        • 223 Fan
        • 3 years ago

        The follow on supercomputer has the code name Project Morgana.

      • Neutronbeam
      • 3 years ago

      Eh, that’s a stretch.

    • Redocbew
    • 3 years ago

    Someone needs to ask it: what is the airspeed velocity of an unladen swallow?

      • morphine
      • 3 years ago

      Is it a European swallow or an African swallow?

        • CScottG
        • 3 years ago

        ..bye, bye, Redocbew.

        (..the Bridge of Death is a heartless b!tch.)

        • Erebos
        • 3 years ago

        *Supercomputer crashes*

          • K-L-Waster
          • 3 years ago

          You have to know these things when you’re a King.

        • Gasaraki
        • 3 years ago

        The African swallow is bigger so it might be slower…

      • caconym
      • 3 years ago

      12.2 petaflaps

        • caconym
        • 3 years ago

        alternately: an unladen swallow is full-float, a laden one is half-float.

        (I’m sorry)

    • evilpaul
    • 3 years ago

    Harambe!!!

      • Prestige Worldwide
      • 3 years ago

      Never forget.

    • lilbuddhaman
    • 3 years ago

    inb4 crysis, bitches.

      • albundy
      • 3 years ago

      but will it notepad?

        • krazyredboy
        • 3 years ago

        YES! With THAT kind of power, I can finally, escape the clutches of the Abominable Snowman, in SkiFree!

          • ozzuneoj
          • 3 years ago

          Yeah, I’ve heard that the game is programmed in such a way that you can out-run him when you hit 88kfps.

          • NTMBK
          • 3 years ago

          You know, you can press “F” to go faster than the monster and escape.

            • krazyredboy
            • 3 years ago

            Yeah, but you can only “F” for so long…

Pin It on Pinterest

Share This