Nvidia Tesla T4 brings Turing smarts to AI inferencing

At his GTC Japan keynote, Nvidia CEO Jensen Huang noted that AI inferencing—or the use of trained neural network models—is set to become a $20-billion market over the next five years. More and more applications are going to demand services like natural language processing, translation, image and video searches, and AI-driven recommendation, according to Nvidia. To power that future, the company is putting the Turing architecture in data centers using the Tesla T4 inferencing card and letting models run on those cards with the TensorRT Hyperscale Platform.

The Tesla T4 accelerator

The Tesla T4 has 320 Turing tensor cores and 2560 CUDA cores that are good for 8.1 TFLOPS of single-precision FP32, 65 TFLOPS of half-precision FP16, 130 TOPS for INT8, and 260 TOPS for INT4 calculations. The Tesla P40 inference accelerator, by comparison, couldn't perform accelerated half-precision FP16 operations and tops out at 47 TOPS for INT8.

Nvidia also claims the T4 has "twice the decoding performance" for video streams as past products, and it can handle up to 38 1920×1080 video streams for applications like video search, among many others. The T4 delivers that impressive performance in a mere 75 W passively-cooled form factor, compared to 250 W for the Tesla P40.

To allow developers to take advantage of servers and data centers stuffed with T4 accelerators, Nvidia also announced the TensorRT 5 inference optimizer and runtime. TensorRT claims to optimize trained neural networks from practically every deep-learning framework for Tesla accelerators and to help developers tune their models for use with reduced-precision calculations like INT8 and FP16 data types.

Nvidia will also provide a TensorRT inference server application as a containerized microservice to deploy deep-learning models to production systems. The company claims the TensorRT inference server keeps Tesla accelerators fed and ensures peak throughput from Nvidia hardware. The inference server application is ready for use with Kubernetes and Docker, and Nvidia will make it available through its GPU Cloud resource.

Comments closed
    • ronch
    • 1 year ago

    Plot twist: afrad that Intel will be tough competition in the discrete graphics industry, Nvidia secretly approached Raja Koduri and hired him to work for Intel to disrupt their product rollout.

      • K-L-Waster
      • 1 year ago

      … and secretly performed plastic surgery on Rory Read so he could body double for Brian Krzanic, then kidnapped Krzanic and hid him in a tower in an iron mask!

    • ronch
    • 1 year ago

    Radeon sure is falling farther and farther behind. Any chance they’ll catch up?

      • K-L-Waster
      • 1 year ago

      Is there a chance? Anything’s possible.

      Will they? Hmm, let me consult the Magic 8 Ball…

    • techguy
    • 1 year ago

    hmm, wonder what the pricing will be on these… I was going to put a low-end Pascal Quadro in my media server to handle transcoding duties but something like this could be a “forever” upgrade.

    • DeadOfKnight
    • 1 year ago

    I want to see how Titan V compares to Turing. Obviously the value isn’t there for a $3000 card, but on the second hand market it could end up being a better value for legacy applications.

    Edit: Yes, I know, this is about Tesla, but it’s not COMPLETELY off topic.

      • LocalCitizen
      • 1 year ago

      turings are supposed to be volta cores with rt cores and some improvements.
      titan v has 5120 cuda cores + 640 tensors with 12gb of 652 gb/s mem for $3000
      2080ti has 4352 volta cuda cores + 544 tensors (15% less) with 11gb of mem at 616 gb/s for $1200

      turings are bargains

      so, bad news to gamers, instead of crypto miners, now you have ai researchers buying all your gpus

    • Krogoth
    • 1 year ago

    Here is the real Turing test!

    How many Vega chips does it take to do the job of a single Turing chip?

    #PoorNavi
    #PoorVega
    #PoorPolaris

      • JosiahBradley
      • 1 year ago

      Hey Chucky hacked Krogoth’s account!

        • chuckula
        • 1 year ago

        Hacked?

        How are you so sure that I’m not actually Krogoth’s alter ego?

          • JosiahBradley
          • 1 year ago

          Mind = Blown. The universe makes sense now.

          • Voldenuit
          • 1 year ago

          Chuckuloth is bipolar: obviously Chuckula is the manic one and Krogoth is the depressive one.

            • oldog
            • 1 year ago

            Bah! You are describing a “dissociative identity disorder” or “split personality” (very rare indeed if it exists at all). This is NOT schizophrenia or bipolar disorder!

            This matters; these psychiatric terms are misused constantly. It would be like confusing a CPU and an SSD.

            (Climbs off high horse.)

          • ronch
          • 1 year ago

          If you are Krogoth’s alter ego, would you as Chucky know of it?

            • BurntMyBacon
            • 1 year ago

            Only in hindsight or by proxy.

      • K-L-Waster
      • 1 year ago

      It’s a trick question — AMD can’t ship that many Vega chips at a time….

Pin It on Pinterest

Share This