Nvidia Tesla T4 brings Turing smarts to AI inferencing

At his GTC Japan keynote, Nvidia CEO Jensen Huang noted that AI inferencing—or the use of trained neural network models—is set to become a $20-billion market over the next five years. More and more applications are going to demand services like natural language processing, translation, image and video searches, and AI-driven recommendation, according to Nvidia. To power that future, the company is putting the Turing architecture in data centers using the Tesla T4 inferencing card and letting models run on those cards with the TensorRT Hyperscale Platform.

The Tesla T4 accelerator

The Tesla T4 has 320 Turing tensor cores and 2560 CUDA cores that are good for 8.1 TFLOPS of single-precision FP32, 65 TFLOPS of half-precision FP16, 130 TOPS for INT8, and 260 TOPS for INT4 calculations. The Tesla P40 inference accelerator, by comparison, couldn't perform accelerated half-precision FP16 operations and tops out at 47 TOPS for INT8.

Nvidia also claims the T4 has "twice the decoding performance" for video streams as past products, and it can handle up to 38 1920×1080 video streams for applications like video search, among many others. The T4 delivers that impressive performance in a mere 75 W passively-cooled form factor, compared to 250 W for the Tesla P40.

To allow developers to take advantage of servers and data centers stuffed with T4 accelerators, Nvidia also announced the TensorRT 5 inference optimizer and runtime. TensorRT claims to optimize trained neural networks from practically every deep-learning framework for Tesla accelerators and to help developers tune their models for use with reduced-precision calculations like INT8 and FP16 data types.

Nvidia will also provide a TensorRT inference server application as a containerized microservice to deploy deep-learning models to production systems. The company claims the TensorRT inference server keeps Tesla accelerators fed and ensures peak throughput from Nvidia hardware. The inference server application is ready for use with Kubernetes and Docker, and Nvidia will make it available through its GPU Cloud resource.

Comments closed
    • BurntMyBacon
    • 4 years ago

    Only in hindsight or by proxy.

    • K-L-Waster
    • 4 years ago

    Is there a chance? Anything’s possible.

    Will they? Hmm, let me consult the Magic 8 Ball…

    • K-L-Waster
    • 4 years ago

    … and secretly performed plastic surgery on Rory Read so he could body double for Brian Krzanic, then kidnapped Krzanic and hid him in a tower in an iron mask!

    • ronch
    • 4 years ago

    Plot twist: afrad that Intel will be tough competition in the discrete graphics industry, Nvidia secretly approached Raja Koduri and hired him to work for Intel to disrupt their product rollout.

    • ronch
    • 4 years ago

    Radeon sure is falling farther and farther behind. Any chance they’ll catch up?

    • ronch
    • 4 years ago

    If you are Krogoth’s alter ego, would you as Chucky know of it?

    • oldog
    • 4 years ago

    Bah! You are describing a “dissociative identity disorder” or “split personality” (very rare indeed if it exists at all). This is NOT schizophrenia or bipolar disorder!

    This matters; these psychiatric terms are misused constantly. It would be like confusing a CPU and an SSD.

    (Climbs off high horse.)

    • techguy
    • 4 years ago

    hmm, wonder what the pricing will be on these… I was going to put a low-end Pascal Quadro in my media server to handle transcoding duties but something like this could be a “forever” upgrade.

    • Voldenuit
    • 4 years ago

    Chuckuloth is bipolar: obviously Chuckula is the manic one and Krogoth is the depressive one.

    • K-L-Waster
    • 4 years ago

    It’s a trick question — AMD can’t ship that many Vega chips at a time….

    • LocalCitizen
    • 4 years ago

    turings are supposed to be volta cores with rt cores and some improvements.
    titan v has 5120 cuda cores + 640 tensors with 12gb of 652 gb/s mem for $3000
    2080ti has 4352 volta cuda cores + 544 tensors (15% less) with 11gb of mem at 616 gb/s for $1200

    turings are bargains

    so, bad news to gamers, instead of crypto miners, now you have ai researchers buying all your gpus

    • JosiahBradley
    • 4 years ago

    Mind = Blown. The universe makes sense now.

    • chuckula
    • 4 years ago

    Hacked?

    How are you so sure that I’m not actually Krogoth’s alter ego?

    • DeadOfKnight
    • 4 years ago

    I want to see how Titan V compares to Turing. Obviously the value isn’t there for a $3000 card, but on the second hand market it could end up being a better value for legacy applications.

    Edit: Yes, I know, this is about Tesla, but it’s not COMPLETELY off topic.

    • JosiahBradley
    • 4 years ago

    Hey Chucky hacked Krogoth’s account!

    • Krogoth
    • 4 years ago

    Here is the real Turing test!

    How many Vega chips does it take to do the job of a single Turing chip?

    #PoorNavi
    #PoorVega
    #PoorPolaris

Pin It on Pinterest

Share This

Share this post with your friends!