Fujitsu joins the deep-learning stampede with specialized silicon

Nvidia's revenues, profits, and share price have all benefited from surging demand in both the PC gaming hardware and the graphics compute marketplaces. The company could potentially face growing competition in deep learning from rival AMD's Vega graphics chips, the Radeon Instinct family, and the company's ROCm platform, but the AI jockeying doesn't stop there. Intel is working on its own Lake Crest chips, and Google is working on its Tensor Processing Units.

Fujitsu is now throwing its hat into the deep-learning ring, as well. The Japanese server and supercomputer manufacturer announced its intention to build a Deep Learning Unit (DLU) AI processor before the end of its fiscal 2018 (running from April 2018 to March 2019) at the International Supercomputing Conference. Fujitsu's Takumi Maruyama claimed the DLU chip will offer a ten-fold improvement in performance-per-watt compared to competitors' silicon, though it's not clear whether a training or inferencing workload was used to make that claim.

The DLU chip will be composed of an array of Deep Learning Processing Units (DPUs) connected using a high-performance fabric.  A dedicated master core manages the collection of DPUs and the interaction between DPUs and the on-chip memory controller. The chip has native support for FP16, FP32, INT16, and INT8 datatypes. Fujitsu says the low-precision integer datatypes can be used effectively with some deep-learning applications to reduce power consumption while maintaining acceptable accuracy. The company says the chips will have a simple pipleline in order to reduce hardware complexity and an on-chip network for DPU-to-DPU communication.

Fujitsu says the DLU will run using an all-new instruction set architecture (ISA) designed specifically for deep learning. Each DPU has 16 deep-learning processing elements (DPEs), each of which is made up of eight single instruction, multiple data execution units and a "very large" register file under full software control. The DLU will utilize on-package HBM2 memory, as well. The company promises that the design will be scalable using its proprietary Tofu interconnect technology.

 The first-generation DLU silicon  will be sold as a coprocessor, similar to the way that Nvidia offers its Tesla GPU compute products. Fujistu plans to embed the DLU into a CPU starting with the second-generation products. Given the Japanese electronics manufacturer's links to the SPARC architecture, integration of DLUs into future Fujitsu SPARC chips seems most likely. The company didn't provide any estimates of a release date for these integrated chips.

Comments closed
    • ptsant
    • 2 years ago

    Competition is good.

    For what it’s worth, I recently tested the caffe framework for my biomedical research. At work we used a Titan Xp (very fast, excellent ecosystem). At home I played with my RX480. In a matter of months, before Vega launch, the ROCm platform 1.6 and the launch of MIOpen, the AMD convolution library, increased performance by >5x (12x in a specific smaller problem I use as test case).

    AMD is not yet a significant player in Deep Learning, but if they can keep this up and sell Vega at low prices (lower than the Titan Xp) they might gain some market share. As for the newcomers (Intel, Fujitsu etc), I really can’t say. Let’s say that all my friends who do Deep Learning stuff, use Titans. All the other contenders can only go … up from their 0% share.

    • smilingcrow
    • 2 years ago

    “The company promises that the design will be scalable using its proprietary Tofu interconnect technology.”

    Should be cheap to manufacture then although Nvidia have a Quorn based interconnect due later this decade.
    Knowing Intel they will use mince as an interconnect; cheap bastards.

    • ronch
    • 2 years ago

    Guys, do you think this will deep six the competition?

      • derFunkenstein
      • 2 years ago

      YEAH THEY’LL BE EX-86’D

    • ronch
    • 2 years ago

    If I was the marketing guy at Fujitsu I’d name this chip [b<]DeepCheep[/b<]™ (pronounced with an Italian accent).

    • chuckula
    • 2 years ago

    [quote<]The chip has native support for FP16, FP32, INT16, and INT8 datatypes. Fujitsu says the low-precision integer datatypes can be used effectively with some deep-learning applications to reduce power consumption while maintaining acceptable accuracy. [/quote<] The NES controllers never had acceptable accuracy for me.

    • Leader952
    • 2 years ago

    [quote<]Fujitsu's Takumi Maruyama claimed the DLU chip will offer a ten-fold improvement in performance-per-watt compared to competitors' silicon[/quote<] No actual reference to the actual competitors' silicon which we can infer is Nvidia's. So are they picking old silicon to compare it to like Google did. So that ten-fold improvement in 2019 is compared to Nvidia's 2016 product. Nvidia will have better performing products by 2019 also so that 10x will nowhere be 10x by then.

    • Voldenuit
    • 2 years ago

    Missing a picture of said chip:
    [url<]https://vignette1.wikia.nocookie.net/terminator/images/b/bf/T-800CPU.jpg/revision/latest?cb=20110717154640[/url<]

Pin It on Pinterest

Share This