Intel unveiled its beefiest and most specialized deep-learning hardware yet this afternoon. Speaking at the Wall Street Journal's D.Live event, CEO Brian Krzanich unveiled the first fruit of Intel's Nervana acquisition: the Neural Network Processor, or NNP. The company claims this application-specific integrated circuit (ASIC) is purpose-built to handle AI workloads such as matrix multiplications and convolutions, unlike general-purpose compute hardware like CPUs and GPUs.
Although Intel didn't perform a deep-dive for the chip's architecture today, Nervana founder Naveen Rao says in a separate blog post that since the operations and data movements involved with these calculations are understood from the get-go, the chip can do without a standard cache hierarchy like that of today's CPUs or graphics processors. Instead, memory management for the NNP apparently occurs 100% in software, and Rao claims that this fact means the chip can achieve better utilization and higher performance than its presumable general-purpose competition.
Since the chip only has to include hardware related to computation, Nervana claims in an older blog post that it can do away with circuitry related to cache controllers and coherency logic and pack in more compute resources. At the time, Nervana also planned to deck out the chip with 32GB of HBM RAM across four stacks, and the PR-friendly image Intel shared suggests that provision has carried through to its realization of the NNP in some form.
Another potential innovation of the NNP lies in its handling of some data types. Intel claims the NNP offers a new numeric format called Flexpoint. The company says that Flexpoint "allows scalar computations to be implemented as fixed-point multiplications and additions while allowing for large dynamic range using a shared exponent." This specialization is also claimed to allow the NNP to pack more functional units into a given die area while also reducing the amount of power required per unit of work.
Each NNP could also include dedicated on-chip interconnects for communication with other NNPs. In its past descriptions of its Nervana Engine ASIC, the company planned to put "six bi-directional high-bandwidth links" on each chip that could handle inter- or intra-node communication for elastic compute capabilities, either to increase the amount of compute resources dedicated to a task or to increase the model size being processed on NNP clusters.
Krzanich says that Intel has "multiple generations" of Nervana NNP products in the works, and the company's press release states that these future products will help it achieve its (vague) goal of boosting deep-learning performance by 100x by 2020. In a separate podcast, Rao says that the company has working silicon in hand today, and it'll be working with Facebook to put NNPs through their paces as it matures the product. Stay tuned for more details as we get them.