Alongside a preview of its first 7-nm Epyc CPUs built with the Zen 2 microarchitecture, AMD debuted its first 7-nm data-center graphics-processing units today. The Radeon Instinct MI50 and Radeon Instinct MI60 take advantage of a new 7-nm GPU built with the Vega architecture to crunch through tomorrow's high-performance computing, deep learning, cloud computing, and virtualized desktop applications.
As we noted with AMD's next-generation Epyc CPUs, TSMC's 7-nm process provides the red team's chip designers with a 2x density improvement versus GlobalFoundries' 14-nm FinFET process. The resulting silicon can be tuned for 50% lower power for the same performance or for 1.25x the performance in the same power envelope. In the case of the Vega chip that powers the MI25 and MI60, that process change allowed AMD to cram a marketing-approved figure of 13.2 billion transistors into a 331-mm² die, up from 12.5 billion transistors in 471 mm² on the 14-nm Vega 10.
AMD didn't call this chip by an internal codename, but it's clearly a refined and tuned version of the Vega architecture we know from the gaming space. Vega DC (as I'll call it for convenience) unlocks a variety of data-processing capabilities to suit a wide range of compute demands. For those who need the highest-possible precision, Vega DC can perform double-precision floating-point math at half the rate of single-precision data types, for as much as 7.4 TFLOPS. Single-precision math proceeds at a rate of 14.7 TFLOPS. The fully-fledged version of this chip inside the Radeon Instinct MI60 crunches through half-precision floating point math at 29.5 TFLOPS, 59 TOPS for INT8, and 118 TOPS for INT4.
Compared to Nvidia's PCIe version of its Tesla V100 accelerator, the Radeon Instinct MI60 seems to stack up favorably. The green team specs the V100 for 7 TFLOPS FP64, 14 TFLOPS of FP32, 28 TFLOPS of FP16, 56 TOPS of INT8, and 112 TOPS on FP16 input data with FP32 accumulation by way of the Volta architecture's tensor cores. While the two architectures are not entirely cross-comparable in their capabilities, the relatively small die and high throughput of the Radeon Instinct MI60 still impresses by this measure.
To support that blistering number-crunching capability, AMD hooks Vega DC up to 32 GB of HBM2 RAM spread over four stacks of memory. With 1024-bit-wide interfaces per stack, Vega DC can claim as much as 1 TB/s of memory bandwidth. While Tesla V100 boasts a similarly wide bus, its HBM2 memory runs at a slightly slower speed, resulting in bandwidth of 900 GB/s. AMD also claims end-to-end ECC support with Vega DC for data integrity.
The bleeding-edge technology doesn't stop there, either. AMD has implemented PCI Express 4.0 links on Vega DC for a 31.5 GB/s path to the CPU and main memory, or up to 64 GB/s of bi-directional transfer. On top of that, AMD builds Infinity Fabric edge connectors onto every Radeon Instinct MI50 and MI60 card that allow for 200 GB/s of total bi-directional bandwidth for coherent GPU-to-GPU communication. These Infinity Fabric links form a ring topology across as many as four Radeon Instinct accelerators.
Like past Radeon data-center cards, the MI50 and MI60 will allow virtual desktop deployments using hardware-managed partitioning. Each Radeon Instinct card can support up to 16 guest VMs per card, or one VM can harness as many as eight accelerators. This feature will come free of charge for those who wish to harness it.
Four Radeon Instinct MI50 cards in an Infinity Fabric ring
AMD expects the Radeon Instinct MI60 to ship to data-center customers before the end of 2018, while the Radeon Instinct MI50 will begin reaching customers by the end of the first quarter of 2019. AMD also announced its ROCm 2.0 software compute stack alongside this duo of 7-nm cards, and it expects that software to become available by the end of this year.