AMD has been in a data-center desert for many years. Abu Dhabi Opterons marked the company’s last serious foray into the market in 2012, and since then, Intel’s market share of all server chips has been at least as high as 99.2%. Long-running murmurs of ARM chips disrupting Intel’s lock on the data center haven’t translated into major threats to Xeons yet. AMD’s own effort to produce ARM cores that were socket-compatible with its x86 CPUs, called Project Skybridge, never produced a shipping product, and the company abandoned it in 2015. The last update to AMD’s data center roadmap promised the high-performance K12 ARM core for this year, too, but we haven’t heard a peep of it since. Despite those various pretenders to the throne, x86 remains the dominant instruction set in the data center, and Xeons are its avatar.
This year might be different. AMD now has a unique arrow in its quiver that other companies don’t: a competitive high-performance x86 architecture. Zen has proven a capable and energy-efficient performer in our tests of AMD’s Ryzen desktop CPUs, and now the company is taking the fight to Intel in the data center with its Epyc CPU lineup. We already covered some details of this potential resurgence in our preview of the Naples platform. Today, AMD is revealing its chip lineups for two-socket and single-socket servers, as well as some projections of those chips’ performance relative to Intel’s Broadwell Xeons.
The starting lineup
To build each Epyc package, AMD uses four eight-core modules connected using its Infinity Fabric interconnect. That gives Epyc CPUs a maximum of 32 cores and 64 threads per socket, eight memory channels to DDR4-2666 ECC RAM per socket, and 128 lanes of PCIe 3.0 in total.
Not every server needs that much horsepower, of course, so AMD has sliced and diced that basic package into a variety of products designed to serve prices ranging from $400 and up to $4000 and up. All Epyc CPUs will offer the same memory and PCIe provisions, and some will offer configurable TDPs for more flexibility in system design.
AMD will also offer a subset of Epyc CPUs designed for one-socket servers only. Just like their two-socket relatives, Epyc single-socket CPUs will offer 128 lanes of PCIe 3.0 and eight memory channels. Single-socket operation is the only restriction for these chips, and they’ll be available for prices from $700 and up to $2000 and up.
Zen dresses up for business
Although the fundamental Zen core in Epyc is basically the same as that found in desktop Ryzen chips, some of its features are more important for server-class workloads than they are for clients.
The first of these is a virtualized APIC, or Advanced Programmable Interrupt Controller. AMD says this feature helps servers running VMs reduce world switch latency (the state change involved when a system switches from executing guest applications to hypervisor operations) by 50% compared to the Bulldozer architecture and its derivatives.
Epyc CPUs will also make extensive use of the AMD Secure Processor, a dedicated microcontroller embedded on the chip. This chip creates a secure environment that can be used to perform useful features like hardware-validated boot, cryptographic key generation, and key management.
The Secure Processor’s key-generation and key-management capabilities will be useful in implementing a feature AMD calls Secure Memory Encryption. Operating systems and hypervisors can request a key from the SP to encrypt sensitive pages, protecting data in flight from being intercepted through attacks on physical memory.
Epyc CPUs will also offer a feature called Secure Encrypted Virtualization that will isolate data owned by hypervisors, virtual machines, and containerized applications from access by other guest environments on a system. With such an arrangement, an attacker in one guest environment wouldn’t be able to read data in memory owned and encrypted by another guest, for example. Epyc CPUs will also offer hardware acceleration for the SHA-1 and SHA-256 hashing algorithms.
Potentially Epyc performance
AMD is making some claims about the performance of various Epyc products’ performance today, and the initial outlook is good. However, we do have to take issue with a couple of the choices AMD made on the way to its numbers. After compiling SPECint_rate_base2006 with the -O2 flag in GCC, AMD says observed a 43% delta between its internal numbers for the Xeon E5-2699A v4 and public SPEC numbers for similar systems produced using binaries generated by the Intel C++ compiler. In turn, AMD applied that 43% (or 0.575x) across the board to some publicly-available SPECint_rate_base scores for several two-socket Xeon systems.
It’s certainly fair to say that SPEC results produced with Intel’s compiler might deserve adjustment, but my conversations with other analysts present at the Epyc event suggests that a 43% reduction is optimistic. The -O2 flag for GCC isn’t the most aggressive set of optimizations available from that compiler, and SPEC binaries generated accordingly may not be fully representative of binaries compiled in the real world.
Still, here are AMD’s own numbers for two-socket systems. If we take the company’s assumptions at face value, Epyc would appear to offer large (or even massive) boosts over Broadwell Xeons from top to bottom. The Epyc 7601 dual-socket system handily beats out the pair of 22-core, 44-thread Xeon E5-2699A v4 chips in these benchmarking conditions, and every Epyc chip duo enjoys wide margins of victory over the competition.
We wanted to see how the results might shake out with a different set of assumptions around the Intel SPEC results that AMD started with, though. Friend-of-TR David Kanter suggests that a 20% reduction to public SPEC numbers for Intel CPUs is fairer considering the impacts of Intel’s compiler on the “libquantum” portion of the benchmark and the use of optimizing compilers in general. Accordingly, I rejiggered AMD’s numbers with that in mind. (I’ve applied a 1/0.575x increase to AMD’s own E5-2699A v4 results with a GCC binary to keep things consistent before applying David’s suggested handicap.)
With Intel’s compiler results adjusted this way, two-socket Broadwell systems generally come out slightly behind to moderately behind versus Epyc parts when running SPECint_base_rate2006, especially at the lower end of the market. AMD’s systems still win out, but the differences in performance seem closer to our real-world examinations of Zen and Broadwell performance.
Since code compiled with GCC and ICC coexists in the wild, the truth of Epyc performance likely lies somewhere between these poles. SPECint_rate_base2006 is just one synthetic benchmark, as well, and we’re eager to see real-world testing results with applications like databases, web servers, and so on.
It’s also worth noting that Skylake server parts are coming (although we’re not sure exactly when yet), and those parts will almost certainly compare more favorably to their Epyc counterparts when more details become available. Still, the fact that Epyc merits direct comparisons to Xeon performance is great news. We’ll have more about AMD’s server parts to share soon.