Although Intel’s Xeon processors currently have an overwhelmingly dominant share of the market in servers and data-center applications, they stand to face competition from a host of new players in the coming months and years. Everyone from Applied Micro to Qualcomm has been making noise about providing ARM-based SoCs for servers, and most of them look to win business away from Intel by being more cost-effective and power-efficient—or by tailoring a product for a specific niche.
Intel has become very aware of this threat, and it has countered by releasing products like its Atom-based Avoton processors for low-cost and low-power applications. That’s all well and good, but the truly effective counter-punch hits today in the form of a new Xeon processor lineup dubbed Xeon D.
The Xeon D incorporates the very latest technology from Intel into a single package intended for blade servers and other dense configurations with a single processor per computing node. The product itself isn’t really one chip; it’s two pieces of silicon sharing a common package. Almost everything important is on the main die, with the exception of legacy I/O, and the package itself should contain nearly everything needed for a complete server node in a compact footprint. In keeping with this mission, Xeon D processors will sport TDP ratings from “under 20W” to 45W, well below where the larger Xeon EP chips top out.
The simplified block diagram above shows the Xeon D’s basic setup. The primary chip is fabricated using Intel’s world-beating 14-nm process with tri-gate transistors. Its eight CPU cores are based on the Broadwell microarchitecture, and each core can track and execute two threads thanks to Intel’s SMT implementation, known as Hyper-Threading. Compared to the prior-gen Haswell core, Broadwell generally achieves about 5.5% higher instruction throughput per clock cycle thanks to a combination of microarchitectural tweaks and new instructions.
Each core is associated with 1.5MB of last-level cache. This cache is shared across all cores, so the Xeon D effectively has a 12MB unified L3 cache. That cache sits in front of a dual-channel memory controller capable of supporting either DDR4-2133 or DDR3L-1600 type memories. The memory subsystem supports four DIMMs and up to 128GB of capacity via registered modules (or 64GB via unbuffered DIMMs or SO-DIMMs).
The Xeon D has a bunch of high-speed I/O bandwidth on tap, most of it courtesy of 24 lanes of Gen3 PCI Express connectivity. These lanes can be aggregated into two big links, one x16 and one x8, or broken down into various smaller configs, with the finest being six separate PCIe x4 connections. Also integrated on chip are two Intel 10Gbps Ethernet MACs.
The main Broadwell D die shares a package with a separate south bridge chip that provides various forms of legacy I/O, including six SATA3 ports, eight lanes of PCIe Gen2, and USB. These slower external interfaces were likely easier to implement in a larger fabrication process.
In short, this thing is a 8C/16T processor with 12MB of L3 cache, 24 lanes of PCIe Gen3, and dual 10GigE links.
This chip isn’t just a re-purposed version of a mobile Broadwell part. The Xeon D includes a host of server class-features like ECC memory protection, and it incorporates innovations from Haswell-EP like per-core power states that don’t always make sense in client workloads. Since this is new silicon, the TSX errata has been corrected in its CPU cores, and the product fully supports TSX extensions for production use. The Broadwell architecture has a few other benefits over Haswell in this context. Among them are further reduced latencies for switching between virtual machines.
On the power management front, the Xeon D carries over big-ticket items from Haswell-EP like integrated voltage regulation and energy-efficient turbo. (In this latter feature, the CPU monitors the effectiveness of increased clock speeds using stall counters in the core. When added frequency doesn’t help reduce stalls, the CPU will reduce core clocks and shift the power budget elsewhere to improve overall performance.)
This chip adds one more power-related feature to the mix, something blandly named “hardware power management.” This optional feature allows the hardware to make decisions about which P-states and C-states the system should be using rather than taking its cues from the operating system. I don’t have many details about this mechanism yet, but I suspect it borrows from the Power Optimizer work Intel has done for its mobile Core products. If so, the implementation is surely tuned differently for server-class deployments.
I have to say, the Xeon D looks like the future of the Xeon lineup to me, the product poised to ship in the largest numbers overall once the market comes to understand it. Big cores like Broadwell are usually most power-efficient when operating at the lower end of their possible voltage and frequency ranges, and Intel’s latest process tech innovations have offered their biggest benefits at lower voltages. Dual-socket servers can run higher clock speeds, but those speeds come at less efficient operating points. 2P systems also tend to burn lots of power on their socket-to-socket interconnects.
The Xeon D should be excellent for providers of cloud and web services. I’d expect firms like Google and Facebook to snatch them up quickly. Intel also points to applications like web caching, storage, and networking as key for this product. The chipmaker expects the big data, HPC, and enterprise markets to stick with the Xeon EP. I suppose that makes sense, but I expect the Xeon D to be powerfully appealing in any case where a single application doesn’t require more memory or compute power than a Xeon D can supply in a single node. That kind of makes 2P the new 4P, if you follow my meaning.
Of course, Intel wins something else by introducing this product. The market climate just grew quite a bit more hostile for ARM-based server SoCs, which will have to justify themselves against a much more formidable x86-based incumbent.
We don’t yet have full pricing and specs on the various Xeon D models Intel will offer. We do know that the Xeon D-1450 will have eight cores with a base clock of 2.0GHz, an all-core Turbo peak of 2.5GHz, and a single-core Turbo peak of 2.6GHz. Meanwhile, the Xeon D-1520 will feature four cores with a 2.2GHz base frequency, 2.5GHz all-core Turbo, and a 2.6GHz single-core peak. Both chips should be available this month.
Intel has provided us with a few preliminary benchmark results for the Xeon D compared to Avoton. They show the Xeon D to be as much as 3.4x faster with up to 1.7x higher performance per watt. However, those numbers are based on pre-production hardware and look kind of shaky. I suspect we’ll see better numbers published in the coming weeks.