Single page Print

Intel's Core i7-3960X processor


Sandy Bridge goes Extreme, with BMX bikes and energy drinks
— 2:01 AM on November 14, 2011

Truly high-end desktop PCs have been in a curious position for nearly a year now. Intel's X58 platform and the corresponding Core i7-900-series processors have reigned supreme in their segment for three successive years, with little real competition from AMD. Yet in the 11 months since the debut of Intel's Sandy Bridge processors, this high-end platform hasn't been unequivocally faster than the more affordable mid-range options. The source of the trouble is pretty simple: newer technology. The Core i7-900 series is based on the older Nehalem CPU microarchitecture, and the newer Sandy Bridge architecture delivers substantially more performance in every clock cycle. Even the thousand-dollar Core i7-990X, with six cores and three memory channels, can't distance itself too much from its quad-core Sandy Bridge cousin, the Core i7-2600K.

Thus, the X58's raison d'etre has largely been reduced to its utility as a vehicle for multi-GPU solutions. Since quad-core Sandy Bridge chips can only support dual eight-lane PCIe connections, the X58's more ample PCIe connectivity has been prized in certain circles. We can't say we've been part of those circles, though, since we've never been able to measure any real performance differences between SLI and CrossFire on an X58 and a Sandy Bridge rig. Heck, we even quit recommending the X58 as the basis for the beastly Double-Stuff Workstation in our famous system guides, subbing in a Core i7-2600K instead.

In short, Intel's high-end platform has been in desperate need of a refresh. Fortunately, the time for that update has finally come, and Intel is looking to remove all doubt about who's the top dog, unleashing a massive new chip—known as Sandy Bridge-E, or Sandy Bridge Extreme—that doubles up on nearly every resource included in the quad-core versions of Sandy Bridge. If that prospect isn't enough to get you salivating, you probably don't associate a hunger-type response with computer performance, which is entirely normal. Still, some watering of the mouth in this case would help me build drama for the technical specs I'm about to catalog, so oblige me, if you would.


The chip
Sandy Bridge-E is one formidable hunk of silicon. Natively, it includes a total of eight CPU cores capable of tracking 16 threads via Hyper-Threading, 20MB of L3 cache, quad channels of DDR3 memory, and 40 lanes of next-gen PCI Express I/O connectivity. If you're like me, you're probably thinking those specifications sound like they belong to a multi-socket system, but Sandy-E crams all of that stuff into a single socket. While that fact sinks in, we'll consider an annotated picture of the chip's die, highlighting what's where.


Sandy Bridge-E die layout. Source: Intel.

There's much to consider, and we should start with the chip's eight... er, six CPU cores. (I didn't tell you wrong; Sandy Bridge-E natively has eight cores, but Intel has elected to disable a couple of those cores in the first products based on this silicon. We'll discuss that shortly.) Each of those cores is based on the Sandy Bridge microarchitecture, which is essentially a total overhaul of Intel's fundamental CPU building block. We've detailed the more notable changes in the Sandy Bridge architecture elsewhere, but the highlights are worth calling out, including the improved branch predictor, a cache for decoded micro-ops, and the ability to execute two 128-bits loads from memory per clock cycle. All of these changes contribute to higher per-clock instruction throughput. Also, graphics and media workloads should benefit from the inclusion of the AVX extensions to the traditional x86 and SSE instruction sets. AVX doubles the width of floating-point vectors to 256 bits, and Sandy Bridge can execute both a 256-bit add and a 256-bit multiply in one tick of the clock, if the stars (and the data) align correctly.

In order to enable such considerable throughput, Sandy Bridge chips incorporate a ring-style interconnect between the cores. Intel has said each ring stop can transfer up to 96 GB/s at a clock frequency of 3GHz. If that holds true for Sandy-E, with eight stops, the total internal capacity of the ring should be a staggering 768 GB/s.

Not only is the on-chip communication capacity tremendous, but Sandy Bridge-E talks to the rest of the system at unprecedented rates, as well, starting with the quad channels of DDR3 memory provided by the on-chip memory controller. The Core i7-900 series has "only" three memory channels and has never officially supported memory speeds above 1066 MT/s. With Sandy-E, DDR3 transfer rates up to 1600 MT/s are officially supported (though only with a single DIMM per channel). That works out to 51.2 GB/s of memory bandwidth across all four channels—again, a staggering rate, more than double the throughput of lower-end Sandy Bridge desktop chips. What's more, the first chips and motherboards expose the correct multipliers for a range of higher memory speeds, including 1866, 2133, and 2400 MT/s.

Most Sandy-E mobos will be outfitted with eight DIMM slots, so the memory capacities are considerable, too. The manual for Asus' board cites a max capacity of 64GB, while MSI's claims 128GB is possible. These outsized numbers point to Sandy Bridge-E's heritage as a single-socket variant of Intel's upcoming 2P server product, Sandy Bridge-EP. Just as the X58 platform was a sawed-in-half version of the Nehalem-EP platform, so Sandy Bridge-E is one side of a future Xeon configuration. Those Xeons aren't slated to be available for a while yet, though, and Intel had to make a few concessions in order to bring Sandy Bridge-E to the desktop on schedule.

One of those concessions involves the 40 lanes of PCI Express connectivity built into the processor. The incorporation of PCIe directly onto the CPU is a topology change; the X58 chipset played host to the PCIe lanes in the prior-gen platform. Moving this I/O onto the processor die ought to reduce latency and has the potential, at last, to allow superior multi-GPU performance in Intel's high-end platform. However, you'll notice that Intel claims Sandy-Bridge-E is capable of supporting "PCI Express 2.0 graphics," in spite of the fact that this chip was expected to support the next-generation PCI Express 3.0 standard. The reality is that Sandy Bridge-E and its supporting motherboards should technically meet the requirements of the PCIe 3.0 specification. However, the rest of the world doesn't yet have enough PCIe 3.0-capable devices ready to roll. Without access to GPUs and other chips with PCIe 3.0 connectivity to test against, Intel isn't ready to make robust claims about this platform's PCIe 3.0 support right now.

We should see PCIe 3.0-capable graphics processors hitting the market late this year or early next, based on the latest scuttlebutt. Once that happens, we think the likelihood is that most or all PCIe 3.0-capable GPUs will be able to achieve PCIe 3.0 transfer rates when plugged into a Sandy Bridge-E system. Of course, it's also possible there could be an interoperability snag that would prevent that from happening, which is the reason for Intel's caution in making claims about this product's feature set.

If it works out well, the transition to PCIe 3.0 transfer rates should be a very positive outcome, at least on paper, because PCIe 3.0 has essentially twice the bandwidth of second-gen PCIe. The third-gen standard supports higher transaction rates, 8 GT/s versus 5 GT/s, and employs more efficient encoding to take it to double the bandwidth. PCIe 2.0 uses 8-bit/10-bit encoding, with 20% overhead, and thus transfers data at 4 Gbps. PCIe 3.0, meanwhile, uses a 128-bit/130-bit encoding scheme, with much lower overhead, resulting in a transfer rate that rounds out to 8 Gbps. Like much of the rest of Sandy Bridge-E, the math on this PCIe 3.0 connectivity adds up fast. Each third-gen PCIe lane can transmit 1 GB/s of data in each direction, so a PCIe x16 link can transfer a total of 32 GB/s, and the 40 lanes of Sandy-E have a total capacity potential of 80 GB/s.

Code name Key
products
Cores Threads Last-level
cache size
Process node
(Nanometers)
Estimated
transistors
(Millions)
Die
area
(mm²)
Bloomfield Core i7 4 8 8 MB 45 731 263
Lynnfield Core i5, i7 4 8 8 MB 45 774 296
Gulftown Core i7-970, 990X 6 12 12 MB 32 1168 248
Sandy Bridge Core i5, i7 4 8 8 MB 32 995 216
Sandy Bridge-E Core-i7-39xx 8 16 20 MB 32 2270 435
Deneb Phenom II 4 4 6 MB 45 758 258
Thuban Phenom II X6 6 6 6 MB 45 904 346
Orochi/Zambezi FX 8 8 8MB 32 1200 315

As you might expect, cramming all of this processing power, cache, and I/O into a single chip isn't without consequences. Sandy Bridge-E is fairly enormous, with 2.27 billion transistors in a die area that makes even AMD's Bulldozer look dainty. Since it will be sold exclusively as a high-end desktop, workstation, and server processor, though, the sheer size probably won't cut into Intel's bottom line too much.