Single page Print

Intel's Core 2 Extreme QX9650 processor

45nm arrives on the desktop in cool fashion
— 11:00 PM on October 28, 2007

We've been hearing a lot of doom and gloom about the prospects for microprocessors in the past few years. Pundits have told us that Moore's Law is destined to hit physical limitations that will bring the incredible every-two-years doubling of CPU power to a screeching halt—and probably sooner rather than later, since we've already seen CPUs run into heat and clock speed barriers. They have a point—up to a point. But the constant hum-drum of negativity begins to sound dated as time marches further and further away from Intel's fiasco with the 90nm Pentium 4.

In fact, such thoughts seem like ancient history today, as we get our first look at the desktop version of Intel's quad-core processors manufactured on its 45nm fabrication process. This new process not only packs in twice as many transistors as 65nm, but also employs new materials to deliver reductions in electrical current leakage. These changes add up to the sort of generational improvement that transports old codgers like me back to the roaring 1990s, when the horizon for CPU progress seemed limitless.

Of course, these days, Intel has hedged its bets by multiplying the number of cores per processor and ramping up the cadence of design innovations to those cores. The result? The new Core 2 Extreme QX9650 quad-core processor promises big reductions in power consumption and heat production, along with performance increases of up to 20%—at the same 3GHz clock speed as the chip that preceded it. Not that there's anything wrong with that. In fact, this processor could make the prophets of doom and gloom look like downright fuddy-duddies, if you know what I mean. Keep reading to see whether the QX9650 puts a clown suit on the doubters.

A wafer full of 45nm "Penryn" chips. Source: Intel.

The Penryn lands in the Yorkfield
Those of you sick, sick people who follow CPUs closely are probably already familiar with the bevy of code-names involved here, but I'll recount the major points for the healthier among us. True to Moore's Law, Intel's code names double every 18 to 24 months, so there's much to track. The most relevant names for our present discussion are Penryn and Yorkfield. Penryn is the name of the basic building block of Intel's entire 45nm lineup; it is the dual-core 45nm processor design on which most of Intel's mobile, desktop, and server products will be based. Yorkfield is the first desktop implementation of Penryn, and it's a two-fer special, situating two dual-core chips together nice and cozy-like in a single LGA775-style package, just as Intel's Kentsfield quad cores like the QX6850 did before it. The Core 2 Extreme QX9650 will be the first version of Yorkfield to hit the streets.

While we're dropping names, we should probably enter a couple of others into the discussion. Yorkfield is arriving right on time for a generational battle with its somewhat tardy opponent, AMD's Phenom processor. The Phenom is based on AMD's K10 design, and unlike Yorkfield, it incorporates four cores natively onto a single chip—or at least it will when it arrives later this month. We've already shown you a preview of this microarchitectural battle in the heavyweight division with our previews of AMD's K10-based quad-core "Barcelona" Opterons and Intel's 45nm "Harpertown" Xeons. Now we have a chance to reprise this contest on the desktop, starting with the QX9650.

The QX9650 uses the same LGA775 infrastructure as previous Core 2 processors.

As I've mentioned, the key to the QX9650's advances is Intel's new 45nm fab process, which represents a fundamental change in the structure of the transistors on a chip. Intel says it's the biggest advancement in transistor technology since the late 1960's, although this is clearly an evolutionary step. The transistor combines a high-capacitance gate oxide, made of halfnium, with a metal gate, and it delivers some eye-popping purported advantages in addition to the customary doubling of transistor density. Among them, Intel claims, is a 30% reduction in switching power, an improvement of over 20% in switching speed, and a more-than-10X reduction in gate oxide leakage. In layman's terms, that means 45nm chips should be smaller, run faster, and consume less power than Intel's 65nm parts—which were already quite good.

Each dual-core Penryn chip crams roughly 410 million transistors into a space of 107 mm². By contrast, the dual-core 65nm Conroe chips fit fewer transistors, 341 million, into a larger 143 mm² die area. Intel has to produce two good chips in order to make one Yorkfield processor, but the small die area involved should make things relatively easy, in terms of avoiding defects and keeping yields high. AMD, on the other hand, has chosen tighter integration and a higher degree of difficulty via a single-chip approach to quad-core processors; each of its upcoming Phenom chips packs 463M transistors into a 283 mm² die via AMD's 65nm fab process.

The two cores on a Penryn die mirror each other. Source: Intel.

Penryn isn't quite so revolutionary on the CPU design front, since it's based on the same basic microarchitecture as previous Core 2 chips. It ain't exactly chopped liver, either, since the Core 2 chips are the fastest desktop processors around. What's more, Intel's chip architects have endowed Penryn with more than its fair share of new tricks and tweaks. The most visible of those tweaks is a larger (6MB) and smarter (24-way set associative) L2 cache on each chip, shared between the two cores. (That works out to 12MB of total L2 cache in a Yorkfield processor, for my fellow liberal arts degree holders.)

With the QX9650, Yorkfield begins life riding a 1333MHz front-side bus like older Core 2 CPUs, but that's not likely to be the limit forever. Penryn-based Xeons will start out on a 1600MHz FSB, and Intel has already demoed a Core 2 Extreme QX9770 with a 1.6GHz bus.

Both the larger cache and faster bus are traditional vehicles for performance gains, but Penryn has some internal execution tweaks, as well. The chip features a new divider, capable of handling both integer and floating-point math. The divider's radix-16-based design lets it process four bits per cycle (up from two bits in previous chips) and includes an optimized square root function. The divider has an early-out mechanism that can reduce instruction latencies in some cases, too.

Penryn also extends the Core microarchitecture's 128-bit single cycle SSE capabilities to shuffle operations, potentially doubling execution throughput for certain tasks, including the formatting of data for other SSE-based vector operations.

Another common vehicle for performance advances is the addition of tailored instructions for specific uses. Penryn has some of those, too, in the form of SSE4. SSE4 is comprised of 47 instructions aimed at HD video acceleration, basic graphics operations (including dot products), and the integration and control of coprocessors over PCI Express links. Developers will have to update their applications and compilers in order to take advantage of these instructions, of course. Fortunately, we've been able to include an SSE4-enabled video compression codec in our test suite, as you'll see.

As the first desktop-oriented derivative of Penryn, the Core 2 Extreme QX9650 is very much a premium product. Like Intel's other Extreme Editions, the QX9650 has an unlocked upper multiplier and will probably sport a price tag around a grand. Since it drops into LGA775-style sockets, the QX9650 is compatible with many newer Intel-oriented motherboards, especially those based on Intel's P35 and X38 chipsets, usually with the help of a BIOS update. You'll want to check with the mobo maker to see whether a particular board supports the QX9650.

As for cooling, Intel officially lists the QX9650's TDP at 130W, like past Core 2 Extreme processors. I think that's crazy conservative, like the love-child of Ann Coulter and Pat Buchanan, for reasons that will become clear once you see how it looks on the power meter.

And, as I've said, the QX9650 runs at 3GHz on a 1333MHz bus, just like the 65nm Core 2 Extreme QX6850 did before it. The comparison between these two CPUs should give us a nice look at how Penryn/Yorkfield's architectural tweaks boost clock-for-clock performance.

Before we move on to our results, I should mention that this an early preview of the QX9650. This product is officially slated to debut, and become available for purchase, on November 12. Intel plans to introduce several 45nm Xeons at the same time, but that will be it for a while. Additional Penryn-based desktop processors, both dual- and quad-core, aren't expected until early next year.