When we asked AMD about the possibility of a six-core desktop processor last summer, shortly after the six-core Istanbul Opteron launch, the company inquired whether we would personally buy such a product. Perhaps not, we replied, but we know some folks would enjoy the option. AMD's head of server and workstation marketing fired back, "If you have a million friends that need it..."
Barely a month later, AMD officially confirmed plans for a six-core desktop product code-named Thuban, which would be based on AMD's second-generation six-core Opteron design, complete with DDR3 memory support, HyperTransport 3.0, and likely higher performance. Thuban would come out at some point in 2010. Between then and now, Intel has managed to beat AMD to the punch with the Core i7-980X Extreme. When it hit stores last month, the 980X simultaneously became the world's first six-core desktop processor and the world's first 32-nm six-core chip.
Today, AMD finally lifts the veil on the Phenom II X6to undoubtedly high expectations. Is this the design that will let AMD re-enter battlefields long conceded to Intel's Core i5 and i7 CPUs, or does it only serve to cement Intel's leadership in high-end desktop processors?
What makes Thuban special
Hearing the product namePhenom II X6one might think Thuban is little more than a Phenom II X4 with a couple of extra cores glued on. After all, it is based on the same architecture, and it works in the same Socket AM2+ and Socket AM3 motherboards. Making that assumption would be unwise, however, because aside from having six cores on a single die instead of four, Thuban departs from today's Phenom II X4s in two important ways.
The first of those is Turbo Core, whose basic premise should sound familiar to anyone acquainted with Intel's Core i5 and i7 CPUs. From day one, those Intel chips have featured a technology called Turbo Boost, which dynamically raises the clock speeds of active cores depending on the workload and the available thermal headroom. Turbo Boost cleverly balances clock frequencies and thermals, increasing clock speeds with lightly multithreaded workloads and reducing them when more cores are fully occupied. So long as the headroom is available, Turbo Boost will even keep clock frequencies above the CPU's rated speed when all cores are quite busy.
AMD's Turbo Core is simply a different implementation of the same concept. Turbo Core boosts the clocks on up to three of Thuban's cores when the others aren't fully loaded, and it raises them substantiallyby as much as 500MHz in the case of the Phenom II X6 1055T. AMD's take on Turbo differs from Intel's in many details, though.
For instance, Intel's Turbo Boost employs a network of on-chip thermal sensors and a fairly sophisticated built-in microcontroller dedicated to power management, ensuring the best use of the available thermal headroom. Turbo Boost behavior may vary from chip to chip and system to system, depending on the thermal properties of the individual CPU and on the effectiveness of the cooling solution. By contrast, AMD processors with Turbo Core are screened to meet certain thermal conditions at the factory, and each processor with a given model number should behave the same as any other.
In fact, after a couple of probing conversations with various PR types at AMD, we're fairly certain Thuban silicon doesn't contain any substantial new logic dedicated to making Turbo Core work. Turbo Core is essentially an extension of the existing mechanism for management of power states, a la Cool'n'Quiet and SpeedStep. Not that there's anything wrong with that. Although Intel's way of doing things ought to allow it to squeeze more headroom out of each chip, Turbo Core should provide the advertised improvements in clock speed and performance with minimal drama. Never one to miss a trick, AMD marketing even touts Turbo Core's consistent, deterministic behavior as a positive trait.
One question we raised when we first reported on Turbo Core was how this mechanism could possibly coexist with AMD's decision to lock the P-states of all cores together in the Phenom II. We originally explained the rationale for that choice like so:
The firm found that the varying power states (or P-states) on the Phenom could prove to be confusing to the Windows Scheduler, which wouldn't necessarily choose wisely when deciding whether to schedule a thread on a core with a low P-state or a high one. As a result, enabling the Cool'n'Quiet dynamic power saving feature could lead to unintended performance degradation. To work around this problem, AMD has decided to link together the P-states of the Phenom II's cores, via some BIOS-level changes.
Locked P-states would mean all cores run at the same clock frequency, which doesn't mix well with the dynamic symphony that Turbo Core ought to be. Having seen Turbo Core in action and pinged AMD on the matter, we can confirm that the Phenom II X6's don't have linked P-states. The cores move up and down in frequency independently of one another, with up to three of themany three of them, depending on the load at the momentranging north of the CPU's rated clock speed. This behavior is easily observable using the monitoring tool in AMD's Overdrive utility, and it's quite the contrast to the cores-in-lock-step operation of a Phenom II X4.
One remaining question is how Thuban is able to avoid the effects of the scheduling problems that led AMD to link the P-states together on earlier Phenom II processors. When we posed this question to the folks in AMD PR at the eleventh hour before publication of this review, they didn't have a definite answer. However, our casual observations suggest AMD may be using the CPU's rated clock speed as a common baseline to ensure decent performance when threads jump between cores. For example, the Phenom II X6 1090T has a base clock of 3.2GHz and a Turbo Core peak of 3.6GHz. When a single-threaded application is running on it, no core drops below 3.2GHz. When that application stops and the CPU is essentially idle, all six cores drop down to 800MHz, the minimum speed allowed by Cool'n'Quiet.
Turbo Core should give AMD a much better fighting chance against Intel's latest wave of CPUs. The lack of such a technology partly explains why Phenom IIs have historically done poorer than Core i5s and i7s in benchmarks; they might be competitive when all four of their cores are busy, but in more lightly multithreaded apps, they're stuck at their base clock speeds.
AMD had another ace up its sleeve when designing Thuban. The folks at GlobalFoundries have made tweaks to their 45-nm silicon-on-insulator process, adding a low-k dielectric layer to reduce leakage power. The result? Within a given thermal envelope, AMD can achieve nearly the same clock speeds with six cores as a Phenom II X4 did with four cores.
That's a pretty big deal. The fastest quad-core Phenom II X4 AMD has managed to produce, the 965 Black Edition, runs at 3.4GHz. Meanwhile, the fastest Istanbul six-core Opteron based on the same process technology only does 2.8GHz. AMD might have been able to break 3GHz by taking the same design to the desktop, but a hypothetical Istanbul-derived Phenom II would still be at a disadvantage compared to higher-clocked quad-core products. Many desktop apps don't take advantage of more than a couple of cores, so in those cases, clock speed becomes the determining factorand a pair of 3GHz Phenom II cores just ain't that fast.
Intel stepped down to a whole new process technology to give us a six-core processor, the Core i7-980X Extreme, with the same clock speeds and TDP as its previous flagship, the Core i7-975 Extreme. AMD has pulled off a similar feat of clock scaling and power efficiency while staying at the same 45-nm node. We've seen this kind of mid-stream refinement in process tech from AMD in the past, and it has continued through the spin-off of GlobalFoundries as a separate entity. Of course, Intel still has a considerable advantage from a manufacturing perspective, since Gulftown has a 28% smaller die area than Thuban despite having about 56% more transistors, as noted in the table below. Strangely, AMD PR resisted giving us an estimated transistor count for Thuban, but they did point us to the Istanbul Opteron as a point of reference. We suspect the two are essentially identical in this regard, with a few rather minor changes between steppings.
|Penryn||Core 2 Duo||2||2||6 MB||45||410||107|
|Bloomfield||Core i7||4||8||8 MB||45||731||263|
|Lynnfield||Core i5, i7||4||8||8 MB||45||774||296|
|Westmere||Core i3, i5||2||4||4 MB||32||383||81|
|Gulftown||Core i7-980X||6||12||12 MB||32||1170||248|
|Deneb||Phenom II||4||4||6 MB||45||758||258|
|Propus/Rana||Athlon II X4/X3||4||4||512 KB x 4||45||300||169|
|Regor||Athlon II X2||2||2||1 MB x 2||45||234||118|
|Thuban||Phenom II X6||6||6||6 MB x 1||45||~904||346|
Gulftown is even smaller than the Deneb silicon inside quad-core Phenom IIsa clear testament to Intel's manufacturing superiority. AMD may be able to get away with larger chips and tighter margins than before, though, since it no longer needs to pay the tremendous research and development costs associated with silicon manufacturing. That responsibility now falls upon GlobalFoundries.
As it is, Thuban looks well equipped to put AMD back into contention in the higher echelons of the desktop processor market. The big questions, of course, are how quick those Phenom II X6 CPUs actually are, and how much they cost.