Single page Print

AMD's Phenom processors


At last, AMD's quad-core CPUs hit the desktop
— 10:48 PM on November 19, 2007

If you're reading this article, chances are you already know at thing or two about Phenom processors. After all, they've been in development for years, and AMD has been talking about them publicly for quite some time. In fact, we've even reviewed the exact same silicon in different outerwear, the quad-core Opterons, earlier this year. We've heard all about how Phenom will be the world's first "native" quad-core desktop processor, how such integration has tangible benefits for performance and power consumption, and how folks will be absolutely stunned by the synergistic convergence of Phenom processors, the 790FX chipset, and Radeon HD 3800-series GPUs.

What we didn't have, however, were answers about some key Phenom basics: How fast will it be, both in terms of clock speeds and performance per clock? When will it be available? And will it have been worth the, erm, considerable wait?

Today we have some answers, after a long weekend spent in hands-on testing with a Phenom processor. Read on for our extensive first look at AMD's brand-new CPU.

The Phenom steps up
Yes, it's called Phenom, which I've heard pronounced "fee-nom," like a promising young pitching savant in some baseball club's farm system, and "fen-om," which rhymes with "venom" and sounds positively toxic to my ears. Either way, what the name mainly evokes for me is this: Phenom-ena. I suppose it's fine as far as CPU names go, though.

The chip itself is the same basic "K10" design found in AMD's quad-core Opterons. Although those CPU cores are derived from the ones found in current Athlon 64 X2 processors, AMD has made substantial revisions to them in order to improve per-clock performance and efficiency. The cores now have a wider, 32-byte instruction fetch, and the floating-point units can execute 128-bit SSE operations in a single clock cycle. Phenom can execute the Supplemental SSE3 instructions Intel included in its Core 2 processors, but not the newer SSE4 extensions in Intel's just-introduced 45nm chips. The K10 core has more bandwidth throughout in order to accommodate higher throughput—internally between units on the chip, between the L1 and L2 caches, and between the L2 cache and the north bridge/memory controller.


The quad-core Phenom die. Source: AMD.

These improved cores are, of course, now grouped four to a chip, and AMD has added a third level to the cache hierarchy in order to assist with integration of the cores. As a result, each Phenom core has 64K of L1 data cache, 512K of dedicated L2 cache, and access to the 2MB L3 cache shared between all cores. An interesting quirk of the Phenom design is that the L3 cache runs at the clock speed of the memory controller/north bridge section of the chip, which is typically slower than the CPU core clocks. Since the L3 cache is an integral part of the memory hierarchy, north bridge clock speeds will be a key factor in overall Phenom performance.

The chip's integrated memory controller can talk to dual channels of DDR2 memory at speeds up to 1066MHz. This memory controller has been improved in various ways, among them larger buffers and an improved mechanism for speculative data prefetch. The memory controller can also be configured to access its two 64-bit memory channels independently, instead of treating them as a single 128-bit entity.

As with any new chip design these days, the Phenom has been tuned for power efficiency as well as performance. Most prominently, in this case, the Phenom's four cores are clocked independently and can dynamically raise or lower their clock speeds in response to demand. The Phenom's core voltage is still determined by the power state of the core with highest utilization, but AMD has separated the power plane for the chip's CPU core from the power plane for its memory controller. Only motherboards conforming to the new Socket AM2+ standard will be able to reap the benefits of Phenom's split power planes, but Phenom ought to be compatible with—and able to act as a drop-in upgrade for—existing Socket AM2 motherboards. (Though, as always, you'll want to check with your motherboard maker about compatibility, and your mobo may need a BIOS update. Your mileage may vary. One never knows. All rights reserved. Etcetera.)

Socket AM2+ also brings support for another Phenom feature: HyperTransport 3.0. This interconnect links the Phenom to the rest of the system for I/O and the like, although it's not a traditional front-side bus, since the Phenom has its own memory controller. Revision 3.0 of HyperTransport doubles the effective clock speed and data rate of the interconnect, giving Phenoms twice the external bandwidth of the Athlon 64 X2 in the same 940-pin socket.

And yes, it is the same socket. Not only should Phenoms be able to fit into Socket AM2 mobos, but Athlon 64 processors should drop comfortably and functionally into Socket AM2+ motherboards. At present, the only Socket AM2+ chipset on the market is AMD's 790FX, which we've reviewed today, as well. Notice how smoothly I worked in that plug there, Geoff. No one will notice.

AMD says it has plans to introduce yet another socket, dubbed AM3, to go along with its 45nm CPUs. That new dynamic duo will enable support for DDR3 memory types, when it arrives in 2008. Or, you know, whenever's convenient. I wouldn't put any money on 2008.

The Phenom's great, big bundle of integrated goodness is manufactured as a single chip on AMD's 65nm silicon-on-insulator fabrication process at its Fab 36 facility in Dresden, Germany. All told, the Phenom has roughly 463 million transistors, and the chip's area is 285 mm². That's fairly large as far as CPUs go. Intel's brand-new 45nm Penryn chips come two to a package in its quad-core processors, but each chip fits 410 million transistors into a 107 mm² die. Since larger chips are exponentially more prone to manufacturing defects, the Phenom's relatively large size may cause AMD some headaches over time. The big upside here is the K10's tighter integration of four cores and faster communication between them, a benefit that may pay bigger dividends in the multi-socket server arena than in desktop processors like Phenom.

I'm hopped up on several gallons of straight espresso, this review is over 12 hours late, and my kids will be eating pop-tarts for dinner for the next month, so that's all the time I have to discuss microarchitectural specifics in this context. However, you can learn more about the K10 design by reading my review of the quad-core "Barcelona" Opterons, or you can skip over to David Kanter's incisive Barcelona architecture overview for more detail on the CPU-geek stuff.