During a press briefing today, Intel Digital Enterprise Group Senior VP and General Manager Pat Gelsinger revealed new and exciting details about several of Intel’s upcoming products. Among those are Dunnington, the company’s upcoming six-core server part; Nehalem, the next-generation architecture that will supplant Core 2 processors later this year; and Larrabee, Intel’s forthcoming discrete graphics processor.
Starting with Dunnington, Gelsinger confirmed the information leaked by Sun last month by saying the upcoming processor will feature six cores, 16MB of L3 cache, and a staggering 1.9 billion transistors. Dunnington’s six cores will all be Core 2-based, and the processor will slip into the same Caneland platform as today’s Socket 604, Xeon 7300-series CPUs.
Interestingly, Dunnington appears to be a single-die product, unlike Intel’s four-core offerings that are just two dual-core dies tacked together on the same package. When asked in the Q&A session why Intel had chosen six cores instead of eight, Gelsinger said Intel wanted to balance the number of cores with the amount of cache and the chip’s cost envelope. A “detailed set of workload characterizations” led the company to conclude that six cores with 16 megs of L3 cache is the “sweet spot.” Dunnington shipments are due some time in the second half of the year.
After talking a little about Dunnington, Gelsinger moved on to the pièce de résistance of his presentation: Nehalem. The Nehalem architecture will succeed the Core microarchitecture that’s at the heart of most Intel CPUs today, and we’ve known for a while that it will bring several major enhancements, like the addition of an integrated memory controller and the QuickPath point-to-point interconnect (the answer to AMD’s HyperTransport). Gelsinger re-hashed many of the details we’d heard before, but he also shared some novelties.
For one, Gelsinger confirmed Nehalem will feature a three-channel DDR3 memory controller with support for DDR3 speeds as high as 1333MHz. The triple-channel controller will appear on both desktop and server/workstation offerings, and it will support three memory modules per channel. Using current 2GB DDR3-1333 modules, that means you’d be able to cram 18GB of RAM into a single desktop PC and yield a theoretical maximum of 31.99GB/s of bandwidth—impressive, to say the least. Interestingly, Nehalem chips will only feature 256KB of L2 cache per core and 8MB of L3 cache per chip. That’s a little on the light side compared to Intel’s existing 45nm quad-core parts, which have 12MB of L2 cache (one shared 6MB L2 cache per die). AMD’s upcoming 45nm quad-core offerings, for reference, will have 512KB of L2 cache per core and 6MB of L3 cache per chip.
We also learned a little more about the architectural improvements Intel has built into Nehalem. The chip will feature increased instruction-level parallelism compared to Core 2 (with 33% more in-flight μops possible, Intel claims), better cache access and synchronization algorithms, and enhanced branch prediction thanks to a second-level branch predictor. Gelsinger describes Nehalem’s simultaneous multi-threading implementation as “similar in many ways” to Hyper-Threading, but he says performance will be “much improved” compared to old Netburst chips, partly thanks to the upcoming CPU’s increased bandwidth. Also, Intel prides itself on Nehalem’s “modular” design, which will allow the company to make chips with anywhere from two to eight cores, with or without integrated graphics and with however many QuickPath links are necessary.
As the rumor mill has been whispering for some time now, Intel expects Nehalem to hit production in the fourth quarter of this year. If the Nehalem launch is anything like last year’s Penryn launch, we may see server and high-end desktop parts come out first, followed by mainstream desktop parts next year, but Intel didn’t give any specifics. After 45nm Nehalem CPUs roll out, Intel will shrink the chips to 32nm, and then release Nehalem’s successor—code-named Sandy Bridge—on the same 32nm process as part of its “tick tock” product cadence. We know little about Sandy Bridge just yet, but Intel says it will have a new Advanced Vector Extension (AVX) instruction set that will build upon SSE with wider, 256-bit vectors and other enhancements aimed at floating-point-intensive applications.
Of course, what many were really waiting for Gelsinger to discuss during the pre-briefing was Larrabee, and Intel saved the best for last there. Larrabee, Gelsinger says, will combine a large array of Intel Architecture (i.e. x86) cores with a brand-new cache architecture, a new vector instruction set (Intel wouldn’t comment on Larrabee’s relationship with AVX), and a new vector processing unit. Gelsinger specified that Larrabee’s programmable architecture will allow it to accelerate anything from high-definition video and audio processing to physics, artificial intelligence, and global illumination.
Not only that, but Gelsinger also confirmed Larrabee will be compatible with DirectX and OpenGL application programming interfaces. In other words, while Intel will be pushing for different rendering paradigms like ray tracing, the company won’t have to wait on developers to make its silicon useful to gamers—Larrabee should be able to run existing games. Larrabee’s design could also make it well-suited to the hybrid rasterization/ray tracing approaches advocated by folks like John Carmack and Nvidia Chief Scientist David Kirk.
Intel plans to show its first demos of Larrabee in action later this year, with a product launch to follow in either 2009 or 2010. So far, Intel says response to Larrabee from independent software vendors has been “tremendous.”