A little over a week ago, I flew down to AMD's Austin campus for a short visit. Waiting for me there was the hot Texas sun, which provided a welcome reprieve from the cold and rain of Vancouver, as well as a few precious hours of hands-on time with a machine running AMD's new Zacate accelerated processing unit. On top of that, AMD disclosed some fresh details about the Brazos platform that will catapult Zacate into ultraportables, nettops, and netbooks early next year, allowing me to fly back home with a pretty complete picture of AMD's next big thing.
Now for the bad news: while I have a spreadsheet bursting at the seams with benchmark data I collected from the development system, AMD won't lift the press embargo on Zacate benchmarks for another little while. Luckily, pretty much anything beside cold, hard numbers is fair gameand there's plenty of that to go around. Over the next couple of pages, we're going to take a look at what makes Brazos tick, what the first Zacate APUs will look and perform like, and how much you can expect to pay for products based on them.
Now, I should point out that this won't be an exhaustive look at the new Bobcat microarchitecture that powers Zacate's microprocessor component. Scott already covered Bobcat in detail this summer; if you don't already know what makes Bobcat different from past AMD architectures, you should read his article.
In short, though, Bobcat was fashioned from the ground up for low-power systems. You can think of it as AMD's answer to the Intel Atom, except Bobcat emphasizes power-efficient performance more than extreme power efficiency. Fully out-of-order instruction execution, 64-bit extensions, and hardware virtualization are all be on the menu. AMD claims the architecture can deliver performance almost equivalent to that of today's entry-level desktop offerings, yet dual Bobcat cores can still huddle together with an integrated GPU inside a 9W thermal envelope.
Bobcat, meet Brazos
The first two Bobcat-based designs are accelerated processing units, or APUs for shortessentially microprocessors sharing die space with graphics processing components. AMD code-names those two APUs Zacate and Ontario, having tailored the former for an 18W thermal envelope and the latter for a 9W TDP. Despite the different code names, both parts are actually based on the exact same silicon. They occupy 75 mm² of die area and fit onto 19 x 19-mm, 413-ball BGA packages just like the one pictured above. Both are manufactured using TSMC's 40-nm fab process.
That 75 mm² die includes not just two Bobcat cores, but also a GPU component with video decoding logic, a single-channel DDR3 memory controller, and a "platform interface" block with PCI Express lanes and display outputs. Together with Hudson, an auxiliary chip that provides additional I/O connectivity, Zacate and Ontario make up the platform code-named Brazos.
Before we delve deeper into Brazos' I/O capabilities, let's first talk about its graphics component. The GPU built inside Zacate and Ontario shares the same foundation as AMD's DirectX 11 Radeons. It includes two SIMD arrays with 40 ALUs each for a grand total of 80 ALUs, or stream processors, per chip. GPU clock speeds range from 280MHz on Ontario to 500MHz on Zacate. AMD complements those resources with a UVD3 blockthe same one found in discrete, 6000-series Radeonswhich will assist the Bobcat cores with the decoding of H.264, VC1, DivX, and XviD video. Fittingly, the machine I tested detected Zacate's GPU as a Radeon HD 6310 in the Windows 7 Device Manager.
As one would expect, this GPU component shares memory bandwidth with the CPU cores. There won't be a whole lot of bandwidth to share, mind you, because the chip's memory controller only supports up to two DDR3 DIMMs running at 800-1066MHz along a single, 64-bit channel. You're looking at maximum theoretical memory bandwidth of about 8.3GB/s, and that's shared across the entire APU.
Zacate and Ontario will also have built-in PCI Express connectivity. There will be four PCIe Gen2 lanes to connect the chip directly to third-party network controllers or a discrete graphics processor, with an additional four Gen1 lanes linking the APU to the Hudson chipset (or "Fusion Controller Hub"). AMD calls the link between the Hudson FCH and Zacate/Ontario the Unified Media Interface (UMI), but from what I've been able to gather, that's fancy-talk for a plain-jane, four-lane PCIe connection.
What does Hudson look like? I don't have a sexy chip shot with a quarter for reference, but AMD's spec sheet paints a pretty good picture. The Hudson FCH is built on a 65-nm fab process and has a 23 x 23-mm, 605-ball BGA packageslightly larger than the APU it accompanies. Power consumption ranges from 2.7W to 4.7W for "typical configurations." Inside Hudson lurk the four PCIe Gen1 lanes required for the UMI interface, an extra four PCIe Gen2 lanes, six 6Gbps Serial ATA connections, 14 USB 2.0 connections, and built-in fan control logic.
The presence of 6Gbps SATA might seem a tad over-the-top, since few netbooks or ultraportables are likely to sport ultra-fast solid-state drives capable of pushing the boundaries of the 3Gbps standard. Still, it's good to see AMD isn't cutting too many corners. Folks hoping to build, say, cloud computing clusters out of Brazos systems may find some use for the fast storage ports, too.
Because Brazos supports discrete GPUs, it can arrange all of that I/O connectivity in one of two ways. In the first configuration depicted below, Brazos happily relies on its integrated graphics and hooks up Gigabit Ethernet and 802.11n Wi-Fi straight to the APU, with Hudson handling more menial duties.
The second diagram shows Brazos outfitted with a discrete graphics processor. In this case, the discrete GPU is connected directly to the APU via four Gen2 PCI Express lanes, and Hudson plays host to the networking controllers.
One thing to note is that both Hudson and Zacate/Ontario can run their non-UMI PCI Express lanes at either Gen1 or Gen2 speeds. AMD says it recommends using Gen1 for "the power benefit," but in the context of a nettop or an ultraportable with discrete graphics, I wouldn't be surprised to see Gen2 speeds used. (If you've lost your computer standards reference manual, PCIe Gen1 lanes can push 250MB/s in each direction, while Gen2 lanes have twice as much bandwidth.) Four PCIe Gen2 lanes may not be optimal for high-end discrete graphics, of course, but they should suffice for low-end notebook GPUs likely to accompany this platform.
That's Brazos in a nutshell: a low-power, two-chip solution with enough computing, graphics, and I/O resources to keep a lot of folks happy. Intel's Pine Trail platform is starting to look a tad anemic in comparison. As you're about to see, though, Brazos and Pine Trail might both be tailor-made for low-power systems, but they're not quite fighting over the same turf.