Of power gating and ceramic impellers
Although the block diagrams for Llano are a mosaic of known quantities, AMD told us the major focus of its work on this chip was power savings. AMD has long trailed Intel on this front, and buying an AMD-based laptop has generally meant getting a bit of a discount on the overall system at the expense of shorter run times on battery. With Llano, the firm believes it has reached parity in this crucial arena with its much larger competitor.
One key to making it happen is the addition of a new type of logic: a power gate, which shuts off all power to a portion of the chip when tripped, eliminating not just active power but leakage power, as well. Intel has gated power for the individual cores of its processors since Nehalem, but to date, AMD has lacked that capability. No more. All four of Llano's cores share the same voltage supply, but each core has a power switch associated with it. Whenever one of the cores becomes idle and enters the C6 power state, all power to that core is shut off. Even on what may feel like a busy system to the end user, there could be billions of cycles of unused time on multiple cores, a huge target for power savings.
Additionally, Llano is capable of entering a package-level C6 sleep state when all four cores are idle. In this state, voltage is lowered across the entire CPU rail, saving even more power.
Llano has a second power plane for its entire "uncore": the IGP, UVD block, the graphics memory controller, and the north bridge. The uncore can operate at varying voltages and multiple, varying frequencies. According to Goddard, the uncore voltage is dynamically determined by a number of different inputs, including the north bridge's power state, the GPU power state, PCI Express speeds, and the UVD workload. Several uncore elements have power gating, as well. The GPU and its memory controller are separately gated and will be powered down dynamically at idle, while the UVD block can simply be turned off by software when it's not in use.
Besides saving power when idle, Llano is tuned to make the most of its available power envelope when active, thanks to AMD's dynamic power scaling tech, known as Turbo Core. As you may know, Turbo Core is an answer to Intel's Turbo Boost technology. The two are designed around the same basic principle, opportunistically grabbing more clock speed when the thermal headroom is available, yet they operate rather differently.
Intel's Turbo Boost relies on a network of thermal sensors on the chip to help determine how much it can range up in clock frequency, while AMD's Turbo Core uses only activity sensors on the die. Given this limited input, AMD must add additional intelligence offline, so it characterizes the power draw of its chips based on activity—Goddard called this a "big pre- and post-silicon exercise." The firm then sets a Turbo Core policy for each model of CPU based on that research. By its nature, this estimate must be relatively conservative, because it must cover the whole range of chips selected to represent that model.
Turbo Core adds only one more P-state to the CPU's repertoire, a single higher clock speed step; it then dithers between the two top clock frequencies as the activity-based power estimate will allow. In our Llano test chip, an A8-3500M processor, the difference between the two is rather large. The base clock speed is 1.5GHz, and the Turbo speed is 2.4GHz. There are no intermediate states like, say, 1.8GHz for four lightly-loaded cores. Typically, the Turbo Core policy only lets the chip range above its base clock speed when a portion of the total cores are actively at work. In the Phenom II X6, for instance, Turbo allows up to three active cores to range to higher frequencies. We're unsure what the policy is for our Llano APU, since we lack a utility that will properly report its clock speeds.
|The two are designed around the same basic principle, opportunistically grabbing more clock speed when the thermal headroom is available, yet they operate rather differently.|
Our sense is that processors equipped with Turbo Core spend substantially less time resident at higher clock frequencies than those with Turbo Boost. Still, Goddard touted several of Turbo Core's attributes as advantages over Intel's approach. Among them is the fact that activity is measured digitally and therefore more precisely. Also, Turbo Core behavior is consistent and deterministic across all copies of a certain model of CPU, and performance doesn't vary with the quality of the thermal solution in use. All of those things sound great on paper, but our sense is that AMD will abandon those principles just as soon as it can produce a chip with a thermal sensor network comparable to Intel's.
Naturally, AMD's activity-based power estimates for Llano include both the CPU and GPU cores on the chip. As we've already noted, the GPU doesn't participate in Turbo Core's clock frequency scaling. GPU activity may, however, eat up thermal headroom that would otherwise be available to the CPU cores; the GPU gets priority in such a case.
Another possibility is that programs causing particularly high power consumption could be run on one or both of Llano's two major processor types, pushing the chip to exceeed its total thermal envelope. In that case, a legacy CPU thermal throttling mechanism will kick in on the CPU side of the fence, reducing the CPU cores to a lower P-state and limiting the chip's overall power draw and heat production. The IGP will continue to chug along as ever, true to its Redwood roots. AMD's graphics division did introduce a power-based throttling feature called PowerTune in its Cayman GPU, but that mechanism hasn't trickled down to its smaller GPUs yet, nor to the Sumo IGP.