review amd aims for low power computing with bobcat

AMD aims for low-power computing with Bobcat

Brand-new x86-compatible CPU architectures don’t come along very often, but we’ve recently seen the first extensive details about two essentially new, clean-sheet designs from AMD, the high-end Bulldozer architecture and its smaller, low-power sibling, named Bobcat.  We’ve already covered Bulldozer at some length, and now we’re turning our attention to its pint-sized relative.

Brad Burgess, an AMD Fellow and the Chief Architect of the Bobcat core, offered the public a first glance at his team’s creation last week at the Hot Chips conference, and AMD has since released the slides from that talk to the media.  We didn’t attend Burgess’ presentation, but we did speak with Dina McKinney, Corporate Vice-President of Design Engineering at AMD, about both of the new architectures.  We’ll have a look at some of AMD’s slides below and attempt to point out some of what you’ll want to know about this promising new microprocessor.

Let’s start by situating Bobcat in the context of existing x86 processors.  For most intents, this CPU will be AMD’s answer to the Intel Atom, a PC-compatible processor tailored toward keeping two key things in check: power consumption and manufacturing costs.  As PCs become more mobile and range further into commodity territory, the need for capable, low-cost, power-efficient processors has become more evident, and the surprising success of netbooks as consumer products has strongly validated Intel’s approach with the Atom—even if that success hasn’t come in exactly the form Intel originally anticipated.

Like the Atom, Bobcat has been designed largely with synthesized logic, with what AMD calls a “small number of custom arrays.”  This choice involves a fairly straightforward engineering trade-off.  Larger CPUs like the Phenom II (and Bulldozer) use lots of custom-designed logic because it can be more efficient, yielding smaller die areas and superior power efficiency.  GPUs and other application-specific chips tend to rely more heavily on synthesized logic because it can shorten design cycles and allow the chip to be ported to a different manufacturing process with relative ease. The extensive use of computer-generated logic should allow the Bobcat core to be refreshed with regularity and remixed into a range of different products for various markets, much as Intel has done with its various Atom platforms.

Bobcat’s portability is also crucial for how it will initially be deployed: as a part of the Ontario “APU” or accelerated processing unit, the first of AMD’s “fusion” processors that combine a CPU with a GPU on a single piece of silicon.  Ontario will be manufactured by TSMC on the same 40-nm fabrication process used to produce current Radeon graphics chips.  Thereafter, we’d expect Bobcat-based APUs to make the transition to new fabrication processes at a cadence similar to the traditional refresh rate for low-end graphics chips.

Don’t let the “fusion” label or talk of the GPU as a “SIMD engine array” confuse you: Ontario will not be a true hybrid processor that fluidly combines traditional serial-style CPU processing with data-parallel-style GPU processing into a novel programming model that achieves previously unseen performance heights.  Much like current Pine Trail Atoms, Ontario will simply combine two low-power CPU cores and a modest GPU on the same chip in order to save on power, die area, and costs—not that there’s anything wrong with that.

In fact, Ontario has the potential to be substantially more interesting to computing enthusiasts than all of this Atom talk might seem to suggest.  Architecturally, Bobcat employs a more aggressive out-of-order approach to instruction execution, which could allow it to retire quite a few more instructions per clock than Atom, on average.  In other words, Bobcat could be a much faster processor than Atom.

AMD claims Bobcat will achieve an “estimated 90% of today’s mainstream performance in less than half the silicon area.”  That’s a big hint, and we should unpack it a little bit to get a sense of what it tells us.  The comparison being made here is between a dual-core Bobcat and the current Athlon II/Turion dual-core CPUs.

For a sense of the Athlon II’s performance, you might want to check out one of our recent CPU roundups, but the bottom line is that it’s pretty decent overall—an Athlon II X2 255 is similar to a Penryn-based Pentium E6500 and well over twice the speed of a Pentium 4 670.  The X2 255 is more than up to the task of running modern games, too.  If Bobcat reaches 90% of that performance—and that’s still a big “if” since we don’t know exactly how AMD is estimating performance or what clock speeds it will reach—then it should be plenty adequate for the vast majority of everyday computing tasks.  We’re talking about performance similar to, or better than, Intel’s consumer ultra low-voltage processors, which are our current favorites for ultraportable laptops.

As for silicon area, today’s Athlon IIs are based on the 45-nm “Regor” chip, which has a die size of 118 square millimeters.  A dual-core Bobcat implementation should weigh in at under half that, which is pretty small indeed.  However, AMD is careful to point out that the “90% performance/under 50% size” estimate is not a statement about the whole of the Ontario chip, since that chip will include a GPU, too.  (For reference, Intel lists the dual-core Pine Trail Atom D510, made on its 45-nm process, at 87 mm².  That also includes a GPU.)

We don’t know yet exactly how Ontario’s GPU will look or what portion of the total die area it will comprise.  We do expect it to be a true, DirectX 11-class Radeon with robust hardware acceleration of video playback for contemporary compression formats.  Our guess is that on both the graphics and video playback fronts, we can probably expect Ontario to be markedly better than Pine Trail Atoms and potentially superior to Intel’s CULV offerings, as well.

The part of this picture that’s not yet complete is power consumption.  AMD has only said that Bobcat will use “a fraction of the power” consumed by today’s mainstream CPUs and that the Bobcat core will be “sub one-watt capable.”  After consulting with AMD, we take that statement to mean that a single Bobcat core can draw less than one watt while actively doing work, in the right configuration.  That should be a nice starting point, considering that Intel’s first Silverthorne Atom products, which were 45-nm parts with a single CPU core by itself on a chip, ranged from 0.65W to 2.4W max, depending on the model.

For the fastest Bobcat variants, the upper limit on power could be much higher than that, depending on how the curves for clock frequencies and voltage work out.  Of course, Ontario will have two cores, a GPU, and a video decode block onboard, too. The dual-core Atom D510 has a 13W max TDP, and some variants of Ontario might land in that neighborhood.  Even a relatively poor outcome, with much higher power draw than Pine Trail, would still be a fraction of the 65W TDP of the Athlon II X2 255.

The picture that emerges from these estimates is pretty darned attractive, we have to admit.  Ontario may well be a watershed commodity PC component—fast enough not to annoy most power users in casual use, small enough to be breathtakingly cheap, and capable of enabling generous battery run times in mobile systems.

The microarchitecture

Burgess’ Hot Chips presentation lays out Bobcat’s internals in some detail, and unlike Bulldozer, I don’t believe AMD is holding back any major bits of information about the architecture at this point, because the products are coming to market soon.

Overall, Bobcat looks to be a very modern processor architecture with a 13-stage main pipeline.  The L1 instruction and data caches are 32KB, and the L2 cache is 512KB.  As far we know, the L2 cache isn’t shared between two cores on Ontario.  Dual-issue execution cores seem to be in vogue at AMD right now; Bobcat takes the same path as Bulldozer there. Instructions can be executed out of order, as we’ve mentioned, which should bring higher performance per clock than the Atom.  The load/store engine is also out-of-order capable, with the ability to move stores ahead of loads.

Of course, the thing that most distinguishes Bobcat from current Phenoms or Bulldozer is its focus on keeping power consumption low.  Ticking off check boxes on a feature list won’t always convey the impact of the thousands of little choices chip architects and designers make in fashioning a product like this one.  For what it’s worth, though, Bobcat does include fine-grained clock gating, power gating, and a low-power C6 idle state. AMD has also used physical register files for local storage, to avoid the power overhead associated with dynamic register mapping.

One problem for low-power x86 processors is how to handle another sort of overhead, the sort created by decoding typically more complex x86 instructions into simpler “micro-ops” or internal instructions actually executed by the processor.  Intel took a nearly CISC-like approach to the Atom in which 96% of x86 instructions are translated into single or “fused” dual micro-ops.  AMD’s approach with Bobcat is similar.  Burgess says 89% of x86 instructions decode to a single micro-op and 10% into a pair of micro-ops, while the remaining <1% of more complex instructions are handled in microcode.

Bobcat’s x86 ISA support is quite extensive, too, with support for AMD’s 64-bit extensions and all SSE versions up to SSE4A, including SSSE3.  The newest extensions for floating-point math like AVX and FMA aren’t supported, but they don’t really square with Bobcat’s mission in life.  Notably, Bobcat does support AMD’s secure virtualization instructions, which suggests this core might be employed as part of a cloud server platform at some point in the future.

Indeed, we’re fascinated by the variety of prospects for Bobcat to expand its mission beyond notebooks and low-cost desktops when AMD so chooses.  Intel has quite explicitly stated that Atom will push into ever-smaller form factors as progress allows, rather than gaining additional MIPS and FLOPS.  Over the next few years, manufacturing process advances and additional integration could eventually make it possible for Bobcat to fit into pocketable devices like smart phones, too, but AMD is keeping mum on that subject for now.

AMD could also choose to license the Bobcat core to third parties, much like ARM does with its processor architectures.  Intel has tiptoed around the possibility with the Atom, but we’ve not seen much actual progress.  Since AMD is now a pure-play design house—that is, it has spun off its manufacturing capacity into GlobalFoundries and focuses solely on chip design—licensing might make more sense for it than for Intel.  In fact, AMD’s graphics unit, the former ATI, has some experience on this front, having designed the Xbox 360 GPU for Microsoft.  The fact that Ontario will be manufactured at TSMC opens up some possibilities for collaboration with other TSMC customers, although we’d be surprised to see AMD go the licensing route in this first generation of Bobcat-based devices.

For those who’d like to see them, we’ve assembled Burgess’ detailed slides on the Bobcat microarchitecture in the image gallery attached to this story. 

0 responses to “AMD aims for low-power computing with Bobcat

  1. That’s pretty funny. Humans are so hilariously optimized for pattern recognition (to some extent that /[

  2. On that last picture on the left, part of the colored area looks awefully close to a map of North america imo 😛

    X86 decope is alaska
    ROB is the west coast

    seems like u groenland is also there in red lol

    anyways…just thought id point it out 😛

  3. I agree, that would make me buy one.
    I’ll be in the market for a new portable computer between my smartphone and my current notebook.
    But it will have to be smaller than my 3.6kg 15″ 6 year old Athlon XP-M notebook and run for 6h.
    My 1.8GHz Athlon XP-M (512KB version) has no trouble decoding non HD material, only “in browser flash video” will get it to clock up.
    I sense more of a software issue, playing the same video in VLC would likely work at 400MHz.
    I think AMD made the right choice having a special “slow” design. We don’t need faster, we only need cheaper and lower power.

  4. Sorry, I guess that should be “performance points” — on the X-Y graph of price/performance, you put your product at the same performance point but a lower price point (or, as you say, at the same price point but higher performance point)

  5. How do you “significantly undercut at a price point”? Wouldn’t that make it a different price point?
    In any case, you’re right, AMD isn’t stupid. They will choose to moderately outperform at similar price point, or moderately undercut at a similar performance point. It’s pretty much the only way to actually make money when trying to get part of a market that’s new to you.

  6. If it’s 2.5x the performance of Atom, they don’t need to price it cheaper. The only ones they need to price below atom are the ones that bin at around Atom performance, or have significantly worse power usage. Cheaper and better than Atom would be a slam-dunk, but it doesn’t make a lot of sense from a profitability standpoint. AMD may want to grab marketshare, but they would also like to make some money. If they can produce one line of chips with performance that spans from from Atom to CULV performance /[

  7. The GPU is on the same die and shares the same memory controller.

    A 74mm^2 die – AMD can fit over 900 dies on a single wafer. A 40nm bulk wafer at TSMC costs around $5000, and small dies have a high yield (80%+). A working Ontario/Zecate die could cost as little as $7. Packaging and testing must be taken into effect, but I wouldn’t be surprised to see $30 APU costs for OEMs – significantly cheaper than the Atom infrastructure, whilst being 2.5x faster on average. If anything, it will force Intel to reduce the pricing of Atom, resulting in cheaper netbooks all around.

    It will also compete with Intel’s CULV platform, which could result in price drops there as well.

    All in all, it should be a win-win for the consumer.

  8. Yes it will be very interesting to see what happens with Bobcat. This is the first time that I can recall AMD offering something to the market the Intel has no direct answer for. Intel even had a better answer for Opteron than for Bobcat. Will Intel just cut prices on CULV chips to spoil the party? What will become of Atom outside of the very smallest thermal envelopes? Very interesting.

  9. Also, there is a benefit if the CPU can’t do it at all. Bobcat sure should be a step up from Atom!

  10. Seeing as the biggest bottleneck in performance-per-battery-life in Atom systems seems to be the clunky old intel motherboard/graphics chipsets, I can see this design doing well. If you can get a laptop running 12 hours on a charge that’s half as thick/heavy as a typical Asus Eee, with comparable/better performance, they’ll have hit the right mark.

  11. There’s only a benefit if the GPU can do it more power-efficiently than the CPU can (including any overhead due to PCIe, round-tripping to the CPU/memory, etc)

  12. If the AMD chip can properly accelerate flash or other tasks that might be CPU intensive, might there be power savings to be gained?

  13. Dammit, this cold is making me stupid. Sorry.

    I think by the time I read this the article was correct and then I clearly failed to read the post properly 🙂

  14. Intel designed and engineered the Pentium M line in that way. I seem to recall you anticipating good (or at least better) things from Sandy Bridge in this regard because some of the same team is involved.

    And Intel did make some concessions to something other than speed in the Westmere parts: they added a separate power plane for the northbridge (and “uncore”), something that wouldn’t be worthwhile in a server chip but allows for lower-power idle states in notebooks (and desktops).

    I haven’t seen enough apples-to-apples comparisons of the latest “Nile” platform to AMD’s previous offering (or the equivalent Intel performers) to draw any conclusions about their recent progress in this regard. Not saying they haven’t made progress, just that I haven’t been keeping up if they have (and no one has been shouting about it from the rooftops). Certainly the previous platform wasn’t especially parsimonious on power, at least in comparison to the Intel equivalents (which were mostly Core2-based, of course); I would probably already own a Lenovo x100e if it had better battery life. And the “Nile” Toshiba that TR reviewed isn’t anything special in that regard, though of course that’s just one data point.

    But that brings up a related point: we’re looking at total platform power for machines that, for the most part, aren’t going to be used for gaming. So while the GPU in Bobcat can undoubtedly run rings around the one in Arrandale, and probably around the one in Sandy Bridge, that mostly doesn’t matter. If it can run Farmville or the Sims, it’ll be enough for most users. All that latent horsepower just sits there leaking power in the majority of cases. (That cuts both ways, of course: you can say the same thing about the CPU cores in Arrandale). Folks here at TR will care about the gaming performance, and likely will give Bobcat props in that regard, but the larger consumer market looks at price and battery life and (maybe) heat and whiny little fans.

    Nevertheless, AMD has been on a roll with OEM design wins, and a cheaper chip (vs CULV) with better performance (vs even the latest Atom) makes for a pretty interesting mass-market niche that straddles the top of the netbook category and the bottom-middle of the CULV one, which is where all the action is. As long as AMD can deliver, and keeps the OEMs happy, Bobcat should be a success (and a bigger one than any previous AMD mobile product).

  15. It roughly matches an Athlon II at the exact same clock speed, and the GPU is well beyond what exists now. We already know that. There have been benchmarks.

    This is not an Atom fill in or even close. It’s an ULV platform crammed into one chip with the fat trimmed down rather than just underclocking all of its parts and hoping for the best.

    The “process advantage” Intel has amounted to absolutely nothing in increasing battery life. I don’t know where you’re getting that. Intel has been going for high speed at 32nm, ULV parts included.

    The proper manufacturing process for the application matters a lot more than what exact node it is. Like Atom, Bobcat CPUs will not be built as a 3+ GHz CPU that just happens to stay in idle mode. They should leak a lot less than a traditional CPU, which has always been an issue for toned down standard parts, completely regardless of how “power efficient” they may be.

    New AMD laptops are doing quite well with saving power. Their quad-core laptop platform idles below 10w and their middle of the road Neo IIs gets very close to the Core 2 CULV platform, but with significantly better graphics and the memory bandwidth to support it, which isn’t something even Intel managed. That’s nothing to sneeze at.

    Intel has never bothered designing and manufacturing a CPU from the ground up this way. We always have to wait and see, but I don’t think you’re giving AMD enough credit and you’re giving Intel credit for things they didn’t even do.

  16. You’re correct, of course; realistically, 16 or 32 cores is more like it. Still, with people talking about Atom-based servers and even ARM-based servers, I don’t understand why this potential for the architecture wasn’t at least mentioned.

  17. That’s what he said. What do they call comprehension errors introduced when reading starts half-way through the message?

  18. I highly doubt it. the only way AMD would license an x86 core to nvidia would be if nvidia was developing a product that didn’t compete with something AMD was making which doesn’t seem likely at this point.

  19. I don’t think people are excited about “fusion” so much as something small and cheap without the bad taste of Atom. It will be interesting to see if Intel has to make two Atom designs to compete, or if they just use low-end i3’s do to the damage at the top of Bobcat’s range, and Atom at the bottom. Asymmetric competition!

  20. Count the clock cycles. 0-12 is 13 stages inclusive.

    Welcome to the fun world of off-by-one errors introduced when everything in computing starts at 0 instead of 1.

  21. They might have the FTC/DoJ on their side, but arrayed against them would be the legal agreements with Intel they themselves signed. Given that the agreements are (presumably) legal and Intel isn’t being punished for something else, it’s hard to make the case for what amounts to the forcible redistribution of Intel’s (intellectual) property. It has been done in the past when the government needed a second source for something and “persuaded” a company to license its IP on benign terms (this is how war production got ramped up in WWII, for example) but AMD is already a second source and it’s hard to make the case this is in the national interest. Especially with Intel having bottomless pockets to pay lobbyists to loudly say it isn’t.

    And I don’t know that AMD want another player in the party in anyway, much less nVidia. They may be the junior partner in this duopoly, but it is a nice little duopoly, and they’ve got their own IP in the pot to protect too.

  22. Memory bandwidth will be a problem. More realistically they’d pack smaller numbers onto a single chip and bundle it with RAM in a module, and then pack several of those onto a blade to get an ultra-dense server.

  23. I think there’s some overestimation of Bobcat’s prowess going on. This part will be slower than mobile K10, which means it will be slower than C2Ds, Arrandales and Sandy Bridges by a wide margin.

    Llano (K10 core) will be the mainstream product from AMD, and you can still expect it to be slower than the intel alternatives in everything except GPU and GPGPU. Of all the vendors, perhaps Apple is the only one that can pull off any meaningful use out of Llano in the first generation, since it makes its own apps that are widely used and can benefit from Fusion – iTunes, iMovie, Final Cut, Aperture.

    For everything else, AMD has to get developer support, and that is the one thing they are monumentally bad at – how many Stream-enabled apps are there compared to CUDA? Thought so.

  24. Doubtful in the MBP, but the MacBook is a possibility. It could let them cut the price on the MB, too.

  25. Some shredded cheese, a little salsa… Seriously though, die size issues aside, theoretically you could pack 128 Bobcat cores into a product and still keep to a 130W TDP. With each pair of cores performing at 90% of the performance of an Athlon X2, that’s serious multitasking power. The server world should go nuts for something like that.

  26. It’s sticky. AMD won the battle over whether GloFo could build x86 chips even though they are a separate entity, and not technically covered by the Intel/AMD cross license agreements.

    I’ll bet that if AMD wanted to pursue this, they’d have the FTC on their side in any battle with Intel about licensing x86 to a 3rd party.

  27. “Interestingly enough, the corporate “Fusion” branding program will be coming to an end, as well. The Fusion name apparently won’t carry over into APUs, believe it or not.”

    That’s straight from the TR article covering the ATI name retiring.

  28. I don’t believe so, Nvidia would be able to get something like x86-64 but that doesn’t really help without x86 itself 🙁

  29. No way will Bobcat (an architecture targeted at Atom) be more powerful than a CULV Arrandale. And you can bet that there will be CULV Sandy Bridges down the road as well. At this point, we don’t even know if Bobcat will match a C2D CULV (although I’m willing to bet it will have a more powerful GPU for sure). Considering that Bobcat is one tier down from Llano (which will have a K10 core), I’m not expecting any real surprises in performance.

    Yes, intel laptops will probably be more expensive, but if they stay at the current CULV pricerange, would still be a very attractive upgrade from a netbook.

    As for your jibe about battery life, AMD still has to do its homework on that, judging by the products out on the market right now. Let’s hope they deliver, but intel still has the upper hand, and with superior manufacturing process, will continue to have an advantage from that factor alone.

  30. Does Intel’s license agreement with AMD permit AMD to subsequently license x86 to other parties? After all, AMD doesn’t own x86 per se.

  31. I’d have to disagree. While money is not the end all and be all of product design, having more R&D money means AMD can either:

    a. Run more projects concurrently, with each team designing succeeding product generations in lockstep, or
    b. Commit more engineers to a given project. It’s like having more stream processors in your video card because each engineer may work on a given part of the chip without waiting for other engineers to complete theirs as long as the plan for the architecture has been clearly laid out.

  32. Eh…I think you may be confused about what an ULV CPU is. They will not get any more powerful. The CPU cores will stay at exactly the same performance level they have been at for the past several years. Enough is enough, and they hit that point long ago.

    Much like phones, the rest of the chip is what matters. The difference is that Bobcat CPUs will include decent graphics that should follow the same principles in saving power. Graphics are AMD’s thing, not Intel’s.

    So long as Intel keeps taking their existing higher power CPUs, turning down the base multiplier, but still leaving them a crazy turbo boost, they’re not going to be making any headway.

    Yeah, that’s “faster,” but it doesn’t accomplish anything but run your battery down when you’re just surfing the internets and flash ads make your 1 GHz low power CPU constantly blast up to 2+ GHz.

  33. I’d also say that intel’s aspirations for Atom are quite different from where AMD seems to be positioning Bobcat. Atom’s eventual goal is going after ARM (smartphones, MIDs, tablets). Netbooks were just a niche that Atom happened to fill in its transitional (ie, while its power consumption is too ridiculously prodigous) state.

    Once (if ever) Atom actually hits its planned market segment, Bobcat will be competing with intel’s CULVs in its market space, and those babies are going to be a whole lot more powerful.

  34. Bulldozer has no APUs in it. Only Llano and Bobcat will have them (in their first generation, at least).

    Of course, assuming developer support and uptake, Bulldozer users would be able to use their discrete GPUs for compute tasks. It’s a shame that software support for GPGPU (especially for AMD/ATI) is still pretty embryonic/nonexistent/niche.

  35. Fusion as a brand perhaps, but fusion as a technology isn’t going anywhere. Bulldozer is just the first step to integrating the GPU execution units with the rest of the CPU.

  36. I remember reading somewhere (probably a slide) that
    Bobcat is supposed to be 1-10 W,
    and Bulldozer 10-100 W.

  37. Hmmm, I’d rather get a real bobcat. Sadly, a stuffed bobcat toy will have to suffice.


    Anyway, I’m quite excited for Bobcat, more so than Bulldozer. Hmm, I wonder if Apple is going to adopt Bobcat into their Macbook and Macbook Pro lineups anytime in the future after Bobcat’s release?

  38. I thought the big hurdle to licensing designs to 3rd parties was that the x86 cross license with intel was non-transferable?

  39. Fusion is going away, too; you’ve just got to dig up AMD’s slides from the ATI-killing announcement.

  40. First of all, minor correction: Bobcat’s main pipeline comprises of 13 stages, not 12 (they’re numbered 0-12).

    Anyway, if Ontario ends up around the 15W ballpark or less, it’ll be pretty sweet. And it’s a fairly reasonable target to expect: if Nile falls into the 16-23W range, you’d expect Ontario to improve on that, otherwise what’s the point right?

  41. Agreed. Money also = additional resources which = bureaucracy to manage them /= atmosphere for innovation. A major reason that small firms routinely run circles around bigger firms in this.

  42. I think you’re mixing up that they got rid of the ATI brand because it won’t fit terribly well once they have Fusion this and that in its place.

  43. I’m simply excited to just see how this all turns out. AMD does not have NEAR the R&D budget that Intel has, thus AMD’s slow multi-year design of the chip(s). And with Intel throwing out blue shells to AMD and…VIA…ahem, anyway, AMD wanted to make sure their next line up was not just another speed boost or clock cycle efficiency upgrade. Something COMPLETELY different.

    If AMD had Intel’s budget, Bulldozer and Bobcat would have been here in 2006-2007 easily. Just not to the degree that it will be now, due to 90/65nm vs 45nm.

  44. It seems we are about to see a very interesting time in the Processor world. Though one thing I find interesting about all pictures supplied is that they still say Fusion in the bottom. If I read correctly, when AMD just recently got rid of the ATI name, they also did away with Fusion.