ARM reveals client CPU ambitions with roadmap through 2020

ARM produces the basic CPU designs that power practically every smartphone and non-x86 tablet in the world. Now that the CPU IP licensing firm has tasted higher-power-envelope blood thanks to always-connected PCs from partnerships between Qualcomm, Microsoft, Asus, and HP, it wants to expand its ambitions in mobile computing to the 15-W performance class occupied by Intel and AMD U-series processors.

ARM's first step on the road to competing in these devices is the Cortex-A76 core, announced earlier this year. The Cortex-A76 promises a 35% generation-on-generation performance improvement relative to the Cortex-A75 before it, as well as a 40% power-efficiency improvement relative to that design. ARM isn't stopping with the A76, however. The company has released a CPU technology roadmap through 2020 that outlines its ambitions for client PCs.

The next high-performance ARM core for client PCs, codenamed “Deimos,” will be made available to ARM's licensees in 2018. While the company didn't share much detail about this core, it's designed for foundries' 7-nm-class process technologies, it will be compatible with ARM's DynamIQ clustering technology and interconnect fabric, and it promises a 15% increase in “compute performance” over today's Cortex-A76.

The follow-on to Deimos is called Hercules, and ARM says its licensees will have access to that core IP in 2020. This core will be designed for fabrication on both foundry 7-nm and 5-nm process nodes. ARM claims the Hercules design will improve compute performance by some amount in addition to projected power reductions and area reductions of 10% over what's possible from the move to 5-nm-class processes alone.

To emphasize its readiness to jump into the client-computing market, ARM also released a tantalizing chart that suggests its upcoming Cortex-A76 core running at 3 GHz might deliver per-core SPECint 2006 performance similar to Intel's Core i5-7300U while consuming much less power. We weren't privy to the briefing where these slides were presented, but Anandtech's Andrei Frumusanu dug into some of the finer points of the presentation, and his information suggests it's worth taking some of these numbers with a grain of salt or two.

Frumusanu says ARM's less-than-5-W figure represents actual single-core power consumption under that single-threaded SPECint 2006 Speed workload, while it seems ARM simply took the bottom-line TDP from Intel's specifications for the Core i5-7300U rather than providing actual power-consumption figures—even internal ones—for the Intel system running the same workload. Intel defines TDP as the worst-case power consumption of the chip under a worst-case workload, not a single-threaded power-consumption figure as ARM seems to be comparing here. That alone should probably give us pause.

Asus' NovaGo always-connected PC

It's also worth noting that despite ARM's chest-thumping about double-digit performance gains from generation to generation, actual performance of the first PC-class products from its partners suggests there's plenty of room for improvement yet. Always-connected PCs from HP and Asus with Qualcomm Snapdragon 835 SoCs inside have been panned by reviewers who have tried them in the real world thanks to leisurely performance. The Snapdragon 835 uses older ARM A73-based Kryo 280 custom CPU cores in its high-performance arsenal, to be fair, and it's entirely possible that new cores powered by designs based on the Cortex-A76 could offer better performance in those form factors.

Even so, the point remains that Intel remains a large and slow-moving target for CPU IP developers looking to butt in on its dominance in markets from servers to notebooks. That's thanks to the fact that the blue team is still facing immense pressure to get its 10-nm process up to speed and to release next-generation architectures of its own on that process. Intel might be able to stave off some of this competition with continued improvement of the 14-nm process technology that underpins every one of its leading-edge products, but that doesn't change the fact that the Skylake core being implemented on refinements of 14-nm is a 2015-vintage product.

If Intel's 14-nm Whiskey Lake product family delivers the major boost in peak clock speeds that early leaks suggest, even ARM's projected 3.3-GHz peak speeds for A76 cores might not be enough to catch a Core i5 in the bursty, single-threaded workloads that characterize the vast majority of mobile PC usage. Still, ARM's roadmap, ambitious performance targets, and broad partner ecosystem suggest the clock is ticking if Intel wants to maintain performance leadership in the always-connected 5G PC platform of the future.

Comments closed
    • DavidC1
    • 1 year ago

    A76 is not faster than a 3.5GHz 7300U. Total BS marketing slide.

    They claim A76 is 35% faster than the A75 in the SD 845. SD 845 is nowhere near 7300U’s ST performance.

    • LostCat
    • 1 year ago

    I’m interested in seeing what the 1000 can do before doing any further speculation. I don’t expect too much of ARM though.

    IIRC they have a much more efficient arch for multimedia, and shouldn’t have the random DPC latency issues x86 occasionally goes nuts with.

    Beyond that…shrug. All competition is welcome.

    • sweatshopking
    • 1 year ago

    It’s important to remember that criticism of Windows on arm performance was regarding x86 emulation. UWP arm compiled apps run great.

      • DreadCthulhu
      • 1 year ago

      But does anyone actually use UWP apps? I don’t use a single one of them on any of my family’s computers, and from what I have seen from friends & family, they are barely a thing. They use Steam for games, and download other software, like Chrome or Firefox, from the software maker’s website. That and I can’t think of any UWP apps that are clear improvement from their Win32 counterparts.

        • LostCat
        • 1 year ago

        I use them whenever I can. Not wasting CPU and RAM and other resources on outdated software is a good deal to me.

        • sweatshopking
        • 1 year ago

        Plenty of decent ones. Plex is used here daily. Netflix uwp is used like crazy globally.

    • ronch
    • 1 year ago

    Honestly though, who honestly believes ARM will crack the client, workstation and server market in the next 5 years? Not if Intel and AMD can help it. And with a risen AMD, ARM’s chances just got epically ripped apart.

      • tipoo
      • 1 year ago

      By some definitions of crack. I don’t think anybody’s going to be ousted from the market, but they’ll create their cozy little alcove in it.

      Where a multitude of light threads are needed, they will excel.

      [url<]https://twitter.com/eastdakota/status/976560820611031040?lang=en[/url<]

        • Waco
        • 1 year ago

        The supposition that ARM == light cores isn’t necessarily true.

    • Unknown-Error
    • 1 year ago

    I am not very convinced. We’ll have to wait and see. Especially when running x86 Windows how much performance can you get via emulating? What is the penalty? I suppose price and power usage will be the selling points.

    • fyo
    • 1 year ago

    “the bursty, single-threaded workloads that characterize the vast majority of mobile PC usage.”

    Is this really true?

    And by “true” I mean in the real world, where I would actually notice the difference without profiling or a stop watch.

    Any time my laptop is sluggish while browsing, there will be 3-4 “Chrome-related” threads going full-bore. Not 1 thread at 100% of a single core, but multiple threads each grabbing as much as possible from both cores (yes, the laptop is old).

    I’ve recently used a web-app that is quite sluggish on my laptop compared to my (much faster) desktop. However, the limiting factor seems to be graphics. The Intel graphics in my laptop aren’t exactly great. During certain actions, I was able to get CPU usage of the relevant Chrome process up to over 80%. Even a few to 120% (suggesting that whatever it’s doing is multi-threaded to at least some degree).

    Another thing I use my laptop for where it seems sluggish is photo-editing, but the bottlenecks are fairly multithreaded here as well, at least with the apps I use.

    Then there are some issues with playing 4K video, multiple simultaneous h264 streams (thanks ads!), and certain h264 and h265 profiles. All of that is nicely multi-threaded and/or hardware accelerated.

    • Chrispy_
    • 1 year ago

    Don’t get me wrong, I’m all for more competition in the x86 space, but my cross-platform real-world experience says that browsing experience on the latest ARM chip on a lightweight, browser-first OS is as bad (if not worse than)Intel’s [i<]lowest-power, 5-year old[/i<] Bay-trail architecture on a multi-purpose, far heavier OS It's not an apples-to-apples comparison, because what we expect to do in a browser today is quite different to what we expected to do in a browser back in Bay-trail's day, but at the same time, it feels like ARM is way off in terms of a general ballpark experience. If intel can get four x86 threads running at 7.5W in an x86 biased world [i<]today[/i<], why should ARM be optimistic about just one non x86 thread at 5W in two years from now? ARM's future brag is yesterday's reality already. Embarassingly so. Edit: also, whilst I hate to be [i<]that guy[/i<] (who I am kidding, I love being [i<]that guy[/i<]), this is yet another company happily planning a transition to 7nm whilst Intel trip over their own shoelaces throughout 2019.

      • chuckula
      • 1 year ago

      [quote<] Edit: also, whilst I hate to be that guy (who I am kidding, I love being that guy), this is yet another company happily planning a transition to 7nm whilst Intel trip over their own shoelaces throughout 2019. [/quote<] Yeah you pushed it too far there. Guess who -- according to its own roadmaps -- isn't shipping mobile 7nm parts in 2019. That would be AMD, who is focusing on enterprise first and foremost with desktop coming later in 2019. Insult Intel all you want, but they have shipped 10nm in mobile even this year, and there's no reason to think they won't have mobile 10nm parts on the market ahead of AMD.

        • Shobai
        • 1 year ago

        Apart from that, where are we up to with feature size comparisons for the comparable techs? My last recollection suggested near parity between Intel’s 10 nm and, e.g., TSMC’s 7nm.

        • HERETIC
        • 1 year ago

        “Insult Intel all you want, but they have shipped 10nm in mobile even this year, and there’s no reason to think they won’t have mobile 10nm parts on the market ahead of AMD.”

        When they get non BROKEN parts out,perhaps that’s the time to brag,which might NEVER
        happen. Word is their 10nm is being so gutted to get it working, it’s now closer to 12nm..
        And unless they drag on with that for 4 years, it’s more likely they’ll put their efforts in to
        7nm,and have that out first…………………………

        • kuttan
        • 1 year ago

        [quote<] Insult Intel all you want, but they have shipped 10nm in mobile even this year, and there's no reason to think they won't have mobile 10nm parts on the market ahead of AMD[/quote<] Paid Intel shill Triggered :DD

      • End User
      • 1 year ago

      OS and software may have something to do with it.

      My oldest iOS tablet is running on a 3 year old SoC. Overall general performance is excellent and compares very well to devices running 8th gen Intel mobile CPUs under both macOS and Windows 10.

      • tipoo
      • 1 year ago

      I dunno how many conclusions you can draw from that one, it’s so strongly down to the browser and OS stack optimization. Safari on iOS seems to punch even more above its impressive silicons weight for instance, that’s also on ARM. It’s all about the clout backing it for tuning.

      If you were looking at Windows on ARM, that seems particularly bad at browser optimization.

      Just like building the core around it, less is down to the ISA than people seem to credit it for.

    • Neutronbeam
    • 1 year ago

    Still, Hercules could provide a shot in the ARM and have the company feeling quite…chipper!

    • derFunkenstein
    • 1 year ago

    [quote<]Frumusanu says ARM's 5-W figure represents actual single-core power consumption under that single-threaded SPECint 2006 Speed workload, while it seems ARM simply took the bottom-line TDP from Intel's specifications for the Core i5-7300U rather than providing actual power-consumption figures—even internal ones—for the Intel system running the same workload. Intel defines TDP as the worst-case power consumption of the chip under a worst-case workload, not a single-threaded power-consumption figure as ARM seems to be comparing here. That alone should probably give us pause.[/quote<] So a quad-core SoC running a multi-threaded workload could consume closer to 20W, for example?

      • chuckula
      • 1 year ago

      Oh yeah… sh**t.

      They are gloating about [b<]a single core[/b<] using 5 watts of power in a freakin' notebook computer like they've somehow performed some miracle? Damn. Underneath all the marketing BS Arm literally just admitted that its [b<]next generation[/b<] cores are probably inferior to Intel's "broken and failed" 14nm process, much less 10nm.

        • derFunkenstein
        • 1 year ago

        That’s how I read it, but I’ve been wrong about this stuff before.

        • DancinJack
        • 1 year ago

        I’m not sure we should be praising the state of Intel’s 10nm process (yet).

      • Jeff Kampman
      • 1 year ago

      Sorry, I forgot a critical word there. It’s “less than 5 W,” not 5 W on the nose.

      • willmore
      • 1 year ago

      Intel hasn’t been using that definition of TDP for a very long time–ever since they refused to admit how much power Prescott was using. They’re been using very creative definitions since then to get around being honest.

      I would prefer to see real measured values, but that’s really hard to do unless you want to create a stripped down board with nothing more on it than is required to get a chip to function–and to isolate the power use of those other parts. Should ARM do that for an Intel chip for a press release? Seems a bit over the top when their own power value is just an estimation based on knowing or guessing how many transistors will be switching how fast and will have this and such gate capacitence and that interconnect capacitence and voltage swing and plugging it into P=VfC^2 and getting a rough idea.

        • derFunkenstein
        • 1 year ago

        Here’s an idea: just try to compare apples to apples.

    • derFunkenstein
    • 1 year ago

    That’s great that they think they’re catching up in int performance, but isn’t floating point (and particularly, vector extensions like Intel AVX) where CPUs really burn through the juice? They’re selective in that they’re not comparing themselves to Intel on Geekbench float scores.

    I don’t see this buying ARM anything more than they already have: a handful of Windows machines and a bunch of Chromebooks. Light duty stuff.

      • Jeff Kampman
      • 1 year ago

      The thing is, the vast majority of the market is not running vast amounts of vectorized code. It’s running bursty, single-threaded integer stuff, and when there is a need for the things that are amenable to vectorization outside of high-power, high-performance silicon, there is generally an application-specific IP block there to run it, especially in phones and such. If you’re going to prove your power-focused design anywhere it should be on integer code.

        • derFunkenstein
        • 1 year ago

        That’s fair.

        I’m trying to get at two things.

        The first is something you pointed out in the piece. The 15W number for Intel is a worst-case for the whole chip, where “less than 5W” (which could be 4.9W for all we know) is for a single core and does not account for graphics or anything else. Who knows exactly what the int power consumption for an i5-7300U is? Intel and probably not many others. It’s super shady to talk about power consumption in this manner and you rightly called them out on it.

        And since they’re not comfortable taking like-for-like float performance on Intel in their slides, I have to figure the A76 can’t keep up. That should put enough of a damper on making this a top-to-bottom (of the desktop/notebook performance spectrum) solution. AAA games presumably use some form of floating point math, and that’s a part of the market that ARM will have to tackle if they want ARM-based CPUs to make it everywhere.

        It’s not that gaming is “OMG such a huge market” (although I think TR readers would say it can be), it’s that ARM can’t really “do everything” if it can’t, in fact, do *everything*.

      • tipoo
      • 1 year ago

      True on power use, though I’ll point out in theory that ARM can already scale to AVX-512 and beyond (it cites 128-bit, 512-bit or 2048-bit) with the advanced scalable SIMD feature:

      [url<]https://www.anandtech.com/show/10586/arm-announces-arm-v8a-with-scalable-vector-extensions-aiming-for-hpc-and-data-center[/url<] This again makes things interesting, as someone (like Apple) could decide they want something AVX-512 in lower end systems than Intel offers them for, and add that in. Interesting times ahead, again.

        • derFunkenstein
        • 1 year ago

        Maybe it can, but at what power consumption and cooling cost? They’re saying that they can beat Intel at its own performance/power ratio game, but they’re not backing it up.

    • End User
    • 1 year ago

    Projected A76 3GHz performance underperforms last years A11.

    [url<]https://www.anandtech.com/show/12785/arm-cortex-a76-cpu-unveiled-7nm-powerhouse/4[/url<]

      • tipoo
      • 1 year ago

      And clocked 25% higher at that. Apples dominance of this field is impressive.

      • adisor19
      • 1 year ago

      And A12 is right around the corner. I can’t wait to see it benchmarked !

      Adi

        • tipoo
        • 1 year ago

        Semiaccurate, for whatever their reputation is worth, is betting it that A12 rises by 50% per core. Would certainly shatter the notion that everyone will have single core gain stallouts around here when they’re near-Intel, and would make that older “A12” geekbench result possibly an enhanced or die shrunk A11 instead.

        [url<]https://www.semiaccurate.com/2018/08/16/deimos-and-hercules-appear-on-the-arm-roadmap/[/url<]

          • adisor19
          • 1 year ago

          50% is very optimistic me thinks but I wouldn’t be too surprised if this comes out true seeing how the A7 completely caught everyone by surprise.

          Just a few more weeks until we find out.

          Adi

            • tipoo
            • 1 year ago

            It would seem unlikely, but I’ve thought their single core performance train would slow down for years, only to be proven wrong almost every year, so I’ll just wait and see this time, lol

    • tipoo
    • 1 year ago

    Their performance crossover projection does coincide with rumors of a certain fruit company making a switch post 2020. Not that Apple would be using stock cores, but ARMs analysis of the world appears to line up with the rumors brewing about ARM Macs, and Apples cores have been well ahead of ARMs own to start with. Interesting times ahead.

      • chuckula
      • 1 year ago

      Except that Apple only licenses the ARM instruction set but doesn’t use anything else.

      You wouldn’t think that Intel had some new major product ready to launch because AMD put out powerpoints, which is effectively what this is.

        • tipoo
        • 1 year ago

        I wasn’t suggesting they’d use these cores, just that their timelines for the crossover appear to be similar, so their analysis appears to line up. Apple if anything is already out ahead of Intel perf/watt on the low end.

      • Hattig
      • 1 year ago

      Their cores are already way ahead of ARM’s cores in terms of performance. It may be that the A76 makes up some of the lost ground, but I suspect the custom Apple ARM (instruction set) core in the A12 will still be way ahead.

        • DancinJack
        • 1 year ago

        A10 is already faster than projected A76. Don’t expect anything but a complete drubbing from A12.

      • the
      • 1 year ago

      Building a high performance, mulitcore SoC requires plenty in terms of IO capabilities. Memory bandwidth and both off and on-die interconnect matter. ARM does offer IP targeted toward those very aspects. Leveraging ARM’s cores to take advantage of that additional IP would be a straight forward solution vs. developing all of this in-house. I’m just not sure how attractive this would be for Apple who already has a CPU team, the money and additional talent to keep everything in-house.

Pin It on Pinterest

Share This