ARM lays the foundation for a data center invasion

The announcement this week that AMD is working on an all-new, high-performance CPU architecture compatible with the ARM instruction set is huge news in its own right, but it’s also an important step in a progression that’s been unfolding in recent years. Bolstered by success in the mobile computing market, ARM and its partners have been gearing up to challenge to the dominance of Intel in other parts of the computing world, including the data center.

ARM’s licensing model means any capable chip company can use ARM’s technologies, from its CPU instruction set to interconnect standards to specific blocks of logic, in order to build a product. ARM offers a broad suite of IP (or intellectual property) for its customers to employ as they wish, and some of what ARM offers is truly impressive, high-bandwidth technology aimed squarely at server-class applications.

Most ARM licensees aren’t likely to challenge Intel by attempting to take on its potent Xeon processors head to head, like AMD may well do with its K12 core. They can, however, potentially gain a foothold in the lucrative enterprise market by tailoring ARM-compatible SoCs for specific classes of workloads. An old axiom of computing says that custom-designed hardware will inevitably be more efficient and cost-effective at a given job than a brute-force, general-purpose solution. Not every job demands the fastest processing core. If ARM’s partners can build SoCs that efficiently handle workloads that are, for instance, more I/O-bound than compute bound, then they can win business away from Intel without matching Xeon stride for stride in every respect.

In doing so, ARM and its customers could very well lower the cost of computing at a rate faster than Intel’s famed Moore’s Law, and we could see an expansion of the number of viable players in the business of building server-class silicon.

ARM is expending quite a bit of effort to make such a future possible, and it invited us to a press and analyst confab in Austin, Texas, last week in order to highlight some of that work.

There’s really too much happening in the ARM ecosystem for us to offer anything like a comprehensive look at how the various companies involved are targeting the server space, but we should note that there is a widely distributed but concerted push for 64-bit ARM-based servers happening behind the scenes right now. The players include everyone from ARM itself to chipmakers like AMD and Applied Micro, from OEMs like HP to software vendors like Canonical and Red Hat—and to customers like Facebook and other cloud providers. The industry seems to want an alternative to Intel and its x86 ISA in the server space, and an awful lot of key players are putting in the effort to make that happen.

Rather than touring the whole scene, we’ll take a look at a couple of examples of the sort of technology ARM is designing and its partners are implementing. The first one demonstrates that ARM is more than just a CPU company, and the second illustrates the current state of ARM-powered solutions for the data center.

AMBA 5 CHI and the uncore

Much of the innovation in microprocessors over the past decade has come not in the CPU microarchitectures themselves, but outside of the cores, in the plumbing that feeds these compute engines. We’ve spent an awful lot of keystrokes talking about the “uncore” complexes and chip-to-chip interconnects that surround Xeons and Opterons, and we probably still haven’t entirely given them their due.

ARM offers two 64-bit CPU cores that can play a role in the server space, the smaller Cortex-A53 and the still-smallish-but-larger Cortex-A57. (The A57 is derived from the Cortex-A15 that’s made its way into high-end smartphones.) To support these cores, ARM has defined an interconnect architecture called AMBA 5 CHI, and it has created a family of “uncore” products that implement this architecture. Mike Filippo, Lead Architect for ARM’s Enterprise Systems Solutions, walked us through this spec and implementation in detail last week.

The AMBA 5 CHI spec describes a high-bandwidth interconnect for transporting data across a chip. AMBA 5 CHI is coherent, which means multiple connected clients (like CPU cores and I/O devices) can access a shared pool of memory safely. The interconnect hardware manages any hazards created by different clients trying to modify the same data simultaneously. In this respect and many others, AMBA 5 CHI is similar to standards like AMD’s HyperTransport and Intel’s QPI.

AMBA 5 CHI is a layered architecture. It defines proper behavior at multiple layers, from the top-level protocol to routing to the link layer to low-level physical signaling. Oddly enough, Filippo says the spec is agnostic about topology; it can be deployed as a point-to-point link, a crossbar, a ring, a mesh, or what have you. The spec includes provisions for multiple virtual channels—essentially wire sharing—and the protocol layer allows for different flow-control policies.

At present, AMBA 5 CHI is only being used as an internal interconnect between different on-chip devices, but Filipo tells us the spec was defined with an eye toward chip-to-chip communication, as well. That raises the prospect that AMBA 5 CHI, or something very much like it, could be used to enable coherent multiprocessing across multiple silicon dies at some point in the not-too-distant future. In fact, Filippo says ARM is “working on it.”

That said, what ARM has already done with AMBA 5 CHI looks to be plenty impressive in its own right. The firm has created a lineup of logic offerings, dubbed the CCN-500 family for “cache coherent network,” that can act as the glue for a high-bandwidth ARM-based SoC. Right now, the CCN-500 family has two members: the CCN-504, which supports up to 16 CPU cores, and the CCN-508, which supports as many as 32 cores. Filippo tells us there are smaller- and larger-scale versions in the works. All of them implement AMBA 5 CHI.

The block diagram above offers a simplified view of the CCN-508 uncore. One can see how it links together the CPU cores, memory controllers, and other I/O logic needed to make an SoC work. What’s striking about the 508, especially since it’s coming from ARM, is its sheer scale. The CCN-508 can support up to eight quad-core clusters of Cortex-A57 CPUs, for a grand total 32 cores. (It can also scale down to as few as two clusters and eight cores, if needed.) The uncore can connect to four ARM DMC-520 memory controllers capable of supporting both DDR3- and DDR4-type memories. The L3 cache can be as large as 32MB, and since that L3 cache has distributed ownership, there’s a snoop filter to prevent excess traffic from coherency enforcement.

All of the above will sound fairly familiar to those who know today’s Xeon and Opteron architectures. The CCN-508, though, has been built to provide copious bandwidth and coherent caching even in the context of relatively fewer, smaller, and less expensive CPU cores. In fact, dig a level deeper than the diagram above, and one will find that this uncore is based on a distributed design that makes its I/O interfaces into first-class citizens.

The CCN is organized as a series of crosspoints, each of which has two device ports and two interconnect ports. These crosspoints can have various sorts of clients, including L3 cache partitions, CPU cores, and I/O interfaces. “Just plop down crosspoints,” Filippo says, “and the system builds itself.” Breaking the design down into relatively intelligent crosspoints simplifies development, ARM claims, and allows performance to ramp up smoothly as designs grow in scale.

The L3 cache partitions can range in size from 128KB to 4MB, and some of them are paired up in crosspoints with memory controllers and I/O bridges rather than CPU clusters. That distribution underscores how the L3 doesn’t just serve the cores, but also acts as a very high-bandwidth I/O cache. The L3 has an “adaptive” policy regarding inclusion; it doesn’t always replicate the contents of the CPU cores’ L2 caches. In fact, Filippo claims that calling this cache “L3” is iffy, since it’s not just for compute.

This cache is no doubt needed to take advantage of all of the bandwidth on tap. Filippo estimates the CCN-508’s peak bandwidth at 360GB/s, and he says the interconnects can sustain 230GB/s pretty much constantly. Each of the eight I/O accelerator ports is capable of 40GB/s of throughput, so there’s 320GB/s of peak I/O bandwidth possible across the uncore. Although that number outstrips the 230GB/s of interconnect bandwidth, caching can help. Each crosspoint has a bypass port into the L3 cache, so there’s more bandwidth available at each local stop than what’s out on the ring. Filippo says delivered bandwidth is “significantly higher” than one would expect from an analysis of the ring without this mechanism.

Should bandwidth still become a constraint, each stop on the ring supports the quality-of-service provisions built into AMBA 5 CHI, “from ingress to egress and throughout the interconnect,” in Filippo’s words. QoS policies can thus provide guarantees of bandwidth, latency, and packet prioritization to specific applications or types of traffic. Since not all I/O devices honor QoS requests, the CCN has regulators to enforce policies internally when needed.

ARM’s uncore also includes the sort of power-management provisions one would expect from a product of this class. The L3 cache partitions can step down through multiple lower-power states, depending on demand. They can disable half of their capacity if it’s not needed, disable all tag and data SRAM and simply act as a conduit to DRAM, or enter an active retention state.

In short, the CCN-508 looks to be everything one would need from an uncore in order to build a useful chip for the data center. That chip might drive a series of blades or modules in a “microserver” config, or it might be a more specialized storage or network processing ASIC.

Filippo tells us ARM has a number of design wins for the CCN-508—either in the high single digits or the low double digits. ARM expects to see 32-core, enterprise-class systems based on this uncore in 2015, although predicting exactly what ARM’s partners will do with its IP is kind of like cat-herding—an inexact science, at best. That’s just the beginning for the CCN family, too. We should see larger and smaller versions of this uncore become available for licensing later this year.

Interestingly enough, my sense is that the most visible names in the ARM SoC business aren’t likely to adopt the CCN-500 series at all. For instance, AMD has its own internal SoC-style fabric for interconnecting IP blocks, and I believe Nvidia uses its own in the Tegra, too. Companies of that sort seek to differentiate their products by building their own glue logic. The thing is, you don’t have to be a big player in order to build something interesting when a high-bandwidth uncore like the CCN-508 is available for licensing. That’s kind of the point, really.

One of the first: Applied Micro’s X-Gene

The first server-class SoC compatible with the 64-bit ARMv8 ISA is the X-Gene from Applied Micro, and it’s one example of the sort of thing we can expect from ARM partners going into the server space in the coming years. The X-Gene is intended for cloud-style deployments, where lots of small server instances will service workloads that have modest computational requirements or are more I/O-constrained.

The diagram above makes the X-Gene look relatively simple, but don’t be fooled—there’s lots of parallelism represented. Applied Micro says it has tailored this SoC for specific workloads, and in doing so, the company has created an awful lot of its own IP. The CPU cores, for example, are the product of an ARM ISA license. Applied Micro built its own custom CPU core, compatible with ARMv8. To address the enterprise market, the firm built in ECC support and a number of RAS and reliability features. The first X-Gene chip features eight of these cores clocked at 2.4GHz, a relatively high frequency in the ARM world. The interconnect fabric in the X-Gene is Applied Micro’s own design, as well, not anything licensed from ARM. That fabric links the X-Gene’s CPU cores and I/O blocks to a total of four memory controllers, twice as many as in Intel’s Avoton.

What’s interesting is the rationale behind this design choice: Applied Micro says applications increasingly reside completely in memory, so the X-Gene needed to have access to “lots and lots of cheap memory.” The primary driver here wasn’t bandwidth, but sheer RAM capacity. The result is a fairly low-power SoC that can support a ton of memory—up to 512GB, according to Applied Micro, in an eight-ranks-per-channel configuration. I doubt most X-Gene microserver modules will have half a terabyte of RAM onboard, but this possibility is still worthy of note. Intel has limited Avoton to a maximum 64GB of physical memory, perhaps in part to protect its high-margin Xeon business. The X-Gene permits configurations that might be a better fit for cloud workloads, which is exactly how ARM partners could take business away from Intel.

In a similar vein, Applied Micro has built a form of TCP acceleration into the quad 10-GigE network controllers onboard the X-Gene. This hardware can purportedly reduce the latency for TCP communication from 20-30 milliseconds to roughly five microseconds. Applied Micro says cloud providers like Facebook provision their servers on the basis of request latency, and it believes the X-Gene’s TCP acceleration could allow the chip to deliver a substantially higher number of requests per second.

These things sound good in theory, but we don’t yet know how they’ll work in practice. Applied Micro didn’t have any performance numbers of consequence to share with us yet, just a vague claim of being able to support twice as many instances per unit of power as an Intel CPU. (We don’t know which one.)

Also, in a reminder that the X-Gene comes at things from a very different angle, this first chip is built on an antiquated 40-nm fabrication process. Production of the first X-Gene chips began in March, and Applied Micro is currently shipping pre-production silicon to system builders. Happily, a 28-nm X-Gene follow-up is in the works, and the first samples are scheduled for this quarter. The second X-Gene shouldn’t be dramatically different from the first one, but tweaks to the CPU core are expected to bring a 15% gain in the number of instructions retired per clock cycle.

If none of this sounds good enough to persuade customers to leave the existing x86 hardware and software infrastructure behind, well, just know that Applied Micro has been working with some very influential partners. HP hasn’t officially announced any products, but it has repeatedly demonstrated an X-Gene based cartridge for its modular Moonshot servers.

Also, last week, some folks from Canonical showed up at ARM’s event to demo a 64-bit ARMv8 version of the Ubuntu Linux distro running on X-Gene. Canonical intends for Ubuntu 14.4 for ARMv8 to be a first-class server operating system, complete with a five-year support lifetime.

The demonstration platform consisted of a stack of 14 X-Gene servers with very little active cooling running in a centrally controlled OpenStack Icehouse environment. Christian Reis from Canonical kicked off instances of several different server applications, including MediaWiki and Hadoop, with all of the necessary components natively compiled for ARMv8. Although the process of deploying a cloud application environment didn’t make for the most breathtaking real-time theater, the apps did seem to work as advertised once they were up and running.

Reis reported good progress in getting Ubuntu ported to ARMv8. He said that “99% of the main universe” is already up and going. Some of the important remaining “gaps” he identified have to do with proprietary components, like Oracle’s Java virtual machine. There’s also apparently work yet to be done in order to ensure ARM-based systems support low-level firmware standards for broad interoperability. UEFI support is now ready to go, but ACPI is still a work in progress, for instance.

A couple of other major software vendors, Citrix and Red Hat, were also on hand last week to signal their support for 64-bit ARMv8 servers. Both Xen and Red Hat Enterprise Linux are in development now for the 64-bit ARM ISA, and both firms appear to be committed to producing ARM versions of these core products for the long term.

We are at the beginning of something, obviously, and there’s much to be done before ARM-based SoCs can truly challenge Intel for the highest-profile roles in the data center. But the foundation is being laid, brick by brick, by software and hardware engineers from a range of companies whose names are familiar and not so familiar. This week’s revelation that AMD is joining the fray opens up new possibilities for ARM-based servers to challenge Xeons toe to toe, assuming the K12 core turns out reasonably well. It’s hard to say exactly what happens next, but it’s possible the data center will look very different five years from now, thanks to a swarm of invaders, big and small, that share almost nothing in common but an ARM license.

Comments closed
    • IntelMole
    • 6 years ago

    [quote<]I doubt most X-Gene microserver modules will have half a terabyte of RAM onboard, but this possibility is still worthy of note.[/quote<] Really? I can think of a huge use for this in creating "memcache accelerators". That's exactly the sort of compute-light, highly parallel, I/O-heavy application that this seems built for. I'm sure the amazons/ and googles of this world, not to mention SQL-heavy workloads that rely on memcache to avoid hitting the database, would buy a ton of those.

    • balanarahul
    • 6 years ago

    Call me crazy, but I really want AMD to just buy ARM, make ARMv8 a reasonable success and then sell off their x86 license to Intel. They could even ask Intel pretty penny for x86-64 IP.

    But sadly, that’s never gonna happen.

      • ronch
      • 6 years ago

      1. AMD probably isn’t allowed to sell their x86 license or transfer it.

      2. If AMD does buy ARM they need to continue licensing it to others without jacking prices up or making changes to the licensing structure because they’re what make ARM spread out. We can’t expect AMD to pull it off alone.

      3. Another thing that makes ARM attractive is the industry is confident that current ARM management won’t just change their minds about licensing. They may not feel the same way when someone else just swoops in and buys ARM.

      • UnfriendlyFire
      • 6 years ago

      ARM actually has more cash and assets than AMD. Especially cash.

      Swallowing ATI was hard enough. Swallowing a company bigger than yourself is being absurdly bold.

    • the
    • 6 years ago

    I’m really curious what RAS features Applied Micro has incorporated into its X-Gene line. The basic RAS features like ECC are already available so the first wave of server targeted ARM SoC’s will easily capture the LAMP market. Moving upward requires a bit more RAS to ensure that middleware applications and backend functionality remains up in the face of hardware failure. With the low end, it isn’t that big of a deal if a front end web server goes down – just kick it out of the load balance and fix it at your convenience. A faulty app server causes problems for everyone and if a back end system goes down it could be the end of the world.

    The other issue with ARM scaling up in the server world is the software side. Open source software is popular in the low end segment which is the main reason I see for ARM rapid deployment there: the software side will be in place by the time the hardware ships. One step up in the server market and the software support mostly vanishes. Oracle needs to support a JVM on the platform and then a few pieces like Tomcat will be in place. Similarly databases like MySQL will need to become supported on the platform (though they already have unsupported ports). Other things like low level backup software will also have to be in place and officially supported. Virtualization is expected at this level and ARM has the ISA parts in place. The problem is getting a good hypervisor for ARM. VMWare has surprisingly come out against ARM in the server space. Microsoft hasn’t announce any products outside of Windows RT and other consumer apps for ARM so no Hyper-V. Xen works on ARM but itsn’t production quality yet. Going into the middle ground is possible but it will take sometime for the software side to be put in place. The hardware side of things will need a bit beefier RAS, faster cores (AMD K12?) and more memory support (multiple socket + NUMA?). Looking at roadmaps from various players, these hardware aspects will likely be solved in a few years time though it looks like the software side will lag behind.

    The highend for ARM is going to be along ways into the future. Even x86 isn’t entirely there. RAS features like memory and processor hotswap are features in systems that can never, ever go down. The software side is even more dire with Oracle and IBM not having made a hint of porting their large DB software. It will be a long time before ARM can compete in this level, though I see the likes of IBM and Oracle fighting to keep their flagship software off of ARM as long as those companies have a hardware business. SAP so far has remained quiet about ARM as they’re likely taking a wait-and-see approach.

    • raghu78
    • 6 years ago

    Scott
    Nice article. I strongly believe that commoditization of computing power is bound to happen in the server market. Intel will face heavy competition and over time its market share will fall and so will its gross margins. There was an interesting and very insightful statement

    “Intel has limited Avoton to a maximum 64GB of physical memory, perhaps in part to protect its high-margin Xeon business.”

    This is the exact problem which Intel will face in all segments of the market. ARMv8 custom cores from Apple (Cyclone), AMD (K12), Nvidia (Denver) are all comparable to Intel’s big cores.

    [url<]http://www.anandtech.com/show/7910/apples-cyclone-microarchitecture-detailed[/url<] "With six decoders and nine ports to execution units, Cyclone is big. As I mentioned before, it's bigger than anything else that goes in a phone. Apple didn't build a Krait/Silvermont competitor, it built something much closer to Intel's big cores. At the launch of the iPhone 5s, Apple referred to the A7 as being "desktop class" - it turns out that wasn't an exaggeration." [url<]https://techreport.com/review/26418/amd-reveals-k12-new-arm-and-x86-cores-are-coming[/url<] " Keller was very complimentary about the ARMv8 ISA in his talk, saying it has more registers and "a proper three-operand instruction set." He noted that ARMv8 doesn't require the same instruction decoding hardware as an x86 processor, leaving more room to concentrate on performance. Keller even outright said that "the way we built ARM is a little different from x86" because it "has a bigger engine." I take that to mean AMD's ARM-compatible microarchitecture is somewhat wider than its sister, x86-compatible core. We'll have to see how that difference translates into performance in the long run." [url<]https://techreport.com/news/25852/nvidia-tegra-k1-soc-has-denver-cpu-cores-kepler-graphics[/url<] All these companies do not have to artificially restrict performance like Intel has to on their Atom cores to protect their high margin big core products which dominate in desktops/notebooks and servers. ARMv8 will commoditize computing power in every market that Intel competes including ones which it dominates today like servers. The long term casualty is Intel's gross margins and profits. Also Intel's closeness with Google on Chromebooks will motivate Microsoft to give hell to Intel. Microsoft will eventually provide their entire software stack - Windows, Office and their entire server software lineup for for two ISAs - x86-64 and ARMv8. By the end of the decade Intel will be a pale comparison to its current self.

      • chuckula
      • 6 years ago

      [quote<]This is the exact problem which Intel will face in all segments of the market. ARMv8 custom cores from Apple (Cyclone), AMD (K12), Nvidia (Denver) are all comparable to Intel's big cores. [/quote<] Of course they are. So is an 8088 from the 1970's. You can always [i<]compare[/i<] cores, but that doesn't mean they are [b<]similar[/b<]. The next generation cores coming out for 64 bit ARM are finally on-par with last year's Atom update in performance, although there's a very real chance that the purportedly magical power efficiency of ARM might not look so great as ARM scales to higher performance levels.

        • raghu78
        • 6 years ago

        “The next generation cores coming out for 64 bit ARM are finally on-par with last year’s Atom update in performance, although there’s a very real chance that the purportedly magical power efficiency of ARM might not look so great as ARM scales to higher performance levels”

        Cyclone crushes Silvermont in single thread performance and is in the same class as ivybridge
        You didn’t read the anandtech article. Cyclone is a massive core with execution resources on par with ivybridge. if you want confirmation

        [url<]http://browser.primatelabs.com/processor-benchmarks[/url<] [url<]http://browser.primatelabs.com/ios-benchmarks[/url<] Single thread performance core i3 3217u (1800 mhz) - 1608 Apple A7 (1400 Mhz) - 1384 z3770 (1486 mhz turbo 2400 mhz) - 937 core i3 3217u ivybridge has 16% higher single thread performance compared to A7 Cyclone while running at 28% higher clock speeds. so stop thinking that Cyclone is Silvermont class. Same for K12 and Denver. btw this is on a Samsung 28nm process. At 16/14 FINFET it gets worse for Intel. Apple's successors of Cyclone, AMD K12 and Nvidia Denver will compete with Broadwell/Skylake. Intel will not want to make Atom too powerful as it marginalizes their high margin Core processors. The ARMv8 licensees are going to tear into Intel's marketshare like a pack of hungry wolves.

          • Klimax
          • 6 years ago

          One of many things wrong with comparison: Cyclone cannot scale in frequency as high as SB+, meaning it can’t reach performance of desktop chips at all. (it wasn’t even designed that way)

          No chip can so far challenge Intel’s desktop chips and for Xeon you have to look at Power cores to find challenger.

          I suggest to return back to reality from ARM fantasy land, because you have so far you aren’t anywhere close to it.

            • raghu78
            • 6 years ago

            “One of many things wrong with comparison: Cyclone cannot scale in frequency as high as SB+, meaning it can’t reach performance of desktop chips at all.”

            Apple is clearly looking to replace Intel across their product stack. Thats the reason to design such a massive core. Apple’s design team is capable of scaling the design when they choose to. Its also a matter of timing and process node tech. The A9 is the chip which is the perfect candidate for replacing Intel across their stack. Built on a Samsung or TSMC FINFET process the successors of Cyclone can scale to 3+ Ghz.

            btw AMD K12 and the new x86 core are both high performance designs and will compete against Intel’s Xeon in servers. so don’t be fooling yourself that ARMv8 cores cannot scale to > 3 Ghz.

            Nvidia will go after the server market. that the reason for Denver which is another high performance design. Don’t think that only Intel can design high performance cores.

            • Ninjitsu
            • 6 years ago

            [quote<]Apple is clearly looking to replace Intel across their product stack.[/quote<] Are they now? Who's going to replace all their software? Which consumer is going to repurchase all their software? What are they going to replace Intel with, anyway? Cyclone? [quote<]Its also a matter of timing and process node tech.[/quote<] Intel has the lead there, by a margin, so what's your point? TSMC's FinFET isn't hitting before 2016, and apparently area scaling isn't much at all. What on earth is the A9? That's like "Skymont" at this point. [quote<]btw AMD K12 and the new x86 core are both high performance designs and will compete against Intel's Xeon in servers.[/quote<] They are claimed to be high-performance designs but are unproven, and are expected in 2016. They'll compete with Skylake and Goldmont, which are equally uncharacterized as of now but Intel's x86 designs have [i<]proven[/i<] to work well for almost a decade now. Also note, ARM competes in microservers with Avoton. Not with Xeon. In 2016, don't expect K12 to compete with Xeon. [quote<]that the reason for Denver which is another high performance design[/quote<] On paper it may be, but it hasn't materialized yet, has it? And it was a dual core design the last we heard. Same for Cyclone. Nvidia anyway has more of an interest in HPC using GPGPU than throwing CPU at the problem. They'll probably want Denver for control nodes, if at all there's any interest to put Denver inside servers in the first place. [quote<]Don't think that only Intel can design high performance cores.[/quote<] No one thinks that, that's just what we've consistently seen. Both Apple and Qualcomm have designed and implemented very good architectures, but they're still playing in their own market. Intel's not had that much experience with the ARM market, but they've still done a commendable job so far. Nothing from the ARM camp has attempted a full Core size chip, and that's not exactly a trivial matter. Cyclone is still smaller than core, and [i<]we don't know how it scales up yet[/i<]. I think Kaveri/Steamroller should be proof enough that something that looks good at 45w may not look great once it's pushed to the limit. Factors are many but the point is that the end product can always not deliver in some cases. Cyclone, Krait, A5x are high-performance mobile cores, and they do that job well. Expecting them to magically work well in a workstation or HPC environment isn't wise.

          • chuckula
          • 6 years ago

          Cyclone comes nowhere near “crushing” silvermont at single-core performance and the Anandtech article merely notes that Cyclone is a very very large core. Basically, Cyclone = large x86 core size, for substantially less than x86 core performance (and that’s not even getting into power efficiency where Haswell Macbook Airs are getting 15 hour battery life ratings).

          Geekbench scores running different software on different operating systems are a joke and show you are more interested in pushing an agenda than what’s happening in the real world.

            • raghu78
            • 6 years ago

            Since geekbench scores show something you don’t like you are quick to dismiss it. btw Windows 64 bit is a much older and more mature OS than iOS 7. So if there is any disadvantage its to Apple’s brand new chip/ OS combo. anyway keep up the cheering for Intel.

            • accord1999
            • 6 years ago

            People dismiss Geekbench because it’s a simplistic suite of synthetic benchmarks that are heavily affected by compilers and settings. Intel’s mainstream x86 processors have proven themselves quantitatively over the entire range of consumer and server software, while the vast majority of benchmarks for ARM processors are basically Javascript tests.

            • Ninjitsu
            • 6 years ago

            Well, in that case, 64-bit Windows will [i<]always[/i<] be older and more mature than iOS 7, the same way x86 is way more older and mature than ARMv8, the same way Intel will [i<]always[/i<] be older and more mature than Apple's CPU design team. Not a great argument. EDIT: And following this train of thought, what's the point in everyone migrating to ARM? You'll always get more mature software on the x86 side. Of course, this argument is slightly flawed, so all we can really do is wait and watch instead of predicting doomsday.

      • just brew it!
      • 6 years ago

      I think Linux — with its Open Source ethos and army of “we’ll port this sucker to anything” coders — has had a lot to do with the commoditization of computing power. It’s already well underway. Heck, even the move to x86 for servers was a big step in that direction; it stole market share away from the traditional server CPU vendors, and even cannibalized Intel’s own Itanium line once AMD-64 caught on. With enterprise-class features and decent support for virtualization, you can do an awful lot with commodity hardware + Linux which would’ve previously cost 10x as much.

      • Ninjitsu
      • 6 years ago

      Well, you have to remember:

      1. Intel isn’t sitting on its arse doing nothing. They’ve still got some of the best (of everything) in the industry.

      2. Except Cyclone (which is a custom ARMv8 compatible chip, so don’t expect that same performance from everyone) ARMv8 by other players is so far [i<]unproven[/i<]. Cyclone isn't heading into servers (or Macs) anytime soon. 3. There're indications that 2016/2017 may see Core-Atom unification on 10nm. 4. Power consumption. Intel still has the lead on efficiency once you go into high performance situations. 5. Knight's Corner. 6. Crystalwell's children. 7. AMD and ARM have pushed Intel [i<]again[/i<]. Lisa Graff's already countering on the desktop side, Haswell-E hits this year and we don't know what Broadwell's characteristics are. 8. Your last line may well apply to AMD and ARM if TSMC and GloFo can't keep up with Intel's fabs. 9. More "Intel is dead" rhetoric, seriously? I'd love to see more competition as well, but I really don't think that Intel's going away within this decade. They'll probably take us to the next era in computing.

    • chuckula
    • 6 years ago

    As I have been saying for a very long time, there’s no reason you couldn’t build a high-end server platform based on ARM… it’s just that it would end up looking suspiciously like what you can already get from Intel or AMD… and these slides just seem to back that assertion up.

      • ronch
      • 6 years ago

      Unless upcoming big ARM cores from the likes of Qualcomm, AMD and Broadcom seriously kick x86’s butt it’s not gonna be easy for ARM to dislodge x86. I’ve said it so many times — few companies have the willingness, much less the resources, to port all their stuff over to ARM and validate them. Of course one could argue that it wouldn’t matter because the few big companies able to make the transition can make all the difference, but nonetheless, it’ll be a long uphill battle. Intel won’t take this lying down. Ironically, it’s AMD that’s playing the most interesting part. Who would’ve thought?

        • the
        • 6 years ago

        Agreed. The low end will move quickly though as open source software (LAMP stack) can be quickly ported. In many cases, it is already running on ARM in some capacity. It isn’t going to be the hardware tying ARM’s arm behind their back but rather getting the midrange software pieces ported, running and supported.

          • ronch
          • 6 years ago

          Thing is, lots of big, rich companies don’t use open source and instead use Oracle, SAP, etc., and the cost of those packages make the price of staying with x86 trivial. As for open source being used by large businesses, that’s a bit of an illusion. Even finding a decent open source ERP package for a small business with proper support and rock-solid reliability is like looking for a needle in a haystack. No, make that a needle in a rice field.

            • the
            • 6 years ago

            Open sources is used in a very large percentage of big businesses. The thing to look at is where they’re using open source software. Linux, Apache, and are all open source and very, very common to see used in business. This is where ARM will take off initially.

            To go go deeper, they will need the support of commercial applications and more RAS in the platform. I see the hardware part evolving rapidly as it gains in popularity. The software is is an open question. It will also be an opportunity for the smaller software players to attack the incumbents by supporting ARM. It’ll be interesting to watch this all unfold.

            • Flatland_Spider
            • 6 years ago

            Those rich companies were running Big Iron well before x86 was anything more then an embedded ISA from a small company in California, and many of those rich companies use all sorts of ISAs. SPARC, Itanium, Power, whatever gets the job done. The can also hire platoons of consultants and support staff for the odd hardware they buy.

            Oracle, SAP, etc. were on Big Iron first, and many of those packages were also ported to Itanium when it was thing. They aren’t concerned about porting their software to another ISA.

            x86 needs those big rich companies and big software packages more then they need x86. For scale, $100,000 is a rounding error. They find that in the couch on the weekend.

        • UnfriendlyFire
        • 6 years ago

        AMD gave up directly competing against Intel in the x86 performance hill with the desktop FX getting axed.

        Though I thought they were dead set on the APU+HSA strategy with ARM stuff as a side project.

        Well apparently AMD had some cards in their sleeves. The question is, are the numbers on the cards are 2 to 4? Or queen to ace?

          • ronch
          • 6 years ago

          AMD’s upcoming x86 core is codenamed “Ace”.

          Just kidding.

    • dpaus
    • 6 years ago

    Note to Neely:

    [quote<]The industry seems to want an alternative to Intel[/quote<] Was dpaus right again?!? Yes he was! 🙂 I'm so stupid-busy these days I barely have time to read anything, let alone participate. But I've been trying to tell you that ^ for years, and you never believed me (or, more correctly, you insisted on denying the Truth that was right in front of your [closed] eyes!), so when I saw it from another source, it was well worth the 30 seconds required to rub it in! (kinda like wing sauce, now that I think about it....)

      • chuckula
      • 6 years ago

      [quote<]The industry seems to want an alternative to Intel[/quote<] AMD sez: WHAT ABOUT US!!?!?!!?

        • MustangSally
        • 6 years ago

        Chopped liver, potted plant, etc.

      • NeelyCam
      • 6 years ago

      The industry definitely wants an alternative to Intel – I don’t deny that. The problem is that this alternative has to be as good as what Intel offers, and that just doesn’t seem to happen.

        • the
        • 6 years ago

        Server buyers also want them to cost less than Intel alternatives. The few examples that can best Intel (see IBM’s POWER8) are typically the exact opposite of cheap.

          • Kurotetsu
          • 6 years ago

          Perhaps the new OpenPower initiative can do something about that?

            • the
            • 6 years ago

            Maybe, but pricing is a big factor here and IBM hasn’t announced how much POWER8 is going to cost. Also IBM hasn’t indicated what specific configurations/clock speeds POWER8 will be released for OpenPOWER systems.

      • Krogoth
      • 6 years ago

      Nvidia is also on the trail.

      A fierce battle is brewing between GPGPUs and CPUs in the HPC world.

        • Ninjitsu
        • 6 years ago

        Intel’s also heading towards GPGPU with Knight’s Corner.

      • nanoflower
      • 6 years ago

      I don’t agree with you. What the industry wants is something cheap that they can throw in to cheap devices. They don’t care if it is from Intel or ARM (through the licensees) so long as what they get is reliable and well supported. ARM is just making it possible to beat Intel on the cheap chips that work well enough (and can easily be customized which makes them even cheaper since you don’t have to include features that a given product doesn’t need in your ARM design.) If Intel sold their Atoms for similar prices and were willing to do the same custom designs (or let companies take the Atom and do their own customization) I believe most companies would happily take Intel up on the offer.

      The problem is that if Intel doesn’t like cutting their margins and even if they did do it to win over the business they wouldn’t keep the margins low once ARM was no longer a competitor. ARM is happy with making less money by just licensing the technology. Intel isn’t willing to do that.

    • UnfriendlyFire
    • 6 years ago

    If I was running Intel, and ARM-based chips are starting to threaten a very lucrative server market (Intel makes most of their money from servers, but use the desktop/laptop market to justify running so many fabs and improve return on investment from improving processes)…

    I would probably react by convincing the FTC that ARM is a competitor of x86. Which shouldn’t be that hard given how pliable the FTC is with the ISP industry.

    Once the FTC is dealt with, I would sell Iris Pros at bargain prices and release new ones with additional GPU cores bolted on. And ramp up GPU driver development to stop being the butt of crap driver jokes. And slightly reduce the price of Xeon server CPUs.

    Which would directly challenge AMD’s APUs and server chips.

    After AMD folds… ARM is next.

    EDIT: Nividia would also be weakened since an APU war is going to result in both Intel and AMD ramping up their GPU performances until one of them surrenders.

    And that would threaten Nividia’s mid-range GPUs, forcing them to either improve the performance of their mid-range GPUs, or reduce the prices.

      • NeelyCam
      • 6 years ago

      [quote<]After AMD folds... ARM is next.[/quote<] Intel likes to destroy companies in an alphabetical order. I really hope they leave Bethesda alone, though

        • chuckula
        • 6 years ago

        [quote<]Intel likes to destroy companies in an alphabetical order.[/quote<] Qualcomm is safe for quite a long time in that case.

          • UnfriendlyFire
          • 6 years ago

          Qualcomm needs ARM’s license to produce the chips.

          Unless if they bought it while ARM’s and AMD’s corpses are being looted by patent trolls, Apple, Facebook and Google

            • nanoflower
            • 6 years ago

            If ARM is being sold off for the patents why wouldn’t Intel buy them up. The ideal situation might be that Intel buys ARM so that they get money from all the future ARM licensees.

          • jihadjoe
          • 6 years ago

          Zenimax?

        • ronch
        • 6 years ago

        Hey, what about Cyrix?

          • UnfriendlyFire
          • 6 years ago

          Well AMD also sorta played a role in Cyrix’s downfall by simply being an x86 competitor.

            • ronch
            • 6 years ago

            I think Cyrix shot itself in the foot when they couldn’t get their Jalapeño and Mojave cores up to speed, plus Natsemi’s misdirection of the company and belief that they needed to concentrate on their MediaGX lineup at that time. Oh well, my memory’s kinda hazy after 15 years. Nonetheless, I’m still a Cyrix fan, FWIW.

          • jihadjoe
          • 6 years ago

          I blame AMD for that. The two companies were basically competing over who got to be the “Intel alternative.” AMD won, and Cyrix died.

        • the
        • 6 years ago

        They’ll have to kill Apple before Bethesda though.

          • NeelyCam
          • 6 years ago

          I think they skipped Apple when going for ARM. Had something to do with some deal between Otellini and Jobs

      • ronch
      • 6 years ago

      Oh yeah, that’d be cool. AMD dead, ARM dead. Intel will charge you both kidneys for a Pentium Dual Core. And no, IBM won’t be there to save you because they’re practically greedier than Intel, and you’d know what I mean if you’ve been using and maintaining IBM servers for the past few years.

      • HisDivineOrder
      • 6 years ago

      Intel doesn’t ask to be allowed to do something.

      They’re very much a, “Better to ask forgiveness than ask for permission” kinda company.

Pin It on Pinterest

Share This