Qualcomm readies up 48-core Centriq 2400 ARM server chip

Maybe 2017 will be the year that ARM servers finally become a thing. After demoing a 24-core server chip a little more than a year ago, Qualcomm's Datacenter Technologies subsidiary has announced the Centriq 2400 CPU. This new chip is a 48-core ARMv8 processor based on a new in-house CPU core design called Falkor, and it's compliant with ARM's Server Base System Architecture specification. Earlier in the week, Qualcomm showed off the new hardware running "a typical datacenter application" comprising Linux with Java and Apache Spark.

Qualcomm makes some pretty bold claims about its new chip. The company says the Centriq 2400 is the first-ever server processor fabricated on a 10-nm process, and that it's power-efficient, scalable, and will offer "high performance." In the context of the datacenter, "high performance" means going head-to-head with Intel's Xeons, and that's usually regarded as a tall order, even for atypical, powerful ARM processors.

There's been a lot of talk about ARM CPUs heading into servers in the past few years, but that idea has yet to really pan out. Previous attempts in this space failed to be competitive when dealing with typical server workloads and were generally no more power-efficient than Intel's offerings. Qualcomm is staying tight-lipped about the technical details of the Falkor core and the Centriq 2400's platform, so perhaps the CPUs have the secret sauce needed to compete with Xeons.

Qualcomm says it's already sampling Centriq 2400s to industry partners, and that it expects commercial availability of the new chips in the second half of 2017.

Comments closed
    • lycium
    • 3 years ago

    “and were generally are”, ” it expects expects”

    Edit: lol, downvotes for posting some (rather egregious) errors? Awesome, that’s enough of that, won’t do it again.

      • morphine
      • 3 years ago

      Thanks for the heads-up. Those are commonly known as “editing orphans.” Fixed 🙂

    • ronch
    • 3 years ago

    Considering how ARM server chips seem to be sinking down quicksand and how ARM server chipmakers would love to revive this dream of unseating Intel as the de facto server chip option, I think said ARM core should’ve been called Artax.

    • Tirk
    • 3 years ago

    Maybe Qualcomm harnesses the imagination of children to make these Falkor chips and takes us through some Neverending Story?

    Save us from the nothing Qualcomm!

    • VincentHanna
    • 3 years ago

    And the truth comes out.

    THIS is why MSFT is launching an ARM compatible windows.

      • just brew it!
      • 3 years ago

      I don’t think the enterprise server market is going to care much about running 32-bit applications under emulation.

    • TheJack
    • 3 years ago

    I would say the most important factor for the success of these chips is, how determined Qualcomm is to push them forward.

    • Firestarter
    • 3 years ago

    [quote<]the first-ever server processor fabricated on a 10-nm process[/quote<] *for a certain definition of 10nm that may or may not be based on an actual 10nm feature size in any part of the CPU I imagine any professional in the semiconductor industry involuntarily rolled their eyes while reading that

    • ronch
    • 3 years ago

    Try and try until you die, I say.

    • Unknown-Error
    • 3 years ago

    This not going to change things. ARM Server SoC have a very long way to go. Even AMD K12 seem to be dead.

      • just brew it!
      • 3 years ago

      Qualcomm is a much bigger company than AMD, has more resources at their disposal, and has more experience building ARM chips; so I wouldn’t necessarily say that K12’s (apparent) demise implies that Centriq will also fail.

      I do think it is a bit of a long shot though.

    • brucethemoose
    • 3 years ago

    Any mention of the die size?

      • chuckula
      • 3 years ago

      YUGE!
      — Qualcomm’s new marketing director with strange hair.

      • RAGEPRO
      • 3 years ago

      [quote<]Qualcomm is staying tight-lipped about the technical details of the Falkor core and the Centriq 2400's platform, so perhaps the CPUs have the secret sauce needed to compete with Xeons.[/quote<]

      • ronch
      • 3 years ago

      Clue: Texas

    • AMDisDEC
    • 3 years ago

    I love American marketing!
    They make believe an cellphone chip will in the future be competitive with Intel or AMD Desktop/Server offerings.
    never happen.
    MIPS had a 64 bit chip years ago that would be much more competitive because it was designed from the ground up for performance. I.E. 64bit Alpha.
    Too bad AMD execs failed to possess the insight to outright buy the MIPS IP and allowed Imagination to grab them.
    If they had, they would now possess a ultra scalable CPU which force ARM abandon all hope of scaling up to data centers with an Cell Phone core.

      • just brew it!
      • 3 years ago

      Server workloads typically parallelize well, and aggregate performance matters more than single-thread performance. If 48 lightweight ARM cores can do the work of 16 x86 cores, while using less power, then it is a potentially viable solution for the datacenter.

      The bigger issue I see is building up an ARM ecosystem for enterprise-class hardware. RAID controllers, NICs, etc. will all need ARM firmware to be usable with these chips.

        • AnotherReader
        • 3 years ago

        They parallelize well, but server CPUs also need a beefy uncore to bring all of those cores together. That uncore also consumes power; on top of it, servers have a lot of memory which consumes more power. It remains to be seen if Qualcomm can compete with Broadwell-EP. I think we should all be skeptical until they prove otherwise.

          • Srsly_Bro
          • 3 years ago

          Purley platform will be out soon and is already sampling to select partners. Broadwell is old news.

        • xeridea
        • 3 years ago

        Depends on load. Anything latency sensitive (such as a webserver) would be a nogo. Many other tasks would be fine though.

      • AnotherReader
      • 3 years ago

      Alpha wasn’t made by MIPS; sadly, the correct answer is part of your username.

        • AMDisDEC
        • 3 years ago

        I never stated that Alpha was made by MIPS, and it isn’t sad that DEC made Alpha.
        What’s really sad is the dense mentality that is pervasive in the non-diverse US semiconductor industry that led to DEC’s demise and soon, AMD’s as well.

          • AnotherReader
          • 3 years ago

          Would you be able to expand on the nature of this “dense mentality”?

      • the
      • 3 years ago

      I strongly suggest you take a look at Apple’s CPU designs in the mobile space. The big advantage Intel has on them in desktop/server chips is a very robust SIMD unit. Apple is very competitive in integer and scalar floating point. A large scale deployment of Apple’s CPU cores would be disruptive force in the data center if they can get core count to scale nearly linearly with power consumption.

      As for the rest of your comment, MIPs died out in the server space eons ago. SGI was their last supporter and preemptively jumped ship to the Intel/HP Itanium. Somewhat ironically, SGI was recently purchased by HPE.

      Alpha on the other hand was a different entity from MIPs and lasted longer than that platform in the datacenter. Alpha was ahead of its time but ultimatley killed off by office politics in the HP/Compaq merger.

        • AMDisDEC
        • 3 years ago

        My point is, MIPS had scaled the MIPS core from embedded to 64-bit server class with no compromise in power saving for embedded, or performance in MIPS64.
        The architecture was/is superior to ARM, and AMD had a MIPS license during their Opteron peak years. I.E. Alchemy, but because they were trying so hard to compete with Intel on performance, they missed the opportunity to leverage their MIPS IP to compete with ARM in embedded and MIPS64 against ARM’s server push.
        Plus, AMD hyped HyperTransport which was actually a DEC/Samsung Alpha interface.
        Still, since ARM is not as scalable as MIPS, they have little to no chance of actually competing with x86 in the server space. Their only real advantage is lower power, at the price of performance.
        Didn’t China a few years ago reach the top5 performing Supercomputer using MIPS (Alpha) derivative while also having the lowest power consumption?

          • the
          • 3 years ago

          Alpha is not MIPs. They are radically two different architectures.

          In your comparison then there is little to no difference between how ARM and MIPs moved in the market. At the time, MIPs has more mobile focused cores while simultaneously offering higher performance chips for servers. This is very analogous to what ARM is doing today.

          AMD’s embedded strategy during the Athlon XP/early Athlon 64 days pretty much made sense due to AMD recycling existing designs for that space. You have to have some massive amount of sales aligned to make money in the embedded space to the lack of margins per chip. It was wiser for AMD to keep their focus on x86 as that is where the profit was (and still is) at.

          Hypertransport is AMD’s own creation. It was the previous bus for the Athlon classic/Athlon XP that was licensed from DEC.

            • AMDisDEC
            • 3 years ago

            Alpha was a MIPS64 derivative that incorporated many VAX features. In the end it was different then both MIPS and VAX.

            Yes, the HyperTransport interface was initially developed for the Alpha processor by a small Boston start-up funded by Samsung with the intent of Samsung buying the firm. The deal fell through and the execs of the company convinced AMD to purchase them and AMD adopted HT for the Opteron.

            During the Athlon XP/Athlon 64 period AMD had no embedded focus.
            Their focus was strictly desktop and server.
            During that period embedded revenue generated by Intel was $1-2B/annual, while embedded revenue for AMD was less than $1M.
            If you called AMD then and told them you were working on an embedded design and wanted to use their Athlon processor their response was, we don’t do that. They were too busy focused on competing with Intel on the desktop.
            It wasn’t until AMD released the Opteron that they began to think seriously about embedded. Prior to that their embedded products consisted on Geode and Alchemy, neither which were successful.
            Today AMD still doesn’t comprehend the embedded market which is why they abandoned their MIPS license and now attempting to do ARM.

            • the
            • 3 years ago

            Alpha development was independent of MIPs. There are similarities as both are RISC but there are clear differences in instruction formatting and opcodes. The similarities are where you’d expect with common traits around 32 bit instructions, three register operand support in a 32 entry register file and 16 bit immediate support. Alpha was always its own beast.

            AMD’s embedded efforts were mainly focused upon the K6 lineup while the Athlon XP was dominate on the desktop. The Geode lineup didn’t arrive until the Athlon 64 and only that was through AMD purchasing that business from National Semiconductor. It sufficed until AMD was ready to bring out embedded products based upon Athlon64 cores.

            Much like how the Geode products came via an acquisition, AMD got the Alchemy lineup by purchasing the original company. Ironically Alchemy and Geode were both absorbed into AMD during 2002.

            I also wouldn’t compare revenue of Intel’s embedded division to AMD since Intel lumps so many IO chips and microcontrollers into that division. In other words, Intel sells more than just processors in that space.

      • raddude9
      • 3 years ago

      [quote<]MIPS had a 64 bit chip years ago[/quote<] And that was actually a big problem for MIPS, they went 64-bit way back in 1992 which was too early. They were so hell-bent on being first to 64-bit that they squandered their transistor budget on the number of bits, and not improving performance. It also meant that at a stroke they pretty much abandoned the low and mid range workstation market as the cost of 64-bit machines didn't start to come down for many years later. [quote<]They make believe an cellphone chip will in the future be competitive with Intel or AMD Desktop/Server offerings.[/quote<] You've got a short memory, it wasn't so long ago that people sneered at the idea that a lowly desktop x86 chip would be competitive with the many powerful RISC Server offerings. [quote<]Too bad AMD execs failed to possess the insight to outright buy the MIPS IP[/quote<] You're kidding right, MIPS is dead as an instruction set. Like it or not, Intel managed to pretty much kill off all of the high-end RISC architectures by undercutting them. ARM only survived because they managed to find a niche in the cheap & low-power market, but they've been so successful that they now have a shot at undercutting x86. With the very little we know about this chip, it does actually look like it could be the best attempt so far at get ARM chips into the server market. Which, with the current state of that market, can only be good news for consumers.

        • AMDisDEC
        • 3 years ago

        Yep, MIPS is pretty much dead at this point except in deeply embedded designs because Imagination purchased the IP.
        I wouldn’t hold my breath while waiting for ARM to undercut Intel x86. It will never happen.
        Just like AMD, in the server market, ARM will be stuck following Intel’s lead. The only advantage ARM has over x86 is lower power, and Intel can scale their products to match that which means ARM has no advantage.
        The only reason ARM survived Intel is because they addressed a market with margins so small, Intel ignored it. Intel won’t ignore the server market and even a 64 core ARM64 won’t be competitive with a 24 core Xeon, especially the next gen parts incorporating FPGAs and GPUs.

          • raddude9
          • 3 years ago

          [quote<]I wouldn't hold my breath while waiting for ARM to undercut Intel x86.[/quote<] ARM have already undercut Intel in Android tablets and phones, Intel tried really hard to compete but failed. And I don't think anyone is holding their breath for ARM chips to compete in the server market. I'm just hopeful for more competition because the price of Intel's server chips is getting ridiculous. [quote<]The only advantage ARM has over x86 is lower power[/quote<] And price, and the ability of companies to customise the core to serve particular markets. [quote<]Intel won't ignore the server market and even a 64 core ARM64 won't be competitive with a 24 core Xeon[/quote<] There's no reason why an ARM core can't match the IPC of intel's core chips, and if you look at the trends, ARM chips have been closing the IPC gap over the last few years. [quote<]especially the next gen parts incorporating FPGAs and GPUs.[/quote<] ARM chips have been incorporated into FPGAs and GPUs for years. Whether it makes sense to bundle very different functionality into the same chip is another matter. For GPU's we can see that they tend to be better when they are on their own. Anyway, Intel is facing into an interesting year, what with x86 Win 10 running on ARM chips, a serious attempt by Qualcom to get into the server space and renewed x86 competition from AMD. It's going to be a few years before the implications of these events become clear.

            • chuckula
            • 3 years ago

            You’re mixing & matching supposed advantages of ARM chips to create a hypothetical device that doesn’t exist (for good reason).

            For example: You claim that SOME ARM chips operate with low power.. yeah there’s a reason for that.

            You claim that SOME ARM chips are cheap… once again there’s a reason for that.

            You claim that SOME ARM chips can have the same effective performance of a high-end Xeon: Unlike the previous two categories, these chips do not exist and Qualcomm probably hasn’t made one out of thin air.

            You then assume that because SOME ARM chips are low power, and that SOME ARM chips are cheap and that a hypothetical server-grade ARM chip will be on part with Xeons in performance that the hypothetical ARM chip will be: 1. Cheap; 2. Very low power; and 3. As fast or faster than a Xeon.

            It doesn’t work that way, and I could play the exact game with x86 parts since Atoms certainly are cheap, low power, and frankly just as fast as comparable ARM parts with equivalent core sizes.

            • raddude9
            • 3 years ago

            [quote<] You claim that SOME ARM chips operate with low power[/quote<] I'm not claimaing that, that's a fact. [quote<]You claim that SOME ARM chips are cheap[/quote<] Again, that's a fact, not a claim. [quote<]You claim that SOME ARM chips can have the same effective performance of a high-end Xeon[/quote<] No. I didn't claim that at all. I stated, very reasonably that the IPC of ARM chips has been catching up with intel's x86 chips. Case in point, Apple's latest ARM chips have very good IPC, not far off Intel's core line. But IPC is not the same as performance, do I need to explain why? [quote<]these chips do not exist and Qualcom probably hasn't made one out of thin air.[/quote<] I don't know what resources Qualcom has used to make these new chips, and I don't think you do either. But I do know that Qualcom has a history of designing chips and they have more resources than almost every other chip maker out there. So I'm not going to assume that they haven't managed to create a chip with good IPC with a decent clock-speed at reasonably low power. The exact balance of the chip remains to be seen. [quote<] a hypothetical server-grade ARM chip[/quote<] hypothetical? Granted we don't know it's performance characteristics yet, but isn't this chip sampling already? [quote<] will be on part with Xeons in performance that the hypothetical ARM chip will be: 1. Cheap; 2. Very low power; and 3. As fast or faster than a Xeon.[/quote<] Are you replying to my post at all? Where did I say that it would be as fast as a Xeon? And which Xeon are you talking about and which benchmark? And where did I say that it would be cheap, I reckoned that it would be cheaper than the equivalent intel chip, but that does NOT mean that it would be cheap. Dude, I think you're losing it, you're so keen to defend your beloved intel that you're reading way more into my words than is actually there. And why don't you look forward to more competition in the server market? The threat of the previous generation of thus far unsuccessful ARM server chips has forced intel to innovate with their Xeon-D line. Without competition we don't get innovation.

            • chuckula
            • 3 years ago

            Long wall of text that basically 100% supports this supposition: Intel could easily take over the discrete GPU market.

            Why?

            Well let’s look at their IGPs!

            After all, CHEAP Atom chip with a Skylake-grade IGP is incredibly inexpensive and delivers great performance for their power envelope.

            There are 45 Watt Skylake chips with EXCELLENT Graphics performance for their POWER envelope!

            Therefore, all Intel has to do is press a magic button and CHEAP plus LOW POWER will magically scale into a discrete GPU that will outperform AMD and NVidia while being CHEAP and using little or no POWER!

            Are you actually naive enough to believe anything I just posted? Because it’s literally the same thing you just posted.

            • raddude9
            • 3 years ago

            Read more carefully.

            [quote<] Intel could easily take over the discrete GPU market.[/quote<] I didn't claim that anyone will take over any market, easily or otherwise. I do though think it's stupid to write-off this chip without knowing any details of how it performs, which is what you're doing.

            • the
            • 3 years ago

            *raises hand*

            I think you’ve picked a really bad example for your point as Intel really could beat AMD/nVidia on price, performance and power consumption in the discrete GPU market. It would be silly to say that Intel will produce a 450 mm^2 chip, sell it for $180, consume less power than a mobile GP106 and out perform the Pascal Titan. However, it is well within Intel’s ability to release a comparably sized chip to the GTX 1060, sell it for less, consume less power and out perform it.

            The main limiting factor for Intel isn’t their engineering talent but rather management who want to focus on x86 everywhere even though they could capture the GPU market.

            ARM on the other hand isn’t in the same scenario as Intel storming the discrete GPU market. Certainly ARM has some high performance core designs (hello Apple), low power designs and has licensing arrangements that only net ARM royalties measured in cents but combining them all into a competitive server chip is a different story. That also ignores the critical RAS features necessary for a true datacenter product which are rather lacking in the ARM ecosystem vs. Xeons. That hasn’t stopped various players from trying. Cavium’s ThunderX is another 48 custom ARM core design that [url=http://www.anandtech.com/show/10353/investigating-cavium-thunderx-48-arm-cores<]performed in the middle of the pack[/url<] against lower core count Xeons. The results were a good mix with the Cavium system able to beat the Xeon D and Xeon E5 in performance on select tests and a couple of times was able to best it performance/watt too. Ultimately I wouldn't call it a victory for Cavium but it did provide enough data to indicate that Intel should be worried about ARM in the data center if designers can continue to improve ARM based cores. I would argue that Qualcomm does have the expertise to do just that and I'm eager to see independent benchmarks of this new chip.

            • synthtel2
            • 3 years ago

            Intel could probably win on performance/watt, sure, but price? Intel GPU logic takes an awful lot of die area per a given amount of performance.

            Measuring on some die shots, I come up with 43 mm[super<]2[/super<] for Skylake GT2 (not including memory and I/O logic a dGPU would need). Direct comparisons between it and more serious GPUs are tough to find, but some [url=http://www.anandtech.com/show/9483/intel-skylake-review-6700k-6600k-ddr4-ddr3-ipc-6th-generation/14<]Anandtech stuff[/url<] shows it a bit behind a GCN 1.0 card with similar memory bandwidth and 450-500 GFLOPS. This matches with that iGPU's theoretical number crunching ability of ~450 GFLOPS. If they scale about as well as AMD and need to throw 11-12x the hardware at it to get a 1060's performance, that results in a die size around twice what it's taking their competition. Presumably to get the performance/$$$ into a more reasonable range, they'd boost clocks well above that of their iGPUs, but that loses them their performance/watt advantage. Whether or not that leaves them with a competitive product, public information may be insufficient to know, but if anyone happens to know what voltage Intel iGPUs run at, I'd take a crack at it. (For the record: chuckula, you're getting facts like this right, but don't take this as support of your position. Raddude9 isn't saying the things you're thinking he's saying.)

            • synthtel2
            • 3 years ago

            W.r.t. up-clocking to balance this stuff out: apparently people do overclock Intel graphics sometimes (I had no idea). All the stuff I could find was on Ivy/Haswell, but it looks like they have about 50% headroom in them tops. It’s the infamous Charlie, but [url=https://semiaccurate.com/2012/04/23/overclocking-intels-hd-4000/<]here's[/url<] the most detailed source I found. The consensus seems to be that Ivy/Haswell GPUs run something under a volt stock, with some of the >40% overclocks pushing >1.3V. If the power use versus voltage curves look about the same there as for the CPU part (which isn't a given), that represents a huge hit to efficiency (bigger than I was expecting). If Skylake is similar (again, not a given), that means they could never reach performance/$$$ parity with AMD/Nvidia, but they could very easily wreck their performance/watt trying. Maybe a better question (at least from Intel's perspective) is who exactly doesn't need the best absolute performance but would pay top dollar for more performance/watt.

            • the
            • 3 years ago

            I’ve heard of Intel GPU overclocks hitting 2 Ghz but that requires a bit of work and the results of less than impressive due to diminishing returns. The thing about Intel’s GPUs is that they’re severely constrained by memory bandwidth, this is why the i7 5775C was such a big deal as it had the extra 128 MB of eDRAM cache on its own dedicated high bandwidth link. The other factor with overclocking is that the CPU and GPU share a power budget in this situation. Increasing both the CPU and GPU clock speeds can easily creating a throttling situation. The high GPU clock speeds I’ve read about on other forums were actually reducing CPU clocks in conjunction with increasing overall socket power limits to reach maximum GPU clock speeds.

            The Iris Pro 6200 is roughly equivalent to the vanilla Geforce GTX 750. Die size of the i7 5775c has been rather elusive but estimates I’ve seen place it around 181 mm^2. This is larger than the 148 mm^2 of the GM107 die used in the GTX 750. This isn’t a fair comparison as Intel is also putting four CPU cores and lots of cache on that die where as nVidia has disabled some of the shaders on the vanilla GTX 750 meaning some of that 148 mm^2 die isn’t being used. The other factor is that Intel is using a 14 nm FinFET process here where as nVidia was on bulk 28 nm. nVidia does have an advantage when it comes it putting more compute into a given die area but it does appear to be lower than 2x. A better calculation would be to put SkyLake GT4e up against the new lowend Pascal or Polaris 11 chips to see how Intel ranks on a more similar process.

            If Intel could reach clock speed parity and ALU numbers with nVidia’s Pascal line up, they would certainly be performance competitive but it does appear that Intel’s die would be slightly larger for the same performance. The real variable would be how much of that power efficiency Intel would burn to get reach those clock speeds.

            • synthtel2
            • 3 years ago

            2 GHz?!? I guess extreme overclockers will be extreme overclockers. On memory bandwidth, that Anandtech article is actually pretty useful in that they tested with both DDR3 and DDR4. Some games show a sizable difference, others hardly any. From a more theoretical perspective, Skylake GT2’s compute : memory bandwidth ratio is about in line with bigger dGPUs, and that’s after taking a cut for CPU use.

            I mainly used GT2 because its numbers are widely available, but I suppose GT3e isn’t so obscure as all that. I come up with ~71 mm[super<]2[/super<] for it (there's a lot of space above that part of the die that appears to be wasted for no reason, which I didn't include, but I did include some stuff in the midst of the GPU section that looks low-density and isn't needed for GT2). [url=http://www.anandtech.com/show/9320/intel-broadwell-review-i7-5775c-i5-5675c<]Here[/url<] are die size numbers that make more sense (181 was Haswell, this is 133), and better shots are [url=https://en.wikichip.org/wiki/intel/microarchitectures/broadwell<]here[/url<]. Despite the well-aligned performance, it clearly isn't the same design that's in Skylake, but as Broadwell's takes less area, it's probably better for this thought experiment anyway. I can't find very many solid comparisons between GT3e and proper dGPUs (if you have any, I'd be interested in links), but it looks like it's turning in numbers right around twice those of the SKL GT2 I was just looking at, putting it significantly behind the GTX 750 (10-25%?). Scaling up to a 970 (3.25x shaders, 1.1x clock, 1.1x because GT3e is a bit behind the 750) and accounting for other stuff that Intel would need to put on a die, an equivalent Intel dGPU would be something over 300 mm[super<]2[/super<]. Comparing that to a 1060's 200 mm[super<]2[/super<] or a 480's 232 mm[super<]2[/super<] doesn't look good, but running 1.5-1.6 GHz could fix that. Then it's back to the power consumption numbers we don't know.

      • sophisticles
      • 3 years ago

      LOL! “Good one”. Here’s the sad thing, these “”cellphone” chips are extremely efficient, I have a Samsung Galaxy S6 Active, you can see the specs here:

      [url<]http://www.androidcentral.com/samsung-galaxy-s6-active-specs[/url<] These 2 little cpu's and little gpu, being powered by a small battery, is capable of playing back 4k video seamlessly, it can multi-task nicely and even do some simple photo editing. Imagine what they could do if they were scaled up to be 100w cpu parts,

        • AMDisDEC
        • 3 years ago

        Waiting for Zilog to complete scaling up the Z80. It should give Intel headaches.

    • DragonDaddyBear
    • 3 years ago

    Benchmarks I seem to recall seeing to this point make it a pretty clear Intel is the better choice for data centers. Are there any benchmarks showing ARM with an advantage in a server workload?

      • JosiahBradley
      • 3 years ago

      If your only concern is the workload to cost metric, then tasks per watt is extremely important as the electric bill can cost as much as the devices themselves. ARM on 10nm should be pretty incredible at performance per watt for software that isn’t super dependent on CISC ISAs like x86/64.

        • xeridea
        • 3 years ago

        It also depends on your tasks, some things like serving webpages you wouldn’t want on ARM, because their single threaded performance is terrible compared to Intel, so the tradeoff isn’t worth it. Background tasks are a better candidate.

          • DragonDaddyBear
          • 3 years ago

          I’m interested in benchmarks, though. There’s a lot of people saying stuff that’s probably right, but by how much? As you rightly point out there’s the TCO thing. I love metrics, but I tend to over analyze things.

            • the
            • 3 years ago

            One factor TCO is software licensing. If it takes 48 ARM cores to be competitive with a 22 core Xeon E5, then the TCO will favor Xeons due to the massive reduction in software licensing fees (and that’s assuming ARM binaries even exist for commercial application).

          • the
          • 3 years ago

          I was under the impression that the opposite was true: front end webservers have notoriously low IPC load. These workloads scale rather well based upon clock speed, core count and show a good boost with SMT as well.

          It would be the middle application servers which are responsible for generating the more dynamic content that need some more beefier hardware.

            • xeridea
            • 3 years ago

            Webservers tend to have low CPU utilization so you get quick response times, and it won’t blow up if you get surges in traffic. You can easily utilize lots of cores (one per request), but you still want a quick response.

            • DragonDaddyBear
            • 3 years ago

            Great information! But this is just one server work load.

            This discussion has been great and informative but has served to emphasise that we need metrics on specific work loads before anyone makes silly statements like it’s awesome or useless. It may have a place, but there are a lot of factors beyond just the raw though put.

            • Beelzebubba9
            • 3 years ago

            The is right about most front end web server loads being well suited for more smaller cores overall. You are correct that the overall balance is latency vs. throughput, but when it comes to web server workloads your wire latency is typically so high that increasing the response time for a given, simple query by a few ms server side doesn’t negatively impact the user’s experience. But going from 32 cores to 96 per physical server – especially if the latter has SSL offload in hardware – may give you the ability to handle many more requests concurrently per watt, so that’s why these chips are even in consideration.

            I generally see them used in bare to the metal container environments as well as (possibly) being the backend for services like AWS Lambda or Azure Functions. Any workload where more ‘cycles’ are spent negotiating the connection, authenticating the session against a backend service, encrypting/decrypting SSL traffic, etc and not running complex queries should be suited for these types of SoCs.

            • Anonymous Coward
            • 3 years ago

            Yeah there is certainly no need for all cloud services to run on Intel. They can potentially choose a particular platform for even a small performance per watt advantage, and then choose a different platform next month. A customer can achieve a huge amount without installing any software themselves.

            • xeridea
            • 3 years ago

            If your webapp is highly optimized, you may be able to use these. Most websites are terribly inefficient (including 100% of WordPress sites, due to WP itself being horrendous). But if you are in the market for these, you probably already know about the importance of efficient use of software, so it may be a good match.

            Optimized sites with lightweight MVC such as CodeIgniter can generally process requests in ~10-30 ms, especially with PHP7, so high performance ARM cores may be plausible. Probably wouldn’t want to use on DB server though. WordPress sites take more like 500ms to process (clean install, no plugins), so definitely not.

            • the
            • 3 years ago

            As long as you’re able to hit a minimum response threshold, scaling in this fashion is pretty linear as core count and even thread count via SMT increases. This is a prime niche for ARM in the datacenter as long as the core designs are able to meet the minimum response threshold. I’m eager to see how 48 core chips like this perform in real testing as they fall into the ideal niche for that workload.

          • Hattig
          • 3 years ago

          Err, something massively parallelizable and mostly I/O bound, like serving webpages, is an ideal candidate for a massively multi-threaded server processor. That’s why we already had 128-thread SPARC processors and the like with CMT to mask I/O latencies.

          Note that ARMv8 now has an optional HPC ISA add-on – up to 2048-bit SIMD. If Qualcomm are targeting HPC then their Falkor core will incorporate an implementation of this.

          It’s likely that Falkor implements some form of SMT, either 2-way or 4-way, hopefully we will learn more at some point. With the Windows on ARM revival we may also see designs suitable for full Windows designs, 5W – 35W rather than the current phone SoC designs, maybe Falkor will be a good match there.

      • the
      • 3 years ago

      I suspect that there are two factors in play: cost and power consumption (performance/watt). Intel recently has been stagnating performance while main stream Xeon E5 chip prices have been rising. These ARM competitors are to be much more cost effective while offering a similar aggregate performance (note that per core/per thread performance still favors Xeons, and thus commercially license software may still be cheaper on Xeons).

      Power consumption is also a variable and if these ARM chips can come in under Xeon E5 thresholds, they can get some wins. However, they also have to perform better than the current performance/watt hero in the Xeon D. If the design can hit it out of the park with Xeon E5 performnace and lower than Xeon D power consumption, they’ll be able to move this chips with some serious volume.

      Google and Amazon are reportedly interested in these chips to run their own internal software and several open source projects. The volume of these two customers cannot be underestimated for data center hardware sales.

      • blastdoor
      • 3 years ago

      “ARM” is too generic. If you compared AMD’s Bulldozer to Power8, you’d conclude that x86 sucks compared to POWER, but that would be the wrong conclusion. You need to compare specific implementations.

      I have no idea how Qualcomm’s new core performs. But I’m thinking it needs to perform a lot better than Qualcomm’s smartphone cores if it’s going to have any chance of competing with Xeons.

        • AMDisDEC
        • 3 years ago

        Comparing POWER to x86 is almost like comparing the IBM CELL to AMD Bulldozer.

          • Waco
          • 3 years ago

          I’d have to disagree there. POWER and x86 are fairly close (in modern designs) whereas the Cell couldn’t be further from a modern x86 core like Bulldozer…

            • AMDisDEC
            • 3 years ago

            CELL was actually a POWER architecture, PLUS. It’s the Plus that made it non-competitive for programming outside of IBM. Probably the same reason Intel’s i860 died out.

            • Waco
            • 3 years ago

            I should have specified “normal” POWER. Cell wasn’t a normal chip by any measure of the word.

            • tipoo
            • 3 years ago

            The PPE used a PowerPC ISA, the SPEs actually had their own specialty ISA based on it but not completely compatible with PPC. Heavily stripped down and cantered around SIMD.

      • Hattig
      • 3 years ago

      It all depends on the implementation.

      We don’t know anything about Qualcomm Falkor, but we have to assume that it’s a server-oriented core design. 48 cores means we have to hope that the interconnects and all that are well implemented too.

      There is nothing stopping someone creating a high performance ARMv8 implementation, much like POWER8 is a high performance PowerPC implementation.

      Or to put it oppositely, there’s nothing stopping someone creating a low performance x86 implementation. Oh wait, Atom.

    • willmore
    • 3 years ago

    I, for one, welcome our new ARM server overlords.

      • morphine
      • 3 years ago

      I was about to cringe, but I have to admit that if there ever was a good time to resurrect an old même, this is it.

        • Wirko
        • 3 years ago

        Interestly interference by a French (or Portuguese?) spellchecker.

          • morphine
          • 3 years ago

          French? FRENCH?!

          Ooooooh, you little……

            • Srsly_Bro
            • 3 years ago

            I could tell by your outrageous accent.

Pin It on Pinterest

Share This