NEC shoehorns 368 Avoton cores and 1472GB of RAM into one box

Our visit to the Kingston booth at IDF was particularly interesting for more than one reason. With the launch of Intel’s Avoton SoC for servers, in the form of the Atom C2000 series, super-dense microservers are likely to be getting a lot more attention. To show off Avoton’s potential, Kingston had a nifty little demo system built inside of a standard mid-tower ATX case.

That little mainboard there houses four separate computers. Each of those black heatsinks cools a separate eight-core Avoton SoC, and each compute node has two SO-DIMMs worth of memory attached. All 32 cores are running a live demo without any fans mounted atop the heatsinks.

The Kingston rep wasn’t sure which Avoton model was in use, so I stuck my finger on the heatsink and promptly concluded it was probably a 6W variant. The thing was barely warm to the touch. The Kingston dude flinched a bit when I accosted his demo system’s CPUs but generally took it well. I managed not to unleash a killer static shock.

Kingston is excited about the potential of Avoton, and of microservers generally, because of the huge amounts of RAM they can consume. To better illustrate, the Kingston rep dragged me over to the NEC booth and pointed out this monster:

This is an Avoton-based microserver in a 2U rack-mount chassis. Those cards you see mounted in several of the slots have an eight-core Avoton compute node, and each one can host up to 32GB of RAM. So you’re looking at a two-unit-high enclosure with as many as 368 Silvermont cores and up to 1472GB of DDR3 memory. Across the back of the system (to the right in the picture above) is an array of slots for storage expansion cards with 2.5″ drives, as well.

I think we could host TR on that thing. Just maybe.

The crazy thing is that the main limitation in terms of per-node computing resources may well be memory capacity. Depends on the workload, of course, but for many uses, Avoton’s memory capacity limit could be the main obstacle to Avoton-based microservers supplanting Xeons. Of course, ARM-based SoC providers are likely to build support for even higher memory capacities into their 64-bit ARMv8 SoCs based on the Cortex-A57.

These are very interesting times in the server market.

Comments closed
    • Wirko
    • 6 years ago

    Where’s the space for 368 / 8 = 46 cards? Can the front row and the back row be populated with compute cards too?

      • Anonymous Hamster
      • 6 years ago

      I’m a bit confused too. I see 12 slots on the left, 18 in the middle, and 18 on the right. To get to 46, that requires you to fill all slots except for 2 (perhaps needed for I/O?).

    • faramir
    • 6 years ago

    “I think we could host TR on that thing. Just maybe.”

    So what kind of hardware does it actually take to host TR ? Honest question.

      • ronch
      • 6 years ago

      You need cores that are faster than the ones found in Gondor.

      • derFunkenstein
      • 6 years ago

      It flies on the wings of eagles, delivered to your intertubes in vessels lovingly hand-crafted by monks who live way up in the Himalayan mountains.

      • internetsandman
      • 6 years ago

      I know this isn’t an honest answer, but I find it amusing that this honest question received nothing but joke answers

        • rookiebeotch
        • 6 years ago

        I honestly was more interested in the joke answers than an honest answer.

        Hobbits.

          • faramir
          • 6 years ago

          Meh, trolls :-/

    • sjl
    • 6 years ago

    Just trying to understand – is this a monolithic server (whereby it would be possible, at least in principle, to access all those cores and all that RAM in a single operating system environment), or is it a multiple physically distinct but sharing power system arrangement? Or to put it another way: is this an arrangement that could realistically come to compete with, say, IBM’s p795 (up to 256 4.0GHz POWER7 cores, or 128 4.25 GHz POWER7 cores, with up to 16 TB of RAM)? (Ignore the relative CPU capabilities for the sake of this discussion; that’s not what I’m driving at.)

    The use of the phrase “compute node” makes me think it isn’t, which makes me sad. Still an impressive achievement, mind, but there are good reasons for companies to want a monolithic arrangement (even if most of these sorts of setups use a virtualisation layer to split the large chunk of power into smaller VMs, it’s nice to be able to shuffle resources around on the fly according to need, for example.)

      • esterhasz
      • 6 years ago

      They could use a virtualization layer for aggregation into a single virtual machine, but this would have little advantage over simply going Xeon or Power. Maybe some HPC algorithms really benefit from more over faster cores, but this would be the exception, I guess.

      These systems are mainly used for web or service hosting, where a client can either rent a full board or a hypervisor distributes virtual machines over the managed pool of HW according to needs and loads. Especially the former looks very interesting for both client and hosting company with this kind of hardware.

      • Flatland_Spider
      • 6 years ago

      I don’t think the Avoton Atoms have the glue necessarily to run in a multi-socket configuration. This is one of those features Intel saved for the big cores. Unless there is another chip to facilitate this, they should be individual servers.

    • ronch
    • 6 years ago

    So many-cores is really the way of the future. The Bulldozer isn’t a bad architecture per se, but perhaps it was just a little ahead of its time. And yeah, blame GF for the power consumption.

      • OneArmedScissor
      • 6 years ago

      Blame AMD marketing. The Opterons can be very low power and are configured to be reasonably balanced. The mobile APUs are actually some of the most frugal “high power” x86 chips, ever.

      However, the desktop version of both cranks the core clock through the roof, and the voltage along with it.

      In the case of the FX series, it’s much worse, as they leave the L3 and memory controller at about half the speed, way out of sync.

      AMD haven’t raised the “uncore” speed since 2008. They’d already started playing the ever-increasing core clock + voltage game with the Phenom II. The IIs actually ran higher voltage than the original 65nm Phenoms.

      Remember how there were 140w Phenom II quad-cores, even though Intel kept mainstream quad-cores at 95w? And now there are 125w FXs while Intel has dropped quad-cores to 77w, and 100w APUs while Intel’s dual-cores are 55w.

    • albundy
    • 6 years ago

    so… how many Silvermont cores does it take to outperform a measly 12 core single xeon?

      • OneArmedScissor
      • 6 years ago

      Just one if you are trying to lower the utility bill. :p

      Plenty of websites and databases don’t need faster hardware, but they’ll jump all over something to reduce their costs.

      And it’s not necessarily just electricity costs. There are surely maintenance advantages to this card format.

        • TO11MTM
        • 6 years ago

        Yes.

        Also, to what OneArmedScissor said, Most databases (Unless you’re running something special like Geospatial or similar) don’t give much of a care about floating point performance. While Xeon’s are still pretty damn good at that, Avoton gives you good Integer performance without spending as much die space on FP.

        This was actually a conscious decision AMD Made with Bulldozer – Server chips make more money so it made sense to gear the product towards being the best ‘server’ chip they could build; i.e. max integer cores even if it meant a split FPU.

    • DeadOfKnight
    • 6 years ago

    RAMdisk that big = holy cow

    • Farting Bob
    • 6 years ago

    Damnit Kingston, have some pride in how your demo system looks when you’re at a trade show! I’m sure a company of your size can find a modular PSU somewhere in the office and a case that doesn’t look like it cost $15. And screw the SSD in place so it doesn’t look like you were surprised 2 minutes beforehand and told to build a system from scratch!

      • chuckula
      • 6 years ago

      In fairness to Kingston, IDF is more of a technical conference + Intel advertising than a full-blown tradeshow. Intel takes pains to give slick presentations in its own right, but a lot of the prototypes you see are very raw (and there’s nothing wrong with that).

      Obviously Kingston is not trying to push out an Atom server in a white-box ATX form factor as a finished product.

      • MadManOriginal
      • 6 years ago

      The look of the case and the PSU are fine, it’s a server product after all…but the SSD just hanging out resting atop the mobo power wires is just too ghetto – they should have at least velcro’d it somewhere if the drive bays were all full.

      • Metonymy
      • 6 years ago

      Yeah, that dangling SSD does sort of catch your eye right away.

    • IYagami
    • 6 years ago

    You can have
    [list<] [*<] 256 threads (32 cores) in a 2U server right now with the SPARC T5-2 server. [/*<][*<] Or 512 threads (64 cores) in a 5U server with the SPART T5-4 server [/*<][*<] Or 1024 threads (128 cores) in a 8U server with the SPART T5-8 server[/*<] [/list<] Of course these are servers focused on databases and web servers. However, if you really need these number of cores / threads, you can have a look at Oracle alternatives

      • maxxcool
      • 6 years ago

      While this is 100% accurate… I would sooner sever my limbs with a wooden spoon than get locked into oracles $$$ hardware service schemes…

    • bjm
    • 6 years ago

    The question everyone wants to know:

    Can it run WinZip?

      • Farting Bob
      • 6 years ago

      Thank you for trying the evaluation version of Winzip, unfortuntely the demo version only supports up to 300 cores and 1420GB of RAM. Buy the full version now!

    • pcgeek86
    • 6 years ago

    Not sure I understand why ARM cores are necessary in servers, given the raw power and efficiency of x86 CPUs.

      • My Johnson
      • 6 years ago

      I haven’t seen any good arguments lately that Instruction Set matters. When Apple was on the PowerPC platform advocates made the argument but at the end of day it mattered little. Also, all common instruction sets are decades old, so I suspect hardware and software will be optimized to alleviate shortcomings if there are any.

      • Flatland_Spider
      • 6 years ago

      Until recently, you couldn’t do this with x86 chips. 368 Xeon cores would melt a case.

      Then there is custom designs. Companies can build the chip they need. It’s the difference between a Swiss Army knife and a specialty tool.

      • sschaem
      • 6 years ago

      Price and even higher density at the same process?
      But intel process advantage is making this a non issue.

      So only AMD suffer the x86 stigma of the funky instruction decoder overhead.

        • TO11MTM
        • 6 years ago

        Most any non embedded chip that I am aware of runs through a decoder, and ARM has had it’s share of long ones. Arm11 was only single stage, (BTW, Going from Arm v6 typically called ‘ARM11; to Arm v7 with the initial product Being Cortex ‘A8’ makes Intel’s naming schemes look downright reasonable…) but A8 had a 4 cycle decoder, (A9 had one admittedly,) and A15 has a 7 Stage Decode/Rename/Dispatch which is pretty in line with Saltwell (Looks to be 6 for the same set of operations or so.)

        I’m not arguing They’re suffering, but if it’s because of the decoder it’s their own damn fault for not designing theirs properly.

    • chuckula
    • 6 years ago

    Well… I guess 32 cores in a mid-tower case is OK… IF YOU’RE FROM THE ’80s!!!

    Until all of that goes into my smart-watch, I’m going to go full-Krogoth on this demo.

      • dpaus
      • 6 years ago

      Yeah, dude, we did that about 4 years ago with Interlagos Opterons.

        • ronch
        • 6 years ago

        And remember, those Opterons had REAL, “BIG” cores! Who wants a ton of “small” cores, anyway? 😛

      • eitje
      • 6 years ago

      never go full-krogoth.

        • Deanjo
        • 6 years ago

        He just isn’t ready yet to go full-krogoth.

        • demani
        • 6 years ago

        You always have to hold a little something back. Otherwise you go home empty handed.

      • ronch
      • 6 years ago

      I’m not impressed by your full-Krogoth attitude.

    • Wildchild
    • 6 years ago

    Can it run Crysis?

      • pcgeek86
      • 6 years ago

      I think you mean: But will it run Crysis?

        • Wildchild
        • 6 years ago

        There’s also simply “Will it run Crysis?”

        Summary: it doesn’t matter.

        • rookiebeotch
        • 6 years ago

        I think you both mean, ‘Will it blend?’

        [url<]http://www.willitblend.com/[/url<]

      • internetsandman
      • 6 years ago

      If you get enough of them together it might just have enough parallelism to render the graphics

      • lycium
      • 6 years ago

      Probably not. If it were possible, the single threaded performance would be terrible.

      I know it’s a joke, but I thought I’d point out that not everything with zillions of cores and gigs of ram is “fast” in the same way.

        • Srsly_Bro
        • 6 years ago

        You old geezers! Let the young one have his moment. He’s not aware it won’t Crysis, and he probably never will. Let him have just this one!

      • My Johnson
      • 6 years ago

      It looks possible. I bet you could get low quality via software rendering at 1080p.

      • derFunkenstein
      • 6 years ago

      Yes, but can it fold?

        • BIF
        • 6 years ago

        This is the most important question of all.

      • confusedpenguin
      • 6 years ago

      Throw in a decent GPU, and yes, yes it will run Crysis. 🙂

Pin It on Pinterest

Share This