Updated: Cascade Lake-AP Xeon CPUs embrace the multi-chip module

After taking a little over a year to think on it, Intel appears to have decided that glue can be pretty Epyc after all. The company teased plans for a new Xeon platform called Cascade Lake Advanced Performance, or Cascade Lake-AP, this morning ahead of the Supercomputing 2018 conference. This next-gen platform doubles the cores per socket from an Intel system by joining a number of Cascade Lake Xeon dies together on a single package with the blue team's Ultra Path Interconnect, or UPI. Intel will allow Cascade Lake-AP servers to employ up to two-socket (2S) topologies, for as many as 96 cores per server.

Intel chose to share two competitive performance numbers alongside the disclosure of Cascade Lake-AP. One of these is that a top-end Cascade Lake-AP system can put up 3.4x the Linpack throughput of a dual-socket AMD Epyc 7601 platform. This benchmark hits AMD where  it hurts. The AVX-512 instruction set gives Intel CPUs a major leg up on the competition in high-performance computing applications where floating-point throughput is paramount. Intel used its own compilers to create binaries for this comparison, and that decision could create favorable Linpack performance results versus AMD CPUs, as well.

AMD has touted superior floating-point throughput from its Epyc platforms in the past for two-socket systems, but those comparisons were made against Broadwell CPUs with two AVX2 execution units per core rather than the twin AVX-512 engines of Skylake Server and the derivative Cascade Lake cores. AMD also chose to use the GCC compiler for those comparisons rather than Intel's compiler suite. Intel has clearly had enough of that kind of claim from AMD, and it seems keen to reassert its chips' superiority for floating-point performance with this benchmark info.

Other decisions about configuring the systems under test will likely raise louder objections. Intel didn't note whether Hyper-Threading would be available from Cascade Lake-AP chips, and indeed, its comparative numbers against that dual-socket Epyc 7601 system were obtained with SMT off on the AMD platform. 64 active cores is nothing to sniff at, to be sure, but when a platform is capable of throwing 128 threads at a problem and one artificially slices that number in half, eyebrows are going to go up.

Update 11/5/2018 at 18:11: According to an Intel spokesperson who contacted me this evening, "it's common industry practice for Intel to disable simultaneous multithreading on processors when running STREAM and LINPACK to achieve the highest processor performance, which is why we disabled it on all processors we benchmarked." Our independent research on this point corroborates Intel's statement, as Linpack fully occupies the floating-point units of the CPU and would likely experience performance regressions from resource contention with SMT on. Point taken.

Intel also asserted that on the Stream Triad benchmark, a Cascade Lake-AP system will be able to offer 1.3x the memory bandwidth of that same 2S Epyc 7601 system with eight channels of DDR4-2666 RAM. That figure comes courtesy of 12 channels of DDR4 memory per socket, a simple doubling-up of the six memory channels available per socket from a typical Xeon Scalable processor today. Dual-socket Cascade Lake-AP systems will be able to offer an incredible 24 channels of DDR4 memory per server. Intel didn't disclose the memory speed it used to arrive at this figure, however.

Intel also teased some deep-learning performance numbers against its own products. Compared to a 2S system with Xeon Platinum 8180 CPUs, Intel projects that a 2S Cascade Lake-AP server will offer as many as 17 times the deep-learning image inference throughput per second as today's systems. That figure could be related to Cascade Lake's support for the Vector Neural Network Instruction (VNNI) subset of the AVX-512 instruction set. VNNI allows Cascade Lake processors to perform INT8 and INT16 operations that are important to AI inferencing operations.

Beyond this high-level teaser, Intel didn't specify nitty-gritty details like the inter-socket interconnect topology or the number of PCIe lanes available per socket from each Cascade Lake-AP CPU. We expect to learn more upon the official release of the Cascade Lake family of processors later this year.

Comments closed
    • chuckula
    • 1 year ago

    For all the insults hurled at Intel, AMD just confirmed that Epyc 2 isn’t even fully 7nm. They couldn’t get the IO working on 7nm so GloFo is still making separate I/O chips on 14 nm.

    AVX 512…. CONFIRMED to not be present in these 7nm miracle chips. I can’t understand why when AMD is just so superior to Intel in every way with their 7nm miracle transistors.

    Oh and when Papermaster had the chance to confirm that that Epyc 2 has 64 cores… He moved on to Vega.

      • Goty
      • 1 year ago

      chuckula is worried… CONFIRMED.

      [quote<]They couldn't get the IO working on 7nm so GloFo is still making separate I/O chips on 14 nm.[/quote<] Even if that were true, at least AMD was able to get [i<]something[/i<] working on a sub-14nm node. You know, unlike Intel.

        • chuckula
        • 1 year ago

        AMD went to TSMC for chiplets that appear to be less sophisticated than a smartphone SoC that packs in CPU plus GPU plus I/O. And AMD is still relying on GloFo for Epyc 2 in the deal.

        Whine about some extra core that finally adopted Haswell’s AVX units all day, but Intel isn’t good ng bankruptcy over this

          • Goty
          • 1 year ago

          … And?

          * I/O doesn’t scale like logic or cache, hence it is an excellent candidate to be shipped off to its own IC fabbed on a cheaper process. There are no indications that AMD “couldn’t make it work on 7nm.”

          * 64 cores/128 threads were announced around 2:00 ET on the steam

          * Who cares if the chiplets are “less sophisticated” because they don’t contain parts that *shock* don’t matter in the datacenter? What would including an integrated GPU gain for AMD in this space?

          * Who care where AMD is sourcing it’s 14nm parts? They still have a wafer supply agreement to satisfy.

          Got anymore FUD or outright falsehoods for us, or are you just going to sit there and be mad at your keyboard until 2020 when Intel finally gets its 10nm mess straightened out? It seems like you’re pretty upset at the moment at least, since you couldn’t even wait for TR to get a story up about the event…

    • the
    • 1 year ago

    I suspect that the PCIe lane count is going to be 132. One of the lesser known things about the medium and extreme core count Sky Lake-SP dies is that they actually have 68 lanes of PCIe connectivity.: 4 go toward the DMI bus, there are 16 lanes reserved for the on package Omni Path options and the remaining 48 lanes are external. Putting two of these dies in socket would permit all the lanes to be exposed but at the cost of removing on-die Omni Path options. Then again, Intel may simply opt to leverage UPI for future Omnipath connectivity.

    • Gadoran
    • 1 year ago

    Dear Jeff, i don’t understand your comment about the absence of SMT on AMD system.
    AMD chosen this, AMD disabled SMT, same do now Intel, no SMT-

    Very likely SMT enabled hurts the test lowering the available bandwidth on memory channels and lowering the result of the test.

    So in short words SMT off is the best chance to have an high score on Triad.

      • chuckula
      • 1 year ago

      I’m sorry, this article is only for posting emotionalistic hatred of Intel.
      Your facts about how AMD intentionally chooses to benchmark its own hardware directly under its control aren’t welcome.

      #GlueIsGood

    • DPete27
    • 1 year ago

    Why would Intel choose to release this today and not Wednesday? Surely it’s more effective to overshadow a product launch retrospectively than preemptively.

      • Krogoth
      • 1 year ago

      Intel is trying to hold the 2P fort against Zen2 launch.

      • Anonymous Coward
      • 1 year ago

      Let me guess: because AMD has the better announcement.

    • Leader952
    • 1 year ago

    It’s hypocritical that AMD’s EPYC with four glued together Ryzen dies gets kudos but when Intel glues two Cascade Lake dies together it gets ripped.

      • Srsly_Bro
      • 1 year ago

      Thread ripped*

      • jarder
      • 1 year ago

      I disagree, the hypocritical part is where one company rips another company for using glued CPU dies and then goes on to do that very same thing…

      Yes, I do know that both companies have done this.

        • DPete27
        • 1 year ago

        If we’re talking about the “COMPANIES”, then yes Intel did say that a single die is more efficient for performance. But even they can’t get past the raw size needed for a single die with this many cores and what that would do to yields. In no way has Intel redacted their original statement in this launch.

      • Anonymous Coward
      • 1 year ago

      When AMD did it, they arrived somewhere they had never been (and they did so impressively competently), when Intel does it, basically its identical to what they already have, except packed more densely. Yawn.

      • derFunkenstein
      • 1 year ago

      Intel ripped AMD (indirectly) for the “glue” so they earned it.

        • Leader952
        • 1 year ago

        And AMD ripped Intel for the Core 2 Quad Q6600 having two dies glued together

        So AMD earned it also

          • Anonymous Coward
          • 1 year ago

          The NUMA-style glue is much more elegant than a shared FSB.

            • psuedonymous
            • 1 year ago

            It’s the same damn heterogenous glue. Call it an FSB, an Infinity Fabric, a QPI, or whatever.

            If your memory access is non-uniform, you’re going to run into the same problems in practice, which means writing your workload for the specific processor. Plausible for HPC workloads, progressively less so for workstation and consumer workloads.

            Remember, both Intel (Itanium) and AMD (HSA) have failed in the past in attempts to persuade desktop software to change for their hardware rather than designing hardware to run desktop software.

            • Anonymous Coward
            • 1 year ago

            I’m not swayed by saying “everything is complex therefore nothing matters”. Two dies sharing an FSB is less elegant that two dies which each have their own RAM/PCI/etc, and also talk efficiently to each other.

          • Spunjji
          • 1 year ago

          How far back do you want to go here? Back to the Pentium D, triumph of marketing over engineering? Or can we just settle on the fact that, having used these designs in consumer designs themselves, Intel were displaying hypocrisy and petulance when they made the “glue” comment about Epyc. It’s even more absurd when you consider that the design makes more sense for a server architecture (where AMD are using it) than it ever did on the desktop (where Intel used it).

          Making timely jokes about someone else’s hypocrisy is not the same as being a hypocrite yourself.

            • Anonymous Coward
            • 1 year ago

            I’d kind of like to have a Pentium D. I have a regular Prescott (and damn, its slow).

    • DavidC1
    • 1 year ago

    “its comparative numbers against that dual-socket Epyc 7601 system were obtained with SMT off on the AMD platform. 64 active cores is nothing to sniff at, to be sure, but when a platform is capable of throwing 128 threads at a problem and one artificially slices that number in half, eyebrows are going to go up.”

    Because Linpack doesn’t benefit from SMT since Linpack fully stresses the vector units.

      • Mr Bill
      • 1 year ago

      Is that architecture dependent?

      [quote<]Because Linpack doesn't benefit from SMT since Linpack fully stresses the vector units.[/quote<]

        • DavidC1
        • 1 year ago

        Linpack gets very close to theoretical maximum capability of the vector units. 85-90% of maximum is pretty common. You actually lose few single digit % by enabling SMT because there’s no room for filling pipeline bubbles since there aren’t any.

        Xeon Platinum 28 cores running at 2.5GHz has a maximum throughput of:
        28 core x 2.5GHz x 2 AVX units x 2 FMA x 64-bit/DP x 8 DP per AVX-512 unit = 2.24TFlops

        In Linpack, that CPU should reach somewhere in the 2TFlop range. In HPC applications, they are usually bound by real world constraints such as, memory bandwidth, internal routing, OoOE and thus do much less. If expressed in terms of flops, maybe 300GFlops for some of them, 100GFlops for others. There are probably applications where it stresses the vector unit as hard as Linpack does, but many don’t.

        It’s sometimes used as a great way of inflating your numbers. Kinda like using FP16 numbers to say your GPU is 2x as fast.

        “Intel didn’t disclose the memory speed it used to arrive at this figure, however.”

        DDR4-2400 fits it pretty well. 12x DDR4-2400 is approximately 30% over 8x DDR4-2666 used in EPYC after assuming few % losses on Intel side due to the MCM package.

          • Waco
          • 1 year ago

          Many real HPC applications are network, memory bandwidth, and latency constrained. 2-5% efficiency is common.

    • Goty
    • 1 year ago

    Anyone want to bet the over/under for the base frequency? Intel’s current 24c parts come in with base frequencies as high as 2.70 GHz, but that’s at a TDP of 205W. Assuming this thing comes in somewhere under a 350W TDP and naively scaling down from 2*205W, we could guess at a base frequency of 2.30 GHz. I honestly wouldn’t be too surprised to see it at 2.00 or 2.10 GHz though with the constraints of cooling both chips on a single package.

      • UberGerbil
      • 1 year ago

      Power scales at the square of the frequency, so you have to raise your target TDP a lot to get much of a bump in clockspeed. And more cores doesn’t make that easier….

        • Goty
        • 1 year ago

        Yeah, and I also don’t have a feel for the power consumption of UPI or if that number is included in the TDP of Intel’s parts (I’m guessing not.) Lots of factors at play.

        *EDIT* Got UPI and Omni-Path mixed up.

        • DavidC1
        • 1 year ago

        UberGerbil:

        Incorrect. Power scales linearly with frequency assuming leakage isn’t taken into account. Power scales square with voltage.

        Combined, Power scales cubed with frequency, if voltage changes accordingly.

        Reality is of course much more complicated. There’s the fixed power consisting of units that can’t be turned off, static leakage power, and the relation with dynamic power and leakage.

      • DavidC1
      • 1 year ago

      There’s a 2.1GHz base 24C part that has a TDP of 150W. The OmniPath enabled version is at 160W.

      Binning, and reducing redundancy might allow a 2.3GHz part at 300W.

      The unknown is the VNNI instruction. The addition of VNNI added quite a bit of power to Knights Mill.

    • JosiahBradley
    • 1 year ago

    EYPC face palm… I bet you my bottom dollar this isn’t even close to the price performance of even the worst Epyc chip on the curve. I also bet you that power efficiency will at least be similar too.

      • Goty
      • 1 year ago

      To be fair, price/performance doesn’t mean much when you get to this market.

        • Waco
        • 1 year ago

        Yes, yes it absolutely does.

          • Goty
          • 1 year ago

          How so? Not only is this a halo product (meaning it will already command a premium), high performance and high core count server CPUs don’t follow any sane price/performance curve because of things like per core vs per socket pricing and a focus on TOC over unit costs.

            • Waco
            • 1 year ago

            Is the claim that this will compete against Epyc? If so, how is a higher priced product going to compete with something that is already far better in a cost comparison?

            • Goty
            • 1 year ago

            Because the cost of the chip is a small part of the overall cost of operating an HPC cluster or datacenter, even for these $10K+ behemoths. If there are savings to be had elsewhere, it can be very easy to make up the cost difference of the CPUs.

            • Waco
            • 1 year ago

            I’d be shocked if they’re $10k, I expect they’ll be far more plus the additional cost of extra DIMMs and a specialty motherboard.

            Sure, interconnects, power, and cooling parts all add up pretty quickly. Doubling or tripling the cost of a part is a pretty huge factor when building out a large HPC cluster.

            For example, LANL’s large HPC cluster is around 20k nodes. Adding $10k to each node (to cover the cost of these special chips) would have added around $200 million to the cost of the machine. The machine was only $120 million…

            • Anonymous Coward
            • 1 year ago

            Whatever they charge, they’ll need to set prices that are reasonably competitive in TCO.

            • Waco
            • 1 year ago

            Intel doesn’t have a history of that. 😛

            • Anonymous Coward
            • 1 year ago

            Hah, I even got a downvote for saying it! :-0

            • Spunjji
            • 1 year ago

            A chip this large with this many memory traces and this high a TDP is going to require a hell of a board and supporting components. You’re right that the cost of the chip isn’t always the primary factor in a system, but in this case it’s going to drag everything else up with it, too.

            • Anonymous Coward
            • 1 year ago

            Yeah but compared to [i<]two sockets[/i<] is it a problem? Because this seems to be really similar to two sockets in one. Basically a better interconnect between two hunks of silicon.

            • Waco
            • 1 year ago

            Two sockets are usually less expensive because the routing/tracing is simpler. It was one of the many complaints about the Epyc socket design when it launched.

          • Beahmont
          • 1 year ago

          Such a shill.

          More channels and more PCI-E lanes is all you’ve been touting on Threadripper and Epyc systems for months now.

          Intel beats AMD in both again in the same socket and now suddenly none of that matters.

            • Waco
            • 1 year ago

            Shill? I guess that’s how we avoid intelligent conversation these days.

            AMD chips are affordable [i<]and[/i<] offer a ton of connectivity and memory bandwidth. The Intel chips will not compare in terms of price/performance/IO/memory bandwidth given that they [i<]already[/i<] don't compare on that metric and this is going to be priced higher.

            • Beahmont
            • 1 year ago

            Nah. Intelligent conversation stopped around here the instant AMD didn’t reasonably unarguably suck.

            You don’t get to move the goal posts. You moaned and groaned about how amazing Threadripper and Epyc were solely because of their bandwidth and PCI-E lanes and how much Intel sucked. When people pointed out to you that you could get more bandwidth and PCI-E lanes for cheaper with more systems you went to talking about max performance in a single platform. And that’s fair enough.

            But now that Intel offers something with unquestionably more bandwidth and PCI-E lanes in a single platform, suddenly there’s a new metric involved that, according to you and your crystal ball, wouldn’t you know it justifies those AMD systems you brag regularly about building earlier in the year and happens to still favor AMD even though previously it was a raw performance metric.

            If you want the most performance in a single platform, this system wins period. And you can’t stand that after all that trash you’ve been talking about Intel for months. And while you’re probably right about this being a halo product with a halo price, there will probably be lower cost options that just glue lower tier chips together for a competitive price.

            Gluing chips together is fairly easy. The moment AMD decided to go that route without having a better architecture first, they lost as Intel can just glue its better chips together and win that way, but everyone else loses because “The Coar Wars” is a stupid situation for consumers because it’s cheaper to just glue more chips together and talk about that than actually give real single threaded performance. Sure things are more competitive now, but the actual usable performance in a single system isn’t actually advancing, it’s just moving sideways.

            • Waco
            • 1 year ago

            Intel has always had the crown with their 4P systems. Twist it all you want, I honestly don’t care.

            • MOSFET
            • 1 year ago

            Better judgment tells me to leave it alone, but it’s hard to not let slip, at a minimum, Good Grief!

            I never expected Waco to be the subject of a shill subthread!

            • Waco
            • 1 year ago

            I appreciate the shock. 🙂

            • cygnus1
            • 1 year ago

            How is he a shill? He’s just stating facts. This unreleased product doesn’t win on core count (Epyc 2 wins that) and only catches up PCIe lane count, while still falling behind because Epyc 2 will be have 4.0 capability vs 3.0 for Intel.

            Also, this Intel product, whenever it’s released, will be very low volume. Which combined with the added complexity of the motherboard, is going to make for absolutely eye popping prices on these unless Intel plans to sell them at a loss.

        • blastdoor
        • 1 year ago

        Price/performance always matters… it’s just that the price and performance that people focus on isn’t just that of the CPU in isolation of everything else. Which means that dividing a CPU benchmark score by a CPU price is rarely the right metric.

        Another interesting definitional issue is “this market.” Because I cannot afford CLAP, some might say that I’m not the target market. And I guess that’s true, in so far as Intel is only targeting people willing to pay. But if we focus on my needs rather than Intel’s profits, then CLAP definitely is a product for me. I’ve got a Threadripper 2990wx running almost 24-7. I’d love to have the extra cores, AVX512 capabilities, and memory bandwidth of CLAP. Heck, I’d be happy to step up to EPYC. I could make good use of the performance. But I definitely can’t afford it.

    • Sahrin
    • 1 year ago

    Xeon Emergency Edition.

    Even comes with a huge new socket like Presshot.

      • Goty
      • 1 year ago

      Probably the relatively insane power consumption, too. What’s the biggest consumer water chiller you can buy?

      *EDIT* Ooh, chuck’s getting fast with those downvotes!

        • Srsly_Bro
        • 1 year ago

        Not to go unnoticed, that -1 was from me.

          • Goty
          • 1 year ago

          Glad I could give you an avenue to participate!

          • blastdoor
          • 1 year ago

          On the one hand, I can see the rationale for a downvote.

          On the other hand, your whole “bro” schtick is wearing thin.

          So, three downvotes for you

      • Krogoth
      • 1 year ago

      Socket 775 packaging was only marginally larger than Socket 478 packaging though.

      • Klimax
      • 1 year ago

      Not that bigger then AMD’s own…

        • Spunjji
        • 1 year ago

        We can’t be sure given that apparently it doesn’t yet exist, but that pin count and the projected power requirements suggest this will be a whole new level of vast.

    • Thresher
    • 1 year ago

    How are they going to cool something like that?

      • DavidC1
      • 1 year ago

      It isn’t completely new to the industry but new to Intel.

      The IBM Power chips have been in that area for a long time now. Companies keep pushing the envelope to get that extra bit of performance.

      Large packages, liquid cooling and oil immersion techniques are used in servers. The -AP will likely not be a high volume part like the regular Cascade Lake Xeons.

      • DavidC1
      • 1 year ago

      Actually I was wrong. It isn’t even new to Intel.

      The Knights Mill Xeon Phi tops out at 320W. These parts if anything, will not be much greater than that.

      The -AP chips are successors to Xeon Phi anyways.

        • Waco
        • 1 year ago

        The -AP chips can’t really compete with the Phi line without embedded high bandwidth memory of some sort, though. They’re already bandwidth constrained for many workloads.

          • DavidC1
          • 1 year ago

          That’s true, but it may be because Cascade Lakes as a whole were unplanned due to 10nm delays.

          If you see leaked roadmaps -AP replaces Xeon Phi.

          You’ll see HBM2 on Icelake Xeons, so there’s your answer right there.

          psuedonymous(a poster for this article below) also hypothesizes that it uses BGA for ultra high density compute farms.

          It’s not really a hypothesis. Cascade Lake AP uses BGA5903. So it’ll be little more broadly used than Phi that it’ll go into dense compute rather than just HPC.

            • Waco
            • 1 year ago

            I know the roadmap, I was just pointing out that the CL-AP line doesn’t really compare with the Phi line (which had a pretty specific purpose).

            Intel decided to kill off the Phi line a while back, which makes me a sad panda.

      • chuckula
      • 1 year ago

      I think a chip like this with a roughly 300 watt TDP would be cooled with a hunk of metal in a heatsink and a fan.

      You know, the same way that a 300mm^2 GPU from AMD on TSMC’s “miracle” 7nm processor that guzzles down 300 watts of power gets cooled.

      You know this one: [url<]https://techreport.com/news/34243/amd-radeon-instinct-mi50-and-mi60-bring-7-nm-gpus-to-the-data-center[/url<] Except that the Cascade Lake parts are easier to cool by having a larger heat surface area. Notice how AMD advertises four-packs of those GPUs with zero air gap between them. Something tells me cooling two Cascade Lake AP parts will be easy.

        • Krogoth
        • 1 year ago

        It is difficult to say for sure which platform would be more difficult to cool. It depends on the thermal densities of the dies and how well IHS is interfaced to the substrates.

      • Krogoth
      • 1 year ago

      Large HSF and the chip packaging is going to be at least as large as current LGA3674 if not larger.

    • psuedonymous
    • 1 year ago

    My bet is on this being a BGA package rather than some new superjumbo LGA. Reasoning:
    This is pointless except for Density Uber Alles applications, i.e. “My datacentre is full and packed with Xeons, I can’t physically expand, but I need to cram more CPUs in at any cost”. Those customers are already almost entirely migrated to Open Compute and using custom (or at most semi-custom) boards, so having the CPU soldered on at build is likely not an issue, and allows for even higher rack densities.

    • Goty
    • 1 year ago

    *waits for people to start complaining about access times to non-local memory*

    *dies*

      • chuckula
      • 1 year ago

      Considering the worst-case access time is probably better than a 2 die desktop grade 1950x or 2950x the cards are still stacked on Intel’s favor.

      And each die has direct access to a full 6 channels of RAM for all non-worst case scenarios instead of at best 2 channels of RAM.

        • Goty
        • 1 year ago

        Who else wasn’t surprised that chuckula was the first one to come to Intel’s defense?

          • chuckula
          • 1 year ago

          So you admit that I’m right and don’t have a fact-based counterargument.

          This is why Intel shills always win.

            • Goty
            • 1 year ago

            You’d be right if the fanboy argument were about the magnitude of the performance impact rather than its existence. In light of that argument, complaining about the issue in Epyc but not for Xeon is purely hypocritical. (And before you go on the defensive and take this personally, I’m not calling you out in particular here.)

            My stance has always been that I don’t care about idiosyncrasies as long as the platform performance is there, hence why you won’t see me on here bashing this product for idiotic reasons.

            • Srsly_Bro
            • 1 year ago

            We’re fortunate your paradox machine is holding this world together!

          • Action.de.Parsnip
          • 1 year ago

          To be honest it threw me a bit

        • jarder
        • 1 year ago

        I think it’s way too early to make any predictions. At the moment we have no idea how the cores in a potential 96-core xeon system will be connected as Intel has not released that information. It’s quite possible that data will have to jump from the core where it is stored through another core on the way to the core where it’s needed. I’ll be reserving judgement until the relevant information and benchmarks are released.

    • Waco
    • 1 year ago

    So they crunched 4 sockets into 2. Hurray? These are going to be eye-wateringly expensive I bet. Intel has some egg on their face given their responses to Epyc MCMs.

    Still looking forward to tomorrow.

      • Anonymous Coward
      • 1 year ago

      Why should they be abnormally expensive? Huge sockets like this seem like the direction to go for the foreseeable future. IMO, they would be wise to make the socket available also in the single-socket market.

        • Waco
        • 1 year ago

        Two top-end Intel dies in a new special socket with 12 channels of DDR4 per socket? Seems like a pricey combination to me.

        AMD has a huge socket, but at least it’s consistent across their entire portfolio.

          • Anonymous Coward
          • 1 year ago

          Yeah, its true that they (appear to) have somewhat restricted options for low-cost compared to AMD and their small dies. But single-socket must be the future for all price points it can service, no sense in two sockets if it can be avoided.

      • Beahmont
      • 1 year ago

      I’m sorry, are Epyc MCM’s not just 4 Zen chips jammed into a single socket?

      Why the cheers for AMD and the Boo’s for Intel for doing the same thing?

        • Waco
        • 1 year ago

        Because of Intel’s stance on the matter for the past few years, plus the premium that Intel places on such halo products.

        AMD chips are still a massive bargain in comparison considering [i<]they already compete favorably against Intel platforms[/i<].

          • Beahmont
          • 1 year ago

          Intel’s stance on the matter is that single chips are better than a glued chip of the same core count, and they aren’t wrong.

          If you can’t get to the same core counts in a monolithic chip, then glue is better than no glue, and that’s been Intel’s stance since it made it’s first quad core chips with glue out of 2 dual core chips.

            • Spunjji
            • 1 year ago

            Intel didn’t *have* single-die chips that competed with AMD’s core counts when they made the glue jokes.

            But then you’re already in the territory of baseless speculation on what rationale a multinational corporation has behind their marketing disses, so, you do you.

    • rika13
    • 1 year ago

    Cascade Lake Advanced Performance…

    I wonder how long before some server admin yells “I GOT THE CLAP!!”.

      • Jeff Kampman
      • 1 year ago

      Way ahead of you. [url<]https://twitter.com/jkampman_tr/status/1059368948968419328[/url<]

      • maxxcool
      • 1 year ago

      *groan*

      • UberGerbil
      • 1 year ago

      Intel certainly was smart/fortunate they decided to codename their chips after various Lakes rather than Rivers.

        • Srsly_Bro
        • 1 year ago

        Nehalem river in Oregon would like to say hello.

          • Mr Bill
          • 1 year ago

          Hello from the Willamette.

            • Srsly_Bro
            • 1 year ago

            Yes, forgot about that one. I use to go running by it when I lived in Portland. In Seattle now.

        • Mr Bill
        • 1 year ago

        Or Trails?

          • Wirko
          • 1 year ago

          One Well, er, Swell was unforgettable.

        • Mr Bill
        • 1 year ago

        Missing sarcasm tag?

          • UberGerbil
          • 1 year ago

          Yep.

      • fredsnotdead
      • 1 year ago

      If it doesn’t meet expectations, it’ll be the slow CLAP.

    • Krogoth
    • 1 year ago

    SEE BROTHERS WHAT DID I TELL YOU?

    INTEL WILL COME BACK WITH A VENGEANCE!

    #PoorTSMC7nm
    #PoorZen2
    #PoorVega20
    #PoorNavi

      • cygnus1
      • 1 year ago

      I seriously doubt this will be anywhere near the price ballpark of any of the Epyc systems they compared to. That’s fine for folks with effectively unlimited budgets, but I don’t think that applies to most.

      Edit: and I bet they’ll double the PCIe lanes off the socket as well, so add that in too. 5000+ pins in the new socket maybe?

        • jihadjoe
        • 1 year ago

        Current rumors say 5903. Begun, the pin count wars have.

          • NTMBK
          • 1 year ago

          Coincidentally, 5903 is also the expected launch date of Intel’s 10nm server chip!

            • Wirko
            • 1 year ago

            That’s plausible, it’s already 5779 in Haifa.

    • chuckula
    • 1 year ago

    This gives some support as to why some HPC customers were fully briefed on Zen 2 and then went with Cascade lake anyway. I like that they just said screw it and went with the full 12 channels of RAM.

      • Krogoth
      • 1 year ago

      Actually, AMD isn’t Intel’s biggest threat in the HPC space. It is Nvidia. They are more scared of Volta and its successors then Zen2.

        • derFunkenstein
        • 1 year ago

        But but but but but you told me that Epyc was really taking off, and I should just wait, because all those Epyc installs were right around the corner, it just took people a while to get their post-Spectre/Meltdown builds in order.

          • Krogoth
          • 1 year ago

          HPC market =! Enterprise/SMB market

          They are two completely different markets. These chips aren’t aimed at Enterprise/SMB space.

            • derFunkenstein
            • 1 year ago

            So what are 2S Epyc systems with 64 cores aimed at? What about the (supposedly coming) 1S 64-core Epyc?

            • Srsly_Bro
            • 1 year ago

            We’ll find out tomorrow! If you’re gonna glue, glue it right.

            • Krogoth
            • 1 year ago

            Enterprise/SMB markets. The same market that the current Skylake-X HCCs and XCCs occupy (Xeon Gold-Platinum). Intel is doing a single-die Cascade Lake refresh for Enterprise/SMB markets.

            These dual-die Cascade Lake chips are going after 4P/HPC customers. The platform itself is far beyond the needs and budget of Enterprise/SMB customers. They are halo/prestige pieces designed to keep up appearances in the “core count race”.

            • derFunkenstein
            • 1 year ago

            Oh I get it, when Intel does lots of cores it’s for the “Enterprise/SMB” market but those same cores are for “HPC” when it’s AMD. Got it.

            • Krogoth
            • 1 year ago

            The platforms are completely different. This bad boy is paired with 12-channels of DDR4 memory and likely has at least 88 PCIe lanes at is disposal (assuming each die has a PCIe controllers with 44 lanes).

            It is intended to be the successor to their current 28-core/LGA 3647 platform that is geared towards HPC customers. The platform will also provide future support Intel’s real answer to Zen2 their own EMIB project which most certainly a WIP.

            • blastdoor
            • 1 year ago

            Perhaps this point is off topic relative to whatever it is that you guys are arguing about, but I would just note that being beyond the budget is not necessarily the same thing as being beyond the needs.

            • derFunkenstein
            • 1 year ago

            Always.

    • Hattig
    • 1 year ago

    12 channel memory? I presume that’s a new socket then.

    And yeah, using the same glue as they criticised AMD for using. I wonder what the TDP is?

    And a pre-announcement of a future release to boot – yet the headlines I’ve been seeing suggest it was an actual release.

      • cygnus1
      • 1 year ago

      Yeah, the 12 channels thing threw up that signal for me too. Has to be a new socket, and the pin count of that thing is going to be ridiculous. I wonder how many layers the motherboards will have to be for this to route that 24 memory channels around two sockets too.

        • JustAnEngineer
        • 1 year ago

        Two-layer PCBs are trivial. Four layer PCBs are quite easy (and therefore inexpensive), with just a single two-layer inner layer and some fiberglass and foil on each side. Once you get to six layers and up, the alignment between the inner layers and connecting them together electrically with plated high-aspect ratio through-hole vias becomes challenging (and therefore expensive).

          • cygnus1
          • 1 year ago

          Yep, fully understand all that. That’s why I was saying, there’s just so much real estate to route that many traces. 24 total DIMM channels times 184 pins, that’s around 4000 traces for just the RAM surrounding 2 sockets. Throw in PCIe and other connectivity to the sockets and it’s just getting ridiculous. I wouldn’t be shocked if these boards are over 12 layer PCBs.

            • Waco
            • 1 year ago

            Yep. These sockets will make Knights Landing look tiny by comparison.

            • JustAnEngineer
            • 1 year ago

            When I worked at the circuit board factory in the late 80s, they were building a few 15 to 20 layer boards… that cost several thousand dollars each to build and sold for over $10k each.

            • cygnus1
            • 1 year ago

            Yeah, that’s what I’m wondering might be required for these beasts. I’ve seen guesses at the pin count being near 6,000. That just boggles my mind how complex that would be to route. We have consumer boards that are 8 and 10 layers, so I’m expecting the number of layers in this thing has got to be up into the teens

    • BabelHuber
    • 1 year ago

    Nice – Intel is gluing together 2 dice? I thought that this is a bad design 🙂

    It will be interesting to see what AMD shows us tomorrow – I expect a 64 core / 128 thread server chip…

    Probably only coincidence that Intel revealed this today.

      • arunphilip
      • 1 year ago

      [quote<]Probably only coincidence that Intel revealed this today[/quote<] Yeah, pretty naughty of Intel to grab a few headlines a day prior to AMD. Good to see they're feeling the heat.

      • Krogoth
      • 1 year ago

      Nope, Intel is trying to steal AMD’s thunder with their official Zen2 launch. They did this back with Pentium 4 EE (Gallatin) back in the day.

        • Intel999
        • 1 year ago

        They also did this more recently with their HEDT answer to Threadripper 2. Do you remember the one that hid an aquarium chiller under the demo to keep the CPU from melting.

        That was a desperation move just as this is. The pricing for this new glued together Xeon is going to be astronomical only to be overshadowed by the incredible power draw.

      • jihadjoe
      • 1 year ago

      Hating on glue has historically proven to be a bad move! AMD called out Kentsfield and look what happened to them in that time.

        • BabelHuber
        • 1 year ago

        [quote<]Hating on glue has historically proven to be a bad move![/quote<] Exactly! Alexander the Great could have easily conquered India - if he hadn't developed an irrational hatred on glue!

      • chuckula
      • 1 year ago

      Well Intel decided to go retro back to its success in 2008 with the core 2 quad.
      After all, AMD thought the idea was brilliant and being copied by your competitors is the best form of flattery.

      • Kretschmer
      • 1 year ago

      Both Intel and AMD have glued together two dies at various points (Smithfield, Threadripper, Cascade Lake). The fanboy response shifts based on which company is currently using the technique.

      *Shrug* All I care about at the end of the day is the performance. If that’s there, how the chip is put together is secondary.

        • freebird
        • 1 year ago

        There has been a LOT of CPU MCM “glue sniffing” going on by both … I guess 2019 will show us which one will get us “higher”…

        Not sure if that is TDP or Performance though…

      • Mr Bill
      • 1 year ago

      Depends on the glue…

      Hypertransport
      ThunderTransport
      QuickPath
      UltraPath
      InfinityMesh
      InfinityFabric
      HBM

Pin It on Pinterest

Share This