The TR Podcast 186: Talking Skylake architecture with David Kanter

We recorded the latest episode of The TR Podcast last night. Special guest David Kanter joined Scott to go deep on Intel's Skylake CPU and graphics architecture, and they also addressed some questions about the recent controversy surrounding differences between Nvidia's and AMD's differing approaches to DirectX 12 asychronous compute shaders.  Listen to their discussion and become enlightened:

Don't miss our live recording sessions for the TR Podcast. Subscribe to our Twitch channel or follow Scott on Twitter to be notified when we go on air. As always, we'll post a fully-edited audio version of the podcast when it's ready.

Comments closed
    • Silus
    • 4 years ago

    I’ve said it before and I’ll say it again. These (along with the regular podcast) make TR standout over all other publications in what tech is concerned. There’s always that little tidbit of information that’s rarely included in a text only article and we get to learn a lot.

    Thanks again Scott, TR and David. Hope you can continue this as you’ll surely have at least one listener in me.

    • tipoo
    • 4 years ago

    Are time stamps going to happen at some point?

    • USAFTW
    • 4 years ago

    Very informative podcast.
    However, may I proffer a suggestion:
    In the past, you guys used to invite guests every now and again. I remember listening to podcast #5 and you had Roy Taylor from AMD on back then. Is it possible to invite folks like Richard Huddy or Tom Petersen or Andrew Lauritzen and asking them some hard-hitting, investigative journalism type questions?

    • Unknown-Error
    • 4 years ago

    So if anyone wants to throw away their Titans and 980Ti because it lacks DX12, David will happily take your cards. Actually not just David, most sane humans would take them.

      • Ninjitsu
      • 4 years ago

      Yeah me too! I’ll even take a 980 or 970. And I’ll pay for shipping!

        • Firestarter
        • 4 years ago

        I can offer up one HD7950 with glorious async shaders for anyone wanting to ditch their GTX980 Ti

    • WhatMeWorry
    • 4 years ago

    And now I know how to pronounce Kaby Lake: “…rhymes with maybe”

    • Vergil
    • 4 years ago

    So to summarize:

    Maxwell is not worth it, it’s old irrelevant tech.
    Maxwell is not DX12 ready, it’s not great for VR, and it’s not a fit for GPGPU compute.
    They conserve a bit more power by ignoring all these features, but that’s about it.

    If you’re looking for a future-proof GPU purchase, your only option is GCN Fury Cards!

      • Silus
      • 4 years ago

      Ah “future proofness”, the AMD fanboy way of trying to compensate for poor performance, high (or higher than competition) power consumption and overall crappy drivers.

      Of course when that “future” comes that card will be obsolete anyway…

    • jihadjoe
    • 4 years ago

    Podcasts with David are the best!

    When you guys were talking about processors with eDRAM that bit about memory-bound workloads showing up as CPU-bound was really interesting. And it’s obviously a latency issue (as opposed to bandwidth) because X99 with its quad channel memory doesn’t show any improvement over Devil’s Canyon in those kinds of workloads, but the 5775C does.

    • Freon
    • 4 years ago

    Another great podcast with the Wasson/Kanter duo!

    • CScottG
    • 4 years ago

    Watched it live – kind of wished I hadn’t..

    Chock-full of minutia.

    ..or perhaps that’s chuck(ula)-full of minutia? (..good questions for the programming chuckula.)

    Older processor the US market can’t seem to get in volume is potentially better?

    ..crap.

    -everything else after that until near the very end was like listening to an episode of Peanuts where the dialog was nothing but their “on-the-phone” conversations.

    BUT,

    -did like hearing about the upcoming Pascal architecture from Nvidia – though I seriously doubt Nvidia is going to spring it this year.

    ..(sigh).

      • jts888
      • 4 years ago

      In my admittedly very jaded opinion, it felt to me like Scott was fawning over Intel press release material a bit too much, given how lackluster Skylake’s effective IPC improvement is over Haswell’s.

      He and David spent a great deal of time talking about a set of 10-20% bumps in ROB sizes, load/store reservation sizes, etc., but the bottom line is that most workloads will only see like a 5% bump in clock-for-clock performance and that sustained clock rates aren’t rising at a decent clip anymore. Maybe we need to wait for the Skylake EP/EX release, but I would have enjoyed more focus on the fixed TSX implementation (or even more discussion on the ring bus tweaks) instead of fine-grained voltage/turbo modulation.

      The argument can be made that Skywell is a decent improvement to mobile platforms, but the workstation (i.e., non-Xeon) implementations are extremely underwhelming to me, and I felt like we got not much besides cheerleading from Scott (save the eDRAM issue) and not a lot of overt criticism from David.

        • Ninjitsu
        • 4 years ago

        It looked more like there was little to really criticise from an architectural point of view. Intel’s doing what they can to meet whatever power targets they had set and still increase IPC, but it seems obvious to me that software doesn’t fully exploit Skylake yet.

        TSX isn’t really meant for the consumer space either, though yes it would be nice to see it get attention (though wasn’t it talked about widely when Haswell came out?).

          • jts888
          • 4 years ago

          The issue IMO is that the desktop Skylake (i7-6700K reviewed) is a virtual non-release compared to the Haswells from over a year ago (i7-4790K): it runs the same clock and came out much later for essentially the same price, with maybe a 5% performance improvement even when using higher clocked DDR4 instead of DDR3. GPU vendors would (and do) get crucified for such lackluster releases.

          While it’s possible that MPX bounds checking could eventually reduce the security exploit profile of applications, it’s going to be a long time before most code is running it by default, so the only things that are really interesting to me now in Skylake are the finally (?) fixed TSX implementation and DX12, neither of which got as much exposure as I would have preferred.

    • Ninjitsu
    • 4 years ago

    I watched the on-demand video on Twitch, being too impatient for this post to come up. So I had a follow up question which I’ve now forgotten!

    But excellent discussion as always with Scott and David. 🙂

    • WhatMeWorry
    • 4 years ago

    “Tick Tock is for sissies, real men go Tock, Tock, Tock” 🙂

      • jts888
      • 4 years ago

      Skylake is a bit of a dog by Intel’s “tock” standards, even counting Haswell’s flub on TSX.
      The best-case scenario is that Kaby Lake is the real “tock” IPC-improving generation and that Skylake was just a tweaked Broadwell to push out the door mainly for the benefit of mobile platforms.

      Really, though, I’d like to see a modern comparison of Nehalem/SB/Haswell/Skylake and see how well an i7-6700 stacks up to an i7-920 from seven years ago. (3.6 GHz vs. 2.66 GHz, but same core count (4), cache sizes (64kiB/256kiB/8MiB), and price bracket (~$300 launch))

      If there was more than a 30% clock-for-clock IPC increase in that time or more than an 80% overall speed gain in non-accelerated (e.g., AES) processing, I’d be a little surprised.

        • jihadjoe
        • 4 years ago

        They weren’t talking about the CPU side when they mentioned that though, but the GPU.

        Intel graphics have come a pretty long way in a short amount of time. From barely good enough to show the desktop they now have the most DX12 compliant architecture. If Intel were to make a full-fat GPU right now they probably wouldn’t be too far behind AMD or Nvidia.

          • ronch
          • 4 years ago

          I love how Intel worked their way into GPUs little by little, step by step. I wouldn’t be surprised if they suddenly put out a full-bore discrete GPU.

          Intel’s work environment fosters learning about what works and what doesn’t. They inch their way forward and even with ‘failed’ microarchitectures like the Pentium 4, they see it as not a failure but as a learning experience.

          Contrast that to how AMD fires their engineers when commercial flops like Bulldozer or Barcelona came out. I mean, come on guys, you’re building LEADING EDGE MICROPROCESSORS, not paper planes! And you’re up against a huge competitor with an R&D budget enough to fill the Pacific! Unless you can find some superhumans to work for you, how do you expect a bunch of engineers to pull a rabbit out of a hat every single time? I think AMD has to adopt Intel’s corporate paradigm.

          • tipoo
          • 4 years ago

          Man, I would actually love for that to happen. Their fabrication plant advantage would also leapfrog the current 28nm stalemate. You’re right, their architecture is very good, and we’ve also seen it’s very scalable (though to scale up more than 3-4 slices, they may need more than one “common slice” with the front end stuff, as that’s currently limiting upwards DX12 scaling already), provided enough bandwidth.

        • Ninjitsu
        • 4 years ago

        920 vs 6700K [url<]http://www.anandtech.com/bench/product/47?vs=1543[/url<]

          • jts888
          • 4 years ago

          Thanks, that’s a nice benchmark database.

          It looks like the transcoding, aes, and cinebench stuff scores went up a few fold as one would probably guess, but the non-accelerated stuff like the 3d particle sim, 7-zip. etc., only improved by about 50-70%, and that’s for the ~$350 4 GHz 6700K instead of the more directly comparable ~$300 6700.

          That boils down to about 7% year-over-year improvements for most software for over a decade, which is paltry by the standards of the ’90s through early ’00s, even if the underlying engineering difficulties are pretty straightforward. (i.e., no netburst++ 10 GHz, 50-stage pipelines solving all our problems)

            • Ninjitsu
            • 4 years ago

            [i<]But should you even be comparing with the 90's, today?[/i<] EDIT: I mean in terms of how much progress you'd expect to make per generation.

            • jts888
            • 4 years ago

            That’s entirely up to the individual and his/her opinions of the industry, I suppose.

            From about ’95 to ’03, single-threaded integer performance improved consistently year-to-year by about 50%, and now it’s maybe 10% if we’re being generous, and core counts aren’t increasing except on server CPUs.
            (Scott’s own final numbers for the i7-6700K show ~8% annual improvement since the i7-2600K for non-games and ~5% for games.)

            One can aesthetically appreciate some of the craft going into wringing out these last refinements to general purpose CPU cores, but it’s hard to sincerely be happy about the practical implications of the rapidly dwindling returns, which is why I’ve been a bit critical of Scott’s coverage of Skylake.

    • ludi
    • 4 years ago

    Good run, as usual. And thanks to DK for taking the time.

    • terminalrecluse
    • 4 years ago

    Kanter on AMD speedup in DX12 AoT game: “Well duh if you understood the architecture”…

    score one for the red team

      • Ninjitsu
      • 4 years ago

      Did you miss the rest of the discussion, though? ¬_¬

        • nanoflower
        • 4 years ago

        Reading through the discussion on Beyond3d it’s seems clear that no one is quite sure about the asynch compute calls in regards to AMD and Nvidia. Lots of interesting discoveries that will take a while to work through and see if there are real lacks in either implementation or just things that are misunderstood/misused. In any case no one is likely to be using the feature in a real game on the PC this year.

          • nanoflower
          • 4 years ago

          And now it seems that Oxide just hit on something that was partially implemented in the driver according to the developer from Oxide. So it will be fixed in a future driver update.

            • MFergus
            • 4 years ago

            How recent was this? I saw how they had to turn off async compute for nvidia but from what I read it seemed more like a hardware limitation than something that can be fixed with a driver.

            • Ninjitsu
            • 4 years ago

            This is very recent. Nvidia implements a method of async compute, but you can’t use it along side graphics. The driver was exposing the feature, however, which is what a future driver should fix.

            • jts888
            • 4 years ago

            Could you clarify a bit? Is the fix to actually allow fine-grained asynchronous/simultaneous processing of both graphics and compute queues, or is it to not have have the DX12 driver (incorrectly?) report the capacity to do so?

            • nanoflower
            • 4 years ago

            Currently the driver says that Maxwell 2 can support the feature but the driver doesn’t fully implement it. Probably due to Nvidia spending time on other features that were more likely to be used, but thanks to the attention being focused on this feature now I’m sure they will be forthcoming with a new driver that fully supports the feature soon.

            What pharma (an Oxide dev) said was “We actually just chatted with Nvidia about Asynch Compute, indeed the driver hasn’t fully implemented it yet, but it appeared like it was. We are working closely with them as they implement Asynch Compute.”

            • DoomGuy64
            • 4 years ago

            From what I’ve read elsewhere, it’s not that the feature isn’t fully implemented in the driver. It’s been speculated that Async is not fully implemented in hardware, so Nvidia is emulating the feature through software, which is why it incurs a performance penalty instead of not working at all.

        • terminalrecluse
        • 4 years ago

        What’s to miss? kanter went on to say that Nvidia would get it right in pascal but that AMD had been doing asynchronous design for years given all their console exposure since consoles have had to use low level access to graphics for years. And he did say that AMD drivers weren’t that great but I doubt that’s the whole issue it’s Amds ability to do async that gave them the edge.

          • nanoflower
          • 4 years ago

          It looks like they do support it on Maxwell 2 but the driver hasn’t fully implemented support for the feature at this time. So there’s no need to wait for Pascal to get support for it, just a new driver. However I’m sure Pascal will do a better job of supporting the feature.

            • terminalrecluse
            • 4 years ago

            Well then I stand corrected.

            • Freon
            • 4 years ago

            Except all the caveats on 32 queued items that probably must be the same type, preemption is possibly catastrophic, etc?

          • Ninjitsu
          • 4 years ago

          He said that only Intel implements all DX12 features, and he’d gladly take any high-end Maxwell cards that people apparently don’t want anymore.

        • Freon
        • 4 years ago

        I don’t think I missed anything, have a hard time not hearing that AMD’s got an at least pretty good solution today (or rather, several years ago) for DX12 and NV has to race to get their nextgen part out to fill the async crater they have in current parts.

          • nanoflower
          • 4 years ago

          Except for the fact that it’s a lie that I’m sure will continue to be spread for some time to come. Yes, asynch compute doesn’t work correctly today with Maxwell 2 but it’s a driver issue. That means it’s something that Nvidia can and will fix quickly now that everyone is talking about it. At that point AMD may have a performance benefit over Nvidia in asynch compute thanks to a different design but Nvidia will still perform better in DX11 games than AMD and likely will perform as well or better than AMD in upcoming DX12 games (because few will depend upon asynch compute.)

          Even if Nvidia were to decide to not do anything with asynch compute on Maxwell it would not matter a great deal because with their overwhelming share of the discrete graphics market developers will take that into account in their design so that their games perform well on Nvidia cards. Plus by the time asynch compute becomes a common factor in games (if it ever does) we will likely have moved past Pascal and unto new generations of Nvidia cards.

            • NoOne ButMe
            • 4 years ago

            The issue isn’t how you schedule here.

            It’s that as far as anyone can tell, Maxwell 1/2 BOTH CANNOT run graphics and compute at the same time without suffering a slowdown. Their HyperQ and features similarly were made for professional compute purposes, and non-graphics work.

            Maxwell is stripped down to exactly what you need for DX11. If you only have X hardware and you’re utilizing it .95X%, you can only ever get about a 5.25% performance increase. You don’t have more hardware to utilize.

      • TheSeekingOne
      • 4 years ago

      Nvidia on their DX12 support in their own words:

      [url<]https://www.youtube.com/watch?v=Dnn0rgDaSro&feature=youtu.be[/url<]

        • GhostBOT
        • 4 years ago

        This is Hilarious XD

        • shank15217
        • 4 years ago

        HAHAAHAHHAHAHAHA I love it

          • Firestarter
          • 4 years ago

          VR

      • Pancake
      • 4 years ago

      Kanter’s interpretation is wrong. With DX12 GPU microarchitecture makes a big difference to how you might want to organise and arrange your code and data structures. So, market share of graphics card manufacturer might be the decider on where optimisation effort is spent. Looking at the current stats that would favour NVidia by a large amount.

      • f0d
      • 4 years ago

      its funny how some people (*cough* *cough* amd fanboys) are making such a big racket over one feature of dx12 that may or may not be supported by nvidia yet there are 2 other dx12 features that amd have said themselves that they dont support like raster ordered views and conservative raster

        • Firestarter
        • 4 years ago

        nvidia fanboys can have their day when those features prove to be significant in games

          • f0d
          • 4 years ago

          none of the features have been proven “significant” yet since there are no actual dx12 games RELEASED yet and there is one alpha test of an unreleased game
          also this supposedly important feature is getting fixed in a driver update
          [url<]http://www.game-debate.com/news/?news=18007&game=Ashes%20of%20the%20Singularity&title=Nvidia%20Working%20On%20Asynchronous%20Compute%20Support%20In%20DirectX%2012%20With%20Future%20Driver%20Updates[/url<] either way i follow no horse in this race, i just find it funny how people are reacting to this one feature in an unreleased game and how much is supposedly means imo the smart thing to do is to wait for a few games to get released before making any decisions on what features and cards are best being a "fan" of a brand hurts you more than helps you - just get what has been proven to be the best after a thorough test (which you wont be able to do until 3 or 4 DX12 games come out at least)

            • Ninjitsu
            • 4 years ago

            Yeah, this is why I’m not jumping on the “AMD WINS EVERYTHING NOW” bandwagon. It’s one game in beta. And people talk about sample sizes.

            • Firestarter
            • 4 years ago

            [quote<]imo the smart thing to do is to wait for a few games to get released before making any decisions on what features and cards are best[/quote<] that is very true

    • Nevermind
    • 4 years ago

    The more we hear about the new “backend” features of the CPU, on-die voice and photo analysis, always-on listening capabilities, OS-independent hypervisor that uses the network to send data back to Intel – or anyone else, *(some of that old news, Management Engine)…

    The less excited I am about Skylake or any new CPU that follows this trend.

    And before you say “well obviously you have something to hide” I really don’t, not at all.
    It’s the principle. And this isn’t pro-AMD either, although they are presumably doing less.

    Consumers should be in control of the things they decide to buy.

    If they want to include “features” like this, outraged consumers NEED TO DEMAND that they have an option for turning them off and verifying that. Or it will get worse as time goes on.

    Rant over.

      • Deanjo
      • 4 years ago

      [quote<]And before you say "well obviously you have something to hide" I really don't, not at all. [/quote<] That's what all the terrorists say. 😀

        • Nevermind
        • 4 years ago

        I guess we’re all being treated as potential terrorists, no?

        Hell, if they want to hack into the boxes of suspects, we can’t really stop that here in the USA.
        The law has found we have no standing to sue, and no warrants are really required.

        But pre-hacking ALL new products sold to ALL consumers is going a bit far, don’t you think?

        • chuckula
        • 4 years ago

        I DO have something to hide!

        Not a terrorist!

        • the
        • 4 years ago

        Of course! Why would I want my top secret plans for world domination to leak to the rest of the world?

      • TheFinalNode
      • 4 years ago

      Here’s a short but great talk I frequently refer to when people say that only “bad” people want privacy: [url<]https://www.youtube.com/watch?v=pcSlowAhvUk[/url<]. It's by Glenn Greenwald, the journalist who initially reported on Edward Snowden's leaked documents, so it's very articulate and well thought out.

      • Ninjitsu
      • 4 years ago

      Well frankly if there’s no software that uses these features, then the hardware by itself isn’t of much use. After all, software has been doing this without dedicated DSPs and stuff so far.

      Blame MS for using this stuff in Windows 10. Intel wouldn’t bother if no partner demanded it.

        • Nevermind
        • 4 years ago

        True, but I don’t think M$ is the culmination of their secret data partnerships.

      • TheJack
      • 4 years ago

      It somehow sounds like the rise of the terminators. Companies are now boldly and openly saying that they are watching your every movement and if you don’t like it, well, who gives a damn? Question is, where is going to go from here?

Pin It on Pinterest

Share This