DirectX 12 Multiadapter shares work between discrete, integrated GPUs

During its Build conference yesterday, Microsoft revealed that its next-gen graphics API offers enhanced support for accelerating performance with multiple GPUs. Dubbed DirectX 12 Multiadapter, this capability gives developers explicit control over GPU resources. It works with not only traditional CrossFire and SLI setups, which team multiple discrete graphics cards, but also with hybrid configurations that combine discrete and integrated GPUs.

According to Direct3D development lead Max McMullen, who presented a Build session on graphics performance, DX12's Multiadapter mojo lets developers generate and execute commands in parallel on multiple GPUs, complete with independent memory management for each one. Those GPUs can collaborate on rendering the same frame or do different kinds of work in parallel.

McMullen showcased the benefits for a hybrid configuration using the Unreal Engine 4 Elemental demo. Splitting the workload between unnamed Nvidia discrete and Intel integrated GPUs raised the frame rate from 35.9 FPS to 39.7 FPS versus only targeting the Nvidia chip. In that example, the integrated GPU was relegated to handling some of the post-processing effects.

A more impressive Multiadapter demo aired during yesterday's Build keynote. This one used a custom Square Enix Witch Chapter 0 [cry] scene designed to push DX12 to its limits, and the footage has already made its way to YouTube. Behold:

Um, wow.

Four GeForce GTX Titan X graphics cards were required to run the demo in real time, which means we probably won't see graphics that good in games anytime soon. DirectX 12's ability to share workloads across different multi-GPU configurations looks very promising, though, especially since it doesn't require a high-end setup with multiple cards.

Comments closed
    • Cyco-Dude
    • 4 years ago

    *shrugs* nothing we haven’t seen before. i think the performance benefits will be better than any incremental graphics improvements.

    • Bensam123
    • 4 years ago

    Hope this is the start of the second coming of Lucid multi-gpu tech. This isn’t just about combining a Intel GPU with Nvidia cards, it’s about being vendor agnostic (AMD and Nvidia) and used between more then one type of card. Say a low end and a high end one or older gen cards in the future.

      • Meadows
      • 4 years ago

      Lucid Hydra was a great idea but the implementation didn’t get enough support. As far as I could find out through an internet search, performance was barely better in many titles, the chip needed constant driver updates for new games that it didn’t get, and after a while they just stopped bothering with it altogether because literally no new motherboard wanted to use Hydra anyway. The useful life of the product was fantastically short.

      While it *was* a weird workaround requiring extra hardware, it would’ve allowed some of the advantages of DirectX 12 in older games that use Dx 9, 10, 10.1, or 11.

      Dx 12 has the more elegant solution of course, but this way we only get the advantages in future titles that have yet to come.

    • d0g_p00p
    • 4 years ago

    I can’t wait to play “Hair Stylist Wars 2”, the graphics upgrade is amazing!

    • Meadows
    • 4 years ago

    Kind of appalling that they needed 4 Titans to do that in real time.

    I’m not entirely blown away either. [b<]We've all seen perfect hair in games before[/b<], the only difference here is that each strand of hair is treated uniquely, and good luck telling that without the presenter saying so. What would actually be nice is [i<]imperfect hair[/i<]. Simulation of scruffy, wet, muddy, or unkempt hair, dynamically. Like in Tomb Raider, except done properly. That'll be the real step forward, not this overdesigned prop from a Parisian catwalk.

      • jihadjoe
      • 4 years ago

      I’m sure there’ll be some optimization to come, along with some dialing back of detail — perhaps even to settings unlike those you describe. Personally I don’t see anything wrong with targeting very high graphics fidelity during development, even if such settings end up unplayable on most hardware at release date. Crysis did this, and nobody hated them for it.

      There will eventually come a time when the midrange card will outperform quad Titan Xs.

        • VincentHanna
        • 4 years ago

        I don’t think that is really his point. His point is that “very high graphics fidelity” and “pointless, wasteful and redundant” aren’t synonyms. If you are going to render 200,000 individual hairs, and you are going to go to the trouble of giving every hair 50 some texture engines, and individual physics and whatnot, you should have a reason to do so.

        Barring that, a single hair mesh can be just as effective, and not cost the resources of a 5000 computing block… even if technology advances and we have the resources of 50 titan Zs, can you not think of something better to do with that 8%?

      • chuckula
      • 4 years ago

      [quote<]What would actually be nice is imperfect hair. [/quote<] Imperfections, which are everywhere in the real-world, are the last great hurdle for these graphics engines to overcome (at least if they want to be ultra realistic and not intentionaly unrealistic for artistic reasons).

        • jihadjoe
        • 4 years ago

        But the way to get those imperfections in is basically though a very high level of detail. Instead of a nice smooth texture for a car’s paint, for example you’ll have to model in scratches and orange peel (as the Forza team did). To get realistic imperfections in hair, one of the best ways is to model each individual strand, instead of having them move in blocks or chunks.

          • Meadows
          • 4 years ago

          Scratches, bumpmaps and subsurface scattering for skin (as in the case of this video) should not and do not take 4 Titans.

            • Ninjitsu
            • 4 years ago

            But if they’re dynamic?

            • Meadows
            • 4 years ago

            Dynamism does not increase the load quite like that. While it means you vary the exact numbers to be crunched in a function, that does not increase [i<]the amount of numbers[/i<] to crunch in any meaningful way. Even if they did do some particularly funny magic, it's obviously pointless because the scene looks like something a single Titan X should have been able to display. After all, we have no information on target framerates and I'm not even sure about the resolution here. Let's face it: this is just a backdrop of medium detail behind one single lone character in some simple fantasy clothing with overdesigned hair. Now that I'm thinking about it, I'm not even sure it needed four cards to begin with. That might've been just a safety precaution "just in case".

    • alrey
    • 4 years ago

    The end of steam OS?

      • sweatshopking
      • 4 years ago

      that suggests it had a start.

      • kuttan
      • 4 years ago

      May be not Vulkan API is there….

      • deruberhanyok
      • 4 years ago

      I’m sure this is a feature that will have people running to buy video cards from different vendors for their gaming rigs.

      What does this have to do with SteamOS, anyways?

        • auxy
        • 4 years ago

        No DX12 in SteamOS. (*‘∀‘)

    • Welch
    • 4 years ago

    This is where I knew animation and graphics had to go in order to get truely better. Individual polygons for each item, no special bitmap algorithims, no tesellation style tricks to give the appearance of physical texture. Just true to life individual building blocks to make up entire models of every single surface.

    This gentlemen, is the start of real 3D computing.

      • Klimax
      • 4 years ago

      I don’t think I am following you. What do you mean “polygon for each item”. As for tessellation, you got that one wrong. It is not “trick” for appearance of physical texture.

      Tessellation is not what you stated. It generates triangles to allow for finer detail and it can be input to geometry shader to finish final mesh. (It’s severely decreasing required amount of memory, especially at high tessellation factors above 16/32)

      This presentation should show you why tessellation is important and what its use is:
      [url<]http://developer.download.nvidia.com/presentations/2009/GDC/GDC09_D3D11Tessellation.pdf[/url<] Post looks like mishmash of some ideas.

        • Klimax
        • 4 years ago

        Found some detail on Ars. Extreme number of triangles on scene is behind (a thing tessellation was introduced to avoid as memory and processing cost is unreasonable). Sounds like just extreme version of standard tech we have here, until more details get written down. And we don’t need such large number. Tessellation + geometry shaders push cost of such detailed mesh to far later stage and IMO is bit saner.

        If we want to see something new: Tessellation + original Displacement mapping. (No approximations)

          • the
          • 4 years ago

          Well if you want insane number of polygons, you can go with ray tracing. Dealing with an order of magnitude more polygons isn’t an issue. However ray tracing is an entirely different can of worms when it comes to programming complexity and processing requirements.

            • Klimax
            • 4 years ago

            I think you are mixing up things a bit. This demo uses classical rendering pipeline which approximates light and other effects of physics and has 63 millions of polygons. My point is, that is unnecessary because tessellation can reduce number of vertexes in vertex stage and regenerate and adapt them in geometry stage without loss of detail.

            Too many polygons = damn too much VRAM and bandwidth required and often wasted processing in stages for vertexes. That’s what this demo uses – ~63 millions of triangles (if any other polygon used initially, then even higher number of triangles would be required – triangle is called primitive in rendering for a reason).

            • the
            • 4 years ago

            I know. Essentially if the bottleneck is in polygon throughput for the classical pipeline, switching to ray tracing would remove it. However, it has an entirely different bottleneck with the light-ray computation. At some insane polygon count, it does become beneficial to switch over even with the ray tracing over head. (And I don’t think we’re close to that point considering the quality of the Final Fantasy demo.)

            The likely reason for so many polygons is to get the hair animation right. Tessellation there would indeed reduce the amount of memory and bandwidth needed but it also has a chance to look unnatural in motion. It is a trade off. Chances are there are several other tricks at play that we don’t notice to reduce the impact of dealing with so many polygons. I’ll throw out a guess with that the hair’s shadow isn’t done on a per polygon basis to match it but rather uses a simplified model that is a good enough approximation.

            • Klimax
            • 4 years ago

            Ok, now I see what you are saying.

            As for hair in motion and tessellation, I am sure it should be fixable in geometry shaders. (Likely depends on how well it is possible to describe curvature of that large triangle forming hair by equations)

            • Mat3
            • 4 years ago

            [quote<]Essentially if the bottleneck is in polygon throughput for the classical pipeline, switching to ray tracing would remove it. [/quote<] With ray tracing, all those polygons would have to be stored in memory, and not just the ones directly visible by the camera. The memory requirements would skyrocket.

            • the
            • 4 years ago

            But not bandwidth as they would only be accessed if used in the current scene. Caching here does wonders. Considering that chips today are very

            It also helps that if you’re not bound by polygon throughput, you can omit things like normal and bump maps that take up memory for some models. Considering that normal and bump mapping are effectively textures in terms of memory usage, this could permit a reduction in memory usage.

            • Klimax
            • 4 years ago

            Not effectively. At least one if not two extra textures per surface. Usually one textures holds height data and one vector. Fun stuff and often very simple math with nice results.

    • deruberhanyok
    • 4 years ago

    Sweet googa mooga… it’s kind of mind-numbing that’s being rendered in real time.

    I have a tendency to skip over videos most of the times when I see articles like this. “oh, it’ll be impressive, whatever, don’t need to watch it to see that there’s more polygons and prettier lighting”.

    But I’m really glad I clicked on this one – if you thought to just skip the video, it’s definitely worth watching.

    • tootercomputer
    • 4 years ago

    The pron industry is going to love this technology.

    • jessterman21
    • 4 years ago

    Someone at SE should dig up the Advent Children files and release a benchmark like the UE4 or Monster Hunter Ultimate ones. We could probably run the movie real-time now – minus whatever downsampling they used.

    • Deanjo
    • 4 years ago

    Two or more graphics drivers from multiple vendors……. like that isn’t a disaster waiting to happen.

      • Meadows
      • 4 years ago

      Game developers will have to check what works, so if a disaster does happen, it’ll be their fault this time.

        • w76
        • 4 years ago

        So MS is turning the tables: don’t blame us, blame developers!

      • GrimDanfango
      • 4 years ago

      Well, pretty much ever single nVidia-based laptop has been doing that for years now, with Optimus letting the discrete card coexist peacefully with the onboard Intel GPU.

      nVidia mixed with AMD, that’s a different matter. I know for a fact that it’s pretty much a guaranteed system-crippler under Linux, as I tried it just recently to attempt to use an AMD card from OpenCL alongside a primary nVidia card.

      I know they can at least coexist under Windows 7, but I certainly wouldn’t put it above either, especially nVidia, to ‘accidentally’ sabotage the other now and again.

      (disclaimer – I’m actually pretty much a full-on nVidia follower, but I can certainly acknowledge that out of the two, they’re usually the one that given the slightest opportunity, will always seek to actively screw over the competition)

        • MadManOriginal
        • 4 years ago

        True, although switching, or having entirely separate workloads, is still different than sharing a single real-time workload.

        • Andrew Lauritzen
        • 4 years ago

        Honestly this setup is way *less* finicky than Optimus. Optimus causes all sorts of problems with its shimming and layering nonsense, while Microsoft’s solution just runs the separate drivers in isolation, just like if you had multiple GPUs running different applications in your system (which you can already do in Win8).

        I’m not overly concerned about the robustness of this solution – the bigger question is how much performance game developers will actually be able to extract and how much effort they are willing to put into it. Let’s be clear – even with two identical GPUs the gains aren’t going to be nearly 2x… there’s a reason why the GPU vendors do AFR and it has everything to do with benchmark numbers 🙂

        Look at Civ BE Mantle multigpu for a good indicator of the sorts of gains you might see here if game developers put some effort into it. It’s still worthwhile gains, but it absolutely is not going to shift the balance to it being more cost-effective to buy two lesser powered GPUs or anything. It’s still only going to be for the extreme high end where you already have a Titan or something and want to add a second one.

        • Deanjo
        • 4 years ago

        Switching is a very different scenario than utilizing two different graphics vendors working on a single frame. This isn’t the first time it was attempted. Lucidlogix’s Hydra attempted this a few years back and learned the hard way that it wasn’t a good idea. It constantly had issues with finding working combinations of drivers for mixed setups and when there was a working combo of drivers it would run some games but not others unless you used another combo.

    • lycium
    • 4 years ago

    Hah, how cool, just earlier this morning while brushing my teeth I was thinking about how different-vendor GPUs should do this. Then can in OpenCL, so why not? 🙂

    • kvndoom
    • 4 years ago

    Yay better rendered cutscenes.

    • VincentHanna
    • 4 years ago

    I don’t care about my iGPU… As far as I’m concerned, its only purpose is to run my pc in safemode.

    However, if it can split work between discrete and iGPUs, does that mean it can also split work between non-SLI/non-xfire GPUs? Because that would be pretty sweet… being able to “add” a 900 series card to augment my current setup, instead of replacing it entirely…
    But of course, like everything else, if they didn’t talk about it, it probably just means that I’m dreaming and that it will never happen.

    • hasseb64
    • 4 years ago

    Small FPS benefit, BIG risk of bad coding, forget it!

      • Mat3
      • 4 years ago

      Well, the benefit would be a lot better if they used the integrated graphics on an AMD APU.

    • Bananaman
    • 4 years ago

    This reminds me of the software company LucidLogix that shipped their VirtuMVP software with a number of motherboards a few years back. If I remember, the software promised to offer higher framerates, more efficient v-sync, and enhanced video encoding (via Intel quick sync) by leveraging both the dedicated GPU and an IGP. A quick Google search shows the technology is basically dead, having been plagued with bugs, compatibility issues and stability problems.

    Technical concerns aside, I think VirtuMVP had a few good ideas that could be given new life with DX12. This is a great step towards switchable graphics when dedicated cards aren’t in use. This could bring the benefits of Intel quick sync to multi-display setups not connected to the IGP. An IGP could contribute it’s chunk of reserved CPU memory to the GPU pool, possibly allowing a means to “expand” video memory, or at least reallocate RAM that won’t be used by most games anyways. I’m sure the reality of what’s actually possible will greatly depend on how driver video drivers propagate themselves to the rest of the system, but it’s exciting to think about, no?

      • tootercomputer
      • 4 years ago

      Yes, you are spot on about VirtuMVP. It came with my AsRock mobo in 2012 and it was the biggest piece of hype I’d ever seen. I searched and could find no actual reviews that it did anything. All the “reviews” were simply PR stuff from the company.

    • Anomymous Gerbil
    • 4 years ago

    Does this mean DX12 games can be written to utilise multiple GPUs, even if they’re not set up in SLI mode (or maybe even preferably not set up in SLI mode)?

      • xeridea
      • 4 years ago

      Yes, with DX12 you can use multilple GPUs as you please, no SLI, no CF, no AFR (unless you want). You could run a 390X and a 780TI together. It is a lot lower level API so you have far more control.

        • chuckula
        • 4 years ago

        While all of what you said is certainly possible, it isn’t a plug-n-play free solution either. Programmers still have to deal with some complex resource management to handle multiple GPUs in an efficient manner.

          • xeridea
          • 4 years ago

          However it is done, it is certainly easier than working wit SLI/CF. Most games have issues getting them to work right. With lower level control, it will be a lot easier to split workloads. Yes there is some work involved, but it is way better than the current situation where multi gpus are a waste, or have huge drawbacks.

            • the
            • 4 years ago

            No, it is actually more difficult for programmers. The resource management is entirely in their control. While this is a good thing for potential performance, gamers often get titles like Assassin’s Creed Unity which have the chance of decreasing performance when multiple GPUs are put to use.

            SLI/Crossfire puts the load distribution algorithms to work at the driver level, abstracted from the programmer. While it does take some profiling to get optimal performance, this methodology typically sees a moderate performance gain even without profile or direct developer intervention.

            • Deanjo
            • 4 years ago

            Not to mention that with SLI/Crossfire, the programmers are working with two identical hardware products. Different rendering strategies would be required for different combinations. I doubt for example the strategy is going to be the same if you say used a Titan and 7770 for a 270X and GTX-770. It’s just a world of hurt begging for developers to completely ignore multi-card/multi-vendor setups all together.

            • the
            • 4 years ago

            It just comes down to how well a developer can partition the rendering pipeline into discrete tasks. The more tasks, the greater the ability to evenly divide up the workload regardless of what cards going to be used. This is effectively a demonstration of Amdahl’s Law.

            • BobbinThreadbare
            • 4 years ago

            90% of game devs don’t write their own engines. Only like 4 companies need to figure it out (epic, valve, bethesda, DICE) and that covers just about everyone.

            • Ninjitsu
            • 4 years ago

            Unity – don’t forget one of the most popular ones out there!

            • the
            • 4 years ago

            True for the main engines.

            3rd party middleware will also need to adapt but also offer one of the easiest avenue to divid up a task. For example, if a game is using speed tree, why not have a meager Intel GPU handle that task. If middleware developers work with the engine designers, then task division can be simplified. This would lessen the burden on the actually game developer who is putting all of these pieces together.

            Case in point, the Square Enix demo looks like it used nVidia’s Face Works middleware technology that was shown off at GTC 2013.

    • ozzuneoj
    • 4 years ago

    Is it possible that this could be bad for overclockers? If a certain task is heavily loading your IGP, that will create more heat and require more power for the CPU in general, correct?

    I’m sure it would depend on the situation, but it seems like it could make an overclock less stable if the CPU suddenly starts cranking away at post processing effects in certain games after the system voltage and clock speed were carefully tuned for stability without any load on the IGP.

      • Flying Fox
      • 4 years ago

      Overclocking is never a guaranteed thing?

      • Firestarter
      • 4 years ago

      if it’s making an overclock unstable because it’s being too efficient, then I still don’t see how that would be a bad thing for overclockers

      • cobalt
      • 4 years ago

      Yes, it may make an overclock less stable, but that wouldn’t be BAD, that would be GOOD. Tests for stable overclocks evolve over the years, so if what you suggest is true, then we just found a new benchmark to use to more effectively test stability.

    • DPete27
    • 4 years ago

    Yeah, what ever happened to [url=https://techreport.com/review/21682/lucid-smarter-vsync-could-revolutionize-game-performance<]Lucid Viru MVP?[/url<] Such a promising technology with an unfortunate death as a result of poor/no updates to support new games.

    • TheSeekingOne
    • 4 years ago

    I wonder how much more performance can be gained in an HSA execution environment where the GPU can directly access all virtual memory and communicate with processes without any intervention from the CPU.

    • tipoo
    • 4 years ago

    Hope it takes off. I always felt it wasteful for IGPs that are getting pretty darn decent to just sit there idle if you have a discreet card.

    Even with a multi Tflop dedicated card, adding 400-800Gflops from the integrated is no small deal.

      • derFunkenstein
      • 4 years ago

      And since so many cards shut down their fans when they’re not in use anymore, they really are discreet.

      • tviceman
      • 4 years ago

      While I agree, I also believe including an IGP on high end desktop CPU’s is wasteful on die space. I’d rather have more cores or better use of transistors occupying that space to drive higher CPU performance / watt.

        • tipoo
        • 4 years ago

        I agree with that, but it looks like that ship has sailed. Unless you want to get a Xeon (which is a real option), Intel is adamant about bolting GPUs onto their CPUs now.

          • ikjadoon
          • 4 years ago

          Or the -E series…from Sandy Bridge-E to the upcoming Broadwell-E: none have integrated graphics.

    • HisDivineOrder
    • 4 years ago

    Well, at least AMD will have some remote reason to push APU’s to gamers, right? And at least those Intel iGPU’s will have some possible use to PC gamers.

    That said, I’d prefer more CPU cores and let me buy another discrete GPU to slap in instead.

      • xeridea
      • 4 years ago

      The CPU will be far less important in games now, though I would also rather have dedicated CPU. Zen will have a 8 core CPU with no graphics. Having GPU on chip may be useful if HSA gets more software support in future though.

    • Ninjitsu
    • 4 years ago

    I wonder if it’ll be a better idea to have an iGPU accelerated physics library now, almost everyone has an iGPU, and with the main discrete GPU handling the rendering, the integrated one could assist the CPU with physics. Also makes a lot of sense considering that data doesn’t need to go across PCIe, just RAM to CPU/iGPU. Those eDRAM Broadwells might shine here.

      • Firestarter
      • 4 years ago

      now you’re talking APU

        • nanoflower
        • 4 years ago

        Maybe AMD will start making APUs (Accellerated Physics Units.)

        • ronch
        • 4 years ago

        I’m sure at this point someone should do it first, then AMD would follow.

      • the
      • 4 years ago

      There is PhysX and it can finally be ported to non-nVidia GPUs now that it has been open sourced as part of UE4.

      Intel has Havoc but I haven’t heard much about it for several years.

        • Ninjitsu
        • 4 years ago

        Oh yeah, that’s true! BIS is switching to the open-source version of PhysX with the next patch for Arma 3 as well.

        Havoc was much better than PhysX from my experience with Sleeping Dogs. I think other Square Enix games use it too.

      • derFunkenstein
      • 4 years ago

      Holy crap that’s a good idea. It would also make upgrading your system look more palatable to people still using Sandy Bridge, but you’ll also have to start convincing Intel to include GT3 graphics on high-end CPUs.

      • puppetworx
      • 4 years ago

      Now yer talkin’.

      • ish718
      • 4 years ago

      Sounds like a good idea on paper but I can’t imagine all the driver issues that would arise.

      • UberGerbil
      • 4 years ago

      Seems like the way to do that is to put an OpenCL wrapper around the IGP. Though since we’re talking about Microsoft, it would be DirectCompute instead. I’m still surprised Microsoft never offered at least a baseline DirectPhysics library, but this would be a good place to do it.

      • Andrew Lauritzen
      • 4 years ago

      Possibly but it’s kind of like the audio situation… CPUs are actually pretty decent at physics and audio already (crappy software notwithstanding) – there’s not really much fundamental efficiency that GPUs are going to add to that.

      • ImSpartacus
      • 4 years ago

      That seems like a really good idea. I was hesitant about how this kind of asymmetrical dual gpu stuff would work, but throwing physics on the igp sounds like a winner.

      • kuttan
      • 4 years ago

      Not just physics computing, possibilities are infinite. Its all upto the game developers what part of game should go to the IGP.

    • TwoEars
    • 4 years ago

    I honestly don’t care that much about the graphics, just give me a solid RPG story and good gameplay mechanics and I’m happy. I’m a little worried about developers spending too much time on graphics and less on story and gameplay.

    What people tend to forget is that the limiting factor in most games isn’t the game engine or your textures, it’s your budget and how many concept artists, world artists and modelers you can afford to throw at a project. And if this is “one more thing” for those guys to optimize and keep track of I’m not sure it’s the right way to go.

      • Milo Burke
      • 4 years ago

      I agree to a point. Story and gameplay mechanics are absolutely important. But a good story and good graphics are not mutually exclusive.

        • K-L-Waster
        • 4 years ago

        To take that further, the writers, concept artists, world artists and modelers are never the same people who code the game engine, and vice versa. Developing an advanced game engine and developing a compelling story and game world are completely uncorrelated activities. There is no inherent reason a game cannot have both.

        Sadly, it is rarely the case that they do go together, but that is hardly the fault of the underlying technology.

          • TwoEars
          • 4 years ago

          They are not uncorrelated – big publishers (run by suits) are shipping millions of games using fancy screenshots and ads that has little to do with gameplay and story. They are in it for the money and they know they can get away with it 90% of the time. Meanwhile small developers (run by enthusiasts) realize that they can’t compete on graphics (and that they don’t have enough money for ads) so they know they have to nail the story and gameplay and sell by word of mouth. This is how the industry has been operating for some time now.

            • K-L-Waster
            • 4 years ago

            My point is there is nothing the inherently forces it to be that way. The fact that certain companies choose to work that way is their business (I don’t like it either, but not a whole lot either of us can do about it other than vote with our wallets….).

            To put it another way, dispensing with DX12, or Mantle, or the next flavour of OpenGL, will not suddenly force the big budget studios to hire decent writers. They’ll just keep making eye candy with the existing tech.

            It’s identical to what happens in movies: most of the big eye-candy blockbusters have lousy writing and acting, and all of the budget ends up in pyro and CGI. That doesn’t inherently make CGI bad, though, and there are some few exceptions where there is a big effects budget *and* a great story.

          • Ninjitsu
          • 4 years ago

          I guess TwoEars’ point is more about money distribution. I think.

        • LostCat
        • 4 years ago

        I say there are enough games with both that I could’ve stuck with the ones with good graphics and still had more than my fill of great games.

      • chuckula
      • 4 years ago

      [quote<]Just give me a solid RPG story and good gameplay mechanics and I'm happy. [/quote<] It is pitch black. You are likely to be eaten by a grue.

    • BryanC
    • 4 years ago

    Mixing and matching GPUs is unlikely to be very useful, unless the GPUs are very similar. Going from 36 to 40 FPS, while requiring a custom code path for this combination of Intel & Nvidia GPU, seems like a marginal gain for a large amount of work, when you consider all the combinations of GPUs people might use. Especially when you consider that this exposes 2x the driver bugs (or maybe more, because it will use less well tested code).

      • the
      • 4 years ago

      The coding may not be as proprietary as you think. This example was that the Intel GPU was doing post processing work from the nVidia GPU. These tasks are likely written independently form the main rendering pipeline. The DX 12 scheduler can pass this on to available hardware as a pipelined function generically. Thus if there was only a single GPU in a system, the GPU would then context switch between the main rendering pipeline and the post processing functionality (alternatively, if the GPU could do both if it supports multiple simultaneous contexts).

      Though you are spot on about drivers bugs by using multiple GPUs. Hopefully with DX12 being a thin layer, there isn’t much room for big bugs to appear. I’d still expect a rocky launch for just being on the bleeding edge.

        • BryanC
        • 4 years ago

        I don’t think the DX12 scheduler has the ability to pass work on to available hardware without a special code path, because that would imply the scheduler can transfer data between GPUs without the developer knowing. This would defeat the purpose of a low-level API.

        Because there are so many combinations of hardware that could be used, and DX12 is a low-level framework, properly “using all the GPUs” will require developers to implement complicated scheduling code themselves.

        I don’t see this happening very often.

      • xeridea
      • 4 years ago

      The level of control with multi GPU is what makes it useful. Instead of alternate frame rendering, you can have mulitple cards rendering the same frame (different sections), which reduces latency and studdering. With less capable Intel GPU, they had it doing a smaller task, but it was more useful than sitting idle. As developers before have said, the new APIs are easier to work with because you have more control, less to worry about with optimization of the unknown. This eliminates the need for constant CF/SLI profiles, endless hours of tweaking to try to work around DX11.

        • the
        • 4 years ago

        The UE4 demo wasn’t even using split frame rendering. Rather the work load distribution was task based: both the nVidia and Intel GPUs worked on a single entire frame. That’s really powerful in terms of flexibility.

        I see split frame rendering working best with two or more identical GPUs as they’re guaranteed to have the same performance and functionality.

        There is also the real possibility of a system with two discrete and one integrated GPU. Split frame the main rendering and let the integrated GPU handle all the posts for the UE4 example.

        Things are finally getting exciting again in graphics.

          • xeridea
          • 4 years ago

          Idea:
          Dynamic split frame rendering.

          If 2 GPUs are being used that are unequal, SFR could be used, and the performance monitored. If a game notices that one GPU is taking longer to render its part, it would vary the workload, so instead of 50/50, it would do like 60/40.

          Or it could just split up into say a 4×4 grid, and each GPU would just claim squares to work on, this would be better for scenes where different parts of screen have more action or detail.

            • the
            • 4 years ago

            I’d hate to break it to you, but both of those techniques have been implemented a decade ago.

            Split frame rendering has traditionally been dynamic: a perfect split down the middle of a frame rarely divides up the workload between the two GPU’s evenly. Back when nVidia first introduced quad SLI with the 7900GX2, you could even have the driver draw the line where it would split the screen under DX9. ([url=https://techreport.com/review/11006/quad-sli-under-the-microscope/2<]Check out the ancient history.[/url<]) The concept of tiling a screen into small chunks for load distribution isn't new either. PowerVR is based around the tiling concept throughout the graphics pipeline (though PowerVR doesn't scale across multiple GPUs, they do this internally to a degree). [url=https://techreport.com/review/8826/ati-crossfire-dual-graphics-solution/3<]AMD used this when they first introduced Crossfire.[/url<] So what stopped split frame rendering and tiling a frame buffer from being used today? DirectX 10. It was a massive API change which focused more on the simplistic alternate frame rendering for multiple GPU scaling. DX10 also changed how GPUs work by moving toward unified shaders. Between the software and hardware changes, techniques like split frame rendering and tiling a frame buffer simply broke. Until today with Vulcan/Mantle and DX12, not much effort has been put into actually fixing these.

      • HisDivineOrder
      • 4 years ago

      For better or worse, the very idea of DirectX 12 (and low level access in general) is putting more of the onus of driver work on the developers themselves versus the card manufacturers. Games that are tailored closer to the metal are going to require the developers to fix them versus driver teams working to fix the driver in the middle model we’ve had for years now.

      That’s an advantage AND a disadvantage. We’re counting on the same developers who do or do not update their games post-release to be more responsible for keeping games running in the future when games are old.

      So… I think “driver bugs” are the least of our problems, really.

        • BryanC
        • 4 years ago

        I think we’re saying the same thing here. Do we live in a universe where the multi-adapter mode that this article discusses has any real impact on games?

        I think the answer is clearly no. It is too complex of a problem, with very marginal gains, and no one is incentivized correctly to do the work: not the developers (who need to move on to the next thing rather than continuing to support new and complex compositions of heterogeneous GPUs), not the GPU vendors, and low-level graphics APIs by nature can’t fix it either.

    • kuttan
    • 4 years ago

    Finally there is a reason for having an Intel IGP yep AMD too. IGPs can contribute assistance to the discrete GPUs while gaming. So people having an always unused IGP, now getting used is really a smart move.

    • NTMBK
    • 4 years ago

    You’ll all be sorry you didn’t buy APUs!

    (Okay, probably not)

    • chuckula
    • 4 years ago

    The workload sharing (if done properly) in DX12 and Vulkan is quite interesting because it could take away a bunch of the problems we have seen with timing glitches in alternate-frame-rendering in traditional multi-GPU setups.

    Also, technically you don’t have to have complete duplication of all the data for a scene between the video memories of two different cards since each card can hold data for only a portion of the scene or for certain portions of the rendering process and then the final scene is a composition of the outputs from both cards. (Some duplication of data between the cards is likely, but it doesn’t have to be a complete copy, which effectively halves the size of available VRAM in current setups.)

      • DrDominodog51
      • 4 years ago

      Wouldn’t the composition of the 2 portions of the frame cause more cpu load?

        • chuckula
        • 4 years ago

        Yes but didn’t you hear, DX12 fixed CPU load… there’s billions & billions of cycles free for that now!

      • Laykun
      • 4 years ago

      Complete duplication of data is necesary even if you’re rendereing different portions of a scene on two different cards as it’s hard to know at run time exactly what resources you’re going to need for each part of the screen. And when the scene shifts you’d need to move data accross the PCI-E bus to the other GPU which wouldn’t be ideal. Again with doing different parts of the rendering process on different GPUs will often cause dependant data, where one GPU is stalled waiting for another to finish, again not ideal. Duplicating data over both GPUs has the huge benefit of not having to sync anything between two cards over PCI-E or having to move resources between GPUs at run time.

      If you completely duplicate the data though it’d be great to see something like shadow map generation shared/balanced between GPUs and the results synced with each other once they’re done, along with split frame rendering for VR setups.

        • chuckula
        • 4 years ago

        How about this: One GPU handles everything pre-rasterization and then another GPU handles everything post-rasterization in a pipeline setup.

        It might not be the most efficient setup, but technically you could split the data into stages at that point so GPU #2 would only need to receive the rasterized output from GPU #1 then apply fragment shaders & other post-processing. That might be similar to what Nvidia did in that demonstration cited in the story.

    • Shambles
    • 4 years ago

    Making use of the integrated GPU is good.

    Being able to mix and match GPUs from different generations and manufacturers would be better.

      • Duct Tape Dude
      • 4 years ago

      [quote<]Being able to mix and match GPUs from different generations and manufacturers would be better.[/quote<] I believe I've heard them using AMD/nvidia GPUs together as well. DX12 sees everything as a pool of resources.

Pin It on Pinterest

Share This