AMD unveiled its Mantle graphics programming layer during a press event in Hawaii two months ago. The announcement immediately sent waves through the PC gaming community, and in its wake, we heard about a number of games adopting the API—games from Battlefield 4 to Thief to Star Citizen. However, AMD divulged comparatively little about how Mantle works or about the benefits we can expect from it. The lack of concrete information about Mantle spawned lots of speculation and debate, but most of it wasn't very enlightening.
Fortunately, we now know some specifics. At its APU13 developer conference in San Jose, California, AMD invited journalists and developers to listen to hours worth of keynotes and sessions by Mantle luminaries. We didn't just hear from the API's architects; we also listened to some its more illustrious early adopters, a few of whom helped develop Mantle in collaboration with AMD.
Among the speakers were Guennadi Riguer and Brian Bennett, the two AMD staffers who created Mantle; Johan Andersson of EA DICE, the man behind the Frostbite engines that power the Battlefield series; and Jurjen Katsman, the CEO of Nixxes, a Dutch studio that's porting the next Thief game to the PC. (Nixxes can also be credited with porting Deus Ex: Human Revolution, Hitman Absolution, and Tomb Raider to Windows.)
Altogether, the Mantle presentations and talks at APU13 amounted to well over three hours of material. Much of that material was laden with game programming jargon and cryptic PowerPoint diagrams, and almost all of it was presented by developers with a knack for talking really, really fast. What follows is some of the information we managed to glean from those sessions—and from talking with a few of those folks one-on-one.
What on earth is Mantle?
Before we get started, we should probably talk a little bit about what Mantle is.
Mantle is a new application programming interface, or API, for real-time graphics that's intended to be a substitute for Direct3D and OpenGL. Mantle is designed to cut much of the overhead associated with those APIs, and in many respects, it's meant to operate at a lower level, closer to the metal, than they do. In that sense, Mantle is similar—but not identical—to the low-level APIs used to develop games on consoles like the Xbox One and the PlayStation 4.
At present, Mantle support is limited to Windows systems with graphics processors based on AMD's Graphics Core Next architecture. Games written using Mantle will run on discrete Radeons from the HD 7000 series onward, and they'll work on upcoming Kaveri APUs, too. (The GCN graphics architecture has made its way into some other notable silicon, including the SoCs inside of both the PlayStation 4 and the Xbox One, but Mantle does not, to our knowledge, currently support those.)
All of this talk of being "close to the metal" refers to a classic tradeoff in programming interfaces, especially in real-time graphics. A programming interface may choose to go very low level by exposing control over the smallest details of the hardware, giving the developer access to exact buffer sizes and the like. Doing so can allow programmers to extract the best possible performance out of a particular piece of silicon. However, applications written for low-level APIs can become dependent on the presence of specific hardware. When a new chip architecture comes along, a "close to the metal" application may run poorly or even refuse to run on the new silicon. In order to maintain broader compatibility and flexibility, higher-level APIs restrict access to hardware-specific features and expose a simpler set of capabilities that presumably will be available across multiple chip architectures.
Console APIs can afford to be fairly low-level, since console hardware doesn't change for years at a stretch. By contrast, the high-level nature of Direct3D is the bit of magic that allows us to run decade-old PC games on brand-new graphics cards without issue.
In Mantle's case, according to Riguer, AMD has lowered the abstraction level in some areas but "not across the board." DICE's Johan Andersson described the traditional approach as "middle-ground abstraction," where a compromise is struck between performance and usability. Mantle, by comparison, offers "thin low-level abstraction" that exposes how the underlying hardware works. Riguer boiled it down further by comparing Mantle to driving a car with a manual transmission—more responsibility, but also more fun.
Also, while Graphics Core Next is the "hardware foundation" for Mantle, AMD's Guennadi Riguer and some of the other Mantle luminaries at APU13 made it clear that the API is by no means tied down to GCN hardware. Some of Mantle's features are targeted at GCN, but others are generic. "We don't want to paint ourselves in a corner," Riguer explained. "What we would like to do with Mantle is to have [the] ability to innovate on future graphics architectures for years to come, and possibly even enable our competitors to run Mantle." Jurjen Katsman of Nixxes was even bolder in his assessment, stating, "There's nothing that I can see from my perspective that stops [Mantle] from running on pretty much any hardware out there that is somewhat recent."
Of course, technical feasibility isn't the only obstacle in the way of Nvidia's hypothetical adoption of Mantle. We'll discuss this again in a little more detail at the end of the article. But first...
The problem with Direct3D
To understand why AMD created Mantle, it helps to know about some of the pitfalls of development with current, vendor-agnostic APIs. That model involves a substantial amount of overhead, and it apparently puts much of the optimization burden on driver developers, leaving game developers with limited control over how the hardware runs their software.
Katsman was particularly critical, calling Direct3D "extremely unpredictable" and complaining that, in some titles, "50% of your CPU time is spent by the driver and by Direct3D doing something that you're not quite sure about." AMD's Riguer blamed that high overhead partly on the fact that graphics drivers have "no straightforward way to translate API commands to GPU commands" and are "not all that lean and mean." In consoles, where the APIs are closer to the metal, Katsman said overhead amounts to something like "a few percent" of total CPU time.
The slide above, taken from the Nixxes presentation, outlines some of Katsman's grievances with Direct3D in more detail.
Among those grievances is the performance hit caused by the driver compiling shaders at "undefined times" in the background. Katsman noted that, in Deus Ex: Human Revolution, one of Nixxes' PC ports, shader compilation caused the game to stutter—which, in turn, led players to complain online. For what it's worth, we did notice some ugly stuttering in our own testing of that game, although it's not clear if those slowdowns were caused by this specific problem.
Another issue with Direct3D is the developer's lack of control over GPU memory. Riguer explained that consoles let developers achieve "much greater visuals than on [the] PC with comparable or greater memory configs." Katsman provided some background information about why that is. "In general, [with] Direct3D, if you destroy and recreate resources all the time, the API is too slow to do that, so you're stuck having a fixed amount of resources that you cache and you keep around," Katsman said. "Memory usage on PC is actually far higher, and we're not really getting anything in return."
There's also the overhead associated with draw calls, Direct3D's basic commands to place and manipulate objects on the screen. Packing in the amount of detail in today's games requires lots of draw calls for each frame, and that leads to what developers call the small-batch problem. In Riguer's words, "You hit a wall after so many draw calls per frame." The limit is usually around 3,000-5,000 draw calls per frame, although very skilled developers can purportedly manage 10,000 or more. According to Katsman, developers must "jump through a lot of hoops" and come up with "new and clever ways to have fewer draw calls." The barrier to increasing the number of draw calls per frame lies not with the hardware, Katsman added, but with the API.
Katsman then decried the fact that driver optimizations are "almost required" for new games. Anyone who's ever had to download multiple beta driver updates to support a new PC game will be all too familiar with that problem. Developers are, in effect, unable to make their games work well by themselves. "I think that's actually very harmful and doesn't really contribute to users getting a good experience from the games they buy," said Katsman.
Finally, PC games underutilize multi-core processors. Four-, six-, and eight-core chips aren't uncommon in modern gaming PCs, but AMD's Riguer said that "very few of those cores are available for driving graphics today." Katsman elaborated on this point, noting that developers must expect drivers to spawn extra threads. He brought up this hypothetical scenario: "If the system has eight cores, then as an app, we should probably only use five, because who knows, the driver may still use another three or so." That truly is a hypothetical scenario, though—in practice, Katsman pointed out that most games "flatten off at one core."
|Friday night topic: what are you giving for Christmas?||79|
|Notes from TR's next-gen storage testing||21|
|Today's Steam deals include AC Unity, Borderlands: The Pre-Sequel||25|
|Deal of the week: A Radeon R9 290X for $280, a 960GB SSD for $339, and more||1|
|RRAM breakthrough could lead to 1Tb chips built on 28-nm tech||19|
|The TR Podcast 167.5 bonus edition: You guys ask us stuff!||4|
|AC Unity season pass holders can now redeem their free game||13|
|Our bonus TR live stream is up right now!||3|