Civ: Beyond Earth with Mantle aims to end multi-GPU microstuttering

The next installment in Sid Meier's Civilization series, Civilization: Beyond Earth, comes out tomorrow. The folks at AMD have been working with its developer, Firaxis, to optimize the game for Radeon graphics cards. Most notably, Firaxis and AMD have ported the game to work with AMD"s lightweight Mantle graphics API.

Predictably, AMD and Firaxis report that Mantle lowers the game's CPU overhead, allowing Beyond Earth to play smoother and deliver higher frame rates on many systems. They've even provided a nice bar graph with average FPS showing AMD in the lead, like so:

That's all well and good, I suppose (although *ahem* the R9 290X they used has 8GB of RAM). But average FPS numbers won't tell you about gameplay smoothness or responsiveness. What's more interesting is how AMD and Firaxis have tackled the thorny problem of multi-GPU rendering in Beyond Earth.

Both CrossFire and SLI, the multi-GPU schemes from AMD and Nvidia, handle the vast majority of today's games by divvying up frames between GPUs in interleaved fashion. Frame one goes to GPU one, frame two to GPU two, frame three back to GPU one, and so on. This technique is known as alternate-frame rendering (AFR). AFR does a nice job of dividing the workload between GPUs so that everything scales well for the benchmarks. Both triangle throughput and pixel processing benefit from giving each GPU its own frame.

Unfortunately, AFR doesn't always do as good a job of improving the user experience as it does of improving—or perhaps inflating— average FPS scores. The timing of frames processed on different GPUs can go out of sync, causing a phenomenon known as multi-GPU micro-stuttering. We've chronicled this problem in our initial FCAT article and, most extensively, in our epic Radeon HD 7990 review. AMD has attempted to fix this problem by pacing the delivery of frames to the display, much as Nvidia has done for years with its frame metering tech. But frame pacing is imperfect and, depending on how a game's internal simulation timing works, may lead to perfectly spaced frames that contain out-of-sync visuals.

Making AFR work well is a Hard Problem. It's further complicated by variable display refresh schemes like G-Sync and FreeSync that attempt to paint a new frame on the screen as soon as it's ready. Pacing those frames could be a hot mess.

In a similar vein, virtual reality headsets like the Oculus Rift are extremely sensitive to input lag, the delay between when a user's head turns and when a visual response shows up on the headset's display. If that process takes too long, the user may get vertigo and go all a-chunder. Inserting a rendering scheme like AFR with frame metering into the middle of that feedback loop is a bad proposition. Frame metering intentionally adds latency to some frames in order to smooth out delivery, and AFR itself requires deeper queuing of frames, which also adds latency.

At the end of the day, this collection of problems has conspired to make AFR—and multi-GPU schemes in general—look pretty shaky. AFR is fragile, requires tuning and driver support for each and every game, and doesn't always deliver the experience that its FPS results seem to promise. AMD and Nvidia have worked hard to keep CrossFire and SLI working well for their users, but we at TR only recommend buying multi-GPU solutions when no single GPU is fast enough for your purposes.

Happily, game developers and the GPU companies seem to be considering other approaches to delivering an improved experience with multi-GPU solutions, even if they don't over-inflate FPS averages. Nvidia vaguely hinted at a change of approach during its GeForce GTX 970 and 980 launch when talking about VR Direct, its collection of features aimed at the Oculus Rift and similar devices. Now, AMD and Firaxis have gone one better, throwing out AFR and implementing split-frame rendering (SFR) instead in the Mantle version of Beyond Earth.

AMD provided us with an explanation of their approach that's worth reading in its entirety, so here it is:

With a traditional graphics API, multi-GPU arrays like AMD CrossFire™ are typically utilized with a rendering method called "alternate-frame rendering" (AFR). AFR renders odd frames on the first GPU, and even frames on the second GPU. Parallelizing a game's workload across two GPUs working in tandem has obvious performance benefits.

As AFR requires frames to be rendered in advance, this approach can occasionally suffer from some issues:

·         Large queue depths can reduce the responsiveness of the user's mouse input

·         The game's design might not accommodate a queue sufficient for good mGPU scaling

·         Predicted frames in the queue may not be useful to the current state of the user’s movement or camera

Thankfully, AFR is not the only approach to multi-GPU. Mantle empowers game developers with full control of a multi-GPU array and the ability to create or implement unique mGPU solutions that fit the needs of the game engine. In Civilization: Beyond Earth, Firaxis designed a "split-frame rendering" (SFR) subsystem. SFR divides each frame of a scene into proportional sections, and assigns a rendering slice to each GPU in AMD CrossFire™ configuration. The "master" GPU quickly receives the work of each GPU and composites the final scene for the user to see on his or her monitor.

If you don’t see 70-100% GPU scaling, that is working as intended, according to Firaxis. Civilization: Beyond Earth’s GPU-oriented workloads are not as demanding as other recent PC titles. However, Beyond Earth’s design generates a considerable amount of work in the producer thread. The producer thread tracks API calls from the game and lines them up, through the CPU, for the GPU's consumer thread to do graphics work. This producer thread vs. consumer thread workload balance is what establishes Civilization as a CPU-sensitive title (vs. a GPU-sensitive one).

Because the game emphasizes CPU performance, the rendering workloads may not fully utilize the capacity of a high-end GPU. In essence, there is no work leftover for the second GPU. However, in cases where the GPU workload is high and a frame might take a while to render (affecting user input latency), the decision to use SFR cuts input latency in half, because there is no long AFR queue to work through. The queue is essentially one frame, each GPU handling a half. This will keep the game smooth and responsive, emphasizing playability, vs. raw frame rates.

Let me provide an example. Let's say a frame takes 60 milliseconds to render, and you have an AFR queue depth of two frames. That means the user will experience 120ms of lag between the time they move the map and that movement is reflected on-screen. Firaxis' decision to use SFR halves the queue down to one frame, reducing the input latency to 60ms. And because each GPU is working on half the frame, the queue is reduced by half again to just 30ms.

In this way the game will feel very smooth and responsive, because raw frame-rate scaling was not the goal of this title. Smooth, playable performance was the goal. This is one of the unique approaches to mGPU that AMD has been extolling in the era of Mantle and other similar APIs.

All I can say is: thank goodness. Let's hope we see more of this kind of thing from AMD and major game studios in the coming months and years. Multi-GPU solutions don't have to double their FPS averages in order to achieve smoother animations or improved responsiveness. I'd much rather see a multi-GPU team producing more modest increases that the user can actually feel and experience.

Of course, while we're at it, I'll note that if you measure frame times instead of FPS averages, you can more often capture the true improvement offered by mGPU solutions. AMD has been a little slower than Nvidia to adopt a frame-time-sensitive approach to testing, but it's clearly a better way to quantify the benefits of this sort of work.

Fortunately, AMD and Firaxis have built tools into Beyond Earth to capture frame times. I have been working on other things behind the scenes this week and haven't yet had the time to make use of these tools, but I'm pleased to see them there. You can bet they'll figure prominently into our future GPU articles and reviews.

Tip: You can use the A/Z keys to walk threads.
View options

This discussion is now closed.