We tested Mantle performance versus Direct3D in Windows 8.1 using a couple of different processors, a Kaveri-based AMD A10-7850K APU and an Intel Haswell-based Core i7-4770K. The idea was to test in a CPU-constrained performance scenario using two processors with different levels of performance. In fact, I had hoped to show a lower level of CPU performance by including another AMD APU, a 65W Richland-based A10-6700. However, its performance turned out to be almost identical to that of the 95W Kaveri 7850K, so I held it out of our final results in order to keep things simple.
The main video card we used was a Radeon R9 290X card from XFX. This 290X comes with a custom cooler and sustains its peak Turbo clock almost constantly, even under the heaviest of loads. It essentially eliminates the clock speed and thermal variance issues we've seen with stock-cooled 290X cards. (I'll be writing more about this card soon.) To ensure the GPU wasn't the performance constraint, we tested BF4 at 1920x1080 on the "high" image quality presets, which is fairly easy work for a video cards of this power. We also tested at these same settings using a GeForce GTX 780 Ti, in order to see how Nvidia's Direct3D driver fares compared to AMD's D3D and Mantle implementations.
We captured performance info while playing through a two-minute-long section of BF4 three times on each config. You can click the series of buttons below to see frame-time plots from one of the test runs for each config we tested.
Even the raw plots readily show Mantle producing lower frame times and more total frames than Direct3D does with the same R9 290X card.
The known issue with occasional stuttering rears its head in one plot, for the 4770K with Mantle. You can't see the full size of the frame time spike on the plot, but it's 295 milliseconds—nearly a third of a second. We didn't see this sort of hiccup all that often, but it did happen during some test runs, including the one we plotted for the 4770K.
AMD has made some big claims for performance improvements from Direct3D to Mantle, and the numbers from the A10-7850K appear to back them up. The leap from an average of 69 FPS to 110 FPS is considerable by any standard, particularly for an API change that apparently produces the same visuals. Even better, our latency-focused metric, the 99th percentile frame time, tends to agree that Mantle is substantially faster than D3D in this case. Mantle also outperforms Direct3D in combination with the Core i7-4770K, but the differences aren't quite as dramatic.
One thing we didn't expect to see was Nvidia's Direct3D driver performing so much better than AMD's. We don't often test different GPU brands in CPU-constrained scenarios, but perhaps we should. Looks like Nvidia has done quite a bit of work polishing its D3D driver for low CPU overhead.
Of course, Nvidia has known for months, like the rest of us, that a Mantle-enabled version of BF4 was on the way. You can imagine that this game became a pretty important target of optimization for them during that span. Looks like their work has paid off handsomely. Heck, on the 4770K, the GTX 780 Ti with D3D outperforms the R9 290X with Mantle. (For what it's worth, although frame times are very low generally for the 4770K/780 Ti setup, the BF4 data says it's still mainly CPU-limited.)
The "time spent beyond X" graphs are our indicator of "badness," of how long frame production times exceed several key thresholds. Those intermittent stuttering episodes with the early Mantle driver show up in the beyond-50-ms results for the A10-7850K, even though we didn't see a hiccup of this size in every run. Since we're showing the median result from three runs, the spike we plotted for the 4770K doesn't show up at all here. (There were no such spikes in the other two test sessions.)
The big takeaway here comes from the "time spent beyond 16.7 ms" plot. You need to produce a frame every 16.7 milliseconds to achieve a smooth 60-FPS rate of animation. Mantle moves the A10-7850K much, much closer to that goal, even with that one big latency spike in the picture. If AMD can eliminate those hiccups, then slower CPUs like the 7850K should be capable of delivering a much smoother gaming experience than they can with Direct3D.
|The TR Podcast 175: the Zen of chipmaking and ARM's Cortex-A72 revealed||4|
|Elon Musk lays out vision for a battery-powered future||117|
|Inside ARM's Cortex-A72 microarchitecture||34|
|Asus' 144Hz MG279Q monitor may top out at 90Hz with FreeSync||58|
|Deal of the week: A Bay Trail netbook for $161, free case fans, and more||18|
|DirectX 12 Multiadapter shares work between discrete, integrated GPUs||98|
|Gigabyte's 9-series motherboards are Broadwell-ready||46|
|The TR Podcast will be live on Twitch shortly!||3|
|AMD delays FreeSync support for multi-GPU systems||41|