AMD’s Graphics Core Next architecture premiered around this time last year on the Radeon HD 7000 series. The architecture brought with it a slew of improvements, which we covered at length here. Before long, GCN had spread through much of AMD’s desktop graphics lineup, providing fierce opposition to the competition from Nvidia.
When AMD got around to updating its mobile offerings, though, it did something a little disappointing. The company decided to use its older TeraScale architecture to power mid-range and low-end members of the Radeon HD 7000M series, from the 7600M on down. Those offerings were built using TSMC’s 40-nm fab process, as well, instead of the finer 28-nm process used to fab newer, GCN-powered chips.
AMD did eventually introduce GCN-driven Radeon HD 7700M, 7800M, and 7900M series products to service the higher end of the mobile market. However, cheaper and thinner gaming notebooks have essentially been stuck with re-branded last-gen Radeons for the better part of a year.
Happily, that’s all about to change. AMD officially announced the first members of the Radeon HD 8000M series earlier this week, and those offerings are due out early next year. All of them are based on the GCN architecture and built using TSMC’s 28-nm fab process, and AMD promises some meaty performance gains over the 40-nm TeraScale parts they’re supposed to replace.
We’ve been able to get one of those newcomers, the Radeon HD 8790M, in our lab today. As we reported a few days ago, AMD has rejiggered its mobile branding somewhat, so the Radeon HD 8700M series is meant to succeed the Radeon HD 7600M series. As luck would have it, we have representatives of both lineups in our labs today: a 7690M and an 8790M.
A cursory look at the picture above shows that the 8790M’s graphics chip, code-named Mars, is quite a bit smaller than Thames, the 40-nm slab of silicon that powers the 7690M. Mars measures about 77 mm² by our count, while Thames takes up 118 mm². That difference is at least partly due to the fact that Mars is built using the finer 28-nm process, so its transistors are smaller.
We’d love to compare transistor counts at this point, but unfortunately, AMD is refusing to disclose complete specifications for the Radeon HD 8000M series until January 7. What little we’ve been able to glean about our test samples is listed in the table below:
|Radeon HD 7690M||480||600 MHz||800 MHz||128-bit||1GB GDDR5||40 nm||118 mm²|
|Radeon HD 8790M||384||900 MHz||1000 MHz||128-bit||2GB GDDR5||28 nm||~76.5 mm²|
The 8790M has fewer shader ALUs than its predecessor, but at least in our test sample, its clock speeds are higher—and it comes packed with more memory. Also, of course, the 8790M is based on AMD’s latest GPU architecture. Interestingly, AMD tells us it’s made no refinements to the GCN architecture in the Radeon HD 8000M series. No new features have been added, either, beyond those that were already present in proper, GCN-powered 7000M-series parts.
The new architecture and higher clocks bode well for performance, despite the 8790M’s lower ALU count. AMD’s internal benchmarks show an increase of 20-50% from the 7670M to the 8770M, and one would expect the 7690M to trail the 8790M by a similar margin. Since we have the latter two GPUs our disposal, we’re going to put that assumption to the test.
A novel approach to mobile GPU testing
Traditionally, testing mobile GPUs involves using pre-assembled laptops. That can make comparisons with other offerings tricky, especially when you want to benchmark a previous-gen part that may only be available inside older systems with outdated CPUs.
With the help of AMD, we’ve tried a different approach this time. AMD supplied the aforementioned GPUs—the Radeon HD 8790M and 7690M—as bare MXM modules, each one with its own, dedicated cooler. For our test platform, AMD sent us an MXM to PCI Express x16 adapter alongside an off-the-shelf desktop processor, motherboard, and memory. Using this setup, we’re able to test mobile GPUs free from the confines of notebooks or proprietary qualification hardware.
What you see here is an Intel Core i7-3770K processor, a Gigabyte Z77X-UD3H motherboard, four gigs of AMD memory, AMD’s MXM adapter, and our two mobile Radeons. AMD also threw in a 500GB Seagate hard drive, but we swapped that for a Crucial solid-state drive from our own supply of test hardware. Benchmarking games is a lot quicker on an SSD.
Since we’re testing with a very fast desktop CPU, we’re able to ensure that the processor isn’t a primary performance bottleneck. That means our GPUs should be free to fulfill their performance potential. We follow this same approach when testing desktop graphics cards, and it has served us well. Of course, in this particular case, the processor we’re testing is a fair bit quicker than what you’d find in even a top-of-the-line gaming notebook. Our scores may be a little higher as a result.
Benchmarking mobile GPUs with a desktop platform also rules out battery testing, for obvious reasons. However, elegant solutions to that problem are rare in any event. Every notebook is bound to be different—some will have smaller batteries, some will have larger displays, and others will couple gaming GPUs with slim enclosures and power-sipping CPUs. At least with our test setup, we we can offer a look at the power consumption delta between new and old mobile Radeons, and that’s exactly what we’ve done.
Sadly, we’re still working on getting a competing mobile GPU from Nvidia. The benchmarks on the next few pages will show you how the new Radeon compares to the old one, but they won’t tell you how either stacks up against a rival GeForce. Our apologies. That said, knowing how the two generations compare—not to mention what kinds of games can be played, and at what settings—is very valuable information, especially given the dearth of mobile GPU benchmarks out there. We think we’re better off making this information available now and building upon it later, rather than waiting, possibly indefinitely, for the stars to align.
Our testing methods
As ever, we did our best to deliver clean benchmark numbers. Tests were run at least three times, and we reported the median results. Our test systems were configured like so:
|Processor||Intel Core i7-3770K|
|North bridge||Intel Z77 Express|
|Memory size||4GB (2 DIMMs)|
|Memory type||AMD Memory
DDR3 SDRAM at 1600MHz
|Chipset drivers||INF update 126.96.36.1991
Rapid Storage Technology 11.6
|Audio||Integrated Via audio
with 6.0.01.10800 drivers
|Hard drive||Crucial m4 256GB|
|Power supply||Corsair HX750W 750W|
|OS||Windows 8 Professional x64 Edition|
|Driver revision||GPU base
|AMD Radeon HD 8790M||Catalyst 9.011 RC2||900||1000||2048|
|AMD Radeon HD 7690M||Catalyst 9.011 RC2||600||800||1024|
Thanks to Corsair and Crucial for helping to outfit our test rig—and AMD for supplying the test platform and mobile GPUs, as well.
AMD-specific optimizations for texture filtering and tessellation were disabled in the control panel. Other image quality settings were left at the control panel defaults. Vertical refresh sync (vsync) was disabled for all tests.
We used the following test applications:
Some further notes on our methods:
- We used the Fraps utility to record frame rates while playing a 90-second sequence from the game. Although capturing frame rates while playing isn’t precisely repeatable, we tried to make each run as similar as possible to all of the others. We tested each Fraps sequence five times per GPU in order to counteract any variability. We’ve included frame-by-frame results from Fraps for each game, and in those plots, you’re seeing the results from a single, representative pass through the test sequence.
We measured total system power consumption at the wall socket using a P3 Kill A Watt digital power meter. The monitor was plugged into a separate outlet, so its power draw was not part of our measurement. The GPUs were plugged into our MXM to PCI Express adapter, which was itself plugged into a motherboard on an open test bench.
The idle measurements were taken at the Windows desktop with the Aero theme enabled. The GPUs were tested under load running Skyrim at its Ultra quality preset.
The tests and methods we employ are generally publicly available and reproducible. If you have questions about our methods, hit our forums to talk with us about them.
We tested Battlefield 3 by playing through the start of the Kaffarov mission, right after the player lands. Our 90-second runs involved walking through the woods and getting into a firefight with a group of hostiles, who fired and lobbed grenades at us.
We tested at 1440×900 using the game’s “Medium” detail preset.
We should preface the results below with a little primer on our testing methods. Along with measuring average frames per second, we delve inside the second to look at frame rendering times. Studying the time taken to render each frame gives us a better sense of playability, because it highlights issues like stuttering that can occur—and be felt by the player—within the span of one second. Charting frame times shows these issues clear as day, while charting average frames per second obscures them.
To get a sense of how frame times correspond to FPS rates, check the table on the right.
We’re going to start by charting frame times over the totality of a representative run for each system. (That run is usually the middle one out of the five we ran for each GPU.) These plots should give us an at-a-glance impression of overall playability, warts and all.
Right off the bat, we can see the 8790M is doing quite a bit better than its predecessor. Both solutions exhibit occasional latency spikes, though.
We can slice and dice our raw frame-time data in other ways to show different facets of the performance picture. Let’s start with something we’re all familiar with: average frames per second. While this metric doesn’t account for irregularities in frame latencies, it does give us some sense of overall performance. We can also demarcate the threshold below which 99% of frames are rendered, which offers a sense of overall frame latency, excluding fringe cases. (The lower the threshold, the more fluid the game.)
We’re looking at a 58% increase in average frame rates and a 59% drop in 99th-percentile frame times. That’s a pretty impressive improvement from one generation to the next.
Now, the 99th percentile result only captures a single point along the latency curve. We can show you that whole curve, as well. With single-GPU configs like these, the right hand-side of the graph—and especially the last 5% or so—is where you’ll want to look. That section tends to be where the best and worst solutions diverge.
Finally, we can rank solutions based on how long they spent working on frames that took longer than a certain number of milliseconds to render. Simply put, this metric is a measure of “badness.” It tells us about the scope of delays in frame delivery during the test scenario. Here, you can click the buttons below the graph to switch between different millisecond thresholds.
The 8790M isn’t just faster on average. It also spends a lot less time working on high-latency frames, which makes for more fluid, stutter-free animations and smoother gameplay.
For this test, I shamelessly stole Scott’s Borderlands 2 character and aped the gameplay session he used to benchmark the Radeon HD 7950 and GeForce GTX 660 Ti earlier this month. The session takes place at the start of the “Opportunity” level. As Scott noted, this section isn’t precisely repeatable, because enemies don’t always spawn in the same spots or attack in the same way. We tested five times per GPU and tried to keep to the same path through the level, however, which should help compensate for variability.
Here, too, we tested at 1440×900. All other graphics settings were maxed out except for hardware-accelerated PhysX, which isn’t supported on these Radeons.
Yikes. Neither GPU does a good job of keeping frame times consistently low, as evidenced by the large number of spikes on both plots.
The 8790M does pull off much better average frame rates and 99th-percentile frame times, though. Then again, that 54.3-ms frame time works out to an 18.4 FPS frame rate, which is hardly anything to brag about.
According to our percentile graph, things start to go awry around the 95th percentile on both GPUs. So, about 5% of frames take significantly longer to render than the rest. That about tracks with what Scott measured in Borderlands 2 with Radeon HD 7950. (This time, though, we were using newer drivers supplied by AMD.)
Both GPUs spend a fair bit of time working on frames that take more than 33.3 ms or 16.7 ms to render. However, the 8790M is obviously faster and doesn’t linger too long on frames that take more than 50 ms or so. Our seat-of-the-pants impression echoes this result. Borderlands 2 is responsive and very much playable on the 8790M, even if the latency spikes damage the illusion of motion to some degree, making animations appear not completely fluid.
Far Cry 3
This is our first time testing Far Cry 3. I picked one of the first assassination missions, shortly after the dramatic intro sequence and the oddly sudden transition from on-rails action shooter to open-world RPG with guns.
The game was run at 1440×900 with MSAA disabled and the “Medium” graphics quality preset selected.
Far Cry 3 is a very graphically intensive game, and at the image quality settings we’ve chosen, both cards struggle intermittently with delivering smooth gameplay. The Y axis on the plot has been stretched to show the 7690M’s lone spike to around 180 ms a third of the way into the run. (Yes, it happens every time.) However, you can see the 8790M suffers from relatively frequent spikes over 40 ms, especially toward the beginning of the run.
At the start of the run, those spikes manifest as a kind of jumpiness or jerkiness in the animation. Oddly enough, the effect is worse than in Borderlands 2, even though the spikes tend to be shorter.
The 8790M’s 99th-percentile frame time is lower here than in Borderlands 2, at least. 39.8 ms works out to around 25 FPS.
Our latency plot shows the 8790M’s frame latencies are fine through about 97% of the run. Only the last 3% of frames seem to pose a problem. Because the 8790M is so much faster, though, even its worst frame times aren’t too much higher than the 7690M’s are generally.
The 8790M spends very little time working on frames that take more than 50 ms or 33.3 ms to render. That reflects what we see in the frame-time plot, where the 8790M’s latency spikes are generally short and clearly become infrequent after the first 500 frames or so. There are only two exceptions where unusually high spikes disrupt gameplay.
The 7690M only sees one exceptional spike, but because the GPU is relatively slow, it nevertheless spends over two thirds of the run working on frames with latencies above 33.3 ms. Far Cry 3 feels sluggish and choppy on that GPU.
Here, too, I pilfered Scott’s saved game and attempted to replicate his gameplay session, which involves walking through a crowd in the Chinatown level. As you can see in the video, the crowd is very dense, and there are plenty of fancy shader and tessellation effects in play.
Testing was conducted at 1440×900 with MSAA disabled and using the “High” quality preset.
Yeah, so, this isn’t really much of a contest. The 7690M performs abysmally here, with huge and extremely frequent latency spikes. Even the average frame rate alone—just 22 FPS—shows how poorly the last-gen part handles itself.
The 8790M fares much better at these settings, with nearly double the average frame rate and half the 99th percentile frame time. 53.2 ms is admittedly still on the low side—it works out to about 18.8 FPS—and there are more high-latency frames than in Far Cry 3. However, the 8790M is fine through about 96% of the run, and subjectively, it feels smooth overall.
I haven’t had a chance to get very far into Sleeping Dogs myself, but TR’s Geoff Gasior did, and he got hooked. From the small glimpse I’ve received of the game’s open-world environment and martial-arts-style combat, I think I can see why.
The game’s version of Hong Kong seems to be its most demanding area from a performance standpoint, so that’s what we benchmarked. We took Wei Shen on a motorcycle joyride through the city, trying our best to remember we were supposed to ride on the left side of the street.
We benchmarked Sleeping Dogs at 1440×900 using a tweaked version of the “High” quality preset, where we disabled vsync and knocked both antialiasing and SSAO down to “Normal.” We had the high-resolution texture pack installed, too.
Along with Battlefield 3, Sleeping Dogs is perhaps one of the best-behaved games on the Radeon HD 8790M. The 99th-percentile frame time is nice and low, and latency spikes are both small and infrequent. The game looks and plays great.
On the 7690M, it’s another story. I suspect that GPU’s smaller memory capacity is a hindrance here, since we’re using the high-resolution texture pack. Either way, the game hangs in a very disruptive fashion every few seconds, which makes driving through the busy streets of Hong Kong a dangerous and scary experience. More than once, I had to restart a test run after veering off course when the game skipped.
Over the last few pages, we’ve seen that the Radeon HD 8790M is much quicker than the 7690M. Now, we can see that performance increase doesn’t come with substantially higher power consumption; the 8790M draws only 2W more under load. Not only that, but it draws less power than the 7690M at idle. Thanks to AMD’s ZeroCore Power feature, which is exclusive to 28-nm, GCN-powered parts, power utilization falls even lower when the display goes to sleep.
Note that these numbers show power draw for the whole system, including the Core i7-3770K, which has a 77W power envelope. Total power use on a notebook would probably be much lower.
We would have included noise and temperature numbers, but the MXM GPU modules AMD sent us have very different cooling solutions, neither of which you can expect to see in actual notebooks. Noise and temperature measurements for these samples would be pointless at best and misleading at worst.
You know the saying: better late than never.
I think that applies to the Radeon HD 8000M series. Mid-range and low-end gaming notebooks have been saddled with 40-nm GPUs based on AMD’s old TeraScale architecture for almost a year. The 8000M series finally rights that wrong by bringing 28-nm, GCN goodness to lower price tiers and power envelopes. AMD hasn’t broken new ground here; it’s simply made a year-old architecture available to more folks.
As our benchmarks have shown, the wait has been worth it. The Radeon HD 8790M beats the pants off its predecessor, and it does so while consuming less power at idle and only marginally more under load.
Before we sign off, we should remind readers that clock speeds and memory configurations will vary in the wild. So, not all of the 8790M or 7690M GPUs you find out there will be equivalent to those we tested. Some will feature slower DDR3 memory and may have lower core clock speeds, as well.