At the same time, it appears those of us still paying attention to, you know, the most powerful consumer computing platform are living in some sort of a magical future-land where our most persistent gripes have been replaced by difficult choices between multiple amazing options. Want a quiet case? Easily done. Want a case capable of housing powerful hardware? Easily had, as well. Want a case that’s both at once? Also readily available. Need a power supply? This one’s modular and makes zero noise at idle. Lousy keyboard got you down? Here, take your pick from ten different mechanical offerings with four different switch types. This one will massage your fingertips as you type.
Decent computer parts can still be had for cheap, but if you want to pay more in order to get something that’s higher quality, the choices now are better than ever. Component makers are increasingly catering to the desires of PC hobbyists, and frankly, we could get used to it. Already are used to it, really, except when we’re occasionally surprised by another nifty example of the trend.
We were a little startled recently when we received a package containing nothing but the prybar you see above. No, as far as we know, this isn’t a new motion controller for Left 4 Dead, although that would be awesome. Instead, it’s just a big, metal implement. We weren’t sure what to make of it until several days later, when the following arrived:
Without the prybar, I would have surely chipped a tooth trying to get that crate open, so thank goodness.
Inside was the subject of our attention today, the GeForce GTX 690. Yes, this is the new uber-high-end graphics card from Nvidia that packs two of the GK104 graphics processors found in the GeForce GTX 680. You can probably tell from the picture above that the GTX 690 looks a little different from your typical graphics card. Here are a couple more shots, closer up, to give you a better sense of things.
Yes, the GTX 690 looks distinctive. What may not be obvious from the pictures is that the card’s sleek lines and metallic color palette are not a plasticky imitation of something better, as one might expect given the history here. Instead, this premium graphics card is built with premium materials. The silver-colored portions of the cooling shroud are chromium-plated aluminum, and the black piece between them is made of magnesium. Beneath the (still plastic) windows on either side of the cooling fan, you can see the nickel-plated fins of the card’s dual heatsinks. Oh, and there’s a bit of a light show, too, since the green “GeForce GTX” lettering is LED-illuminated.
Touch a fingertip to the cool, solid surfaces of the GTX 690, and you can feel the extra expense involved. Use that fingertip to give the cooling fan a spin, and you’ll feel it there, too, in a motion that’s perceptibly smoother than most. No expense was spared on the materials for this card, and it shows in little ways that, we’ll admit, not everyone will appreciate. We can’t help but like it, though. In terms of look and feel, if the GTX 690 has a rival among current video cards, it may be XFX’s aluminum-shrouded Radeons. But you’ll need two of those in order to approach the GTX 690’s performance.
Nvidia tells us it has invested heavily in tuning the acoustics of the GTX 690’s cooler, as well. Beyond the fancy fan mechanism, the base plate beneath the fan features ducted air channels. Mounted on the board are very low profile capacitors, intended to reduce turbulence in the air flowing across the heatsinks. Time constraints have kept us from disassembling our GTX 690 yet, but below are a couple of stock pictures of the areas in question. As you can see, the GTX 690’s cooler is designed to send air flowing in two directions: half toward the back of the card and outside of the case, and half toward the front of the card, into the PC enclosure.
All told, Nvidia expects the GTX 690’s cooler to be very quiet for what it is—quieter even than some of the firm’s single-GPU cards. We’ll test that claim shortly, of course.
Specs like EyeMasters
|GeForce GTX 680||1006||1058||1536||128||32||6 GT/s||256||195W|
|GeForce GTX 690||915||1019||3072||256||64||6 GT/s||2 x 256||300W|
The GeForce GTX 690’s specifications are eye-popping, which is mostly what you’d expect from an SLI-on-a-stick graphics card. All of the GK104’s units are enabled, so many of the key rates are twice the GTX 680’s. Since the GTX 690 is a dual-GPU pairing, of course, the peak graphics rates shown in the table above are somewhat less connected to reality than usual. Applications may or may not take advantage of all of that power depending on many things, some of which we’ll discuss shortly.
The GTX 690 does have loads of bandwidth on tap, though. Between the two GPUs is a PCI Express bridge chip supplied by PLX; it has 48 lanes of PCIe Gen3 connectivity, 16 lanes routed to each GPU and 16 lanes connected to the host system.
Although the prior-generation GeForce GTX 590 performed more like a couple of down-spec GTX 570s, Nvidia has been able to reach relatively higher with this new card. Some of the credit goes to the Kepler generation’s new GPU Boost dynamic voltage and frequency scaling feature, which raises clock speeds to take advantage of any available thermal headroom. The GTX 690’s “base” clock is lower than the GTX 680’s by quite a bit, but the 690 has more range built into it. The 690’s “boost” clock of 1019MHz isn’t far from the GTX 680’s boost clock of 1058MHz. If the workload and the ambient conditions allow enough headroom, the GTX 690 should operate at something close to its boost clock rate—sometimes at even higher frequencies than that. As a result, Nvidia expects the GTX 690 to perform very similarly to a pair of GTX 680s in SLI.
That’s pretty impressive in the grand scheme, since the GTX 680 can claim to be the world’s fastest GPU. However, the GK104 graphics processor isn’t exactly a heavyweight, by most other standards; it’s a mid-sized chip with a modest power envelope and a 256-bit memory interface. In many ways, the GK104 is more like the GF104 chip that powers the GeForce GTX 560 Ti than it is like the GF110 chip that powers the GTX 580. That lineage is probably why the GTX 690 has “only” 2GB of memory per GPU—4GB in total, but effectively 2GB for all intents. (The Radeon HD 7970, by contrast, has 3GB for a single GPU.) The GK104’s middleweight status was no doubt helpful when Nvidia was attempting to cram two GPUs onto a single card in a reasonable power envelope. In fact, the GTX 690’s max power rating of 300W is 65W lower than the GTX 590’s.
The GeForce GTX 590 (left) versus the GTX 690 (right)
In many other respects, the GTX 690 mirrors its predecessor. The board is 11″ long, occupies two expansion slots, and requires a pair of 8-pin aux power inputs. The display outputs include a trio of dual-link DVI ports and a single mini-DisplayPort connector. The GTX 590 is just very, uh, plasticky by comparison.
|GeForce GTX 580||37||49/49||1.6||3.1||192|
|GeForce GTX 680||34||135/135||3.3||4.2||192|
|GeForce GTX 590||58||78/78||2.5||4.9||328|
|GeForce GTX 690||65||261/261||6.5||8.2||384|
|Radeon HD 7970||30||118/59||3.8||1.9||264|
|Radeon HD 6990||53||159/80||5.1||3.3||320|
|Radeon HD 6990 AUSUM||56||169/85||5.4||3.5||320|
The GTX 590 is also quite a bit slower than the 690, in theory, as is nearly every other video card out there. Only the Radeon HD 6990 with its special “AUSUM” overclocking switch thrown really competes on any of the key rates, like memory bandwidth and texture filtering—and in other important respects, it’s not even close. We’re not likely to see a true competitor for the GTX 690 until AMD takes the wraps off of its dual-Tahiti card. Frankly, we’re kind of surprised to have made it this far into 2012 without seeing AMD’s dual-GPU entry for this generation, since they’ve been talking about it for some time. Until that product arrives, the GTX 690 is pretty much in a class by itself.
That brings us, inevitably, to the question of price. Given the GTX 690’s premium materials and performance, Nvidia has decided to slap a price tag on this puppy that reads: $999.99, one penny short of a grand. I believe that makes the GTX 690 the most expensive consumer graphics card ever. The one-grand sticker essentially doubles the GTX 680’s list price, so it makes a sort of sense. Still, you’d kind of hope for some sort of volume discount when buying two GPUs together, wouldn’t you?
I dunno. I’m not sure the folks who would pony up for this sort of card will care that much.
One thing that this, er, formidable price tag could do is keep demand in line with the limited supply of these cards. Most folks are keenly aware that the supply of GK104 chips is rather tight right now, since the GTX 680 is tough to find in stock anywhere. Furthermore, the dual-GPU cards of the last generation, the Radeon HD 6990 and the GeForce GTX 590, seem to have been in short supply throughout their model runs. We expect the GTX 690 to reach online store shelves this week, but we have few illusions about them being plentiful, at least initially.
A testing conundrum
As you might recall, we’ve been skeptical about the merits of multi-GPU solutions like the GeForce GTX 690 since we published this article last fall. That piece introduced some new ways to think about gaming performance, and the methods we proposed immediately highlighted some problems with SLI and CrossFire.
Multi-GPU schemes generally divide the work by asking a pair of GPUs to render frames in alternating fashion—frame 1 to GPU 0, frame 2 to GPU 1, frame 3 to GPU 0, and so on. The trouble is, the two GPUs aren’t always in sync with one another. Instead of producing a series of relatively consistent frame delivery times, a pair of GPUs using alternate frame rendering will sometimes oscillate between low-latency frames and high-latency frames.
To illustrate, we can zoom in on a very small chunk of one of our test runs for this review. First, here’s how the frame times look on a single-GPU solution:
Although frame times vary slightly on the single-GPU setup, the differences are pretty small during this short window of time. Meanwhile, look what happens on a CrossFire setup using two of the same GPU:
You can see that alternating pattern, with a short frame time followed by a long one. That’s micro-stuttering, and it’s a potentially serious performance issue. If you were simply to measure this solution’s performance in average frames per second, of course, it would look pretty good. Lots of frames are being produced. However, our sense is that the smoothness of the game’s animation will be limited by those longer frame times. In this short window, adding a second GPU appears to reduce long-latency frames from about 29 ms to about 23 ms. Although the FPS average might be nearly doubled by the presence of all of those low-latency frames, the real, perceived impact of adding a second card would be much less than a doubling of performance.
This problem affects both SLI and CrossFire, including multi-GPU graphics cards like the GTX 690. How much micro-stuttering you find can vary from one moment to the next. In this example, we can see a little bit of jitter from the GTX 690, but it’s fairly minimal.
However, it appears that the degree of jitter tends to grow as multi-GPU solutions become more performance-constrained. That’s bad news in our example for the older dual-GPU graphics cards:
Ouch. If this trend holds up, the more you need higher performance from a multi-GPU solution, the less likely it is to deliver. Kind of calls the value proposition into question, eh?
Things get even trickier from here, for several reasons. Both AMD and Nvidia acknowledged the multi-GPU micro-stuttering problem when we asked them about it, but Nvidia’s Tom Petersen threw us for a loop by asserting that Nvidia’s GPUs have had, since “at least” the G80, a built-in provision called frame metering that attempts to counteract the problem.
The diagram above shows the frame rendering pipeline, from the game engine through to the display. Frame metering attempts to smooth out the delivery of frames by monitoring frame times and, as necessary, adding a slight delay between a couple of points on the timeline above, T_render and T_display. In other words, the GPU may try to dampen the oscillating pattern characteristic of micro-stuttering by delaying the display of completed frames that come “early” in the sequence.
We think frame metering could work, in theory, with a couple of caveats. One obvious trade-off is the slight increase in input lag caused by delaying roughly half of the frames being rendered, although the impact of that should be relatively tiny. The other problem is the actual content of the delayed frames, which is timing-dependent. The question here is how a game engine decides what time is “now.” When it dispatches a frame, the game engine will create the content of that image—the underlying geometry and such—based on its sense of time in the game world. If the game engine simply uses the present time, then delaying every other frame via metering will cause visual discontinuities, resulting in animation that is less smooth than it should be. However, Petersen tells us some game engines use a moving average of the last several frame times in order to determine the “current” time for each frame. If so, then it’s possible frame metering at the other end of the graphics pipeline could work well.
A further complication: we can’t yet measure the impact of frame metering—or, really of any multi-GPU solution—with any precision. The tool we use to capture our performance data, Fraps, writes a timestamp for each frame at a relatively early point in the pipeline, when the game hands off a frame to the Direct3D software layer (T_ready in the diagram above). A huge portion of the work, both in software and on the GPU, happens after that point.
We’re comfortable with using Fraps for single-GPU solutions because it captures frame times at a fixed point in what is essentially a feedback loop. When one frame is dispatched, the system continues through the process and moves on to the next, stopping at the same point in the loop each time to record a timestamp.
That feedback loop loses its integrity when two GPUs handle the work in alternating fashion, and things become particularly tricky with other potential delays in play. Fraps has no way of knowing when a buffer flip has happened at the other end of the pipeline, especially if there’s a variable metering wait involved—so frame delivery could be much smoother in reality than it looks in our Fraps data. By the same token, multi-GPU schemes tend to have some additional latency built into them. With alternate frame rendering, for instance, a frame completed on the secondary GPU must be transferred to the primary GPU before it can be displayed. As a result, it’s possible that the disparity between frame display times could be much worse than our Fraps data show, as well.
So, what to do if you’re us, and you have a multi-GPU video card to review? The best we can say for our Fraps data is that we believe it’s accurate for what it measures, the point when the game engine presents a frame to Direct3D, and that we believe the frames times it captures are at least loosely correlated to the actual display times at the other end of the pipeline. We can also say with confidence that any analysis of multi-GPU performance based solely on FPS averages is at least as wrong as what we’re about to show you. We had hoped to have some new tools at our disposal for this article, including a high-speed camera we ordered, but the camera didn’t arrive in time for this review, unfortunately. We will have to follow up with it at a later date. For now, we’ll have to march ahead with some big, hairy caveats attached to all of our performance results. Please keep those caveats in mind as you read the following pages.
In order to take full advantage of high-end graphics cards these days, you’ve got to ask a lot of ’em. That’s why we decided to conduct our testing for this review with a trio of monitors, all Dell U2410s, each with a display resolution of 1920×1200.
Together, they have a collective resolution of about six megapixels, roughly 50% more pixels than the 30″ monitor we usually use for GPU testing. The increased resolution and complexity made it fairly easy to push the limits of these multi-GPU setups. We even had to go easy on the image quality settings in some cases to maintain playable frame rates.
Most of our multi-GPU pairings were built from cards we’ve tested before, but our GTX 680 team had one brand-new member: Zotac’s GeForce GTX 680 AMP!, a product just announced today. Obviously, that’s not a stock cooler, but it is very swanky. This is an AMP! edition, so its default clock speeds are quite a bit higher than a stock GTX 680’s. The base and boost frequencies are 1111MHz and 1176MHz, well above the stock 1006/1071MHz speeds. Even more impressively, perhaps, the Zotac card’s memory speed is 1652MHz, up from 1502MHz stock. We suspect memory bandwidth may be an important performance limiter on the GTX 680, so the higher RAM speeds are noteworthy. Zotac is asking $549 for this card, 50 bucks above the stock GTX 680’s list price.
For the purposes of this review, we committed the heinous crime of dialing back the Zotac GTX 680 card’s base and memory clock speeds to match the other card in the SLI pairing, which was a standard-issue GTX 680. We’re worried about GPUs being out of sync, after all, and we didn’t want to make matters worse with a mismatch. (We did the same with the XFX Radeon HD 7970, bringing it back to stock clocks to match the other card.) The thing is, the utilities we had on hand wouldn’t let us straightforwardly control the Zotac card’s boost clock, so perfect symmetry eluded us.
With the GTX 680, that is kind of the way of things, though. Nvidia expects slightly variant performance from every GTX 680 card thanks to GPU Boost, which will adjust to the particulars of a card’s thermals, the individual chip’s properties, and such. Two GTX 680s in SLI aren’t likely to run at exactly the same speed, since the thermal conditions at one spot in a system will vary from those at another. Nvidia anticipates that the frame metering capabilities in the GK104 will keep frame delivery consistent, regardless.
Oh, and please note that we tested the Radeon HD 6990 with its “AUSUM” switch enabled, raising its clock speed and PowerTune limits. We saw no reason not to test it in that configuration, given what it is.
Our testing methods
As ever, we did our best to deliver clean benchmark numbers. Tests were run at least three times, and we’ve reported the median result.
Our test systems were configured like so:
|Memory size||16GB (4 DIMMs)|
DDR3 SDRAM at 1600MHz
|Chipset drivers||INF update
Rapid Storage Technology Enterprise 188.8.131.5220
with Realtek 184.108.40.20626 drivers
F240 240GB SATA
|OS||Windows 7 Ultimate x64 Edition
Service Pack 1
DirectX 11 June 2010 Update
GTX 680 + Zotac GTX 680
| GeForce GTX
HD 6990 AUSUM
12.4 + 12.3 CAP 1
12.4 + 12.3 CAP 1
HD 7970 + XFX HD 7970
12.4 + 12.3 CAP 1
Thanks to Intel, Corsair, and Gigabyte for helping to outfit our test rigs with some of the finest hardware available. AMD, Nvidia, and the makers of the various products supplied the graphics cards for testing, as well.
Unless otherwise specified, image quality settings for the graphics cards were left at the control panel defaults. Vertical refresh sync (vsync) was disabled for all tests.
We used the following test applications:
Some further notes on our methods:
- We used the Fraps utility to record frame rates while playing a 90-second sequence from the game. Although capturing frame rates while playing isn’t precisely repeatable, we tried to make each run as similar as possible to all of the others. We tested each Fraps sequence five times per video card in order to counteract any variability. We’ve included frame-by-frame results from Fraps for each game, and in those plots, you’re seeing the results from a single, representative pass through the test sequence.
We measured total system power consumption at the wall socket using a Yokogawa WT210 digital power meter. The monitor was plugged into a separate outlet, so its power draw was not part of our measurement. The cards were plugged into a motherboard on an open test bench.
The idle measurements were taken at the Windows desktop with the Aero theme enabled. The cards were tested under load running Skyrim at its Ultra quality settings with FXAA enabled.
We measured noise levels on our test system, sitting on an open test bench, using an Extech 407738 digital sound level meter. The meter was mounted on a tripod approximately 10″ from the test system at a height even with the top of the video card.
You can think of these noise level measurements much like our system power consumption tests, because the entire systems’ noise levels were measured. Of course, noise levels will vary greatly in the real world along with the acoustic properties of the PC enclosure used, whether the enclosure provides adequate cooling to avoid a card’s highest fan speeds, placement of the enclosure in the room, and a whole range of other variables. These results should give a reasonably good picture of comparative fan noise, though.
- We used GPU-Z to log GPU temperatures during our load testing.
The tests and methods we employ are generally publicly available and reproducible. If you have questions about our methods, hit our forums to talk with us about them.
The Elder Scrolls V: Skyrim
Our test run for Skyrim was a lap around the town of Whiterun, starting up high at the castle entrance, descending down the stairs into the main part of town, and then doing a figure-eight around the main drag.
Since these are pretty capable graphics cards, we set the game to its “Ultra” presets, which turns on 4X multisampled antialiasing. We then layered on FXAA post-process anti-aliasing, as well, for the best possible image quality without editing an .ini file. We also had the high-res texture pack installed, of course. Although it’s not pictured above, the total display resolution was 5760×1200.
These first three plots show the raw data from a single test run, the rendering times for each individual frame, shown in milliseconds. Notice that because we’re thinking in terms of frame latency, lower numbers are better. For reference, we’ve included a table on the right showing the conversions from frame times to FPS.
As you can see, the GTX 690 performs essentially identically to two GTX 680s in SLI. Throughout the test run, the GTX 690’s frame latencies remain below 22 milliseconds or so, and they’re often under the magical 16.7 millisecond mark that, if it’s steady, translates into 60Hz or 60 FPS. Some of the other cards don’t fare as well, especially the Radeon HD 6990, whose spike-riddled plot reveals frame times that are often rather high. The GTX 590’s plot looks more like a cloud than a line, suggesting that it has some jitter going on, as well.
Just looking at the FPS average, the GTX 690 ties the GTX 680 SLI team, well ahead of anything else. Two Radeon HD 7970s in CrossFire, surprisingly enough, aren’t any faster than a GeForce GTX 590.
Of course, we’ve established that FPS averages don’t tell the whole story. We can get a better sense of the overall frame latency picture by looking at the 99th percentile frame time. Simply put, for each card, this number means that 99% of all frames were rendered in x milliseconds or less. Since we’re looking at a point where the vast majority of frames have been completed, the effects of any micro-stuttering oscillations will be reflected in this result.
Switching to this more latency-centered indicator does some interesting things for us. First and foremost, it brings the GTX 690 and the GTX 680 SLI back to the pack. Those two are only a couple of milliseconds ahead of a single GTX 680 in this measurement. Oddly enough, the Radeon HD 7970 CrossFire config looks to have higher latencies at the 99th percentile than a single 7970 card does. Worst of all for AMD, the Radeon HD 6990 looks like a basket case. Going by FPS alone, the 6990 would appear to be just a few ticks behind the 7970. A look at the latency picture reveals the gulf between the 6990 and everything else.
Then again, 99th percentile frame times are just one point along a whole latency curve, and we can show you how that curve looks.
With multi-GPU products in the mix, these latency curves are more interesting than ever. You can see that the GTX 690 and the 680 SLI config are evenly matched throughout the test run, with no real weaknesses. Both solutions deliver frames quickly throughout, although their frame latencies rise, nearly to meet the single GTX 680’s, in the last 5% of frames.
The curve for the Radeon HD 7970 CrossFire setup tells quite the story. Although the dual 7970s deliver half of their frames much more quickly than a single card, their frame times rise at a sharper angle beyond 50%, eventually crossing over at around 82-83%. For the last 16% or so of frames delivered, the single Radeon HD 7970 is quicker. We’re likely seeing two halves of a multi-GPU jitter pattern illustrated in the 7970 CrossFire’s more rapidly ascending curve, and in the final analysis, the single 7970 may be the better of the two solutions.
We can also quantify “badness,” the slowdowns and delays one encounters while playing a game, by looking at the amount of time spent rendering frames above a certain threshold. The theory here is that the more time spent on long-latency frames, the more interruption you’re likely to perceive while playing a game.
We’ve chosen several noteworthy thresholds. The first, 50 milliseconds, equates to 20 FPS. We figure if the frame rate drops below 20 FPS for any length of time, most folks are likely to perceive a slowdown. The next two, 33.3 ms and 16.7 ms, equate to 30 and 60 FPS, respectively, which are traditionally important performance thresholds for gamers. Our three thresholds also equate to 60Hz, 30Hz, and 20Hz, the first three quantization points for a 60Hz display with vsync enabled. If you go beyond any of these points, you’ll be waiting at least one more vertical refresh interval before updating the screen.
As you might have expected, only the Radeon HD 6990 suffers any really substantial slowdowns, and even it doesn’t waste too much time working on frames above 50 milliseconds.
When we ratchet the threshold down to 16.7 ms, the GTX 690 and 680 SLI really separate themselves from the pack. A single GTX 680 card spends about three times as long as a GTX 690 or two GTX 680s in SLI above 16.7 milliseconds—so in a really picky way, the GTX 690 is measurably better at minimizing wait times for those worst-case frames.
Notably, the Radeon HD 7970 single-card and CrossFire configs are essentially tied here. Adding a second Radeon doesn’t appear to help at all in the most difficult cases.
Batman: Arkham City
We did a little Batman-style free running through the rooftops of Gotham for this one.
All of the plots for this game show lots of spikes, or occasional long frame times, something we’re used to seeing from Arkham City—doesn’t seem to matter much whether you’re dealing with one GPU or two.
The AMD and Nvidia offerings are much more competitive with one another here than they were in Skyrim, at least in the FPS sweeps.
Switching to the 99th percentile frame times whittles away the gap between the multi-GPU solutions and their single-chip equivalents. The GTX 690 still looks good here, but it’s no quicker than a pair of 7970s.
Although all of the solutions had spiky lines in the initial plots of frame times, the latency curve illustrates how two of the solutions, the single Radeon HD 7970 and the Radeon HD 6990, produce a higher proportion of long-latency frames than everything else. Again, the 6990’s sharper rise from the halfway point to about 88% suggests some longer frame times as part of a micro-stuttering pattern. However, the single 7970 struggles mightily with that last 7% of frames, all of which take longer than 60 milliseconds to render. Interestingly, in this case, adding a second card for CrossFire essentially eliminates those struggles.
The GTX 690 again looks excellent throughout, even surpassing the 7970 CrossFire config.
In this test scenario, with either Radeons or GeForces, you can substantially reduce slowdowns by adding a second GPU. That’s true for the GTX 680/690, and it’s true for the Radeon HD 7970, as well. The multi-GPU options look pretty darned good in that light.
We tested Battlefield 3 with all of its DX11 goodness cranked up, including the “Ultra” quality settings with both 4X MSAA and the high-quality version of the post-process FXAA. Our test was conducted in the “Operation Guillotine” level, for 60 seconds starting at the third checkpoint.
The multi-GPU plots look… cloudy. Could prove interesting.
Notice the disparity between the FPS average and the 99th percentile frame times. Although the 7970 CrossFire config is far and away the at the top of the FPS charts, it’s only slightly quicker than the GTX 690 at the 99th percentile. Why is that?
Well, both the 7970 CrossFire and the Radeon HD 6990 show a classic micro-stuttering split, as their frame times rise sharply at around the 50th percentile mark. The disparity is greater for the 6990, which looks to be more performance constrained. Even with that handicap, the 7970 CrossFire config at least matches the GTX 690 and 680 SLI across the latter half of the latency curve, which is why it still has a slight advantage at the 99th percentile. In other words, the 7970 CrossFire config is still a very competitive performer, even though it doesn’t quite live up to its towering advantage in the FPS results.
The worst performer here is the GeForce GTX 590, whose last 10% of frames take between 60 and 120 milliseconds to render.
I like the filtering effect we get from these three thresholds. If you’re looking to avoid real slowdowns, consider the 50-ms results, where only the 6990 and the GTX 590 really show any notable issues. At 33.3 ms, the three multi-GPU solutions from the current generation are in a class by themselves, while the Radeon HD 7970 carves out a clear advantage over the GeForce GTX 680. You’re better off with either current-gen single GPU than you are with a dual-GPU card from the prior generation, though. Crank the limit down to 16.7 ms, and the ranks of the top three remain the same, but all of the other solutions look to be pretty similar.
Our cavalcade of punishing but pretty DirectX 11 games continues with Crysis 2, which we patched with both the DX11 and high-res texture updates.
Notice that we left object image quality at “extreme” rather than “ultra,” in order to avoid the insane over-tessellation of flat surfaces that somehow found its way into the DX11 patch. We tested 60 seconds of gameplay in the level pictured above, where we gunned down several bad guys, making our way across a skywalk to another rooftop.
You’ll see a couple of unexpected things in these results. First, the GeForce GTX 680 SLI is quite a bit slower than the GTX 690, which is unusual. Second, neither of those solutions performs very well compared to the Radeon HD 7970 CrossFire team. Why?
Well, looks like we found a bug in Nvidia’s graphics drivers. Here are the FPS averages for the GTX 680 SLI rig, runs one to five: 62, 44, 43, 44, 44. And here are the averages for the GTX 690: 59, 52, 51, 52, 50. In both cases, the first test run is faster than all of the rest. The GTX 590 suffered from the same problem, but none of the single-card configs did, nor did any of the Radeon setups. Looks like something bad happens when you exit and load a saved game on the Nvidia multi-GPU setups. After you’ve done that once, performance drops and doesn’t come back until you exit the game completely and start it up again. For whatever reason, the GTX 680 SLI setup suffers more from this issue than the GTX 690 does.
I briefly considered re-testing the GTX 690 and company by exiting the game between runs, but I figure this problem is one that anybody who plays Crysis 2 will encounter. Seems like fair game to include it. Multi-GPU solutions tend to be prone to these sorts of issues, anyhow.
While we’re at it, these results are affected by a problem with the Radeons, as well. Let’s zoom in on the very beginning of the test runs to get a closer look.
One of the first motions of our test run, after loading a saved game, is to turn around 180° or so and face another direction. When we do so on the Radeon cards, we get some long delays, multiple frames that take 60 milliseconds or more. I didn’t know whether to blame the game engine or the Radeons, and I was considering doing a quick spin-around move to warm the GPU caches before starting the test run—until I started testing the GeForces, and the slowdowns became vastly less pronounced, almost imperceptible in most cases. You can see a single 70-millisecond frame from the GTX 690 above, but the following frames clock in at under 50 ms. There’s a lot more white area under those two Radeon lines, which means more time spent waiting. Again, I figured this problem was fair game to include, in the grand scheme of things.
Notice, also, that you can see the multi-GPU jitter patterns in both the GTX 690 and the 7970 CrossFire plots above. The GTX 690’s is less pronounced, but it’s still unmistakable.
Even with these two issues affecting the different camps, we still have some clear takeaways from these results. Of course, the Radeons all pay for that slowdown at the beginning of the test run in our 50-ms threshold results. There’s no escaping that.
Beyond that, the prior-generation multi-GPU cards look really quite poor, with the two worst latency curves from the 50th percentile on up and, thus, the most time spent beyond each of our three thresholds. You’re obviously better off with a single GTX 680 or 7970 than with a GTX 590 or 6990.
Finally, even with the Nvidia driver issue, the GTX 690 comes out of this test looking quite good in terms of the overall latency picture and its ability to avoid slowdowns.
These power consumption numbers aren’t quite what we’d expected, based on prior experience. Driving three monitors appears to change the math in some cases. For example, the Radeon HD 7970’s ZeroCore power feature evidently doesn’t work with three displays attached. Both the single 7970 and the primary card in our CrossFire team refused to enter the ZeroCore state when the display dropped into power-save mode. Their fans never stopped spinning. Not only that, but the idle power consumption numbers for a single 7970 are quite a bit higher than we saw with a single display attached. (And we’re not measuring the display power consumption here, just the PC’s power draw at the wall socket.) We’re also not sure why the GeForce GTX 680 SLI rig pulled more power with the display in power-save mode, but it did.
Given everything, the GTX 690’s power consumption is remarkably low, both when idle and when running Skyrim across three displays, which is how we tested under load. Our GTX 690-based test system pulled fully 100W less than the same system with a GTX 590 installed.
Noise levels and GPU temperatures
The GTX 690’s acoustic profile is, if anything, even more impressive than its power draw numbers. Nvidia’s new dual-GPU card achieves noise levels very similar to a single GeForce GTX 680, which is one of the quieter cards on the market. The 690 doesn’t come by its low decibel readings cheaply, either—it maintains lower GPU temperatures under load than almost anything else we tested.
If you’re wondering about why the single Radeon HD 7970 produced higher noise levels and higher temperatures than the 7970 CrossFire config, well, so am I. Some of the issue, I think, is that we have an asymmetrical CrossFire team, with different coolers on the two cards. Somehow, using them together produces a different fan speed policy, it seems. Also, of course, noise is not additive, so putting in a second card doesn’t always lead straightforwardly to higher decibel readings. Another contributor may be relatively higher GPU utilization in the single-card config, since 7970 CrossFire performance doesn’t appear to scale well in Skyrim. We may have to try testing with a different game next time.
Well, now we have some performance numbers for the GeForce GTX 690. How correct they are, we’re not entirely sure. I will say this: regardless of the fact that we’ve not accounted for the potentially positive effects of frame metering, the GeForce GTX 690 looks to be the fastest overall graphics card on the planet. The GTX 690 even does well in our latency-sensitive metrics. Although it’s rarely twice as fast as a GeForce GTX 680 in terms of 99th percentile frame times, the GTX 690’s overall frame latency picture, as measured in Fraps, is generally superior to the GTX 680 by a nice margin. The GTX 690 also does a nice job of minimizing those painful instances of long-latency frames, improving on the performance of the GTX 680 in that regard in solid fashion.
Since we’re not entirely confident in our numbers, I’ll offer a few subjective impressions from my testing, as well. I started the process with the dual 7970 CrossFire team, and I chose the image quality settings for testing based on how well that config could run each game. My goal was to stress the multi-GPU solutions appropriately without cranking up the image quality options so far the games would be unplayable on the single-GPU cards.
I was initially surprised to see how easily the 7970 CrossFire config could be pushed to its limits at a six-megapixel resolution. The settings I chose yielded playable but not entirely fluid animation on the 7970 CrossFire rig (with the exception of BF3, which was stellar on the 7970s with the quality maxed out.) I was fearful of whether these games would truly be workable on a single 7970, but it turns out that I shouldn’t have worried. For the most part, playability wasn’t substantially compromised on a single card. However, playability was compromised when I switched over to the Radeon HD 6990. Although its FPS averages were generally higher than a single 7970’s, the experience of playing games with it was instantly, obviously worse. You might have guessed that by looking at our latency-focused numbers from Fraps, but the subjective experience backed that up.
From there, I switched to the green camp and the GeForce GTX 590, which was a bit of an upgrade from the 6990 in terms of overall smoothness—and it wasn’t such a basket case in Skyrim. When I swapped in a single GTX 680, though, the experence changed. The GTX 680 felt like a clear improvement in playability, after having tested the 6990 and GTX 590 back to back before it. The power of a single, fast GPU is something to be respected. I remember seeing the GTX 680’s FPS average at the end of the first Arkham City test run and being shocked at how low the number was (the card averaged 32 FPS) given the quality of the seat-of-the-pants experience.
Then again, I wasn’t getting much sleep during this period, and I’d overclocked my entire nervous system via copious amounts of tasty Brazilian coffee. Cerrado Gold, baby. Breakfast of champions.
The GTX 680 SLI config and the GTX 690 came next, and subjectively, the experience offered by the two was indistinguishable. Both were obviously faster in places where the GTX 680 felt strained, and I’d say they offered a better experience overall—and thus the best of any config I tested. However, it seemed like they’d still run into occasional, brief episodes of sluggishness that one didn’t experience on a single GTX 680.
You can make of those subjective impressions what you will. They’re in no way scientific, although I did try to throw in a big word here and there.
Given our seat-of-the-pants impressions and our test results, I’m pretty confident in offering a generally positive recommendation for the GeForce GTX 690. No, it’s not “twice as fast” as a GeForce GTX 680 in any meaningful sense, and a coldly rational buyer probably wouldn’t want to pay twice the GTX 680’s price for it. However, it is as quick as two GTX 680s in SLI, which makes it the highest-performance video card we’ve ever used. Furthermore, Nvidia has gone above and beyond with the GTX 690’s industrial design, materials, acoustics, and power efficiency, all of which are exemplary, outstanding, and most other positive words you might wish to use.
If you’re serious about buying one of these cards, you probably understand the logic of such a purchase better than I do. I’m not sure how one would justify the price, but Nvidia has given folks lots of shiny little excuses—and they’ve muted the drawbacks like excess noise, too. There’s not much not to like here, other than that fantastic, breathtaking, prodigious price tag. I suspect some folks will overcome that one obstacle without much trouble.
As for the lingering questions about multi-GPU micro-stuttering and the effectiveness of frame metering, we have several things in the works. There’s a tremendous amount of work still to be done, and we have a lot of other projects on our plate, so be patient with us, folks. Shortly before the completion of this review, we did finally receive that high-speed camera we’d ordered. We’ve already managed to capture a serviceable video at 240 FPS, four times our display’s refresh rate. The resolution isn’t too high, but it’s enough to show whether the animation is smooth. Have a look:
You can easily see the strobe of the monitor’s CCFL backlight, and since we had vsync disabled, several instances of tearing are clearly visible. We think this tool should allow for some worthwhile comparisons, eventually.
All the cool kids follow me on Twitter.