The video card market has been surprisingly static in the second half of 2011, so Nvidia’s recent introduction of a new product—the GeForce GTX 560 Ti 448—was a happy occasion on several counts. First, it was a chance for something different to perhaps offer a little extra value to Christmas shoppers. Second, it was an opportunity for us to revisit some fancy new GPU testing methods with the latest games. Thus, we fired up the graphics test systems in Damage Labs, installed titles like Skyrim and Batman: Arkham City, and set to work. Finishing up this review has taken a little longer than we’d have liked, but we’ve managed to include some fresh insights on several fronts. Read on for our take.
Above is a picture of the video card that prompted this little get-together, Zotac’s version of the GeForce GTX 560 Ti 448. As you may know, today’s video cards are carefully calibrated beasts, based on one of several different chips and tailored to deliver a particular mix of price and performance. This newcomer slots in between two very well established offerings, the GeForce GTX 560 Ti and the GeForce GTX 570. One artifact of this product’s late addition to the Nvidia lineup is its awkward name, which is meant to signify its place in the world using an especially long accumulation of letters and numbers. In fact, the “448” in the name refers to the number of shader ALUs enabled on the card’s GF110 GPU.
Yep, that’s right. Although the rest of the GeForce GTX 560 series is based on the smaller GF114 chip, this new card packs the big daddy, the GF110. We’ve already described the Fermi graphics architecture, on which the GF110 chip is based, in some detail. The fundamental building block of this architecture is a unit known as the SM, or shader multiprocessor, which is nearly a complete GPU unto itself. The GF110 has 16 SMs, along with six memory interfaces and six corresponding ROP partitions. Nvidia has spun out several products based on the GF110 chip, including the pricey GTX 580, with all units enabled, and the GTX 570, with 15 SMs and five memory interface/ROP partition pairs enabled.
The GeForce GTX 560 Ti 448 takes this selective trimming operation a tiny bit further. Only 14 of its SMs are enabled, but in every other way—clock speeds, the number of memory interfaces and ROP partitions, the works—the GTX 560 Ti 448 is similar to the GTX 570. The consequences of this change are so minor as to be nearly imperceptible. The loss of an additional SM means the GTX 560 Ti 448 will have a little less shader arithmetic throughput, texture filtering capacity, and geometry processing ability than the 570. However, the Ti 560 448 has the exact same memory bandwidth, pixel fill rate, and triangle rasterization rate. Combine that with the fact that the Zotac card we’re reviewing is clocked higher than Nvidia’s baseline speed, at 765MHz rather than 732MHz, and the GTX 560 Ti 448 becomes vanishingly close to the GTX 570 in terms of key graphics throughput rates.
| Peak bilinear
| Peak bilinear
| Peak shader
|GeForce GTX 560
|Asus GTX 560 Ti
|GeForce GTX 560
|Zotac GTX 560
|GeForce GTX 570||29.3||43.9||43.9||1405||2928||152|
|GeForce GTX 580||37.1||49.4||49.4||1581||3088||192|
|Radeon HD 6870||28.8||50.4||25.2||2016||900||134|
|Radeon HD 6950||25.6||70.4||35.2||2253||1600||160|
|Radeon HD 6970||28.2||84.5||42.2||2703||1760||176|
Then again, the hot-clocked Asus GTX 560 Ti card that we’ve included in the table above—and in our tests on the following pages—is theoretically faster in several key categories, including texture filtering and shader arithmetic, thanks to its even higher clock speeds. You can see how the small amount of daylight between these different products tempted Nvidia to call this card a GTX 560 Ti, even though it’s based on a different chip.
Both the GTX 570 and the GTX 560 Ti 448 have several other advantages over the regular GTX 560 Ti, however. Chief among them is more memory bandwidth, which will likely translate into better performance. Also, the GF110 has substantially higher peak polygon rasterization rates and a little bit more geometry processing potential, all told. This property doesn’t generally affect current games, unless they have strange polygon-count inflation problems, but it may impact performance in future games that put DirectX 11 tessellation to truly good use. For now, we can show you the difference between the GF110-based 560 Ti 448 and the GF114-based vanilla GTX 560 Ti using a quick, synthetic tessellation benchmark.
Note that even the GF114-based card produces much higher scores than today’s fastest Radeons. Since those same Radeons are altogether competitive with the Nvidia cards in current games, you can probably surmise that the difference between the GF110 and GF114 may not amount to much for a while. Still, it is a real, hardware-based distinction between the chips that matters for a specific sort of performance.
As you can see in the picture of the Zotac card above, the GTX 560 Ti 448 sports dual SLI connectors that will allow it to participate in three-way SLI teams, something the regular GTX 560 Ti cannot do. The 560 Ti 448 card also has a little bit more video memory, 1280MB instead of 1024MB, which may help occasionally at very high resolutions, when video memory is running low.
Left to right: Asus GTX 560 Ti TOP, Zotac GTX 560 Ti 448, and an older Zotac GTX 570
Another perk of grabbing Zotac’s GTX 560 Ti 448: like newer versions of the GTX 570, it’s been squeezed into a more reasonable 9″ card length, versus the 10.5″ length of the first wave of 570s. That change may help with fit in mid-sized cases.
Nvidia’s suggested price for the GeForce GTX 560 Ti 448 is $289.99; the Zotac card above will run you $299.99 at Newegg and comes with a copy of Battlefield 3. Such pricing not only lands squarely between the GTX 560 Ti (~$240) and the GTX 570 (~$340), but also between the Radeon HD 6950 (~$250) and 6970 (~$350). Like Toyotathon, though, Nvidia insists the GTX 560 Ti 448 is a limited-time offer. These cards will only ship to North America and Europe, and only via select board makers. When this allocation of GPUs is exhausted, that’s all she wrote. In fact, Nvidia tells us the limited nature of the product run is one reason it didn’t pull out a more suitable name, like GeForce GTX 565, to assign to these cards.
Some refinements to our methods
A few months ago, we reconsidered the way we test video-game performance and proposed some new methods in the article Inside the second: A new look at game benchmarking. The basic argument of that article was that the traditional approach of measuring speed in frames per second has some pretty major blind spots. For instance, one second is an eternity in terms of human perception. A bunch of fast frames surrounded by a handful of painfully slow ones can average out to an “acceptable” rate in FPS, even when the fluidity of the game has been interrupted. (We opened a whole other can of worms when we applied these insights to multi-GPU systems, but that is a story for another day.)
We weren’t quite sure what folks would think of our proposed new methods, but the response so far has been overwhelmingly positive. Most folks embraced the idea of a new approach, and many of you wrote in to offer your suggestions on how we might improve our methods going forward. Since then, several things have happened.
For one, while I was preoccupied with reviewing new CPUs, Cyril took the ball and ran with it, testing both Battlefield 3 and Skyrim using our proposed new methods. Folks seemed to like those articles, and in both cases, Cyril was able to pinpoint performance issues that a simple FPS measurement would have missed.
Meanwhile, behind the scenes in conversations with TR editors and others, I’ve slowly sifted through your suggestions to figure out which of them might prove worthwhile to us. We’ve rejected some interesting ideas simply because we think they’d be too complicated for mass consumption, and we’ve passed on some others because they didn’t necessarily apply to the sort of performance we’re after. The goal of a real-time graphics system is to produce frames regularly at relatively low latencies, within a window established by the limits of display technology and human perception. Measuring properties like “variance” without reference to the realities involved doesn’t appeal to us.
We have come up with one refinement to our methods that we think is helpful, though. In past articles, in order to highlight cases where a particular config ran into performance problems, we reported the number of frame times that were longer than a given time period for each card, usually 50 ms. We kind of pulled that number out of a hat, but 50 ms corresponds to about 20 FPS at a steady rate. We think that’s slow enough that the illusion of motion is being threatened. A collection of too many frame latencies beyond about 50 ms wouldn’t produce a good experience in most games. By counting the number of frame times above 50 ms for each config, we were able to offer a sense of which ones had potentially problematic performance problems (picked a peck of pickled peppers).
This approach, though, has two problems. First, in certain cases where the prevailing frame times rose above 50 ms for most GPUs, the faster GPU would of course produce more frames above 50 ms. We didn’t want to penalize the faster solution, so we had to be very careful about how we set our threshold in each test scenario.
The second problem is related: a simple count of the frame times longer than a certain threshold fails to consider the time element involved. For instance, take the two example performances below. They’re fabricated but possible.
The first card, the ReForce, produces several frames in 51 milliseconds during its test run. That’s not great, but three frames at 51 ms probably wouldn’t interrupt the flow of a game too badly. The second card, the Gyro, has only one long-latency frame, but it’s a doozy: 200 ms, a fifth of a second and an undeniable interruption in gameplay. Here’s how our long-latency frame count would look for these two cards:
Whoops. The Gyro comes out looking better in that chart, even though it’s obviously doing a poorer job of delivering fluid motion. The solution we’ve devised? Rather than counting the number of frames above 50 ms, we can add up all of the time spent working on frames beyond our 50-ms threshold. For our example above, the outcome would look like so:
Those three 51-ms frames only contribute 3 ms to the total time spent waiting beyond our threshold, while that one 200-ms frame contributes much more. I think this result captures the relative severity of the interruptions in gameplay fluidity quite nicely. This technique also does away with any concerns about the faster card being penalized for producing more frames.
I should note that, although we cooked up this new method months ago in a frenetic conversation at Starbucks during IDF, a TR reader named Olaf later wrote in and pointed out this exact problem with the time element of the frame rate count, in response to one of Cyril’s articles. Olaf, you nailed it. We’re going with the technique of adding up time spent beyond 50 ms from here on out.
The time element of individual frames also scuttled one of our favorite suggestions for augmenting the presentation of our results: histograms showing the distribution of frame times. At first, that seemed like a nice idea. However, when we actually created one, it looked like so:
The problem? These are real data from our tests, and as you’ll see later, what separates the performance of some GPUs from others here is an abundance of long frame times in some cases. Some of the GPUs devote quite a bit of time to processing long-latency frames, so those frames are very important to consider. Yet the severity of those long-latency frames is entirely obscured in this histogram. As a simple count, they’re overwhelmed in the chart by the many thousands of low-latency frames produced by all of the GPUs. It’s a purty picture, but I don’t think it adds much to our analysis.
That’s a shame, because I was ready to bust out the fancy stuff:
But ultimately pointless. We’ll keep looking for ways to better analyze and present our results in the future. I think what we’ve developed so far is pretty strong, though, with our one new refinement.
For what it’s worth, I’ve also rejiggered things behind the scenes a little bit to make sure that, where possible, we’re sampling all five runs from each game separately and then picking the median result. That way, the amount of time spent beyond 50 ms is the time spent during a single, representative run—and not the result of the occasional system performance blip due to a background task. The one place where that doesn’t work is the 99th percentile frame times, where we’ve found we get more coherent results by analyzing the data from all five runs as a group.
Our testing methods
As ever, we did our best to deliver clean benchmark numbers. Tests were run at least three times, and we’ve reported the median result.
Our test systems were configured like so:
|North bridge||X58 IOH|
|Memory size||12GB (6 DIMMs)|
|Memory type||Corsair Dominator CMD12GX3M6A1600C8
DDR3 SDRAMat 1333MHz
|Memory timings||9-9-9-24 2T|
|Chipset drivers||INF update
Rapid Storage Technology 10.8.0.1003
with Realtek 18.104.22.16882 drivers
|Dual Radeon HD
with Catalyst 11.11b drivers
| Radeon HD 6970
with Catalyst 11.11b drivers
Asus GeForce GTX
560 Ti DirectCU II TOP 1GB
with ForceWare 290.36 beta drivers
GeForce GTX 560 Ti 448 1280MB
with ForceWare 290.36 beta drivers
GeForce GTX 570 1280MB
with ForceWare 290.36 beta drivers
F240 240GB SATA
|Power supply||PC Power & Cooling Silencer 750 Watt|
|OS||Windows 7 Ultimate x64 Edition
Service Pack 1
DirectX 11 June 2009 Update
Thanks to Intel, Corsair, Gigabyte, and PC Power & Cooling for helping to outfit our test rigs with some of the finest hardware available. AMD, Nvidia, and the makers of the various products supplied the graphics cards for testing, as well.
Unless otherwise specified, image quality settings for the graphics cards were left at the control panel defaults. Vertical refresh sync (vsync) was disabled for all tests.
We used the following test applications:
- Batman: Arkham City
- Battlefield 3
- Call of Duty: Modern Warfare 3
- The Elder Scrolls V: Skyrim
- Fraps 3.4.7
- GPU-Z 0.5.5
Some further notes on our methods:
- We used the Fraps utility to record frame rates while playing a 90-second sequence from the game. Although capturing frame rates while playing isn’t precisely repeatable, we tried to make each run as similar as possible to all of the others. We tested each Fraps sequence five times per video card in order to counteract any variability. We’ve included frame-by-frame results from Fraps for each game, and in those plots, you’re seeing the results from a single, representative pass through the test sequence.
We measured total system power consumption at the wall socket using a Yokogawa WT210 digital power meter. The monitor was plugged into a separate outlet, so its power draw was not part of our measurement. The cards were plugged into a motherboard on an open test bench.
The idle measurements were taken at the Windows desktop with the Aero theme enabled. The cards were tested under load running Modern Warfare 3 at a 2560×1600 resolution with 4X AA and 16X anisotropic filtering.
We measured noise levels on our test system, sitting on an open test bench, using an Extech 407738 digital sound level meter. The meter was mounted on a tripod approximately 10″ from the test system at a height even with the top of the video card.
You can think of these noise level measurements much like our system power consumption tests, because the entire systems’ noise levels were measured. Of course, noise levels will vary greatly in the real world along with the acoustic properties of the PC enclosure used, whether the enclosure provides adequate cooling to avoid a card’s highest fan speeds, placement of the enclosure in the room, and a whole range of other variables. These results should give a reasonably good picture of comparative fan noise, though.
- We used GPU-Z to log GPU temperatures during our load testing.
The tests and methods we employ are generally publicly available and reproducible. If you have questions about our methods, hit our forums to talk with us about them.
The Elder Scrolls V: Skyrim
Our test run for Skyrim was a lap around the town of Whiterun, starting up high at the castle entrance, descending down the stairs into the main part of town, and then doing a figure-eight around the main drag.
Since these are pretty capable graphics cards, we set the game to its “Ultra” presets, which turns on 4X multisampled antialiasing. We then layered on FXAA post-process anti-aliasing, as well, for the best possible image quality without editing an .ini file.
By any measure, all of these graphics cards aced this test. Honestly, we didn’t expect this outcome. Cyril found that slightly lower quality settings at 1080p were a bit of a challenge for the GeForce GTX 560 Ti and Radeon HD 6950. We resisted jumping up to a higher (and less common) resolution in part because we thought these settings would be challenging enough. Yet in our tests, none of the cards spent much time at all above 50 ms or even 40 ms.
Why the difference? Two possibilities come to mind. One, we’re using newer video drivers all around. And two, our test system has a much beefier Core i7-980X processor, lending credence to the claims that this game is somewhat CPU-bound.
Whatever the case, you’re good with any of these cards in Skyrim, and the differences between them are minor—three milliseconds of frame latency for 99% of the frames produced, as our percentile chart shows. As one might expect given the theoretical numbers at the opening of this review, there’s virtually no difference between Zotac’s GTX 560 Ti 448 and the stock-clocked GeForce GTX 570.
Batman: Arkham City
We did a little Batman-style free running through the rooftops of Gotham for this one.
We had hoped to test Arkham City with its DirectX 11 features enabled, but the DX11 code path in the first release of the game was essentially broken, with way too many performance pitfalls on any video card. The game has since been patched, but we’re not yet confident it’s truly fixed, given the user feedback so far. So we’re sticking with testing this game—which is truly excellent, by the way—the same way we’ve played it: in DirectX 9 mode at the highest possible resolution and image quality settings otherwise.
Notice how the Radeons range into 40-ms frame times quite a bit more often than the GeForces here. The FPS average doesn’t really capture that important bit of reality. The GeForce GTX 560 Ti and the Radeon HD 6970 are virtually tied in terms of FPS, but the GTX 560 Ti returns 99% of its frames in 26 milliseconds or less. For the 6970, that figure is 35 ms. This one is a clear win for Nvidia.
Long-latency frames above 50 ms are a bit more of a problem for the Radeons, too. Any of the GeForces will give you smoother gameplay overall.
Once again, though, even our most sophisticated tools can’t detect a meaningful difference in performance between the GTX 560 Ti 448 and its bigger brother, the GTX 570. Heck, even the regular GTX 560 Ti’s performance isn’t much different, really.
Now for something a little more intensive: Battlefield 3 with all of its DX11 goodness cranked up, including the “Ultra” quality settings with both 4X MSAA and the high-quality version of the post-process FXAA. We tested in the opening portion of the “Thunder Run” level, from the first checkpoint through the initial tank battle and a few moments after that.
What’s this? Way too many long-latency frames during the tank battle on all of the GeForces. Just by looking at the frame time plot, we can spot the problem. The Radeons aren’t similarly afflicted.
Yeah, so all of those FPS numbers we’ve reported for years? Not very confident in them now. The GeForces capture the top three spots in the average FPS sweeps, yet they clearly are slower in ways that matter. The 99th percentile frame time results capture that fact.
The two GF110-based GeForces spent roughly three-quarters of a second each working on frames beyond our 50-ms threshold. That is an awful lot of time spent waiting during the course of a 90-second snippet of gameplay. In practice, we know those longer frames tend to take about 80 milliseconds to produce, which corresponds to a steady-state frame rate of 13 FPS. By the seat of our pants, the impact on the game experience is this: a little more of the “fog of war” feeling during battle than the game developers probably intended, along with some difficulty getting a fix on your next target. The tank battle itself is pretty much the opposite of a fast-twitch scenario, though, so the problem isn’t felt as intensely as it might be elsewhere in the game.
More troubling is that fact that Cyril saw similar frame latency spikes on GeForce cards in battle on the “Rock and a hard place” map. We resolved then to test other areas of the game, and now that we have, a pattern appears to be emerging. Hrm.
Call of Duty: Modern Warfare 3
Ok, so the Call of Duty game engine is older than dirt, but this game has made more money than, well, a rounding error in the federal budget, which is way more than most of us can say. I figured we could ratchet up the resolution to four megapixels, max out the image quality, and at least provoke some reaction from these cards.
Yeah, not so much. The only news here, in my view, beyond the fact that all of these cards are incredibly competent for this game, is how small the differences are between the various solutions. Console-focused games like this one simply don’t stress the pain points in lower-end solutions like the GF114 GPU, things like memory bandwidth and geometry processing throughput. The question is: why pay more than you would for GeForce GTX 560 Ti or a Radeon HD 6950? Clearly, not for higher performance in this sort of game.
I feel like I should have thrown in a vastly more expensive video card and maybe also a vastly less expensive one, so maybe we’d get some drama in these results somewhere. Seriously, 11 watts of total system power consumption separate the contenders at idle and only nine watts separate the top four while running a game? Sheesh. Take your pick, folks.
Noise levels and GPU temperatures
Well, at least our noise and temperature results are somewhat interesting. The Zotac GTX 560 Ti 448 card we’re reviewing comes out looking pretty good overall—a little louder at idle than everything else, though the difference subjectively is minor, and a little quieter under load than the rest of the bunch. That fits with our overall impression that the Zotac cooler has a decent acoustic profile. It’s not tuned to maintain unnecessarily low GPU temperatures under load, as the Asus GTX 560 Ti card is, and that’s fine with us. We’ll tinker with the fan speeds if we want to overclock, but the defaults should prioritize quiet operation.
The Zotac GeForce GTX 560 Ti 448 card we’ve tested is functionally almost identical to the GeForce GTX 570. More than anything, it’s a bit of a temporary Christmas price cut on the 570. We’re favorably inclined toward such things, and for as long as this card is available, nobody should buy a GTX 570. The Ti 560 448 is the same basic thing for less money.
We’re also generally pleased with Zotac’s rendition of this holiday special. We like the shorter board length, higher clock speeds, reasonably quiet cooler, and the included coupon for a copy of Battlefield 3. All in all, a very solid offering from Zotac.
The surprising outcome of our testing is the relatively minor differences in performance between the cheapest two cards, the GeForce GTX Ti 560 and Radeon HD 6950, and the more expensive offerings, which cost about $100 more. In fact, in our admittedly limited set of tests, we saw larger differences between the two major GPU brands than we did between the different rungs on their respective product lineups. For instance, Arkham City runs better on all of the GeForces than on any of the Radeons, while the reverse is true (in more dramatic fashion) for Battlefield 3. I had kind of expected our tighter focus on GPU frame times and on specific performance stumbles to lead to more separation between the various product segments, not less. Then again, we are dealing with relatively minor differences even in theory, all told, once you factor in the relatively high clocks on our Asus GTX 560 Ti and such. There’s just not a lot of differentiation to be had between $240 and $350 or so.
Given that fact, I’m going to hit you with a very specific recommendation to wrap up this review. If you have a single monitor at a resolution of 1920×1200 or less, the cards to buy are the GeForce GTX 560 Ti and the Radeon HD 6950. There’s no compelling reason to spend more, given the level of performance these cards deliver, and good luck choosing between the two on the basis of our performance results. Either one should serve you well, although if you favor Battlefield 3, you may want to opt for the Radeon. On the other hand, if you’ve moved up to that next class of display at 2560×1440 or better, you may want to consider paying extra for the GeForce GTX 560 Ti 448. You might get by just fine with a lower-end card, but the additional memory size and bandwidth over the regular GTX 560 Ti will probably pay off here and there—not in spades, but enough to justify a little more expense. With that qualification, we’ll throw a TR Recommended award to the Zotac 560 Ti 448. It’s essentially a GTX 570 for 300 bucks, and that we like.
With that said, we’ll apologize once again for the lack of drama in this particular review. We consider this one a down-payment on a future review with similar methods but broader scope, both of games and graphics cards, with some tougher challenges for all involved.