review nvidias geforce gtx 680 graphics processor reviewed

Nvidia’s GeForce GTX 680 graphics processor reviewed

At Nvidia’s GPU Technology Conference in 2010, CEO Jen-Hsun Huang made some pretty dramatic claims about his company’s future GPU architecture, code-named Kepler. Huang predicted the chip would be nearly three times more efficient, in terms of FLOPS per watt, than the firm’s prior Fermi architecture. Those improvements, he said, would go “far beyond” the traditional advances chip companies can squeeze out of the move to a newer, smaller fabrication process. The gains would come from changes to the chip’s architecture, design, and software together.

Fast forward to today, and it’s time to see whether Nvidia has hit its mark. The first chip based on the Kepler architecture is hitting the market, aboard a new graphics card called the GeForce GTX 680, and we now have a clear sense of what was involved in the creation of this chip. Although Kepler’s fundamental capabilities are largely unchanged versus the last generation, Nvidia has extensively refined and polished nearly every aspect of this GPU with an eye toward improved power efficiency.

Kepler was developed under the direction of lead architect John Danskin and Sr. VP of GPU engineering Jonah Alben. Danskin and Alben told us their team took a rather different approach to chip development than what’s been common at Nvidia in the past, with much closer collaboration between the different disciplines involved, from the architects to the chip designers to the compiler developers. An idea that seemed brilliant to the architects would be nixed because it didn’t work well in silicon, or if it didn’t serve the shared goal of building a very power-efficient processor.

Although Kepler is, in many ways, the accumulation of many small refinements, Danskin identified the two most major changes as the revised SM—or shader multiprocessor, the GPU’s processing “core”—and a vastly improved memory interface. Let’s start by looking at the new SM, which Nvidia calls the SMX, because it gives us the chance to drop a massive block diagram on you. Warm up your scroll wheels for this baby.

Logical block diagrams of the Kepler SMX (left) and Fermi SM (right). Source: Nvidia.

To some extent, GPUs are just massive collections of floating-point computing power, and the SM is the locus of that power. The SM is where nearly all of the graphics processing work takes place, from geometry processing to pixel shading and texture sampling. As you can see, Kepler’s SMX is clearly more powerful than past generations, because it’s over 700 pixels tall in block diagram form. Fermi is, like, 520 or so, tops. More notably, the SMX packs a heaping helping of ALUs, which Nvidia has helpfully labeled as “cores.” I’d contend the SM itself is probably the closest analog to a CPU core, so we’ll avoid that terminology. Whatever you call it, though, the new SMX has more raw computing power—192 ALUs versus 32 ALUs in the Fermi SM. According to Alben, about half of the Kepler team was devoted to building the SMX, which is a new design, not a derivative of Fermi’s SM.

The organization of the SMX’s execution units isn’t truly apparent in the diagram above. Although Nvidia likes to talk about them as individual “cores,” the ALUs are actually grouped into execution units of varying widths. In the SMX, there are four 16-ALU-wide vector execution units and four 32-wide units. Each of the four schedulers in the diagram above is associated with one vec16 unit and one vec32 unit. There are eight special function units per scheduler to handle, well, special math functions like transcendentals and interpolation. (Incidentally, the partial use of vec32 units is apparently how the GF114 got to have 48 ALUs in its SM, a detail Alben let slip that we hadn’t realized before.)

Although each of the SMX’s execution units works on multiple data simultaneously according to its width—and we’ve called them vector units as a result—work is scheduled on them according to Nvidia’s customary scheme, in which the elements of a pixel or thread are processed sequentially on a single ALU. (AMD has recently adopted a similar scheduling format in its GCN architecture.) As in the past, Nvidia schedules its work in groups of 32 pixels or threads known as “warps.” Those vec32 units should be able to output a completed warp in each clock cycle, while the vec16 units and SFUs will require multiple clocks to output a warp.

The increased parallelism in the SMX is a consequence of Nvidia’s decision to seek power efficiency with Kepler. In Fermi and prior designs, Nvidia used deep pipelining to achieve high clock frequencies in its shader cores, which typically ran at twice the speed of the rest of the chip. Alben argues that arrangement made sense from the standpoint of area efficiency—that is, the extra die space dedicated to pipelining was presumably more than offset by the performance gained at twice the clock speed. However, driving a chip at higher frequencies requires increased voltage and power. With Kepler’s focus shifted to power efficiency, the team chose to use shorter pipelines and to expand the unit count, even at the expense of some chip area. That choice simplified the chip’s clocking, as well, since the whole thing now runs at one speed.

Another, more radical change is the elimination of much of the control logic in the SM. The key to many GPU architectures is the scheduling engine, which manages a vast number of threads in flight and keeps all of the parallel execution units as busy as possible. Prior chips like Fermi have used lots of complex logic to decide which warps should run when, logic that takes a lot of space and consumes a lot of power, according to Alben. Kepler has eliminated some of that logic entirely and will rely on the real-time complier in Nvidia’s driver software to help make scheduling decisions. In the interests of clarity, permit me to quote from Nvidia’s whitepaper on the subject, which summarizes the change nicely:

Both Kepler and Fermi schedulers contain similar hardware units to handle scheduling functions, including, (a) register scoreboarding for long latency operations (texture and load), (b) inter-warp scheduling decisions (e.g., pick the best warp to go next among eligible candidates), and (c) thread block level scheduling (e.g., the GigaThread engine); however, Fermi’s scheduler also contains a complex hardware stage to prevent data hazards in the math datapath itself. A multi-port register scoreboard keeps track of any registers that are not yet ready with valid data, and a dependency checker block analyzes register usage across a multitude of fully decoded warp instructions against the scoreboard, to determine which are eligible to issue.

For Kepler, we realized that since this information is deterministic (the math pipeline latencies are not variable), it is possible for the compiler to determine up front when instructions will be ready to issue, and provide this information in the instruction itself. This allowed us to replace several complex and power-expensive blocks with a simple hardware block that extracts the pre-determined latency information and uses it to mask out warps from eligibility at the inter-warp scheduler stage.

The short story here is that, in Kepler, the constant tug-of-war between control logic and FLOPS has moved decidedly in the direction of more on-chip FLOPS. The big question we have is whether Nvidia’s compiler can truly be effective at keeping the GPU’s execution units busy. Then again, it doesn’t have to be perfect, since Kepler’s increases in peak throughput are sufficient to overcome some loss of utilization efficiency. Also, as you’ll soon see, this setup obviously works pretty well for graphics, a well-known and embarrassingly parallel workload. We are more dubious about this arrangement’s potential for GPU computing, where throughput for a given workload could be highly dependent on compiler tuning. That’s really another story for another chip on another day, though, as we’ll explain shortly.

The first chip: GK104
Now that we’ve looked at the SMX, we dial back the magnification a bit and consider the overall layout of the first chip based on the Kepler architecture, the GK104.

Logical block diagram of the GK104. Source: Nvidia.

You can see that there are four GPCs, or graphics processing clusters, in the GK104, each nearly a GPU unto itself, with its own rasterization engine. The chip has eight copies of the SMX onboard, for a gut-punching total of 1536 ALUs and 128 texels per clock of texture filtering power.

The L2 cache shown above is 512KB in total, divided into four 128KB “slices,” each with 128 bits of bandwidth per clock cycle. That adds up to double the per-cycle bandwidth of the GF114 or 30% more than the biggest Fermi, the GF110. The rest of the specifics are in the table below, with the relevant comparisons to other GPUs.

width (bits)
process node
GF114 32 64/64 384 2 256 1950 360 40 nm
GF110 48 64/64 512 4 384 3000 520 40 nm
GK104 32 128/128 1536 4 256 3500 294 28 nm
Cypress 32 80/40 1600 1 256 2150 334 40 nm
Cayman 32 96/48 1536 2 256 2640 389 40 nm
Pitcairn 32 80/40 1280 2 256 2800 212 28 nm
Tahiti 32 128/64 2048 2 384 4310 365 28 nm

In terms of basic, per-clock rates, the GK104 stacks up reasonably well against today’s best graphics chips. However, if the name “GK104” isn’t enough of a clue for you, have a look at some of the vitals. This chip’s memory interface is only 256 bits wide, all told, and its die size is smaller than the middle-class GF114 chip that powers the GeForce GTX 560 series. The GK104 is also substantially smaller, and comprised of fewer transistors, than the Tahiti GPU behind AMD’s Radeon HD 7900 series cards. Although the product based on it is called the GeForce GTX 680, the GK104 is not a top-of-the-line, reticle-busting monster. For the Kepler generation, Nvidia has chosen to bring a smaller chip to market first.

Die shot of the GK104. Source: Nvidia.

The GK104 poses, no duckface, next to a quarter for scale

Although Nvidia won’t officially confirm it, there is surely a bigger Kepler in the works. The GK104 is obviously more tailored for graphics than GPU computing, and GPU computing is an increasingly important market for Nvidia. The GK104 can handle double-precision floating-point data formats, but it only does so at 1/24th the rate it processes single-precision math, just enough to maintain compatibility. Nvidia has suggested there will be some interesting GPU-computing related announcements during its GTC conference in May, and we expect the details of the bigger Kepler to be revealed at that point. Our best guess is that the GK100, or whatever it’s called, will be a much larger chip, presumably with six 64-bit memory interfaces and 768KB of L2 cache. We wouldn’t be surprised to see its SM exchange those 32-wide execution units for 16-wide units capable of handling double-precision math, leaving it with a total of 128 ALUs per SM. We’d also expect full ECC protection for all local storage and off-chip memory, just like the GF110.

The presence of a larger chip at some point in Nvidia’s future doesn’t mean the GK104 lacks for power. Although it “only” has four 64-bit memory controllers, this chip’s memory interface is probably the most notable change outside of the SMX. As Danskin very carefully put it, “Fermi, our memory wasn’t as fast as it could have been. This is, in fact, as fast as it could be.” The interface still supports GDDR5 memory, but data rates are up from about 4 Gbps in the Fermi products to 6 Gbps in the GeForce GTX 680. As a result, the GTX 680 is able essentially to match the GeForce GTX 580 in total memory bandwidth, at 192 GB/s, while having a 50% narrower data path.

The other novelty in the GK104 is Nvidia’s first PCI Express 3.0-compatible interconnect, which doubles the peak data rate possible for GPU-to-host communication. We don’t expect major performance benefits for graphics workloads from this faster interface, but it could matter in multi-GPU scenarios or for GPU computing applications.

Several new features
On this page, we intend to explain some of the important new features Nvidia has built into the GK104 or its software stack. However, in the interests of getting this review posted before our deadline, we’ve decided to put in a placeholder, a radically condensed version of the final product. Don’t worry, we’ll fix it later in software—like the R600’s ROPs.

  • GPU Boost — As evidenced by the various “turbo” schemes in desktop CPUs, dynamic voltage and frequency schemes are all the rage these days. The theory is straightforward enough. Not all games and other graphics workloads make use of the GPU in the same way, and even relatively “intensive” games may not cause all of the transistors to flip and thus heat up the GPU quite like the most extreme cases. As a result, there’s often some headroom left in a graphics card’s designated thermal envelope, or TDP (thermal design power), which is generally engineered to withstand a worst-case peak workload. Dynamic clocking schemes attempt to track this headroom and to take advantage of it by raising clock speeds opportunistically.

    Although the theory is fairly simple, the various implementations of dynamic clocking vary widely in their specifics, which can make them hard to track. Intel’s Turbo Boost is probably the gold standard at present; it uses a network of thermal sensors spread across the die in conjunction with a programmable, on-chip microcontroller that governs Turbo policy. Since it’s a hardware solution with direct inputs from the die, Turbo Boost reacts very quickly to changes in thermal conditions, and its behavior may differ somewhat from chip to chip, since the thermal properties of the chips themselves can vary.

    Although distinct from one another in certain ways, both AMD’s Turbo Core (in its CPUs) and PowerTune (in its GPUs) combine on-chip activity counters with pre-production chip testing to establish a profile for each model. In use, power draw for the chip is then estimated based on the activity counters, and clocks are adjusted in response to the expected thermal situation. AMD argues the predictable, deterministic behavior of its DVFS schemes is an admirable trait. The price of that consistency is that it can’t squeeze every last drop of performance out of each individual slab of silicon.

    GPU Boost is essentially a first-generation crack at a dynamic clocking feature, and it combines some traits of each of the competing schemes. Fundamentally, the logic is more like the two Turbos than it is like AMD’s PowerTune. With PowerTune, AMD runs its GPUs at a relatively high base frequency, but clock speeds are sometimes throttled back under atypically high GPU utilization. By contrast, GPU Boost starts with a more conservative base clock speed and ranges into higher frequencies when possible.

    The inputs for Boost’s decision-making algorithm include power draw, GPU and memory utilization, and GPU temperatures. Most of this information is collected from the GPU itself, but I believe the power use information comes from external circuitry on the GTX 680 board. In fact, Nvidia’s Tom Petersen told us board makers will be required to include this circuitry in order to get the GPU maker’s stamp of approval. The various inputs for Boost are then processed in software, in a portion of the GPU driver, not in an on-chip controller. The combination of software control and external power circuitry is likely responsible for Boost’s relatively high clock-change latency. Stepping up or down in frequency takes about 100 milliseconds, according to Petersen. A tenth of a second is a very long time in the life of a gigahertz-class chip, and Petersen was frank in admitting that this first generation of GPU Boost isn’t everything Nvidia hopes it will become in the future.

    Graphics cards with Boost will be sold with a couple of clock speed numbers on the side. The base clock is the lower of the two—1006MHz on the GeForce GTX 680—and represents the lowest operating speed in thermally intensive workloads. Curiously enough, the “boost clock”—which is 1058MHz on the GTX 680—isn’t the maximum speed possible. Instead, it’s “sort of a promise,” according to Petersen, the clock speed at which the GPU should run during typical operation. GPU Boost performance will vary slightly from card to card, based on factors like chip quality, ambient temperatures, and the effectiveness of the cooling solution. GTX 680 owners should expect to see their cards running at the Boost clock frequency as a matter of course, regardless of these factors. Beyond that, GPU Boost will make its best effort to reach even higher clock speeds when feasible, stepping up and down in increments of 13MHz.

    Petersen demoed several interesting scenarios to illustrate Boost behavior. In a very power-intensive scene, 3DMark11’s first graphics test, the GTX 680 was forced to remain at its base clock throughout. When playing Battlefield 3, meanwhile, the chip spent most of its time at about 1.1GHz—above both the base and boost levels. In a third application, the classic DX9 graphics demo “rthdribl,” the GTX throttled back to under 1GHz, simply because additional GPU performance wasn’t needed. One spot where Nvidia intends to make use of this throttling capability is in-game menu screens—and we’re happy to see it. Some menu screens can cause power use and fan speeds to shoot skyward as frame rates reach quadruple digits.

    Nvidia has taken pains to ensure GPU Boost is compatible with user-driven tweaking and overclocking. A new version of its NVAPI allows third-party software, like EVGA’s slick Precision software, control over key Boost parameters. With Precision, the user may raise the GPU’s maximum power limit by as much as 32% above the default, in order to enable operation at higher clock speeds. Interestingly enough, Petersen said Nvidia doesn’t consider cranking up this slider overclocking, since its GPUs are qualified to work properly at every voltage-and-frequency point along the curve. (Of course, you could exceed the bounds of the PCIe power connector specification by cranking this slider, so it’s not exactly 100% kosher.) True overclocking happens by grabbing hold of a separate slider, the GPU clock offset, which raises the chip’s frequency at a given voltage level. An offset of +200MHz, for instance, raised our GTX 680’s clock speed while running Skyrim from 1110MHz (its usual Boost speed) to 1306MHz. EVGA’s tool allows GPU clock offsets as high as +549MHz and memory clock offsets up to +1000MHz, so users are given quite a bit of leeway for experimentation.

    Although GPU Boost is only in its first incarnation, Nvidia has some big ideas about how to take advantage of these dynamic clocking capabilities. For instance, Petersen openly telegraphed the firm’s plans for future versions of Boost to include control over memory speeds, as well as GPU clocks.

    More immediately, one feature exposed by EVGA’s Precision utility is frame-rate targeting. Very simply, the user is able to specify his desired frame rate with a slider, and if the game’s performance exceeds that limit, the GPU steps back down the voltage-and-frequency curve in order to conserve power. We were initially skeptical about the usefulness of this feature for one big reason: the very long latency of 100 ms for clock speed adjustments. If the GPU has dialed back its speed because the workload is light and then something changes in the game—say, an explosion that adds a bunch of smoke and particle effects to the mix—ramping the clock back up could take quite a while, causing a perceptible hitch in the action. We think that potential is there, and as a result, we doubt this feature will appeal to twitch gamers and the like. However, in our initial playtesting of this feature, we’ve not noticed any problems. We need to spend more time with it, but Kepler’s frame rate targeting may prove to be useful, even in this generation, so long as its clock speed leeway isn’t too wide. At some point in the future, when the GPU’s DVFS logic is moved into hardware and frequency change delays are measured in much smaller numbers, we expect features like this one to become standard procedure, especially for mobile systems.

  • Adaptive vsync — Better than dumb vsync.
  • TXAA — Quincunx 2.0, or Nvidia erects a narrower tent.
  • Bindless textures — Megatexturing in hardware, but not for DX11.
  • NVENC — Hardware video encoding, or right back atcha, QuickSync.
  • Display output improvement — Eye-nvidi-ty.

The GeForce GTX 680

Now that we’ve looked at the GPU in some detail, let me drop the specs on you for the first card based on the GK104, the GeForce GTX 680.

GeForce GTX 680 1006 1058 1536 128 32 6 GT/s 256 15W/195W

The GTX 680 has (as far as we know, at least) all of the the GK104’s functional units enabled, and it takes that revised memory interface up to 6 GT/s, as advertised. The board’s peak power draw is fairly tame, considering its positioning, but not perhaps considering the class of chip under that cooler.

Peak pixel
fill rate
Peak bilinear
Peak bilinear
FP16 filtering
Peak shader
GeForce GTX 560
29 58 58 1.4 1800 134
GeForce GTX 560
Ti 448
29 41 41 1.3 2928 152
GeForce GTX 580 37 49 49 1.6 3088 192
GeForce GTX 680 32 129 129 3.1 4024 192
Radeon HD 5870 27 68 34 2.7 850 154
Radeon HD 6970 28 85 43 2.7 1780 176
Radeon HD 7870 32 80 40 2.6 2000 154
Radeon HD 7970 30 118 59 3.8 1850 264

Multiply the chip’s capabilities by its clock speeds, and you get a sense of how the GTX 680 stacks up to the competition. In most key rates, its theoretical peaks are higher than the Radeon HD 7970’s—and our estimates conservatively use the base clock, not the boost clock, as their basis. The only deficits are in peak shader FLOPS, where the 7970 is faster, and in memory bandwidth, thanks to Tahiti’s 384-bit memory interface.

With that said, you may or may not be pleased to hear that Nvidia has priced the GeForce GTX 680 at $499.99. On one hand, that undercuts the Radeon HD 7970 by 50 bucks and should be a decent deal given its specs. On the other, that’s a lot more than you’d expect to pay for the spiritual successor to the GeForce GTX 560 Ti—and despite its name, the GTX 680 is most definitely that. Simply knowing that fact may create a bit of a pain point for some of us, even if the price is justified based on this card’s performance.

Thanks to its relatively low peak power consumption, the GTX 680 can get away with only two six-pin power inputs. Strangely, Nvidia has staggered those inputs, supposedly to make them easier to access. However, notice that the orientation on the lower input is rotated 180° from the upper one. That means the tabs to release the power plugs are both “inside,” facing each other, which makes them harder to grasp. I don’t know what part of this arrangement is better than the usual side-by-side layout.

The 680’s display outputs are a model of simplicity: two dual-link DVI ports, an HDMI output, and a full-sized DisplayPort connector.

At 10″, the GTX 680 is just over half an inch shorter than its closest competitor, the Radeon HD 7970.

Our testing methods
This review marks the debut of our new GPU test rigs, which we’ve already outed here. They’ve performed wonderfully for us, with lower operating noise, higher CPU performance in games, and support for PCI Express 3.0.

Oh, before we move on, please note below that we’ve tested stock-clocked variants of most of the graphics cards involved, including the Radeon HD 7970, 7870, 6970, and 5870 and the GeForce GTX 580 and 680. We agonized over whether to use a Radeon HD 7970 card like the XFX Black Edition, which runs 75MHz faster than AMD’s reference clock. However, we decided to stick with stock clocks for the higher-priced cards this time around. We expect board makers to offer higher-clocked variants of the GTX 680, which we’ll happily compare to higher-clocked 7970s once we get our hands on ’em. Although we’re sure our decision will enrage some AMD fans, we don’t think the XFX Black Edition’s $600 price tag would have looked very good in our value scatter plots, and we just didn’t have time to include multiple speed grades of the same product.

As ever, we did our best to deliver clean benchmark numbers. Tests were run at least three times, and we’ve reported the median result.

Our test systems were configured like so:

Processor Core i7-3820
Motherboard Gigabyte
Chipset Intel X79
Memory size 16GB (4 DIMMs)
Memory type Corsair
Vengeance CMZ16GX3M4X1600C9
DDR3 SDRAM at 1600MHz
Memory timings 9-9-11-24
Chipset drivers INF update
Rapid Storage Technology Enterprise
Audio Integrated
with Realtek drivers
Hard drive Corsair
F240 240GB SATA
Power supply Corsair
OS Windows 7 Ultimate x64 Edition
Service Pack 1
DirectX 11 June 2010 Update
Asus GeForce
GTX 560 Ti DirectCU II TOP
900 1050 1024
Zotac GeForce
GTX 560 Ti 448
765 950 1280
Zotac GeForce GTX 580 ForceWare
772 1002 1536
GeForce GTX
1006 1502 2048
Matrix Radeon HD 5870
850 1200 2048
Radeon HD 6970 Catalyst
890 1375 2048
Radeon HD
1000 1200 2048
Radeon HD 7970 Catalyst
925 1375 3072

Thanks to Intel, Corsair, and Gigabyte for helping to outfit our test rigs with some of the finest hardware available. AMD, Nvidia, and the makers of the various products supplied the graphics cards for testing, as well.

Unless otherwise specified, image quality settings for the graphics cards were left at the control panel defaults. Vertical refresh sync (vsync) was disabled for all tests.

We used the following test applications:

Some further notes on our methods:

  • We used the Fraps utility to record frame rates while playing a 90-second sequence from the game. Although capturing frame rates while playing isn’t precisely repeatable, we tried to make each run as similar as possible to all of the others. We tested each Fraps sequence five times per video card in order to counteract any variability. We’ve included frame-by-frame results from Fraps for each game, and in those plots, you’re seeing the results from a single, representative pass through the test sequence.

  • We measured total system power consumption at the wall socket using a Yokogawa WT210 digital power meter. The monitor was plugged into a separate outlet, so its power draw was not part of our measurement. The cards were plugged into a motherboard on an open test bench.

    The idle measurements were taken at the Windows desktop with the Aero theme enabled. The cards were tested under load running Skyrim at its Ultra quality settings with FXAA enabled.

  • We measured noise levels on our test system, sitting on an open test bench, using an Extech 407738 digital sound level meter. The meter was mounted on a tripod approximately 10″ from the test system at a height even with the top of the video card.

    You can think of these noise level measurements much like our system power consumption tests, because the entire systems’ noise levels were measured. Of course, noise levels will vary greatly in the real world along with the acoustic properties of the PC enclosure used, whether the enclosure provides adequate cooling to avoid a card’s highest fan speeds, placement of the enclosure in the room, and a whole range of other variables. These results should give a reasonably good picture of comparative fan noise, though.

  • We used GPU-Z to log GPU temperatures during our load testing.

The tests and methods we employ are generally publicly available and reproducible. If you have questions about our methods, hit our forums to talk with us about them.

Texture filtering
We’ll begin with a series of synthetic tests aimed at exposing the true, delivered throughput of the GPUs. In each instance, we’ve included a table with the relevant theoretical rates for each solution, for reference.

Peak pixel
fill rate
Peak bilinear
Peak bilinear
FP16 filtering
GeForce GTX 560
29 58 58 134
GeForce GTX 560
Ti 448
29 41 41 152
GeForce GTX 580 37 49 49 192
GeForce GTX 680 32 129 129 192
Radeon HD 5870 27 68 34 154
Radeon HD 6970 28 85 43 176
Radeon HD 7870 32 80 40 154
Radeon HD 7970 30 118 59 264

The pixel fill rate is, in theory, determined by the speed of the ROP hardware, but this test usually winds up being limited by memory bandwidth long before the ROPs run out of steam. That appears to be the case here. Somewhat surprisingly, the GTX 680 manages to match the Radeon HD 7970 almost exactly, even though the Radeon has substantially more potential memory bandwidth on tap.

Nvidia’s new toy comes out looking very good in terms of texturing capacity, more than doubling the performance of the GeForce GTX 580 in the texture fill and integer filtering tests. Kepler’s full-rate FP16 filtering allows it outperform the 7970 substantially in the final test. In no case does the GTX 680’s relatively lower memory bandwidth appear to hinder its ability to keep up with the 7970.

Tessellation and geometry throughput

GeForce GTX 560
1800 134
GeForce GTX 560
Ti 448
2928 152
GeForce GTX 580 3088 192
GeForce GTX 680 4024 192
Radeon HD 5870 850 154
Radeon HD 6970 1780 176
Radeon HD 7870 2000 154
Radeon HD 7970 1850 264

Although the GTX 680 has a higher theoretical rasterization rate than the GTX 580, the GK104 GPU has only half as many setup and tessellator units (aka PolyMorph engines) as the GF110. Despite that fact, the GTX 680 achieves twice the tessellation performance of Fermi. The GTX 680 even exceeds that rate in TessMark’s 64X expansion test, where it’s nearly three times the speed of the Radeon HD 7970. We doubt we’ll see a good use of a 64X geometry expansion factor in a game this year, but the Kepler architecture clearly has plenty of headroom here.

Shader performance

Peak shader
GeForce GTX 560
1.4 134
GeForce GTX 560
Ti 448
1.3 152
GeForce GTX 580 1.6 192
GeForce GTX 680 3.1 192
Radeon HD 5870 2.7 154
Radeon HD 6970 2.7 176
Radeon HD 7870 2.6 154
Radeon HD 7970 3.8 264

Our first look at the performance of Kepler’s re-architected SMX yields some mixed, and intriguing, results. The trouble with many of these tests is that they split so cleanly along architectural or even brand lines. For instance, the 3DMark particles test runs faster on any GeForce than on any Radeon. We’re left a little flummoxed by the fact that the 7970 wins three tests outright, and the GTX 680 wins the other three. What do we make of that, other than to call it even?

Nonetheless, there are clear positives here, such as the GTX 680 taking the top spot in the ShaderToyMark and GPU cloth tests. The GTX 680 improves on the Fermi-based GTX 580’s performance in five of the six tests, sometimes by wide margins. Still, for a card with the same memory bandwidth and ostensibly twice the shader FLOPS, the GTX 680 doesn’t appear to outperform the GTX 580 as comprehensively as one might expect.

GPU computing performance

This benchmark, built into Civ V, uses DirectCompute to perform compression on a series of textures. Again, this is a nice result from the new GeForce, though the 7970 is a smidge faster in the end.

Here’s where we start to worry. In spite of doing well in our graphics-related shader benchmarks and in the DirectCompute test above, the GTX 680 tanks in LuxMark’s OpenCL-driven ray-tracing test. Even a quad-core CPU is faster! The shame! More notably, the GTX 680 trails the GTX 580 by a mile—and the Radeon HD 7970 by several. Nvidia tells us LuxMark isn’t a target for driver optimization and may never be. We suppose that’s fine, but we’re left wondering just how much Kepler’s compiler-controlled shaders will rely on software tuning in order to achieve good throughput in GPU computing applications. Yes, this is only one test, and no, there aren’t many good OpenCL benchmarks yet. Still, we’re left to wonder.

Then again, we are in the early days for OpenCL support generally, and AMD seems to be very committed to supporting this API. Notice how the Core i7-3820 runs this test faster when using AMD’s APP driver than when using Intel’s own OpenCL ICD. If a brainiac monster like Sandy Bridge-E can benefit that much from AMD’s software tuning over Intel’s own, well, we can’t lay much fault at Kepler’s feet just yet.

The Elder Scrolls V: Skyrim
Our test run for Skyrim was a lap around the town of Whiterun, starting up high at the castle entrance, descending down the stairs into the main part of town, and then doing a figure-eight around the main drag.

Since these are pretty capable graphics cards, we set the game to its “Ultra” presets, which turns on 4X multisampled antialiasing. We then layered on FXAA post-process anti-aliasing, as well, for the best possible image quality without editing an .ini file.

At this point, you may be wondering what’s going on with the funky plots shown above. Those are the raw data for our snazzy new game benchmarking methods, which focus on the time taken to render each frame rather than an frame rate averaged over a second. For more information on why we’re testing this way, please read this article, which explains almost everything.

Frame time
in milliseconds
8.3 120
16.7 60
20 50
25 40
33.3 30
50 20

If that’s too much work for you, the basic premise is simple enough. The key to creating a smooth animation in a game is to flip from one frame to the next as quickly as possible in continuous fashion. The plots above show the time required to produce each frame of the animation, on each card, in our 90-second Skyrim test run. As you can see, some of the cards struggled here, particularly the GeForce GTX 560 Ti, which was running low on video memory. Those long waits for individual frames, some of them 100 milliseconds (that’s a tenth of a second) or more, produce less-than-fluid action in the game.

Notice that, in dealing with render times for individual frames, longer waits are a bad thing—lower is better, when it comes to latencies. For those who prefer to think in terms of FPS, we’ve provided the handy table at the right, which offers some conversions. See how, in the last plot, frame times are generally lower for the GeForce GTX 680 than for the Radeon HD 7970, and so the GTX 680 produces more total frames? Well, that translates into…

…higher FPS averages for the new GeForce. Quite a bit higher, in this case. Also notice that some of our worst offenders in terms of long frame times, such as the GeForce GTX 560 Ti and the GTX 560 Ti 448, produce seemingly “acceptable” frame rates of 41 and 50 FPS, respectively. We might expect that FPS number to translate into adequate performance, but we know from looking at the plot that’s not the case.

To give us a better sense of the frame latency picture, or the general fluidity of gameplay, we can look at the 99th percentile frame latency—that is, 99% of all frames were rendered during this frame time or less. Once we do that, we can see just how poorly the GTX 560 Ti handles itself here compared to everything else.

We’re still experimenting with our new methods, and I’m going to drop a couple of new wrinkles on you here today. We think the 99th percentile latency number is a good one, but since it’s just one point among many, we have some concerns about using it alone to convey the general latency picture. As a bit of an experiment, we’ve decided to expand our look at frame times to cover more points, like so.

This illustrates how close the matchup is between several of the cards, especially our headliners, the Radeon HD 7970 and GeForce GTX 680. Although the GeForce generally produces frames in less time than the Radeon, both are very close to that magic 16.7 ms (60 FPS) mark 95% of the time. Adding in those last few percentage points, that last handful of frames that take longer to render, makes the GTX 680’s advantage nearly vanish.

Our next goal is to focus more closely on the tough parts, places where the GPU’s performance limitations may be contributing to less-than-fluid animation, occasional stuttering, or worse. For that, we add up all of the time each GPU spends working on really long frame times, those above 50 milliseconds or (put another way) below about 20 FPS. We’ve explained our rationale behind this one in more detail right here, if you’re curious or just confused.

Only the two offenders we’ve already identified really spend any significant time working on really long-to-render frames. The rest of the pack (and I’d include the GTX 580 in this group) handles Skyrim at essentially the highest quality settings quite well.

Batman: Arkham City
We did a little Batman-style free running through the rooftops of Gotham for this one.

Frame time

in milliseconds



8.3 120
16.7 60
20 50
25 40
33.3 30
50 20

Several factors converged to make us choose these settings. One of our goals in preparing this article was to avoid the crazy scenario we had in our GeForce GTX 560 Ti 448 review, where every card tested could run nearly every game adequately. We wanted to push the fastest cards to their limits, not watch them tie a bunch of other cards for adequacy. So we cranked up the resolution and image quality and, yes, even enabled DirectX 11. We had previously avoided using DX11 with this game because the initial release had serious performance problems on pretty much any video card. A patch has since eliminated the worst problems, and the game is now playable in DX11, so we enabled it.

This choice makes sense for benchmarking ultra high-end graphics cards, I think. I have to say, though, that the increase in image quality with DX11 tessellation, soft shadows, and ambient occlusion isn’t really worth the performance penalty you’ll pay. The image quality differences are hard to see; the performance differences are abundantly obvious. This game looks great and runs very smoothly at 2560×1600 in DX9 mode, even on a $250 graphics card.

The GTX 680 again takes the top spot in the FPS sweeps, but as you can see in the plots above, all of the cards produce some long frame times with regularity. As a result of those higher-latency frames, the GTX 680 ties the 7970 in the 99th percentile frame time metric.

A broader look at the latency picture shows that the GTX 680 generally produces lower-latency frames than the 7970, which is why its FPS average is so high. However, that last 1% gives it trouble.

Lots of trouble, when we look at the time spent on long-latency frames. What happened to the GTX 680? Well, look up at the plots above, and you’ll see that, very early in our test run, there was a frame that took nearly 180 ms to produce—nearly a fifth of a second. As we played the game, we experienced this wait as a brief but total interruption in gameplay. That stutter, plus a few other shorter ones, contributed to the 680’s poor showing here. Turns out we ran into this problem with the GTX 680 in four of our five test runs, each time early in the run and each time lasting about 180 ms. Nvidia tells us the slowdown is the result of a problem with its GPU Boost mechanism that will be fixed in an upcoming driver update.

Battlefield 3
We tested Battlefield 3 with all of its DX11 goodness cranked up, including the “Ultra” quality settings with both 4X MSAA and the high-quality version of the post-process FXAA. We tested in the “Operation Guillotine” level, for 60 seconds starting at the third checkpoint.

Blessedly, there aren’t many wrinkles at all in BF3 performance from any of the cards. The 99th percentile frame times mirror the FPS averages, and all is well with the world. Even the slow cards are just generally slow and not plagued with excessively spiky, uneven frame times like we saw in Arkham City. This time, the GeForce GTX 680 outperforms the Radeon HD 7970 in every metric we throw at it, although its advantage is incredibly slim in every case.

Crysis 2
Our cavalcade of punishing but pretty DirectX 11 games continues with Crysis 2, which we patched with both the DX11 and high-res texture updates.

Notice that we left object image quality at “extreme” rather than “ultra,” in order to avoid the insane over-tessellation of flat surfaces that somehow found its way into the DX11 patch. We tested 90 seconds of gameplay in the level pictured above, where we gunned down several bad guys, making our way up the railroad bridge.

The GTX 680 just trails the 7970 in the FPS average, but its 99th percentile frame time falls behind a couple of other cards, including the Radeon HD 7870. Why? If you look at the plot for the GTX 680, you can see how, in the opening portion of the test run, its frame times range regularly into the 30-millisecond range. That’s probably why its 99th percentile frame time is 32 milliseconds—or, translated, roughly 30 FPS—and therefore nothing to worry about in the grand scheme. The GTX 680 devotes almost no time to really long frames, and its performance is quite acceptable here—just not quite as good as the 7970’s during those opening moments of the test sequence.

Serious Sam 3: BFE
We tested Serious Sam 3 at its “Ultra” quality settings, only tweaking it to remove the strange two-megapixel cap on the rendering resolution.

How interesting. Generally, this is one of those games where a particular sort of GPU architecture tends to do well—Radeons, in this case. However, the GeForce GTX 680 is different enough from its siblings that it utterly reverses that trend, effectively tying the Radeon HD 7970.

Power consumption

We’re pretty pleased with the nice, low power consumption numbers our new test rigs are capable of producing at idle. Not bad for quad memory channels, Sandy Bridge Extreme, and an 850W PSU, eh?

Although the entire system’s power draw is part of our measurement, the display is not. The reason we’re testing with the display off is that the new Radeons are capable of going into a special ultra-lower power mode, called ZeroCore power, when the display goes into standby. Most of the chip is turned off, and the GPU cooling fans spin down to a halt. That allows them to save about 12W of power draw on our test system, a feat the GTX 680 can’t match. Still, the 680’s power draw at idle is otherwise comparable to the 7970’s, with only about a watt’s worth of difference between them.

We’re running Skyrim for this test, and here’s where Kepler’s power efficiency becomes readily apparent. When equipped with the Radeon HD 7970, our test rig requires over 40W more power under load than it does when a GeForce GTX 680 is installed. You can see why I’ve said this is the same class of GPU as the GeForce GTX 560 Ti, although its performance is a generation beyond that.

Since we tested power consumption in Skyrim, we can mash that data up with our performance results to create a rough picture of power efficiency. By this measure, the GTX 680 is far and away the most power-efficient performer we’ve tested.

Noise levels and GPU temperatures

Even though the Radeon HD 7970 can turn off its cooling fan when the display goes into power-save, it doesn’t convey any measurable advantage here. The GTX 680 essentially adds nothing to our system’s total noise levels, which consist almost entirely of noise from the (very quiet) CPU cooler.

Under load, the GTX 680’s cooler performs admirably, maintaining the same GPU temperature as the 7970 while generating substantially less sound pressure. Of course, the GTX 680’s cooler has quite a bit less power (and thus heat) to deal with, but Nvidia has a long tradition of acoustic excellence for its coolers, dating back to at least the GeForce 8800 GTX (though not, you know, NV30.)

We’re not terribly pleased with the fan speed profile AMD has chosen for its stock 7970 cards, which seems to be rather noisy. However, we should note that we’ve seen much better cooling and acoustic performance out of XFX’s Radeon HD 7970 Black Edition, a card with slightly higher clock speeds. It’s a little pricey, but it’s also clearly superior to the reference design.

Going scatter-brained
The scatter plot of power and performance on the previous page has inspired me to try a bit of an experiment. This is just for fun, so feel free to skip ahead if you’d like. I’m just curious see what we can learn by mashing up some other bits of info with our overall performance data across all of the games we tested.

This one isn’t really fair at all, since we haven’t normalized for the chip fabrication process involved. The three GPUs produced on a 28-nm process are all vastly superior, in terms of performance per area, to their 40-nm counterparts. The difference in size between the GeForce GTX 580 and the Radeon HD 7870, for roughly equivalent performance, is comical. The GTX 680 looks quite good among the three 28-nm chips, with higher performance and a smaller die area than the 7970.

The next few scatters are for the GPU architecture geeks who might be wondering about all of those graphics rates we’re always quoting and measuring. Here’s a look at how the theoretical peak numbers in different categories track with delivered performance in games. What we’re looking for here is a strong or weak correlation; a stronger correlation should give us a nice collection of points roughly forming diagonal line, or something close to it.

The first couple of plots, with rasterization rate and FLOPS, don’t show us much correlation at all between these properties and in-game performance. The final three begin to fall into line a little bit, with memory bandwidth and ROP rate (or pixel fill) being most strongly correlated, to my eye. Notice that the GeForce GTX 680 is apparently very efficient with its memory bandwidth, well outside of the norm.

These results led to me wonder whether the correlations would grow stronger if we subbed in the results of directed tests instead of theoretical peak numbers. We do have some of that data, so…

ShaderToyMark gives us the strongest correlation, which shouldn’t be too much of a surprise, since it’s the most game-like graphics workload among our directed tests. Otherwise, I’m not sure we can draw too many strong conclusions from these results, other than to say that the GTX 680 sure looks to have an abundance of riches when it comes to FP16 texture filtering.

With a tremendous amount of information now under our belts, we can boil things down, almost cruelly, to a few simple results in a final couple of scatter plots. First up is our overall performance index, in terms of average FPS across all of the games we tested, matched against the price of each card. As usual, the most desirable position on these plots is closer to the top left corner, where the performance is higher and the price is lower.

The GeForce GTX 680 is slightly faster and 50 bucks less expensive than the Radeon HD 7970, so it lands in a better position on this first plot. However, if we switch to an arguably superior method of understanding gaming performance and smoothness, our 99th percentile frame time (converted to FPS so the plot reads the same), the results change a bit.

The GTX 680’s few instances of higher frame latencies, such as that apparent GPU Boost issue in Arkham City, move it just a couple of ticks below the Radeon HD 7970 in overall performance. Then again, the GTX 680 costs $50 less, so it’s still a comparable value.

The truth is that, either way you look at it, there is very little performance difference between these two cards, and any difference is probably imperceptible to the average person.

GeForce GTX 680
March 2012

We’ve already established that the GTX 680 is more power efficient than the Radeon HD 7970 (at least when running a game, if not sitting idle), and it’s quieter, too. In fact, there is very little not to like about the GeForce GTX 680. With this GPU, Nvidia catches up to AMD on a whole host of fronts overnight, from power efficiency and performance to process tech and feature set. Nvidia was even able to supply us with working software that uses its H.264 video encoder, something AMD has yet to do for the Radeon HD 7970 and friends. All of those considerations lead us, almost inescapably, to one conclusion: the GeForce GTX 680 has earned itself an Editor’s Choice award for being the most desirable video card in its class.

That honor comes with some small caveats, though. For one, if you require the absolutely fastest single-GPU video card available, with price as no object, then you’ll probably want to check out some of the higher-clocked versions of the Radeon HD 7970, like XFX’s Black Edition. We figure a slight bump in GPU core and memory clocks ought to put the 7970 solidly over the top versus a stock-clocked GTX 680 like the one we tested—and we don’t yet have any information about what board makers will be doing with the GTX 680. You’ll have to be very finely attuned to bragging rights for any of this to matter, though. Fortunately, AMD and Nvidia are so attuned, and I expect to see higher-clocked variants of both cards hitting the market in the coming weeks in an attempt to establish a clear winner. That should be fun to watch.

Also, the GeForce GTX 680 is a massive generational improvement, extracting roughly twice the performance of the GeForce GTX 560 Ti from a similar class of GPU. Still, we’re a little disappointed Nvidia isn’t passing along more of those gains to consumers in the form of higher performance per dollar, as has happened in the past. Half a grand is a lot to ask for a mid-sized chip on a card with a 256-bit memory interface. We had a similar complaint when AMD introduced the Radeon HD 7970, and at that time, we expressed the hope that competition from Nvidia would drive prices down. Now, we’re having to face the reality that the problem isn’t really lack of competitive fire at the GPU companies, it’s the limited number of 28-nm wafers coming out of TSMC, who makes the chips for both firms. The number of good chips per wafer is likely an issue, too. AMD and Nvidia will probably be able to sell all of the chips they can get at current prices for a while, simply because of supply constraints.

We hate to be a Debbie Downer here, though, so we’ll mention something else. The GTX 680 is part of a family of Kepler-based products, and as the fastest member so far, it’s bound to command a premium. But we expect to see a whole range of new cards and GPUs based on Kepler to be hitting the market in the coming months, almost all of them more affordable than this one. Given the amazing efficiency of the Kepler architecture, we expect really good things to follow—and we’ll be there, making up ridiculous ways to plot the goodness.

If you’re a total nerd, you can always follow me on Twitter.

0 responses to “Nvidia’s GeForce GTX 680 graphics processor reviewed

  1. I’m not so sure I agree. What is really the difference between a $300 video card and a $500 video card? $500 is 125kg of pennies (a very large bucket) and $300 is 75kg of pennies (a somewhat smaller, but still large bucket).

    I needed to choose a GTX 680 for my rig because my rig is a Mini-ITX box the size of a toaster. If I stick a 580 in it, it will *die*.

    But even looking at it from a performance perspective, the 680 runs circles around the 580, which is enough to justify the price, and, oh, wait, hold on, it consumes *less* power? Yup. How is that technological advancement not worth the extra 66% in cost (which is already justified from performance alone)?

  2. Stick to the current generation. When Big Kepler GK110 turns up in October it will have to face off against AMD Sea Islands HD 8970 for this Christmas battle. Do not expect AMD to be sitting idle. Sea Islands is expected to be an improved architecture plus AMD will look to go for 400+ sq mm die because the 28nm process would be well understood and robust. Also the Big Kepler GK110 will be a compute heavy die cause Nvidia is talking about 2.5 – 3 times the Fermis (Tesla M2090) DP perf. So a lot of die size goes towards compute features like DP perf, ECC, wider memory subsystem required for compute apps (384 or 512 bit),newer features like virtual memory. These will significantly affect perf scaling in gaming scenarios compared to GTX 680. Moreover Nvidia needs to look at what specs it is manufacturable. The larger the die more the leakage, more difficult the yields. TSMC now has a finished wafers agreement at 28nm vs a good die agreement at 40nm. So Nvidia cannot afford to sell a product with rubbish yields as was the case with Fermi. Expectations are for a 30% gaming improvement. So hold on to your horses. I still feel Nvidia might come out on top given they are likely to go for a huge 550 sq mm die. But until we compare final products we can’t say.

  3. I’ll stop writing inefficient, bloated code the day I stop writing code damn you!

  4. Silicon scaling has been “coming to an end” since the days of the P4 Prescott, maybe even earlier than that if you believe everything you read. When we reach the “end” there will be some new tech to circumvent it instead.

    Hard disk areal density has been getting close to it’s theoretical limit for ages, yet we keep marching forwards and must now be at a point where density is 20x higher then when perpendicular recording busted through the last naysayer’s “end is nigh” barrier.

    I’ll believe it’s reached the end when things stop getting faster, better and smaller, and if that ever happens, perhaps people can FINALLY STOP WRITING INEFFICIENT, BLOATED CODE and actually start making software advances instead.

  5. yea high end pc’s are more expensive but also better ! full hd, maxed graphics, that is some sweet eye candy you’re paying for if you buy this … if you have some hd screen anyway, but yea of course it’s very expensive, i bought my hd5870 500$ and now i kinda regret as it’s laggy on bf3 🙁 (might be my cpu actually..), but i’ll just save and invest later.. new comp etc, if you really want such a card wait a few months price will drop by a good bunch already. or get some 7850/70 maybe sounds fairly enough.

  6. currently yea but it’s nice to invest in a card that handle future games, just saying :p

  7. GPU Boost sounds interesting. Is the delay always 100 ms, or is that just an “oh, we measured that much once” type of typical number? In other words, [i<]does it vary[/i<]? Can the driver get interrupted by high CPU utilisation?

  8. This isn’t high-end, only close.

    The only thing that’s “high-end” is the price of these.

  9. Too much focus on the high-end. Even at an enthusiast website I doubt many will bother buying either this one or AMD’s offer.

  10. They ARE available every day at and sites like (and the price for EVGA models is being kept at MSRP level even by Newegg), there’s just very limited daily stock of them which gets quickly bought in about 10-20 minutes.

  11. I really don’t see how AMD failed this time. First of all NVIDIA came with ONE graphic card and for a long time this is it. Everybody is talking that this is 560Ti replacement. Ok, let me see the highend Kepler!!!. Wait… there nothing else to launch for Nvidia. And if you think there are so many people who can afford to buy this, think again (by the way I’m living in Romania and here the GTX680 is more expensive ~500-550Euro so that means 700-750$). On the other hand AMD is having almost a full range and can play with prices if they want. Now Nvidia can change their gaming logo to “GTX680 the way was meant…”

  12. Yield of 28nm are awful, meaning it brings no cost benefits.

    Plus the US destroyed the world’s economy and they can’t manage to get sane enough again to actually really fix that.

  13. If you’re going to use a positive-pressure chassis like the Raven 2 (or the much less ugly Fortress 2) then you probably want an integrated water-cooler from Corsair or the like as well, to take advantage of the case’s intake pressure.

    Then, of course, you wouldn’t be replacing the heat-sink fan combination on the GPU but rather sticking with stock exhausting solutions, as the increased pressure provided by the chassis would keep the fans spinning lower and quieter as a result.

    This is essentially what I did with my Fractal Define R3. I added intake fans wherever possible and my 2x HD6950 2GB cards and 4.8GHz 2500k stay near silent, powered by a Silverstone X650 Gold PSU.

  14. The 7970’s die is more than 30% larger than that of the 680.

    But I’m hypothesizing that the 680 will turn out like the 8800GT. And the 9800GTX’s die was 25% larger than that of the 8800GT (eventually called 9800GT).

    If Big Kepler’s die is about the same size as that of the 7970, that would mean the 680 is almost perfectly analogous to the 8800GT in its day. The only difference would be that Nvidia properly named the 680, as opposed to the 8800GT’s painfully inaccurate moniker.

    And once we consider the 680’s lack of innovation in the compute space, the argument for “Big Kepler” becomes even stronger. Compute clients don’t care about midrange parts, they go for the biggest and best they can get. If there is a Big Kepler, the 680’s lackluster compute abilities won’t matter.

    And don’t worry about price. The 680 is priced as its name suggests it should be priced. Just don’t be surprised if we see a GTX 685 (or GTX 680+) at the $500 price point in the near future.

  15. Both AMD and Nvidia can try and have some huge GPUs made that would crush both Tahiti and Kepler, then they’d list them at 2x the price and you would still buy a GTX 680. Would that make you happier? Note that making a bigger and faster GPU won’t necessarily make the GTX 680 or HD7970 any cheaper.

  16. The 680 IS mid-range. This is the 8800GT all over again, but this time, Nvidia got the naming right.

    It is a bit depressing that Nvidia’s mid-range Kepler competes with AMD’s best Sea Islands product. If AMD had its act together, we could have the 680 named something like 660 and save at least $200.

  17. I can confirm the part @CPUs, here the i5 2500K is more expensive then when it was launched…..insane.

  18. I’ve noticed this as well, prices here in the UK have not come down at all since the launch of the last generation of cards. In fact if anything, they’ve gone up, and the problem isn’t only confided to GPU’s. The Core i7 2600K cost around £200 shortly after launch, then, the price rose to £250 – £260, and now, the price is back down to £220 – what a bargain!

    In fact, I’ve still got the email for 4th May 2011 from :

    “Intel Core i7-2600K 3.40GHz (Sandybridge) Socket LGA1155 Pro
    save £15.40
    £215.99 inc VAT”

  19. Ok look stock vs stock GTX 680 > HD 7970

    But the GTX 680 did not usher in a new era in which super high end performance becomes available to the masses. Hell even the “mid-end” HD 7870 is 50% more expensive in my country compared to what i paid for my GTX 560Ti last year, right after launch (also i need to add that even this card hasn’t come down in price much since).

    So W…T…F is going on? Why are the prices so high? Will they come down soon? No? When then?

  20. I did. And it’s a very good point but it’s not the whole story which is why I mentioned the 7970 vs 580 comparison – the 580 uses more total power but the total system noise is lower – because of the fan.

    As for multiple data points, sure, go for it. I can always skip over that graph 😀

  21. Is anybody else worried that they’re not going to fill in the mid-range properly? AMD already have that weird gap where 78xx is still quite expensive with fairly high power draw and 77xx isn’t at all compelling for price/performance. So far nVidia only seem to have 2 Kepler chips about – the mobile 640/650/660 part with 384 shaders and the desktop 680 part with 1536.

    I want to see something in the 1152 – 768 shader ballpark with performance that’s not crippled by poor memory bandwidth and a reasonable price, performance and power draw balance. That would be *real* competition for AMD, because as yet they have nothing decent to fight back with from their current range.

    Unfortunately I’m rather worried that won’t happen, because their mobile range is packed out with Fermi in the crucial 675/670 bracket, with that one Kepler part somehow magically filling in for the 660/650/640 area. Extrapolating to how the desktop and mobile lines usually correspond to each other, I don’t see a prospect for a desktop 660 any time soon. Which is a damn shame.

    For perspective, I have a 560 in my notebook that I want to replace with something with similar thermals and 25-30% better performance. Should be feasible with 28nm. I await a contender.

  22. That doesn’t change the performance/watt perspective, though. It does, however, mean that both cards compete, which is what I like to see. No 8800GTX/2900XT murdering going on here.

  23. Did you read the part of my post about overall system noise and heat dissipation from the case? I agree that the abilities of the heat-sink make an enormous difference to the sound level a system will generate, but it’s not the whole story.

    As for graphing thoroughly, I don’t know about smilingcrow, but I never heard mention of multiple graphs. Just a graph based on more than one data point. An average of 3 games, perhaps?

  24. This is all assuming that nVidia will “delay” the bigger Kepler; indications are more that it would never have been around in time to fight this particular battle anyway. Not that it matters, because they’re fighting it damn well on their own terms now. Funny how things can work out.

    I think the problem here is less than AMD didn’t bring their “A” game, and more that their A game hasn’t been that exciting since the 4/5 series launches.

  25. It could end up like the nVidia “hardware decode” on the AGP 6800GT / GTX. It was FUBAR from the start, drivers were promised, they never appeared. Man, that was a while ago.

    I hope this is not the case. We have no evidence either way as yet; probably best not to base a buying decision on it, though.

  26. This is a very solid, well-done review.

    I really love how early in the article, very key aspects of the architecture are explained (hardware implementation vs. compiler) and what they mean.

    The benchmarking is usual high standards fare.

    The conclusions are smart and informative, not just an afterthought.

    Kudos, TR and Scott. Well done and very enjoyable.

  27. Seems my HD7950 is happy enough at 1150mhz as well, with the fan set to kill and voltage at HD7970 level. So there you go.

    edit: which is, given the low stock clocks of a ‘standard’ HD7950, a staggering 43% overclock

  28. Talk about the truth.. there isn’t a single 680 available at GL to all of you who want one, these cards are not gonna sell for their listed price for long.

  29. Actually in terms of the amount of power used, I would claim the 7xxx series to have NV beat depending upon what you are looking for… The ZeroCore power feature where the chip can basically just turn itself totally off dang near is a neat feature I wish my GTX285 had… and with the ZeroCore power feature enabled and the monitor off, the Radeons are using a decently less amount of power than the 680. In terms of peak power consumption during load… I can really less care other than for knowing how large of a PSU I need to have… but I would say for a good chunk of MY computers life, it sits there idle with the monitor turned off, so ZeroCore is something that is appealing to me.

    Granted, like almost everyone else here I am hardly in the market for a 500+ dollar videocard… so this is just hypothetical for me as well… but… considering that the 7970 and the 560 are (really, more or less) identical in terms of performance, I would spend the extra 50 for the 7970 mainly for the ZeroCore feature. But then again for me I kind of see it as “well I’m already spending 500+ dollars for a uber-card… whats another 50 at this point… ”

    And the reason I do not turn off my computer is because it’s also doubling up as a file server for the rest of the house… hence why it runs 24/7 and why the ZeroCore is something I want…

  30. Er, I get that more performance at less watts is a good thing. I’m just not getting why you need it graphed so thoroughly. It’s one of those metrics where if it’s in the ballpark then it’s pretty much a wash, but if it’s very different then you really don’t need a slew of graphs. And you keep explaining it in ways that are long on words and short on concrete. And you completely ignore the very important role of the heatsink in all of this. I’ve responded point by point and all you do is reword the same points. My point is that by the time it matters you don’t really need a graph to know it. What’s your point?

  31. Spunjii and I tried but you are still completely missing the point; I’m outta here.

  32. performance per dB – yes! – thank you. That’s what smilingcrow is really asking for.

  33. He’s being downvoted? I see -1 on two posts and +1 on one. That’s not too bad, LOL.

    I am a [b<]freak[/b<] for silent computing. If I can hear the fan at all it bothers me. Unless I'm actually playing a game in which case I can't hear the fan because the speakers or headphones are on. I have an extremely decent power supply sitting on the shelf because I replaced it with a Seasonic X-560, at a cost of $120 (not cheap for me!), so I could get a PSU with a fan that doesn't spin at idle power levels. My next purchase is a Noctua 120mm fan to replace the stock fan on a Cooler Master Hyper 212 because that is too loud. All of that is said to establish my silent computing cred. Perhaps, if silent computing is the goal, what would make most sense is to have performace per dB graphed. That would account for all the variables and would be a tangible metric. It would mean something *really tangible*. You see, all that's needed for the 7970 to be every bit as quiet as the 680 is for Asus or someone else to sell one with a custom cooler, or slap on your own water cooler. Suddenly performance per watt loses all bearing on which card is quietest, completely. That's my point. Performance per watt does not determine the sound pressure level that a card emits - the heatsink does and the absolute wattage does and the permitted temperature does. The GTX 580 is quieter than the HD 7970 but it is a slower, hotter (so much worse performance per watt) card. But according to whats being said here - that performance per watt directly relates to a cards sound pressure level - the GTX 580 should be much louder than the 7970. But it's not; it's quieter. All of this to say that I don't see why performance per shader is any less interesting that performance per watt, and I don't think you need more than one game tested, realistically. And now I think performance per dB should accompany or even replace performance per watt. And in a perfect world all of this would be done in a close case and system temperatures would be recorded and plotted as well, but TR's test bench isn't even set up in a way that is possible to do that (open cases are used).

  34. I’m not aware of any stock coolers that are silent as opposed to quiet which is what I was referring to; this is in relationship to GPUs with a TDP of 100W+ as there are a number of fanless graphics cards using lower power GPUs.
    Decibels aren’t subjective as it’s a scientific measurement as much as watts, millimetres etc.

    I think PPW is interesting for a number of reasons as I outlined previously. I’m personally interested in silent computing but I’m not looking for TR etc to provide me performance per decibel figures. For a start none of the coolers are silent so the data is of no interest to me. If you want silent high performance computing then you are looking at using 3rd party heatsinks, fans etc. I can get that info from sites such as SilentPCReview but it would be good to get PPW data from the more mainstream tech sites as it’s an important metric along with the GPU die size when estimating the best card to use for such niche areas.

  35. Thanks Spunjii.

    Maybe a simpler way to describe it is that scenarios 1, 2 and 4 above all face a power cap due to various constraints. e.g. 100W maximum.
    So in each situation you are looking to see which GPU can give the best performance whist keeping under 101W.
    If we were talking CPUs it would be easy, just buy a Sandy Bridge derivative or wait for IB.
    Bulldozer’s performance was disappointing but its PPW makes it a grand folly for anyone facing a power constraint but also requiring the highest possible performance. I’ve seen figures that show it as having 50% of the PPW of SB.

    So I’m interested to see PPW figures for GPUs also but I’m not suggesting that we need to see figures for every game. For games it would be good to take at least one game where each platform is strong and another where they are matched and see if there is any correlation between their performance and power consumption. Even having 3 data sets and averaging them would be more meaningful by itself.

    Personally I’d be very interested how they perform in terms of PPW when running non gaming tasks; CUDA, OpenCL etc.

  36. I don’t think the 680 is sufficiently fast enough in enough benchmarks to warrant calling it out-and-out “faster,” “quieter” is a completely useless metric in graphics card reviews considering that most graphics cards purchased will not be using the stock cooler, and it doesn’t have ZeroCore.

    I think the 680 is better, but not by that much. And, it’s not nearly as bad as past nVidia launches have been for AMD.

  37. performance per watt is an cool metric but getting in detail about it across different games is sorta… tedious and irrelevant. smilingcrow seems to want to know decibel per performance since sound is his concern and that is extremely subjective. There is no finite empirical comparison for what smilingcrow wants… Buy the one with the quietest cooler if that is what you want. Seems like you can run your own numbers and figure out things without having tech report cater to your unique concerns. They’ve already done extensive reviews on different coolers. I think the MSI frozer is quitest and they put that on every card they distribute.

    [url<][/url<] given time the 680 GTX will have that cooler too.

  38. Flip-mode, you seem to be arguing against a point of view that you don’t understand.

    While you are correct that actual power use at full load is a crucial metric in low power/noise systems, it is still useful to know which cards are netting you the most performance at a given wattage in order to squeeze the most performance out of a power/thermally constrained system. It’s no good buying the highest power card you can fit in the envelope if the performance improvement over a low power card is minimal. More data points for their performance/watt graph (i.e. more games tested) would make that decision a better informed one for users who are looking to build quiet systems. You’re right that performance/watt *alone* is a poor statistic for making decisions, but in combination with other knowledge (e.g. maximum power draw) it is a potent tool for getting the absolute maximum performance from a lower-power/heat/noise system.

    This brings me onto your claim that the heat-sink being the most important factor for silent computing with a GPU – that is a thoroughly incorrect assertion. Overall system heat is a direct result of the power draw of all system components, and removing that heat requires fans, which means noise (I’m sure you know that). Having a quiet (or silent), efficient heat-sink on your GPU won’t help if your case is still having to unload 200W of heat from the PSU and chassis via its fans. The maximum fan-noise tolerated is different for everyone and it may well be worth tolerating a little extra noise for a lot more performance, or vice versa. It’s a huge grey area and exploring it is helped by knowing more of the facts (e.g. performance/watt) beforehand.

    I have built a lot of low-noise, high-power systems because I enjoy the challenge and I can verify that most enthusiasts have no idea what “silent computing” really means. It’s a game in itself, much like extreme overclocking, and I acknowledge that it’s a niche.

    Sorry for the length of that, but I felt smilingcrow should get some proper defence here because he’s been downvoted for no good reason.

  39. There are two things that matter when reviewing video cards.

    How well it performs in current games, and how well one would expect it to perform in future games.

    Since every graphics card in the past 3 years can render the source engine with max settings at 1080p, there is no need to see another card do the same thing. And since the source engine is 8 years old, it’s not going to help us see how the card will do in the future either.

  40. I got news for you, when you’re already running the game with max settings and never seeing a slow down, a new card is not going to help make the game look any better.

  41. Well, I disagree with every one of those because what is more important than performance per watt in every one is the [b<]actual power[/b<] that is consumed under load. You could have a low end card that gets fantastic performance per watt and just sips power at load but that still doesn't provide enough performance. You could have a high end card that gets terrific performance per watt and has massive amounts of performance but that still consumes too much power and puts off too much heat for a thermally constrained environment. So, again, what matters in all those situations you mentioned is [b<]actual power[/b<] and performance / watt is just telling you how much performance your getting for that actual power - but it can't be practically applied because you can't shop for it and it might not even be a differentiator, by which I mean it's conceivable that you have a whole family of cards that offer very similar performance per watt, so that metric isn't helping you choose a damn thing. So: 1. Silent computing: it's the heatsink that matters most 2. Thermally constrained environments: it's actual power consumption that matters most. 3. Reduced electricity: actual power consumption 4. Anemic power supply: actual power conumption. I can see performance per watt being a serious part of the discussion if you are going to deploy a fleet of these things and you have some kind of total power budget that's something like 25 kW and you want to maximize you computational power for that power limit. But we're not talking desktops computing or gaming anymore.

  42. Practical applications for performance per watt (PPW):

    1. Silent computing – already mentioned.

    2. Thermally constrained environments.
    If you have a smaller form factor case there is a lower limit on how much heat can be dissipated so to maximise performance at a given TDP you want to look at PPW.

    3. Reduced electricity bill.
    If you are FOLDING or similar where your GPU is loaded 24/7 then looking at PPW makes sense.

    4. Upgrading a retail PC.
    If you have a retail PC with a fairly anaemic power supply and wish to upgrade the GPU without the additional expense of buying a new P/S then PPW is significant.

  43. They can, the terrans have evolved in Wings of Liberty. They can haz orbital dropped supplies also.

  44. Both of these cards are overpowered for most displays, but it does seem like the 7970 has a slight advantage where it matters – less stutter – and this is only in high res displays which most of us probably don’t have and won’t be buying. So in general, the performance of these cards are the same.

    Sigh, who knew fabrication of silicon would cost so much these days. But the price might be fair given that silicon scaling might be coming to an end.

  45. appreciate the acknowledgement, their are many who lack the character to do the research and agree if required.

    instead they push lies and try to shift the discussion away.


  46. I was trying to get you to tell me some practical application for the performance/watt metric because I don’t think there is one, but when I think about it there’s no practical application for performance/shader either.

  47. all 120hz monitors I’ve seen are 1920×1080 at most. I’m pretty happy with my Samsung S23A700, but there’s room for improvement for sure.

  48. Nvidia Investor and all around TR annoyance indeego here. Sticking with my [email protected] 1920×1080 (Sorry aspect ratios don’t bug me like they do many here.) There hasn’t been a game that crushes my system that I play. I am about 1-2 years behind the curve though…

    To be honest both my work servers, workstations, home laptop, home workstation are all vastly overpowered for their real needs and very little on the technology landscape has excited me lately. I would like to get PCIe SSD put in at work but no real rush.

    I’m much more interested in upgrading my LCD to 120Hz and not have it suck, but all the 120Hz monitors I’ve read have either serious issues, drawbacks, crap I don’t need, or poor resolution.

  49. It’s *not* babby’s first troll. Silicondoc is well known on XS / 4chan/g/ where he was banned from XS a long whiles ago. You’ll hear his cry of ‘RED ROOSTER’ from reddit.

  50. At stock voltage (set at 1.175v), the typical overclock seems to be ~1150mhz, so I guess you’re right. I was basing my argument on my HD7950, of which the stock GPU voltage is lower with corresponding lower OC.

  51. Perhaps, there is more then one right answer to this, depending on how you look at things… from a financial standpoint or a technical standpoint.

  52. difference being that I had no trouble finding HD 7970’s at launch, the only trouble if any was in finding the exact brand I wanted but their were plenty of HD 7970’s available.

    GTX 680…. none, their is talk that some are coming.

    understand I’m overall pretty happy with GTX 680, it raised the bar and personally I’m not that depressed about the price given it dropped the high end down another $50 after AMD did it with HD 7970… it’s all good to me and that the performance bar has been raised is all the better.

    Nvidia is also pushing to fix some glaring flaws that have been around forever which is great but I suspect after a month AMD will retake the crown given how well their core scales compared to Nvidia’s 680… still great cards from both companies regardless.

  53. Want a reasonably quiet PC? Buy something like Thermalright Macho or similar for CPU, Arctic Cooling’s Twin Turbo 2 or compatible for your GPU and a fanless/semi fanless PSU from Super Flower.

    Put all that into a Silverstone Raven 02 or similar case with good airflow, get rid of mechanical HD’s, profit: [url<][/url<] Low temps, good overclocking capability. All comoponents are cooled equally and the natural direction of heat dissapation (upwards) helps to keep components cool too. My idle temps are 35C and 50C when gaming.

  54. He does have one valid point, and with the wrong numbers still. The GTX 680 does in fact reliably beat the previous green champion GTX 580 in terms of FPS, but only by at least 20%, not 35%. The rest of his “logic” breaks down from there.

  55. It seems clear now that this was intended to replace the 560 Ti (not the 570/580-based ones either), giving consumers the 2560×1600 performance at the $250-ish range. When nVidia was publicly saying, “Wow, we expected more!” at the 7970 launch, they were internally rubbing their hands together in glee because suddenly their card went from being a mid-range value card to a new high end contender. All because AMD did not bring its A game.

    AMD failed. nVidia, being nVidia, saw an opportunity to undercut them just enough to get that as their headline, but not enough to substantially matter in the long run. AMD is unlikely to lower their price because the nVidia card still has its mid-range card heritage on its sleeve, its 2GB of memory, so the 3GB-based 7970 and 7950 can claim the larger memory buffer warrants the higher cost. So AMD’ll sit up at the high end of cost and have the same performance as the 2GB, slightly less expensive 680 while nVidia revels in being the slightly less expensive, but substantially more expensive than they’d ever dreamed of being able to get away with.

    This lets them postpone the bigger, better, badder-ass versions of Kepler that were supposed to make this part look like a chump for six months. Then when AMD finally has something to throw out that’s actually progress performance forward… BAM, Kepler Prime shows up and blows it away. Parts stockpiled for months, paid for by the premium they tossed onto a part that was meant to be a 560 Ti ($250-ish) replacement.

    Taking a card that is clearly designed around replacing the Compute-deficient 560 Ti line and then marking it up at twice the last card’s price is quite an achievement. Even more impressive, though, is getting everyone to think it’s a bargain. AMD failed when it got greedy; and nVidia has now continued that greed down the line.

    If you think about it, the whole thing makes a lot of sense though. nVidia’s losing money on Tegra, which has yet to really take off in a very PROFITABLE way. AMD’s losing money hand over fist due to the crap that is Bulldozer (casting a dark shadow across all products set to inherit its updated core technology), the now ancient netbook APU they’re still selling mostly unchanged after over a year, and the delays on Trinity.

    The scene was set for a milking. Consumers are now cattle and we’re getting milked with both hands, green and red.

  56. it’s my special mix. that’s why i’m always so lovable just like silicondoc!!

  57. I’ll have what you’re smoking. Only i want 2x the dose as I’m having bad day.

  58. The context of the discussion was me responding to your desire to see a scatter plot for FPS per shader by saying that I thought performance per watt was more interesting. You then seemed sidetracked by talking about power supplies which wasn’t something I mentioned.
    The data is there if you want to plot FPS per shader yourself but there is very little power data.

    My personal interest is in using a nVidia card to accelerate tasks using CUDA in Adobe CS5.5/6 not gaming so silence is still important to me.
    I wouldn’t buy the GTX 680 as the TDP is too high but I’m wondering how the lesser Kepler cards will perform and at this point looking at the efficiency of this card is the best guess I have. It doesn’t look great on the Compute side but I’ll wait until some benchmarks running CS5.5 have been released.

    As for over-clocking I’ve never found it to be inherently incompatible with silent computing. The trick is to over-clock at stock voltage or to use a gentle voltage bump. The limit is set by the efficiency of your cooling and the thermal thresholds that you are happy to utilise.

    With Ivy Bridge I think ~4GHz for an i7 should be achievable in silence.
    As for Kepler I don’t know yet as I haven’t explored custom cooling solutions and the range is still not fully released.
    Sometimes you get better efficiency by buying a higher TDP GPU and down clocking it and under-voting rather than buying a lesser GPU and over-clocking it especially if you need to over-volt it.

    I’m not in a hurry as CS6 isn’t released yet so I’ll wait to see how things unfold. Who knows Adobe may have moved away from only supporting CUDA but I doubt it which is a shame as AMD have decent Compute performance now.


  60. The 680 went on sale and then quickly sold out… in exactly the same way that the 7970 did when it finally launched. What’s good for AMD is good for NVIDIA and vice-versa.

  61. you don’t need to slap on a new cooler to do 1200mhz, you don’t have to overvolt and the VRM’s do it without any effort.

    don’t lie, it’s already been done, results universal and easy to find on the web.

    no additional voltage required, no exotic cooling required, it only uses 17 watts more….. literally effortless.

  62. VR-Zone is strange?…. miniscule?

    overclocked HD 7970 wins more than it loses, no new mystery driver required, performance available today, get over it, stop lying about it and for heavens sake please gain some balance.

    the pathetic part of your position is in thinking that 15 reviews that don’t cover overclocking at high res surpass 1 that does.

    data is data, if the 15 reviews don’t have the data their value is less than 0.

    you’ll likely forever be a punching bag given your bias but at least examine the data while making yourself the easy target….. if I cared how a high end card performed at 1280 X 1024 I’d be like you…. exceptionally weak in position.

  63. great I want to buy a GK 110 right now…. oh wait I can’t and GTX 680 is the best Nvidia has and will have for quite a while to come.

    AMD is going to release a new version of HD 7970 eventually perhaps it’d only be fair to talk b.s. about it now like you are about phantom product from Nvidia.

  64. Turn off your computer. Take off your 3d Vision glasses. Put on some sunscreen. Go outside.

  65. I’m with you, man. But at the same time it’s important to remember that prices will come down. I just looked this morning and prices for the GTX 580 are astoundingly good at this point – they are priced where the GTX 570 was priced 3-4 weeks ago. They’ve literally dropped $200 in price in the last 6 weeks. GTX 570s can now be had for well under $300. The prices of the new generation are really crappy, but the prices for the top models of the last generation are becoming fantastic. But those deals only last for a short time, so right this minute is the time to buy a GTX 580 or GTX 570.

  66. So you want to see performance per watt benchmarked across a broad selection of games so that you can build a silent computer that can be overclocked without getting hot? OK, I’m shaking my head and wondering if you’re just being stubborn, but OK. For someone with your extended standards it’s probably more practical and more effective to just pick the fastest thing you can afford and put a water block on it.

    FWIW I’ve got a 5870 which is one of the loudest cards reviewed here. When I’m playing a game I don’t much hear it, because I’ve either got headphones on or speakers on. When I’m not playing a game it’s silent, and all I hear is that damned Cooler Master Hyper 212, which most people claim is nearly silent.

  67. Funny then how nearly every review site was impressed and said more or less “Nvidia hit it out of the park”.
    I guess that red fan bias has lobotomized the brain.

  68. The GTX680 comes in at least 35% at minimum ahead of the former single core card king by Nvidia for one and half years, the GTX580, which cost $500+ and resides well over the $400 mark.
    For any company to release a card with the much higher performance range that the GTX680 exhibits not only at a lower price than their current champion but far below the competing design that it also beats is pure fantasy.
    Claiming the GTX680 is a mid range product at a top price merely outlines the enormous lead Nvidia holds over not just it’s last iteration, but also it’s competition AMD.
    AMD has lost miserably, has nothing to compete with the “top end” Nvidia product, and hasn’t matched the currrent GTX680, yet charges more.
    Your fantasy attack on Nvidia outlines your immense bias, and in reality is humiliation for AMD whose top card costs more, performs less, and is not their mid range product but the best they have to offer.
    If Nvidia released it’s top card you claim makes the GTX680 a mid range product( which beats the top end of AMD) they would have to charge at least $850 for it currently.
    Facing reality is something AMD fanboys should do quickly, instead of stirring the pot with twisted lies and falsely impudent “opinions”.

  69. The GTX680 has won in nearly every test there has been. Denying it by attacking me or picking a strange never used artificially miniscule talking point won’t change that.
    It’s win is outlined on every review conclusion.
    AMD fanboys are going insane as usual claiming a driver update or overclocking can level the playing field, but even that leaves the 7970 losing.
    Now there’s a 2Ghz 680 in the works already by Zotac.
    There’s no future for 7970 winning, it even loses in triple screen large resolutions, and has 1/4th the special features of the Nvidia card.
    Attacking me personally changes none of that, you amd loser.

  70. The point is to look at the GPUs from AMD and NVidia and see how they compare for efficiency. Personally I prefer silent computing and there is a limit on how much heat can be dissipated silently. The efficiency is one indicator that can help to determine which GPU is more suitable.

    Choosing a suitable PSU for a card such as this is a mundane task and can easily be determined.

  71. The ACTUAL facts are that Kepler is priced as a high end product, and marketed against another high end product. Such a simple concept.

  72. As more then price determines what a midrange and highend video card is. The fact remains that in the Kepler family the GK 104 is a midrange product. The GK 110 is that families high end. Such a simple concept.

  73. I agree with this, the RWT article is very enlightening on explaining how Nvidia managed to achieve what they did. There seems to be a lot of reviews saying how “amazing” Kepler is, when in fact it’s actually a disappointing low cost high margin product that’s marketed as something that’s high end, which it clearly isn’t due to the massive simplification of the GPU’s compute capabilities.

    A lot of people also seem to be claiming that Kepler has good performance per watt and per mm^2, which is true, but completely unsurprising. This is simply the 28nm equivalent to a HD5870; of course it’s going to have a high FLOPS throughput and good power efficiency, but we all know in real world scenarios when the work load requires more general dynamic loads and the drivers have not been optimised for these applications, the efficiency of Kepler will not be so good.

    Despite all this, it’s hard to argue against the direction Nvidia has chosen, given the state of games today. They’re simply not taking advantage of the general compute capabilities of GPU’s and it doesn’t look like they will be in the near future. Suffice to say though, I do wish GPU’s could be used for more than just graphics and act as more of a co-processor to the CPU, it sort of feels like we’re going backwards again.

  74. What I mean is that AMD rushed their new toys out ASAP, possibly ahead of their original schedule. You get “beta” drivers as a result. That’s the context of “why” there were only “beta” drivers available at launch. That’s often the price though of bleeding edge.

    And anyway, that “beta” moniker is somewhat misleading, as typically “beta” only means it hasn’t got WHQL cert. yet from Microsoft. As anyone who uses Windows can tell you, WHQL means jack squat.

  75. I can’t stomach a 500 dollar GPU. If that becomes the norm I really will go console. I like to stay at or under 250. Of course I don’t game at the huge resolutions either.

    I was actually kind of let down by the Arkham framerate (given the lower resolution). Why did they test on such a lower resolution on that one game? Was it some sort of technical limitation?

  76. Um, what you’re saying isn’t making sense. Total heat dissipation and performance per watt… are… separate issues. Total heat dissipation is going to depend on the size of the chip and watts per square mm – a larger chip with more surface area will be able to dissipate more heat without actually being at a higher temperature – it’s about surface area. Total heat dissipation has nothing to do with performance per watt.

    Performance per watt doesn’t have anything to do with overclocking either.

    Performance per watt would be important to know if you’re in a server situation and running things at load 24/7 and you’re trying to gauge whether or not you should replace some equipment or if you already know you’re buying which product is going to give you the most performance in a rack that has a 15 kW of power limit.

    Performance per watt is essentially at the very bottom of the list of factors upon which to base any purchasing decision. It’s nice to know from a academic standpoint and it’s nice to know here because Nvidia has been making a bunch of claims about it, but other than that it’s not very meaningful in the desktop environment. You really just need to know what the load power consumption is so you can make sure your PSU is good enough, and you need to know the idle power consumption so you can see how much things cost to run at the power level they’ll be spending the great majority of their time at.

  77. There is a limit on how much heat you can dissipate from a GPU whilst keeping noise levels under control which is why for high end cards the performance per watt is important. If you have to raise the TDP too high to achieve a competitive performance the thermals, noise levels and over-clocking headroom will likely be compromised.
    This isn’t the highest end card so it’s not so critical but it does give an idea of what the architecture can achieve.

  78. What do you mean though? We’re talking about driver here, not exactly the cards. And before 301.10, there’s also another WHQL driver that’s only weeks old as well.

  79. You sound like one. Me, I haven’t had a Nvidia card since the original and venerable Geforce, and the only reason I’m not ordering one now is that I already bought a HD7950.

  80. The HD7970 does not do 1200mhz ‘with no effort’. Slap on a huge cooler and overvolt with the intend of frying VRMs, and it may do 1200mhz, yes.

  81. I was ready to agree until I saw the overclocking results at VR Zone and TechPowerUp.

    GTX 680 doesn’t scale well because of hardware limitations even at 1350mhz it loses more than wins against an HD 7970 at 1200mhz and it’s well known that all of the HD 7970’s hit 1200mhz with no voltage, heat or noise issues…… if AMD raises the official clocks Nvidia will be looking all the worse for wear despite still being a paper launch.

    surprised at how weak the compute and OpenCL perf was as well, a real shift for Nvidia as they are moving to distinguish their enterprise from desktop product.

  82. just give up on the car thing already… geez.

    so many factors go into what makes a car what it is beyond the stupid horsepower numbers…. just give it up already.

  83. you didn’t read any reviews, don’t lie, what you did was search for any possible credit to give Nvidia, it’s why you get used as a punching bag instead of considered as having a point.

  84. I read about 15 reviews, with actual fps numbers and end user experience while gaming… so, you’re way behind me dummy.

  85. AMD came out first & set the bar, Nvidia in a paper launch raised it because they knew where it was.

    AMD can do a few things with little downside if any.

    GTX 680 scales poorly it’s limits are obvious and unfortunately hardware based, a 1350mhz 680 gpu loses more than it wins against an HD 7970 at 1250mhz and it’s no secret HD 7970 hits 1200mhz with no effort and no notable downside.

    Nvidia served notice to be sure that they have something but it’s a surprise that AMD even has a high end competitor given they weren’t going for one having given up on the practice years ago and reserving dual gpu for that segment.

  86. I agree a whole slew of propaganda.

    really weak at best and straight out of the brochure.

  87. After you justify the price premium on OC’ing the 7970, you want the price knocked down to $425, and the OC’ed model at $525, revealing what you really believe. So no, the price premium cannot be justified.

  88. Did you read the same review I did?

    “The GeForce GTX 680 is slightly faster and 50 bucks less expensive than the Radeon HD 7970, so it lands in a better position on this first plot. However, if we switch to an arguably superior method of understanding gaming performance and smoothness, our 99th percentile frame time (converted to FPS so the plot reads the same), the results change a bit.

    The GTX 680’s few instances of higher frame latencies, such as that apparent GPU Boost issue in Arkham City, move it just a couple of ticks below the Radeon HD 7970 in overall performance. Then again, the GTX 680 costs $50 less, so it’s still a comparable value.

    The truth is that, either way you look at it, there is very little performance difference between these two cards, and any difference is probably imperceptible to the average person.”

  89. Aside from tesselation performance, he’s arguing marketing bulletpoints. “Active dynamic cooling”, really?

  90. 2 ms and 2 FPS difference is now “thoroughly trounces”. Didn’t you get the memo?

  91. Well that’s the difference 3 months later will make, and that’s the price of being an early adopter.

  92. That’s what seems to have changed from the Fermi direction, the shift from one card that’s meant to cater to both graphics and computing to two lines tailored more for one or the other.

    I’m guessing the pros of this approach are more efficiency and lower cost of manufacturing for example, but the cons may be more development expenses and potentially higher costs and/or prices on the computing models since they may not utilize the consumer line’s “mass production” advantages to drive adoption costs down.

  93. Nvidia having the benefit of seeing what clocks AMD is using pushed their card above them…. this gives them the nod and that is fair enough.

    the problem for Nvidia is poor scaling during overclocking so while their card will hit 1300+mhz bandwidth bottlencks hurt them noticeably at which point an AMD running 1250mhz can trade blows with the best GTX 680 has to offer while being closer in power efficiency.

    I personally like the bad press AMD is getting because I hope it’ll force a $100 or better price adjustment especially when it looks like AMD’s HD 7970 will beat GTX 680 once factory overclocked to 1200mhz in outright performance at 2560 X 1600 given how well it does at 1920 X 1080.

    going to wait a month and see how it all fleshes out, AMD came out early and Nvidia exploited it but while sort of impressive Nvidia took advantage of a lowered bar….. if AMD decides to compete using price then FANTASTIC, WOOHOO drop it $200 if you like I’ll take 2, or AMD could be smart and bump the official frequencies and let the inherent advantages they have do the work for them while adjusting their price to match Nvidia’s price dollar for dollar.

    regarding power consumption AMD loses on power consumption by 40 watts which is a big win for Nvidia although I personally don’t believe it’s compelling given both cards are 40+ watts under the previous generation consuming GTX 570 levels of power….. even overclocked neither card will stress my $50.00 550watt Naxn psu.

    p.s. it’s surprising to see AMD actually have a single gpu high end card that can compete favorably with Nvidia instead of pushing the dual gpu discussion.

    p.s.s. I found the overclocking information at VR-Zone, TechPowerUp did some overclocking but they used fewer games & lower resolutions which don’t stress the video cards limiting the value.

    both reviews have value as VR offers up more data and TPR offers up more info on the difficulties of pushing GTX 680.

    wish the results used 2560 X 1600 and wonder if it’s pressure by an interested party that had those numbers omitted.

  94. Read your post then looked at the BF3 page and your post makes no sense. 680 and 7970 are just 2 ms apart on the 99th percentile frame time and just 2 FPS apart on average. So 7970s performance in BF3 seems completely acceptable. The only problem is 7970s price, which I’ve been griping about since it launched; I didn’t need a 680 review to tell me that 7970 didn’t provide great FPS / $.

    Also keep in mind the 4 MP resolution the game is being run at. Any less than that and these cards will rip through BF3.

  95. id find it hard to believe nvidia would give up the gpgpu crown so easily, considering their history with programmable gpus and gpgpu as a whole. i’d believe strongly in the dedicated gpgpu rumors going around.

  96. A 6950 is adequate to play BF3 on High settings and stay above 60FPS. But I had to overclock it. This was at 1920×1200.

    Unfortunately, Ultra settings, and AA absolutely kill performance on it where it drops to 30fps and isn’t consistent.

    But, the quality difference between High and Ultra really isn’t that big of a deal, especially when you are running around shooting people. In fact, all those effects only detract from being able to easily spot people (like motion blur, wtf).

  97. That is correct, you need an ‘active’ display port adapter, they’re listed specifically as active adapters and they run about $30. I figured this out the hard way. Unless they changed this with the 7970. 3870 and older you could run two DVI outputs and a component/svideo/rca jack at the same time, you can’t do that with newer cards.

  98. Not really impressed, I was expecting a lot more since Nvidia was talking trash when the 7970 was released..

  99. Please read the RWT article on Kepler. It seems Nvidia has taken a u-turn with it’s design efforts with Kepler. I don’t think its fair to put down the AMD architecture and you’ll see why once you read that article. It also gives a good explanation why the Kepler core was weak in the OpenCL compute benchmarks. I don’t think AMD left anything on the table with Tahiti, it is a more balanced design than Cypress but it does lose to Kepler in GFX heavy workload due to that trade off.


  100. Looks like Dave at RWT did some good analysis on this topic yesterday, seems to have similar expectations:

    [url<][/url<] "Given this situation, it seems highly likely that Nvidia’s upcoming compute products will use a core that is tuned for general purpose workloads. It will be a derivative of Kepler, to re-use as much of the engineering effort as possible, but with several significant changes. " Interesting read.

  101. [quote<]performance (however you want to define this) and costs.[/quote<] And the Chevy can match them in performance so if the cost is raised all of a sudden it's a higher class of car according to your criteria.

  102. I have no idea what your point is.

    Here is mine, as clear as I can make it. Products compete against similar priced products because from a consumer perspective all that matters is performance (however you want to define this) and costs. Therefore, since these cards are both a similar price they’re in the same range. It doesn’t matter what number designation they’re given or what bin a company decides to put it in.

    I hope I’ve explained myself well, as I am done with this conversation.

  103. A bit more information on NVENC from nvidia:

    All Kepler GPUs also incorporate a new hardware-based H.264 video encoder, NVENC. Prior to the introduction of Kepler, video encoding on previous GeForce products was handled by encode software running on the GPU’s array of CUDA Cores. While the CUDA Cores were able to deliver tremendous performance speedups compared to CPU-based encoding, one downside of using these high-speed processor cores to process video encoding was increased power consumption.

    By using specialized circuitry for H.264 encoding, the NVENC hardware encoder in Kepler is almost four times faster than our previous CUDA-based encoder while consuming much less power.

    It is important to note that an application can choose to encode using both NVENC hardware and NVIDIA’s legacy CUDA encoder in parallel, without negatively affecting each other. However, some video pre-processing algorithms may require CUDA, and this will result in reduced performance from the CUDA encoder since the available CUDA Cores will be shared by the encoder and pre-processor.

    NVENC provides the following:

    [Can encode full HD resolution (1080p) videos up to 8x faster than real-time. For example, in high performance mode, encoding of a 16 minute long 1080p, 30 fps video will take approximately 2 minutes.]

    Support for H.264 Base, Main, and High Profile Level 4.1 (same as Blu-ray standard)

    Supports MVC (Multiview Video Coding) for stereoscopic video—an extension of H.264 which is used for Blu-ray 3D.

    Up to 4096×4096 encode

    We currently expose NVENC through proprietary APIs, and provide an SDK for development using NVENC. Later this year, CUDA developers will also be able to use the high performance NVENC video encoder. For example, you could use the compute engines for video pre-processing and then do the
    actual H.264 encoding in NVENC. Alternatively, you can choose to improve overall video encoding performance by running simultaneous parallel encoders in CUDA and NVENC, without affecting each other’s performance.

    NVENC enables a wide range of new use cases for consumers:

    HD videoconferencing on mainstream notebooks

    Sending the contents of the desktop to the big screen TV (gaming, video) through a wireless connection

    Authoring high quality Blu-ray discs from your HD camcorder

    A beta version of Cyberlink MediaEspresso with NVENC support is now available on the GeForce GTX 680 press FTP. Support will be coming soon for Cyberlink PowerDirector and Arcsoft MediaConverter.

  104. so wait how do you what??? I can’t run two DVI’s and an HDMI monitor? That’s just flippin lame.

  105. your arguing features, AMD has better multi monitor support… I agree that Nvidia has tuned their to do better and it shows its better with several things that will only most likely get worse with clock speed increases. (frame stutter anyone)

  106. This is true but if you look at the chip design this is the competitor for the 78XX line, what happens when the real deal hits in the coming months. Nvidia can compete on price all day long once their supply constraints are fixed this one is simply cheaper to make.

  107. Kepler has the better cards for sure, with higher performance per watt. That doesn’t necessarily translate to better performance when overclocked, but it sure is likely, especially considering that both chips are made on basically the same manufacturing process.

  108. I agree in your assessment. There’s two things I think of in reaction to your statement though:

    1. I think, if we add a bit of rumor/speculation to it, I think it would be fair to guess that nV could have had original price/performance blow out (this chip would have creamed 7870 level chips and below, GK100/110 possibly creaming 7970 level chips, etc). …that’s all assuming good enough yields, availability etc, and their decision to compete that way, which they didnt take.

    2. All said and done, the end results for me aren’t decisively one-sided. So no, no particular blow out as far as my wallet’s mindshare is concerned. However, after years of “Bigger, Hotter, Faster” from nV, followed by the then dubious claims of “3x more efficient!!1!” from the CEO, I’d have to say nVidia success in their new direction has completely blown away my previously held conception of them and their architecture. I suppose by extension, that means the overall view of their technology (and performance/efficiency) this go-around is pretty mind-blowing-worthy too, once you take the money and market segmentation out of it. 🙂

  109. Not to be dismissive, but to be totally honest I don’t much care about power consumption while playing a game other than for sizing the PSU. That’s the major importance of power consumption under load in my opinion: determining the necessary PSU. So FPS per watt is completely unimportant to me unless you’re talking about HD 2900 XT where the difference was insane and then you don’t really need a scatter chart to tell you what you already know.

  110. The Heatie can’t even catch Kepler. So much for amd overclocking, rather so little.

  111. I’m aware of that but it is only covers one game whereas if you look at other reviews you see that the differences in power consumption and frame rates can vary a lot between the two platforms depending on the game. So only having one data point isn’t very meaningful when there are large variances.

  112. Why did I get rated down? Newer AMD cards only support two active devices at one time unless you get active adapters.

  113. Well, if it follows the trend, the “big Kepler” will have fp64 at 1/16th fp32 – 50% better but still pretty awful for fp64-heavy workloads, esp considering that the AMD cards have fp64 with full IEEE 754 compliance at 1/4 fp32.

    (Previously the midrange had fp64 at 1/12 fp32 and the 470/480/570/580 had 1/8, and now the “redefined midrange” has 1/24, so 1/16 is the the natural guess for “big Kepler.”)

    Handicapping fp64 this heavily does save nV a bit of die area without impacting performance in current games, but the main reason it’s been scaled back has got to be that nV wants to force people to use Tesla. However, since GCN is so much better for GPGPU than AMD’s previous architectures, some people will turn to that instead.

  114. Well, we’ll see how availability goes, but yeah, going by previous releases, especially on new processing nodes, I’m not too optomistic. And, as always in this industry, few cards+high demand will equal higher prices 🙁

  115. That’s already there, man. [url<][/url<]

  116. I feel like we’re arguing past each other.

    The prices of these two cards are close enough that they are in the same “range” no matter what AMD or Nvidia labels them.

    Now, obviously the Nvidia card is a little bit faster and a little bit cheaper, so it’s a better buy, but claiming that Nvidia is beating AMD’s high end card with a mid range card when they’re within 10% cost of each other is silly.

  117. I’d rather see a chart of frames per watt as it is the power envelope that defines cards as much as price. There is a finite limit on how high the TDP can go so having a power efficient architecture helps.

  118. I think it would be an excellent illustration of the efficiency of shaders from one generation to the next – for instance, much was made of the efficiency of the 6900 vliw4 shaders, but I’ve harbored the opinion that the 6800 vliw shaders are extremely efficient as well and it would be nice to see them compared. And then GCN to vliw4 / vliw5. Same on the Nvidia side to see one generation to the next.

    And at the end of the day, even though Nvidia and ATI define shaders differently they can still be compared based upon our understanding of the most fundamental unit. So for GTX 680 that’s 1536 “shader” while for 7970 its 2048 “shaders”.

  119. They gave me a UPS tracking number and it left Hodgkins, Illinois last night 😉

    That being said, launch availability always sucks. I would have preferred eVGA but I’ll take the Galaxy as they are all reference cards anyway. Looks like everyone is out of stock now, but it doesn’t seem to be affecting prices at all.

  120. Well cost is relative. If it ends up being midrange in Nvidia’s kepler lineup then its a midrange card if its 500 dollars doesn’t matter when the top end card is 850 and the low end is 250. Its midrange alright… not really going to buy it I aggree its a high premium for a middle of the road product. As the market sits right now they can charge it and people will pay it. I forecast its under 300 in 6 months, lol. When AMD cuts their 7970 down to 400 to compete (soon as supply constraints are lifted I’m sure) things will get interesting. No matter how good the 680 is I’m not in the market for it or any Nvidia card, sadly they don’t support more than 3 monitors.

  121. [quote<]The one scatter plot that I would have loved to have seen would have been FPS per shader[/quote<]Since a "shader" is defined so differently between architectures (esp. between Nvidia and AMD), how is this going to show meaningful info?

  122. The really scary thing is the position this chip will fill in the nvidia lineup this year. This chip was supposed to compete with the 78XX line. Despite the 680 name the actual chip name indicates a more… workstation like high end consumer card is in the works and on the way. That is a scary proposition.

  123. Cool, I hope you get it 😉 The card seems to be sold out on Newegg, and NCIX just has a bunch of stars next to availability. But they’re green stars at least!

  124. So what, doesn’t mean they wont turn it on. Kepler came 3 months after Tahiti, no one is complaining, fanboi much?

  125. “I should go ask $40 k for the family Chevy Cruise because apparently that would make it a luxury car.”

    If you tried, then yes your car would be competing with $40k cars, and no one would buy it because it would fall well short of expectations.

  126. I see a big price drop from AMD in the future, especially if more Nvidia cards are as compelling this generation, like $100ish drops from the top down.

  127. Are you guys reading the same review I was? Reading these comments before the actual review led me to expect a total Nvidia blowout, but that just isn’t the case at all. The 99th percentile frame times are slightly in the Radeons favour (which is the more important number, right?), and the traditional FPS numbers are very, very close. The Red’s have idle power consumption, by quite a bit under “display off”, while load numbers favour Green by a margin.

    The two cards are very close performers, the only thing that really drags the Radeons down is their price. Quite frankly though they were overpriced to begin with, and after seeing how fast the 680’s have been sold out, and all the rumours of TSMC manufacturing issues, I highly doubt you’ll be able to get the 680 at launch prices for very long anyway. Maybe you’ll be lucky to get it at all…


  128. Yeah, everything too hypothetical. And I’m usually against averages. If I eat 1 chicken and you starve, the average says we are both well fed.

    Stalls are probably worse than a couple FPS less on average.

    Anyway, I just wanted to point that the 680 is NOT faster (at least is very arguably) than the 7970. Any other metric looks to favour the 680 (price, watts, noise), so it looks like it’s the better choice right now.

  129. 20k is about right. you can get like a hyundai accent for about 15. i bought my acreage for 75k. housing is cheap in this part of the country. in alberta you can’t get a house for 300k in any kind of city.

  130. Whew, finally had a chance to read some of this review. The GTX 680 is awesome! but the new norm in card pricing really hurts bad.

    It’s very impressive what Nvidia did with power consumption and die size here; the improvements there must come as a surprise to a lot of people, and a good surprise at that. The 680 and the 7970 are both fantastic cards, with the 680 sporting some definite advantages.

    [b<]Awesome[/b<] review; I applaud. The "scatter brain" page is the coolest new feature I've seen in a GPU review since "inside the second" which was the coolest new feature in a GPU review since people started reviewing GPUs. No other site is thinking at the same analytical level. It's awesome. [b<]The one scatter plot that I would have loved to have seen would have been FPS per shader[/b<] I'm really hoping the pricing situation has everything to do with TSMC and that as 28 nm improves prices will plummet. And it would be great to see the Radeon driver team take it up a notch; it's not too much to ask that the new card launch with a driver ready.

  131. “Still, we’re a little disappointed Nvidia isn’t passing along more of those gains to consumers in the form of higher performance per dollar, as has happened in the past. Half a grand is a lot to ask for a mid-sized chip on a card with a 256-bit memory interface.”

    Thank you for mentioning this, Mr. Wasson.

    It’s a great card and a big leap for Nvidia but the value proposition of these cards just doesn’t add up right now. I am far more interested in future, more reasonably priced GPUs in the Kepler line.

  132. Yeah, solitaire is a much better game to enjoy with such high detail levels

  133. those prices are in line with prices here. it’s a 1.40 a litre for gas, bread is 3.99, and milk is about 3.99 for 2 litres.

  134. [url<][/url<] It's the [u<]first one[/u<] here, blind man. Learn to read.

  135. The essential scatter chart is Price/FPS, and is the only one not here….

    Waiting for SLI benchmarcks

  136. Yes there is, but the engine itself is designed and heavily optimized for consoles, not designed to shower the user with an exceptional eye candy experience.

  137. You mean Rage? Based on the game engine that was designed exclusively to wring every last bit of performance out of seven year old consoles?

  138. It only supports two unless you get an active display port adapter, which are about $30 a piece.

  139. I really only look at two sites for video card reviews, Tech Report and TechPowerUp. In TR, it’s more on the “quality”, while in TPU it’s the “quantity.” TR has high quality data and have different kinds of data, while TPU has less kinds of data but have quite a lot of it, and also uses a very large benchmark suite.

  140. Great opening to the article Scott… I enjoy topics that actually cover a multitude of avenues, including what has been said in the past and what is happening now. Great way to start things up.

    Bindless textures – I hope this doesn’t just wash out all the textures like Megatexture does and makes everything look like a crap stain.

    Good article. It makes me wonder how well the 680 will age since they lobotomized a lot of the logic from the card itself, while the 6970 retains it (if graphics in video games ever get over consolization). Zerocore still seems to be quite a unique feature as well. Here’s hoping AMD further improves it, I’m sure we’ll see Nvidias version in their next generation of cards.

    Scott consider adding Rift to the benchmarks. Some of these new games aren’t cutting it, on the highest settings around a big Rift battle it can really drive cards to a halt. I don’t know how easily it would be to reproduce such battles though. :l

  141. I wonder if Damage was on something when writing this. He kept mentioning “drop [graph/diagram/etc] on us”. 😛

  142. Well, he did note the whole currency conversion thing. Funny things happen with those for us too:

    EVGA 680 on newegg: $499
    EVGA 680 on 4 699:- SEK

    4 699 SEK in US$: 694…

    A couple of years back it was even worse (now it’s regressed to basically covering our 25% VAT) – could purchase stuff from the US at half-price compared to purchasing domestic, including shipping… Just have to have someone over there buy it and ship it under false label to make sure customs don’t seize it. 😛

    At least I assume there’s something like that going on from what he wrote. There’s similar things going on all over though – norwegian salaries are astronomical to us swedes, but of course everything costs way more as well over there…

  143. Are your pills in the medicine cabinet, or the nightstand?

    Please remember to drink a full glass of water when you take ’em.

  144. This is Nvidia’s second gen Fermi family card vs AMDs first gen GCN card, I have a feeling AMDs second gen GCN card is going to be a whopper even if they don’t go down a process node. Nvidia clearly took a lot away form their Fermi design experience with this card and AMD will probably do the same with their next gen. I think the 680s real competition is an overclocked 7870 this generation, which improves performance dramatically and still keeps a low power profile (see TRs OC tests). It’s also a smaller die than the 680s so AMD can probably price it around $300-325 making all of Nvidia’s last gen chips very bad bargains indeed.

  145. There’s all this sour pussing that GTX 680 isn’t $299 which was the usual Charlie D semi-accurate leak and no other source for it that I’ve seen, and it sure is fueling red hatred and many saying they won’t buy an inflated mid chip…
    By the looks of it, it’s a perfect red amd fanboy hate filled rumor – first they get to scream the card is never coming out at $299, then after they get to scream it didn’t come out at $299 – and they pass around no one should pay evil rich greedy Nvidia $500 for a mid range $300 chip.
    This sways the haters, loserz, fools, and fanboys to amd red…
    Remember that whole rumor was Charlie D the AMD fanboy news site of the internet.

    I would love to see where this percentage of performance increase came in under the current top dog price… by like $170 because the GTX580 was $470 or still is slightly less…

    So it seems to me a mindless FANTASY, and it’s a good con job by raging red fans to stir up more fake Nvidia hatred.

  146. Remember, this is little Kepler. Big Kepler will ease all your concerns (and should launch in the May-June time frame, according to the most plausible rumors).

  147. I see, so active dynamic cooling, TXAA, new PhysX in game on the fly unique destruction, FXAA to the max, Nvidia supremely better tessellation at any amd overclock comparison, DoF dx11 supremity, shall I go on ?
    So your card being slower is worth $70 more right now because you think it might overclock better ? I’
    ve seen OC to OC and the 7970 loses badly still, it only picks up one slight win in the benches.
    It’s nice to hope as a red fan, but the hope and speculation is in vain.

  148. heh, at 1920×1080 with AA/AF cranked up, the 680 is roughly an order of magnitude faster than a Radeon 3870. Yay, progress!

  149. “They’re reviewing hardware, not games”

    Interesting, I thought the review was to see how the hardware performed in games. I mean I only play a few source games so this review basically tells me nothing on how it will perform vs a last gen card.

    No offense Scott

  150. “Don’t worry, we’ll fix it later in software—like the R600’s ROPs.”

    Oh, you guys went there.

  151. Apparently the nvidia GK110 will usher in a new class called the uberomg class. Guess I should go ask $40 k for the family Chevy Cruise because apparently that would make it a luxury car.

  152. I expect they will do exactly this except instead of calling it 7975 they will call it 8970.

  153. 2.5 months is suddenly 4 months late, soon it will be 6 or 8 I’m sure.
    also Faiakes the GTX680 more than clearly outperforms the AMD crapster 7970.
    Would you like to buy mine btw ?
    It’s is on avrage at least 18% faster, it wins in triple monitor 5000+x1000+ resolutions, and there’s only a very small 1 or 2 games and fanboys can pick out at certain settings to claim the GTX680 didn’t sweep every single benchmark games used in over 2 years.
    $70 bucks less already (7970’s are $569.99)

  154. BF3 would be more than playable on a single 30″ you just need to turn MSAA down!

    There is less need for high levels of AA at 1600p.

    It can be done…..

  155. I’m not so sure that makes much sense. That’s like blaming Intel for AMD’s lacklustre performance. I know marketing wise it may sound cool (and boy is it good marketing) to say that Nvidia just decided to not release GK110 because GK104 was “good enough” but that really doesn’t make any sense. The performance delta between 7970 and 680 will largely be imperceptible in most games. Depending on the review we are seeing AVG and low framerate diffs of 5 to 10 frames (sometimes AMD, but more Nvidia). Of course there’s a few outliers, but by and large no one will be able to tell the difference.

    Since when has that been the case?

    There is an unmistakable difference between a 6970 and a 580. This isn’t the case this time and it has everything to do with the lack of GK110’s appearance and you can’t blame AMD for that. Let’s not fake the funk. If Nvidia could release GK 110 it would have done so instead of relying on “dice-roll” GPU boost to differentiate the card’s performance.

    Since I run Linux I don’t even buy AMD’s cards, but the Nvidia 680 launch to me seems…..well underwhelming. 480 might have been hot, but it did bring performance and likewise so did 580. This time we have Nvidia’s version of a 7970 and I’m not so sure it warrants the kudos it’s receiving.

    AMD could drop the price of the 7970 to $450 and that’s much higher than the 5 or 6 series sold for. This means that 670 will be 400 or very high 3’s. 7870 and 660 will be very high 2’s or low 3’s. This is bad all the way around and if AMD releases a 7980 matching or besting the performance of the 680 we’ll see high prices in the mid range until the next refresh. This is before 690 and 7990 (and Nvidia could counter a 7980 with a 685). Who do we blame then?

  156. wtf. you make 1000 dollars a week? damn, that’d be nice. I live on less than minimum wage, but i DO get that while going to school….

  157. If it gets more than a hundred fps, chances are that you wouldn’t worry about its performance.

  158. AMD’s Tahiti being 5 to 6 times faster than the GK104 in FP64 shouldn’t make the chose hard for you.

  159. Most people are correct when they’re saying that this is a mid-range ‘part’ that performs like- and is subsequently priced against- the competition’s high-end.

    If you want to argue the semantics, realize that we should expect Nvidia to ship a GK100 part with much better GPGPU performance, but worse gaming performance per watt across the board.

    If they don’t do this, they’d be ceding GPGPU to AMD, and GPGPU is the bread and butter of Nvidia’s margins.

    There will be a big daddy Kepler.

  160. I just want them to make TOR run well on my HD3000 at 1366×768. It already runs like a dream (as it should!) on my HD6950’s at 2560×1600.

  161. Scott have you considered ‘Sandra 2012’ or ‘GPU Caps Viewer’ for your OpenCL testing?

  162. I know, I don’t want to go back to having to shell out $300 dollars for an entry level Intel CPU. Case in point. The Celeron 300A was a bargain at just under $200 dollars at launch, and Intel only made it because the first Celeron, at 266Mhz and 150 some dollars. Both of those CPUs only ever launched because of pressure at the “budget” end of the spectrum from AMD. Now you can get an i5 for under 200 bucks. Yup, competition is good for us consumers.

  163. It’s a $50 difference, and Deanjo specifically said he doesn’t care how much something costs just what random bin the manufacturer decides to put it in.

  164. I think I was wrong when I said poor yields as the issue appears to be more a matter of not enough wafers, so your description of supply constraints is more accurate.

  165. Hmm. After seeing some reviews that pit OCed GTX 680 VS OCed HD 7970, things don’t so grim for us AMD fanbois after all.

    At stock, GTX 680 clearly is a better value. However, when you start to overclock both these cards, it seems the HD 7970 scales much better, thus its price premium could very well be justified.

    I’d suggest this to AMD: Revise the Tahiti silicon, stock clock it at 1.2+ GHz for the core and 6+ GT/s for the memory, name the new SKU as HD 7975 or whatever, price it at $525 USD, drop HD 7970 to $425. Then again, as HD 7970 was clocked pretty weak (relative to potential at least) to begin with, this could have been their plan all along, only waiting to see Nvidia’s counterattack.

    Nvidia could try to follow suit with a higher clocked SKU, probably GTX 680 Ultra… But like I said before, Kepler doesn’t seems to scale as well with clockspeed as Tahiti does.

  166. Are you trying to troll or try to be sarcastic in any way? If yes – you shouldn’t try that again because you’re very bad at it. Source engine might be ancient but it is constantly being improved and still DOES show difference in performance even with modern cards, go read up the Anandtech’s today’s review of GTX680. It doesn’t matter if you will get a 300 fps on GTX680 and 270fps on 7970 or some other insanely high fps numbers – the only thing that matters in video card COMPARISONS is the DIFFERENCE in performance. As far as for Rage – its engine is currently unusable for a proper benchmarking, Anandtech did a good article on explaining why, go read it up sometime. Not to mention it is unknown if any other relevantly popular game will use it in future in its current form.

  167. sure, but for 50$ cheaper than the stock 7970? that’s what they’d need to do to get the the same level, nm make it compelling.

  168. they did that a while ago. this thing is neutered in regards to it’s gpgpu computing.

  169. I could see binned 7970’s way OC’d (and maybe even overvolted) and sold as the 7980.

  170. Rage is useless for benchmarking, since it tries to render at 60 fps at all times, adjusting image quality on the fly if needed.

  171. It would be interesting to post a note on performance per transistor as opposed to performance per mm^2. That should clarify any issues with regards to process shrinks, even if the numbers are very approximate compared to a flat mm^2 number. From that perspective, the architectural changes from Fermi to Kepler would really lend to crazy increase in transistor efficiency, whereas the jump from Cayman to Tahiti would be less impressive.

    I have a feeling that GK110 will perform around 20% better in games while making large gains in GPGPU applications, but that’s just opinion.

  172. Interesting, nVidia and AMD switch their approach to GPU computing, AMD going for more general purpose while nVidia going for more specialized solutions.

    Depending on the compiler might not be great for general computing – ex. Itanium’s compiler struggles – but for a GPU that deals with much more parallelism it sounds somewhat sensible imho.

    I’m assuming nVidia will split their product lines into “gaming” and “computing” GPUs instead of having a jack of all trades solution, I dunno if this is confirmed or not yet, anyone know?

    Curious to see how this plays out.

  173. Umm, if you have a 120Hz 3D capable monitor, it is capable of rendering 120fps in 2D.

  174. Too all the people who didn’t get it:
    It means between Intel’s CPUs and NVidia’s new GPU, it is not a good time to be AMD.

  175. But still what? Why would you want 100fps if your monitor can only display 60? Of course having an average of more than 60 in benchmark situations makes sense, if that translates to more fluid gameplay with less dips below the desired framerate, but the 99th percentile number is a better way of measuring that.

  176. Really need that to see what the full potential is, that and everyone is already flaming the comments about what chip has more OC potential 😛

  177. Yeah I don’t think either company really cares to have supply constraints, doesn’t matter if the GPU sells for 500 when they can’t move enough.

  178. For this gen of card it is aimed at midrange, ignore the price and the name it is clearly not their top tier folding machine.

    See page 2

  179. I know what you mean, you only get the thumbs down from people who didn’t read page 2 but skipped to the conclusion.

    Regardless of name this card is destined for the eventual price point of 300 or less for this generation. While their will likely be a GPU under this one in price for this gen there will also be one above it, rational thinking makes that middle of the road, aka midrange.

  180. And apparently the poor yields at the foundry which messes with the supply and demand curve.

  181. You’ll still be find with it, just as I’ll be fine with my GTX 460, but boy it really does make me hungry for an upgrade. Gonna have to wait for the 7850’s competitor first though.

  182. 3d capable 120hz monitor? Sure .. most of them are 16*10 rez but still 😀 its 20fps more than the desired 100

  183. If we are talking about clock potential of each product I think that they are either matched or the 680 might edge out the 7970 for more potential. We are talking about driver updates the 680 is newer and probably has more performance gains from driver refinements coming up than the 7970 does. The raw stats favor the 680 in all categories but memory interface. Yeah the GPU is smaller but so what that obviously isn’t effecting the performance much.

    I think both products have lots of headroom for more power hungry versions to hit the market so if we are looking for a 7970 mkII we should look for a 685 or something that turns it up a notch.

    While I expect the AMD offering to rule multi monitor arrays for years to come, until AMD cuts prices we are looking at the Nvidia offering winning on reference specs with better performance per watt/dollar.

    End of the day neither company can produce enough to meet market demands so doesn’t really matter what they cost if you can’t buy them.

  184. The die is considerably smaller than a GTX 560 Ti’s and look at what this can do. It’ll be interesting to see if they plan to make a bigger chip.

    The only problem I see is one of little need for this kind of speed considering the current game selection. BF3 is of course (maybe) an exception for people who play it.

  185. Outstanding. The price is [u<]not[/u<] right, but the product is outstanding nonetheless. (And here I thought I'll be fine with my GTX 460 SE for a while yet.)

  186. I’m wondering if these power consumption numbers lend hope that there could be a single-slot midrange card in the 6XX series’ near future. Maybe not the 660, but perhaps the 650? Hmmm…

    I’ve been saying for a couple of months now that I will most likely be replacing my aging GTX 260 (core 216) with a GTX 660. Very much looking forward to that review.

    Going to a brand new Ivy Bridge 3570K with the GTX 660 and a Samsung 830 256GB coming from my current Core 2 Quad Q9300 with 750GB spinning disk… It’s gonna be a good summer 🙂

  187. A lot of companies would kill to [i<]drop balls[/i<] like this, regardless of how genital that sounds.

  188. It’s a new game, that looks like a game from 2006 or 7. A common occurrence these days.

    I did actually play a good chunk of it on a 8800GT and it ran well at 1680×1050.

  189. AMD really needs to find a way to improve Battlefield 3 performance on their cards. It’s the only AAA title (since Metro and Crysis 2 are sadly niche) that would require a >$500 GPU for playing at max settings, yet for whatever reason, AMD hasn’t tuned it right (despite it being one of their Gaming Evolved titles) and nVidia thoroughly trounces them here.

  190. No World of Warcraft benchmark? I need to know if I can run MoP with everything cranked!

  191. I’d like to see SWTOR in video card reviews as well – at least once 1.2 patch drops. I know the engine is not optimized and kind of a mess right now, but still.

  192. Over 100 fps? If you have a monitor capable of actually displaying 100 or more frames per second, you’re looking at 1920×1080. At that resolution, 100 fps is easy.

  193. Yea, it would – check out Anandtech’s review:

  194. two $500 cards and still can’t play BF3, are you kidding me, Imma call Yuri and tell him to make some evil soviet graphics card, so me and my comrades could play BF3 with over 100FPS.

  195. [quote<]Who cares if Nvidia's "midrange" is as good as AMD's "highrange" if they cost the same amount?[/quote<] How 'bout they both drop prices so we could have something closer to "freerange"! :O

  196. @OneArmedScissor
    Sorry, I missed the context of your post as I was obsessing on power efficiency and missed your point. It’s good to see review sites mixing it up with regard to which platform they use as it gives more data points for comparison. I definitely agree that it’s important to see how it fares in all senses on a Sandy Bridge platform which must be the first choice for most non partisan people prepared to spend $500 on a GPU.

    I figured any regular posters here will immediately understand why the power number are so different; namely due to the different configurations used. So likewise they wouldn’t be fooled by the numbers into thinking it requires a more powerful PSU than it actually does.

    The choice of power supply is generally a red herring though as it doesn’t significantly impact the data unless they use an old inefficient unit.
    Here are figures for the 1.2KW Antec behemoth that Anandtech used versus a Seasonic 350W Gold rated PSU. They show the efficiency at an output of ~250W DC and the corresponding AC input which is a good ballpark figure for a power efficient gaming rig (Sandy Bridge) using this GPU:

    Antec TPQ 1200 86.5% (289W AC input)
    Seasonic SS-350GTM 92.4% (271W AC input)

    So even such an extreme comparison only shows a difference at the wall socket of 18W or less than 7%.

    [url<][/url<] [url<][/url<] The Kingwin 1KW Platinum unit has an efficiency of 92.9% at 250W DC which is even better than the Seasonic 350W Gold. Power supplies have realty improved a lot over the last 3 years or so and the two 1KW units linked above have peak efficiency around 400W so not unreasonable. [url<][/url<] The area where power supply choice has more of an impact is at idle as it’s at such low wattages that the efficiency curve really drops off. It’s not large in absolute terms but more as a percentage.

  197. Didn’t nVidia open up to 4 monitors on this one? I remember seeing something about it, but that was back in the rumor stage.

  198. Who cares if Nvidia’s “midrange” is as good as AMD’s “highrange” if they cost the same amount?

  199. Well, the GTX 680 is roughly 10% faster and 10% cheaper. So while it’s an improvement over the Radeon 7970, it’s not much of a leap, more of a nudge. It would be extremely easy for AMD to provide a 7970 MkII that surpasses the GTX 680. So yeah, I see this as a catch-up product.

    If I were buying a $500 video card today, I’d get the GTX 680.

    However, I’m still happy with my GTX 460!

    PS – I wonder if this is purely a launch of opportunity for Nvidia. I mean, maybe Nvidia has something really big almost ready to go, and figured they could just do a GTX 680 to compete with AMD for a couple of months before dropping the hammer…?

  200. But would it stress these graphics cards enough to show much difference between any of ’em? They’re reviewing hardware, not games.

  201. Don’t worry it runs more then fine….I rather they stick to newer games here then benchmark old engines that run fine on almost anything out these days.

  202. [quote<]Well, if the new method of measuring graphic performance has some merit (and I think so), the 680 is slower than the 7970 (see the 99 percentile per dollar chart)[/quote<] The 99th percentile method, while important, is not the only metric by which to measure performance. A card with a lower 99th percentile frame time will deliver a smoother experience than a card with higher fps but higher frame times. However, the card with the higher fps would provide a competitive advantage to an advanced gamer because it will render more frames (and thus more information on opponents) during fast sweeps which are second nature in competitive twitch shooters. In an ideal world, you'd want both, even as a competitive gamer, because you don't want your view to stall in fast-paced battles. I'm just pointing out a scenario where frame time by itself is not useful as a sole metric of performance (and this is why TR reports both average fps and 99th percentile frame times). Then again, I'm neither a competitive gamer nor in the market for a $500 card, so it's a hypothetical discussion on my part.

  203. Having just completed a second read, I keep coming back to just one sentence in the ‘Conclusions’ page:

    [quote<]there is very little performance difference between these two cards, and any difference is probably imperceptible to the average person[/quote<] Now I'd just like to see someone start to reduce the size of these cards so I can get one that will fit into the CoolerMaster case I've selected for my all-in-one gaming/HTPC (currently 'stuck' with the XFX 6850, which is no bad deal at all).

  204. So, this pretty much tells me I’m going to get a 7950 and overclock it to 7970 and beyond speeds and spank the 680. The 7870 lingers in the 25ms-35ms+ frame rendering window a little too much at 1080p resolutions to make it attractive. Waiting for that price drop on the 7950 . . .

    Note that the 7950/7970 and GTX 680 are reviewed at resolutions higher than 1080p for the most part. On my 24″ screen, 1080p looks great (Avatar blew my mind – backlit LED LCDs FTW!!). So, dropping down the 1080p (even from 1920×1200) will result in boosted FPS and reduced rendering times. THAT is the price-point king right now, IMO.

  205. [quote<]The 7970 also include a full video encoder (h.264 included)[/quote<] That apparently isn't turned on even with the newest drivers...

  206. This thing has the die-size, memory interface, approximate SM(X) layout and board complexity of a mid-range chip. The names and layouts looks similar in all cases:

    GF10[b<]4[/b<] = GTX 460 = 8SM's* + 4x64-bit memory controllers GF11[b<]4[/b<] = GTX 560 = 8SM's** + 4x64-bit memory controllers GK10[b<]4[/b<] = GTX 6[i<][b<]8[/b<][/i<]0 = 8SMX's + 4x64-bit memory controllers Whilst I'm impressed what nvidia have done, the production cost of this must be similar to the GF114 and as Scott mentions, I'm a little disgusted that they're asking for $500 for this. I guess it's sad that competition is poor enough to allow Nvidia to do this, I just hope AMD secure enough design wins that they can keep up the fight. [i<]edit: * 7 SM's, one disabled because of poor yields ** Technically the product with the actual GTX560 name only had 7 too but we all know nvidia loves to muck around with naming schemes. The [b<]real[/b<] GTX560 is the Ti[/i<]

  207. Well, Anand show that the GTX 680 is 20% slower in Crysis warhead, slightly slower running Metro2033. and the gtx 680 seem to be dramatically slower running OpenCL code.
    The 7970 also include a full video encoder (h.264 included) And 50% more memory.
    (Might make a difference with gaming with 3 screens at 1920×1080?)

    PS: I thumbed you up anyways as its true that there is little reason (cant say none ) to go for the 7970.

  208. Counter-attack. I love it! Are they going to try a Zergling run-by into the main, too? nVidia better lift their supply depots.

  209. Unfortunately, the PC’s will be hamstrung by consoles for a very very long time to come, if not forever.

    Early rumors say that the new Xbox will have the equivalent of an HD4000 series card under the hood. Right now, that is ALREADY 3 generations old, and the damn thing is about a year or so before it even comes out at the earliest. This means that the GTX 700 / HD 8000 might be out by them, further increasing the gap. Not to mention the fact that the next Xbox will probably have a longer shelf life than the 360 considering all consoles have to worry about is 30fps @ 720p for a while.

  210. A Great Read as always. TR continues to have the best reviews on the internet.

    I have to say, this card makes you wonder what they have hiding for a GTX 690. If I didn’t have a 6950 that already ran everything quite nicely, this would be the high-end card to get. Makes me look forward to the midrange as well.

  211. I’ve read several of the GTX 680 reviews on the ‘net, and once again TR’s is clearly the best-authored. Scott, you should give seminars.

  212. Well, if the new method of measuring graphic performance has some merit (and I think so), the 680 is slower than the 7970 (see the 99 percentile per dollar chart). The performance per dollar is slightly better, though.

  213. Can Rage be meaningfully benchmarked when the engine is designed to scale workload and graphical quality to hit a static fps target?

  214. I also see we have a full sized HDMI connector also,I hate my 560 Tis mini HDMI ports.They are just so fragile.
    As for the review great as allways.

    As for the card itself i want 2:) I can only imagine the silky smooth 3d performance or just the 120hrtz gaming silk also……….But 1000$ they are going to have to wait until i at least get that 27 inch asus 3d vision 120hrtz monitor.
    Hummm my income tax check is otw!

    Too bad wife would castrate me if i spent half of it on toys:)

  215. [quote<]Nvidia is dropping the ball here. Playing catch up. By next gen I see AMD dominating the market if this keeps up.[/quote<] If by "dropping the ball" you mean that they are releasing a faster product that costs less and uses less power, then you, sir, are correct.

  216. That’s not a bad idea, either. The Phoronix Test Suite has stuff like that, too, doesn’t it?

  217. Agreed with this. Out of Mercedes Benz C, E, S class, the C-Class is “low” end, but that low end is way above my tiny wallet. 😀

  218. Higher shader and texture output lower ROP and memory bus. This has been AMD’s strategy since 58×0.

    Nvidia is dropping the ball here. Playing catch up. By next gen I see AMD dominating the market if this keeps up.

  219. Source engine? That’s like the ancient HL2 thing for consoles? That runs fine on 8800GT?

    Rage benchmarks would be interesting though. That thing will be reused in some new games as far as I understand.

  220. Yes, I agree good design is important. But there is a small advantage still for the smaller supply. Moral of the story is buying a huge supply is pretty pointless. Looking at Anandtech’s article further, the power figures probably have more to do with the fact they tested with an overclocked Extreme Edition CPU. That probably drives the power draw up a significant amount.

  221. At 1/24th FP32, I’d think people would be looking seriously at using software emulation instead of the hardware support.

    fp32 has 8 exponent bits and 24 significand bits; fp64 has 11 exponent bits and 53 significand bits. A simple “double-single” type emulation gives only 8 exponent bits but 48 significand bits and should run at about 1/4 – 1/5 fp32. That may well be good enough for a lot of people. See e.g. the papers linked from the best answer to [url=<]this stackoverflow question[/url<]. Even a quad-single, with 96 significand bits (though still only 8 bits exponent range), should outperform nV's native double format. Getting things to play nice with NaNs, infinities, and all the details of IEEE 754 would be complicated and definitely slow things down a lot, and the handicapped hardware support may well be faster than that. But if you can avoid those and don't need the three exponent bits I'd think software emulation would be a win.

  222. I’m impressed by the performance/die area/power consumption of GK104. It’s certainly a huge leap in terms of efficiency for Nvidia. Of course that’s just one side of the coin. The other side is that this is an overpriced chip and not all that faster than AMD high end. I wonder if these prices have to do with 28 nm being more expensive than previous nodes. There’s rumors that TSMC raised the wafer prices and neither AMD nor Nvidia have deals on working chips only like previously.

    In any case 7970 is now even more overpriced and should drop down to 449$ fast. Which makes things for AMD a bit complicated since they’ll have to drop the rest of the lineup down a notch too. That’s of course a good thing for us consumers, finally some competition!

    While GTX680 is smaller than the usual suspect for Nvidia high end, it’s not exactly mid range either. It might be though that GK100/110 will end up Tesla/Quadro only, and GTX 680 will remain the high end for GeForce. And in any case I highly doubt they could keep the excellent gaming efficiency of GK104 with a chip geared more towards compute, like Tahiti.

    All in all Kepler is a breath of fresh air so far. I might actually consider Maxwell for my next update after my current 7970 since Nvidia updated their surround support to work on single cards. There’s some sweet features both Kepler and Tahiti bring to the table. I can’t wait for 8000-series to see how AMD strikes back. 🙂

  223. They could actually publish their data using dynamic charting software – something like amCharts:

    [url<][/url<] That way you could zoom in on the desired sector. They are completely configurable - there are lots of other HTML5 charting packages about. Would be a real leader on this side of things if they did that.

  224. I wasn’t talking about efficiency, but the difference between what people should actually expect with their own computer, and what many websites’ test setups show.

    At best, those charts are really only useful for judging what PSU a particular configuration warrants, but their configurations are typically unrepresentative of what people use. Hexus keeps things in perspective, and that’s why I mentioned it.

    You don’t judge what PSU you need based on frames per watt of a graphics card.

    I don’t think Anandtech is actually using a modern PSU. They almost never mention it, and it’s been a long time since the last time I noticed.

    Regardless, with a 1,000w or 1,200w PSU, that potentially puts even their load figures below the 20% DC level, which the 80+ ratings do not take into account. At that range, even brand new, Platinum rated PSUs still tend to take a nose dive in efficiency.

  225. It depends more on the design of the PSU and its 80Plus rating.
    A decent 80Plus Gold 1KW unit is achieving 90% efficiency at just over 200W DC output whereas a 350W Gold unit tops out at 92.4% at 250W DC.
    So the difference is tiny really. Links to two Seasonic Gold PSUs:

    [url<][/url<] [url<][/url<] The bigger difference is the design of the PSU and not the output. A 1KW Gold unit will be more efficient than a poorly designed 500W unit even at a load of 250W.

  226. Not the 7950 vs the 7970 but in a comparison of a 78xx series vs a 79xx series I would indeed call the 78xx the midrange card based on gpu capabilites reguardless if they were priced the same. Just as I would call a Corvette the high end vehicle and the Camaro the midrange.

  227. I’ll just jump in here and say that while the wattage of the PSU shouldn’t make a difference, the efficiency of the PSU is non-linear. I’m no EE, I just know the patern I’ve seen in testing. Especially at lower draws (excepting very low % of peak outputt, where most supplies show the highest efficiency), an overkill supply is probably going to be less efficient. Anand running a rig that is probably ~400watt draw on a 1200 watt supply (I checked their testing) is likely going to show higher draw at the wall than if they’d used a 600 or ever 750watt supply. That and not all supplies are equal. I do know using a power supply far larger than most people would use makes their results pretty meaningless.

  228. Thanks, Scott for the nice new render time graphs. They look pretty good. Any chance we can get them with more detail? I’d be glad to write you a script or something that would process your raw frame time data to generate more detail on the % of frames axis.

  229. Right now I’m picturing blue and green Spartans taking turns bagging a red Spartan.

  230. Yup 7870 is looking good. If Nvidia has no answer in that price segment by the mid of may I’m probably getting one for my Ivy setup.

  231. If that’s correct, it’s about damned time. AFAIK, ATI has supported 3 simultaneous displays across most of it’s line for some time, while NVidia thought 2 was good enough for everyone that didn’t want to buy multiple cards 😛

  232. Nice Review.

    Nice silicon.

    No source engine game benchmark = disappointing though.

    edit – and Scott would I be correct in assuming the tested res was 2560×1600 throughout?

    If so would this affect the time in your graphs if the res was turned down to 1680×1050?

  233. @ OneArmedScissor
    Hexus use Batman Arkham City for which the GTX 460 is 33% faster than the HD 7970. So as the nVidia equipped system consumed 4W less at load that equates to a large difference in efficiency; frames per watt.
    Reviews use different tests when measuring power figures so best to look at how well the card performs in the tests to get the big picture. i.e. the actual efficiency.
    No point in suggesting card x has decent power consumption just because it is low if the frame rate is particularly bad for that game.

    As for the use of a 1KW psu I don’t think it make much difference if they are using a modern unit. I doubt the difference would be more than 5 to 10% which is not that significant. E.g. if the difference is 50W with a 1kw psu and 10% less with a lower rated psu the gap is then 45W.

  234. We did average the perf for the value stuff. Re-read. We didn’t for power since we tested power use in Skyrim.

  235. i’m pleasantly surprised. given how late this was, i was expecting a rehash of the 4xxx vs 2xx series again. I’m glad nVidia’s managed to get power consumption under control instead of just aiming for “bigger is better” that they seemed to be doing the last 2 generations.

    Also, to me the cards seem to trade blows in the games you chose. I’m not sure why you only chose one (skyrim) for value considerations. That game was a clear win for nVidia. If, say, you used Battlefield 3, i’m sure the value and FPS/watt scores would be much closer. Crysis 2 would probably be a win for AMD. Why not average the performance over all the games?

  236. I bet the release date depended as much on driver refinement as it did on actual production.

  237. First all great review Scott. This card is quite impressive. I am not a big fan of nVidia especially JHH but this really deserves a pat on the back. I mean beats the 7970 in price, performance, power, noise, I mean almost everything. Whats even more frightening is that nVidia has a more powerful GK110. GK104 is actually mid-ranged *YIKES*

  238. When we get something that runs the code. It’s not available yet. But it may sound less thrilling once I flesh out my description, even though the theory makes some sense.

  239. Congratulations to Team Green. After a few lackluster generations they’ve managed to catch back up to AMD. I can only hope that the renewed competition will lead to lower prices for consumers. The 7970 launch was overpriced and I think it’s clear that that price will be falling shortly.

  240. When can we expect the quality/performance investigation of TXAA to be added? It *sounds* good, but most things do when described by the people who made it…

  241. The ranges are based on price points, not relative performance. If AMD only made a 7950 and 7970 and both sold for over $500, would you call the 7950 a “low end” card just because there isn’t one below it?

  242. Bring on the mid range 660 with lower price than the 7870, and equivalent performance, and i might buy some nvidia again after 10 years.

  243. [quote<]Let the exciting game of driver optimization one-upmanship begin![/quote<] The 301.10 drivers were released mere hours before reviews were published. It already begun before reviews were officially out.

  244. Also, for the die size, power consumption, and performance, this looks like the first NVidia product that interests me in a while. For so long they’ve been churning out massive dies and more or less bruteforcing the problem while ATi was cleverly focusing on smaller, more profitable designs that were “fast enough”. With the 7970 versus the GTX 680, it looks like the tables have turned.

  245. Yes, especially when Fermi devoted many transistors to resources that didn’t benefit most users. That’s alot of wasted die size multiplied by every GPU they sold, cutting into profit margins for very little benefit – such a small percentage of people are going to need GPGPU performance.

  246. Midrange in a product family line. Not what you consider midrange in your wallet. Also consider that there have been far more expensive top of the line launches at the $800 + mark in the past.

  247. Midrange and $500 do not belong in the same sentence. Even $350 for the HD 7870 is beyond my definition of midrange. $250 and below is more like it, especially when there are no games that really push even the $350 cards.

  248. Yes… while the number of people that talk about OpenCL and CUDA on this website is huge, the number of people that actually use it is tiny (and most of them are doing it for BOINC or some other hobby project). A whole bunch more people play games, however.

    It will be interesting to see the gaming vs. compute balance when the GK110 parts come out this summer. I think we might see some (Krogoth inspired) unimpressive increases in gaming performance but some big jumps in compute performance.

  249. They also use different games when measuring power load (TR – Skyrim, Anand – Metro 2033, Hexus: Batman AC).

  250. I love the internet – downrated for a non-fanboy factually correct assessment of the situation.

    Anyone who has been following the market the last few years should understand that AMD has foregone trying to make the “big die” GPUs to compete with Nvidia, instead favoring smaller chips that can perform at a given price point. They have had a lot of success – look at the 6850/6870 for example.

    Nvidia just took this idea and put their own spin on it this generation and is obviously doing very well with it from a performance/thermals perspective.

    Unlike the fanboys I embrace technological innovation and competition. AMD and Nvidia still compete in a way that I wish AMD and Intel were. I just hope all of this drives a much higher bar for console graphics so PCs aren’t hamstrung by consolitis as quickly as they were within a year of the 360/PS3 launch.

  251. [quote<]NVENC — Hardware video encoding, or right back atcha, QuickSync.[/quote<] Time to encode is not the only thing to consider however especially with video encoding. Quality plays a huge factor and this is something that TR has to investigate further in depth. Fast is good but if it looks like crap (like all GPU assisted encoders have so far) then its value diminishes greatly.

  252. the 680 supports up to 4 displays.

  253. well, what’s not correct is that AMD’s CPUs are cheaper than Intel’s. But otherwise you’re right – they’re worse in every way.

  254. Yeah. We do have a Tesla C2050 running in the lab, but Teslas can cost multiple thousands of dollars. So for personal and cost-sensitive machines, I prefer cheaper (consumer-level) products. On my personal machine, I am currently running 5850 as it offers 400 gflops fp64 for cheap. 79xx offers a natural upgrade path for me, but prices are a bit too high currently.

    Hopefully there will be a Tesla product based on a bigger/better Kepler chip. Looks like GTX 680 is not artificially cut-down, i.e. the chip design itself is incapable of handling FP64 any faster. On the GTX 480 etc, they artificially clipped the performance on consumer cards.

  255. I was hoping the 680 would let me switch to a single card solution (currently running dual 570’s). I can’t justify the cost when the performance is basically the same. Based on today’s reviews I’ll be sticking with my 570’s until the next gen cards come out.

  256. So Nvidia has went with separate Kepler-GL and Kepler-CL versions. I think this is the right direction, as the two targets have increasingly different requirements for the type of compute units.

  257. All the Newegg listings for different GTX 680 cards show that the “shader” clock speed is still doubled compared to the base clock speed. The review doesn’t make any note of that – are these listings just the retailer assuming otherwise?

  258. [quote<]I was hoping that 4 months late to the game[/quote<] How does January 2012 (when the 7970 actually went on sale) - late March 2012 equal four months?

  259. Disappointed that Nvidia’s midrange processor outdoes AMD’s top of the line at a cheaper price?

  260. I’m confused. I see ALLOT of headroom for this card it theoretically has only one major bottle neck and that’s the 256 bit memory interface right. Otherwise if you look at page 14 it gets more performance from a smaller GPU and has more flops of stuff to through around assuming it can circumvent that choke point(something that will only effect high res gaming on multi monitor setups etc.) Reading around the card is almost 10-20 percent slower than it should be for those clocks. On top of that it supposedly has insane OC headroom (though I find it to be a bit pricey to risk burning up).

    I think AMD will win the price war but Nvidia has put out a compelling product, for their core consumer. AMD still features more features for high end users though with their eyefinity setups.

  261. This card is pretty nice. I switched to ATI last upgrade due to all the problems I had with my older games on my previous Nvidia cards though so spending $500 to see if they fixed it seems pretty risky.

    Not that ATI doesn’t have problems though. I bet they still haven’t fixed that annoying mouse cursor bug in SC2. Overall though ATI has better driver support for the games I play which is pretty important.

    But yeah, I am not sure I am going to be getting any of these. I have a 5850 right now and while these are certainly faster, I don’t think they are $400+ faster. Oh well, here is hoping on the next gen I guess.

    Oh, excellent review BTW. I really like the new line graphs. Combined with the time spent above 50ms this gives me a very good idea of actual performance.

  262. I’m a bit surprised that it’s that good. Not that superior, though as nvidia was bragging. As said, little overclocking and Tahiti passes by. What is more surprising that AMD didn’t provide any response in a form of HD7980 (like HD4890, which was basically super-clocked HD4970) even when they had ample of time to do so and Tahiti has a lot of OC potential. Now it seems that Kepler got all the mind-share victories here.

  263. Very happy to see Nvidia has implemented native DisplayPort with this generation.

  264. [quote<]The short story here is that, in Kepler, the constant tug-of-war between control logic and FLOPS has moved decidedly in the direction of more on-chip FLOPS. The big question we have is whether Nvidia's compiler can truly be effective at keeping the GPU's execution units busy. Then again, it doesn't have to be perfect, since Kepler's increases in peak throughput are sufficient to overcome some loss of utilization efficiency.[/quote<] Whilst the "peak throughput" is sufficient to overcome loss of utilization today, it won't be tomorrow. It applies.

  265. This is true but you can never tell how long that support will last for, for the Kepler architecture in future. It’s a variable that’s largely dependent on funding, which is very unpredictable in this day and age.

    Take for instance the GTX 280, which I have. It suffered from unoptimised drivers for years before the release of Battlefield 3, which for some reason prompted Nvidia to actually give some form of a dam. GPU usage was always sitting around 60%-70% in games such as Bad Company 2 and Formula 1 2010/11. This problem will exist in a greater form for Kepler, as it already does in cards like the HD4000 series. My brother owns a HD4890 and it suffers from severe under GPU utilisation in newer titles. Where the card could be churning out 40+ FPS, it’s stuck at 30 FPS. That’s not something I want a £400+ card to suffer from in a year or two’s time.

  266. Kepler and GCN both use substantially new architectures. I don’t see how what you’ve said applies.

  267. Clearance sales for existing 5xx stock are about to happen. 😉

    Might be a good time to shop around for a 560Ti on the cheap.

  268. If you use more than 3 displays and power them with only one card, Nvidia does not make a product for consumers that will support that.

  269. FYI the 680 looks to have even more. I still agree preformance is great but price is what sells cards. Assuming TSMC can fix their manufacturing issues AMD has always priced their products more aggressively and probably will now. If multi monitor setups are important to you AMD still has that one in the bag.

  270. Page 14 is an amazing page.

    It demonstrates many of the accomplishments of both the 680 and 7970. It shows how Nvidia has done more with less in many situations and also how much apparent headroom there should be for GPU performance improvement through driver updates coming down the pipe. It also sheds light on the achievements in the 7970 relative to some of its lesser “potential” demonstrating that AMD has finally nailed down some decent driver engineers.

    Going forwards I expect that the 680 will achieve a scale of magnitude more after a couple driver updates but I’m still even more impressed with what AMD has gotten out of the 7970 all things considered. I do worry the memory interface for the 680 might be a limiting factor in its performance down the line but for now its still awesome.

  271. AMD is going to counter-attack by cutting 7970’s price and encourage vendors to factory-overclock 7970s to make them match 680. (FYI, 7970 has a ton of overclocking headroom).

  272. But the difference being that Nvidia enjoys much greater industry support than AMD does. You see alot more games optimized through Nvidia’s TWIMTBP program compared to AMD’s.

  273. Impressive. Really exciting to see nVidia top AMD with a great product, right after a good showing from AMD. Competition’s great when both participants are at the top of their game.

    I kept looking at the graphs, though, and thought that the Radeon HD 7870 did pretty well keeping up with the big boys at a much saner price point.

    And now… Let the exciting game of driver optimization one-upmanship begin!

  274. The somewhat untold story (or often missed part of the story) is that this chip is a mid-range GPU. It is priced and named as the 680 only because AMD’s highest end parts are comparable in performance. It is the market that lets this spiritual successor to the 460/560Ti be the new 680 and priced at $499. Just imagine what the “real” 680 could do (GK110).

    That alone makes this amazing from a technical perspective. All fanboys aside, Nvidia has out-AMD’ed AMD in making a small chip with high performance. Whereas AMD will have refreshed parts out no doubt by Fall of this year, Nvidia will still have the GK110 waiting in the wings and on a more mature 28nm process as well. Based on history I would guess GK110 to be at least 50% faster. EDIT: 50% faster than the 680 – I’m not predicting what AMD’s performance will be at the same price point.

    2012 looks to be the best year for GPU technology in a long time.

  275. GK104 seems to be a high-end part guise as a mid-range part. The transistor count is almost as high as Tahiti and easily surpasses Pitcairn. I can’t imagine what the GK110’s transistor count will end-up being, but it is the main reason why Nvidia didn’t want to deploy it yet. TSMC’s 28nm process doesn’t seem to be so hot.

    GK104 makes a trade-off for more texture processing power by sacrificing some shading power when compared to Tahiti.

    It shows this in games. The vast majority of demanding games don’t need the shading power that Tahiti offers, but always welcome more texture power and that’s how the GK104 is able to pull itself ahead.

    Anyway, it seems Nvidia made the right choice by settling for GK104 which still manages to give Tahiti some serious competition instead of gambling on a behemoth that would end-up getting delayed by yielding issues. It would end-up being a repeat of the Fermi (GTX 480) debacle.

    GK104 is about as power efficient as Tahiti, but it is certainly more transistor count efficient.

    I can’t wait until Nvidia can distill lesser versions of GK104 without killing performance so they can give the 77xx and 78xx some much need competition. As a side-effect, I suspect that Nvidia is about to do a discount on their existing 5xx stock.

  276. Great review, although I feel the article should make a mention of the GPU’s reliance on driver optimisations within the conclusion. Reason being that further down the line when GK104 is no longer the latest architecture, there is a good chance the card won’t perform well in newer games, when compared to the HD7000 series. This issue exponentially increased when Nvidia decides to release more derivatives of it’s Kepler architecture, as each one requires optimised drivers. It’ll be the same issue that plagued the HD4000, HD5000 and HD6000 series of AMD cards.

  277. The one I trust the most is Hexus:

    [url<][/url<] They historically have used more realistic test setups than other sites, and their low power figures show it. Anandtech's is closer to TR's: [url<][/url<] But they also tend to use a 1,000+ watt PSU, and that jacks up all of their power figures and exacerbates the difference.

  278. you’re a crazy man. they probably tried, but amd’s cpu’s suck SO BAD, they still lose.

  279. I think Nvidia came to market with a deal and a powerhouse. I’m not buying just yet but it will lead to a purchase for certain in the next 6 months. I don’t think this product is just a win for Nvidia but also consumers.

  280. let me fix that.

  281. The 301.10 are WHQL drivers, that ought to say enough. No need to show your fanboyism by using profanity.

  282. Well, NVIDIA has their Tesla market for that. Fermi in GeForce products was also capped much like this.

  283. “why the -? what part of it is incorrect?”

    IT MADE SENSE AND YOU DID NOT TYPE IN ALL CAPS!!!!!!!!!!!!!!!!!!!!!!!

  284. Awesome card and great review as usual!
    Really like the changes they made to the architecture. It’s not really a new architecture, but the changes made make it a beast of a GPU, while being so small.
    The price is too high though. Sure it beats the 7970 overall, but to soundly beat it (actually humiliate AMD) NVIDIA should call this GTX 660 Ti and price it @ $400.

    But as you said Scott, this should be a problem related with supply @ TSMC and there’s nothing we can do about, except complain about how high the prices are.

    Now if GK104 is like this, I’m anxiously waiting to see what GK106 will be (competitor to 7870). If NVIDIA follows its own trend it will be around 1/2 of GK104 in terms of resources. Will be interesting to see what kind of performance that yields, considering how efficient Kepler is. And hopefully the price will be at the usual $200-250 for mid-range and not the 7870/50 inflated prices…

  285. *Looking at LuxMark OpenCL rendering results where Intel CPU performs better with AMD APP than Intel’s own ICD*

    -So, despite what Intel have done to them in past years (maybe still now, who knows?) in the infamous compiler optimization ‘scandal’, AMD still choose to not discriminating Intel in their own SDK. I wept at AMD’s pureness…


  286. Disappointed that FP64 rate is only 1/24th FP32. Many of my workloads are FP64. (I do OpenCL programming).
    I guess a 7950 it is for me then, when the prices come down.

  287. Wow that is actually much more impressive than i thought it was going to be. Beats the 7970 all over the shop, fps and power consumption, smaller die and lower price.

    Im impressed.

  288. so.. essentially, there is no reason to get teh 7970? the 680 is faster, uses less power, is quieter, supports h.264 encoding, physx (whether you care or not, it’s still there), AND costs less. looks like amd is in a similar position on cpu’s and gpu’s now…..

    why the -? what part of it is incorrect?

  289. Since no one have mentioned it already over here, I’ll say something that many have been saying in other sites now.

    7970 launches – no official drivers on launch day
    680 launches – 301.10 drivers released on launch day

    And note, it’s not a f**king “beta” driver.

  290. I read that too. However, while absolutely plausible, according to the very same rumor sites, MS has at lest three different next gen consoles in the works under three different names 😛

    Only Apple rumors manage to do (way) worse than next-gen console related ones.

  291. Pretty good showing. It’s a very fast card for monitors much larger than I will ever own. I’d be interested in seeing what the next step down (sub-$200 range) is capable of.

    Also looking at the plots, it appears that nVidia needs to drop the price on lower-priced parts. The GTX 560 Ti has nearly identical performance-per-dollar-spent (around 28fps and $250 vs roughly 55fps and $500 for the 680) and it doesn’t have the luxury of bring a premium part. Clearly 15-20% price cuts across the board are in order.

  292. I would’ve gotten this if it cost a bit less to replace my GTX470… looks like now I have to wait until this ridiculous price comes down, which should be a year from now.

    Otherwise, it looks like a very well behaved card. Any folding PPD tests?

  293. Unfortunately Nvidia has burned its bridges with previous console graphics deals and all signs are pointing towards AMD GPUs being inside all three next gen consoles.

  294. @Scott Wasson

    Great review. Something to note though:

    Your power consumption results seem a bit odd. Most other reviews place the HD7970 and GTX 680 a lot closer together, see for example:
    [url<][/url<] Due to Turbo Boost also contributes to a considerable variability in the GTX 680's power consumption depending on the game it is running. If it isn't too much difficulty it would be a good idea in future to test power consumption per game benchmarked like HardOCP have done.

  295. Well, theoretically, here’s a nice candidate GPU for a next-gen 200W console based on COTS parts 😛

  296. Disappointed, I must say.

    I was hoping that 4 months late to the game, nVidia would clearly outperform AMD, which would speed up improvements in both their product lines.

    I am pleased with the new Load consumption and noise numbers though. That alone would push me to buy the 680.

  297. I was kinda hoping for some overclocked results (from OCed 680 and some of those factory OCed 7970s that are prevelant).