A closer look at some GeForce GTX 680 features

When we first reviewed the GeForce GTX 680, we were in our usual rush to push out an article at the time of the product’s unveiling, after a solid week of poking and prodding Nvidia’s new GPU. We were also attempting some of the more complex analysis of real-time graphics performance we’d ever done, and these things take time to produce.

Something had to give.

Our choice was to take a mulligan on writing up several of the new hardware and software features built into the GTX 680 and its GK104 graphics processor, with the intention of fixing the oversight later. We figured omitting some stuff and patching it later was pretty fitting for a GPU introduction, anyhow.

Well, we’ve been working on writing up those features, and given additional time to describe them, we indulged our penchant for ridiculous detail. The word count shot up like Apple’s stock price, and the result is this article, because there’s no way we’re not totally milking this thing for more readership. Read on for more detail than you probably deserve to have inflicted on you concerning the GeForce GTX 680’s new dynamic clocking scheme, antialiasing capabilities, adaptive vsync, and more.

Nvidia silicon gets forced induction

As evidenced by the various “turbo” schemes in desktop CPUs, dynamic voltage and frequency tech is all the rage these days. The theory is straightforward enough. Not all games and other graphics workloads make use of the GPU in the same way, and even relatively “intensive” games may not cause all of the transistors to flip and thus heat up the GPU quite like the most extreme cases. As a result, there’s often some headroom left in a graphics card’s designated thermal envelope, or TDP (thermal design power), which is generally engineered to withstand a worst-case peak workload. Dynamic clocking schemes attempt to track this headroom and to take advantage of it by raising clock speeds opportunistically.

Although the theory is fairly simple, the various implementations of dynamic clocking vary widely in their specifics, which can make them hard to sort. Intel’s Turbo Boost is probably the gold standard at present; it uses a network of thermal sensors spread across the die in conjunction with a programmable, on-chip microcontroller that governs Turbo policy. Since it’s a hardware solution with direct inputs from the die, Turbo Boost reacts very quickly to changes in thermal conditions, and its behavior may differ somewhat from chip to chip, since the thermal properties of the chips themselves can vary.

Although distinct from one another in certain ways, both AMD’s Turbo Core (in its CPUs) and PowerTune (in its GPUs) combine on-chip activity counters with pre-production chip testing to establish a profile for each model of CPU or GPU. In operation, then, power draw for the chip is estimated based on the activity counters, and clocks are adjusted in response to the expected thermal situation. AMD argues the predictable, deterministic behavior of its dynamic voltage and frequency (DVFS) schemes is an admirable trait. The price of that consistency is that it can’t squeeze every last drop of performance out of each individual slab of silicon.

Nvidia’s take on this sort of capability is called GPU Boost, and it combines some traits of each of the competing schemes. Fundamentally, its logic is more like the two Turbos than it is like AMD’s PowerTune. With PowerTune, AMD runs its GPUs at a relatively high base frequency, but clock speeds are sometimes throttled back during atypically high GPU utilization. By contrast, GPU Boost starts with a more conservative base clock speed and ranges into higher frequencies when possible.

The inputs for Boost’s decision-making algorithm include power draw, GPU and memory utilization, and GPU temperatures. Most of this information is collected from the GPU itself, but I believe the power use information comes from external circuitry on the GTX 680 board. In fact, Nvidia’s Tom Petersen told us board makers will be required to include this circuitry in order to get the GPU maker’s stamp of approval. The various inputs for Boost are then processed in software, in a portion of the GPU driver, not in an on-chip microcontroller. This combination of software control and external power circuitry is likely responsible for Boost’s relatively high clock-change latency. Stepping up or down in frequency takes about 100 milliseconds, according to Petersen. A tenth of a second is a very long time in the life of a gigahertz-class chip, and Petersen was frank in admitting that this first generation of GPU Boost isn’t everything Nvidia hopes it will become in the future.

Graphics cards with Boost will be sold with a couple of clock speed numbers on the side. The base clock is the lower of the two—1006MHz on the GeForce GTX 680—and represents the lowest operating speed in thermally intensive workloads. Curiously enough, the “boost clock”—which is 1058MHz on the GTX 680—isn’t the maximum speed possible. Instead, it’s “sort of a promise,” says Petersen; it’s the clock speed at which the GPU should run during typical operation. GPU Boost performance will vary slightly from card to card, based on factors like chip quality, ambient temperatures, and the effectiveness of the cooling solution. GeForce GTX 680 owners should expect to see their cards running at the Boost clock frequency fairly regularly, regardless of these factors. Beyond that, GPU Boost will make its best effort to reach even higher clock speeds when feasible, stepping up and down in increments of 13MHz.

The GPU ramps up to 1100MHz in our Arkham City test run

Petersen demoed several interesting scenarios to illustrate Boost behavior. In a very power-intensive scene, 3DMark11’s first graphics test, the GTX 680 was forced to remain at its base clock throughout. When playing Battlefield 3, meanwhile, the chip spent most of its time at about 1.1GHz—above both the base and boost levels. In a third application, the classic DX9 graphics demo “rthdribl,” the GTX throttled back to under 1GHz, simply because additional GPU performance wasn’t needed. One spot where Nvidia intends to make use of this throttling capability is in-game menu screens—and we’re happy to see it. Some menu screens can cause power use and fan speeds to shoot skyward as frame rates reach quadruple digits.

Nvidia has taken pains to ensure GPU Boost is compatible with user-driven tweaking and overclocking. A new version of its NVAPI allows third-party software, like EVGA’s slick Precision software, control over key Boost parameters. With Precision, the user may raise the GPU’s maximum power limit by as much as 32% above the default, in order to enable operation at higher clock speeds. Interestingly enough, Petersen said Nvidia doesn’t consider cranking up this slider overclocking, since its GPUs are qualified to work properly at every voltage-and-frequency point along the curve. (Of course, you could exceed the bounds of the PCIe power connector specification by cranking this slider, so it’s not exactly 100% kosher.) True overclocking happens by grabbing hold of a separate slider, the GPU clock offset, which raises the chip’s frequency at a given voltage level. An offset of +200MHz, for instance, raised our GTX 680’s clock speed while running Skyrim from 1110MHz (its usual Boost speed) to 1306MHz. EVGA’s tool allows GPU clock offsets as high as +549MHz and memory clock offsets up to +1000MHz, so users are given quite a bit of room for experimentation.

EVGA’s precision with the three main sliders maxed out.

Although GPU Boost is only in its first incarnation, Nvidia has some big ideas about how to take advantage of these dynamic clocking capabilities. For instance, Petersen openly telegraphed the firm’s plans for future versions of Boost to include control over memory speeds, as well as GPU clocks.

More immediately, one feature exposed by EVGA’s Precision utility is frame-rate targeting. Very simply, the user is able to specify his desired frame rate with a slider, and if the game’s performance exceeds that limit, the GPU steps back down the voltage-and-frequency curve in order to conserve power.

We were initially skeptical about the usefulness of this feature for one big reason: the very long latency of 100 ms for clock speed adjustments. If the GPU has dialed back its speed because the workload is light and then something changes in the game—say, an explosion that adds a bunch of smoke and particle effects to the mix—ramping the clock back up could take quite a while, causing a perceptible hitch in the action. We think that potential is there, and as a result, we doubt this feature will appeal to twitch gamers and the like.

However, in our initial playtesting of this feature, we’ve not noticed any problems. We need to spend more time with it, but Kepler’s frame rate targeting may prove to be useful, even in this generation, so long as its clock speed leeway isn’t too wide. At some point in the future, when the GPU’s DVFS logic is moved into hardware and frequency change delays are measured in much smaller numbers, we expect features like this one to become standard procedure, especially for mobile systems.

Vsync starts taking night courses

You’re probably familiar with vsync, or vertical refresh synchronization, if you’ve spent much time gaming on a PC. With vsync enabled, the GPU waits for the display to finish updating the screen—something most displays do 60 times per second—before flipping to a new frame buffer. Without vsync, the GPU may flip to a new frame while the display is being updated. Doing so can result in an artifact called tearing, where the upper part of the display shows one image and the lower portion shows another, creating an obvious visual discontinuity.

Tearing is bad and often annoying. In cases where the graphics processor is rendering frames much faster than the display’s refresh rate, you may even see multiple tears onscreen at once, which is pretty awful. (id Software’s Rage infamously had a bad case of this problem when it was first released, compounded by the fact that forcing on vsync via the graphics driver control panel didn’t always work.) Vsync is generally a nice feature to have, and I’ll sometimes go out of my way to turn it on in order to avoid tearing.

However, vsync can create problems when GPU performance becomes strained, because frame flips must coincide with the display’s refresh rate. A 60Hz display updates itself every 16.7 milliseconds. If a new frame isn’t completed and available when an update begins, then the prior frame will be shown again, and another 16.7 ms will pass before a new frame can be displayed. Thus, the effective frame rate of a system with a 60Hz display and vsync enabled will be 60Hz, 30Hz, 20Hz, and so on—in other words, frame output is quantized. When game performance drops and this quantization effect kicks in, your eyes may begin to notice slowdowns and choppiness. For this reason, lots of folks (especially competitive gamers) tend to play without vsync enabled, choosing to tolerate tearing as the lesser of two evils.

Nvidia’s new adaptive vsync feature attempts to thread the needle between these two tradeoffs with a simple policy. If frame updates are coming in at the display’s refresh rate or better, vsync will be enforced; if not, vsync will be disabled. This nifty little compromise will allow some tearing, but only in cases where the smoothness of the on-screen animation would otherwise be threatened.

An illustration of adaptive vsync in action. Source: Nvidia.

If the idea behind this feature sounds familiar, perhaps that’s because we’ve heard of it before. Last October, in an update to Rage, id Software added a “Smart vsync” option that sounds very familiar:

Some graphics drivers now support a so called “swap-tear” extension.
You can try using this extension by setting VSync to SMART.
If your graphics driver supports this extension and you set VSync
to SMART then RAGE will synchronize to the vertical retrace of
your monitor when your computer is able to maintain 60 frames per
second and the screen may tear if your frame rate drops below 60 frames
per second. In other words the SMART VSync option trades a sudden drop
to 30 frames per second with occasional screen tearing. Occasional
screen tearing is usually considered less distracting than a more
severe drop in frame rate

So yeah, nothing new under the sun.

In the past, games have avoided the vsync quantization penalty via buffering, or building up a queue of one or two frames, so a new one is generally available when it comes time for a display refresh. Triple buffering has traditionally been the accepted standard for ensuring smoothness. When we asked Nvidia about the merits of triple buffering versus adaptive vsync, we got a response from one of their software engineers, who offered a nice, detailed explanation of the contrasts, including the differences between triple buffering in OpenGL and DirectX:

There are two definitions for triple buffering. One applies to OGL and the other to DX. Adaptive v-sync provides benefits in terms of power savings and smoothness relative to both.

  • Triple buffering solutions require more frame-buffer memory than double buffering, which can be a problem at high resolutions.
  • Triple buffering is an application choice (no driver override in DX) and is not frequently supported.
  • OGL triple buffering: The GPU renders frames as fast as it can (equivalent to v-sync off) and the most recently completed frame is display at the next v-sync. This means you get tear-free rendering, but entire frames are affectively dropped (never displayed) so smoothness is severely compromised and the effective time interval between successive displayed frames can vary by a factor of two. Measuring fps in this case will return the v-sync off frame rate which is meaningless when some frames are not displayed (can you be sure they were actually rendered?). To summarize – this implementation combines high power consumption and uneven motion sampling for a poor user experience.
  • DX triple buffering is the same as double buffering but with three back buffers which allows the GPU to render two frames before stalling for display to complete scanout of the oldest frame. The resulting behavior is the same as adaptive vsync (or regular double-buffered v-sync=on) for frame rates above 60Hz, so power and smoothness are ok. It’s a different story when the frame rate drops below 60 though. Below 60Hz this solution will run faster than 30Hz (i.e. better than regular double buffered v-sync=on) because successive frames will display after either 1 or 2 v-blank intervals. This results in better average frame rates, but the samples are uneven and smoothness is compromised.
  • Adaptive vsync is smooth below 60Hz (even samples) and uses less power above 60Hz.
  • Triple buffering adds 50% more latency to the rendering pipeline. This is particularly problematic below 60fps. Adaptive vsync adds no latency.
  • Clearly, things get complicated in a hurry depending on how the feature is implemented, but generally speaking, triple buffering has several disadvantages compared to adaptive vsync. One, the animation may not be as smooth. Two, there’s substantially more lag between user inputs and screen updates. Three, it uses more video memory. And four, triple-buffering can’t be enabled via the graphics driver control panel for DirectX games. Nvidia also contends that smart vsync is more power-efficient than the OpenGL version of triple buffering.

    Overall, we’re compelled by these arguments. However, we should note that we’re very much focusing on the most difficult cases, which are far from typical. We’ve somehow managed to enjoy fast action games for many years, both with and without vsync and triple buffering, before adaptive vsync came along.

    Whether or not you’ll notice a problem with traditional vsync depends a lot on the GPU’s current performance and how the camera is moving through the virtual world. Petersen showed a demo at the Kepler press day involving the camera flying over a virtual village in Unigine Heaven where vsync-related stuttering was very obvious. When we were hunting for slowdowns related to vsync quantization while strafing around in a Serious Sam 3 firefight, however, we couldn’t perceive any problems, even though frame rates were bouncing around between 30 and 60 FPS on our 60Hz display.

    With that said, we only noticed very occasional tearing in Sam 3 with adaptive vsync enabled, and performance was generally quite good. We’ve also played quite a bit of Borderlands on the GTX 680 with both FXAA and adaptive vsync, and the experience has consistently been excellent. If you own video card capable of adaptive vsync, I’d recommend enabling it for everyday gaming. I’m convinced it is the best current compromise.

    We should note, though, that this simple optimization isn’t really a solution to some of the more vexing problems with ensuring consistent frame delivery and fluid animation in games. Adaptive vsync doesn’t really address the issues with uneven frame dispatch that lead to micro-stuttering in multi-GPU solutions, as we explored here, nor is it as smart about display synchronization as the HyperFormance feature in Lucid’s upcoming Virtu MVP software. We’ll have more to say about Kepler and micro-stuttering in the coming weeks, so stay tuned.

    Because adaptive vsync is a software feature, it doesn’t have to be confined to Kepler hardware. In fact, Nvidia plans to release a driver update shortly that will grant this capability to some older GeForce products, as well.

    Some antialiasing improvements

    Nvidia has revealed a couple of new wrinkles in its antialiasing capabilities to go along with Kepler’s introduction. The first of those relates to FXAA, the post-process antialiasing technique that has been adopted by a host of new games in recent months.

    FXAA is nifty because it costs very little in terms of performance and works well with current game engines; many of those engines use techniques like deferred shading and HDR lighting that don’t mix well with traditional, hardware-based multisampling. At the same time, FXAA is limited because of how it works—by looking at the final, rendered scene, identifying jaggies using an edge-detection algorithm, and smoothing them out. Such post-process filters have no access to information from the sub-pixel level and thus do a poor job of resolving finely detailed geometry. They can also cause the silhouettes of objects to shift and crawl a bit from one frame to the next, because they’re simply guessing about the objects’ underlying shapes. (You’ll see this problem in the complex outlines of a gun in a first-person shooter.) Still, this brand of antialiasing is a darn sight better than nothing, which is what PC gamers have had in way too many games in recent years.

    During the Kepler press event, Nvidia claimed “FXAA was invented at Nvidia,” which is technically true, but is kind of like claiming the producers of Keeping up with the Kardashians invented pimping. They didn’t invent it; they just perfected it. I believe Alexander Reshetov at Intel Labs was the first to describe (PDF) this sort of post-process reconstruction filter. AMD later implemented MLAA in its Catalyst drivers when it introduced the Radeon HD 6800 series. Nvidia did the world the great service of developing and releasing its own variant of MLAA, which looks good and is very fast, in a form that’s easy for developers to integrate into their games. FXAA has since become very widely adopted.

    FXAA softens all edges effectively in Skyrim.

    The big news with Kepler is that Nvidia is finally integrating an FXAA option into its control panel, so end users can force on FXAA in games that wouldn’t otherwise support it. The disadvantage of using the control panel to invoke this feature is that FXAA’s edge smoothing will be applied to all objects on the screen, including menus and HUDs. Games that natively support FXAA avoid applying the filter to those elements, because it can cause some blurring and distortion. The upside, obviously, is that games with poor antialiasing support can have FXAA added.

    I tried forcing on FXAA in one of my old favorites, Borderlands, which has always desperately needed AA help. To my surprise, the results were uniformly excellent. Not only were object edges smooth, but the artifacts I’ve noticed with AMD’s MLAA weren’t present. The text in menus didn’t have strangely curved lines and over-rounded corners, and black power lines against a bright sky weren’t surrounded by strange, dark halos. FXAA is superior to MLAA in both of these respects.

    Post-process filters like this one may be a bit of a cheat, but the same can be said for much of real-time graphics. That’s fine so long as the illusion works, and FXAA can work astonishingly well.

    Building on the success of FXAA, Nvidia has another antialiasing method in the works, dubbed TXAA, that it hopes will address the same problems with fewer drawbacks. TXAA isn’t just a post-process filter; instead, it combines low levels of multisampling with a couple of other tricks.

    The first of those tricks is a custom resolve filter that borrows subpixel samples from neighboring pixels. We’ve seen custom filters of this sort before, such as the tent filters AMD introduced way back in its Radeon HD 2000-series cards. Routines that reach beyond the pixel boundary can provide more accurate edge smoothing and do an excellent job of resolving fine geometric detail, just as AMD’s CFAA tent filters did. However, they have the downside of slightly blurring the entire screen. I kind of liked AMD’s narrow tent filter back in the day, but most folks reacted strongly against the smearing of text and UI elements. Sadly, AMD wound up removing the tent filter option from its drivers for newer Radeons, though it’s still exposed in the latest Catalysts when using a Radeon HD 5870. TXAA is distinct from AMD’s old method because it’s not a tent filter; rather than reducing the weight of samples linearly as you move away from the pixel center, TXAA may potentially use a bicubic weighting function. That should, in theory, reduce the amount of blurring, depending on how the weighting is applied.

    TXAA has two basic quality levels, one based on 2X multisampling and the other based on 4X MSAA, both with custom resolve. The key to making TXAA effective will be ensuring that the resolve filter is compatible with deferred shading and HDR lighting, so it will be able to touch every edge in a scene, not just a subset, as too often happens with multisampling these days. Because this resolve step will happen in software, it will not take advantage of the MSAA resolve hardware in the GPU’s ROP units. Shader FLOPS, not ROP throughput, will determine TXAA performance.

    The other trick TXAA employs will be familiar to (former?) owners of Radeon X800 cards. The “T” in TXAA stands for “temporal,” as in the temporal variation of sample patterns. The idea is to increase the effective sample size for antialiasing by switching between different subpixel sample patterns from frame to frame. Back in the day, we found ATI’s implementation of this feature to create a “fizzy” effect around object edges, but that was long enough ago that the world was literally using a different display technology (CRTs). LCDs may tolerate this effect more graciously. Also, we expect Nvidia may use subtler variance in its sample patterns in order to avoid artifacts. How this method will combine with a kernel filter that reaches beyond the pixel boundary is a very interesting question.

    TXAA edge smoothing comparison. Source: Nvidia.

    Nvidia claims the first TXAA mode will exceed the quality of 8X multisampling with the same performance overhead as 2X MSAA. We suspect that, with the right tuning, the combination of techniques Nvidia intends to use in TXAA could work quite nicely—that the whole may seem greater than the sum of the parts. Then again, TXAA may fail to overcome its drawbacks, like inherent blurring, and wind up being unpopular. Will TXAA end up being regarded as Quincunx 2.0, or will it be something better? It’s hard to say.

    The trouble is, TXAA looks to be a moving target. For one thing, all of the information we’ve relayed above came with the caveat that TXAA will be highly tweakable by the game developer. Different implementations may or may not use temporal jittering, for instance. For another, TXAA is still very much a work in progress, and the folks at Nvidia who we interviewed admitted some of the specifics about its operation have yet to be decided, including the exact weighting method of the resolve filter.

    Nvidia did have one live demo of TXAA running at its Kepler press event, but we don’t have a copy of the program, so we can’t really evaluate it. TXAA will initially be distributed like FXAA, as something game developers can choose to incorporate into their games. The feature may eventually find its way into a control panel option in Nvidia’s graphics drivers, but I wouldn’t hold my breath while waiting for it. We probably wont get to see TXAA in action until games that use it arrive. The first title may be the upcoming MMO Secret World, which is slated to ship in June. After that, Borderlands 2 should hit this September with TXAA support. Nvidia tells us both Crytek and Epic are working with TXAA, as well.

    Since it is entirely a software algorithm, TXAA could in theory work on Fermi-based GPUs, as well as Kepler. However, Nvidia’s story on whether it will enable TXAA on Fermi-derived products seems to be shifting. We got mixed signals on this front during the Kepler press event, and the latest information we have is that TXAA will not be coming to Fermi cards. We think this sort of artificial limitation of the technology would be a shame, though, so here’s hoping Nvidia has a change of heart. One thing we know for sure: unlike FXAA, TXAA won’t work with Radeons.

    Bindless textures

    One of Kepler’s new capabilities is something Nvidia calls “bindless textures,” an unfamiliar name, perhaps, for a familiar feature. Ever since John Carmack started talking about “megatexturing,” the world of real-time graphics has been on notice that the ability to stream in and manage large amounts of texture data will be an important capability in the future. AMD added hardware support for streaming textures in its Southern Islands series of GPUs, and now Nvidia has added it in Kepler. Architect John Danskin said this facility will lead to a “dramatic increase” in the number of textures available at runtime, and he showed off a sample scene that might benefit from the feature, a ray-traced rendition of a room whose ceiling is covered entirely with murals.

    Unfortunately, neither AMD’s nor Nvidia’s brand of streaming textures is supported in DirectX 11.1, and we’re not aware of any roadmaps from Microsoft for the next generation of the API. For the time being, Kepler’s bindless texture capability will be exposed to programmers via an OpenGL extension.

    Hardware video encoding

    For a while now, video encoding has been hailed as an example of a compute-intensive task that can be accelerated via GPU computing. However, even highly parallel GPU computing can’t match the efficiency of dedicated hardware—and these days, nearly everything from smartphone SoCs to Sandy Bridge processors comes with an H.264 video encoding block built in. Now, you can add GK104 to that list.

    Nvidia says the GK104’s NVENC encoder can compress 1080p video into the H.264 format at four to eight times the speed of real-time playback. The encoding hardware supports the same H.264 profile levels as the Blu-ray standard, including the MVC extension for stereoscopic 3D video. Nvidia cites power efficiency as NVENC’s primary advantage over GPU-compute-based encoders.

    AMD claims to have added similarly capable video encoding hardware to its 7000-series Radeons, but we have yet to see a program that can take advantage of it. The length of the delay, given the fact that the 7970 has been out for roughly four months now, makes us worry there could be some sort of hardware bug holding things up. Fortunately, Nvidia was able to supply a beta version of CyberLink’s MediaEspresso with NVENC support right out of the chute.

    Although we were able to try out NVENC and test its performance, I’d caution you that the results we’re about to share are somewhat provisional. Many of these hardware encoders are essentially black boxes. They sometimes offer some knobs and dials to tweak, but making them all behave the same as one another, or the same as a pure-software routine, is nearly impossible—especially in MediaEspresso, which offers limited control over the encoding parameters. As a result, the quality levels involved here will vary. Heck, we couldn’t even get Intel’s QuickSync to produce an output file of a similar size.

    We started with a 30-minute sitcom recorded in 720p MPEG2 format on a home theater PC and compressed it into H.264/ACC format at 1280×720. Our target bitrate for NVENC and the MediaEspresso software encoder was 10 Mbps. We weren’t able to raise the bitrate for Intel’s QuickSync beyond 6 Mbps, so it was clearly doing a different class of work. We’ve still reported the results below; make of them what you will.

    While encoding with NVEnc, our GTX 680-equipped test system’s power consumption hovered around 126W. When using the software encoder running on the Core i7-3820, the same system’s power draw was about 159W, so the GK104’s hardware encoder has obvious benefits for power efficiency.

    For what it’s worth, all three of the output files contained video that looked very nice. We couldn’t easily detect any major difference between them with a quick check using the naked eye; none had any obvious visual artifacts. A more rigorous quality and performance comparison might turn up substantial differences, but we think most end users will likely find NVENC and QuickSync acceptable for their own video conversions. We may have to revisit this ground in a more detailed fashion in the future.

    Better display support

    For several generations, AMD has had a clear advantage over Nvidia in the number and variety of displays and output standards its GPUs have supported. The Kepler generation looks to close that gap with a host of notable additions, including (at last!) the ability to drive more than two displays simultaneously.

    The default output config on the GeForce GTX 680 consists of a pair of DVI ports (both dual-link), an HDMI port, and a full-sized DisplayPort connector. The HDMI output is compliant with version 1.4a of the HDMI standard, and the DisplayPort output conforms to version 1.2 of that spec. The GTX 680 can drive 4K-class monitors at resolutions up to 3840×2160 at 60Hz.

    Impressively, connecting monitors to all four of the card’s default outputs simultaneously shouldn’t be a problem. The GTX 680 can drive four monitors at once, three of which can be grouped into a single large surface and used as part of a Surround Gaming setup. That’s right: SLI is no longer required for triple-display gaming. Nvidia recommends using the fourth monitor as an “accessory display” for email, chatting, and so on. I recommend just buying three displays and using alt-tab, but whatever.

    The GeForce camp is looking to catch up to Eyefinity on the software front, as well, with a host of new driver features in the pipeline. Most notably, it will be possible to configure three displays as a single large surface for the Windows desktop while keeping the taskbar confined to one monitor. Also, users will be able to define custom resolutions. Last but perhaps niftiest, Surround Gaming is getting a new twist. When playing a game with bezel compensation enabled, the user will be able to punch a hotkey in order to get a peek “behind” the bezels, into space that’s normally obscured. If it works as advertised, we’d like that feature very much for games that aren’t terribly smart about menu placement and such.

    In fact, writing about these things makes us curious about how Eyefinity and Surround Gaming stack up with the latest generation of games. Hmmmmmm…

    So when can I buy one?

    Since the GTX 680’s introduction, we’ve noted that its availability at online retailers in North America has been spotty at best. We asked Nvidia for some insight into the situation, and they offered a couple of nuggets.

    First, although we don’t have exact numbers, we’re told that online retailers are receiving shipments of GTX 680 boards every day. The key to finding one, it seems, is signing up to receive notifications at Newegg or the like whenever cards come back in stock. Although they sell out quickly, there is purportedly a decent number of cards trickling into the market each day.

    Second, an upcoming change to how GTX 680 cards are manufactured may provide some relief. Right now, as is often the case with new graphics card launches, Nvidia is having all GTX 680 cards manufactured for it on contract. The firm is then distributing those cards to partners like EVGA and MSI for resale. Very soon, in “early April,” it will be enabling its board partners to begin manufacturing their own renditions of the GTX 680. We’ll likely see cards with custom board designs and non-reference coolers hitting the market as a result of this change. Once this conversion happens, the flow of new cards should increase, helping to meet some of the pent-up demand. Nvidia admits that fully satisfying the demand for the GTX 680 may still prove to be a challenge, but it expects the availability picture to improve at least somewhat with the change in manufacturing arrangements.

    Dude, you should totally follow me on Twitter.

    Comments closed
      • hechacker1
      • 10 years ago

      The Spyder software comes with a LUT loader that can be set to load every minute, which works, but is a kludge workaround.

      I agree I wish Windows, or AMD/Nvidia would just let us lock it already.

      • CBHvi7t
      • 10 years ago

      “frame-rate targeting” _ Jey! finally!
      A few games had that feature and that is were it belongs, but it was time that they have it in the driver for the games that fail to have it and the hardware support to react on it. Of cause one could fix the clock-rate, based on in game testing, before starting the game.

      • hechacker1
      • 10 years ago

      Guys, I love your benchmarks, but something is bugging me.

      We’re still benching and considering a 60FPS limit as the optimum experience, and the 99th percentile graphs reflect this.

      With the 690, and CF7970, I think it’s time to show those graphs against the 120FPS limit. It would really change the way the data looks, and it would inform us 120FPS users a lot about the gameplay experience.

      As it stands now, even on old games like TF2 (and not even the highest setting) along with 1080p, it cannot maintain 120FPS! It gets close, but the stuttering induced is really interesting. I’d have to lower quality settings, or set a cap on fps.


      And thanks for the podcasts.

      • sigher
      • 10 years ago

      I am wondering about something, I hear the yield of the 28nm parts is abysmal, so I’m wondering if that will have an effect on the lifetime of the GPU if you do an overclock compared to the older generation, I mean if the traces are so small and the number of chips that even come off the wafer is already so small then the ones that work can’t be that robust either in all cases can they?
      And as I recall even intel had an issue with one small part of their chip being sensitive to the small traces over time.

      • rjseo
      • 10 years ago
      • merryjohn
      • 10 years ago
      • Deanjo
      • 10 years ago

      IIRC, Leadtek has actually been doing the Nvidia reference board designs for quite a few years (dating back to the GF 3/4 days)

      • Forge
      • 10 years ago

      You must not have been paying attention for the last 5+ years. Nvidia has been making the FIRST RUN PCBs *ONLY* for a very long time. I can be certain back to GF4 days, and I think it’s been this way since TNT2, IE forever. Card comes out. Nvidia ships finished cards to partners for sticker branding. Partners have custom boards half-ready to almost-ready at that point. Second run of cards, Nvidia sells GPU/RAM bundles only, few to no PCBs. It optimizes Nvidia’s profits and gets products out on announcement day.

      ATI used to do things this way, with Sapphire being ATI’s semi-internal reference card maker (Built By ATI, BBA). Later ATI spun Sapphire out fully to being just another OEM, and stopped releasing BBA cards to retail. I think that was due to pressure from the OEMs, who didn’t want to compete with an “official” card.

      • marraco
      • 11 years ago

      That’s official inflation: a lie

      [url<]http://www.shadowstats.com/[/url<] And also, that's average inflation. Not all products vary the same. 5% average inflation on 4 years=22% 6% average inflation on 4 years=26% 7% average inflation on 4 years=30%

      • stupido
      • 11 years ago

      No worries… AMD is “culprit on duty”… 😀 as consumer are…

      • Krogoth
      • 11 years ago

      The problem is yields. The evidence is mounting that TSMC’s 28nm process is becoming a repeat of 40nm, which the earlier chips on it had yielding problems. Nvidia and AMD are combating this problem by setting their price high to avoid diminishing their supplies. They know that the cards will sale well because there are still enough buyers who are willing to buy their cards at their current prices. (Supply/Demand).

      GK104 isn’t exactly a small chip. It is almost as large as Tahiti itself and quite a bit larger then Pitcairn (Tahiti’s mid-range).

      • Meadows
      • 11 years ago

      You’d need 50% inflation over the course of the last 2 years for that theory to work, which is [i<]madness[/i<].

      • Bensam123
      • 11 years ago

      You can’t get an HDMI to VGA unless your video card specifically support converting to it. HDMI/Displayport/DVI are all digital signals and you can adapt from one to another.

      The DVI port has a built in analog signal in the jack. Thats the four pins around the — You can’t convert directly from one of the above digital formats to analog without an active converter, which are around $30-40. You can find an adapter for things like HDMI to VGA, but your video card specfically has to specify it, it’s not standard in any of the above jack formats besides DVI.

      • clone
      • 11 years ago

      ok so to be clear, AMD releases an at the time clearly superior product and prices it $50 under it’s direct competition…. everyone complains that it’s overpriced and AMD is being unfairly greedy…. because they had the fastest most power efficient and cheapest product on the market at the time.

      this is bad and it’s all AMD’s fault…. for bringing out better for less while costing less…. somehow this makes AMD greedy.

      then Nvidia gets it’s product out onto the market and it’s a little better in almost every way while also being cheaper than AMD’s card but ppl still want to complain about price because they ….. want to blame.

      so here we are the fastest cards on the planet are now officially $100 less than the previous generation and that’s “AMD’s fault”?

      because Nvidia has something better in the wings due out possibly before Christmas?

      it’s freaking the beginning of April and ppl are complaining that the fastest cards on the planet are $100 less than they were last year while being also faster and more power efficient because prices will probably drop 8 months from now?

      and this somehow is AMD’s fault…. progress is AMD’s fault, lower prices are AMD’s “fault”… Nvidia is the …. “victim in all of this?”

      the consumer is the “victim” in all of this?

      oh yeah and did I mention that it’s all “AMD’s fault”… for now having the fastest cards being faster, more power efficient, $100 less costly and offering better feature support.

      talk about a tough crowd, some silly used the words inflation regarding faster cards that cost less and are more power efficient… lol.

      • clone
      • 11 years ago

      Nvidia passing off manufacturing to board partners and allowing them to customize the PCB’s is a return to the old days of inconsistent quality….. it was Nvidia that brought it all back in house back in the late 1990’s which arguably helped substancially in putting them on top and now they are sending it back out into the wild?

      a sad day in video cards the moment I read the news…. now companies like ECS will get the work and screw up their own bios’s and offer inconsistent fixes.

      do your research when buying and make certain to get one made by Nvidia, I’d be leary of anything else.

      • sweatshopking
      • 11 years ago

      what? inflation has been slightly high since 2008, but not astronomical. Quit being a crazy man.

      • marraco
      • 11 years ago

      Yes, it’s called inflation. That’s the price to pay for quantitative easing, and printing money to save the rich.

      • marraco
      • 11 years ago

      I enabled FXAA on a 8800GT, so it should be easy to do on Fermi cards.

      Just install a Kepler driver on a 7xxx or later geforce, (you need to tweak a text file), and enjoy adaptive Vsync and FXAA.

      • Parallax
      • 11 years ago

      Sort of. Whenever Win7 or Vista start a full-screen program using directx (be it a game or aero popup window) one of the first things directx does is reset the Look-Up-Table to default (i.e. none). Of course directx and windows usually fail to restore the LUT afterward. All we need is a way to force the calibrated LUT in the driver, much like Anti-Aliasing and other settings can be.

      There are some 3rd party utilities that try to restore the calibrated LUT on resolution changes and such, but often don’t work properly when using alt-tab and don’t work at all with other games. A few displays have hardware LUTs built into them, but are all very high-end and expensive.

      • fredsnotdead
      • 11 years ago

      Is this related to the problem where Win7 loses the color profile when the popup window (forget what it’s called) asks for permission to run some software?

      • Airmantharp
      • 11 years ago

      Something that has me interested in these cards is the apparent inclusion of three TDMS links- which would be one more than AMD has been putting into their cards.

      This is obviously necessary for two DVI and one HDMI port to be used concurrently, and it would mitigate the problem that many have been seeing when using AMD cards to power three displays, but having to use DisplayPort for only one of them, and getting cross-screen tearing as a result of the different synchronization timings between the outputs.

      With this support along with the adaptive vertical sync and improved FXAA-like TXAA support, these cards seem to be stacking up very well, and the 4GB versions may make a very lucrative option for high-resolution and multi-display gaming.

      • Kurotetsu
      • 11 years ago

      In addition, DisplayPort->VGA and HDMI->VGA adapters are pretty cheap on Monoprice. So you can get support for 3 VGA displays if you absolutely need that (for some bizarre reason).

      • lycium
      • 11 years ago

      Knowing and citing that Alex Reshetov developed MLAA, and your perfect writing style, are what make you the best in this business, Scott 🙂

      • Meadows
      • 11 years ago

      [quote<]why the hell am I getting tearing at FPS levels below my refresh rate!?[/quote<] Because [i<]that's how it works[/i<]. The better question is: why do you get tearing in CoD?

      • Forge
      • 11 years ago

      Ahh, senility. That TXAA writeup linked to the Quincunx page, which reminded me of “Quadcox”. I sincerely hope that one day at Nvidia HQ, some engineering guy, snickering, walked by some PR guy’s office and said “Hey, you hear? Quadcox is a hit!”.

      That PR guy is probably cursing me to this day. Either that, or it ended up being nothing, and only some TR folks remember.

      Quadcox 2.0! WTFAA!

      • Litzner
      • 11 years ago

      I have read your recommendation of using adaptive VSync over the other options in most situations, so I decided to give it another try, as I had tried it when I first got my GTX 680, and experienced terrible tearing.

      So far I have tried it in two games, the first of which is CoD:MW3, a game that I am ALWAYS above 60fps in, and I experienced a LOT of tearing any time I turned left or right, it was small, and not nearly as bad as not running Vsync at all, but it was everywhere. So that was a bust to solve tearing issues in that game.

      I them tried it in RIFT, a game where I am always below 60 fps in heavily populated areas, do to a lot of rendering being processed by the CPU (Gamebryo engine…), and not very well optimized for full GPU usage. At 35-45fps in that game, I would get the same small tearing every where when turning left or right with adaptive VSync force, which blew my mind, why the hell am I getting tearing at FPS levels below my refresh rate!? I turn adaptive VSync off and in the same area and tearing goes away.

      So far for me, adaptive Vsync has not been working very well at all.

      • Fighterpilot
      • 11 years ago

      These new 680s certainly have some elegant engineering in them.
      Very nice article Mr Wasson.
      They sure do look like a great card,bravo Nvidia.

      • smilingcrow
      • 11 years ago

      Thanks for the explanation.
      I’d still have been interested to see the power data for the i7-2600K system at idle and whilst using Quick Sync as that could be compared with the same data for the nVidia system to see how much power the encoding processes consumed.
      Not a like for like comparison but good enough really.

      • UberGerbil
      • 11 years ago

      [quote<]From the pinout on the DVI connectors for the card it looks as though it only supports one VGA monitor, which is sad even though they're touting compatability as feature of their card. Yes, digital is great and all, but there are people that still own monitors/tvs with VGA inputs on them.[/quote<]I noticed that too. And you're right, lots of people still have one -- I have one myself (a 15" LCD so old VGA input is all it has that I use for tertiary things like email or media playlists). Lots of people have one. But more than one? Yes, it's a slight flaw the other DVI is -D instead of -I, but it's [i<]very[/i<] slight. I'm actually surprised there's a second DVI at all, rather than HDMI.

      • Bensam123
      • 11 years ago

      I’d really like to see how adaptive v-sync compares to Lucids solution… This really intrigues me. As a gamer I’ve always kept my v-sync off for just the reasons you specified, just the same as I never turn on tripple buffering. Yet, there are some games that reallllly need it.

      “Impressively, connecting monitors to all four of the card’s default outputs simultaneously shouldn’t be a problem. The GTX 680 can drive four monitors at once…”

      Which is good. I’ve owned Radeons for ages and what really got me with my latest one was when I tried to enable three displays and found out you can only have two actives displays at once unless you buy a active display port adapter, which are quite a bit more expensive then just a DVI>HDMI or Displayport>DVI adapter.

      From the pinout on the DVI connectors for the card it looks as though it only supports one VGA monitor, which is sad even though they’re touting compatability as feature of their card. Yes, digital is great and all, but there are people that still own monitors/tvs with VGA inputs on them.

      On Blindless textures you originally described it as a form of megatextures, yet I was under the impression that a megatexture is just one giant texture spread out over an entire area (like the ground)… not the streaming capabilities of textures…?

      • Damage
      • 11 years ago

      Since our Core i7-3820-based test system lacked an IGP, we couldn’t test QuickSync on the same system. We tested QuickSync on a Core i7-2600K-based system without a discrete graphics card. Given the differences, we didn’t see the point in recording power draw numbers while testing.

      We may do a fuller comparison later, all on the same system, but not today.

      • smilingcrow
      • 11 years ago

      “While encoding with NVEnc our test system’s power consumption hovered around 126W. When using the software encoder the same system’s power draw was about 159W.”

      So what was the power consumption when you used Quick Sync? Strange that you omitted that info!

      • Kurkotain
      • 11 years ago

      The twitter bit at the end cracked me up… 😀

      • HisDivineOrder
      • 11 years ago

      Thank AMD. I mean, if you believe the arguments from before that it was nVidia’s fault for not showing up with competition and that’s why AMD was overpriced, then I guess we can now extend that argument to say it is now AMD’s fault for overpricing/failing to release a high end card, allowing nVidia to release their mid-Kepler as their high-Kepler and sit on the real high end until such time as AMD decides to compete.

      Competition is great… when the competition shows up.

      • UberGerbil
      • 11 years ago

      Well said.

      • Meadows
      • 11 years ago

      As asked [url=https://techreport.com/discussions.x/22653?post=624754<]elsewhere before[/url<], is the latency of GPU Boost a variable, or a constant? Does it get faster at higher framerates, or does it analyse time intervals rather than the last X frames? If the videocard shuts down from bad overclocking, and Windows resets the driver, does the feature persist, or does it need a reboot to recover?

      • Chrispy_
      • 11 years ago

      What worries me is that the pricing of this “mid-Kepler” is $500, and the sweet-spot is usually at $199.

      Traditionally when nvidia cuts a chip in half (as the GF106 and 116 are effectively half a GF104 and 114) we lose about half the performance but the cards aren’t half the price.

      Based on nothing but approximate costs of the preceding two generations, the GK106 will be $299 and the GK108 will be $199 which means nvidia will be bumping up the price of the sweet spot by $100 and offering an overly watered-down product at the oh-so-important $199.

      Maybe I’m trying to extrapolate where it’s not applicable to this generation of chips. There’s no evidence yet support that the GK106 will actually be half a GK104.

      • Stargazer
      • 11 years ago

      I find it hard to form any definite opinion since I don’t tend to personally buy more than one card using the same type of GPU, and review sites don’t tend to provide enough information to make any meaningful conclusions.

      Yes, reviews typically include noise/temperature measurements, but these tend to be for [b<]open air[/b<] systems. This is great if you yourself happen to use an open air system, but most of us use [i<]cases[/i<] for our computers. The problem is that some coolers are designed to blow hot air out through the back of the case (high-end reference cards tend to do this), while others are not. There are many benefits to blowing the hot air out (less heat inside the case means that other components are not heated up as much (leading to potentially increased fan noises for them), and the case fans do not have to work as hard to keep down the heat of the system), but these benefits are meaningless in the open air test systems. Instead, the trade-offs that are made when designing these heat-exhausting coolers means that it is often easier to get good open-air performance from non-exhausting coolers. This means that the noise/temperature data that is achieved in open air test systems are not necessarily representative of the noise/temperature situation we will get once we put our graphics cards inside our cases. Unfortunately, it tends to be hard to find data about what happens in a "real" computer system. I fully understand why review sites prefer using open air systems for noise/temperature investigations. Not only is it much simpler to use open air systems, but if you do decide to use a case, there are also tons of different variables that can affect the outcome (what kind of case you're using, what other components are being used, air-flow in your system, ...). However, even if each review site only used one specific case setup, you would still be able to form a decent impression by looking at the results from several different sites. At the very least, it would be interesting to see sporadic tests using different types of coolers [i<]inside cases[/i<] to see how the different cooling solutions compare there (<hint>it might be nice to do this once 3rd party coolers for the 680 come out</hint>). Unfortunately, the prevalence of open air test systems could also act to encourage 3rd party card providers to use non-exhausting coolers, even if that is not necessarily always the most optimal solution. Since 3rd party cards tend to mainly differentiate themselves based on their cooling solutions (and sometimes the amount of factory overclock), it is in their best interests to perform well in this regard. With the majority of reviews employing open air test systems, this encourages them to optimize for this scenario, even if you sometimes could get better results in an in-case system with some other solution. It's... unfortunate.

      • Voldenuit
      • 11 years ago

      Any chance of seeing screen capture comparisons of video encoding output using different code paths? Anandtech has been doing that on its video encoding comparisons for a while.

      • kamikaziechameleon
      • 11 years ago

      I’m a huge fan of good 3rd party coolers, i love the MSI frozr coolers. Its worth waiting just to get my hands on a card packing one of those.

      • Damage
      • 11 years ago

      Hmm. I must have gotten bad info somehow there. I’ved updated the article. Thanks.

      • ImSpartacus
      • 11 years ago

      Don’t even remind us of the 2900XT…

      • Parallax
      • 11 years ago

      All these new features and Nvidia still can’t handle color profiles correctly. All we need is for the driver to store the LUT like it’s supposed to and NOT change it. Not when changing resolutions, not when rebooting, not when switching from windowed to fullscreen, etc….

      Using an AMD card currently and still need to use a 3rd party utility ($30) to hold the profile. At least this works most of the time because AMD was smart enough not to lock access to the LUT like Nvidia.

      Think setting up eyefinity or surround is hard? Try successfully holding profiles on 2 monitors, because it’s a nightmare at best.

      Somebody please write an article and send it to both video camps. I’ll even help you write it (I’m serious!).

      • ImSpartacus
      • 11 years ago

      Yes, it’s good that TR can differentiate itself with in-depth looks at notable products.

      • Dysthymia
      • 11 years ago

      Really appreciate all the in-depth info. ( :
      Can’t wait until I can afford some new hardware.

      • trek205
      • 11 years ago

      “In fact, Nvidia plans to release a driver update shortly that will grant this capability to GeForce GTX 500-series products, as well. Wouldn’t you love to be at the meeting where they decided to give this feature to owners of GTX 560s but not owners of GTX 480s? The reasoning must be fascinating.”

      um Nvidia said right there on their site that with a future driver update that adaptive vsync feature will apply to 8000 series and above.

      “As with Control Panel FXAA, Adaptive VSync will be rolled out to all GeForce 8-series and later users in the near future.”

      [url<]http://www.geforce.com/whats-new/articles/nvidia-geforce-gtx-680-R300-drivers-released/[/url<] EDIT: sorry that was not actually meant as a reply to Jon.

      • Jon
      • 11 years ago

      FINALLY, I can’t wait to try out 2D surround again. I complained about the current state of Nvidia 2D surround in a forum post [url<]https://techreport.com/forums/viewtopic.php?f=3&t=80788[/url<] Now....prices need to drop just a little. Let's hope this isn't a dream.

      • Deanjo
      • 11 years ago

      [quote<]The length of the delay, given the fact that the 7970 has been out for roughly four months now, makes us worry there could be some sort of hardware bug holding things up.[/quote<] I'm always leery about video capability claims when it comes to Radeons. They have claimed so much over the years and later got called out on them such as hardware video encoding and HDCP compliance in the X1xxx series which later proved false, UVD in the 2900's, then there was XvBA in linux which was supposed to be out in 2008 and took 3 years for them to finally release something that could be used. Oh ya, even though you couldn't see much of a difference a comparison shot of the encodes would have been nice.

      • MadManOriginal
      • 11 years ago

      I’m glad you posted this expansion as a separate article, because I likely would have missed it if it had been inserted into the regular review article.

      • Johnny5
      • 11 years ago

      *removes cover*
      It’s an AMD, baby!

      • Yeats
      • 11 years ago

      You’re the best, hold me.

    Pin It on Pinterest

    Share This

    Share this post with your friends!