My experience with testing the 6600 GT in SLI has led me to issue one big correction to my original take on SLI. Initially, I reported that using SLI was fraught with lockups and general system stability problems. That was true for me at first, but as I tested the 6600 GTs in SLI, I didn't run into any such problems. In fact, the 6600 GT SLI rig was a model of stability. That was a bit disconcerting, because I didn't change BIOS revisions, video drivers, or anything else of note since testing the 6800 Ultra cards. With the exact same configuration, the Asus 6600 GT cards in SLI ran flawlessly.
This experience led me to go back to the 6800 Ultra cards and try them again in SLI. To my surprise, they were stable the second time around, as well. In fact, I set them up to run a demo loop for hours, just to make sure, and they didn't crash once. This experience gave me a new respect for the basic stability of SLI systems. I'm not sure what the problem was the first time around, but it seems to have resolved itself without any major BIOS, driver, or hardware changes. I'm cautiously optimistic that Asus and NVIDIA have created a reasonably stable platform here.
However, that doesn't mean that SLI isn't without its drawbacks. This technology does involve some substantial compromises, especially at this early stage in its existence. First and foremost in any discussion of using 6600 GT cards in SLI is the issue of memory use. Although SLI doubles up on rendering power by using a pair of cards together, it does not double the effective amount of available graphics memory. Each graphics card in an SLI config must store its own texture, geometry, and frame buffer data independently. Also, the primary card connected to the video output must store frame buffer data coming in from the second card over the SLI connector. There's no doubt some video memory overhead associated with coordination between the two cards, too. As a result, a pair of 128MB graphics cards in an SLI rig will act, effectively, as a traditional graphics solution with 128MB of memory, or perhaps slightly less.
This is a notable drawback, because the extra rendering power of a second card would likely be most helpful at high resolutions with antialiasing and texture filtering cranked upprecisely the same situations where a 128MB graphics card often runs out of steam. You'll see what I'm talking about when we get to the test results.
SLI also suffers from some compatibility problems at this stage of the game. NVIDIA has published a list of applications that scale well with SLI technology, and as of today, that list is only 21 items long. Of those 21 apps, four are benchmarks and one (Unreal Engine 3) is a technology demo for a game in development. That leaves sixteen games that currently scale well with SLI, according to NVIDIA. There are some big names on that list, including the three major game releases we've chosen for testing, but the vast majority of games aren't on it.
The reality is that SLI doesn't "just work" out of the box with any game or application that one might wish to run on it. In fact, SLI runs in either of two modes, and neither mode is compatible with all existing applications. The first of those two modes is known as alternate frame rendering, or AFR for short. AFR distributes the graphics load between two GPUs by having each GPU render every other frame in a scene, so that GPU 0 would render frames 1, 3, 5, 7, and 9, while GPU 1 would render frames 2, 4, 6, 8, and 10. By buffering a couple of frames ahead of what's being output to screen, AFR can distribute the graphics load quite evenlysimple and clean.
NVIDIA says AFR mode is generally the best way to handle SLI load balancing, because it tends to scale best. AFR allows for real geometry scaling, not just pixel shading and fill rate scaling. Unfortunately, AFR isn't as broadly compatible with applications as one would hope. NVIDIA cites several possible snags with AFR compatibility and scaling. Rendering to a texture presents problems, because the system may have to transfer the rendered texture to the other GPU. Also, blurring effects like in Painkiller's "demon mode" cause problems because of the need to push data forward from one frame to the next.
SLI's other, more compatible mode is known as split-frame rendering or SFR. SFR subdivides the screen and renders part of each frame on each GPU. With what NVIDIA refers to as "balanced loading" active, SFR dynamically subdivides the screen in different proportions in response to graphics load.
Even with balanced loading, SFR doesn't typically scale as well as AFR. AFR better distributes the vertex processing load and is more likely to achieve a near-2X performance speedup. However, SFR is more broadly compatible than AFR.
In order to enforce the proper and compatible SLI mode for various games, NVIDIA has elected to expose SLI through the game profiles function of its video drivers. Profiles store one's preferred settings for various games, and in the case of SLI, NVIDIA has pre-programmed the appropriate mode for various games (AFR, SFR, or single-GPU) into the driver.
These settings are the result of NVIDIA's own extensive QA testing, and users can modify these settings only by creating a separate profile for the game. Even then, users may only turn SLI on or off. Manually specifying AFR or SFR mode will require some hacking around.
If an application doesn't yet have a profile, it will generally default to single-GPU rendering, although NVIDIA allows one to choose "multi-GPU rendering" as a global default in the driver. In this case, apps without profiles appear to default to SFR mode.
One of the more intriguing questions about SLI is what actually happens over the card-to-card SLI link. NVIDIA says the card-to-card Scalable Link Interface, or SLI, does not use the HyperTransport protocol, as some have speculated. Instead, it's a digital interconnect of NVIDIA's own making that offers over a gigabyte per second of bandwidth. NVIDIA uses this link to transmit synchronization, display, and pixel data between the cards. Generally, PCI Express handles game data, such as textures and geometry, as it does in single-card setups. Although PCI Express x16 might seem to have enough bandwidth to handle all of the data necessary for SLI's operation, NVIDIA cites internal bandwidth limitations in some chipsets as part of the impetus behind the SLI connection. (NVIDIA has said that Intel's current PCI-E chipsets offer only about 3GB/s of bandwidth to the video card and about 1GB/s from it.)