Nvidia’s GeForce GTX 280 graphics processor

If the GPU world were a wildlife special on the National Geographic channel, the G80 processor that powers GeForce 8800 GTX graphics cards would be a stunningly successful apex predator. In the nearly two years that have passed since its introduction, no other single-chip graphics solution has surpassed it. Newer GPUs have come close, shrinking similar capabilities into smaller, cooler chips, but that’s about it. The G80 is still the biggest, baddest beast of its kind—a chip, as we said at the time, with “the approximate surface area of Rosie O’Donnell.” After it dispatched its would-be rival, the Radeon HD 2900 XT, in an epic mismatch, AMD gave up on building high-end GPUs altogether, preferring instead to go the multi-GPU route. Meanwhile, the G80 has sired a whole range of successful offspring, from teeny little mobile chips to dual-chip monstrosities like the GeForce 9800 GX2.

Of course, even the strongest predator has a limited time as king of the pride, and the G80’s reign is coming to a close. Today, its true heir arrives on the scene in the form of the GT200 graphics processor powering the GeForce GTX 200-series graphics cards. Despite being built on a smaller chip fabrication process, the GT200 is even larger than the G80, and it packs nearly twice the processing power of its progenitor.

This new contender isn’t content with just ruling the same territory, either. Nvidia has ambitious plans to expand the GPU’s processing domain beyond real-time graphics and gaming, and as the GPU computing picture becomes clearer, those plans seem increasingly viable. Join us as we dive in for a look at this formidable new processor.

The GT200 GPU: an overview

The first thing to be said about the GT200 is that it’s not a major departure from Nvidia’s current stable of G80-derived GPUs. Instead, it’s very much a refinement of that architecture, with a multitude of tweaks throughout intended to improve throughput, efficiency, and the like. The GT200 adds a handful of new capabilities at the edges, but its core graphics functionality is very similar to current GeForce 8- and 9-series products.

As any graphics expert will tell you, determining what’s changed involves the study of Chiclets, of course. Nvidia has laid out the Chiclets in various flavors and patterns in order to convey the internal organization of GT200. Behold:

A logical block diagram of the GT200 GPU. Source: Nvidia.

Shiny, but with a chewy center!

Arranged in this way, the Chiclets have much to tell us. The 10 large groups across the upper portion of the diagram are what Nvidia calls thread processing clusters, or TPCs. TPCs are familiar from G80, which has eight of them onboard. The little green boxes inside of the TPCs are the chip’s basic processing cores, known in Nvidia’s parlance as stream processors or SPs. The SPs are arranged in groups of eight, as you can see, and these groups have earned their own name and acronym, for the trifecta: they’re called SMs, or streaming multiprocessors.

Now, let’s combine the power of all three terms. 10 TPCs multiplied by three SMs times eight SPs works out to a total of 240 processing cores on the GT200. That’s an awful lot of green Chiclets and nearly twice the G80’s 128 SPs, a substantial increase in processing potential—not to mention chewy, minty flavor.

One of the key changes in the organization of the GT200 is the increase from two to three SMs inside of each thread processing cluster. The TPCs still house the chip’s texture addressing and filtering hardware (brown Chiclets), but the ratio of SPs to texturing units has increased by half, from 2:1 to 3:1. We’ve seen a growing bias toward shader power versus texturing over time, and this is another step in that direction. Even with the change, though, Nvidia remains more conservative on this front than AMD.

The lower part of the diagram reveals a corresponding rise in pixel-pushing power with the increase in ROP (raster operator) partitions from six on the G80 (GeForce 8800 GTX) and four on the G92 (GeForce 9800 GTX) to eight on the GT200. Since each ROP partition can output four pixels at a time, the GT200 can output 32 pixels per clock. And since each ROP partition also hosts a 64-bit memory controller, the GT200’s path to memory is an aggregated 512 bits wide.

In short, the GT200 has a whole lot of pretty much everything.

One thing it lacks, however, is support for DirectX 10.1. Some folks had expected Nvidia to follow AMD down this path, since AMD introduced DX10.1 support in its Radeon HD 3000 series last fall. DX10.1 introduces extensions that expose greater control over the GPU’s antialiasing capabilities, among other things. Nvidia says its GPUs can handle some DX10.1 capabilities, but not all of them. That prevents it from claming DX10.1 support, since Microsoft considers it an all-or-nothing affair. Curiously, though, Nvidia says it is working with game developers to support a subset of DX10.1 extensions, even though Microsoft may not be entirely pleased with the prospect. I believe that work includes addressing problems with antialiasing and game engines that use deferred shading, one of the places where DX10.1 promises to have a big performance impact. Curiouser and curiouser: Nvidia is cagey about exactly which DX10.1 capabilities its GPUs can and cannot support, for whatever reason.

The chip: Large and in charge

When I say the GT200 has a whole lot of everything, that naturally includes transistors: roughly 1.4 billion of them, more than double the 681 million transistors in the G80. Ever faithful to its dictum to avoid the risk of transitioning to a new fab process with a substantially new design, Nvidia has stuck with a 65nm manufacturing technology, and in fact, it says the GT200 is the largest chip TSMC has ever fabricated.

Mounted on a board and covered with a protective metal cap, the GT200 looks like so:

Holy Moses.

When I asked Nvidia’s Tony Tamasi about the GT200’s die size, he wouldn’t get too specific, preferring only to peg it between 500 and 600 mm². Given that, I think the reports that GT200’s die is 576 mm² are credible. Whatever the case, this chip is big—like “after the first wafer was fabbed, the tide came in two minutes late” big. It’s the Kim Kardashian’s butt of the GPU world.

To give you some additional perspective on its size, here’s a to-scale comparison Nvidia provided between a GT200 GPU and an Intel “Penryn” 45nm dual-core CPU.

Source: Nvidia.

Such a large chip can’t be inexpensive to manufacture, since defect rates tend to rise exponentially with chip area. Nvidia almost seems to revel in having such a big chip, though, and it does have experience in this realm. It certainly seems as if Nvidia’s last large chip, the G80, worked out pretty well. Perhaps they’re not crazy to do this.

If you’re curious about what’s where on the die, have a look at the helpfully colored diagram below.

The GT200’s basic layout. Source: Nvidia.

Tamasi noted that the shader cores look very regular and recognizable because they are made up of custom logic, like on a CPU, rather than being the product of automated logic synthesis. He also pointed out that, if you count ’em, the GT200 has exactly the number of on-chip shader structures you’d expect, with 10 TPCs easily visible and no extras thrown in to help increase yields. Of course, nothing precludes Nvidia from selling a GT200-based product with fewer than 10 TPCs enabled, either.

You may be wondering, with a chip this large, about power consumption—as in: Will the lights flicker when I fire up Call of Duty 4? The chip’s max thermal design power, or TDP, is 236W, which is considerable. However, Nvidia claims idle power draw for the GT200 of only 25W, down from 64W in the G80. They even say GT200’s idle power draw is similar to AMD’s righteously frugal RV670 GPU. We shall see about that, but how did they accomplish such a thing? GeForce GPUs have many clock domains, as evidenced by the fact that the GPU core and shader clock speeds diverge. Tamasi said Nvidia implemented dynamic power and frequency scaling throughout the chip, with multiple units able to scale independently. He characterized G80 as an “on or off” affair, whereas GT200’s power use scales more linearly with demand. Even in a 3D game or application, he hinted, the GT200 might use much less power than its TDP maximum. Much like a CPU, GT200 has multiple power states with algorithmic determination of the proper state, and those P-states include a new, presumably relatively low-power state for video decoding and playback. Also, GT200-based cards will be compatible with Nvidia’s HybridPower scheme, so they can be deactivated entirely in favor of a chipset-based GPU when they’re not needed.

As you may have noticed in the photograph above, the GT200 brings back an innovation from the G80 that we hadn’t really expected to see again: a separate, companion display chip. This chip is similar in function to the one on the G80 but is a new chip with additional capabilities, including support for 10-bit-per-color-channel scan out. GT200 cards will feature a pair of dual-link DVI outputs with HDCP over both links (for high-res HD movie playback), and Nvidia claims they will support HDMI via a DVI-to-HDMI adapter, although our sample of a production card from XFX didn’t include such an adapter. GT200 boards can also support DisplayPort, but they’ll require a custom card design from the vendor, since Nvidia’s reference design doesn’t include a DisplayPort, er, display port. (Seriously, WTH?)

Incidentally, if you’re going to be playing HD movies back over one of those fancy connections, you’ll be pleased to learn that Nvidia has extended the PureVideo logic in the GT200 to handle decoding of files encoded with the VC-1 and WMV9 codecs, as well as H.264.

The cards: GeForce GTX 280 and 260

The GT200 GPU will initially ship in two different models of video cards from a variety of Nvidia partners. The board you see below is an example of the big daddy, the GeForce GTX 280, as it will ship from XFX. This is the full-throttle implementation of the GT200, with all 240 SPs and eight ROP partitions active. The GTX 280’s core clock speed wll be 602MHz, with SPs clocked at 1296MHz. It comes with a full gigabyte of GDDR3 memory running at 1107MHz, for an effective 2214MT/s.

Obviously, this board has a dual-slot cooler, and it’s covered in a full complement of body armor composed half of metal (mostly around back) and half of plastic (you figure it out). We first saw this all-encompassing shroud treatment in the GeForce 9800 GX2. I suppose it’s possible this provision could actually reduce return rates on the cards simply by protecting them from rough handling or—heh—would-be volt-modders. I worry about the shroud trapping in heat, but I noticed when tearing one apart that the metal plate on the back side of the card apparently acts as a heat radiator for the memory chips mounted back there.

From end to end, the GTX 280 card is 10.5″ long and, as I’ve mentioned, its TDP is 236W. To keep this puppy fed, you’ll need a PSU with one eight-pin PCIe aux power connector and one six-pin one.

As with the 9800 GX2, the GTX 280’s SLI connectors are covered by a rubber cap, to keep the “black monolith” design theme going. Popping it off reveals dual connectors, threatening the prospect of three-way GTX 280 SLI. Heck, the only really exposed bit of the GTX 280 is the PCIe x16 connector, which is (of course) PCIe 2.0 compliant.

Oddly enough, the GTX 280 is slated to be available not today, but tomorrow, June 17th. You will be expected to pay roughly $649 for the privilege of owning one, which is, you know, a lot. If it’s any consolation, the XFX version of the GTX 280 ships with a copy of Assassin’s Creed, which is stellar as far as bundled games go and a nice showcase for the GTX 280.

The GeForce GTX 260 appears to use the same basic board design and cooler as the GTX 280, but it gets by with two six-pin aux power connectors. This card uses a somewhat stripped-down version of the GTX 280, with two thread processing clusters and one ROP partition disabled. As a result, the GTX 260 has 192 stream processors, a 448-bit path to memory, and reduced texturing and pixel-pushing power compared to the GTX 280. Clock rates are 576MHz for the core, 1242MHz SPs, and 999MHz memory. The deletion of one memory interface also brings another quirk: the GTX 260’s total memory size is 896MB, which is kinda weird but probably harmless.

Initially, Nvidia told us to expect GTX 260 cards to sell for $449, but last week, they revised the price down to $399. Could they be anticipating potent competition from the Radeon camp, or are they just feeling generous? Who knows, but we’ll take the lower price. GeForce GTX 260 cards aren’t slated for availability until June 26. By then, we expect to see another interesting option in the market, as well.

Here’s a quick picture of a GTX 260 card completely stripped of its shroud and cooler. I had a devil of a time removing that stuff. The GT200 GPU remains enormous.

GPU compute breaks out

Speaking of enormity, Nvidia is certainly talking up the potential of the GT200 and its siblings for applications beyond traditional real-time graphics processing. This isn’t just idle talk, of course; the potential of GPUs for handling certain types of computing problems has been evident for some time now. Quite a few tasks require performing relatively simple transforms on large amounts of similar data, and GPUs are ideal for streaming through such data sets and crunching them. Starting with the G80, Nvidia has built provisions into each of its graphics processors for GPU-compute applications. The firm has also developed CUDA, a C-like programming interface for its DX10-class GPUs that it offers to the world free of charge via a downloadable SDK.

The first GPU-enabled applications came from specific industries that confront particularly difficult computing problems that are good candidates for acceleration via parallel processors like GPUs: oil and gas exploration, biomedical imaging and simulation, computational fluid dynamics, and such things. Both Nvidia and ATI (now AMD) have been showing demos of such applications to the press for some time now. Both companies have even re-branded their GPUs as parallel compute engines and sold them in workstation- and server-style configurations. Indeed, surely one of the reasons Nvidia can justify building large GPUs like the G80 and GT200 is the fact that those chips can command high margins inside of Tesla GPU-compute products.

Impressive as they have been, though, such applications haven’t typically had broad appeal. Nvidia hopes the next wave of GPU-compute programs will include more consumer-oriented software, and it’s fond of pointing out that it has shipped over 70 million CUDA-capable GPUs since the introduction of the GeForce 8—a considerable installed base. Meanwhile, the company has quietly been buying up and investing in makers of tools and software with GPU-computing potential. At its press event for the GT200, Nvidia and its partners showed off a number of promising consumer-level products in development that use CUDA and the GPU to deliver new levels of performance. Among them:

  • Adobe showed a work-in-progress version of Photoshop that uses the GPU to accelerate image display and manipulation. The demo consisted of loading up a 442 megapixel image—a 2GB file—and working with it. GPU accelerated zooming allowed the program to flow instantaneously between a full-image view and close examination of a small section of the image, then move back out again. The image could be rotated freely, in real time, as well. Using another tool, the Adobe rep loaded up a 3D model of a motorcycle and was able to paint directly on its surface. He then grabbed a bit of vector art, stamped it on the surface of the model, and it sort of “melted” into place, moving with the bike as the view rotated. Again, the program’s responses were instant and fluid.
  • Nvidia recently purchased a company called Rayscale that has developed a ray-tracing application for the GPU. Their software mixes traditional GPU rasterization via OpenGL with ray-tracing via CUDA to create high-quality images with much better reflections than possible with rasterization alone. At present, the company’s founders said, the software isn’t quite able to render images in real time; one limiter is the speed penalty exacted by performing a context switch from graphics mode to CUDA. Nvidia says it’s working on improving the speed of such switches.
  • A firm called Elemental is preparing several video encoding products that use GPU acceleration, including a plug-in for Adobe Premiere and a stand-alone transcoder called BadaBoom. The compny showed a demonstration of a very, very quick video transcode and claimed BadaBoom could convert an MPEG2 file to H.264 at a rate faster than real-time playback—a huge improvement over CPU-based encoding. The final product isn’t due until August, but Nvidia provided us with an early version of BadaBoom as we were in the late stages of putting together this review. We haven’t yet had time to play with it, but we’re hoping to conduct a reasonably good apples-to-apples comparison between video encoding on a multi-core CPU and a GPU, if we can work out the exact quality settings used by BadaBoom. Obviously, H.264 video encoding at the speeds Elemental claims could have tremendous mass-market appeal.
  • Stanford University’s distributed computing guru, Vijay Pande, was on hand to show off a Folding@Home client for Nvidia GPUs—at last! Radeons have had a Folding client for some time now, of course. The GeForce client has the distinction of being developed in CUDA, so it should be compatible with any GeForce 8 or newer Nvidia GPU. Pande said the GeForce GTX 280 can simulate protein folding at a rate of over 400 nanoseconds per day, or over 500 ns/day if the card’s not driving a display. We have a beta copy of the Folding client, and yep, it folds. However, we’re not yet satisfied with the performance testing tools Nvidia supplied alongside it, so we’ll refrain from publishing any number of our own just yet. I believe the client itself shouldn’t be too far from public release. (If you haven’t yet, consider joining Team TR and putting that GPU to good use.)
  • Last but certainly not least, Manju Hegde, former CEO of Ageia, offered an update on his team’s progress in porting the physics “solvers” for the PhysX API to the GPU in the wake of Nvidia’s buyout of Ageia. He said they started porting the solvers to CUDA roughly two and a half months ago and had them up and running within a month. Compared to the performance of a Core 2 Quad CPU, Hedge said the GeForce GTX 280 was up to 15X faster simulating fluids, 12X faster with soft bodies, and 13X faster with cloth and fabrics. (I believe that puts the GTX 280’s performance at roughly six to 10 times that of Ageia’s own PhysX hardware, for what it’s worth.) Their goal is to make sure all current hardware-accelerated PhysX content works with the GPU drivers.

    Hegde also pointed out that game developers have become much more open to using hardware physics acceleration in their games since the acquisition, with 12 top-flight titles signing on in the first month, versus two titles in Ageia’s two-and-a-half years in existence. Among the games currently in development that will use PhysX are Natural Motion’s Backbreaker football sim and the sweet-looking Mirror’s Edge.

    One question we don’t know the answer to just yet is how well hardware physics acceleration will coexist with 3D graphics processing, especially on low-end and mid-range GPUs. Hedge showed a striking “Creature from the Deep” demo that employs soft bodies, force fields, and particle debris at the event, but he later revealed that demo used two GPUs, one for graphics and the other for physics. Again, context switching overhead is an issue here. We expect to have an early PhysX driver to play with later this week. We’ll have to see how it performs.


The creature from the deep wriggles like Jello thanks to PhysX

Shader processing


Block diagram of a TPC. Source: Nvidia.

Partially thanks to its push into GPU computing, Nvidia has been much more open about some details of the GT200’s architecture than it has been with prior GPU designs. As a result, we can take a look inside of a thread processing cluster and see a little more clearly how it works. The diagram at the right shows one TPC. Each TPC has three shader multiprocessors (SMs), eight texture addressing/filtering units, and an L1 cache. For whatever reason, Nvidia won’t divulge the size of this L1 cache.

Inside of each SM is one instruction unit (IU), eight stream processors (SPs), and a 16K pool of local, shared memory. This local memory can facilitate inter-thread communication in GPU compute applications, but it’s not used that way in graphics, where such communication isn’t necessary.

For a while now, Nvidia has struggled with exactly how to characterize its GPUs’ computing model. At last, the firm seems to have settled on a name: SIMT, for “single instruction, multiple thread.” As with G80, GT200 execution is scalar rather than vector, with each SP processing a single pixel component at a time. The key to performance is keeping all of those execution units fed as much of the time as possible, and threading is the means by which the GT200 accomplishes this goal. All threads in the GT200 are managed in hardware by the IUs, with zero cost for switching between them.

The IU manages things in groups of 32 parallel threads Nvidia calls “warps.” The IU can track up to 32 warps, so each SM can handle up to 1024 threads in flight. Across the GT200’s 30 SMs, that adds up to as many as 30,720 concurrent hardware threads in flight at any given time. (G80 was similar, but peaked at 768 threads per SM for a maximum of 12,288 threads in flight.) The warp is a fundamental unit in the GPU. The chip’s branching granularity is one warp, which equates to 32 pixels or 16 vertices (or, I suppose, 32 compute threads). Since one pixel equals one thread, and since the SPs are scalar, the compiler schedules pixel elements for execution sequentially: red, then green, then blue, and then alpha. Meanwhile, inside of that same SM, seven other pixels are getting the exact same treatment in parallel.

Should the threads in a warp hit a situation where a high-latency operation like a texture read/memory access is required, the IU can simply switch to processing another of the many warps it tracks while waiting for the results to come back. In this way, the GPU hides latency and keeps its SPs occupied.

That is, as I understand it, SIMT in a nutshell, and it’s essentially the model established by the G80. Of course, the GT200 is improved in ways big and small to deliver more processing power more efficiently than the G80.

One of those improvements is relatively high-profile because it affects the GT200’s theoretical peak FLOPS numbers. As you may know, each SP can contribute up two FLOPS per clock by executing a multiply-add (MAD) instruction. On top of that, each SP has an associated special-function unit that handles things like transcendentals and interpolation. That SFU can also, when not being used otherwise, execute a floating-point multiply instruction, contributing another FLOP per clock to the SP’s output. By issuing a MAD and a MUL together, the SPs can deliver three total FLOPS per clock, and this potential is the basis for Nvidia’s claim of 518 GFLOPS peak for the GeForce 8800 GTX, as well as of the estimate of 933 GFLOPS for the GeForce GTX 280.

Trouble is, that additional MUL wasn’t always accessible on the G80, leading some folks to muse about the mysterious case of the missing MUL. Nvidia won’t quite admit that dual-issue on the G80 was broken, but it says scheduling on the GT200 has been massaged so that it “can now perform near full-speed dual-issue” of a MAD+MUL pair. Tamasi claims the performance impact of dual-issue is measurable, with 3DMark Vantage’s Perlin noise test gaining 16% and the GPU cloth test gaining about 7% when dual-issue is active. That’s a long way from 33%, but it’s better than nothing, I suppose.

Another enhancement in GT200 is the doubling of the size of the register file for each SM. The aim here is, by adding a more on-chip storage, to allow more complex shaders to run without overflowing into memory. Nvidia cites improvements of 35% in 3DMark Vantage’s parallax occlusion mapping test, 6% in GPU cloth, 5% in Perlin noise, and 15% overall with Vantage’s Extreme presets due to the larger register file.

Another standout in the laundry list of tweaks to GT200 is a much larger buffer for stream output from geometry shaders. Some developers have attempted to use geometry shaders for tessellation, but the large amount of data they produced caused problems for G80 and its progeny. The GT200’s stream out buffer is six times the size of G80’s, which should help. Nvidia’s own numbers show the Radeon HD 3870 working faster with geometry shaders than the G80; those same measurements put the GT200 above the Radeon HD 3870 X2.

The GT200 as a general compute engine. Source: Nvidia.

The diagram above sets the stage for the final two modifications to the GT200’s processing capabilities. Nvidia likes to show this simplified diagram in order to explain how the GPU works in CUDA compute mode, when most of its graphics-specific logic won’t be used. As you can see, the Chiclets don’t change much, although the ROP hardware is essentially ignored, and what’s left is a great, big parallel compute machine.

One thing such a machine needs for scientific computing and the like is the ability to handle higher precision floating-point datatypes. Such precision isn’t typically necessary in graphics, especially real-time graphics, so it wasn’t a capability of the first DirectX 10-class GPUs. The GT200, however, adds the ability to process IEEE 754R-compliant, 64-bit, double-precision floating-point math. Nvidia has added one double-precision unit in each SM, so GT200 has 30 total. That gives it a peak double-precision computational rate of 78 GFLOPS, well below the GPU’s single-precision peak but still not too shabby.

Another facility added to the GT200 for the CUDA crowd is represented by the extra-wide, light-blue Chiclets in the diagram above: the ability to perform atomic read-modify-write operations into memory, useful for certain types of GPU-compute algorithms.

Peak shader
arithmetic (GFLOPS)

Single-issue Dual-issue

GeForce 8800 GTX

346 518
GeForce 9800 GTX

432 648
GeForce 9800 GX2

768 1152
GeForce GTX 260

477 715
GeForce GTX 280

622 933
Radeon HD 2900 XT

475
Radeon HD 3870 496
Radeon HD 3870 X2

1056

So how powerful is the GT200’s shader array? With 240 cores operating at 1296MHz, it’s potentially quite formidable. The table on the right should put things into context.

As you’d expect, the GT200’s peak computational rate will depend on whether and how much it’s able to use its dual-issue capability to get that third FLOP per clock. We can probably expect that the GT200 will reach closer to its dual-issue peak than the G80 does to its own, but I suspect the GT200’s practical peak for graphics processing may be something less than 933 GFLOPS.

Nevertheless, the GeForce GTX 280 looks to be substantially more powerful than any other single-GPU solution, and it’s not far from the two dual-GPU cards we’ve listed, the Radeon HD 3870 X2 and the GeForce 9800 GX2. I should point out, however, that the GTX 280 just missed being able to claim a teraflop. Surely Nvidia intended to reach that mark and somehow fell just short. I believe we’ll see a GT200-based Tesla product with slightly higher shader clocks, so it can make that claim.

The GT200’s shader tweaks pay some nice dividends in 3DMark’s synthetic shader tests, as the GeForce GTX 280 grabs the top spot in each. The parallax occlusion mapping test is where the GT200’s larger register file is reputedly a big help, and both GeForce GTX cards top even the 9800 GX2 there, despite the fact that performance in that test scales well on the multi-GPU cards.

Neither multi-GPU solution scales well in the GPU cloth and particles benchmarks, however, and those cards are left to fend for themselves on the strength of a single GPU. Surprisingly, among the single-GPU options, the GT200 is only incrementally faster than the GeForce 8800 GTX and 9800 GTX in both tests.

The Radeons mount more of a challenge to the GeForces in the Perlin noise benchmark, but once again, the GTX 280 captures the top spot, and the hobbled GT200 in the GTX 260 nearly matches a pair of G92s on the 9800 GX2. Both the larger register file and the improved dual-issue on the GT200 are purported to help out in this test, and those claims are looking pretty plausible.

Texturing, ROP hardware, and memory interface

Ah, the basic math that—outside of shaders—determines so much of a GPU’s character. Let’s have a look at the numbers, and then we’ll talk about why they are the way they are.

Peak
pixel
fill rate
(Gpixels/s)

Peak bilinear

texel
filtering
rate
(Gtexels/s)


Peak bilinear

FP16 texel
filtering
rate
(Gtexels/s)


Peak
memory
bandwidth
(GB/s)

GeForce 8800 GTX

13.8 18.4 18.4 86.4
GeForce 9800 GTX

10.8 43.2 21.6 70.4
GeForce 9800 GX2

19.2 76.8 38.4 128.0
GeForce GTX 260

16.1 36.9 18.4 111.9
GeForce GTX 280

19.3 48.2 24.1 141.7
Radeon HD 2900 XT

11.9 11.9 11.9 105.6
Radeon HD 3870 12.4 12.4 12.4 72.0
Radeon HD 3870 X2

26.4 26.4 26.4 115.2

Each of the GT200’s thread processing clusters has the ability to address and bilinearly filter eight textures per clock, just like in the G92. That’s up from the G80, whose TPCs were limited to addressing four textures per clock and filtering eight. As in both of those chips, the GT200 filters FP16 texture formats at half the usual rate. Because the new GPU has 10 TPCs, its texturing capacity is up, from 64 texels per clock in G92 to 80 texels per clock in GT200. That’s not a huge gain in texture filtering throughput, but Nvidia expects more efficient scheduling to bring GT200 closer to its theoretical peak than G92.

Meanwhile, the GT200’s ROP partitions runneth over. It has eight of ’em, 50% more than the G80 and twice the number in the G92. Each of its ROP partitions can output four pixels per clock, which means the GT200 can draw pixels at a rate of 32 per clock cycle. As a result, the single-GPU GeForce GTX 280’s hypothetical peak pixel-pushing power surpasses even the GeForce 9800 GX2’s. Beyond the increase in number, the ROP hardware is largely unchanged, although it can now perform frame-buffer blends in one clock cycle instead of two, so the GT200’s blend rate is 32 samples per clock, versus 12 per clock on the G80.

To me, the GT200’s healthy complement of ROP partitions is the most welcome development of all because, especially on Nvidia’s GPUs, the ROP hardware plays a big role in antialiasing performance. Lots of ROP capacity means better frame rates with higher levels of antialiasing, which is always a good thing.

Another thing the wealth of ROP partitions provides is an ample path to memory, 512 bits in all. That kind of external bandwidth means the GT200 has to have lots of traces running from the GPU to memory and lots of space on the chip dedicated to I/O pads, and some folks have questioned the wisdom of such things. After all, the last example we have of a GPU with a 512-bit interface is the Radeon HD 2900 XT, and it turned out to be awfully large for the performance it delivered. Nvidia insists the primary limiter of the GT200’s size is its shader cores and says the I/O pads are roughly balanced to this. Although the GT200 sticks with tried-and-true GDDR3 memory, it’s capable of supporting GDDR4 memory types, as well—not that it may ever be necessary. The GTX 280’s whopping 142 GB/s of bandwidth outdoes anything we’ve seen to date, even the dual-GPU cards.

Speaking of bandwidth, we’ve found that synthetic tests of pixel fill rate tend to be limited more by memory bandwidth that anything else. That seems to be the case here, since none of the cards reach anything close to a theoretical peak and the top four finish in order of memory bandwidth.

The texturing results prove to be more interesting, in part because the numbers and units don’t correspond to these GPUs’ abilities at all. They’re typically a little more than ten times the theoretical peak. I’ve looked at FutureMark’s whitepaper and even inquired directly with them about what’s going on here, but I haven’t yet received an answer. The results do appear to make sense for what this is: a relative comparison of FP16 texel fill rate.

RightMark’s fill rate test uses integer texture formats, so it’s a little different. Here, the GTX 280’s texel throughput essentially doubles that of the GeForce 8800 GTX. The GT200’s more efficient scheduling does seem to be helping a little bit, as well; the GTX 260 matches the GeForce 9800 GTX, despite having a slightly lower theoretical peak.

Texture filtering quality and performance

The GT200 carries over the same texture filtering algorithms used in the G80 and friends, so there isn’t much to say there. I suggest reading the texture filtering section of my G80 review for more discussion of this subject.

We should, however, pause to consider performance briefly, to see how the GT200’s filtering hardware handles different filtering levels compared to other GPUs. We’ve tested the GTX 280 both at its default settings and with the driver control panel’s “High quality” preset, which disables some sampling and trilinear filtering optimizations.

The GT200’s texture filtering performance scales more or less as expected, although we should note that the GTX 260 starts out roughly equivalent to the 9800 GTX and then drops off slightly as the aniso level increases. The GTX 260’s more efficient scheduling seems to give way to its slightly lower filtering capacity.

Antialiasing

As with texture filtering, so with antialiasing: the GT200’s AA hardware and capabilities are pretty much unchanged. We have tested performance, though, including the proprietary extensions to regular multisampled antialiasing offered by both Nvidia and AMD. The results below show how increasing sample levels impact frame rates. We tested in Half-Life 2 Episode Two at 1920×1200 resolution with the rest of the game’s image quality options at their highest possible settings.

Ok, so let’s get this out of the way. This is our first look at the GTX 280’s performance in an actual game, and wow. Yeah, so it’s fast.

Once you’re over that, you’ll notice that the GT200’s performance as sample counts rise tends to tail off pretty gradually—until we hit 8X multisampling, where it takes a pretty big hit. Interestingly enough, the Radeon HD 3870-based cards don’t lose much at all when going from 4X to 8X multisampling. The GT200’s saving grace, if it needs one, is Nvidia’s coverage sampled AA, which offers higher quality edge smoothing with very little additional overhead. CSAA 16X, in particular, is very nice. Nvidia’s latest GPUs offer this higher quality mode essentially for “free.”

But what, you ask, about AMD’s custom filter AA? Never fear. I have tested it, too, but it’s really tough to present the results in a graph. Instead, I’ve compiled them in a table.

Radeon HD 3780 X2 – Half-Life 2 Episode Two – AA scaling
Base

MSAA

mode

Sample

count

FPS Filter

type

Sample

count

FPS Filter

type

Sample

count

FPS Filter

type

Sample

count

FPS
1X 1 98.0
2X 2 66.2 Narrow

tent

4 65.5 Wide

tent

6 62.7
4X 4 65.0 Narrow

tent

6 47.5 Wide

tent

8 46.2 Edge

detect

12 37.7
8X 8 59.1 Narrow

tent

12 26.9 Wide

tent

16 25.5 Edge

detect

24 28.1

AMD’s custom filters grab samples from adjacent pixels and factor them in (the tent filters use a weighted average) to increase the effective sample count. This method has the effect of causing some amount of blurring throughout the entire screen, but it does tend to work. AMD’s tent filters can be particularly good at clarifying the details of fine geometry, like the tip of a sword or a power line in the distance.

Unfortunately, when combined with 4X AA or better, these custom filters exact a pretty serious performance penalty—not something we saw with the original R600 back in the day, for what it’s worth. I’ll be curious to see whether this weakness persists with newer R600-derived GPUs with more memory bandwidth and larger frame buffers.

By the way, I have said nice things in the past about the Radeon HD series’ tent filters, but my estimation of CFAA has sunk over time. The blurring effect seems to be more noticeable, and annoying, in some games than in others for whatever reason. Surely one reason is the increase in other sorts of post-processing filters in games generally. Right now, CSAA seems to have all of the advantages: no blurring, little to no performance penalty, and CSAA modes are accessible as an option in many newer games.

And now, we’re off to the races…

Our testing methods

As ever, we did our best to deliver clean benchmark numbers. Tests were run at least three times, and the results were averaged.

Our test systems were configured like so:

Processor Core
2 Extreme QX9650
3.0GHz
System
bus
1333MHz
(333MHz quad-pumped)
Motherboard Gigabyte
GA-X38-DQ6
BIOS
revision
F9a
North
bridge
X38
MCH
South
bridge
ICH9R
Chipset
drivers
INF
update 8.3.1.1009

Matrix Storage Manager 7.8

Memory
size
4GB
(4 DIMMs)
Memory
type
2
x Corsair
TWIN2X20488500C5D
DDR2 SDRAM
at 800MHz
CAS
latency (CL)
5
RAS
to CAS delay (tRCD)
5
RAS
precharge (tRP)
5
Cycle
time (tRAS)
18
Command
rate
2T
Audio Integrated
ICH9R/ALC889A

with RealTek 6.0.1.5618 drivers

Graphics

Radeon HD 2900 XT 512MB PCIe

with Catalyst 8.5 drivers

Asus Radeon HD 3870 512MB PCIe

with Catalyst 8.5 drivers



Radeon HD 3870 X2 1GB PCIe

with Catalyst 8.5 drivers

MSI
GeForce
8800 GTX 768MB PCIe

with ForceWare 175.16 drivers

XFX
GeForce
9800 GTX 512MB PCIe

with ForceWare 175.16 drivers

XFX
GeForce
9800 GX2 1GB PCIe

with ForceWare 175.16 drivers

GeForce
GTX 260 896MB PCIe

with ForceWare 177.34 drivers

GeForce
GTX 280 1GB PCIe

with ForceWare 177.26 drivers

Hard
drive
WD
Caviar SE16 320GB SATA
OS Windows
Vista Ultimate
x64 Edition
OS
updates
Service
Pack 1, DirectX March 2008 update

Thanks to Corsair for providing us with memory for our testing. Their quality, service, and support are easily superior to no-name DIMMs.

Our test systems were powered by PC Power & Cooling Silencer 750W power supply units. The Silencer 750W was a runaway Editor’s Choice winner in our epic 11-way power supply roundup, so it seemed like a fitting choice for our test rigs. Thanks to OCZ for providing these units for our use in testing.

Unless otherwise specified, image quality settings for the graphics cards were left at the control panel defaults. Vertical refresh sync (vsync) was disabled for all tests.

We used the following versions of our test applications:

The tests and methods we employ are generally publicly available and reproducible. If you have questions about our methods, hit our forums to talk with us about them.

Call of Duty 4: Modern Warfare

We tested Call of Duty 4 by recording a custom demo of a multiplayer gaming session and playing it back using the game’s timedemo capability. Since these are high-end graphics configs we’re testing, we enabled 4X antialiasing and 16X anisotropic filtering and turned up the game’s texture and image quality settings to their limits.

We’ve chosen to test at 1680×1050, 1920×1200, and 2560×1600—resolutions of roughly two, three, and four megapixels—to see how performance scales.

As expected, the GeForce GTX 280 outperforms any other single-GPU solution, cranking out over 50 frames per second at 2560×1600 resolution. However, the dual-GPU cards have a lot of fight in them: the Radeon HD 3870 X2 sticks with the GTX 260 at lower resolutions, and the 9800 GX2 simply trounces the GTX 280. The thing is, the picture changes at 2560×1600, where the GTX 260 pulls decisively ahead of the 3870 X2 and the GTX 280 closes the gap with the 9800 GX2.

Half-Life 2: Episode Two

We used a custom-recorded timedemo for this game, as well. We tested Episode Two with the in-game image quality options cranked, with 4X AA and 16 anisotropic filtering. HDR lighting and motion blur were both enabled.

The GeForce GTX cards look relatively stronger here, with the 280 basically matching the 9800 GX2 at 2560×1600. The GTX 260 keeps some distance between itself and the 3870 X2, too. Obviously, with everything but the single-GPU Radeons churning out nearly 60 frames per second at 1920×1200, most of these cards will handle Episode Two just fine on most displays.

Enemy Territory: Quake Wars

We tested this game with 4X antialiasing and 16X anisotropic filtering enabled, along with “high” settings for all of the game’s quality options except “Shader level” which was set to “Ultra.” We left the diffuse, bump, and specular texture quality settings at their default levels, though. Shadow and smooth foliage were enabled, but soft particles were disabled. Again, we used a custom timedemo recorded for use in this review.

At last, the GTX 280 pulls ahead of the 9800 GX2 by a hair at the highest resolution. However, the GTX 260 has some trouble fending off the Radeon HD 3870 X2, which runs neck and neck with it.

By the way, the drop-off for the 9800 GTX at 2560×1600 is in earnest. I tested and re-tested it. I suspect the card may be running out of memory, although if that’s the case, I’m not sure why the GX2 isn’t affected, as well.

Crysis

Rather than use a timedemo, I tested Crysis by playing the game and using FRAPS to record frame rates. Because this way of doing things can introduce a lot of variation from one run to the next, I tested each card in five 60-second gameplay sessions.

Also, I’ve chosen a new area for testing Crysis. This time, I’m on a hillside in the recovery level having a firefight with six or seven of the bad guys. As before, I’ve tested at two different settings, with the game’s “High” quality presets and with its “Very high” ones, also.

Sadly, the GTX 280 is no magic bullet for Crysis performance, in case you were looking for one. Still, please note that the median low frame rate for the GTX 280 with the “High” quality settings is 25 FPS. That’s not too bad at all, and for this reason, Crysis feels eminently playable on the GTX 280. Of course, I had to go and pick a hillside with ridiculously long view distances and an insane amount of vegetation and detail for my new testing area, so folks will still say Crysis doesn’t run well. For what it’s worth, FPS averages on the FRAPS readout jump into the 40s if you turn around and face uphill. Not that it matters—avoiding low frame rates is the key to playability, and the GTX does that.

Then again, so does the 9800 GX2.

Assassin’s Creed

There has been some controversy surrounding the PC version of Assassin’s Creed, but I couldn’t resist testing it, in part because it’s such a gorgeous, well-produced game. Also, hey, I was curious to see how the performance picture looks for myself. The originally shipped version of this game can take advantage of the Radeon HD 3870 GPU’s DirectX 10.1 capabilities to get a performance boost with antialiasing, and as you may have heard, Ubisoft chose to remove the DX10.1 path in an update to the game. I chose to test the game without this patch, leaving DX10.1 support intact.

I used our standard FRAPS procedure here, five sessions of 60 seconds each, while free-running across the rooftops in Damascus. All of the game’s quality options were maxed out, and I had to edit a config file manually in order to enable 4X AA at this resolution. Eh, it worked.

Wow, the Radeons just look exceptionally strong here. Even the Radeon HD 2900 XT, which lacks DX10.1 support, comes out ahead of the GeForce 8800 GTX—a rare occurrence. With DX10.1, the Radeon HD 3870 isn’t too far behind the GTX 260, amazingly enough. The new GeForces do post solid gains over the older ones, though, and the SLI-on-a-stick 9800 GX2 doesn’t look so hot.

Race Driver GRID

I tested this absolutely gorgeous-looking game with FRAPS, as well, and in order to keep things simple, I decided to capture frame rates over a single, longer session as I raced around the track. This approach has the advantage of letting me report second-by-second frame-rate results.

The 9800 GX2 is fastest overall, but it wasn’t without its quirks. I had to copy in a new SLI profile file in order to get GRID to use both GPUs. Once that was installed, the GX2 obviously did very well. On a similar note, the Radeon HD 3870 X2 seems to have lacked a profile for this game, since it wasn’t any faster than a single 3870.

One oddity in the numbers is that the GTX 260 seems to be bumping up against a frame rate cap at 60 FPS most of the time. Only once, during a short period, does it reach above 60. I’m not sure what’s going on here. I tested and re-tested, confirmed that vsync was disabled, and the results didn’t change. My best guess is that the GTX 260 might be interacting with some sort of dynamic level-of-detail mechanism in the game engine. Interestingly, the GTX 280 rarely ranges below the 60 FPS level.

3DMark Vantage

And finally, we have 3DMark Vantage’s overall index. I’m pleased to have games that will challenge the performance of a new graphics card today, so we don’t have to rely on an educated guess about possible future usage models like 3DMark. However, I did collect some scores to see how the GPUs would fare, so here they are. Note that I used the “High” presets for the benchmark rather than “Extreme,” which is what everyone else seems to be using. Somehow, I thought frame rates in the fives were low enough.

The GT200’s enhanced processing engine serves it well in 3DMark Vantage. As I’ve mentioned, Nvidia claims the GT200’s larger register file has tangible benefits with Vantage’s complex shaders.

Power consumption

We measured total system power consumption at the wall socket using an Extech power analyzer model 380803. The monitor was plugged into a separate outlet, so its power draw was not part of our measurement. The cards were plugged into a motherboard on an open test bench.

The idle measurements were taken at the Windows Vista desktop with the Aero theme enabled. The cards were tested under load running Half-Life 2 Episode Two at 2560×1600 resolution, using the same settings we did for performance testing.

Well, not bad. The GeForce GTX cards pull less power at idle than the 9800 GTX or the Radeon HD 3870 X2. They’re not quite down to Radeon HD 3870 levels, but this is a much larger chip. When running Episode Two, the GT200 cards’ power draw shoots up by quite a bit, but remains well within reasonable limits.

Noise levels

We measured noise levels on our test systems, sitting on an open test bench, using an Extech model 407727 digital sound level meter. The meter was mounted on a tripod approximately 12″ from the test system at a height even with the top of the video card. We used the OSHA-standard weighting and speed for these measurements.

You can think of these noise level measurements much like our system power consumption tests, because the entire systems’ noise levels were measured, including the stock Intel cooler we used to cool the CPU. Of course, noise levels will vary greatly in the real world along with the acoustic properties of the PC enclosure used, whether the enclosure provides adequate cooling to avoid a card’s highest fan speeds, placement of the enclosure in the room, and a whole range of other variables. These results should give a reasonably good picture of comparative fan noise, though.

I wasn’t able to reliably measure noise levels for most of these systems at idle. Our test systems keep getting quieter with the addition of new power supply units and new motherboards with passive cooling and the like, as do the video cards themselves. Our test rigs at idle are too close to the sensitivity floor for our sound level meter, so I only measured noise levels under load. Even then, I wasn’t able to get a good measurement for the GeForce 8800 GTX; its cooler is just too quiet.

All of Nvidia’s new-look coolers are louder than the incredibly quiet dual-slot cooler on the 8800 GTX. The GTX 260 and 280 both put out a fairly noticeable hissing noise when they’re running games, as our readings suggest. I wouldn’t consider them unacceptable, because they’re nice and quite at idle. And, as you’ll see, I think there’s a reason the new GPU coolers are louder.

GPU temperatures

Per your requests, I’ve added GPU temperature readings to our results. I captured these using AMD’s Catalyst Control Center and Nvidia’s nTune Monitor, so we’re basically relying on the cards to report their temperatures properly. In the case of multi-GPU configs, I only got one number out of CCC. I used the highest of the numbers from the Nvidia monitoring app. These temperatures were recorded while running the “rthdribl” demo in a window. Windowed apps only seem to use one GPU, so it’s possible the dual-GPU cards could get hotter with both GPUs in action. Hard to get a temperature reading if you can’t see the monitoring app, though.

Looks to me like the 8800 GTX is so much quieter than newer cards because it’s willing to let GPU temperatures climb much higher. 84°C is pretty warm, so I can’t complain too much about the acoustics of the later cards.

Conclusions

What you make of the GeForce GTX 280 may hinge on where you come down on the multi-GPU question. Clearly, the GTX 280 is far and away the new single-GPU performance champ, and Nvidia has done it again by nearly doubling the resources of the G80. Its performance is strongest, relatively speaking, at high resolutions where current solutions suffer most, surely in part because of its true 1GB memory size. And one can’t help but like the legion of tweaks and incremental enhancements Nvidia has made to an already familiar and successful basic GPU architecture, from better tuning of the shader cores to the precipitous reduction in idle power draw.

All other things being equal, I’d rather have a big single-GPU card like the GTX 280 than a dual-chip special like the Radeon HD 3870 X2 or the GeForce 9800 GX2 any day. Multi-GPU setups are fragile, and in some games, their performance simply doesn’t scale very well. Also, Nvidia’s support for multiple monitors in SLI and GX2 solutions is pretty dreadful.

The trouble is, things are pretty decidedly not equal. More often than not, the GeForce 9800 GX2 is faster than the GTX 280, and the GX2 is currently selling for as little as 470 bucks, American money. Compared to that, the GTX 280’s asking price of $649 seems mighty steep. Even the GTX 260 at $399 feels expensive in light of the alternatives—dual GeForce 8800 GTs in SLI, for instance—unless you’re committed to the single-GPU path.

Another problem with cards like the 9800 GX2 is simply that they’ve shown us that there’s more performance to be had in today’s games than what the GTX 260 and 280 can offer. One can’t escape the impression, seeing the benchmark results, that the GT200’s performance could be higher. Yet many of the changes Nvidia has introduced in this new GPU fall decidedly under the rubric of future-proofing. We’re unlikely to see games push the limits of this shader core for some time to come, for example. I went back and looked, and it turns out that when the GeForce 8800 GTX debuted, it was often slower than two GeForce 7900 GTX cards in SLI. No one cared much at the time because the G80 brought with it a whole boatload of new capabilities. One can’t exactly say the same for the GT200, but then again, things like a double-size register file for more complex shaders or faster stream-out for geometry shaders may end up being fairly consequential in the long run. It’s just terribly difficult to judge these things right now, when cheaper multi-GPU alternatives will run today’s games faster.

And then there’s the fact that AMD has committed itself to the multi-GPU path entirely for the high end. I can’t decide whether that legitimizes the approach or makes Nvidia the winner by default. Probably, it’s a little of both, although I dunno how that works. The folks at AMD are already talking big about the performance of RV700, their next-generation dual-GPU video card, though. We’ll have to wait and see how these things play out.

Whatever happens there, Nvidia has opened up new selling points for its GPUs with CUDA and the apparent blossoming of a nascent GPU-compute ecosystem. Perks like PhysX acceleration and speed-of-light Photoshop work may make a fast GPU indispensable one day, and if that happens, the GT200 GPU will be ready to take full advantage.

Comments closed
    • Damage
    • 11 years ago

    Note: I’ve just updated the theoretical GPU capacity table in this review to correct the fill rate numbers for the GeForce GTX 260. The revised numbers are slightly lower. The performance results remain unaffected.

    • keldererik
    • 11 years ago

    3870X2 results in Race Driver Grid are wrong.

    If you rename GRID.exe to 3dmark06.exe and set catalyst AI on advanced you see double the FPS.

    Even patch 1.1 didn’t resolve the problem with 3870X2 cards (or even quad gpu).

    I got 140% better performance with 2x 3870X2’s.

      • Damage
      • 11 years ago

      No, the results are correct. This is the performance you’ll see with Cat 8.5 drivers. Renaming the exe is a nice trick, but you have to be sure you have the right name invoke the right profile. Most folks aren’t going to bother. This is one of the obvious drawbacks of multi-GPU solutions.

      I believe there is a CrossFire profile in a newer driver, though. We’ll look into it for future testing.

      • willyolio
      • 11 years ago

      are you sure image quality didn’t suffer for the sake of FPS?

      • MadManOriginal
      • 11 years ago

      That’s odd…so the 3870×2 can run the game in Crossfire but it requires an /exe rename to trick the drivers in to doing so? Is it just a matter of AMD adding a profile then because otherwise I’d think a program that’s tricked like that would cause issues.

    • moritzgedig
    • 11 years ago

    PR Job only. to few people will affort this product to make it worth it. All it does is give nvidia the performance crown and push the requirements, important to generate sales.
    the size of the chip is crazy and will generate a high ratio of 260 to 280.
    I don’t see AMD dead, only PR wise.
    I like singlechip solutions better (I can be sure I get the most out of it for any game) but the dual chip on a board way is the better / cost efficient approach to the high-segment.
    In the segment I’m buying from, there will always be the singlechip solution.
    Sure AMD needs to ship something better than the 3870, but I would say they are only 2 months behind atm, nothing they can’t regain.

    • 2x4
    • 11 years ago

    i wonder if there are any information on how this card handles far cry 2??

      • Meadows
      • 11 years ago

      You don’t need half of this card to handle Far Cry 2.

        • 2x4
        • 11 years ago

        where did you get this info from???

          • Meadows
          • 11 years ago

          A Far Cry 2 preview that was in one of the older shortbreads. The engine won’t be nearly as demanding as CryEngine 2 yet feature some impressive stuff.

            • 2x4
            • 11 years ago

            thanks. i saw couple of vids on youtube and the game looks pretty amazing.

            • d0g_p00p
            • 11 years ago

            I don’t know I saw it at GDC (SF) and it was the showcase at Intel and Dell’s booth and it was running like crap. I thought it was Crysis when I first saw it because it looked exactly the same but with different weapons and without the biosuit.

            edit: Intel was running on a quad core 3.0Ghz something with a single 8800GTX and Dell had it running on a Q6600 with SLI 8800GTX and both were still pathetic. We will see.

            • Meadows
            • 11 years ago

            Optimization is the last thing developers do. They promised it will bring less load than CryEngine 2, so let’s be hopeful and place some faith in their claims.

            • poulpy
            • 11 years ago

            Say that to the people who have been “hopeful and place(d) some faith” in 3DRealms’ claims and eagerly awaiting DNF for 11 years straight and counting..

            ps: I can’t see anything running slower than the CryEngine 2 so I’d go for this one to be faster anyway :p

            • d0g_p00p
            • 11 years ago

            After playing both Far Cry and Crysis and seeing Far Cry 2 live I honestly don’t have much hope. The engines that Crytek designs have always been top notch, however the code seems very sloppy performance wise. I hope I am wrong and you are right. However Crytek has a very poor record and I have not seen anything to change this fact.

            • Steba
            • 11 years ago

            actually crytek has nothing to do with far cry 2. it’s using a completely different engine named “Dunia”. Only 2% of the original cryengine is being used so we can’t really base our opinions on past performance.

            • SPOOFE
            • 11 years ago

            Just goes to show how people can develop a bias against one thing or another without knowing any details about it.

    • axeman
    • 11 years ago

    Bah. “Card x is now outdated because it supports only DirectX n. Card y now supports DirectX n.1 …” Anyone else tired of the perpetual obsolescence of PC hardware? Maybe I’m just getting to be an old man.

    • WillBach
    • 11 years ago

    Nice review, Scott! The animal metaphors in the beginning were particularly good. The article hit my RSS feed while I was talking to my girlfriend via Skype, and I read her the first three paragraphs while imitating Steve Irwin [1]. She’s not normally interested in desktop graphics[2], but she almost fell out her chair laughing. Thanks for a good piece of journalism that lightened our day 🙂

    [1] May he rest in piece.

    [2]She’s been spoiled by the Cray where she works, CUDA has yet to win here over.

    “…with ‘the approximate surface area of Rosie O’Donnell.’ Cricky, I wouldn’t want to wrestle her…”

    • SGT Lindy
    • 11 years ago

    Wow what games can I run on it, that really can use it? Only $469 at newegg, gees what a bargain.

    Makes that PS3 I want to get look cheap.

      • swaaye
      • 11 years ago

      Sure does: GF7, 512MB RAM, a half-baked software library, and an ambiguously powerful CPU all for $400 eh? Yeah, I’m ignoring the blueray aspect because I don’t watch movies much anymore and don’t have a HDTV anyway.

      Not that GTX 280 is extreme gaming value at $650, but it does what it’s meant to: be faster and better. I think the new GPU is meant as much for GPGPU and the pro market as it is for games. Flexible, interesting hardware.

        • Grigory
        • 11 years ago

        Yeah, I thought the same thing back in the days: “Why get a 3D graphics card when I can get a complete Atari 2600 console for that! Bah, humbug!”

        • SGT Lindy
        • 11 years ago

        I would suggest that the PS3 is extreme gaming value for the price. Especially compared to this new video card.

    • 2x4
    • 11 years ago

    so the gtx280 was supposed to be faster than 9800gx2 and that could justify the price.

    but now, each benchmark you see 9800gx2 is on top however gtx 280 costs $200 more!!!!

      • TechFiend
      • 11 years ago

      No kidding…I couldn’t believe the 280 came so closely on the heels of my overkill investment on the 9800 GX2…but I have to say I love the GX2….I was getting a new rig and said what the heck, esp. w/the CORE 2 DUO E8400 being the value it is. I play Crysis, Assassins Creed, Rainbow 6 Vegas 2, all w/highest settings at 1920 x 1200, all with great frame rates, all in XP. Assassins Creed framerates on my system are much higher than in the review, but I’m attributing that to VISTA in the review…

    • slash3
    • 11 years ago

    Maybe this was a typo in the system setup, but were different driver versions really used for the GTX 260 and GTX 280 cards (177.34 and 177.26, respectively)? If so, why?

      • 2x4
      • 11 years ago

      it could be a typo cause there is only one driver for the 200 series on nvidia web site

        • Meadows
        • 11 years ago

        It is neither, as the only drivers nVidia offers are labeled *[<177.35<]*. I've also tried modding the .inf file, but the driver will still refuse to install on other cards. I hope they release an all-encompassing driver pack.

    • ReAp3r-G
    • 11 years ago

    what is that chip near the I/O shield? is that a separate I/O chip as seen on the early iterations of the G80 cards? might have missed a description of it in the article…anyone care to point out what that chip might be? thanks! =)

    • henfactor
    • 11 years ago

    Scott, is there an SLI review on the way? How bout’ 3 way SLI?

    (please say yes, please say yes 🙂

      • SS4
      • 11 years ago

      Yes, I’m really looking for a 3 way SLI review. Been browsing tons of harware review website and i only found normal SLI test….

        • Forge
        • 11 years ago

        If you’re willing to drop the better part of 2000$ on GPUs alone, as well as finding a PSU with 3*8pin AND 3*6pin,then you can have an exclusive!

        For that kind of money and electricity, it better pull off Crysis Very High 60+ FPS at 2560*1600, yet I somehow doubt it will.

        I mean, for the love of Gord; that’s nearly 750W in GPUs alone!!

          • MrJP
          • 11 years ago

          Anandtech found that a 1000W supply wasn’t enough for a 2-card GTX 280 SLI setup and had to add a second PSU:
          §[<http://www.anandtech.com/video/showdoc.aspx?i=3334&p=19<]§ Interestingly the 280 looks relatively worse for loaded power consumption in their tests than it did in TR's: 24W more than a 9800GX2, 21W more than a 3870X2. I wonder if that's something to do with the platform or benchmark used for the loaded testing, or an indication of variance between individual cards?

            • Meadows
            • 11 years ago

            I believe the difference simply comes from a more loaded system.

            • MrJP
            • 11 years ago

            Perhaps, but for most of the cards, the TR system had greater power draw – the GTX 280 was the only one that came in lower than Anand’s test. This changes the ranking between the cards fairly significantly: the 9800GX2 ends up being 37W worse than the GTX 280 compared with 14W better for Anand.

            TR:
            1. HD3870 (212W, -87W to GTX 280)
            2. 9800GTX (247W, -52W to GTX280)
            3. GTX 280 (299W)
            4. 3870X2 (318W, +19W to GTX 280)
            5. 9800GX2 (326W, +27W to GTX 280)

            Anandtech:
            1. HD3870 (212W, -101W to GTX280)
            2. 9800GTX (229W, -84W to GTX 280)
            3. 9800GX2 (289W, -24W to GTX 280)
            4. 3870X2 (293W, -20W to GTX280)
            5. GTX 280 (313W)

            TR vs AT:
            1. HD3870: 0W
            2. 9800GTX: +18W
            3. 9800GX2: +37W
            4. 3870X2: +25W
            5. GTX 280: -14W

            I’m not saying anything’s wrong here (and if asked to choose I know who’s numbers I’d put more faith in, especially since AT didn’t bother to mention how they measured it), but it highlights that you have to treat power numbers from a single benchmark as indicative only.

        • Fighterpilot
        • 11 years ago
          • Meadows
          • 11 years ago

          Some results are CPU-bound, with the rest being showcases of early optimization attempts.

    • Kaleid
    • 11 years ago

    I think these card needs a x2900xt to hd3870 treatment

    Such as
    1. Die shrink (I know 55nm is being worked on, but is it enough?)
    2. Ditch expensive 512bit GDDR3 and go for more efficient 256bit GDDR5
    3. Card size reduction, power reduction and also cooler
    4. Which all in all hopefully leads to a hefty needed price reduction

    Nvidias new cards are way OTT even for most enthusiasts. The software doesn’t keep up.

      • coldpower27
      • 11 years ago

      Die shrink may help some, but it won’t be even remotely close to cheap till we reach 45nm, even a 55nm chip is still going to be expensive at around the 8800 GTX level die size.

    • swaaye
    • 11 years ago

    That idle power draw is rather incredible, IMO. The beast is just a bit above a 3870! It’s twice the size of G80 and yet pulls ~half the power at idle! Granted, load draw is pretty nuts, but this single card is much more efficient while doing nothing than those dual GPU boards.

    I’m excited to see what ATI brings out. If only they can match 8/9800GTX and be impressively priced. Matching GTX 260 and putting the hurt on price (and watts) would be even better. NVIDIA appears to be considering ATI’s new stuff a threat to GTX 260 with their pricing.

    Oh, and 8800GTX still performs beautifully. Talk about longevity for a 19 month old 3D card.

    Crysis appears to just be ahead of its time. The GTX 280 scales against 8800GTX as well as in other games, but it doesn’t amount to much in framerate increase really. Putting two GTX 280s together ought to help out those who really, really want to make it go fast though. May be a major investment, but it will probably do the trick.

    • Crayon Shin Chan
    • 11 years ago

    Whatever happened to the ‘new single card will beat two older cards in SLI’ mantra they had a while back?

      • coldpower27
      • 11 years ago

      GTX 280 doesn’t double texturing power unfortunately, so that maybe 1 of the reasons why GTX 280 doesn’t scale as high as it should.

      It’s still a fairly impressive card and veryu compute dense though.

    • odizzido
    • 11 years ago

    New card out and the 8800 still doesn’t work with UT2004. I have almost no interest in Nvidia’s GPU’s anymore.

      • Nictron
      • 11 years ago

      Try the new 177.26 drivers, they solve the UT2004 hitching bug finally.

        • odizzido
        • 11 years ago

        holy f****** s***….serious? I guess the turtles hit the right combo of keys to fix it this year….lol year. I won’t be able to test it myself for 2.5 months, and if it works I will be quite happy……but damage done. Waiting for…two years?…for them to fix this bug is pure bull**** at best.

    • Lans
    • 12 years ago

    As a gamer, the only thing that stands out to me is the much larger output buffer for geometry shaders. Was the G80 / G92 so poorly optimized or Nvidia (through it’s developer’s relation) know there’ll be more geometry shader usage? It definitely feels like Nvidia is hiding something here when they still don’t support DX10.1 (cubemap arrays) yet made the output buffer so much larger (geometry shader on cubemap arrays or just longer geometry shaders with more outputs). It is clear what Nvidia is expecting from this but we’ll have to see how this plays out and how many title will the GTX200 benefit from this. The larger register file for shaders is nice too but I would prefer to be able to run shader faster in general instead of just playing certain games faster… I would definitely wait on buying a GTX 280 or 260.

    As a programmer, I am not sure what to make of the less than 1/10 rate of double precision (78 GFLOPS) math vs. single precision (933 GFLOPS). I suppose that is probably still at least 2x faster than doing on the CPU? But seems like a really niche market out there that would bite the bullet for the trade off. The larger register file is pretty nice here though. I am a r[<*edit* not<]r die hard Nvidia fan so I would have to wait for the RV770 given is rumored to break 1~1.2 TFLOPS and see what kind of improvement they made. EDIT: I am not a die hard Nvidia fan...

      • poulpy
      • 11 years ago

      On the double precision side of things the FireStream 9250 would -apparently- offer over 200 GFLOPS.

      §[<http://www.dailytech.com/AMD+FireStream+9250+Stream+Processor+Breaks+1+Teraflop/article12093.htm<]§

      • blubje
      • 11 years ago

      It actually might be significantly more than 2x faster. Because of the CPU’s memory model, if you are not doing cache-coherent accesses, you are limited to the bus bandwidth of 6.4 GB/s, and if your data cannot be sufficiently predicted in time, perhaps much less, as the pipeline could stall waiting for data. For the GPU, the threads are scheduled in warps, hiding much of the memory latency; a relaxed memory model also means that the bandwidth is much higher as well (111 GB/s on the gtx 260).

      so it seems you could be much faster on the gpu if you are working with a large array and you can parallelize the algorithm; it could be much slower if you can’t parallelize it though.

    • Mystic-G
    • 12 years ago

    Where as very few games have actual DX10.1 support I find that AMD being better in this area at this point in time to be irrelevant.

    I’ll stick with DX9 on XP, thank you.

    • l33t-g4m3r
    • 12 years ago

    meadows: porkster of 08.

      • Krogoth
      • 12 years ago

      WHO WANTS SOME BACON?

      • Corrado
      • 11 years ago

      I laughed…

      WHATS YOUR POINT!?

    • Thanato
    • 12 years ago

    Can the 200’s SLI with the previous gen cards? It would be cool if one could still use their old video cards with new in SLI mode.

      • Meadows
      • 12 years ago

      No.

      Any other fairy tale questions you want answered, while I’m here?

        • Thanato
        • 12 years ago

        hehe, um will there be a X2 version of the 260 or 280?
        Will a 260 sli with a 280?

        • ludi
        • 12 years ago

        Just one: When does the fairy godmother turn you into a coachman?

      • Thanato
      • 12 years ago

      In fairytale land money isn’t a issue right?

      I guess the 200’s are a waist of money. Maybe a die shrink will make the cards cheaper and use less power. How much are 1200 watt power supplies? (rhetorical)

    • Fighterpilot
    • 12 years ago

    The new graphs do indeed look great…nice one TR!
    The results on this new card are good…not too much power draw and noise for its size but much too expensive for most people I would think.
    HD4850 looks to be the go to card this round if early reports prove correct…for $200 it makes these new green cards look overpriced.

    • paulWTAMU
    • 12 years ago

    It’s a nice card but dear God in heaven…650 dollars for a card? I just built a friend of mine a nice midrange system for that much (including a decent monitor). I’m loving my 8800GT more and more the more I see of this next generation.

    • wingless
    • 12 years ago

    Anything is better than my lowly 2900XT at this point but I think I’ll wait for the 4870 to pop up before I give Newegg my hard earned cash. The GT200’s price is a direct result of that enormous die. The cost of production is ridiculous. For once, AMD may end up with a better performance/dollar due to a superior process and no need for over-engineering. AMD will be the “Intel” of GPUs if they can keep this up (well, we’ll see how the R700 performs, eh?). I think the price is right with the upcoming ATI cards. I can’t wait to see how they do next week.

    • Corrado
    • 12 years ago

    I’m not impressed really. I can’t say that I’m disappointed, but its not impressing either. Its about what I expected maybe? I don’t know. I’m very ‘meh’ on the whole situation. Theres a plethora of awesome performing, very inexpensive (relatively) cards out there right now… yet nVidia still tries to sell this for $650? I think thats what gets me. nVidia is trying to play by the rules they created 3-4 years ago today, and its not working anymore. I think ATI changed the rules with the 3870, and its price/performance ratio, coupled with ATI’s much better multi-GPU scaling. Its obviously not perfect (see GRID), but Crossfire, in my opinion, leaps and bounds better than nVidia’s SLI. Most games scale VERY well with Crossfire, which is what I like to see, minus a lot of the quirts inherent with SLI.

    I myself just bought a 3870, and am upgrading from an X800GTO. I got the 3870 for $135, and its got a Zallman cooler on it, so its very high quality, and yet still very quiet. Because of this, I’m not really in the market for a new card for quite some time, except for maybe a 2nd card for multi-GPU. Even so, that won’t be for a year or more. This could also be the reason for my apathetic impressions of this card.

    I won’t lie, I’m ATI/AMD biased, but I’m also not a blind fanboy. I’ve had plenty of nVidia and Intel products, and in fact am on a Core2Duo/i965 setup right now. I WANT AMD to beat nVidia and Intel, but I won’t lie to myself and believe it to be so when its obviously not. I just believe that, with regards to graphics, AMD is on a better path right now than nVidia, and this review reinforces those feelings to me. $650 is far too much for a video card, especially one that isn’t on the order of when I went from a Trident PCI card and software rendering in Quake -> Voodoo 1 and GLQuake… and even that was only $299.

    • Bensam123
    • 12 years ago

    Chiclet comparision was great. I’ve never thought of GPUs that way and I’m sure looking at differently colored rectangles on a diagram will remind me of this again. XD

    “I can’t decide whether that legitimizes the approach or makes Nvidia the winner by default.”

    Perhaps a bit of both. 3DFX death-knell was going to dual/quad setups, but arguably there were a lot of other issues involved there.

    What’s better: Having one really big GPU or two smaller GPUs? I think they’re one in the same as long as the two can work together well.

      • Corrado
      • 12 years ago

      Exactly. Whats better? A 400hp v8, or a 400hp supercharged V6? It doesn’t really matter if the torque and hp curves are relatively similar. The v6 may use less gas, but the V8 has less moving parts. If the end result is the same, most users don’t really care how it gets done.

    • herothezero
    • 12 years ago

    Clearly, the $600 I spent on the 8800GTX upon its initial release was money well-spent.

    Is anyone else besides me not seeing the value proposition of the GTX200?

      • jobodaho
      • 12 years ago

      Not me, that money could buy a PS3 *[

        • rosselt
        • 11 years ago

        i’ve had my 8800gtx and core2 @ 3.2Ghz since jan-07 its been great till last month i got a 46″ 1080p bravia and x360 and have only used my pc to stream stuff to the x360 and browse and read emails.

        i guess what im trying to say though is you still need a decent TV preferably 1080p with a console for it be a really worth while which raises the price of the ‘package’ along with the games being more expensive so the argument of PC’s cost more isnt really justified

      • leor
      • 12 years ago

      for real, my 8800GTX runs everything fine on my 30 inch monster. I’ve never had a graphics card stay relevant for so long.

      • Price0331
      • 12 years ago

      Nah, not really, spending that much money on a component is border line insane to me, when I bought my 3870 that can run everything I want for 250 dollars, including Inventor 08.

      But, whatever you feel is money well spent. ; ]

      • indeego
      • 11 years ago

      Clearly not, as that GPU is now half the costg{<.<}g

    • d0g_p00p
    • 12 years ago

    Nice review. Guess it goes to show what a pile of crap Crysis is code wise.

      • eitje
      • 12 years ago

      d0g_p00p is taking care of issues before the main body of the squad gets there!

    • Forge
    • 12 years ago

    Most impressive. The heat and noise are not as bad as I was lead to believe, and the performance is quite good. Unfortunate that the price scaled up faster than performance did.

    Bring R700 and let’s get it on.

    • drfish
    • 12 years ago

    It’s so nice of nvidia not to release hardware that makes me feel bad about my 8800GTX… I have absolutely no inclination to upgrade my video card after reading that review. Long live the 8800GTX!

      • HalcYoN[nV]
      • 12 years ago

      Best video card ever. It really is. The only card I used for anywhere near as long was an ATI 9700. Otherwise, it has always been a six to nine month upgrade cycle, sometimes adding a second card for SLI, sometimes just a new card.

      Honestly, ATI may get this round with price/performance/practicality.

    • provoko
    • 12 years ago

    Good job nvidia, i’m glad it wasn’t just a 10 frame improvement this time around.

      • Meadows
      • 12 years ago

      It’s sad that some people had expectations such as yours there.

    • Hattig
    • 12 years ago

    Good Review.

    I like that nVidia have spent the time to create a solid basic architecture for forthcoming products over the next couple of years.

    On the other hand, when is AMD’s product coming out? Hopefully there will be a lot of competition – pricewise and performance-wise, between the products, even if ATI use X2 to get there.

    • leor
    • 12 years ago

    l[

      • PRIME1
      • 12 years ago

      Now is that because of the size or the amount of hot air that….. er nevermind.

      I think that analogy can go all sorts of ways.

        • eitje
        • 12 years ago

        two slots.

        ’nuff said.

    • astraelraen
    • 12 years ago

    Why would I buy a GTX280 for 650+ when a 9800GX2 gives better performance for ~470?

      • StashTheVampede
      • 12 years ago

      a single gpu solution that probably has a longer “useful” life because of the newer tech?

        • leor
        • 12 years ago

        i’d rather have the 200 bucks . . .

    • PRIME1
    • 12 years ago

    It seems AMD will fall further behind in the GPU race.

    I think relying on dual gpus to compete against one gpu will be a difficult sell, especially with all the problems crossfire has.

    Not to mention that with a die shrink a dual 280 card will be very much a reality.

    Of course for most people a 9600GT will be more than enough power, the amount of PC games needing top end cards keeps getting smaller and smaller.

      • Krogoth
      • 12 years ago

      Ok, please drop those green-shaded glasses.

      Nvidia started the whole recent dual-GPU on a single-PCB business. (well, 3dFX and ATI did it earlier a long time ago).

      How is a mega-GPU package any better then two smaller GPU packages that deliver similar performance? From an economic and technical standpoint it is easier to sell two smaller dies then one huge one. Remember how C2Q versus Phenom turnout?

      What is more funny is that you revert back to recommending a cheap, older design being more then sufficient for most gamers needs. Thus making the new GT2xx cards pointless.

        • PRIME1
        • 12 years ago

        With the driver problems and profile issues inherent in dual GPU setups, a single GPU is preferable.

        That is not just my conclusion, but the conclusion made in the article we are commenting on.

        Dual GPU is interesting and fun but a niche market at best and right now only the 9800GX2 comes close to what a single GT280 can do.

          • Jigar
          • 12 years ago

          r[<9800GX2 comes close to what a single GT280 can do.<]r If i have read the article correct, even GTX280 was following 9800GX2 as far as the performance goes...

          • Krogoth
          • 12 years ago

          Nvidia drivers still have annoying issues, even with single-card solutions. Yet, their development team spends far more resources on niche solutions like dual-GPU and SLI. >_<

            • Meadows
            • 12 years ago

            SLI and Crossfire *[

      • HalcYoN[nV]
      • 12 years ago

      As long as ATI continues to advance the Crossfire software, which you know they are focusing on with their decision to embrace the multi-GPU single slot model, most of your complaints will disappear. Crossfire already bests SLI in multi-monitor support.

      Again, the 8800GTX ruined us all. It was an amazing product that moved ahead light years. But now, AMD/ATI has a very compelling philosophy that is gaining traction for the balanced system mantra.

    • michael_d
    • 12 years ago

    Very disappointing scores from GTX280 especially in Crysis. 4870 will be very close and perhaps even beat it. 4870 x2 will wipe the floor with GTX280.

    • Lord.Blue
    • 12 years ago

    My 8800 GTS 320 is doing fine for me right now. Mass Effect looks amazing, WoW has never run better, and Source based games are looking birlliant.

      • Mystic-G
      • 12 years ago

      Hahahaha I’m with you there man.

      Until games cause it to really lack in performance i’ll watch the rest of these video cards come and go. I’m just mad that I paid $300 for my 320MB Superclocked August of last year. Yea, that was actually a cheap price for it. But that’s how the industry goes, I guess.

      By the time it throws in the towel, whatever I get by then will be worth the wait i’m sure….. DX11 support =p

      • Mithent
      • 12 years ago

      That’s the thing; unless you’re one of the people who apparently lose sleep over whether they can play Crysis on Very High, then there’s little need to move to one of these cards right now. They’re certainly faster than the old generation, but the old generation isn’t being sufficiently taxed yet by most people.

      I’d say, wait until these are needed, by which time they’ll be cheaper. Don’t pay early-adopter prices now unless what you have is inadequate today.

    • MrJP
    • 12 years ago

    Time for the big *[

    • ssidbroadcast
    • 12 years ago

    Dang. nVidia does it again. Thanks Scott for posting that interesting sidenote on Assassin’s Creed 10.1 performance, tho.

    • Krogoth
    • 12 years ago

    It is just a freaking G9x on steroids.

    Reminds me of Radeon X8xx. Company in question’s attempt to stretch their basic architect to its practical limits.

    FYI, X8xx were basically Radeon 9700PRO with more TMUs, pixel shaders, pipelines with minor tweaks. Likewise, how GT280 deep-down is a G80 with some tweaks, more steaming processors and a wider memory bus.

    Performance is not stellar, but it is fast. It is another question whatever there is anything worthwhile on PC that truly needs its power. I am getting by with my aging X1900XT fine.

      • Meadows
      • 12 years ago

      HD gaming has been around since a very long time. It needs juice. Your card couldn’t cut it on release.

      Nehalem is Penryn on steroids is Conroe on steroids. Radeon HD 4000 is HD 3000 on steroids is HD 2000 on steroids. GeForce 7 is GeForce 6 on steroids is GeForce 5 (FX) on steroids. But enough of the fractal-looking sentences; what’s your point again? It’s still advancement. And with the tweaks nVidia has done, it’s more than sufficient to declare a new line of cards, _[

        • Krogoth
        • 12 years ago

        Nehalem has not much in common with Conroe/Penyrn. It is a new-architect from ground-up. Geforce 6/7 have little in common with embarrassing Geforce FX family.

        I cannot really say the same for GT2xx family. They are nothing more then evolutionarily tweaks of the G80. The performance reflects that. It barely beats its G9x SLI counterparts while having similar loaded power and heat profiles.

        ATI just has to deliver a GPU with similar performance and much saner heat and power requirements to beat Nvidia at its own game. It will force Nvidia to drop prices on GT2xx = competition; good for customers.

          • Meadows
          • 12 years ago

          I hope that happens, but again, you really shouldn’t be defamating a product that’s clearly new. Sure the scientists at nVidia aren’t so stupid as to need to rebuild the core from the ground up every other year. Think about it, for pity’s sake. They can’t change their minds so often. They’re not dumb, they don’t need 6 tries to get an architecture right (6 practical tries – as nVidia employs large clusters whose sole job is emulating future products and attempting to predict their behaviour, they don’t need to explore so many dead ends in real life).

          They created something efficient (as far as the G80 can go), there’s no reason to throw it away and spend millions or billions on developing a new core – a core that may, or may not, outdo the previous one (meaning it can be a gamble to some extent). Look at how ATI’s novel idea fumbled in the HD 2000 days.

          I’m sorry almost all players in the IT world will disappoint you if you expect something brand new every christmas.

            • Krogoth
            • 12 years ago

            Nvidia’s executives are just being lazy, because of the lack of competition. I doubt that GT2xx were products by engineers and innovators. They are products of cost-saving R&D and quicker returns done by executives for shareholders needs.

            ATI tried the same thing with their R3xx, but Nvidia caught up with their G6x/G7x stuff and ATI never quite got their edge back. History may repeats itself with ATI being the one on top and Nvidia playing catch-up. However, Intel might render mainstream discrete GPU into a niche if they begin to develop decent integrated GPUs.

            • Meadows
            • 12 years ago

            Intel would never do such a thing, even though they might be able to.

            • SPOOFE
            • 12 years ago

            “Nvidia’s executives are just being lazy”

            Or you’re just overly whiny.

            • Krogoth
            • 12 years ago

            They have no incentive to risk something completely new. It is because the lack of competition.

            The GT2xx is one of consequences of it.

            It is good, old human nature at work.

            BTW, I am not complaining about it rather just observing for what the reality is.

            • Meadows
            • 12 years ago

            Then you’re one crappy observer.

            • Krogoth
            • 12 years ago

            Please put away those green-shaded glasses. It is making you look very silly. >_<

            Nvidia engineers did not really try at all this time around. Because of corporate politics not because they are incapable. Look at the golden example of this with Intel and Netburst. What are the root causes of both cases? Lack of competition.

            It is just too difficult to justify a $649 card that performs like 30% better then $149 model that is commonly available on the market. Hell, it makes the 8800GTX looks ever more solid and worth the $599 at its official depute. I did not even use AMD examples!

            • eitje
            • 12 years ago

            i think that only someone that works for nvidia, especially in their engineering team, would be able to speak to why this chip is what it is. you’re speculating on the internal aspects of the company at this point (as much as your detractors are also doing), and I don’t think any of us can safely do that.

            • Krogoth
            • 12 years ago

            There is plenty of indirect evidence that executive guys are holding the engineers by the balls with product development decisions.

            Just look at how little has change since 8800GTX depute from an architect standpoint. The G9x and GT2xx are nothing more then die-shrinks, subtle tweaking and juices of the same darn design. Look how Nvidia has been aggressively pushes for SLI and wants to keep it exclusive for their platform for few past years. They even made ridiciously stupid non-sense like Quad SLI .

            The priorities for their driver development team has not been that promising since 8800GTX’s launch.

            • eitje
            • 11 years ago

            That could just as easily be the engineers sticking to their current design because they really like the elegance of it. Or it could be that staffing was severely reduced on the GT200 project, so they had to go with an incremental increase. Or they might’ve been working on something else entirely, and then had to scrap it because they wouldn’t be able to complete it in time.

            We could pretend we know what’s going on all day long!

            • Meadows
            • 11 years ago

            I believe they’re sticking to the design because it works and it’s efficient. Krogoth can’t be pleased though, games are shit, hardware is shit, what _[

      • flip-mode
      • 12 years ago

      I don’t understand why you’re so put off by what is a very typical development process.

      Gf4 = Gf3++
      R400 = R300++
      Gf7 = Gf6++
      Gf9 = Gf8++
      R700 = R600++

      Basically Intel’s “tick-tock” has been going on in GPUs for a good long while.

      I’m fairly impressed by GT200 if only from a manufacturing standpoint. And it performs decently and the power consumption is nothing short of impressive considering the chip.

      The price is ridiculous in my opinion. With a chip that size, Nvidia really is in a tough spot if the R700 cards end up beating the G92 cards. Nvidia can drop G92 prices and ATI can follow with R700 prices, but the size of the GT200 certainly limits how far it can drop. R700 could possibly put the hurt on Nvidia if it’s small enough and fast enough.

        • Krogoth
        • 12 years ago

        Impressive from a manufacturing standpoint, only if the yields are actually good for something that huge.

          • SPOOFE
          • 12 years ago

          So you admit you have no idea whether it’ll be impressive or not? You sure shoot your mouth off with bloatedly overconfident rubbish for someone that has no idea.

            • Krogoth
            • 12 years ago

            Jumping to conclusions eh?

            I am just clarifying my conditions for being impressive from a manufacturing standpoint.

            It is fairly well-known that huge monolith die designs are prone to have yielding issues. Just because the possibility for transistor and other circuit problems are mathematically higher when there are more parts (variables) involved.

          • eitje
          • 11 years ago

          my bet: yields is why they’re charging $650 per card.

      • henfactor
      • 12 years ago

      Umm, Krogoth, I get the sense you have something against Nvidia. Do you? (meant as a true question, not a personal attack 😉

        • Krogoth
        • 12 years ago

        Not really, their execution of latest products has been rather lackluster for a better lack of a word.

        Since, 8800GTX’s depute their drivers have fallen short. They stubbornly refuse to make SLI more open when they could make more money on discrete video cards then their so-so chipset solutions. Geforce 9xxx family is an insult brand-name wise when it was little more then die-shrink of 8xxx family. The current GT2xx should be real 9xxxx family. They got the lead in the discrete market, yet fall short on aforementioned stupid, stupid stuff.

        I am not anti-Nvidia zealot. I have owned several Nvidia products in the past. Nforce 2, Nforce 4, Geforce 3, Geforce 4 Ti4400, Geforce 7800GTX. The only one on that list where I was somewhat annoyed with was Nforce 4’s broken extra features.

          • ish718
          • 11 years ago

          Yea Nvidia seems to always use the brute force tactic( more Stream processors, TMUs, more memory, etc…)

          I don’t see any real innovation going on there
          Ati made some attempt with DX10.1 but thats FAIL

            • Meadows
            • 11 years ago

            So when ATI adds more memory, faster memory, more stream processors and more TMUs, that’s _[

            • ish718
            • 11 years ago

            Its quite indisputable at this point that HD4000 will do well, especially when you consider price/performance

    • albundy
    • 12 years ago

    sadly i was expecting Shader Model 5.0 and better crysis fps. would like to see amd’s offering…

    • Umbragen
    • 12 years ago

    So ATI just needs to get close… And at half the sticker price, not that close.

    I should also mention, nothing about this card is going to get Vista out of it’s box and onto my PC. Is NVidia having a snit with Microsoft as well as Intel?

    • bogbox
    • 12 years ago

    What happen with the big , huge , green monster? ( like Hulk :)))

    It was beaten by a cheaper card ? by a older card? WOW!

    What happen with the expensive ,more expensive than any, 1000$( 650 euros )?got beaten by a 600$(400 euro) card ?
    Nvidia promoted SLI as ultimate gaming platform like 9800gx2 , and got beat at their own game.
    I’m not sorry for them. (remember the fx days ? or the ati r600 ? )
    Didn’t learn anything.
    Where are all Nvidia fanboys ?

    Great review TR! Good decision on assassins creed! Look there all you fanboys and complains of!
    PS I’m not a fan boy of ATI , bought a FX card myself! (still working good )

      • Meadows
      • 12 years ago

      g{

        • bogbox
        • 12 years ago

        l[

          • Meadows
          • 12 years ago

          I’ve been trying to stay cool and civilized but damn, you’re one hilarious idiot if I’ve ever seen one.

            • bthylafh
            • 12 years ago

            As civilized as you ever are, anyhow.

            • Meadows
            • 12 years ago

            I’d like to see you try and discuss anything with the above young babblefish.

            • bthylafh
            • 12 years ago

            I simply wouldn’t bother.

            • Meadows
            • 12 years ago

            I’m just trying to help the poor guy.

            • Jigar
            • 12 years ago

            Seems like you are the one who needs help … Be smart buy the best for the money..

            • Jigar
            • 12 years ago

            Double post…

    • PrincipalSkinner
    • 12 years ago

    Power saving are nice but perfomance is nothing special. Price is mean.

    • Thresher
    • 12 years ago

    I am beginning to think the TR’s opinion on X2 type cards is getting a bit dated. Certainly, nVidia is still having issues with their drivers, but AMD has gotten it just about right. There are very few instances where the 3870X2 falls down on the job and I suspect that nVidia will be aiming to copy that success.

    Personally, I do not want an atomic reactor inside my computer. I have enough problems with my computer room heating up whenever I play a game. If it works functionally the same, then I’m all for an X2/X4 or whatever.

      • Meadows
      • 12 years ago

      I don’t know whether you’ve noticed it but the way TR tested the cards, your fabled X2 actually “warms the room more” than the GTX 280, works a bit warmer too, and even the acoustics are no better (only by a hint; the difference is so small you might not pick it up). To add insult to injury, it consistently outperforms the X2 as well, with the solitary exception of Assassin’s Creed. I consider that pretty good.

        • Thresher
        • 12 years ago

        My point is that a single, massive GPU may not be the answer and that multichip solutions have become much more stable. I am not a fanboy, I have an 8800GTS right now. I don’t think there is a single, monolithic approach to the problem. GPUs are very, very multithreaded, there is no reason why doubling up on GPUs can’t be an effective solution.

    • donkeycrock
    • 12 years ago

    I think you should have included UT3 benchmarks.

    So when does the ATI 4870 launch, hopefully soon.

    • Jigar
    • 12 years ago

    I think Nvidia has finally made the mistake which AMD was waiting for, if AMD can cash on it, this would be the best time for consumers to get the better GPU once 48 series comes out…

      • Meadows
      • 12 years ago

      Well, architecturally there is little that is new – and they’ve learnt using their own technology well in the past 2 years or so.

      I imagine they just need a better driver rolled out for making good use of this. The 9800 GX2 may have 256 shader processors on it, but we know the efficiency of SLI all too well. I don’t know if I’m optimistic, but I expect the GTX series to trounce everything as newer drivers surface.

      Although, it would be nice to see the HD 4 series perform too. Whichever side wins this time will get the chance to cash a lot on it.

    • flip-mode
    • 12 years ago

    Scott, the new graph colors look excellent. They are much more legible than before. Good job on that.

    And Scott, an extra special thank for probably spending a lot of Father’s day on this article. I know you have young ones. I spent yesterday squirting my kids with the hose and having a cook out with the family. Your hard work and sacrifice are appreciated.

    This is an excellent article. Job well done. Excellent detail and writing and explaination. I hope you have a chance to take a couple of days off.

    As for the card itself, it is exciting, but far beyond my budget limit. Good job Nvidia.

      • flip-mode
      • 12 years ago

      I can’t believe after all the whining that has been done about the graphs that I’m the only one that bothers to say thanks, and I’m not even the one that makes the stink about it. The GT200 will come and go, but well colored graphs are surprisingly hard to come by.

        • Fastfreak39
        • 12 years ago

        I agree with the graph color assessment. The graphs look cleaner and more professional. Thanks for constantly looking for ways to improve your reviews even the fine details like graph color. That being said I’m looking forward to the AMD reviews. Nvidia needs a fire lit up their hind quarters, I think they can push closer to the edge than this.

      • Usacomp2k3
      • 12 years ago

      Thank you from me as well. So easy to read. Thank!

    • Meadows
    • 12 years ago

    Performance is good so far, but I believe they’ll roll out better drivers in time.

    Temperatures and power draw are extremely impressive compared to the FUD that has been floating around them. If I had the money, I’d get one of these (though I’m inclined to remove my CPU bottleneck first).

      • ChronoReverse
      • 12 years ago

      Huh? We already knew idle will be good (it BETTER be for this generation), but load draw is more than the 2900XT.

      Sure that’s mitigated by the far higher performance but it’s still a lot in absolute terms.

    • bthylafh
    • 12 years ago

    No interest from me. Too big, too expensive, too much power draw. I’m with Microsoft (heh) over being annoyed by Nvidia ignoring DirectX 10.1’s all-or-nothing & trying to get developers to go along with it.

    Maybe after they do a die-shrink and get full support for DX 10.1, I’d be interested, if I’m ready to upgrade from my 7900GS by then.

      • Meadows
      • 12 years ago

      Not sure if you’d noticed, but power draw was entirely acceptable.

        • bthylafh
        • 12 years ago

        Not for me, *[

      • BenBasson
      • 12 years ago

      If you can drop idle power consumption substantially with HybridPower, the load consumption will basically become irrelevant, assuming your PSU can support it. It’s probably idle 90% of the time.

      I know my 8800 GTX sits idle a lot more than it gets used, and is currently eating up 30W more than the GTX 280 while doing so, so the power draw for me would be a complete win.

    • ElderDruid
    • 12 years ago

    I’m not sure whether all these ridiculous, disappointing, incremental gains in performance are more about technical challenges or market manipulation. Being a glass-is-half-empty person, I’ll say the latter.

    Give me 60 FPS in Crysis already!

    I have an 8800 GTS, and I still don’t feel like this new “generation” is the killer upgrade I’ve been waiting for.

      • Spotpuff
      • 12 years ago

      Same.

      Great review as always, but I come away feeling underwhelmed with the performance boost vs say the 8800 series vs the 7900 series.

      Oh well. Nvidia can’t really control how games are coded and what features they include.

      • [TR]
      • 12 years ago

      I just bought an 8800GT a month and a half ago and am pretty happy with it.
      Like you, I’ll wait for lower prices and the inevitable revision in a few months to a year.
      But I suspect that what these new chips bring is more power to do what our cards do right now, not exactly some great DX8->DX9 leap kind of thing.
      DX10 is kind of waiting for people to forget about XP, but I always got the impression it doesn’t go too far beyond DX9 anyway in terms of visual effects. Maybe I’m wrong, though.

      • Meadows
      • 12 years ago

      You don’t need 60 fps in Crysis, thank you very much. You might not even need 40 most of the time. You may not realize it though.

        • Jigar
        • 12 years ago

        Can you please tell me, how we can’t notice 60fps ???

        Oh before you answer me back… you might want to read this ..

        §[<http://amo.net/NT/02-21-01FPS.html<]§

          • Meadows
          • 12 years ago

          I don’t give a flying paper bag about what the eye can see.
          I’m talking about what you need.

          Since I guess you meant maxing out Crysis, then I might add that it employs nice quality motion blur. Now that feature erases the need for 60, 50 or even 40 fps to feel fluid motion and enjoy the game, and as a matter of fact, it might be fun even at 30. I had the average fps under 30 when playing Crysis and I still finished it. Wonder how I survived that horrible torture.

          You don’t need 60 fps.
          Soon you won’t even need it in UT3 and the like. I can’t wait for the next patch, as it will unlock motion blur in the UT engine.

          Edit: I’ve read a bit into the link you gave but I stopped at the point where they said the eye is “capable of implementing motion blur”. If I want to laugh I go to comedycentral.com, not a site masquerading as serious.

            • Jigar
            • 12 years ago

            2 things that i learned today about you.. First you are not a FPS player (and if you think you are…. think again) Second (you are impatient) you should have read the article further, there was an explanation to it …

            • Meadows
            • 12 years ago

            I am an FPS player (also, but not mainly), and I prefer non-motion-blurred games at 60-75 fps, depending on what frequency the resolution in question is. I’m sorry but I cannot help you should you keep opposing that fact.

            • Jigar
            • 12 years ago

            Nope, i wont oppose any of your fact.. The only thing here i wanted to discuss is that for you 60FPS might not matter, but for me if i am playing UT3 a single drop in FPS means a frag miss, which is not tolerable. if you ask me how i miss the frag ?? Play it at god mode 😉

            • Hdfisise
            • 12 years ago

            Pff I used to play css at 20-30fps and it was fine. Moving to a better PC was cool though but not as much as a move as people say it is.

            • JoshMST
            • 12 years ago

            Oh goodness, not the amo.net article. I originally wrote the 30 vs 60 fps in early 1999, and the amo.net guy plagarzied it around 2001. I called him on it, and he changed a couple of paragraphs. I just dropped it after that.

            I removed the article from my site because of some pretty grave technical errors on my part, especially dealing with how humans actually see. The more research I have done actually leads me to believe that the human eye can actually “perceive” more fps than 60, or even 120… but it is the trained mind that can actually utilize the information coming to the brain.

            For example, if you take a person who has not played video games and have them mess around with a game at 30 fps, then have the same game at 120 fps, they would likely say, “Well, I guess the 2nd seems smoother, but I don’t know”. Now, you take a person that plays lots of fast action FPS games and have them do the same thing, they will have an instant response of “OMG THIS IS SO MUCH SMOOTHER”.

            The human eye has some really interesting optimizations and “shortcuts” for us to view the world. Our brain has far more, and with training it can more adequately and accurately perceive the world around us (read up on what USAFA has researched for training in this area for both their football players and pilots).

            Where we are falling down are the technical limitations of the hardware we have. Both in terms of what the graphics cards can put out, and more importantly what our monitors can do. I have done some seat of the pants testing, and using a video card with monitor combination on a FPS that can do a solid 120 fps/120 Hz locked results in a better gaming experience. When playing fast action games, the reduced latency that results in faster reaction times and the input from that reaction correspond to higher scores in many games (assuming you are a good player to begin with).

            Anyway, more fps is better, though you may only notice it in certain scenarios. Supreme Commander is really limited in how it “feels” with more FPS, while games like Counter Strike can have a dramatic effect on your experience as well as your score.

            The important part is that the eye is not the limitation to our experience when it comes to FPS.

            • bjm
            • 12 years ago

            Nobody buys these video cards to fulfill a /[

        • StashTheVampede
        • 12 years ago

        You don’t need 60fps for single player — I agree with this statement. There are numerous other games where 60fps wasn’t necessary (Operation Flashpoint was one that wasn’t ever very fast).

        In multi-player, I 100% disagree with the 60fps statement. The faster the game is displayed in my screen, the better I can react to the situation. I’ve played nearly every FPS since Castle Wolfenstein (especially the Quake based one) and more FPS = a better chance of winning.

      • ElderDruid
      • 12 years ago

      Plus….I don’t think AMD or nVidia have a prayer of ever seeing another TR Editor’s Choice award on a high end product until they stop all this BS and release a card that truly breaks new ground in performance.

    • ElderDruid
    • 12 years ago

    Post deleted by author.

      • INMCM
      • 12 years ago

      800×600, low settings

    • MaxTheLimit
    • 12 years ago

    God damn that’s a lot of cash for fork out for so little performance gain. I see no reason here to upgrade the 8800GT. I really hope the RV770 brings up some good numbers to make for some competitive pricing.

    • 2x4
    • 12 years ago

    when is the release date??

      • Stijn
      • 12 years ago

      *[

    • Pettytheft
    • 12 years ago

    This is finally a big enough performance jump to make me consider upgrading my GTS640 but I’ll have to wait on a deal. The 260 is close but even though they said $400 it’ll probably be $450 for the first few months. Hopefully ATI is competitive this time around and we’ll get some early price drops and deals from retailers.

Pin It on Pinterest

Share This