Doing the math
|GeForce 9600 GT||10.4||20.8||10.4||57.6||208|
|Palit GeForce 9600 GT||11.2||22.4||11.2||64.0||224|
|GeForce 8800 GT||9.6||33.6||16.8||57.6||336|
|GeForce 8800 GTS||10.0||12.0||12.0||64.0||230|
|GeForce 8800 GTS 512||10.4||41.6||20.8||62.1||416|
|GeForce 8800 GTX||13.8||18.4||18.4||86.4||346|
|GeForce 8800 Ultra||14.7||19.6||19.6||103.7||384|
|GeForce 9800 GTX||10.8||43.2||21.6||70.4||432|
|GeForce 9800 GX2||19.2||76.8||38.4||128.0||768|
|Radeon HD 2900 XT||11.9||11.9||11.9||105.6||475|
|Radeon HD 3850||10.7||10.7||10.7||53.1||429|
|Radeon HD 3870||12.4||12.4||12.4||72.0||496|
|Diamond Radeon HD 3870 1GB||13.3||13.3||13.3||55.7||531|
|Radeon HD 3870 X2||26.4||26.4||26.4||115.2||1056|
The table above shows how Diamond's HD 3870 1GB stacks up in theoretical terms. Its higher GPU core clock grants it more fill rate and shader power than the stock HD 3870, although its lower memory clock cuts bandwidth considerably. The 1GB card's memory bandwidth is still comparable to that of its GeForce 9600 GT and 8800 GT competition, though.
The more interesting question here involves overall performance. Not to give too much away, but the 3870 has somewhat underachieved versus the 9600 GT and 8800 GT given its raw shader FLOPS capacity. Why is that? One possibility is that the RV670 GPU's five-wide superscalar execution units don't process data as efficiently as Nvidia's scalar units. I'm not sold on that explanation, though. AMD has implemented all sorts of voodoo magic in its driver compiler, including serializing a pixel shader program for execution on a that fifth ALU while another executes in vector fashion on the other four ALUs. Also, the performance of the 9600 GT argues against shader power being a primary constraint in today's games. The more likely explanations involve the RV670's relatively weak texturing capacity and the fact that R6x0-series GPUseither by design or because of a rumored flaw in the ROP logiccannot perform the resolve step for multisampled antialiasing in their ROP hardware; they must use the shader core for this task.
Another possibility, I suppose, is that the RV670 doesn't compress and manage memory as efficiently as the GeForces do. If so, Diamond's 1GB card may be an answer.
3DMark lets us measure performance in some of our theoretical categories. In actuality, sheer pixel throughput tends to be limited by memory bandwidth, which is why Diamond's 1GB card scores lower in single-texture fill rate than the 512MB GDDR4 version of the HD 3870. Multitextured fill rate hits no such limits; the 1GB card nearly reaches its theoretical peak capacity. However, that capacity is appreciably lower than the GeForce 9600 GT's, let alone the 8800GT's.
The 3870 1GB shows its shader power, mixing it up with the GeForce cards from test to test. One intriguing result: the stock Radeon HD 3870's performance suffers in the simple vertex shader test, likely due to GDDR4's higher access latencies. With its GDDR3 memory, Diamond's 3870 1GB avoids that fate.
As ever, these results don't track perfectly with performance in actual games, although they do give us some insight. For gaming performance, we have... actual games.
|Here's the not-so-live video version of The TR Podcast 164||13|
|Here's what's cooking in Damage Labs||16|
|Deal of the week: An IPS ultra-wide for $420, plus cheap SSDs and more||20|
|Microsoft's quarterly revenue up 25% on strong Surface, Xbox sales||19|
|Assassin's Creed Unity PC requires 6GB of RAM, GTX 680||212|
|Join us as we attempt to live stream The TR Podcast tonight||13|
|Civ: Beyond Earth with Mantle aims to end multi-GPU microstuttering||68|
|CPU startup claims to achieve 3x IPC gains with VISC architecture||59|
|I just found this AMAZING trick! Call of Duty takes up 0GB if you just don't buy it!||+114|