WE’VE FOLLOWED the story of ATI’s slow march to leadership in the graphics hardware world closely. The R300 chip, which powers ATI’s high-end Radeon 9700 Pro card, has given ATI an indisputable lead in terms of features and performance over perennial rival NVIDIA. R300 features a rich array of datatypes, including high-precision floating-point color throughout its pipeline. This newfound precision will enable graphics hardware to encroach on the world of cinematic rendering traditionally owned by slow, software-only renderers.
Translation: it’s really, really cool.
Now comes the pivotal step in ATI’s march. Today the company is shipping its new sub-$200 graphics card based on the R300 chip, just in time for the Christmas shopping season. This card, the Radeon 9500 Pro, directly targets the heart of NVIDIA’s lineup: the GeForce4 Ti 4200. The 4200 has dominated the middle of the graphics card market since its introduction this past spring. Can ATI’s new card knock the GF4 Ti 4200 off its perch? Keep reading to find out.
Introducing the Radeon 9500 Pro
First things first. In order to understand best what the Radeon 9500 Pro is all about, you’d do well to go read my swanky intro to next-gen graphics chips, which explains exactly how and why these new chips are a step ahead of anything you’ve seen before. Then you’ll want to go read my review of the Radeon 9700 Pro. Since the 9500 Pro is based on the exact same ATI R300 chip, nearly everything I said about the 9700 Pro’s capabilities applies here, and I’ll refer to that article as needed throughout this review.
Once you have your prerequisite reading finished, allow me to dispel some notions you may have about the Radeon 9500 Pro. A few weeks back, ATI announced the 9500 Pro to the world and gave out test cards to select media outlets (read: not us; we’re too geeky-wonky with the in-depth stuff). At that time, you may have seen some benchmark scores for the 9500 Pro. Trouble is, those scores aren’t representative of what you’ll get with the finished product. Those early sample cards were simply Radeon 9700 cards underclocked with two of their four memory controllers disabled. As a result, those early test cards had only 64MB of memory and a different board design from the final product, which has 128MB of memory. Just keep that in mind, in case you had some preconceived notions about the 9500 Pro’s performance.
Speaking of board designs, let’s take a look at the production Radeon 9500 Pro card. This puppy is dressed up in ATI red, just like the 9700 Pro, but its memory chips are lined up across the top of the card, and the auxiliary power connector has moved. (Yes, you’ll need to plug an additional power lead into the 9500 Pro to give it enough juice to run. ATI includes a power adapter cable in case you need one, just as with the 9700 Pro.)
The Radeon 9500 Pro is a different card than the 9700 Pro
The usual array of VGA, S-Video, and DVI ports
The 9500 Pro’s memory chips line up in front of the aux power connector So that’s what she looks like. Here’s what you need to know: the 9500 Pro runs at a 275MHz core clock, and it has a 128-bit DDR memory interface with a 540MHz effective clock rate. Hence the cheapness versus the Radeon 9700 Pro, which can run as much as $399. We’ll discuss the exact implications of the 9500 Pro’s specs in a couple of pages.
Radeon 9500 Pro cards apparently just started rolling off the production line, if our review unit is anything to go by. Have a look at the manufacture date:
Fresh from the oven Anyhow, let’s see how this brand-new specimen performs.
Our testing methods
As ever, we did our best to deliver clean benchmark numbers. Tests were run at least twice, and the results were averaged.
Our test system was configured like so:
|Processor||Pentium 4 2.8GHz|
|Front-side bus||533MHz (133MHz quad-pumped)|
|Motherboard||VIA P4PB 400|
|Chipset drivers||4-in-1 4.43|
|Memory size||512MB (1 DIMM)|
|Memory type||Corsair XMS3200 PC2700 DDR SDRAM|
|Sound||Creative SoundBlaster Live!|
|Storage||Maxtor DiamondMax Plus D740X 7200RPM ATA/133 hard drive|
|OS||Microsoft Windows XP Professional|
|OS updates||Service Pack 1|
The test system’s Windows desktop was set at 1024×768 in 32-bit color at an 85Hz screen refresh rate. Vertical refresh sync (vsync) was disabled for all tests.
We used the revision 18.104.22.16800 drivers for the ATI cards and NVIDIA’s Detonator 40.72 WHQL drivers for the NVIDIA cards.
We used the following versions of our test applications:
- MadOnion 3DMark 2001 SE Build 330
- Codecreatures Benchmark Pro
- Comanche 4 demo benchmark
- Quake III Arena v1.31
- Serious Magic texture download benchmark
- Serious Sam SE v1.07
- SPECviewperf 7.0
- VillageMark v1.17
- Unreal Tournament 2003 with 2136 patch
All the tests and methods we employed are publicly available and reproducible. If you have questions about our methods, hit our forums to talk with us about them.
You’ll notice that we’re testing the 9500 Pro against a range of cards. We’ve included a Radeon 9700 Pro card, plus a plain ol’ Radeon 9700. The Radeon 9700 is another new card from ATI, and it’s nothing more than a 9700 Pro that runs at a slightly lower clock speed and sells at a slightly lower price. That card will be interesting to watch, because its 275/540MHz clock rate is identical to the 9500 Pro’s, but it has twice the memory bandwidth thanks to its quad 64-bit memory controllers. The 9500 has only two 64-bit memory controllers, so the contrast will tell us a lot about how memory bandwidth affects performance.
However, the most important matchup hereand the one to which we’ll devote most of our attentionis the Radeon 9500 Pro versus the GeForce4 Ti 4200, because they’re direct competitors. You’ll note that we’re using the new AGP 8X-capable version of the GF4 Ti 4200. (The 9500 Pro supports AGP 8X, too.) This card has 128MB of memory, and it runs at core and memory clock speeds of 275MHz and 512MHz, respectively. This is the card the 9500 Pro has to beat in order to fulfill its mission.
Finally, remember that we’re stuck once again reviewing an R300-based product with DirectX 8-class applications, at best. The 9500 Pro will be most impressive when it can make use of its floating-point color precision and the like, but the software to benchmark performance with FP color just isn’t here yet. We’ll be testing the 9500 Pro as most folks will have to use it for the time being: as a DX8-class chip.
Graphics performance very often comes down to pixel-pushing power at the end of the day, and this kind of power (fill rate) is generally a function of some basics of chip design and clock speed. The table below shows the key capacities and clock rates of the most common new GPUs, so you can see where the 9500 Pro fits in.
|Core clock (MHz)||Pixel pipelines||Peak fill rate (Mpixels/s)||Texture units per pixel pipeline||Peak fill rate (Mtexels/s)||Memory clock (MHz)||Memory bus width (bits)||Peak memory bandwidth (GB/s)|
|GeForce4 MX 440 8X||275||2||550||2||1100||512||128||8.2|
|GeForce4 Ti 4200 8X||250||4||1000||2||2000||512||128||8.2|
|Radeon 9500 Pro||275||8||2200||1||2200||540||128||8.6|
|Radeon 9000 Pro||275||4||1100||1||1100||550||128||8.8|
|GeForce4 Ti 4400||275||4||1100||2||2200||550||128||8.8|
|GeForce4 Ti 4600||
|Radeon 9700 Pro||325||8||2600||1||2600||620||256||19.8|
The 9500 Pro looks a little different from the older chip designs above, because it has eight pixel pipelines capable of laying down only one texture per rendering pass each. By contrast, the GF4 Ti 4200 has a four-pipe design with two texture units per pipe. This important difference gives the 9500 Pro twice the pixel fill rate of the Ti 4200. All told, the 9500 Pro has higher pixel and texel fill rates than the Ti 4200, and more memory bandwidth, as well.
Here’s how the numbers translate into performance in 3DMark’s synthetic fill rate tests.
The 9500 Pro beats the Ti 4200 by a fairly small margin in both pixel and texel fill rate tests. Notice how much slower the 9500 Pro is than the Radeon 9700 in the single-textured test. No doubt the 9500 Pro’s lesser memory bandwidth is holding it back here. However, the 9500 Pro ties with the identically clocked Radeon 9700 in the multitexturing test.
Here are our results for 1280×1024 resolution compared to the chips’ theoretical peak fill rates.
The 9500 Pro proves to be very efficient with multi-texturing, as are all of the cards.
This next test will show us how efficient the 9500 Pro is at discarding pixels that won’t be visible once the final 3D scene is rendered. ATI’s HyperZ suite of technologies aims to eliminate occluded pixels, so valuable processing time isn’t wasted. Our occlusion detection test, VillageMark, is a torture test for this sort of thing.
Here you can see the R300’s HyperZ tech in action. ATI has included a feature called EarlyZ in the R300 chip that basically eliminates occluded pixels, and the payoff is obvious. Even the faster (clock speed-wise) and more expense GF4 Ti 4600 can’t keep pace with the 9500 Pro here. Newer technology wins out over brute force.
Pixel shader performance
For the sake of next-generation games and 3D apps, perhaps nothing matters more than pixel shader performance. Once high-level shading languages become the norm for graphics developers, pixel-processing power will be the new performance bottleneck. These pixel shader tests will give us some indication of how these cards will perform when that day comes. We’ll start with 3DMark’s pixel shader tests, and then move to NVIDIA’s own ChameleonMark. Remember that, in every case, we are only using DirectX 8.0/8.1-class pixel shading functionsnothing terribly fancy.
The Radeon 9500 Pro outperforms the GF4 Ti 4200 by 50 to 100% in 3DMark’s pixel shader tests. The more complex Advanced Pixel Shader test widens the gap between the two.
ATI’s next-gen R300 chip simply outclasses the GeForce4 Ti chip in pixel-processing capacity, because the R300 has double the pixel shaders the GF4 Ti does. For next-gen 3D apps, it will be no contest.
Now that we’ve measured pixel fill rates and shading capacity, we’ll make a couple more stops in our synthetic tests before moving on to real games and 3D apps. The tests below measure polygon processing power, both in the newer, more flexible vertex shader units common to all the chips here, and in older, fixed-function transform and lighting units. The Radeon 9500 Pro has four next-gen vertex shader units running in parallel, each with a 128-bit vector processor and a 32-bit scalar processor, so the 9500 Pro ought to be very fast here.
As expected, the Radeon 9500 Pro trounces the GeForce4 Ti 4200 in vertex shader throughput. Now let’s look at fixed-function tranform and lighting, which is more widely used in today’s games. Also note, as a rule, that graphics chips with vertex shader units tend to implement T&L as a vertex shader program rather than with a separate, fixed-function hardware T&L unit.
Here again, the 9500 Pro bests the Ti 4200, but the gap is narrower.
AGP write performance
For the sake of completeness, I’ll include a round of tests of AGP texture download performance. What we’re talking about here is the ability to move rendered images from a graphics card’s local memory over the AGP bus into main memory. Games don’t generally have a need to transfer data to main memory, but applications like video processing tools and high-quality rendering programs do. Please see my article on this subject if you want to know more.
NVIDIA fixed the AGP texture download problem in its latest driver release, but ATI (and Matrox) have told us they are not currently planning to dedicate resources to address this problem. Until that happens, ATI cards won’t be entirely suitable for real-time digital video editing systems, and they will be limited in their ability to save images rendered in Direct3D-based programs to main memory or to disk. We’ll keep watching to see if the situation changes.
Quake III Arena
Finally, we’re ready for the game tests, where all this theory gets put into practice. We’ll start with Quake III Arena, which is too old to make use of pixel or vertex shaders. We’re testing Q3A with a recorded demo from a CPL match between fatality and Daler. You can grab the demo from our server here, at least until we find out the thing is copyrighted somehow.
The NVIDIA cards reign supreme in this older game, as the 4200 beats the 9500 Pro and the GF4 Ti 4600 even whups up on the Radeon 9700 Pro.
The shader-aware Comanche 4 is another story. For the most part, this game is limited by CPU or system bottlenecks, but when the resolution increases, the ATI cards carve out a lead. The 9500 Pro beats both the Ti 4200 and the Ti 4600 in this DX8 game.
Codecreatures Benchmark Pro
Codecreatures’ cool little demo of its gaming engine software uses pixel shaders to nice effect, and the Radeon 9500 Pro handles this one a little faster than the Ti 4200.
Unreal Tournament 2003
UT 2003 is the first great DirectX 8-class first-person shooter, and it’s a perfect battleground for the cards we’re testing today. We tested with the cold, HardOCP’s UT2003 benchmarking utility, with both low- and high-detail settings.
As the UT2003 announcer would say (shortly before I throttle his neck violently): “Ownage!” The 9500 Pro outruns even the GF4 Ti 4600 in UT2003, demonstrating that all our theory about pixel and vertex shading power isn’t just a bunch of talk. The 9500 Pro’s technology advantages make a difference in real-world games.
Serious Sam SE
To keep things even, I used Serious Sam’s “default quality” add-on, so the game engine’s auto-tuning features would be held in check.
Like Quake III Arena, Serious Sam SE is an OpenGL game that doesn’t use pixel or vertex shaders, and it shows. The 9500 is slower than the Ti 4200 at all but the highest resolution. Notice, though, how the gap between the two cards grows smaller as the display resolution goes up. The 9500 Pro’s fill rate advantage shows through here.
Now, let’s look at our ever-funky Serious Sam second-by-second frame rate graphs to see how all these average frame rate numbers get generated.
All the cards perform quite similarly in terms of peaks, valleys, and the like. If you look closely, you can see the shape of the cards’ lines change as they move from being limited by the game engine, poly throughput, or driver execution (as are all cards at 640×480) to being limited by fill rate and memory bandwidth. For instance, the Ti 4200’s line changes dramatically at 1280×1024, and at 1600×1200, the Ti 4600 and Radeon 9500 Pro lines take on a similar shape. The Radeon 9700 and 9700 Pro don’t appear to be too terribly fill-rate limited at 1600×1200.
The 9500 Pro performs well in that great arbiter of forum bragging rights, 3DMark. Of course, we expected this, because we saw the 9500 Pro whup up on the GF4 Ti cards in 3DMark’s synthetic tests. Let’s see what happens in 3DMark’s gaming tests.
The 9500 Pro consistently outpaces the Ti 4200, and it steals a victory over the Ti 4600 in the Lobby test.
SPEC’s viewperf suite tests a different type of 3D app: lots of polygons, not many textures, lots of wireframes, and ample use of OpenGL lighting. The 9500 Pro holds up well, nearly keeping up with its more expensive R300-based siblings.
We’ll bust up our antialiasing performance analysis into two components: edge AA and texture AA. There’s a lot of theory behind the antialiasing capabilities of the GeForce4 Ti and R300 chips. If you’re not familiar with it, go here and read up. You’ll get the basics of the Radeon 9500/9700 cards’ edge antialiasing capabilities, plus screenshots of the R300 chip’s AA in action. Because the 9500 Pro is based on the same chip as the 9700 Pro, its image output, including AA, is identical to the 9700 Pro.
To refresh your memory, the R300 implements multisampled antialiasing with a programmable (non-grid) jitter pattern and gamma-corrected blends. The chip can collect two, four, or six samples per pixel. Multisampled AA is efficient because it avoids unnecessary texture reads, and the R300 adds even more efficiency by using a color compression engine between the chip and the color buffer when AA is active.
In other words: looks good, goes fast. Here’s the proof.
You can see how the GeForce 4 cards’ lines shoot downward at a much sharper angle than the Radeon cards’ as the number of AA samples increases. By the time we reach 4X AA at 1280×1024, the Radeon 9500 Pro is running much faster than the GeForce4 Ti cards. In fact, the Radeon 9500 Pro at 6X AA is faster than the GeForce4 Ti 4600 at 4X AA.
Edge antialiasing is all well and good, but I think texture AA has a more dramatic impact on image quality. You can get a dose of texture filtering theory and screenshots here.
I used Quake III to test the cards’ ability to handle the various degrees of anisotropic and trilinear texture filtering of which they’re capable. As with edge AA, the 9500 Pro can handle more samples than the GeForce4 Ti cards. We don’t have results for 16X aniso for the GF4 Ti cards because they max out at 8X.
The trend here is similar to the trend for edge AA. As the filtering strength increases, the Radeon chips perform relatively better.
Doing the DX9 thing
I installed the Radeon 9500 Pro in my Shuttle SB51G cube, which is currently home to an early release candidate of DirectX 9 and ATI’s DX9 demos for the Radeon 9700 Pro. My goal was to find out how well the 9500 Pro could handle the floating-point color modes used in these demos, because I had some concern the 9500 Pro’s limited memory bandwidth would make these high-color modes practically unworkable. To my surprise, the 9500 Pro ran all of the demos with nary a hiccup or slowdown. Some absolutely gorgeous rendering is possible on the 9500 Pro in real time, and it’s hard to believe this kind of power is available for under $200.
In case you missed it, I’ll post again here some of the screenshots I took from ATI’s demos. All of these demos look just like this, and run fluidly, on the Radeon 9500 Pro.
I structured this review so thatI hopeyou could get a clear sense of the Radeon 9500 Pro’s capabilities. I won’t belabor the point now. The Radeon 9500 Pro outperforms its chief rival, the GeForce4 Ti 4200, in nearly every important way. Only in older games can the Ti 4200 snag a win or two. The 9500 Pro has significantly more computational power, in terms of both pixel and vertex processing, than either the GeForce4 Ti 4200 or its big brother, the Ti 4600. Its eight-pipeline design gives the 9500 Pro real-world performance advantages, and the whole of the R300 chip is simply more efficient than the GeForce4 Ti GPU. The more advanced features you turn onshaders, edge antialiasing, texture filteringthe more the Radeon chips jump ahead of the GeForce4s. And the Radeon 9500 Pro has amazing cinematic rendering capabilities waiting to be unlocked when DirectX 9 arrives in earnest.
It’s no contest. NVIDIA’s product line is a generation behind, and with the debut of the Radeon 9500 Pro, there’s little reason left to buy a GeForce4. I would take a Radeon 9500 Pro over a GeForce4 Ti 4200, easy, but don’t stop there. I’d rather have a 9500 Pro than a GeForce4 Ti 4600, too. The Ti 4600’s few advantages in older games aren’t as impressive to me as the 9500 Pro’s merits.
ATI says the Radeon 9500 Pro will jump out of the gate with a $20 rebate on its retail list price of $199. That kind of pricing puts the 9500 Pro nearly on par with the GeForce4 Ti 4200 8X, and I’d expect street prices to be lower than list, especially once ATI’s manufacturing partners like Hercules and Tyan get rolling. At prices like that, the 9500 Pro should become very popular, almost overnight. Also, consider this: with NVIDIA’s high-end GeForce FX slated to arrive no sooner than February, it could be four to six months before NVIDIA has a proper, mid-priced answer to the 9500 Pro. That reality will leave a lot of folks asking: why wait? The 9500 Pro is a compelling reason not to.