For more than a year, the GeForce4 MX and GeForce4 Ti 4200 chips have occupied the mainstream and performance segments of NVIDIA’s product catalog, but they’re being put to pasture in favor of new additions to the GeForce FX line. In 2002, GeForce4 Ti 4200 received glowing accolades from reviewers and held the PC graphics price/performance crown for the vast majority the year; many of us will be sad to see the product taken out behind the barn. The GeForce4 Ti 4200 will be replaced by NVIDIA’s new GeForce FX 5600 Ultra, which will have a tough act to follow.
The GeForce4 MX, on the other hand, wasn’t favored by many, and its lack of hardware pixel and vertex shaders isn’t likely to be missed by anyone. The GeForce4 MX’s replacements, the NV34-powered GeForce FX 5200 and 5200 Ultra, have rather small shoes to fill, so it should be easy for them to impress.
What are the capabilities and limitations of NVIDIA’s new NV31 and NV34 graphics chips and the new GeForce FX cards they’ll be rolling out on? Do these new products share enough technology with NVIDIA’s high-end NV30-based products to be worthy of the GeForce FX name, or is NVIDIA still keeping the mainstream a generation behind? Read on to find out.
Cinematic computing across the line
In a bold move that lays to waste NVIDIA’s much-criticized “MX” philosophy of introducing new low-end graphics chips a generation behind the rest of its lineup, NVIDIA’s new NV31 and NV34 chips both support Microsoft’s latest DirectX 9 spec and even offer a little extra functionality above and beyond DirectX 9’s official requirements. Here’s a quick rundown of the features shared by NV30, NV31, and NV34.
- Vertex shader 2.0+ – NV30’s support for vertex shader 2.0+ carries over to NV31 and NV34, with all the bells and whistles included. Vertex shader 2.0+ offers some extra functionality over vertex shader 2.0, making the former a little more flexible.
- Pixel shader 2.0+ – NV31 and NV34 also inherit all the features and functionality of NV30’s pixel shaders 2.0+, which supports more complex pixel shader programs than even Microsoft requires for DirectX 9. In total, NV31 and NV34 support pixel shader programs a maximum of 1024 instructions in length. Most of ATI’s R300-derived GPUs support pixel shader 2.0, whose maximum program length is only 64 instructions, though ATI’s latest Radeon 9800 and 9800 Pro use an “F-buffer” to support shader programs with a theoretically “infinite” number of instructions. At least for now, ATI’s “F-buffer” will only be available on high-end graphics cards, which means NVIDIA still has the edge on mainstream cards.
Will NV3x’s support for more complex pixel shader programs than even DirectX 9’s requirements go unused? Maybe not. In this .plan update, id Software programmer John Carmack acknowledges that he’s already hit the R300’s limits:
For developers doing forward looking work, there is a different tradeoff — the NV30 runs fragment programs much slower, but it has a huge maximum instruction count. I have bumped into program limits on the R300 already.
Games that don’t venture beyond the DirectX 9 spec won’t make use of the NV3x’s support for longer pixel shader programs, but some developers will probably take advantage of support for extra-long shader programs where available.
- Pixel shader precision – Like NV30, NV31 and NV34 support a maximum internal floating point precision of 128 bits within their pixel shaders. NV3x can also scale down its pixel shader precision to 16-bits of floating-point color per channel, or 64-bits overall, to yield better performance in situations where 128 bits of internal precision is just too slow.
Unfortunately, it’s hard to compare the NV3x’s pixel shader precision directly with the R300’s. The R300 supports only one level of floating point pixel shader precision, which, at 96 bits, falls between NV3x’s support for 64- and 128-bit modes. Based on the results of early reviews, it looks like NV30’s performance with 128-bit pixel shader precision is a little slow, but the chip can sacrifice precision to improve performance.
The above features are all key components of NVIDIA’s CineFX engine, which means that NV31 and NV34 are both prepared for the “Dawn of cinematic computing” that NVIDIA has been pushing since its NV30 launch. This catch phrase, of course, refers to really, really pretty visual effects that should be easy for developers to create using of the additional flexibility offered by NV3x’s support for complex shader programs and high color modes. That’s the theory, anyway. NVIDIA will need to equip a large chunk of the market with CineFX-capable products before developers start targeting the technology, but bringing CineFX to the masses is what NV31 and NV34 are all about.
Differentiating features, or the lack thereof
If NV30, NV31, and NV34 share so many key features, how does NVIDIA differentiate between them? First, let’s deal with the easy stuff:
|Lossless color & Z compression||Memory interface||Transistors (millions)||Manufacturing process||RAMDACS|
Both NV30 and NV31 use lossless color and Z-compression to improve antialiasing performance, but those features have been left off NV34 (likely to reduce the NV34’s transistor count). The lack of color compression will hinder NV34’s antialiasing performance, and the chip won’t support NVIDIA’s new Intellisample antialiasing technology. Losing Z-compression won’t help performance, either, with AA or in general use.
All of NVIDIA’s NV3x chips will have a 128-bit memory interface, but NV31 and NV34 will use DDR-I memory chips. NVIDIA wouldn’t reveal how fast the memory on its various NV31 and NV34 flavors will run, but at the very least we know that cards will have less memory bandwidth than the vanilla GeForce FX 5800. Currently, the fastest DDR-I-equipped consumer graphics cards use DDR-I memory at 650MHz, which offers just over 10GB/s of memory bandwidth on a 128-bit bus; to equal the GeForce FX 5800’s 12.8GB/s of memory bandwidth, NVIDIA would have to uue DDR-I memory chips clocked at 800MHz, which is very unlikely.
All of NVIDIA’s NV3x chips will be manufactured by TMSC. Although NV31 will use the same 0.13-manufacturing process as NV30, NV34 will use the older, more established 0.15-micron manufacturing process. NVIDIA wouldn’t reveal NV31 or NV34’s final clock speeds. Those speeds have been decided, but they won’t be released until actual reviews hit the web. It doesn’t take much faith to believe that the 0.13-micron NV31 will run at higher clock speeds than the 0.15-micron NV34. Because NVIDIA is guarding the clock speeds of its new chips so closely, it’s almost impossible to speculate on each chip’s performance potential. One wonders why is NVIDIA being so secretive.
There are, however, no secrets when it comes to NV31 and NV34’s integrated RAMDACs. NV34 integrates two 350MHz RAMDACs, while NV31 uses 400MHz RAMDACs. Honestly, NV34’s 350MHz RAMDACs shouldn’t hold many back. The GeForce4 MX’s 350MHz RAMDACs support 32-bit color in resolutions of 2048×1536 at 60Hz, 1920×1440 at 75Hz, and 1920×1200 at 85Hz. I can think of precious few instances where a relatively low-end NV34-based graphics card would be paired with an ultra high-end monitor capable resolutions and refresh rates higher than that.
Now that we’ve gone over the easy stuff, it’s probably a good idea to pause and take a deep breath. Things are about to get messy.
Deciphering the pipeline mess
Lately, a bit of a fuss has been made over the internal structure of NV30’s pixel pipelines and how many pixels the chip is capable of laying down in a single clock cycle. NV30’s internal layout is unconventional enough to confuse our trusty graphics chip chart, which only works with more traditional (or at least more clearly defined) graphics chip architectures.
What do we know about NV30 for sure? That it can render four pixels per clock for color+Z rendering, and eight pixels per clock for Z-rendering and stencil, texture, and shader operations. Only newer titles that use features like multi-texturing and shader programs will be able to unlock NV30’s ability to render eight pixels per clock cycle. In fact, even in id’s new Doom game, NV30 will only be rendering eight pixels per clock “most” of the time. That “most” is straight from NVIDIA, too.
If that explains NV30, what about NV31 and NV34? According to NVIDIA, both NV31 and NV34 have four pixel pipelines, each of which has a single texture unit. A 4×1-pipe design makes the chips similar to ATI’s Radeon 9500, but comparing NV31 and NV34 with NV30 is more complicated. You didn’t think you were going to get off easy this time, did you?
Because NVIDIA has explicitly stated that NV31 and NV34 are 4×1-pipe designs, it’s probably safe to assume that there are no situations where either chip can lay down more than four textures in a single clock cycle, but it doesn’t look like there are any situations where NV31 or NV34 can lay down more than four pixels per clock cycle, either.
According to NVIDIA, NV31 will be roughly half as fast as NV30 in situations where NV30 can lay down eight pixels per clock (Z-rendering and stencil, texture, and shader operations). Part of that speed decrease will come from the lack of a second texture unit per pixel pipeline, but NV31 will also be slower because it has “less parallelism” in its programmable shader than NV30. NVIDIA isn’t saying NV31 has half as many shaders as NV30 or that its shader is running at half the speed of NV30’s, just that the shader has “less parallelism.” If NV31’s performance is tied to the amount of parallelism within its shader, a betting man might wager that NV31 achieves “roughly half” the speed of NV30 when dealing with shader operations because NV31’s programmable shader has roughly half the parallelism of NV30’s.
Like NV31, NV34’s pixel pipelines have half as many texture units and its programmable shader “roughly half” as much parallelism as NV30’s. NV31 and NV34 have more in common with each other than they do with NV30, but at least partially because of its lack of color and Z compression, NV34 won’t be quite as fast as NV31. According to NVIDIA, NV34’s performance is very similar to NV31’s in situations where NV30 is capable of rendering four pixels per clock and about 10% slower than NV31 in situations where NV30 would be capable of rendering eight pixels per clock. Those comparative performance estimates refer to non-antialiased scenes; all bets are off when antialiasing is enabled.
Of course, these relative performance claims for NV30, NV31, and NV34 assume that the chips are running at identical clock speeds, which certainly won’t be true for all cards based on the chips and may not even be true for any. Additionally, any manufacturer’s performance claims should be taken with a grain of salt, at least until independent, verifiable benchmarks are published.
Now that we know about the chips, let’s move onto the cards they’ll be riding on.
NV31: GeForce FX 5600 Ultra
NVIDIA will bring NV31 to market as the GeForce FX 5600 Ultra, a performance-oriented product that will replace NV25/28-based graphics cards at a retail price of under $200. The GeForce FX 5600 Ultra’s suggested $199 price tag lines the card up nicely against ATI’s new Radeon 9600 Pro, which will become available in the same timeframe. Both cards will have four pixel pipelines with a single texture unit per pipe, but whether the GeForce FX 5600 Ultra can match the Radeon 9600 Pro’s 400MHz core clock speed remains to be seen.
What might GeForce FX 5600 Ultras look like? Let’s have a peek.
As you can see, the GeForce FX 5600 Ultra doesn’t need a Dustbuster.
And there was much rejoicing.
Seriously, listen. You can actually hear the rejoicing because there isn’t a vacuum whining in the background. The GeForce FX 5600 Ultra’s reference heat sink looks like something one might find on a GeForce4 Ti 4600, and won’t eat up any adjacent PCI slots. Still, it’s usually a good idea to leave the closest PCI slot to an AGP card empty to improve air flow.
From the picture above, we can see that NVIDIA has gone with those nifty BGA memory chips on its GeForce FX 5600 Ultra reference card, which also uses a standard output port config. Thankfully, because NV31 integrates dual RAMDACs and dual TMDS transmitters, there’s nothing stopping manufacturers from building GeForce FX 5600 Ultras with dual DVI or even dual VGA ports.
NVIDIA wouldn’t confirm how many PCB layers are required by the GeForce FX 5600 Ultra, but they did say that the boards would have fewer layers than the 12-layer GeForce FX 5800 Ultra. According to NVIDIA, plenty of different manufacturers building their own GeForce FX 5600 Ultra boards; we might even see a few manufacturers stray from NVIDIA’s reference design with more unique GeForce FX 5600 Ultra-based products.
Although NVIDIA makes no official mention of a non-Ultra GeForce FX 5600, I wouldn’t be surprised to see a few popping up in OEM systems. According to NVIDIA, “Ultra” is what sells on retail shelves, so manufacturers won’t be putting much retail emphasis on non-Ultra products. However, I doubt that means that shelves won’t be populated with non-Ultra GeForce FX 5800s since it appears that few GeForce FX 5800 Ultras will be available at all.
NV34: GeForce FX 5200 and 5200 Ultra
Unlike NV31, which only comes in one flavor (at least for now), NV34 will see action in the GeForce FX 5200 and GeForce FX 5200 Ultra, which will be priced at $99 and $149, respectively. The GeForce FX 5200 cards are aimed at mainstream markets, and will all but purge the GeForce4 from NVIDIA’s line. Unfortunately, the GeForce4 MX isn’t quite dead yet. The much-maligned MX will hang out in the sub-$80 price range as a value part, extending the incredible legacy of NVIDIA’s GeForce2 architecture.
Depending on which GeForce FX 5200 we look at, ATI’s Radeon 9600 or 9200 will be the competition. Unfortunately, without clock speeds or samples of the GeForce FX 5200 and GeForce FX 5200 Ultra, it’s hard to speculate about the cards’ performance relative to ATI’s mainstream offerings. At the very least, the Radeon 9600’s antialiasing performance should easily eclipse the GeForce FX 5200 Ultras because of the latter’s lack of color and Z-compression.
NVIDIA’s NV31 and NV34 press materials include pictures of the GeForce FX 5200 and 5200 Ultra, but the GeForce FX 5200 Ultra’s picture is identical to that of the GeForce FX 5600 Ultra. Since we’ve already checked out that glossy, let’s have a look at the vanilla GeForce FX 5200.
How does a passively-cooled, DirectX 9-compatible graphics card for under $100 sound? Deadly, at least for the GeForce MX. Who wouldn’t spend an extra $20 for a couple of extra pixel pipelines and a side order of DirectX 9?
If the GeForce FX 5600 Ultra’s board layout is identical to the GeForce FX 5200 Ultra, then the latter should be available with BGA memory chips. It looks like the vanilla GeForce FX 5200 will use older TSOP memory chips, but won’t require any extra power, so it doesn’t have an on-board power connector.
As with its GeForce FX 5600 Ultra cards, NVIDIA isn’t revealing the board layer requirements of the GeForce FX 5200 other than to say that the boards will require fewer layers than the 12-layer GeForce FX 5800 Ultra. I’d expect the same manufacturers that will be building their own GeForce FX 5600 Ultra cards will also be making their own GeForce FX 5200s.
Selling features ahead of performance
Because NVIDIA hasn’t announced the clock speeds of its NV31- and NV34-based graphics products, it’s really difficult to pass any kind of judgment on what each card’s performance might look like. NVIDIA has released some no doubt carefully massaged benchmarks comparing its new GeForce FX chips with a handful of older GeForce4s and ATI’s recently replaced Radeons, but a manufacturer’s own benchmarks should always be taken with a healthy dose of salt. The fact that NVIDIA is being so secretive about so many crucial details, from core and memory clock speeds to the internal structure of the shaders, is troubling. At least they’re being up front about the number of pixel pipelines and texture units in each chip.
With few details about the internal structure of NVIDIA’s new performance and mainstream chips and no hint of clock speeds, analyzing NV31 and NV34 is tough. There’s no new technology in either chip, but both extend NVIDIA’s support for DirectX 9 features all the way down the line to even the $99 GeForce FX 5200. That even a budget $99 graphics card will feature support for vertex and pixel shader 2.0+ and full 128-bit internal floating point precision is impressive, and those features alone should give marketing managers plenty of options when it comes to dazzling consumers.
But will these new GeForce FX cards sell on their DirectX 9 support alone? That depends.
Against ATI’s Radeon 9200s, the GeForce FX 5200 should do well, and rightly so. Consumers looking for a graphics card at that price point will more than likely be swayed by the GeForce FX 5200’s generation lead over the Radeon 9200 when it comes to DirectX support. Those consumers also aren’t likely to be bothered by the GeForce FX 5200’s potentially poor antialiasing performance, since few are likely to enable that feature anyway. I am, however, concerned about the GeForce FX 5200 line’s performance in real DirectX 9 titles, especially when using full 128-bit floating point precision. NV34 may have full hardware support for DirectX 9, but supporting DirectX 9 applications and running them with acceptable frame rates are two very different things.
Ideally, the higher clock speeds facilitated by the GeForce FX 5600 Ultra’s 0.13-micron core should yield performance comparable with ATI’s mid-range Radeon 9600 Pro, but it’s hard to say for sure. At least in terms of their support for DirectX 9 features, the Radeon 9600 Pro and GeForce FX 5600 Ultra are on roughly even ground. NVIDIA will have the edge when it comes to supporting longer program lengths in its pixel and vertex shaders. Unfortunately, NV31’s shaders seem to be half as powerful as NV30’s, which isn’t encouraging. Based on what NVIDIA has revealed about the GeForce FX 5600 Ultra, it looks like the card could be significantly slower than the GeForce FX 5800, especially in next-generation titles like the new Doom.
In the end, NVIDIA should be commended for bringing NV30’s CineFX engine down to a price point that everyone can afford; it’s cinematic computing for the masses. However, the fact that NVIDIA is being so tight-lipped about the clock speeds of its new cards, even though those clock speeds have apparently been finalized, sets off all sorts of alarms in my head. NVIDIA’s carefully-picked selection of marketing-tuned benchmarks aren’t enough to allay my concerns, and I have more questions about these products than NVIDIA has straight answers. Color me a skeptic, but it doesn’t look like NVIDIA’s NV31- and NV34-based products will dominate the mainstream marketplace. We’ll have to wait and see.