Texturing
We've talked quite a bit about the top portion of that SP cluster, but not much about the lower part. Attached to each group of 16 SPs is a texture address and filtering unit. Each one of these units can handle four texture address operations (basically grabbing a texture to apply to a fragment), for a total of 32 texture address units across the chip. These units run at the G80's core clock speed of 575MHz, not the 1.35GHz of the SPs. The ability to apply 32 textures per clock is formidable, even if shader power is becoming relatively more important. Here's how the math breaks down versus previous top-end graphics cards:
| Core clock (MHz) | Pixels/ clock | Peak fill rate (Mpixels/s) | Textures/ clock | Peak fill rate (Mtexels/s) | Effective memory clock (MHz) | Memory bus width (bits) | Peak memory bandwidth (GB/s) | |
| GeForce 7900 GTX | 650 | 16 | 10400 | 24 | 15600 | 1600 | 256 | 51.2 |
| Radeon X1950 XTX | 650 | 16 | 10400 | 16 | 10400 | 2000 | 256 | 64.0 |
| GeForce 8800 GTS | 500 | 20 | 10000 | 24 | 12000 | 1600 | 320 | 64.0 |
| GeForce 8800 GTX | 575 | 24 | 13800 | 32 | 18400 | 1800 | 384 | 86.4 |
So in theory, the G80's texturing capabilities are quite strong; its 18.4 Gtexel/s theoretical peak isn't vastly higher than the GeForce 7900 GTX's, but its memory bandwidth advantage over the G71 is pronounced. As for pixel fill rates, both ATI and Nvidia seem to have decided that about 10 Gpixels/s is sufficient for the time being.



![]() The G80's texture address and filtering units. Source: NVIDIA. |
The G80's texturing abilities are also superior to the G71's in a way that our results above don't show. The G71 uses one of the ALUs in each pixel shader processor to serve as a texture address unit. This sharing arrangement is sometimes very efficient, but it can cause slowdowns in texturing and shader operations, especially when the two are tightly interleaved. The G80's texture address units are decoupled from the stream processors and operate independently, so that texturing can happen freely alongside shader processingjust like, dare I say it, ATI's R580+.
More impressive than the G80's texture addressing capability, though, is its capacity for texture filtering. You'll see eight filtering units and four address units in the diagram to the left, if you can work out what "TA" and "TF" mean. The G80 has twice the texture filtering capacity per address unit of the G71, so it can do either 2X anisotropic filtering or bilinear filtering of FP16-format textures at full speed, or 32 pixels per clock. (Aniso 2X and FP16 filtering combined happen at 16 pixels per clock.) These units can also filter textures in FP32 format for extremely high precision results. All of this means, of course, that the G80 should be able to produce very nice image quality without compromising performance.
| Friday night topic: The trouble with Best Buy | 143 |