Single page Print

Le chips
The RV630 GPU will power cards in the Radeon HD 2600 line. AMD has scaled down this mid-range chip in a number of dimensions, as the handy block diagram below helps illustrate.

Logical block diagram of the RV630 GPU. Source: AMD.

The original R600 has a total a four SIMD units, each of which has 16 execution units in it, for a total of 320 stream processors. As you can see in the middle of the diagram above, the RV630 has three SIMD units, and each of those has only eight execution units onboard. That adds up to 120 stream processors—still quite a few, and vastly more than the 32 SPs in the competing GeForce 8600. (For what it's worth, the smaller number of units per SIMD should improve the RV630's efficiency when executing shaders with dynamic branches, since the chip's basic branch granularity is determined by the width of the SIMD engine.)

AMD has also scaled down the RV630's texturing and pixel output capabilities by reducing the number of texture processing units to two and leaving only a single render back-end. As a result, the RV630 can filter eight texels and output four pixels per clock. That's a little weak compared to the competition; the GeForce 8600 has essentially double the per-clock texturing and render back-end capacity of the RV630.

The RV630 retains the R600's basic cache structure, with separate L1 caches for textures and vertices, plus an L2 texture cache, but it halves the size of the L2 texture cache to 128KB. At 128 bits, the RV630's path to memory is only a quarter the width of the R600's, but it's comparable to competing GPUs in this class.

Thanks to this crash diet, the RV630 is made up of an estimated 390 million transistors, down precipitously from the 600 million transistors packed into the R600. That still makes the RV630 a heavyweight among mid-range GPUs. The G84 GPU in the GeForce 8600 series is an estimated 289 million transistors and is manufactured on TSMC's 80nm process. We've measured it at roughly 169 mm².

The RV630 GPU.

Alhough TMSC manufactures the RV630 on a smaller 65nm fab process, we measured it at about 155 mm². (If you'd like to see do a quick visual size comparison, we have a picture of the G84 in our GeForce 8600 review. All of our reference coins are approximately the same size as a U.S. quarter.)

The RV630's partner in crime in the Radeon HD 2400 series is a featherweight, though.

Logical block diagram of the RV610 GPU. Source: AMD.

In order to bring it down to its diminutive size, AMD's engineers chopped the RV610 to two shader SIMDs with just four execution units each, or 40 SPs in all. They left only one texture unit and one render back-end, so it can filter four texels and write out four pixels per clock. They also replaced the R600's more complex vertex and texture cache hierarchy with a unified vertex/texture cache, and they reduced the memory path to 64 bits.

The result is a GPU whose 70 million transistors fit into a space only 7 mm by 10 mm—or 70 mm²—when manufactured at 65nm. Nvidia's G86 GPU on competing GeForce 8300, 8400, and 8500 cards is larger in every measure, with 210 million transistors packed into a 132 mm² area via an 80nm process. Here's a quick visual comparison of the two below. Sorry about the goo on the Radeon chips; it's really hard to clean that stuff off, even with engine cleaner.

The G86 GPU.

The RV610 GPU.

The RV610 is smaller than the active portion of Sean Penn's brain, yet it has a full DirectX 10 feature set. Well, almost full—the Radeon 2400 series' multisampled antialiasing tops out at four samples, though it can add additional samples using custom tent filters that grab samples from neighboring pixels. Given the excellent image quality and minimal performance penalty we've seen from tent filters in the Radeon HD 2900 XT, that's no great handicap.