Single page Print

Development of the Radeon X800 chip

We've heard that ATI started work on a chip code-named R400, and then decided to change direction to develop the R420. Why the mid-course correction, and what will become of the remains of the R400 project?

Nalasco: When we generate our roadmaps, we're always looking multiple years ahead, and, you know, circumstances obviously are going to change over that course of time. If you look at the development cycle for a new architecture, you're talking in the vicinity of a couple of years. One of the things that happened in our case is that we had these additional design wins or partnerships that we've developed with Nintendo and Microsoft, and that obviously requires some re-thinking of how the resources in the company are allocated to address that. So I think that's what you're really kind of seeing is that we had to make sure that we were able to continue with the roadmap that we had promised to keep producing for our desktop chips while also meeting these new demands, and we're confident that we're going to be able to do that.

Clearly, we've been able to execute with the X800, and we continue to be able to expect to execute on the same kind of schedule where we produce a new architecture every year, approximately, and a product refresh every six months. Really, the codenames are something that's used a little bit loosely early on in the design stages, but in this case, I would attribute it mostly to a rearrangement of priorities to meet the needs of our business.

ATI chose not to support pixel shader precision above 24 bits per color channel in the X800 series. What were the tradeoffs you faced, and why did you make this choice?

Nalasco: This was something that came up, already, several years ago when DirectX 9 was being . . . the specifications were being put together. At that time, there was a lot of research going into what was sort of the ideal amount of precision you would need to render the quality effects that we were targeting. And obviously you have to take into account that the more complex, or the higher precision that you shoot for, the more complex the hardware as to get to support it, and there becomes diminishing returns. So after a great deal of research, the conclusion that was reached was that for the current generation of products—and this was going back to the Radeon 9700 series—that 24-bit precision was going to give us enough headroom, in terms of how we could improve the quality of graphics, to make it into a reasonable target.

One thing to consider when you're looking at these things is, you know, it might not sound like a huge difference between 24 bits or 32 bits, but what you're basically talking about is 33% more hardware that's required. You need 33% more transistors. Your data paths have to be 33% wider. Your registers have to be 33% wider. To put that into perspective, that's kind of equivalent to adding another four pipelines onto an X800 type of chip to support this additional precision. And then we have to make the choice of, "What kind of visual benefit is that going to provide? What kind of programming benefit is that going to provide?" and weigh that against improving performance.

The conclusion that we came to was that, really, what's limiting graphics right now is less the flexibility of the programming interface and more just the raw performance that we're able to achieve when executing shaders. As a good example, on the Radeon 9800 series, we were able to support a maximum instruction length of 160 instructions in the pixel shader. And if you were to write a shader that was 160 instructions long, what you would find was that it was not, in most cases, able to run in real time. You were running at just a few frames per second.

If you were to now increase that, and you multiply that by a factor of five or ten like we've done in the X800, you haven't really increased what a programmer is able to do just by increasing their instruction limit significantly, because we're making chips that are really targeted at the real-time gaming market.

Increasing the instruction count without increasing performance is just an example of something that is not going to provide a benefit in the near term for a gamer. So if you were instead to devote those extra transistors to increasing performance, now you're able to run those 150-instruction shader programs at a much higher speed, so that a lot of techniques that were previously not feasible to run in real time now become feasible. This applies to many of the features that have been considered for some of the newer chips, so things like dynamic branching or vertex texture fetch and things like that. These are all things that can potentially provide benefits in some cases, but not to the same extent that just a raw performance boost can give you. So I guess that's where the design tradeoff came from. Just judging by the early reports of how people have welcomed the X800 architecture, we're pretty confident that we made the right decision in this case.