On the X800's new features
TR: Tell us about Shader Model 2.0b. This spec was created to expose the new abilities of the Radeon X800 series, right?
Nalasco: Yeah. As you probably know, in DirectX 9 there's different shader models. There's a Shader Model 2 and a Shader Model 3 that are part of DirectX 9. There's also a 2.x kind of generic shader model which allows you to specify through caps bits, one by one, you can support a set of instructions that your particular hardware supports that are in between the 2.0 and 3.0 feature sets. Now the issue is that when you do this, you're going to have a different feature set, potentially, for every piece of hardware. So the 2.a and 2.b are basically shader profiles that you can use when you're compiling HLSL shader code, and those profiles will ensure that you're able to take full advantage of the hardware base that you're compiling for. So 2.b was designed to reflect the hardware capabilities, specifically, of the X800 series.
TR: What are the highlights of 2.b?
Nalasco: The main improvements in 2.b versus 2.0 are increased instruction counts. As I said, we increased it from 160 instructions in the 9800 which supports 2.0 to 1,536 instructions in 2.b. The limits were increased for all different types of instructions. You can have up to 512 arithmetic instructions, 512 texture instructions, and also those 512 arithmetic instructions can be divided up into both scalar and vector instructions, so you can have 512 scalar and 512 vector instructions, whereas in 2.0 it was limited to a total of 96 instructions for the vector and scalar and 32 for the texture instructions.
Another thing that was added was a facing register that allows you to determine which direction a given polygon that you're rendering is facing. And this is useful for things where you want to run a different algorithm on each side of the polygon. Two-sided lighting algorithms are one example of that.
I guess those are the main improvements in 2.b versus 2.0. Oh, sorry, the other was increasing the temp registers. We increased the number of temporary registers we support from 12 to 3212 in 2.0 and 32 in 2.0b.
Those three things are all centered around increasing the number of resources that are available to a shader when you're doing your compilation. So when you write your HLSL shader, the likelihood that it will compile successfully is going to be much higher on an X800-class product because you now have more resources that you can use in your shader, and you're less likely to have to break it up into multiple passes. And again, that allows you to simplify your programming.
TR: I've read that the F-buffer is only available in OpenGL because it isn't exposed in DirectX 9. Is that correct?
Nalasco: Yeah, that is the case currently, the main reason being that the F-buffer is really designed to work with multi-pass shader algorithms. It's designed to do them very efficiently by only multipassing the pixels in the scene that require multipassing. And also, when it does the multipass, it doesn't require you to write the results out to the framebuffer first. It allows you to multipass completely internally in the pixel shader, which generates higher efficiency.
One issue in DirectX is that the compiler has to be aware of these capabilities, and we don't create the HLSL compiler; it's created by Microsoft. In order to take advantage of the F-buffer, Microsoft would have to create a compiler that's specifically aware of that and would allow us to report the ability to compile infinite length shaders. And their compiler would have to take on the task of breaking shaders up into the required number of passes to fit within our hardware, which does add a fair amount of complexity.
In OpenGL, it's easy for to address that, because in OpenGL, we basically write the compiler for GLSL, which breaks it down into our hardware-level shading language. That's one of the reasons it's targeted at OpenGL.
The other thing is that the F-buffer is really of most benefit when you're running very long shaders. Shorter shaders shouldn't have to multipass on this hardware. These long shaders generally don't run in real time, and are therefore much more applicable to a workstation or digital content creation type market, where real time is not as big of a concern as just getting a very good quality image.
TR: So it won't be exposed in Shader Model 2.0b either?
Nalasco: No.
TR: How will 3Dc normal map compression be exposed in DirectX?
Nalasco: 3Dc is basically a new texture format that can be recognized by our hardware. DirectX 9 has a facility for supporting custom texture formats called 4CC code. You can basically specify one of these codes, and our driver will recognize it and take advantage of our hardware capabilities when you read that instruction.
So the only thing that is not currently in DirectX 9 is sort of a way to gracefully fall back. We basically want to promote 3Dc as an open standard, which means we have no proprietary kind of specification for this compression standard. What we'd like to see in DirectX 9 is a graceful fallback, so that if you try to use 3Dc textures on hardware that doesn't have explicit support for it, that there's kind of a graceful way to handle that. Currently, if you try to use a 3Dc texture on hardware that doesn't support it, you have an undefined response or an unspecified response. And that's something that makes it a little bit more difficult to handle if you're a game developer. So right now you have to have specific code paths in your game to support 3Dc, although the changes from a non-3Dc code path are very minimal. We are hoping to eventually get a more graceful implementation put into DirectX 9 or a future version of DirectX.
| Via intros $49 Android PC board | 1 |