This question came up in the late stages of writing my Radeon X1000 series review, and I just got confirmation from ATI yesterday. Turns out that the vertex shaders in the Radeon X1000 series GPUs don’t support a notable Shader Model 3.0 feature: vertex texture fetch. As it sounds, this capability allows the vertex shaders to read from texture memory, which is important because texture memory is sometimes treated as general storage in programmable GPUs. Vertex texture fetch is useful for techniques like displacement mapping, where the vertex and pixel shaders need to share data with one another.
I asked ATI’s David Nalasco about this issue, and he suggested a possible workaround for this limitation:
No, vertex texture fetch is not supported. However, since the X1000
family does all pixel shader calculations with FP32 precision, just like
the vertex shader, it is possible to get the same results using the
render to vertex buffer capability. Basically, you do a quick pre-pass
where you render to a special linear buffer in which each pixel
represents a vertex. Textures can be used to modify each vertex through
the pixel shader, and the result is then read back into the vertex
shader. The result is fast vertex texturing with full filtering
support, without requiring any special hardware in the vertex shader
engine.Note that render to vertex buffer is possible in R4xx as well, but is
limited to FP24 which could cause precision issues in some cases.
Such a workaround would likely involve a performance penalty, but I doubt it would be a major hit. The larger issue is probably just the fact that the workaround would require special consideration from developers, because the GPUs lack a straightforward vertex texture fetch capability.
Because it’s easier for the apes to sling shit than talk.
I hear ya and agree. Your intel HT and dual core analogy only applies for games. HT and dual core are not specifially for games nor were they toughted as being game enhancers.
HT and dual core both work and do not need threaded apps to feel the performance enhancement. Thats a different topic rest assurred both HT and dual core work right out of the box from the get go on all apps. When you run more than one app there is more than one thread. Sure if a single app is written for dual threads great. But HT and dual core really help multitaksing in all situations.
Only people that own dual cores seem to know this.
Before everyone goes crazy and blows this out of proportion, like certain other websites are doing… I suggest people visit Beyond3D where this has been discussed and analysed fully.
ATI meets the SM3.0 specifications. The specs do not require this feature. ATI even provides an alternative hardware implementation of obtaining the identical results.
As it stands, Nvidia’s implementation is broken … well not broken, but so slow no developer would use it. ATI’s implementation does not suffer with the same performance penalties as Nvidia’s.
§[<http://www.beyond3d.com/reviews/ati/r520/index.php?p=02<]§ §[<http://www.beyond3d.com/forum/showpost.php?p=588805&postcount=23<]§ [This has been posted in this thread already, but it seems everyone has ignored those posts]
I understand your idealism, but does it really matter when 95% of the people out there only care about the who-has-the-highest-FPS game.
Using threading effectively requires a very deep rewrite (or better, fresh coding from the ground up), increases testing demands considerably, and adds a lot of time to the schedule. And even with hyperthreading and now dual-core, the hardware base is relatively low; moreover, the resulting code will generally run slightly slower on single-core/non-HT CPUs. And the results aren’t going to show up all that clearly in reviews.
On the other hand, adding eye-candy is relatively cheap incremental work and it shows up (sometimes spectacularly) in review screen shots. Plus the developers get a lot of support (and sometimes even marketing money) from the GPU companies to implement it.
So yes, game companies are very likely to implement features from DirectX 9.
Flashy eye-candy and good game-play are not mutually exclusive, even though that sometimes seems the case.
You probably do, alas, unless ATI can get the driver to implement the conversion automatically. See the second-to-last paragraph here:
§[<http://www.beyond3d.com/reviews/ati/r520/index.php?p=02<]§ Of course, hardware-specific code-paths are nothing new to game developers, but whenever they encounter one they have to do a cost-benefit analysis; in this case, they might just blow off the feature altogether (especially if it isn't particularly performant in either nVidia's or ATI's implementation).
I usually play eye-candy games once through on easy (if at all), then return it to the store. As a video card, sure it should support the greatest technology available. But does it really matter if all it means is you aren’t going to get some tertiary, optional and rarely implemented functionality in a game which obviously passes the burden of its worth onto “pretty trees” and “shiny metal.”
I’ve been hearing about all of these great new video technologies that manifest themselves as ultimately throw-away features. Do you really think video game designers are going to bend over for this feature? How long has the P4 with HT existed; and almost no games are threaded today? Even with dual-core, it’s still not happening. The only instances where we’ll see this feature is on cookie-cutter (probably FPS) games that are forced into eye-candy to differentiate their stagnant gameplay from other incremental video games.
Honestly, people, do you really even care about playing that game?
I hope ATI can pull out of this rut. I’m sure their engineers are beating their heads against the wall for all the hard work they did for such a lukewarm launch.
There just had to be something…
And if all I use is HLSL instead of low level shader instructions, do I really need to care how it is implemented?
Actually, it does meet the specificion. The alternative method is also a hardware solution as well.
But why have it supported in hardware if it’s going to run just as slow as passing it through pixel shader (ie. nVidia’s implementation)? Just because they support every SM3.0 spec, it doesn’t mean it’s going run well.
I’m sure ATI did it intentionally. However, I think hardware developers need to learn that for most purchasers “perception is the better part of reality”. Ergo, whether or not it affects the performance of the card, because the card does not meet SM 3.0 spec in hardware, they will lose sales to techies who belive this is important. I think it’s a bad move, knowing that hardware junkies want this kind of support on the silicon, not in the software.
ATI has implemented enough so that they technically meet the spec without making the feature actually usable. More explanation here: §[<http://www.beyond3d.com/reviews/ati/r520/index.php?p=02<]§
Ha! I was right about the ATI and unified shaders architecture.
I have a feeling that ATI did this intentionaly rather than actually forgetting to implement it. This is the first step to developing software for unified shader architecture, showing how a pixel shader can pretty much do the same functions as a vertex shader. However in this case, the pixel shader is just being used as a piggy back for the vertex shader, but with filtering support.
Nvidia’s implementation is supposed to be extremely slow, so slow that few developers have bothered with it.
§[<http://www.beyond3d.com/forum/showpost.php?p=588805&postcount=23<]§
This is kind of a big deal, isn’t it?
Anyone?
Is vertex texture fetch an optional feature of Shader Model 3.0?
Or does this mean that the X1000 series doesn’t support all of SM 3.0?
Edit followup after some Google research.
Yes, Vertex texture fetch is a non-optional part of Shader Model 3. However, even if the hardware cannot do the texturing itself, you can still use drivers to do it in software to be SM3 compliant. That’d be dog slow, but it’d allow you to support SM3. I don’t know if the ATI drivers do this, or if the X1000 just fails when vertex texture SM3 code is run.
One useful link:
§[<http://msdn.microsoft.com/archive/default.asp?url=/archive/en-us/directx9_c_Summer_04/directx/graphics/ProgrammingGuide/ProgrammablePipeline/HLSL/ShaderModel3/ShaderModel3.asp<]§
I imagine that this limitation will be more prevalent when more SM3.0 game hit the market.