TR interviews NVIDIA’s Tony Tamasi

AS NVIDIA’S VICE PRESIDENT OF Technical Marketing, Tony Tamasi is more than just a marketing guy. He listens to customer feedback, helps define product specs, and then once the products are ready to roll, he acts as NVIDIA’s liason and chief technical evangelist to the rest of the world. As you might imagine, he’s had his hands full with the launch of the GeForce 6 series of GPUs, but he was kind enough to take time and answer some of our questions about NVIDIA’s impressive new graphics processors.

Dates and details on the GeForce 6800 family

When will the GeForce 6800 Ultra arrive in stores?

Tamasi: By Memorial Day the 6800 Ultra will be available, and by July 4th, the full line of the 6800 series will be broadly available.

On the non-Ultra, how much memory will it have?

Tamasi: The $299 card?

Yeah.

Tamasi: That’s actually up to the add-in card guys. There will be versions, I suspect, with 128 and 256MB, but that’s more up to the add-in card guys than us, really.

Will that card have a 256-bit path to memory?

Tamasi: Yes it will.

Will it be DDR, DDR2, or DDR3 memory?

Tamasi: DDR1.

That combination of specs sounds like a tall order at $299. Can you guys make money selling it at that price?

Tamasi: If we couldn’t, we wouldn’t. [Laughter.]

We’ve heard that the GeForce 6800 Ultra GPU is 222 million transistors. How do you guys count transistors? Do you count all SRAM/cache, etc?

Tamasi: The only way we really know how to give an accurate transistor count is to count up all transistors on the chip, and that’s everything. So that number includes caches, FIFOs, register files. It’s all transistors. It’s not just logic transistors.

Are you willing to divulge die sizes?

Tamasi: No, we don’t typically divulge that stuff. It’s big. [Laughter.]

Are you counting the same way for this one as for the NV30 series and past GPUs?

Tamasi: Yep. We’ve counted transistors the same way since we’ve talked about transistor counts. In fact, I’m not sure why anyone would ever throw out a transistor count for a chip that wasn’t actually the transistor count of the chip.

Edge antialiasing

We noticed some interesting things about GeForce 6-series antialiasing in our review. Is the GeForce 6800’s 8X antialiasing mode 4X supersampling plus 2X multisampling?

Tamasi: The current mode that’s actually in the control panel is a 4X super/2X multi, and that will work in both OpenGL and D3D. We actually do have a 4X multi/2X super mode that a driver, probably within the next few weeks, is going to enable as well.

Does GeForce 6800 antialiasing do anything at scan-out that won’t be picked up in screenshots? If so, what is it doing and in which modes?

Tamasi: The resolve pass—when you typically multisample you have to do a resolve pass—can either be done as another pass in the frame buffer or at scan-out. In the case of, like, if you’re doing, say, 4X multisampling, that resolve pass is actually done what we call “on the fly.” We don’t take a separate pass and write another buffer.

So if you take screenshots, you need to… there’s a couple of utilities that will do the right thing and a couple of them which will not do the right thing. In fact, our drivers now basically do the right thing. In other words, when you grab a frame, it will give you a a post-resolve image as opposed to a non-multisampled image.

Now, does that apply in all your multisampled modes?

Tamasi: Yeah. This resolve on the fly technology works for any multisampling mode.

What about screenshots from 3DMark03? When you use its image quality tool, does it produce the correct output?

Tamasi: If you select AA with 3DMark, then you’ll get the correct frame grabs.

ATI has touted “gamma-correct blending” for antialiasing in the R300 series. Does the GeForce 6800 Ultra have this feature, and if not, why not?

Tamasi: It does, and I want to be really specific about this, because there’s a lot of confusion about it. There’s a great deal of difference between gamma correction and gamma adjustment. What ATI does is do a gamma adjustment to gamma 2.2, which can be correct depending on your display, and that’s essentially what we do, as well. Gamma correction would typically would mean you could do an adjustment to any gamma, and that would require a shader pass.

 

Shader models

The GeForce 6800 Ultra’s pixel shader performance is way up from your previous-generation GPU.

Tamasi: Yep.

Are the NV40 pixel shaders derived from NV30-series shaders, or are they a clean-sheet design?

Tamasi: It’s a clean-sheet design. About the only thing they have in common is you could draw a block diagram and some of the blocks might look similar, but the code is all new.

One of the GeForce 6800’s more important new features is Shader Model 3.0. Can you tell us briefly about Shader Model 3.0? How it will benefit gamers?

Tamasi: A couple of ways. There’s two big hunks of Shader Model 3, vertex and pixel shading.

On the vertex side, what Shader Model 3 brings is really three things: a much richer programming model, so you get longer programs, you get more interesting flow control. So from a developer’s perspective, they can do a lot of interesting things in Shader Model 3 that either they couldn’t do before in Shader Model 2 at all, or they can do much more efficiently in Shader Model 3. So, for example, complex character animation. When you’re skinning a character, you can actually branch and skip over pieces of code that would be unused in Shader Model 3, which would be a nice performance win, whereas in Shader Model 2 you’d have to execute that.

There are some new features in vertex shader 3.0. There’s a thing call vertex texture fetch which allows applications to actually access texture memory from vertex processing, which can be used for a lot of things including real displacement mapping, where you access height field and then displace vertices in the vertex shader.

One of the, probably, most overlooked but maybe most interesting features is one called geometry instancing, which essentially allows developers to batch up what previously would have been lots of small transactions, lots of small models, into very large indices of models and transmit those efficiently across the bus and into graphics—particularly applications that do what we call lots of “little dude rendering.” Real-time strategy games are a great example of this, where you might have hundreds of relatively low-polygon-count models running around. Previously, you’d have to basically make a draw call for each one of those models, and that can be really inefficient. You know, it can load your CPU down, and you can have poor graphics utilization. Using geometry instancing, you can basically batch all that up into many times fewer draw calls. Typically tens and sometimes hundreds of times fewer draw calls, which will reduce your CPU utilization, allow your frame rates to improve, as well as improve your efficiency with your graphics processor.

That’s on the vertex side. On the pixel side, it’s much the same. You have a much richer programming environment, so you have very, very long programs, many orders of magnitude more instructions than Shader Model 2 provides. You have a real flow control model, so you get support for loops and branches and a call stack, just like you get in a real programming environment, and of course for Shader Model 3, the required precision is FP32, so you don’t get any artifacts that might have been due to partial precision. You can still get access to partial precision, but now anything less than FP32 becomes partial precision. Essentially, the required precision for Shader Model 3 is FP32.

What do gamers get out of this? Well, they’re going to get titles or content that either looks better or runs faster or both.

I’d like to clarify something about Pixel Shader 3.0 programs. Some of the literature mentions instruction length limits “greater than or equal to 512,” while others say the limit is 65,536 instructions. What’s the story?

Tamasi: The minimum number of slots is 512, but if you support looping, you can execute many more instructions than that. So it’s a combination of… it’s basically flow control is the big reason for that. Shader Model 3 allows you to do flow control, so you can do loops and branches, and Shader Model 2 does not. There is a new profile, which ATI kind of announced at GDC, which is their 2.0b profile, which basically supports what they claim to be 512 instructions, but there’s no flow control, no changes in precision, no loops, no branching—none of the new features, so to speak, of Shader Model 3, just 512 instructions in one pass. They don’t basically support loops or branching. Our hardware supports the full Shader Model 3 model, so you get 512 slots, so to speak, and with loops and branching you can execute 65,000 instructions.

About dynamic flow control in real-time pixel shaders. Branching and conditionals seem to have the potential to produce some relatively costly pipeline stalls. What direction are you guys giving game developers about how to avoid these scenarios?

Tamasi: Basically, use them carefully. [Laughter.] You’re absolutely right. If you don’t use branching properly, it can be a performance loss, but there’s lots of scenarios where it can be a performance win. In fact, our mermaid demo uses branching quite effectively. The shader for the mermaid itself is actually one large shader, and it branches to determine whether it’s skin or the scale of what we call the fish-scale costume. We’ve been quite explicit about, you know, make sure you’re using branching to your application’s benefit. You’re right in that it’s not “free.” In fact, it’s not free on a CPU, either. It’s just that when you talk about a parallel pipeline like a graphics processor, executing a branch becomes a little bit trickier.

 

Shader Model 3.0 in real-time apps

One of your examples of a complex pixel shader at Editor’s Day was a skin shader for Gollum from Lord of the Rings with subsurface scattering. The presentation said that shader required 135 instructions, 14 texture accesses, and 259 FLOPS per pixel to compute.

Tamasi: Yeah, that was just the subsurface scatter component.

What kind of shader lengths are viable for real-time applications with the GeForce 6 series? Can you give me a ballpark?

Tamasi: Hundreds of instructions. Frankly, it depends on the nature of the math, what you’re doing, how many texture accesses, that kind of thing, but to give you a feel of it, at that same Editor’s Day, the folks from Epic gave a demonstration of Unreal Engine 3, and they commented that most of their shaders are between 50 and 150 instructions long.

I’m curious about this: Developers will probably be writing shaders in a high-level shading language like HLSL, which will them be compiled for the target hardware, if I understand correctly. What would a developer writing in HLSL do differently if his target were Shader Model 3.0 versus Shader Model 2?

Tamasi: Basically, they’ll write in HLSL, and really there’s two levels of compilation, is the right way to think about it. There’s the API, DirectX, will do what I would call kind of a pre-compilation to whatever runtime target, whether it’s Shader Model 2.0, Shader Model 2.0b, Shader Model 3.0. Then, once the API does that work, then there’s actually a compiler in the driver. Anybody who builds hardware has a compiler in their driver which will take the API instruction set and turn it into essentially machine code for the hardware.

So from a developer’s perspective, they write in HLSL, and if they want to support Shader Model 3, they’ll write code that requires loops and branching and has long shaders, and the API will deal with that. If they want to target hardware that supports something less than Shader Model 3, they’ll have to write HLSL code with that in mind. And basically, there’s a profile for that that Microsoft provides. It was actually part of DirectX 9 initially. Shader Model 3 was actually in the API in DirectX 9. DirectX 9.0c will essentially enable it from a hardware perspective.

Can you give us some quick examples of effects possible in real time with Shader Model 3.0 that aren’t possible with Shader Model 2.x?

Tamasi: There’s a lot of sophisticated shadowing and lighting algorithms that you can do that would either be, not necessarily impossible, but just very impractical with Shader Model 2. For example, you can early exit in Shader Model 3 from a shader that might require execution of hundreds or potentially many hundreds of instructions in Shader Model 2, which might be impractical from a performance perspective. You can do true branching, which can simply, you can do things that you can’t do in Shader Model 2.

One of the examples, from our own developers, is the physics demonstration that we gave at Editor’s Day that actually provides, with Shader Model 3, a feedback path between the pixel and the vertex processing. In that particular demonstration, what the developer did was displace a geometry field to create essentially a mountainous scene, and then they compute the physics for the particle system entirely in the graphics processor. They actually compute what we would call motion vectors in the pixel shader and they feed those motion vectors back into the vertex processor and use vertex texture fetch to read the motion data to move the particle system around. So it’s a completely GPU-driven particle system, for example.

There’s a lot of things like that that are possible with Shader Model 3, but frankly, I think the biggest win for Shader Model 3, and from what you’ve read from developers or if you’ve talked to them you’ll hear pretty much the same thing, is that Shader Model 3 fundamentally just makes it easier on developers. As far as I can tell, that’s the biggest win for everyone, because it gives them a real programming model that they’re used to. When’s the last time you wrote a C program that didn’t have a branch in it? So they get a real programming model. They don’t have to worry about instruction set limits and what I call “coding inside out.” They can just kind of write their shaders and not have to worry about, “Gee, is this 96 instructions?” or whatnot. And frankly, the feature set is complete enough that they can just kind of code away and get the effect that they want. And frankly, it can be completed simpler and easier in Shader Model 3, so from a productivity perspective, they’re going to be much happier.

That, I think, in combination with the fact that NV4x does 64-bit floating-point framebuffer blending and texture filtering has really make it a lot easier for developers to do high-quality shading content.

What about some examples of shaders where FP32 precision produces correct results and FP24 produces visible artifacts?

Tamasi: You don’t have to listen to me, you can listen to the statements by Tim Sweeney. They’ve got a number of lighting algorithms that produce artifacts with FP24. In general, what you’re going to find is that the more complex the shader gets, the more complex the lighting model gets, the more likely you are to see precision issues with FP24. Typically, if you do shaders that actually manipulate depth values, then again you might see issues with FP24.

And I think lastly, the big issue is that there is no standard for FP24, quite honestly. There is a standard for FP32. It’s been around for about 20 years. It’s IEEE 754. People, when they write particularly a sophisticated program, they kind of expect it to produce precision that you’re somewhat familiar with, and single-precision floating point on CPUs has been FP32 for years. I think from that perspective it’s much more consistent. They don’t have to worry about special-casing things. They don’t have to worry about, “Gee, whose FP24 is it?” since there is no standard. If someone implemented FP24 this way, it might be different on someone else’s hardware, that kind of thing. But generally, the more complex the lighting algorithm, or they actually manipulate depth, the more likely you are to run into precision issues with FP24.

 

Far Cry and shader models

We’ve seen the Far Cry screenshots you all released with Shader Model 3.0 effects.

Tamasi: Actually, those are Shader 2 or Shader 3. That’s right.

One of the effects we’re seeing is a “pseudo displacement mapping” effect, isn’t it?

Tamasi: Yeah. “Virtual displacement mapping,” “parallax mapping,” there’s been a number of terms for that.

Any idea how many instructions long the shader program is that produces this effect?

Tamasi: That effect actually is reasonably inexpensive from a number of… I think it’s less than ten for that one particular piece of that effect. It’s actually less than ten shader instructions to do that.

Will we see a Shader Model 2.0 path for GeForce FX with this same effect in Far Cry?

Tamasi: Yeah, the images that you’ve seen from Far Cry, the current path, those are actually Shader Model 2.0, and anything that runs Shader Model 2.0 should be able to produce those images.

NV40 internals

Looking at some of your presentations, it appears each NV40 pixel shader unit, and I guess there are two in each pixel pipeline, can work a couple of different ways: it can perform a three-component vector operation and a single-component scalar op in one clock cycle, or it can perform a a pair of two-component vector operations per clock. Do you have any examples of what type of graphics operations could take advantage of this capability?

Tamasi: Well, there’s a new rage, so to speak, in terms of shading effects, what we would call post-processing effects—glows and blurs and things of that nature, or other lens effects. Most of those effects tend to be two-dimensional, because you’re typically operating on the entire image, and therefore, if it’s two dimensional, it just has XY coordinates. So, from a coordinate system perspective, those are two-component type operations, and those are all nice wins when you can do parallelized operations.

Inside of the pixel pipeline, you’ve got two of the FP32 pixel shaders in each pixel pipe. Can both of them do parallel vector operations per clock?

Tamasi: Yep. The way to think about it is that you can dual (or more) issue instructions per shader unit, and then you can co-issue between them as well, so, in fact, you can have four, or in some cases more than four, instructions being issued on a single pixel pipeline—two in shader unit one and two in shader unit two—two independent instructions in shader unit one and another two independent instructions in shader unit two. We also have mini-ALUs in each of those shader units, as well, which also can have instructions issued to them. We gave a shader example that actually had up to seven instructions being executed in parallel in one pass.

TR would like to thank Tony Tamasi for his time and patience in answering our questions. 

Comments closed
    • SplitScreen
    • 16 years ago

    I was rather disappointed that the interviewer did not even ask about the 6800’s shader quality in Far Cry. It IS a $500 card but the screenshots when compared to the 9800 XT left a lot to be desired. (www.tomshardware.com) In addition it claims to be capable of all this high dynamic range lighting and OpenEXR stuff. Now if it can’t do Far Cry properly that sort of reminds me of the 5200. Remember it had disappointing DX8 acceleration (initial sign of trouble) and dismal DX9 acceleration (deceleration?).

    I hope it’s just a driver issue and the retail versions of the cards have resolved the problem.

    • PerfectCr
    • 16 years ago

    The real question is… willl it run Leisure Suit Larry 5 in all it’s glory? If it does, I’m gettin’ a 6800 ๐Ÿ˜›

    • palhen
    • 16 years ago

    And now, were is that ATI card we are waiting for?

    • Freon
    • 16 years ago

    Wow, that’s one of the best interviews I’ve read in a while. Well, I guess I stopped reading them because they are usually just give canned marketing department answers, which defeats the purpose.

    Good stuff.

    • Wintermane
    • 16 years ago

    On 32 vs 24 what sweeney was getting at was in his GAME code because he was dealing with some long shaders already and alot of pixel fiddling he ALREADY had hit the point where 24 bit was making mistakes. Now realize the guy actauly wants 64 bit or more;/ Still for that game 32 bit is enough to not produce noticeable artifacts and 24 isnt its as simple as that.

    Did sweeney go out of his way to make 24 bit not work well enough? No he just did one hell of alot of massaging of some pixels to get some realy nifty effectsthat would run fast enough on the bloody hardware he expected at launch time… you know the stuff bleeding edge game coders DO!

    Now the reason its important is sweney is far from the only dev doing complex shaders nowfor near future games. Alot of extra effects in near term games come via extra shader passes and with 24 bit if you enable too many of those effects at once you will corrupt the final results beyond tollerable levels. Its as simple as that.

    As all devs write up more effects to add into the mix the total length of the sahder ops being done on some of the pixels grows beyond the level 2.0 can cope with gracefully and ALSO to the point that the little errors 24bit provides grow into something a player will notice and whine about. Now imagine what happens in barbie dream date 10 when you enable the boobmapping and the fasion enhancment mapping and the super hair gell mapping and the fasion lights and the artistic fog mode all at once and your shader length for one pixel tops 100 or so and boom errors snowball and 2.0 shaders cry mommy and so on. Nothing is worse then poorly mapped boobies!

    As for 16 bit nvidia offers both 128 bit and 64 bit float as in 4 32 or 16 bit float values. They didnt bother with 24 bit because .to be blunt they didnt need to. 24 bit came and went as far as usefullness between the generations of nvidia cards and the 6800 simply doesnt need it when it can do 32 bit plenty fast.

      • reever
      • 16 years ago

      You’re talking about a game engine that will be finished in 2006, see the problem?

      24 bit came and went as far as usefullness between the generations of nvidia cards and the 6800 simply doesnt need it when it can do 32 bit plenty fast.
      —-

      The 6800 will still be using fp16 for some things, go figure, i guess that is still useful as most effects won’t show differences, wait a minute, most effects won’t show a difference, what a concept.

        • Wintermane
        • 16 years ago

        I remember back a bit they said the same sort of thing oh most games dont use blah and besides most effects you wont notice the difference.

        The problem was the effects I cared about DID. Now the simple fact is I dont intend to get any card untill I need to and at that point IF there is some spiffy thingy in some spiffy game I realy want it will edge out other cards if it has the ability to either A do it quickly enough or B do it at all.or C do it nicely.

        Currently there are no games I realy care about so no I dont care to get a card for any new games. But when I do its entirely possible ill get a 6000 series card over a 420variant simply due to one or two games having something I WANT TO SEE.

    • Wintermane
    • 16 years ago

    The reason I dont expect a cheap r420 chip is they didnt make a cheap r300 chip. They have nothing like the 5200 which is now down to as little as 50 bucks. And I dont expect it of them any time soon.

      • vortigern_red
      • 16 years ago

      The R300/NV30 gen of chips are completley different to the newer cards. FX 5200 is NOT an NV30/35 with half the piplines disabled and ATI do have cards that compete with FX5200 they are called the 9200’s both IHVs cards are treated as DX8 capable by games developers.

      Tims game engine used in current games uses a few DX8 shaders 3 years(?) after the geforce3 arrived, I don’t think you would want to be running a 6800 when his PS3/32bit game engine debuts. LOL

        • Wintermane
        • 16 years ago

        Odd I could have sworn ati still used 1-2 gen old tech in its low end chips.

          • vortigern_red
          • 16 years ago

          Yes they do use 1 gen old tech in the low end cards plus some current tech and possibly even older tech. Err… just like NV. If you are refering to the unusable PS2.0 of the FX5200 its just that, unused.

          Anyway my point is it is flawed to assume that last gen line ups will be comparable to this gen. Even if you want to do that ATI did disable quads on two of its top end chips to make mid range chips, but when you only have two quads you can’t disable another to make a low end chip ๐Ÿ™‚

          This is something we have not seen from NV before, all their chips were either different chips or clock speed bins.

          Anyway you are probably right R3xx based chips would make pretty decent low end models whereas NV would no doubt like to move on from NV3x pretty quickly.

    • WaltC
    • 16 years ago

    Part One.

    I particularly enjoyed dissecting these remarks:

    /[

      • WaltC
      • 16 years ago

      Part Two.

      /[

        • vortigern_red
        • 16 years ago

        Walt, I don’t suppose the “C” stands for Concise?

        • Entroper
        • 16 years ago

        You know, all the “Tommy T.” crap just makes your posts even more annoying. Your arguments *[

          • WaltC
          • 16 years ago

          /[

        • Kevin
        • 16 years ago

        It’s “Sweeney” with an “e” at the end. As a fellow “Sweeney” myself, I find it quite annoying when I see that last “e” left off when it’s supposed to be there.

        Not to be a spelling nazi or anything. ๐Ÿ˜‰

          • WaltC
          • 16 years ago

          Thanks–it was unintentional and I’ll certainly correct it…;)

      • ciparis
      • 16 years ago

      Speaking of annoying ;). Regarding Tim’s 24/32 statement, Walt, you chose to argue a point which requires a different interpretation of what was said than was arguably the speakers intent when taken in context, and then you proceeded as if your interpretation was the only possibility worthy of consideration.

      Now, yes, he did say something which, when closely examined, does not necessarily or explicitly state what a hypothetical average reader might have been intended to derive from the statement, given the overall context of “should a discerning gamer care if a card can do this, or not?”. And had I seen that sort of wording in a television ad, you bet I would assume that the important bit (and all they were really saying) was exactly what they didn’t say.

      “Spot the Bullshit” is a fun game – I do it alot when reading advertisements or watching television.

      But with all due respect, when applied to conversational or interview situations – especially in a medium like this one – such precision is not necessarily a successful display of intelligence. Some might consider it pedantic.

      To explain what I mean, in what I have to admit is a rather pedantic level of detail myself (if the shoe fits.. ;)),

      You chose to interpret “I can write software which artifacts at fp24 but not at fp32” as
      – I can write a hypothetical peice of software
      – said software would artifact on fp24, but not fp32

      – instead of –
      – I can write software which still has discernable artifacts on fp24
      – I can not write software which still has discernable artifacts on fp32

      …and then argued the point (apparently taking that assumed meaning as fact) that they really didn’t say anything relevant at all (and expanded upon that in some detail, even questioning their ability discern relevance to begin with).

      It’s hard not to notice the irony.

      Now, it has to be conceded that, given his wording, either of those could be the correct interpretation. But that’s just the point: given the context of the statement (in other words, taking into consideration the, I think, fairly obvious intent of the questions posed to him), unless Tim’s intent was to mislead or unless his statement was taken out of context before we ever saw it, it’s a reasonable argument (_not_ assumption) that Tim’s intent was to illustrate something he considers a real-world advantage of the technology, and not just to toss out a red-herring.

      It doesn’t benefit an argument when an interpretation unconfirmed by context but not precluded by syntax is taken as presumptive intent (or worse, presented as an interpretation made obvious by syntax but whose true meaning or relevance managed to elude even the person making the statemement) and conclusions are drawn based upon that, especially given an assertion as dramatic as “I think Sweeney/Tamasi/et al have missed a basic point to 3d rendering here”.

      Jumping to conclusions (especially without pointing out that an assumption has been made, and when around so many other people skilled in playing the spot-the-bullshit game) is sometimes perceived not as a convincing argument, but rather as invitation to draw into question the motivation of the person making the assumption.

        • tu2thepoo
        • 16 years ago

        quote:
        q[https://techreport.com/onearticle.x/6603<]ยง ).

        • hmmm
        • 16 years ago

        You make a number of good points. I think Walt went a little overboard here, and the irony of his statements is quite clear.

        That said, I do want to comment on this little bit:

        /[

          • Damage
          • 16 years ago

          The interview was conducted live via telephone.

            • WaltC
            • 16 years ago

            Yes, it was my impression that it was a telephone interview.

          • ciparis
          • 16 years ago

          Good comment ๐Ÿ™‚ you’re right that the main point of that paragraph, contrasting the creation of a strategically worded, lawyer-reviewed, liability-inducing advertisement with something that lacked the benefit (if you want to call it that) of such a process didn’t really need that unqualified comment about the medium.

          I meant to be referring stereotypically to situations where you might think, had the speaker and listener (in this case the reader) been in the same room using all of their senses and having a thorough sense context, the meaning would have been more effectively and/or clearly conveyed. I was also thinking (too much caffeine today) about the longevity of communications, lending them to easy review and re-review, where a casual or carelessly-phrased remark can take on a life of its own, and the increasing loss of context that that originally imperfect communication can take on during that life, through snips and resnips, subjecting it to analysis ad-nauseum, etc.

          I didn’t have enough information to make a specific reference to this case (or even to conclude that the medium had had any effect), so I considered taking it out completely, but I left it in as a bit of a soap-box aside. It’s a frequent topic of discussion for the voices inside my head.

          • WaltC
          • 16 years ago

          /[

        • WaltC
        • 16 years ago

        /[

          • ciparis
          • 16 years ago

          Because defending marketing-speak from that company is something I consider a waste of my time ๐Ÿ™‚

          That comment is in answer to y[<"It seems like your only arguments against what I've said are that I *may* have reached the wrong conclusions. Why is it that you do not extend the same level of critique to Tamasi's conclusions as well, then?"<]y, btw. Fair question. Unfortunately, if it's coming from a PR rep, I don't even pay much attention to what they say anymore.. even some of the old 3Dfx marketing folks seem to have been sent to the School of Spin and had their own gamers sense of value surgically removed, and sending comments, however appropriate, in the direction of any such graduate is something I think is akin to dropping thoughts into a black hole. A couple of websites (this one included) do a good enough job I think of wading through that stuff and shining a light where appropriate, and they can make themselves heard. And I'm not criticising people for participating in those discussions, I just personally feel like I'm pissing in the wind if I bother with it. (boredome alert, feel free to stop reading) And, to be clear, my original post wasn't intended to be about your critique in general; it was specifically about one quote from a game company (which I took to be a direct quote, since it was Tim's comment in quotation marks) and where you went with that one quote (and the subsequent affect that that could have on the rest of the article from a reader's point of view). What I was trying to say in the "display of intelligence" paragraph was that an argument, however thoughtfully constructed or well-articulated, can lose credibility if it looks like someone's intention has been discounted in the process (which you may not have done at all, but that's how it read to me). Hell, you might not even have been putting forth a direct quote, but rather a paraphrase of what the NVidia guy said - in which case the entire situation changes (how's that for irony? :)).

          • ciparis
          • 16 years ago

          /[

      • SplitScreen
      • 16 years ago

      WaltC, I hope you realise that Tamasi was probably referring to UnrealEngine3 which will require a DirectX 9 compliant card as the “absolute minimum”. Not the one used in UT2004. A demo was shown to some industry guys. I wouldn’t be quick to argue on that point.

    • emi25
    • 16 years ago

    #28, Yes! Last year nVidia sucked at ps 2.0 performance.
    Please don’t think at 3.0 spec only for gaming.
    If we have programms that can use that power from the GPU, I think you’ll see the benefits.

    l[

    • Lordneo
    • 16 years ago

    anyone know where i can find those pics of frycry with the 6800 ?

    thanks

      • Lordneo
      • 16 years ago

      nevermind , found em.

    • Lordneo
    • 16 years ago

    typo ?

    wouldnt want to give out the wrong impression.
    page 2.
    “Tamasi: A couple of ways. There’s two big hunks of Shader Model 3, vertex and pixel shading. ”

    He meant Chunks ?… or did he really say hunks ?

      • atidriverssuck
      • 16 years ago

      hunk, while being an attractive man, also means a big piece of something. I’m sure it was hunk. You haven’t come across the term before?

        • indeego
        • 16 years ago

        Perhaps he said “Monk” and was merely being philosophically suggestiveg{

    • Dposcorp
    • 16 years ago

    Great job, Scott.
    It is awsome that you were able to land someone high up from Nvidia at the right time, and ask a lot of good questions.
    Just as the 6800 series is starting to come out. :0

    Hopefully you can land someone from ATI at the right time as well.

    Excellent intereview, and keep up the good work.

      • Pete
      • 16 years ago

      Agreed, this was an excellent interview. Thanks, Scott. Glad you got the AA thing cleared up. You saved yourself another needlessly elaborate/pedantic email (sorry about that). ๐Ÿ™‚

    • Wintermane
    • 16 years ago

    I think many people are misreading what nvidia is realy saying.

    Alot of this 2.0 vs 3.0 is more about now you can actualy do this this and this in a GAME instead of a demo.

    Now as for r420 vs nv40 we wont know till they both are retail and lets be frank for most of us that isnt the real question the real question is how much will the 6600 or 6400 or whatever cost and what will IT do.

    I expect to see a 6400 with 8 pipes and 4 shaders.

    I also expect either a 4 pipe 2 shader or 4 pipe 1 shader 6200.

    I do NOT expect to see all that cheap of an r420 variant.

    See I dont give a rats arse what the high end stuff does what I care most about is when do these cards get cheap enough to wind up in dell systems and thus wind up being REALY looked at by the big time devs.

    Imagine what kinds of games will hit after a few million dells with 6200s are sold… Barbie dream date 3d with boobmapping and with support for billions and billions of pastel colors you never thought possible before!;/

      • vortigern_red
      • 16 years ago

      Why don’t you expect to see cheaper R420? Rumor suggests they are using a similar system of disabling quads to NV and R420 is reportedly a smaller die.

    • emi25
    • 16 years ago

    With current Geforce FX cards, ( 4×2 architecture ) they need to cheat.
    They need to release new drivers “optimized”.
    They choose wrong.

    Now I hope they learned the right lesson.

    • PerfectCr
    • 16 years ago

    /me sits back and enjoys the fanboy argument!

    “NVIDIA CHEATED!”
    “WHO NEEDS PS 3.0?”
    “R420 will KILL NV40!”

    Please.

      • MadCatz
      • 16 years ago

      You know its funny. It seems as if this is much more of an Nvidia Fan-boy thing as I have not yet seen any “ATi rules”/”Nvidia Sucks” posts anywhere, yet people keep swearing that they are there. Very ironic. However, if I have missed a few posts someone should let me know as this argument about ATi/Nvidia fanboys is starting to piss me off.

        • ripfire
        • 16 years ago

        It’s easy to spot these arguments. Just read those who already predicts what’s going to happen even though the card hasn’t even been released yet.

    • Krogoth
    • 16 years ago

    [soapbox]
    Damm, is it me or it’s just deja vu? I remember you folks arguring that PS 2.0 wasn’t going to used for quite a while with the Radeon 9700 PRO. Back when it’s was offically released. So it was pretty pointless to get the Radeon 9700 PRO other to able to play at 1600×1200 4xAA/8xAF with a high FPS. You could get around fine with the Geforce 4 Ti series which was true for the most part. Then, it was the driver cheats with Nvidia back with the lackluster 5800U and it’s inferior spawn the FX 5200 and FX5600. Since, the NV3x wasn’t made to the exact Directx 9 specs (the core used FP32 at 128bits while, the spec needed FP24 with 24bits).

    My point is who really cares about all this FP and VS/PS crap? Maybe the programmers but, your averge joe gamer (60-70% of the market) will not give a crap. As long as his video card can run his favorite game ( examples: Sims, CS and BF1942) at 640×480 at a playible framerate. He’s not going to care about the image quality or all this AA/AF crap.

    It’s only the hardcore, perfectists (top 5% of the market) who blow air and complain that their shinly new video card can’t do PS/VS 3.0. It doesn’t score the highest score on 3Dmarks and some other sythatic benchmark, or it’s IQ is only noticibly worse then it’s compedator only under a screenshot.

    [/steps off soapbox]

      • ripfire
      • 16 years ago

      /[<** Holds Voodoo2 card ***<]/ Who needs 32b color?

      • hmmm
      • 16 years ago

      You have a point, but then lose it when you start boohooing 16x12x32 @ 4xAA/8xAnsio. I would bet that your average joe gamer would much rather have that than 640x480x16 with no AA or ansio.

      They might not care what makes it look good, but they will notice that it does look good.

        • Krogoth
        • 16 years ago

        Your average gamer will be stuck with a Dell/Concrap “gaming” PC. That uses a FX5200/GF4MX/R9200 which cannot run at such a resoultion with a decent framerate. Besides, most of the PC gaming market is screwed anyway, since the big gaming publishers are more interested in the larger but, more profitible console market. Just check your local gaming store’s shelfing space to see the infurible proof of this. So in the end your averege joe gamer in the future will end-up using his PC for only real work. While, the Xbox2/PS3 will do all the gaming with his TV which is stuck with a craptastic resolution.

    • daniel4
    • 16 years ago

    /[

      • atidriverssuck
      • 16 years ago

      rumour has it they show up on the Nvidia’s “the way it’s meant to be played” spamvertisement.

    • DukenukemX
    • 16 years ago

    Basically it just proves that Shader Model 3.0 is gonna look just like 2.0. I can see why ATI didn’t add support for 3.0 in their R420 cards. While in theory 3.0 is suppose to be faster we really don’t know until DX9.0c is released and or ATI’s new R420 is released.

    The whole idea of 3.0 making it easer for developers is wrong. Because there isn’t just 3.0 is there? There’s 2.0 and 1.4 and 1.1 for example. If you were a game developer you would make sure even Geforce 3 or Radeon 8500 owners could run your game. So at the end it’s actually more work.

    Unless Nvidia plans on releasing l[

      • WaltC
      • 16 years ago

      /[

    • Doug
    • 16 years ago

    Good article. It would seem that the NV40 has a number of bases covered.

    The more I read about the NV40, the more I will be in line in July to get one (tho depends on release of Doom 3 as if that is delayed, I wont buy the card till then).

    My GF4 TI 4400 is playing Farcry beautifully with most details on the second highest at 1024×768 with 35-40 fps and that is all on a 2 Ghz P4.

    At this stage, it looks like ATI may be doing catch up … not that I have anything against ATI … I just buy the best card at the time. With the NV40 supporting Linux and XP 64 (in basic modes), it looks like the better card (at least for me!).

      • JustAnEngineer
      • 16 years ago

      Are there such a huge number of 3D games being released for Linux that you’re worried about driver support for the latest features on a $500 3D gaming card?

        • vortigern_red
        • 16 years ago

        I hate to say it but I have an ATI 9700 and I don’t need 3D acceleration in linux, but I do use 2 displays and in order to get that working on ATI cards you have to install the ATI driver (not fun! I have still not got it working with 2.6 kernal although its not to bad with 2.4). Nvidias easier to install and more functional drivers are a big plus point in their favor IMHO.

        ATI do have a driver included with Win64 which supports 2D and D3D, its not very exiting but it is a beta OS!

    • Disco
    • 16 years ago

    There’s something I don’t get. If someone can make everyone ‘oohhh’ and ‘aawwww’ over some shader effects that only take 10 instructions, who is going to bother writing up 512+ instruction effects? Is this one of those scenerios where it won’t happen but someone COULD if they really wanted to…

    Seems a bit overblown to me. I guess we’ll only know for sure when someone actually writes a rediculously long effect that can’t be done without 3.0 (i.e. not on ATI) that is really useful and cannot be shorted to a more reasonable number of instructions. Someone let me know when this occurs outside of a demo please… thanks

      • just brew it!
      • 16 years ago

      It’s a forward-looking feature. If GPU makers always limited the hardware’s capabilities to what current software will use, we would all still be running S3 Virge “3D decelerator” cards.

        • TheCollective
        • 16 years ago

        q[

          • vortigern_red
          • 16 years ago

          *[

            • just brew it!
            • 16 years ago

            q[

            • vortigern_red
            • 16 years ago

            *[http://www.nvidia.com/page/geforce_6800.html<]ยง

            • droopy1592
            • 16 years ago

            It’s all marchitecture.

            We will know who’s faster in time. Nvidia’s already cheating, so they must feel threatened.

            • vortigern_red
            • 16 years ago

            I think I’ll wait a bit before jumping to the conclusion they are already cheating.

            But its not just NV its ATI also, there are very few PS2.0 games about and certainly Farcry is virtually unplayable on my radeon 9700 (370/315 thats with A64 3000+ and 1gig ram) at top quality settings let alone with AA/AF. I’ve stopped playing it until I get my new NV40/R420. ๐Ÿ™‚

            • droopy1592
            • 16 years ago

            it’s been proven by the reduction in mip mapping and filter in common benchmarks.

            Also, it’s funny how Nvidia said FP16 and Int12 was good enough precision with no artifacts, now all of a sudden fp24 will show artifacts but not fp32 (since it’s standard)

            • Hattig
            • 16 years ago

            I agree there … FP16 did prove to be inadequate as that HiDR demo proves – looking like a dog on current nVidia hardware and beautiful on ATI – the 24-bit implementation being good enough – a good balance between image quality and speed.

            I just get the impression that ATI make cards that are fast with ‘good enough’ features for gamers – 24-bit instead of 32-bit. 32-bit probably won’t be required for a while, by that time ATI will have R500 cards out that do support it, and hard-core gamers will upgrade anyway, so no loss. In the meantime, the new game will run acceptably on the older hardware because it is fast, whilst not looking as hot.

            Still, you’ve gotta praise nVidia for being more advanced than ATI in terms of functionality now, and I’m sure it will prove useful in many areas.

            • Entroper
            • 16 years ago

            q[https://techreport.com/reviews/2004q2/geforce-6800ultra/index.x?pg=26<]ยง In general, though, I agree with what you're saying. 24-bit was the perfect compromise for the previous generation, as it won't be practical to run lengthy pixel shaders on those cards anyway. Even for this year's games, I don't see shader model 3.0 providing significant benefits over 2.0. However, owners of a 6800 Ultra will be pleased when they can run next year's games at acceptable framerates and graphics quality.

            • vortigern_red
            • 16 years ago

            Proven! That its a cheat and not a bug in release drivers? Not to my satisfaction anyway

            • just brew it!
            • 16 years ago

            OK, I stand corrected.

            But still… why get bent out of shape? It’s like getting mad at AMD for introducing the AMD64 instruction extensions, just becase Microsoft doesn’t have an OS that supports it yet.

            • vortigern_red
            • 16 years ago

            I use both win64 beta and mandrake linux 64 with my athlon64 ๐Ÿ™‚

            But you have a point. BTW I was not getting “bent out of shape” :-). I am not arsed what they say really, I will buy either NV40 or R420 as soon as I can for the increase in performance they will offer over my current card.

            Heh, I bought a Radeon 8500 because my Geforce2 did not have pixel and vertex shaders LOL I don’t think it saw more than 2 or 3 shaders outside of 3Dmark before I replaced it with a 9700.

    • BooTs
    • 16 years ago

    Bleh. Need to read this again tomorrow; its too late for me to grasp it all in one go. I need to get back in to my routine of reading this type of stuff everyday.

    • BRiT
    • 16 years ago

    How come the current Nvidia 6800 AA screenshots don’t seem to show any benefit of gamma-adjusted/corrected? The ATI R3x0 AA screenshots do show benefits of gamma-adjusted/corrected.

    –|BRiT|

    • Ardrid
    • 16 years ago

    Great article guys. I just got a few questions for you:

    1. Tamasi mentioned that the 6800 Ultra does gamma-correction or gamma adjustment. Is this currently enabled or is it going to be enabled in the official drivers, along with the alternate 8X mode?

    2. The Far Cry screenshots (the “real” ones, not the comparison ones that were using PS1.1) were using a combination of SM2.0 and SM3.0, correct? And I’m assuming the virtual displacement mapping was a result of SM2.0, since VS3.0 can do real displacement mapping.

    Once again, great article.

      • vortigern_red
      • 16 years ago

      *[

      • ripfire
      • 16 years ago

      You have to realize that SM2.0 is now a subset in SM3.0. Basically, SM3.0 is SM2.0 with extra features, like flow control.

        • Ardrid
        • 16 years ago

        Yes, I’m aware of that. I’m wondering whether or not, and I’m pretty sure they are, using virtual displacement mapping, rather than the real displacement that VS3.0 provides.

          • ripfire
          • 16 years ago

          Ahh. I see what you mean.

          I thought that real displacement mapping was already part of SM2.0 because R300 (and parhelia) already has it except for NV30 and NV35.

    • Forge
    • 16 years ago

    An interesting read from Mr. T-buffer. I just wish there had been less of a gap between the GF6800 Ultra reviews and store availability. I greatly prefer ATI’s launch-as-shipping method.

      • Damage
      • 16 years ago

      Tarolli? ๐Ÿ™‚

Pin It on Pinterest

Share This