Single page Print

Can I get back to you later about that poly?
Our man Dissonance wrote up a nice explanation of the Kyro's "what you see is what you draw" approach to 3D rendering. This approach is known as deferred rendering, and it's quite different from the more common method, immediate mode rendering. I suggest you see Dissonance's write-up for a visual representation of the differences between the two approaches, but I will make a clumsy attempt to sum them up.

Immediate mode
Immediate mode renderers, including nearly every other 3D graphics card you can buy, draw three-dimensional scenes like the Pentagon processes its budget. They chew through oodles of bandwidth and processing power to get where they're going, then throw out the extra stuff at the end. Immediate mode chips process all the polygons in a scene, apply shading and textures, then send the pixel hurtling down the pipeline with a Z value, or depth information, attached to it. This depth information can use nearly as much bandwidth as the color information for the pixel. Generally these days, a pixel will have 24 bits for Z data and 32 bits for color and transparency info. At the end of this process, the graphics chip starts drawing the scene. Only then does it use the Z information, stored in a Z-buffer, to determine whether one pixel overlaps with another.

Take, for instance, the fundamental object that defines all 3D game scenes: crates. (This I learned from Old Man Murray.) Odds are, any 3D scene will have a number of crates, and probably some barrels, too. Some of those crates and barrels will be behind others, so that crate A obstructs our view of barrel B. An immediate mode renderer processes things in whatever order they come down the pipe. It may draw barrel B in its entirety, then draw crate A in front of it. Only when it comes time to draw the actual pixels for crate A will the chip determine that barrel B shouldn't be visible in the scene—and by then, it's already done all the work to draw barrel B. Only the completed scene is sent to the display, so all you'll ever see is crate A.

In other words, sending a polygon to an immediate mode renderer is like handing Rosie O'Donnell a gift certificate to Luby's.

This process of drawing pixels that will be obscured by others is known as overdraw. Overdraw is the scourge of efficiency in real-time 3D graphics, and it feeds the number-one performance problem for graphics cards these days: memory bandwidth bottlenecks. Memory bandwidth limitations are the reason many newer 3D graphics cards aren't much faster than their predecessors. An old GeForce DDR, for instance, will outrun a brand-new GeForce2 MX 400, because the older card's 128-bit DDR memory interface gives it the edge.

Deferred rendering
To avoid overdraw and better use resources, the Kyro II chip takes things in a different order than immediate mode renderers. Heck, it takes a very different approach altogether. The Kyro uses tile-based rendering, segmenting the display into small sections and processing each portion in turn. Busting the screen into tiles gives the Kyro several advantages. Among them:

  • Hidden surface removal — The Kyro chip maintains a list of polygons for each tile, and it determines early in the rendering process which polygons in the tile will be visible and which will be occluded. Making this determination for an entire scene at once would be difficult, but breaking the scene down into chunks makes it feasible.

  • Deferred rendering — Only after the removal of hidden surfaces are the remaining pixels shaded and textured. In this way, overdraw is dramatically reduced. That means there's less shading work (this form of shading is also known as lighting, or the L in T&L) and much less bandwidth used pulling texture data from the graphics card's memory.

  • On-chip operations — The Kyro has an on-chip "tile buffer" in which it stores the entire contents of the tile. The chip can then perform a number of operations on those pixels, like pixel blending and layering on additional textures, without accessing the card's frame buffer (or Z-buffer) memory. As if the memory bandwidth savings alone weren't enough, the Kyro's makers claim this arrangement allows them to use lots of internal precision in such operations, improving image quality.
Taken together, these benefits are formidable. It's hard not to think the deferred rendering approach will have a prominent place in the future of real-time 3D graphics.