Eight bits of precision isn't enough even for full range static image display. Images with a wide range usually come out fine, but restricted range images can easily show banding on a 24-bit display. Digital television specifies 10 bits of precision, and many printing operations are performed with 12 bits of precision.32 bits just isn't enough... And after 3dfx fought the move from 16-bit color for half a year! The Carmack won't stand for any of that nonsense this time around, either. He ends his little dissertation on the need for 64 bits with a flourish: "64 bit pixels. It is The Right Thing to do. Hardware vendors: don't you be the company that is the last to make the transition." Ouch.
The situation becomes much worse when you consider the losses after multiple operations. As a trivial case, consider having multiple lights on a wall, with their contribution to a pixel determined by a texture lookup. A single light will fall off towards 0 some distance away, and if it covers a large area, it will have visible bands as the light adds one unit, two units, etc. Each additional light from the same relative distance stacks its contribution on top of the earlier ones, which magnifies the amount of the step between bands: instead of going 0,1,2, it goes 0,2,4, etc. Pile a few lights up like this and look towards the dimmer area of the falloff, and you can believe you are back in 256-color land.
The most interesting thing in Carmack's update, however, is a mention that current hardware-based 3D rendering techniques could be adapted to match advanced software rendering programs:
Mark Peercy of SGI has shown, quite surprisingly, that all Renderman surface shaders can be decomposed into multi-pass graphics operations if two extensions are provided over basic OpenGL: the existing pixel texture extension, which allows dependent texture lookups (matrox already supports a form of this, and most vendors will over the next year), and signed, floating point colors through the graphics pipeline. It also makes heavy use of the existing, but rarely optimized, copyTexSubImage2D functionality for temporaries.Quake III, by contrast, renders in 10 passes, I believe. So when people say there's still a lot of room for improvement in 3D graphics hardware, don't let the nifty checkbox feature set on that fancy new AGP card fool you.
This is a truly striking result. In retrospect, it seems obvious that with adds, multiplies, table lookups, and stencil tests that you can perform any computation, but most people were working under the assumption that there were fundamentally different limitations for "realtime" renderers vs offline renderers. It may take hundreds or thousands of passes, but it clearly defines an approach with no fundamental limits.