Hyper-Threading and resource sharing
Jon Stokes' discussion of Hyper-Threading's resource sharing arrangements gave me some ideas for testing, and I checked with Jon to see what he thought of them. The results below are my fault, but Jon gets credit for helping if you like them.

I decided to use Linpack, which can visually represent L1 and L2 cache size and performance, to illustrate HT's divison of the L2 cache between logical processors. In order to do so, I ran a Quake III Arena botmatch in a 640x480 window on the Windows desktop. Then, with Q3A running, I kicked off Linpack. The game's "r_smp" variable was set to zero in all cases. Here's what I found.

With Hyper-Threading enabled, L2 cache performance changes dramatically. The total cache bandwidth available is half what it is without HT. (Assuming the FPU being overworked isn't the culprit.) Also, the Q3A + HT config peaks at about 192K matrix sizes, earlier than the Q3A + non-HT config. Effectively, the cache size is smaller because cache is being shared between two logical processors. This is just as Jon's article predicted.

Also, that "hitch" in the Q3A + HT performance at around 270K matrix size is no fluke. What you're seeing above is an average of three Linpack runs, but the individual Linpack runs all exhibited the same quirk:

I suspect this "hitch" shows us something about how Intel's HT logic manages cache sharing, but I won't venture a guess beyond that.

Before you worry too much about losing cache space and bandwidth with Hyper-Threading, though, read on. I showed these results to Jon, and he suggested turning the tables a bit:

I wonder how much Q3A actually benefits from the cache in the first place and is therefore affected by HT. I recall that the original Quake didn't suffer too much a hit on the cacheless Celeron, because even when it has a cache it dirties the d-cache quite a bit. So if you have Q3A mostly dirtying the cache with data that it's not going to reuse, and then you have Linpack trying to store matrices in the cache at the same time, then I would expect the Linpack performance to suffer much more than the Q3A performance. In other words, if you ran the same tests, but benchmarked _Q3A's framerate_ rather than Linpack, my tentative hunch is that you'd see that Q3A's performance degrades much slower from hyperthreading under the same conditions as Linpack.
I tested Q3A performance with Linpack running, both with and without Hyper-Threading enabled.

The results were just what we suspected. The effects of Hyper-Threading's resource sharing mechanisms will vary greatly depending on the type of applications used.