Howdy all. I’ve been hard at work in Damage Labs setting up new test systems with Windows 7. This is a particularly agonizing chore for me, because I want to be sure to set up everything properly and perfectly the same (as much as possible) between the different systems. Also, generally what happens is I go into this process looking to incorporate as many new benchmarks as possible, but then for various reasons (time constraints, poor application performance scaling, lack of counters for timing operations, software licensing/DRM restrictions), I end up using many of the same tests as in the past generation of results. That kind of looks to be repeating itself in this new round of CPU tests, although I do expect to add a few new games, 7-Zip, and Windows Live Movie Maker, at least—along with new versions of a great many applications.
The question of the day, however, has to do with power management features. Typically, we’ve left features like SpeedStep and Cool’n’Quiet disabled for our general performance tests, only enabling them when we do power efficiency testing. We’ve disabled them for multiple reasons, mainly because they can affect performance results in some tests. The picCOLOR application benchmark we’ve used for a while, for instance, does many short, quick operations spaced a little bit apart. As a result, the CPUs don’t have time to ramp up their clock speeds for each operation, and the results come out lower than with power management disabled.
I also have a sense that, generally speaking, when cases like this occur, AMD processors are more likely to be negatively affected than Intel processors, in part because AMD chips tend to drop to lower clock speeds at idle and in part because of a history of real problems with AMD CPUs, CnQ, and performance.
The question is: Isn’t that fair game, though? "Balanced" is the default power profile in Vista and Win7, and the vast majority of folks are going to want to have these power-saving features enabled on their systems in order to cut down on the noise, heat, and power consumption of their systems. In a case like the picCOLOR benchmark, I think we have a simple solution: use a workload more like a real user would, with higher-resolution images and longer operations. We can work with the software’s developer on that. In other cases, well, most of our benchmarks already reflect real-world use pretty well and aren’t so affected by SpeedStep and Cool’n’Quiet. If they are, the odds are pretty good that a real user might experience the same drop-off in performance, and even if it’s not perceptible, there’s no reason not to include it in our performance measurements.
I’m of two minds on this question. The one sticking point that keeps me from making to switch and leaving power management features enabled is the possibility that our test results will be rendered unreliable in some cases, either due to big differences in outcome from one run to the next or to the occasional outright incompatibility between these features and a piece of software (which we’ve seen in some games in days past). And I’m on a deadline here, so that prospect frightens me more than you might expect. Still, it’s 2009, I’ve been using CnQ and SpeedStep on my own desktop and laptop systems for years, and I consider them integral features of a modern CPU. Seems like it might be time to test ’em like we use ’em.
Hmm. What do you all think?