Get past all that, however, and the first question that comes to mind (to us anyway) is “But is it any faster?” In the past, the tendency has been for newer Microsoft operating systems to run more slowly on older hardware, but offer performance benefits on well-equipped systems. Windows XP offers a new twist in that it’s designed to bridge the gap between the 9x kernel and the NT kernel. How does Windows XP stack up from a performance standpoint, compared not only to Windows 2000, but to Windows ME?
Apart from InfoWorld’s results, which show WinXP to be much slower than Win2K, not much research has been done one the performance characteristics of Microsoft’s new operating system. Since Windows XP stands to replace both Windows 2000 and Windows ME, it seems appropriate to compare the new OS to its predecessors on a variety of tests; simply throwing a couple of Office tests at these OSes won’t do them justice. Additionally, given the past importance of hardware on the results, testing on one platform would be inadequate. Therefore, we’ve compiled a wide array of synthetic and real world benchmarks, on both a high-end and low-end test platform, to determine the ultimate Windows performance king.
Is WinXP a dud? Is the 9x core really that dated? Is there a reason to upgrade from Windows 2000? Read on and find out.
Before we get into the performance benchmarks, it’s worth taking a moment to consider the operating systems in question. We’ve chosen the most recent (pre-WinXP) versions of Microsoft’s business and home operating systems to compare to the newly released WinXP.
Using Windows 2000 is a no-brainer here, but the choice of ME might ruffle a few feathers. Some might argue that Windows 98 or 98SE would be a better choice. However, Microsoft (at least their marketing department) claims that WinME is the pinnacle of the 9x kernel and the immediate predecessor to Windows XP. We decided to take MS at their word, which means ME is the best choice to represent the 9x kernel. We don’t use 9x-based OSes for benchmarking much here at TR; we’re NT snobs, so it’s all 2000 or XP. Because XP forever banishes the 9x core from Microsoft’s OS stable, it’s only fair that we give it one last chance to go down in a blaze of glory.
Our testing methods
As ever, we did our best to deliver clean benchmark numbers. Tests were run three times, and the results were averaged.
Our test systems were configured as follows:
The test systems’ Windows desktops were set at 1024×768 in 32-bit color at a 75Hz screen refresh rate. Vertical refresh sync (vsync) was disabled for all tests.
For both WinME and WinXP, the System Restore utility was disabled. Otherwise, the OSes were left in their default configurations with no tweaking. Both XP and 2000 were installed on NTFS partitions; WinME was installed on a FAT32 partition due to its lack of NTFS support.
We used the following versions of our test applications:
- SiSoft Sandra Standard 2001.3.7.50
- ZD Media Business Winstone 2001 1.0.2
- ZD Media Content Creation Winstone 2001 1.0.2
- SysMark 2001 Patch 3
- POV-Ray for Windows version 3.1g
- ScienceMark 1.0
- SPECviewperf 6.1.2
- MadOnion 3DMark 2001 Build 200
- Quake III: Team Arena
- Max Payne v1.02
All the tests and methods we employed are publicly available and reproducible. If you have questions about our methods, hit our forums to talk with us about them.
SiSoft Sandra’s Stream memory benchmark is up first. Ideally, we want to see identical values for each OS, as this is really just a straight up memory bandwidth test. There’s really no reason for our OSes to differ here, but we’ll check just to be thorough.
FPU-wise, the OSes are identical.
With ALU scores being essentially identical, we know that none of the OSes are constraining things on the memory bandwidth front. This applies for both the high-end and low-end systems, which differ both in amount of RAM and memory bandwidth.
With that formality out of the way, let’s get onto some more interesting tests.
First up for our real world tests is ZD’s Winstone suites. These tests are important for two reasons: First, they represent the kind of usage you would likely find in a real work environment. Additionally, they should provide a good counterpoint to InfoWorld’s tests which indicate that, at least with Office XP, business tasks are much slower in Windows XP. The Winstone suite, however, uses Office 2000 and a much more diverse set of tests.
Our high-end machine sees a steady and measurable increase in performance going from Windows ME, to Windows 2000, to Windows XP. While the margin of WinXP’s lead is small, it’s certainly outside the margin of error. WinME falls just below 2000 in performance, a possible sign of its aging 9x core.
On the low-end machine, the results are less dramatic. The Business Winstone test isn’t that taxing to begin with, and it’s likely the 256MB of RAM in our low-end box that’s keeping XP and 2000 from putting in better performances. Still, despite their larger footprints, the two NT-based OSes don’t lose any ground to WinME.
The more demanding Content Creation test sees our high-end machine pull off a slightly more decisive victory in XP than it did in our Business test. Windows ME really suffers here; it seems the 9x core just can’t keep up with the higher demands of the Content Creation test.
Even on the low-end machine, WinME scores well behind the other OSes. WinXP takes a bigger lead over 2000 with this test, making it the clear performance leader on both hardware platforms.
Like ZD’s Winstone tests, SysMark run two test suites to simulate general business and content creation environments. Will SysMark’s results continue the trends seen with the ZD tests?
For Office Productivity, yes. At the high end, XP takes another victory over 2000, and Windows ME is once again third. The differences in performance aren’t huge, but XP still manages to establish a clear lead.
Our low-end machine continues XP’s winning streak, but with a smaller margin than the high-end box. Windows ME’s core shows its age, placing a distant third.
Moving to the more demanding Internet Content Creation tests, WinXP takes its first loss to Win2K on our high-end test bed. With only System Restore disabled, it’s possible this loss is due to some of the additional features that XP has over Windows 2000, such as the new GUI. Windows ME continues to take a beating.
The situation is repeated on the low end; WinXP loses to 2000 by a small margin, while WinME brings up the rear.
Next we move our testing parameters to the world of number crunching with ScienceMark. Running simulations of things I haven’t a prayer of understanding, ScienceMark at least produces some results that are easy to read. Let’s take a look.
At the high end, XP and 2000’s overall scores closely follow each other, and Windows ME lags again. ScienceMark is a computationally intensive test, and as such it sees a significant performance increase moving to the NT core. With results so close between XP and 2000, I’m going to have to call this one a draw.
With our low-end hardware, the high-end trend persists as 2000 and XP are too closely matched to call. ME’s 9x core prevents ScienceMark from leveraging the full computational potential of the hardware, and it stays in last place.
From scientific math to 3D rendering, the POV-Ray test measures the time required to perform a raytrace rendering.
POV-Ray cares more about a machine’s hardware rather than its operating system. Given equal hardware, all the operating systems offer essentially the same performance on this test.
Delving further into the world of 3D, is SPEC’s viewperf suite, designed to measure 3D workstation performance. SPEC’s tests run the gamut, from wire frames to fully textured objects.
The SPEC scores all displayed very similar trends; the benchmark displayed no preference for Windows 2000 or Windows XP on either test system, turning in virtually identical numbers on each test for each of the NT kernel operating systems. The Windows ME SPEC scores seem to indicate a driver problem with the high-end test machine; it scored substantially lower than the low-end machine on all of the tests. Regardless, WinME once again comes in decidedly last.
All work and no play makes this benchmarking monkey, well, bored; let’s get to some gaming benchmarks. Taking care of our synthetic Direct3D needs is MadOnion’s 3DMark 2001.
Windows 2000 gets a marginal victory over XP with our high-end system, but one well within a margin of error. Windows ME trails significantly; it seems even gamers have no reason to choose the 9x core.
At the low end, there’s little difference between scores, almost certainly because of a video card bottleneck; our MX is likely flailing to keep up in 3DMark 2001, no matter what OS is running.
From DirectX to OpenGL, Vulpine’s GLMark synthetic benchmark is up next.
The performance picture continues 3DMark’s Direct3D pattern: scores for the high-end show XP and 2000 tied, with WinME lagging appreciably. Video card limitations result in a three-way tie on the low-end system.
Enough synthetic benchmarks, let’s move into the real world with some Quake 3: Team Arena. Will 2000 and XP continue to match each other? Can Windows ME claw its way back into the picture? Let’s find out.
Team Arena’s Fastest graphics setting has a great, um, personality. Unfortunately, it’s severely lacking in eye candy. It appears our high-end Windows ME driver issue is rearing its ugly head again, as the high-end system gets embarassed by the low-end box. That anomoly aside, with the detail turned down, all of the operating systems put in a relatively good performance.
Moving to High Quality changes things somewhat. Windows 2000 and XP are still neck and neck, while Windows ME takes a huge hit. It’s difficult to say if the high-end system’s WinME performance can be blamed on our driver issue or not; since this tiem it manages to beat the low-end system, we’ll blame the low numbers on Windows ME itself and not our driver bug.
With everything maxed out, our high-end picture really doesn’t change. While ME’s score looks more and more like a credible performance, it’s still well below the XP/2000 tag team. On the low end, things get more interesting. Windows 2000 pulls out a much better slideshow than both XP and ME as the GeForce2MX struggles with the high-resolution environment. It’s unclear why Windows 2000 does better in this scenario, but it’s largely academic, since nobody is going to want to play Quake 3 with their frame rate in the teens anyway.
To round out our benchmarking suite, Max Payne makes an appearance. Max contrasts Team Arena’s OpenGL with Direct3D rendering.
In a result that will surprise no one, Windows XP and Windows 2000 are in a dead heat, while Windows ME brings up the rear, albeit not as badly as in some of the other tests. The trend is the same on the high-end and low-end machines.
Max (no pun intended) out the resolution and the high-end picture remains the same. The low-end gets pretty barren, as Windows XP and Windows ME both fail to complete the benchmark. At least we’ve learned that if you want to run a Max Payne slide show at 1600×1200 on a Geforce 2 MX, Windows 2000 is your best choice. If you ever decide this is a good idea, seek help.
Toss aside the WPA, the bundled services, and the new GUI (especially the new GUI) and you’re left with this fact: The numbers don’t lie. Regardless of how you feel about all the aforementioned goodies that Microsoft claims make Windows XP better than its predecessors, based on our tests, there’s no reason to call XP a performance dud.
Given all of the new, err, features it’s actually somewhat surprising how close the performance race is between XP and 2000. The fact that XP does so well with SysMark and the Winstone tests would indicate that it’s quite a business performer, at least with the software used in those benchmark suites. It’s difficult to say why InfoWorld’s results were so different; one possible suspect is not enough RAM, but that’s difficult to verify since they didn’t reveal how much memory was in their test systems.
Still, if you’re running Windows 2000, XP isn’t much of a compelling upgrade in terms of raw performance. By themselves, the performance gains you will see don’t justify the upgrade price and additional baggage. WinXP doesn’t really need anything above our low-end system to be useful, though it does seem to have an appetite for RAM. Luckily, RAM is the cheapest it’s ever been.
Windows XP signifies the end of the 9x core. Based on our tests, users of WinME (and likely any other 9x-based OS), should upgrade immediately. WinME just can’t keep up with the NT core, and upgraders will see large improvements in stability as well.
Between the new GUI, the WPA and the feature creep, there are plenty of reasons to bag on Windows XP, but performance isn’t one of them. Bloated or not, from a performance standpoint, Windows XP is a worthy successor to 2000.