The Damage Report

The great upgrade: Tales from the ancient past
— 10:51 AM on July 14, 2010

Our current contest challenges readers to relate their first PC upgrade experiences for a chance to win a copy of Just Cause 2.  We already have quite a few good entries, and skimming through them caused me to think back to one of my first PC upgrades—and wouldn't you know, I wrote an article about it and posted it on the web.  With performance graphs!  The year was 1997.  I think.  Pentiums were all the rage, 3D acceleration was dawning, and I somehow had good things to say about the awful Quantum Bigfoot hard drive I'd picked up in an online auction. Just like 3dfx, I never quite finished what I'd promised, but it's a fun jog down memory lane, regardless.

MAY 19: THE FIRST STEPS

Now that my computer is alive again, I can tell you about what I've been doing and why I haven't been able to update this page. Having finished my big papers for the term, I took some time and upgraded my computer system. I started with a Gateway 2000 system with a 100Mhz Pentium and an STB Powergraph 64 video card. This was a decent setup, with 40Mb of RAM and plenty of nice cards, drives, etc., but it was a tad on the slow side. For one thing, around the time I bought my PC, Gateway decided that the advent of EDO memory meant they didn't need to include a level 2 cache in their systems. They have since repented, but my motherboard had no cache and no place to add one. Anyhow, for a number of good but very techno-nerd intensive reasons, I decided to see how much I could improve this system's performance. I snagged an STB Lightspeed 128 graphics card (with 2Mb of MDRAM) in an online auction for 38 bucks—no kidding. The thing delivers blisteringly fast video. Then I ordered a new motherboard, following the advice I found at Tom's Hardware Guide. The motherboard is an Abit IT5H, which has a 512K L2 cache, a Pentium-style processor socket and a so-very-nifty "soft menu" BIOS—in other words, it's a jumperless motherboard that can be tweaked to no end via software.

I replaced the video card and the motherboard, using the memory and processor from my previous motherboard to populate the new one. Then I strapped a heat sink with a fan on top of my processor's current heat sink for a cooling double-whammy. Now I've got it all up and running, with my system bus running at an atmospheric 75Mhz—faster than a new Pentium II system's bus—and my processor overclocked to 112.5Mhz. Preliminary benchmark results are available here: 

Test
P5-100Mhz, 66Mhz bus, 430FX, No L2 cache, STB Powergraph 64 video card
P5-112.5Mhz, 75Mhz bus, 430HX, 512K L2 cache, STB Lightspeed 128 video card
Performance gain
WinTune 97:      
Dhrystone MIPS (integer) 175 209 19.43%
Whetstone MFLOPS (floating point) 53 63 18.87%
Video speed MP/sec 13 31 138.46%
Create Window 0.00919 0.00484 47.33%
Scroll Text 9.19 2.15 76.61%
Draw lines/curves 8.16 5.22 36.03%
Draw filled objects 3.00 1.11 63.00%
Destroy window 0.02530 0.01050 58.50%
RAM read average Mb/sec 157 234 49.04%
RAM write average Mb/sec 84 95 13.10%
RAM copy average Mb/sec 49 70 42.86%
WinQuake Timedemo:      
demo1 320x200 frames/sec 23.1 33.7 45.89%
demo2 320x200 frames/sec 24.6 34.9 41.87%
demo1 640x480 frames/sec 9.4 13.8 46.81%
demo2 640x480 frames/sec 10.8 15.6 44.44%
       
Average performance increase     49.48%

The testing was by no means scientific, but the results do give some indication of the upgrade's effectiveness. The long and the short of it is that I spent under $200, including shipping and handling, and now my system runs about 45% faster in real-world conditions. Also, in the future, when I want to go even faster, I can buy a Pentium, Pentium MMX, AMD K6, or some other processor and replace my tired ol' P100. Adding a faster processor could conceivably increase performance over 100%, and it would be fairly cheap. This motherboard's system bus can run at 83Mhz, as well, so the potential performance gain with even a 166Mhz chip is pretty formidable.

The moral of the story? Well, for one thing, I'm much more of a tech freak than you probably thought. Beyond that, the important thing to know is that you don't have to spend a zillion bucks on a new PC every few years to keep up with the rest of the world. These things are modular and a lot of the parts can be reused. Finally, after this experience, I'll probably build a new PC before I buy one again. If you want something done right, do it yourself.

MAY 28: ADD DEPTH AND COLOR

I haven't dropped off the face of the planet, honest. I've been busy with Real Life and playing with my new toy, a Diamond Monster 3D card.

As you know from my last update, I've recently upgraded my computer. Fiddling with it has kept me from spending time on the web page here. After some twiddling with memory timing settings, I think I've got a very stable setup. Initially, overclocking my Pentium 100 to 112.5Mhz and running the system bus at 75Mhz (rather than the usual 60 or 66Mhz speeds) caused me some odd cold-boot problems. Programs would crash occasionally for about the first 3 minutes after I turned on the computer—not every time, but often enough to kinda scare me. I seem to have banished those problems, however, without having to clock my processor back to 100Mhz or the bus back to 66, by adjusting some BIOS settings. Everything's peachy now, and very fast.

The last item I want to tell you about is my new Diamond Monster 3D card, the crowning achievement of my little upgrade scheme. Based on the 3Dfx Voodoo chip set, this thing really is a monster. This card is intended solely for 3D acceleration, so it works in conjunction with a standard video card and just takes over when it's asked to generate 3D graphics.

PC video cards with 3D acceleration have become all the rage of late in the tech world, but few of the current cards provide a compelling 3D experience. They're just too slow to deliver the kind of fluid motion one would like. I'm here to tell you the 3Dfx-based cards are a glaring exception to that rule. These pups render around 2 million polygons per second. A Sony Playstation, until recently the best of the home game consoles, reportedly renders only 300,000 polygons per second, by contrast. But I could quote statistics to you all day long and you wouldn't begin to understand what this thing will do. You simply have to see it to believe it.

Quake, the biggest and baddest 3D game around, runs at a fluid 30-frames-per-second pace on the Monster 3D, even with my sad old P100 processor feeding it data. That is, Quake runs in fluid motion at high resolutions (up to 640x480) in over 65,000 colors, not just 256 colors like the software-only version of the game. The effect is stunning, almost cinematic, as one glides from room to room in an utterly convincing three-dimensional environment. Any one of the individual frames of animation from this game could easily pass for a scene like I used to render in a 3D ray-tracing program on my Amiga 3000 a few years back. Rendering a single complex scene (i.e., one frame of animation) could take around three hours on that computer. Things change.

3D is the next big step for desktop computers, and it's here now. A number of games, including MechWarrior 2 and Tomb Raider, support this card directly. Pod, an oh-so-hip racing game, makes my PC look like a Daytona USA arcade machine, only better. (I love racing games.) Other games use Microsoft's Direct 3D to access the 3Dfx Voodoo hardware, and that works fine, too. Word has it even Bill Gates has been giving Direct 3D demos with a Voodoo-equipped system.

You can get a 3Dfx Voodoo-based card with a full 4Mb of memory for about $150 now. Check out the excellent Operation 3.D.F.X. web site for all the newest developments and info on where to buy one of these things. If you don't want to buy one, you still owe it to yourself to find someplace where you can take a look at one running some 3D-accelerated applications, just so you can get a sense of what's possible now.

JULY 18: COMPARE AND CONTRAST

It's time for a late-night, insomnia-inspired update. I've been meaning to update the page for a while with comments on a whole load of tech-related things, but I've been too busy doing tech-related stuff to stop and write about it. Now that I can't sleep for thinking about it, I'll try to run through the list—or at least the highlights—of the interesting toys I've been testing.

First up on the list is a report on my ongoing system upgrade. A few weeks ago, I snagged a massive 4.3 gig Quantum hard drive in yet another online auction. My original 1 gig drive was getting terrifyingly crowded, even with compression running. The new Quantum has a killer peak transfer rate (16.6 Mb/sec) and supports direct-memory-access transfers at full speed. It's also a big drive—as in physical size, like Bill Clinton's gut—and Quantum claims, as a result, the thing will produce some very high sequential transfer rates. In other words, it reads big, one-shot files very quickly. Pulling streaming video off the drive, for instance, should be a breeze. On a more down-to-earth level, Netscape seems to load more quickly.

Overall, I really like the new drive. Two things about my experience with it stand out as pleasant surprises. First, this (enhanced IDE) drive works surprisingly well with my older (also enhanced IDE) Western Digital drive. No matter how I've configured them or what nasty, beta drivers I've installed, these two drives have always talked virtually flawlessly. That's a nice surprise, because I've heard horror stories from folks installing enhanced IDE drives from different manufacturers. The second unexpected little joy has been the fact that this new Quantum drive is almost whisper silent. My original hard drive grunches and t-t-t-t-t-t-talks to me every time it's accessed. Using the Quantum, my system now feels faster just because it doesn't sound like it's putting out so much effort to get something done. Much better.

Adding a new hard drive on my system allowed me to play with the next toy I want to say a bit about, Windows NT. Win NT version 4 now has Win95's clean, decent user interface, and NT is reportedly a much more advanced operating system under the surface. Assuming Microsoft gets NT right, I'd have every reason to switch to this snazzy new operating system. However, my verdict is stil out on this one. I've installed both NT and 95 on my computer, and I still boot into Win95 by default. I'd probably be more gung-ho about seriously trying to migrate to a new OS if I hadn't set up something like 10 new system and software configurations in the past month or so. I'm over-teched. For a while, at least, Win95 will remain my main OS.

Part of my tech exhaustion comes from having set up a new motherboard, two new graphics cards, and a new hard drive for my dad's PC over the July 4th weekend. That's right—I gave my dad's computer my patented Total System Upgrade. He's moved from a 90Mhz Pentium based on an ancient Intel chipset to a 100Mhz Pentium (just a bit overclocked) on an Abit motherboard like mine, a 128-bit graphics card, a Quantum 4.3 Gb hard drive, and a 3Dfx-based 3D graphics card. This upgrade turned out to be a rather big job for one weekend, all things considered, but his system is much improved.

To give you some idea how much improved it is, I can offer you a subjective comparison. His system is now just about identical to mine, with which I'm quite familiar, of course. In comparison to my home system, I offer my impressions of my brand-spanking new PC at work. This bad-boy Micron is a Pentium MMX machine with all the goodies (512K cache, Intel's latest TX chipset, and 32 megs SDRAM, for you fellow geeks). (Its arrival is another source of my tech exhaustion.) The first thing I did to it—before ever turning it on—was overclock the 166Mhz processor to 200. It worked like a charm, and I've never looked back.

The startling thing is not just how little subjective performance difference there is between my 100Mhz non-MMX Pentium machine at home and my 200Mhz MMX Pentium machine at work. The startling thing is that, for many day-to-day tasks, my 100Mhz home machine seems to have the better end of that difference. Why? My guess is that, for one thing, the graphics card I have here at home is quite a bit faster than the one on my PC at work. (My home graphics card is an STB Lightspeed 128 with special MDRAM memory. It's dangerously quick.) Also, Intel's older HX chipset is also probably a bit faster than the new TX chipset in certain, key ways. Finally, both 100 and 200Mhz systems run at a bus speed of 66Mhz. There are real limits to how much performance one can squeeze out of a computer by turning up only the main processor's clock speed.

For a final insult to the shiny (well, face it, it's flat beige), new 200Mhz MMX machine, I installed a couple of 3D games on both machines. The work PC has an S3 Virge/DX-based 2D/3D combo card; my home PC has a 3Dfx-based Monster 3D card. One of those games was Pod, a racer I've enjoyed quite a bit on my home PC. The demo version of it that came with my work PC supposedly uses both MMX and the Virge 3D card to enhance speed and visual quality. The version I run at home is tuned for the 3Dfx card. At home, Pod rivals anything I've seen in the arcades. The visuals are stunning and frame rates are high and smooth. At work, Pod just stinks. It's ugly and slow enough to be virtually unplayable. I had a similar experience with Psygnosis' Wipeout demo, which accesses both 3D cards via Microsoft's Direct3D.

Ted Whatshisname, CEO of Gateway 2000, recently verbally slapped Intel's marketing types by saying something tantamount to sacrilege in the computer industry: "Speed is not a feature." In a way, obviously, he was very right. The specific kinds of speed I get from my 100Mhz home PC matter much more to me than what I get from my 200Mhz MMX PC at work.

AUGUST ??: COMPLETE AND EVALUATE

Coming soon.

19 comments — Last by dpaus at 1:16 PM on 07/19/10

A note on GeForce GTX 480 noise levels
— 11:27 AM on April 15, 2010

Now that GeForce GTX 400-series graphics cards are out in the wild, although in limited numbers, I should say something quickly about the main issues for which these cards, and especially the GTX 480, have gained a reputation: power, noise, and heat. I talked about this some on the latest podcast, but I don't think I communicated it all that well in the context of our GeForce GTX 480 and 470 review.

I feel like the GTX 480 is getting a bit of a bad rap.

Yes, the GF100 cards' performance isn't all that it should be, and that's almost certainly due to the fact that the GPUs wouldn't reach Nvidia's projected clock speeds with all of the units onboard enabled. Typically in such cases, and again almost surely in this one, the established power and thermal envelopes are a constraining factor. High clock speeds might be possible by giving the chip additional voltage, but doing so would push the GPU's power draw, heat, and cooling demands into unreasonable territory. The GF100 is a large chip, and such problems can be compounded for a number of reasons by a large die area and lots of transistors.

Dealing with these issues is a balancing act, one that every chip company has face to one degree or another in turning out a product. Competitive issues aside, I think the balance Nvidia has struck with the GeForce GTX 480 is largely a reasonable one. You can look at the numbers we measured in our review, but the basics are pretty clear. For power draw and GPU temperatures, the GTX 480 stays within the generally established boundaries for the industry. That's not to say that the Radeon HD 5870 doesn't look a darn sight better in terms of power draw, but a test system equipped with single GTX 480 draws 60W less than the same system with dual Radeon HD 5870s in CrossFire. We're not talking about a paragon of efficiency here, but the GTX 480 isn't out on the bleeding edge, either. No new PSU standards were created with the introduction of this product.

Similarly, Nvidia has obviously biased the fan profiles on the GTX 480 toward lower noise levels than toward lower GPU temperatures, but the GPU temperature readings we got for the card were only a few degrees higher than what we saw from the Radeon HD 5870.

More notably, the GTX 480's cooler is an impressive bit of engineering that attempts to mitigate the effects of the GPU's relatively high power consumption—and thus heat production. Have a look at the noise levels we measured while running a real game, Left 4 Dead, that generally produces higher power draw numbers than most:

Once more, the Radeon HD 5870 is quieter—but only the 1GB version with the stock cooler. The slightly overclocked Asus Matrix card with 2GB of RAM was louder than the GTX 480. I don't want to overstate it, but heck, another example might be considered a victory of sorts: the GTX 480 is quieter than the GeForce GTX 295, even though the GTX 480 draws about the same amount of power under load.

For those of you who think that doesn't count for much, you're forgetting the bad old days of the GeForce FX 5800 Ultra, when Nvidia had some similar problems with a new GPU and attempted to make up for it by reducing image quality in multiple ways—skimping on texture filtering and dropping down to lower-fidelity texture formats, mostly—and strapping a cooler to the side of the card that we derisively dubbed the Dustbuster. If Nvidia is compromising on image quality with the GTX 400 series, we sure haven't detected it yet. The texture filtering algorithm looks to be the same as the other recent GeForces, which is to say excellent. And a single GTX 480 is nowhere near as loud as ye olde Dustbuster. I'm hesitant to compare across a such vast differences in time and equipment, but have a look at these numbers:

A single FX 5800 Ultra was nearly 9 dB louder than a GTX 480. Both objectively and subjectively, the difference between the two is huge.

To take these iffy cross-review comparisons even further, we measured the Radeon HD 4890 at 50.6 dB under load not long ago, slightly higher than the 49.9 dB at which we measured the GTX 480. Our SLI noise level results prove the GTX 480's cooler is capable of spinning up to higher speeds, but a single card just didn't go there during the course of our testing. We tested on an open test bench, and your results may vary in either direction in an enclosure, depending on cooling and venting. Still, that bit about the GTX 480 fitting well within the established boundaries of the market applies.

Remember, when you hear an incensed fanboy spouting off about how awful the GTX 480's noise and heat levels are, that his point only applies in a very limited sense, relative to some slightly better competition. I wouldn't hesitate to put a GeForce GTX 480 into a gaming rig of my own on that basis. Yes, I would prefer the Radeon HD 5870's overall combination of attributes, especially in terms of price and performance. But let's be clear: in an undeniably tough situation, Nvidia has avoided the temptation to reach the highest possible performance levels at the cost of reasonable power draw and acoustics. Folks seem to be missing that fact, which has caused Nvidia to send out its viral drones to spread the message. Such silliness shouldn't be necessary. This is one lesson we're happy they've learned, and I'd hate to fail to acknowledge it.

127 comments — Last by sammorris at 12:16 PM on 04/27/10

Core issues
— 12:07 AM on April 8, 2010

Keeping your cores straight isn't easy these days.  Here's a quick cheat sheet.

CUDA core - Not a core
Shader multiprocessor - Might be a core
Stream processor - Not a core
SIMD engine - Kind of a core
Turbo Core - Neither a core nor a turbo
Turbo Boost - Not a turbo, but a boost
Core Duo - Two cores
Core 2 - Also two cores
Core 2 Quad - Four cores
Core i3 - Two cores
Core i5 - Two or four cores
Core i7 - Four or six cores
Core microarchitecture - Not a core, but includes multiple cores and microarchitectures

22 comments — Last by Stargazer at 9:01 PM on 04/15/10

Taking a crack at the great CPU-GPU balance question
— 10:16 AM on December 8, 2009

We get some interesting questions and article suggestions via email from time to time, and we just don't have time to address them all with a proper article, unfortunately. I received one such message recently that asks a burning question we've seen posed in many ways over the years. Let me just reprint this reader's question for you:

I'm just throwing a suggestion for an article if you ever get bored/run out of ideas. I always hear about older CPU's bottlenecking newer video cards, so it would be cool to actually see this tested. Especially since I always here people say "at Athlon X2 will heaviy bottleneck a HD 48xx series GPU..." and lines along those statements. However I don't think I've ever seen anyone substantiate those claims with actual data. I can't distinguish BS answers from factual answers, since everyone seems to have their own opinions and views. I think it would be great to see an article investigating this. Of course I understand you probably are quite busy a lot of the time, but I figured its worth a shot in providing a suggestion or giving you an idea.

This topic never goes away, but it is a rather difficult question to answer definitively, because it's endlessly complex (or something close to it). Here's my attempt at a quick answer, which I figured some folks might find interesting.

----

Yeah, that is an interesting question. Complicated, too. Much depends on the workload you're using, both for the CPU and the GPU. We haven't focused an article on just this question, but we have looked at performance scaling in various ways.

Here's an example with one GPU and multiple CPUs at different display resolutions:

And here's another with multiple GPUs on one fast CPU at different resolutions:

The reality is that you need to have the right balance of CPU and GPU power for the display settings and resolution chosen in a particular game. But, as the first graph there shows, all of the CPUs we tested will average nearly 60 FPS in Far Cry 2, so the GPU is the primary bottleneck.

You have to drop down to the very slowest PC processors in order for the CPU to become any kind of bottleneck in a recent game, especially if it's a console port, since console CPUs are dreadfully slow. Even the Pentium E6300 can sustain 30+ FPS in Far Cry 2:

Of course, this whole equation will change with a different game or different visual quality settings (or switching from DX10 to DX9) in this same game. But generally speaking, these days, even a $90 Athlon dual-core is likely to run most games well, with the possible exception of more complex PC-native RTS games and the Great Exceptions, Crysis and Crysis Warhead. Note the frames rates consistently in excess of 120 FPS for Left for Dead 2 and Wolfenstein in our Lynnfield review, for instance. I do advise gamers to avoid quad-core CPUs with really low clock speeds. A higher-frequency dual-core is a better bet when the going gets rough.

I'm not sure we can dedicate an article to this issue soon—this is a very tough question to answer definitively—but we do try to provide the information you need in our reviews to make a smart buying decision. I hope this helps a little!

41 comments — Last by dearharlequin at 3:04 PM on 12/27/09

Remembering the Gotbyte incident
— 5:24 PM on November 11, 2009

During this celebration of our 10th anniversary, it seems only fitting to look back on a fun episode from our long-ago past. Back when the Pentium 4 was the new hotness, only not quite the hotness it became with Prescott at 90 nm, we featured a review of this processor against its natural competitor, AMD's original Athlon.

Around that time, an online retailer named Gotbyte was building a new website for itself, and its web developer apparently liked the idea of populating that site with some respectable-looking content. Like our review. Only that developer didn't like the idea of using his own bandwidth to serve the images in the review, so he simply embedded ours. Hey, free hosting. Only we noticed some anomalies when poking through our web logs and soon discovered they'd stolen our article. Our solution? We moved the original images for our article to another directory, changed the HTML in our review to include the new image location, and modified the original images—still embedded in the retailer's website—to send a rather different message. The article then appeared on Gotbyte's website more or less in the form you see below.

And yes, Gotbyte eventually apologized, took down the stolen review, and fired its web developer. But the fun we got out of it was almost worth the hassle. Be sure to keep scrolling to see that last image.

THERE'S NO LOVE LOST between the Intel and AMD camps these days. Both sides know they're in for the fight of their lives, and both are bringing spectacular advances to the desktop PC market with regularity. The latest salvos in the desktop wars are a whole new microarchitecture from Intel and a revamped Athlon platform from AMD. The P4 and Athlon DDR in a head-to-head, take-no-prisoners benchmark brawl.

We've rounded up a 1.5GHz P4 system from Intel and tossed it into the ring with a 1.2GHz DDR system from AMD. To test our contenders' mettle, we've run them through a grueling gauntlet of benchmarks, from the highly synthetic to in-the-mud, real-world application tests.

Before the opening bell sounds, let's review our contenders' qualifications.

The Pentium 4's advantages
The Pentium 4 has been endowed by Intel with a number of natural advantages, not least of which is its incredible ability to ramp clock frequencies and, hand-in-hand with it, a hair-raising 1.5GHz top clock speed. Impressive as the GHz numbers are, though, the real story with the P4 is its ability to move data around inside the system. From the front-side bus to its RDRAM memory interface to its north-south bridge link, the P4 has a considerable advantage over the Athlon platform, at least on paper.

Let me slow down and run some of the numbers by you. The P4 has a 100MHz, "quad-pumped" front-side bus between itself and the rest of the system. To confuse you, we will, as always, refer to this bus interchangeably as 100MHz and 400MHz?whatever suits our purposes. This 400MHz monster can pump through up to 3.2GB of data per second. Coupled nicely with that bus are the P4's dual channels of PC800 Rambus DRAM, which can also push through 3.2GB of data per second at peak. Further down, in the less-exotic bowels of the system, the Intel 850 chipset has a 266MHz "hub"-style link between its north and south bridge chips. (Though Intel doesn't use directional terminology, the chips' purposes are basically the same as in most other contemporary PCs.)

In every one of these cases, the P4 has a system bandwidth advantage over the Athlon. If nothing else, the Pentium 4 platform has plenty of room to grow. And it ought to deliver a serious whuppin' at memory-intensive tasks.

The Athlon's advantages
Meanwhile, the Athlon's great advantage over the Pentium 4 is, well, the Athlon chip itself. AMD has created a wondrous thing in this processor, a marvel of x86-compatible design. Athlons have already easily outpaced the PIII in the megahertz race, and they're at least as fast, clock for clock, as any PIII chip. The Pentium 4 may run at higher clock speeds, but it does so by virtue of a very long instruction pipeline. The length of that pipeline hampers the P4's clock-for-clock performance, so that a 1.5GHz Pentium 4 isn't necessarily any faster than, say, a 1GHz Pentium III.

But then many things aren't as they seem once the theoretical performance numbers start flying around. For instance, the Pentium 4 talks to its L2 cache over a 256-bit wide connection, while the Athlon's L2 cache interface is only 64-bits wide. However, the Athlon Thunderbird's dual-ported, on-chip cache is probably just as good as the P4's.

But I digress. The advantages for the Athlon here include excellent clock-for-clock performance, especially in floating-point math, where the P4 is relatively weak and the Athlon is quite strong.

To bolster the Athlon's already strong performance, AMD has introduced a pair of platform enhancements. There's a new front-side bus speed of 266MHz, up from 200MHz. And there's the 760 chipset's ability to address double date rate (DDR) SDRAM. Created to combat the high prices (and latencies) of RDRAM, the DDR vs. Rambus struggle is a running subtext of the AMD-Intel conflict. The 133MHz variety of DDR memory, dubbed PC2100, can push through 2.1GB/sec, at peak?not as much as the P4's dual RDRAM channels, but twice the speed of conventional PC133 SDRAM.

In short, even though the Athlon is running at a clock rate 300MHz lower than the P4's, we're expecting big things out of this 1.2GHz DDR test rig.

Our testing methods
Let's cover what we tested and how, so the rules are clear up front. We've included not only our two contenders for the crown, but a couple of older systems for reference: a 1.1GHz Athlon with PC133 SDRAM, and an 800MHz Pentium III. Just because we can.

We chose to test in Windows 2000 rather than in Win9x/ME for a simple reason: Win2K is much, much better than Win9x/ME, and anyone putting down a big enough chunk o' change to buy a P4 system ought to know it. Once the next rev of Win2K, named Windows XP, makes it out the door, Win9x/ME will finally be put out to pasture. Yes, even for gamers. Win2K is making big strides in this area, and we expect Windows XP to dominate the desktop market in six months to a year. Nobody buying a P4 today ought to use it for any length of time with WinME or the like. Our decision to test with Windows 2000 may make the P4 look relatively stronger than it would in Win9x/ME, based on the scores we've seen around the web. But we think that's fair, under the circumstances.

As ever, we did our best to deliver clean benchmark numbers. All tests were run at least twice, and the results were averaged.

Our Pentium 4 test system contained these components:

Processor: Intel Pentium 4 processor at 1.4 and 1.5GHz

Motherboard: Intel D850GB - Intel 850 chipset - 82850 memory controller hub (MCH), 82801BA I/O controller hub (ICH2)

Memory: 256MB PC800 DRDRAM memory in two 128MB RIMMs

Video: NVIDIA GeForce 2 Ultra 64MB (Detonator 3 version 6.31 drivers)

Audio: Creative SoundBlaster Live!

Storage: IBM 75GXP 30.5GB 7200RPM ATA/100 hard drive

..while our comparison systems varied only with respect to the motherboard, memory, and CPU. The Athlon DDR box looked like this:
Processor: AMD Athlon 1.2GHz CPU on a 266MHz (DDR) bus

Motherboard: Gigabyte GA7-DX motherboard - AMD 761 North Bridge, Via VT82C686B South Bridge

Memory: 256MB PC2100 DDR SDRAM in two 128MB DIMMs

For the Athlon/KT133 system, we used:
Processor: AMD Athlon 1.1GHz CPU on a 200MHz (DDR) bus

Motherboard: Abit KT7-RAID motherboard - Via Apollo KT133 chipset - VT8363 North Bridge, VT82C686A South Bridge

Memory: 256MB PC133 SDRAM in two 128MB DIMMs

Similarly, we included a Pentium III test system?though only at 800MHz, we thought it would be a useful reference point?using these components:
Processor: Intel Pentium III 800EB (Coppermine) CPU at 800MHz on a 133MHz bus

Motherboard: Asus P3V4X motherboard - Via Apollo Pro 133 chipset - VT82C694X North Bridge, VT82C596B South Bridge

Memory: 256MB PC133 SDRAM in two 128MB DIMMs

We used the following versions of our test applications:

  • SiSoft Sandra Standard 2000.3.6.4
  • Compiled binary of C Linpack port
  • ZD Content Creation Winstone 2000
  • LAME 3.70
  • SPECviewperf 6.1.2
  • ps5bench 1.1 Intermediate
  • Adobe Photoshop 5.5
  • POV-Ray for Windows version 3.1g
  • 3DMark 2000 Pro build 335
  • Quake III Arena 1.17
  • Quake III Team Arena Internet demo
  • MDK2 Internet demo
  • Expendable Internet demo

In the Quake III Arena timedemo tests, we used the game defaults for "Normal" and "High Quality" rendering, with a few exceptions. For the "High Quality" tests, texture detail was set to maximum and the "high" geometry settings were enabled, as well.

The test systems' Windows desktop was set at 1024x768 in 32-bit color at a 75Hz screen refresh rate. Vertical refresh sync (vsync) was disabled for all tests.

The performance tests
Memory performance
Let's get this DDR versus Rambus thing out in the open right away. We'll start with Linpack, which measures memory bandwidth using matrices of large, floating-point numbers. Linpack tests a range of data matrices of different sizes, so it stresses everything from the L1 cache out to main system RAM. Look at the graphs to see what I mean:

Interesting, no? So you may be wondering, "Who won?" That depends. In terms of peak performance, the Athlon systems won, with much higher numbers on the tests involving smaller data sets. There are probably a couple of reasons for the Athlon's higher peaks here. First, the Athlon's L1 data cache is 64KB, and it's obviously very fast; by comparison, the P4 has a rather small 8KB L1 data cache. Once the Athlon moves out of its L1 cache, past 64KB matrix sizes, performance drops. Second, the Athlon's floating point math unit is excellent, which should help it process cached data very quickly.

But the shape of the graph tells a story, and those peaks are only part of it. Notice how all the scores start to drop sharply between the 256K and 320K matrix sizes. That's the Linpack test moving beyond the L1 and L2 caches and into main memory. Thanks to a smart cache design, the Athlon is able to hold on a little longer, to about 320K, before suffering a drop off.

So the Athlon looks pretty good until we hit main system memory. Then it turns ugly.

The P4's RDRAM channels allow it to keep a nice, steady pace as the data size rises toward 2MB. The SDRAM-based systems all suffer by comparison; even the DDR box drops to under half the speed of the P4 system. If the race goes to the steady, the P4 wins here easily.

Sandra's modified STREAM benchmark offers another perspective on the data, but it's basically the same dynamic. The Pentium 4's impressive theoretical advantages in memory throughput are even more remarkable in practice. DDR SDRAM does help the Athlon along a bit, but comparatively, DDR looks kind of weak here.

That's not the whole story, though. Memory bandwidth is important, but it's influenced by number of things?not least of which is a processor's ability to pull in data. There are solid reasons to believe the Athlon just isn't as geared toward maximizing memory bandwidth as is the P4. Beyond that, memory performance is also measured in terms of latency, and SDRAM systems have often shown real-world benefits over RDRAM thanks to lower access latencies.

So let's reserve judgment while we dive into some less synthetic tests, then see

Content Creation Winstone

Here's our first indication of how close a matchup this really is. Our 1.1GHz PC133 Athlon rig loses out to the P4, but the DDR box has enough extra oomph to pull out a win.

POV-Ray 3D rendering
This test serves as a correction of sorts to my initial Pentium 4 review. In that review, I tested the P4 in the POV-Ray 3D rendering program, and it lost badly to a 1.1GHz Athlon?especially in rendering "ntreal.pov". Further testing has confirmed that I messed up badly in reporting the results for "ntreal.pov," however, making the Athlon look much faster than it was.

The corrected results below paint a more accurate picture.

The Athlon's monster FPU rips through this one quite a bit faster, but the P4 isn't that far off the game. Still, that's two real-world apps down, and two victories for the Athlon DDR system.

LAME MP3 encoding
We've used a version of the LAME MP3 encoder that doesn't have any special optimizations for AMD's 3DNow or Intel's SSE/SSE2, so like POV-Ray, this one is all about x87 floating-point performance.

You were expecting something else? Another win for the Athlon system, with the P4 again performing fairly respectably in an FPU-intensive test.

In fact, the P4 downright whupped the PIII on this one. Makes me wonder if all the fuss about the P4's relatively weak FPU isn't a bit overblown.

SPECviewperf workstation graphics
SPEC's viewperf suite of graphics tests measures performance using a range of high-end, workstation-class 3D applications and tasks. Theoretically, these tests ought to stress a number of things: FPU processing power, memory performance, AGP implementation, etc.

 

 

The results are mixed, with the suite of tests exposing nicely the truth about these systems' performance: either system can be faster, depending on the task. I won't dare speculate about why a given system is faster on a given test?there are too many variables involved?but in light of the P4's defeats on most of our previous tests, it's safe to conclude the P4 is relatively strong in 3D graphics-related tasks.

Quake III Arena gaming performance
Now into the gaming action...

 

 

If there's one thing to know about Pentium 4 performance, it's that these puppies love Quake. The P4 finally wins a real-world test outright and impressively, showing the Athlon a thing or two.

Now in the past, I've attributed the P4's dominance in Quake to its bandwidth advantages in memory and on the front-side bus. Some have speculated that the P4 benefits from the way Quake III is written, essentially suggesting the game's instruction mix is close to optimal for the P4. While that may be very much true, note that the Athlon benefits greatly from a faster bus and memory, as well.

Of course, once we get to the higher resolutions, the pack bunches up as the video card's fill rate becomes the primary bottleneck.

 

Team Arena is much the same story as the original Quake III, but it's CPU-bound even at 1024x768. Again, the P4 rolls.

MDK2 gaming performance
Now for another OpenGL game. Will the P4 again dominate?

Not exactly. The Athlon DDR box cranks here, turning in as impressive a win in MDK2 as the Pentium 4 did in Quake III.

3DMark 2000
On to 3DMark, where Intel products have always performed relatively well. Suspiciously so, in fact. Has MadOnion, the company behind 3DMark, rigged this one in Intel's favor?

If they did, it didn't matter. The Athlon DDR system rips through 3DMark, stunning the P4. This is not the result we'd expected based on what we've seen in the past. Notice that the Atlon/PC133 system trails the P4 by over a thousand points. Obviously, the DDR rig's extra system bandwidth makes the difference.

3DMark also does a CPU rating, and oddly, the Athlon loses this one...

But once you get to the individual game tests, things turn around.
Now for a real Direct3D game...

Expendable Direct3D performance
We've included both average and minimum frame-rate numbers here, because Expendable is smart enough to record both things. Averages do matter, but the minimum frame rates are the real performance killers in any game.

Additional considerations
So now we've seen how they stack up performance-wise. Before drawing our conclusions, let's take a second to weigh some other considerations. There are some hardware-related platform differences between these two setups worth noting. In particular, the AMD Socket A chips are fragile, subject to damaged or cracked cores during the course of what ought to be routine handling. Poorly made heat sinks can destroy an Athlon chip during mounting, and it happens. We've seen it. AMD needs to do something to improve this situation, because simple upgrades are now too often fraught with peril.

Also, Athlon chips run very hot, and without a heat sink attached, they can burn up inside of a few seconds. That's no big deal for the most part, since attaching a heat sink to one's CPU ought to be a matter of course. But for hardware tinkerers like us, it's an additional worry.

The Pentium 4 doesn't have these problems, but the improvements come at a price?while the Athlon will work in a standard ATX case, the P4 requires a new, ATX12V power connector (part of a revised ATX spec backed by Intel). If you're upgrading your PC in the same case, that means you're going to need a new power supply. Not only that, but the P4 avoids having its core crushed by requiring a new, four-screw heat sink mounting bracket around the processor. Many existing cases will require modification to host a P4 properly. However, I have heard that some manufacturers are now supplying add-in heat sink support trays with their P4 motherboards, which could eliminate the need for a case mod.

 


The P4 in its socket, which is flanked by a pair of black plastic heat sink supports
Note also the ATX12V power connector next to the socket
 

Overclockers should keep in mind that the Athlon offers them more options than the P4. Intel processors are set at a locked multiplier, so the only means of overclocking them is via bus speed increases. Socket A chips, on the other hand, can be modified with a pencil to allow multiplier adjustments, and many motherboards allow menu-driven overclocking in the system BIOS. Like the P4, Athlons allow bus speed overclocking, but any time bus speeds runs too far out of spec in any system, unpredictable (read: bad) things start happening.

Finally, there's the future to consider. The current P4 will be replaced in a few months by a newer revision CPU that sits in a 478-pin socket, so drop-in CPU upgrades are out of the question. AMD is preparing a new revision of the Athlon, code-named Palomino, that may or may not work properly in current Socket A motherboards. In both cases, a motherboard upgrade may be necessary to upgrade much past the CPU speeds we've seen here.

Summing it all up
The combo of a 1.2GHz Athlon, 266MHz bus, and DDR SDRAM make up the heart of the fastest PC you can buy at present. The Pentium 4 isn't far behind, but in a broad majority of our tests, the Athlon DDR system is fastest. If money is no object and you want the fastest PC on the block, buy or build yourself an Athlon DDR rig.

Factor price into the equation, and it's an even easier call to make. At 1.5GHz, Pentium 4 processors run about $600 from bargain-priced mail-order vendors. A 128MB RIMM of PC800 RDRAM currently costs about $160. By contrast, a 1.2GHz Athlon will set you back around $300, and DDR SDRAM costs about a dollar per megabyte?or about $130 for 128MB. For TR readers looking to build their own systems, the Athlon is simply a better deal. No question about it.

Those of you looking to buy OEM systems from Dell, Micron, et al, will have a harder decision to make. In the OEM market, the math is different. It's not necessarily the case that an Athlon-based system will always be a better deal than a P4 system, just because the component costs are lower on the open market. You'll have to shop carefully, scrutinize specs, and do your homework to figure out which OEM system is a better deal.

Next, we should note that DDR memory and motherboards like the one we've tested here are only just becoming available now, in early March, despite the fact we first reviewed an AMD 760-based system upon its unveiling last October. What's more, rumors are flying about motherboard manufacturers deep-sixing their 760 mobos?or seriously curtailing production?in anticipation of Via's DDR Athlon chipset. We expect the AMD 760 chipset to have a rather limited lifetime because of this. AMD has used its chipsets to seed the market and bolster the Athlon platform, but they have been willing to cede the market to Via's chipsets once the Via products have arrived. That doesn't mean 760-based motherboards are a bad choice, however. AMD has provided mostly adequate support for their chipset products in the past, and we expect them to do so going forward.

Delays have harmed the 760, but so have competing PC133 SDRAM products. The PC133 system we tested here wasn't far off the DDR rig in most tests, but motherboards based on Via's new KT133A chipset have shown even better performance. (The KT133A chipset supports PC133 SDRAM, but also supports a 266MHz front-side bus.) In fact, the scores we've seen around the web are hard to ignore. It's clear the Athlon gains quite a bit more from the faster front-side bus than it does from the addition of DDR SDRAM. Folks looking for optimum performance on a budget will want to look into KT133A-based Athlon solutions. We'll be reviewing a KT133A mobo from Abit soon.

Finally, I should say a few words for our loser. Though the Pentium 4 didn't come out on top in these tests, Intel can certainly keep its head held high. The soaring memory bandwidth and Quake III performances we've seen from the P4 bodes well for the future. With the 850 chipset and dual RDRAM channels, Intel has built a world-class infrastructure for the Pentium 4. Once Intel moves to a .13-micron fab process later this year, P4 clock speeds ought to climb rapidly?and when they do, the P4 is gonna be a screamer.

Here I will pause to note something about RDRAM: $160 for 128MB really isn't bad. I hate to say it, but in light of the amazing memory performance we've seen from the P4 in these tests, the price premium over DDR SDRAM is worth every penny.

Ugh. I feel dirty now.

Pentium 4-optimized code in newer applications is likely to help the P4 down the road, too. But then most apps could run faster on the Athlon with the right compiler tricks. We'll explore the depths of specially-optimized applications in our next article, so stay tuned. 

 

33 comments — Last by AmishRakeFight at 6:02 PM on 11/18/09

Contemplating SpeedStep and Cool'n'Quiet in performance testing
— 11:07 AM on August 28, 2009

Howdy all.  I've been hard at work in Damage Labs setting up new test systems with Windows 7.  This is a particularly agonizing chore for me, because I want to be sure to set up everything properly and perfectly the same (as much as possible) between the different systems.  Also, generally what happens is I go into this process looking to incorporate as many new benchmarks as possible, but then for various reasons (time constraints, poor application performance scaling, lack of counters for timing operations, software licensing/DRM restrictions), I end up using many of the same tests as in the past generation of results.  That kind of looks to be repeating itself in this new round of CPU tests, although I do expect to add a few new games, 7-Zip, and Windows Live Movie Maker, at least—along with new versions of a great many applications.

The question of the day, however, has to do with power management features.  Typically, we've left features like SpeedStep and Cool'n'Quiet disabled for our general performance tests, only enabling them when we do power efficiency testing.  We've disabled them for multiple reasons, mainly because they can affect performance results in some tests.  The picCOLOR application benchmark we've used for a while, for instance, does many short, quick operations spaced a little bit apart.  As a result, the CPUs don't have time to ramp up their clock speeds for each operation, and the results come out lower than with power management disabled.

I also have a sense that, generally speaking, when cases like this occur, AMD processors are more likely to be negatively affected than Intel processors, in part because AMD chips tend to drop to lower clock speeds at idle and in part because of a history of real problems with AMD CPUs, CnQ, and performance.

The question is: Isn't that fair game, though?  "Balanced" is the default power profile in Vista and Win7, and the vast majority of folks are going to want to have these power-saving features enabled on their systems in order to cut down on the noise, heat, and power consumption of their systems.  In a case like the picCOLOR benchmark, I think we have a simple solution: use a workload more like a real user would, with higher-resolution images and longer operations.  We can work with the software's developer on that.  In other cases, well, most of our benchmarks already reflect real-world use pretty well and aren't so affected by SpeedStep and Cool'n'Quiet.  If they are, the odds are pretty good that a real user might experience the same drop-off in performance, and even if it's not perceptible, there's no reason not to include it in our performance measurements.

I'm of two minds on this question.  The one sticking point that keeps me from making to switch and leaving power management features enabled is the possibility that our test results will be rendered unreliable in some cases, either due to big differences in outcome from one run to the next or to the occasional outright incompatibility between these features and a piece of software (which we've seen in some games in days past).  And I'm on a deadline here, so that prospect frightens me more than you might expect.  Still, it's 2009, I've been using CnQ and SpeedStep on my own desktop and laptop systems for years, and I consider them integral features of a modern CPU.  Seems like it might be time to test 'em like we use 'em.

Hmm.  What do you all think?

32 comments — Last by derFunkenstein at 8:08 PM on 09/14/09

Gearing up for Lynnfield and Win7
— 11:34 AM on August 24, 2009

The advent of new mainstream CPUs from Intel and a major (and promising) new release of Windows has us reworking all of our CPU test rigs in Damage Labs, in preparation for a busy period of testing.  After using the same basic hardware and software on our CPU test systems for quite a while, this seems like an appropriate time to revamp them.  To that end, boxes have been arriving via UPS and FedEx for the past few days, resulting in this excellent pile of new gear in the corner:

Yep, that ought to do it.

On the left there are new 610W PC Power & Cooling Silencer PSUs that the folks at OCZ were kind enough to send out.  We've been using older GameXStream 700W power supplies for at least a couple of years now, and I figured it was time for an update.  The Silencers are some of our favorite PSUs, and these are noticeably quieter than GameXStreams, which weren't bad for their day.  We've backed down on the wattage rating a little in hopes of getting more efficient PSU performance when the test rigs are at idle, while keeping the right connector payload for a powerful graphics card.

Speaking of which, those are Asus GeForce GTX 260 TOP cards right next door to the PSUs.  Switching to these GeForces should reduce power consumption by roughly 30W at idle versus our previous Radeon HD 4870s.  I also kind of like the idea of going with a third-party GPU vendor instead of going AMD-on-AMD for CPU test rigs, just on principle.  Thanks to Asus for sending its excellent TOP rendition of the GTX 260.  This is a higher-clocked card that shouldn't be the cause of many GPU bottlenecks, to say the least.

Western Digital hard drives will be the storage engines powering our new test rigs.  Those are Caviar RE3 1TB drives stacked up there.  We briefly considered SSDs, but given the big SSD performance delta when going from a new to used state, that didn't seem like a savvy choice for CPU test systems.  Too many issues.  Not to mention the capacity constraints.  These RE3s are very nice drives that should suit our needs perfectly.  Props to WD for helping out here.

On the far right of the picture is a pair of Corsair Dominator DDR3 DIMMs intended for Lynnfield processors. These puppies are rated for 1600MHz operation at a CAS latency of 8 with only 1.65V of juice. They even auto-tune themselves to those settings via built-in profiles, in concert with the right motherboards.

Several of the right motherboards are stacked up behind the DIMMs, including the Gigabyte microATX P55 board I mentioned the other day, the P55M-UD4.  Sitting on top of them are a couple of Lynnfield-ready CPU coolers.  The big dawg from Thermalright is already up and running here now.  It seems to be quiet and effective without being especially heavy, interestingly enough.

As you might imagine, I'm looking forward to testing with our new systems.  The next step is to get Windows 7 up and running.  After that, I'll be spending as much time as I can, within limits, trying out new applications we may want to add to our CPU test suite.  If you have suggestions, now is the time to offer them.  We have limits to what we can include, especially since each new benchmark takes time to set up and confirm, but I would like to add a few new things time time around.  Remember: the best candidates are easily timed, repeatable, don't have crazy DRM restraints, are CPU-bound, and are fairly widely used by consumers on desktop PCs.

59 comments — Last by Clint Torres at 6:31 PM on 08/31/09