The intent of these wide, sweeping changes is clear: to inflict as much pain on the industry as possible in the shortest time window.
What I meant to say was that Intel clearly intends to clean up the last vestiges of the circa-1990s PC platform at once, weeding out weaknesses and pulling open bottlenecks. The marketing spin on all of this says it’s about enhancing the user experience and making the PC a better citizen in the “digital home,” where networked PCs replace VCRs and other such media devices. For once, I’m somewhat persuaded by the spin, because many of these changes should make computing smoother and easier, better suited to the playback of high-definition audio and video. However, this major overhaul of the PC isn’t just about making a better TiVo replacement. There’s much more to it than that.
We’ve tested the whole shebang, from the Intel 915 and 925X Express chipsets to new processors including the Pentium 4 model 560 at 3.6GHz. We’ve tested PCI Express graphics cards from ATI and NVIDIA, and we’ve benchmarked Maxtor’s impressive new MaXLine III Serial ATA hard drives with support for Native Command Queuing. Read on to learn more about what each of these changes means for you and to see how this first wave of next-generation PC hardware performs.
The heart of the matter
At the heart of Intel’s PC overhaul is that too-often overlooked component, the core logic chipset. These two chips act as the traffic cop inside a personal computer, allowing all the devices to communicate and function together. Most of the major features you see on a checklist from Dell or HP are conferred by a system’s chipset, as well. Today, Intel is introducing a lineup of three new 900-series Express chipsets, the 925X, 915P, and 915G. I’ll give you a brief overview of this lineup’s new features, and then we’ll look at the new stuff in more depth. If you’re confused by some of the terminology, hang on, because we’ll be explaining much of it on the following pages.
The three 900-series chipsets have a lot in common, as one might expect. Their north bridge chipsor memory controller hubs (MCH), as Intel likes to call theminclude a PCI Express X16 interface for graphics, replacing AGP, and a new memory controller capable of working with DDR2 memory.
As in the past, Intel has enabled and disabled features on its MCH chips to create three distinct products. The 925X is the high-end chip; it will have faster internal timings in its memory controller and support for ECC memory to enhance data integrity for workstations. The 915P is Intel’s mainstream chip; unlike the 925X, it retains support for DDR memory. And the 915G is essentially the 915P plus built-in graphics.
All three of these north bridge chips talk to the other chip in the set, the south bridgeor I/O Controller Hub (ICH) in Intel-speakvia a new, PCI Express-like link, dubbed DMI, that has a data rate of 1GB/s in each direction for a total of 2GB/s. Until now, Intel’s chipsets had been saddled with Intel’s Accelerated Hub interconnect running at 266MB/s, so this change is welcome.
The change is also necessary, because the I/O-oriented south bridge will now be doing lots more input and output. There are four flavors of the new ICH6 chip, and all them share some common features, including four Serial ATA 150 ports, one ATA/100 port, eight USB 2.0 ports, Gigabit Ethernet, support for Intel’s new High Definition Audio, and four lanes of PCI Express expansion capacity. That features list represents an upgrade in almost every category save one: it’s down one ATA/100 port. Intel’s obviously ready to move the market away from ATA hard drives. Still, the ICH6 series retains support for all sorts of legacy I/O standards, including up to six PCI slots, just in case you’re a glutton for punishment.
As in the past, the ICH models with support for disk arrays, or RAID, get an “R” at the end of their names. Also, some models of the ICH will now come with 802.11g wireless networking capability. Those models will get a W attached to their names. So at the end of the day, you have four variants of the new ICH: the vanilla ICH6, ICH6R, ICH6W, and the super-deluxe ICH6RW. Make mine an RW, please.
That’s the 10,000-foot overview of the 915 and 925X Express series chipsets that bring all these new features to the PC for the first time. Now let’s talk about some of the important features in more detail.
I wasn’t kidding when I said ye olde PCI bus is slow. In its most common implementation in desktop PCs, at 32 bits and 33MHz, the PCI bus has a theoretical peak bandwidth of 133MB/s, which is shared between all devices. To give you some perspective, a single Serial ATA hard drive interface runs at 150MB/s, and Gigabit Ethernet runs at roughly 125MB/s. Asking a PCI-based bus to host an SATA RAID array and a Gigabit Ethernet controller is like asking Alec Baldwin to read through F.A. Hayek’s The Road to Serfdomyou could practically watch its lips move.
Not only does PCI lack bandwidth, but its shared bus architecture requires arbitration between devices that want to transfer data and involves contention between upstream and downstream communications. PCI’s limitations have forced chipset makers to integrate ever more functionality into their chipsets, as Intel did when it hung a Gigabit Ethernet interface off the north bridge in its previous-generation 875P chipset.
At 33MHz and 32 bits, ye olde PCI is decidedly slow and wide. PCI Express, meanwhile, is the epitome of the new thinking in internal PC communications links, the “fast and narrow” approach, a more serialized way of transmitting data with lower pin counts and higher signaling rates. Despite the dorky name, PCI Express doesn’t actually share all that much with PCI, save some memory addressing and device initialization similarities so drivers and operating systems don’t need major plumbing changes to work with the new standard.
In fact, PCI Express is downright network-like on several levels. On the lowest, physical layer, PCI Express uses pairs of dedicated, unidirectional links to transfer data between devices. A pair of links in a PCI-E connection is known as a “lane,” and each lane offers 250MB/s of bandwidth in each direction, upstream and downstream. Because PCI Express lanes are point-to-point affairs, there are no worries about shared bandwidth, and because the lanes are bidirectional, there’s no contention between sending and receiving data.
The slowest possible PCI Express configuration is a PCI Express X1 slot, where a device gets 250MB/s of bandwidth in each direction, or 500MB/s in full duplex. However, like NIC teaming in an Ethernet network, PCI Express lanes can be teamed up to deliver more bandwidth between devices. For graphics, sixteen PCI Express lanes will connect a graphics card to the rest of the system for a total bandwidth of 8GB/s, full duplex. That’s a whopping amount more than the current, PCI-derived standard for graphics, AGP 8X, which tops out at 2.1GB/s.
The similarities between PCI-E and a network don’t stop at the physical layer, either. PCI Express also employs a packet-based protocol for data transmission, and it uses packet header information to reserve bandwidth for delay-sensitive data streams with eight different traffic classes. These facilities should make PCI-E ideal for more than just dedicated connections between devices. PCI Express should become a standard for internal PC communications, just as AMD’s HyperTransport is now.
The PCI-E physical layer spec allows for X1, X2, X4, X8, X12, X16, and X32 lane widths, but the initial connector specs call for only for X1, X4, X8 and X16 slots. X4 and X8 slots may make appearances in servers soon, but for desktop systems, expect to see X1 slots for expansion and X16 slots for graphics.
Obviously, PCI Express will bring more bandwidth to the PC platform, but more importantly, it establishes a new foundation for PC expansion standards. PCI on desktop PCs hasn’t changed radically since its inception, but PCI Express has the engineering headroom and a practical set of options for expansion, when needed.
Intel’s engineers have given the 915 and 925X Express memory controllers the ability to work with memory conforming to the new DDR2 standard. Presently, the original DDR memory type generally tops out at 400MHz, but DDR2 memory starts there and goes up. The first round of DDR2 memory runs as fast as 533MHz, but DDR2 is expected to climb to 667 and 800MHz in the future.
DDR2 memory has been tweaked in various ways to allow for higher clock speeds and, one would hope, eventually more performance than DDR memory. Among other changes, DDR2 memory chips include on-die termination, higher densities, longer burst lengths, a different signaling scheme, and lower operating voltages (1.8V) than original-recipe DDR memory. Many of these changes could cause higher access latencies, but those should be offset by higher clock frequencies.
The DDR2 spec also requires fine-pitch ball-grid array (FPBGA) packaging for DDR2 memory chips, so the TSOP chip package common on many DDR modules (save for our DIMM pictured above) won’t be present.
DDR2 memory is not backward compatible with DDR, so you’ll have to chuck your DIMMs if you’re looking to upgrade. DDR2 modules have 240 pins instead of 184, and they have a different notch placement to prevent confusion or inadvertent calamity.
At 533MHz, DDR2 modules will have a peak theoretical bandwidth capacity of 4.3GB/s. Since the 915 and 925X chipsets have dual-channel DDR2 memory controllers, that’s a total peak of 8.6GB/s. However, the Pentium 4’s front-side bus currently tops out at 6.4GB/s, so the CPU won’t get to enjoy the full benefits of DDR2 memory. More likely beneficiaries include PCI Express X16 graphics cards and the built-in graphics core inside the 915G Express chipset.
More power requires… more power
Although the new motherboards Intel supplied us for review fit the ATX form factor, they have power connectors similar to those in the new BTX chassis standard. The 20-pin main ATX power connector has been upgraded to an EATX-style 24-pin connector like those found on server boards. Like some server boards, the Intel mobos were content to run off a 20-pin power connector if needed. Also, next to the four-pin ATX 12V connector on each board is a four-pin Molex connectorold-school hard drive stylefor auxiliary power. Obviously, Intel is making provisions for its 100W Prescott Pentium 4 processors to get enough juice.
Fortunately, Intel is also taking steps to make sure those monster GeForce 6800 Ultra cards will get enough power. The power supply Intel shipped with the review equipment came with a distinctive new six-pin power connector, and NVIDIA’s new PCI Express-ready GeForce 6800GT came with a port for just such a plug.
This plug can replace the dual Molex connectors NVIDIA used on its GeForce 6800 Ultra card. For those of you who just bought a new 800W power supply, adapters from dual Molex connectors to the new six-pin plug should be available.
These new chipsets bring along with them an improved Serial ATA standard from Intel known as AHCI. This standard adds some new features to the Serial ATA specification, including device hot plugging and a form of tagged command queuing officially known as Native Command Queuing. Both of these features are similar to those provided by the SCSI standard prevalent in the server and workstation world, but they’re now coming to the everyday desktop PC.
The 915 and 925X chipsets also support the ATAPI standard on their four Serial ATA ports, so they should be ready to host SATA optical drives.
The biggest news here is Native Command Queuing. NCQ puts some smarts in the hard disk drive’s control logic, allowing it to reorder the execution of requests in order to optimize for what’s happening with the hard drive mechanism itselfwhere the head is seeking across the drive and where the platter is spinning under the head. By queuing up multiple commands and executing them out of order, the hard drive may be able to grab data more efficiently than it could by simply executing commands one after another, minimizing the near-eternal (in computer time) delays caused by seek times and rotational latencies.
The NCQ spec looks fairly robust, with all the sorts of provisions necessary to make such a feature work. Drives with NCQ can initiate DMA transfers through the host controller themselves, and they can aggregate interrupts, so only one interrupt is generated when multiple commands complete close together. NCQ has the potential to help performance substantially during periods of intensive disk activity, when multiple applications are making requests for data simultaneously. We’ll test that theory shortly.
To complement these SCSI-like features, the ICH6R south bridge has RAID support for its four SATA ports built in. Intel’s chipset RAID will do RAID level 1, or mirroring, and RAID 0, striping, but not both together. RAID 0+1, RAID 10, and RAID 5 are not mentioned in Intel’s docs, unfortunately. However, the RAID controller can support a pair of independent, two-drive RAID arrays. The ICHR also now supports the designation of a hot spare drive and auto array rebuild for RAID 1 arrays.
More impressive still is Intel’s Matrix RAID technology. Matrix RAID is the RAID type nearly every enthusiast has probably wanted, whether he knew it or not. This feature allows the user to create a pair of RAID arrays of different types across only two drives. Each drive can have two partitions. On each drive, partition 0 could be part of a RAID 0 array, and partition 1 could be part of a RAID 1 array. Thus, the user would get, effectively, a pair of RAID drives, one using striping for improved performance and the other using mirroring for data integrity. Put your OS and applications on the RAID 0 array for faster boot and load times, and store your critical data on the RAID 1 array so you won’t lose it if one of the drives crashes. Nifty, eh?
Audio gets more definition
In the annals of product naming, Intel’s new High Definition Audio distinguishes itself with the most vanilla name possible for the feature it represents. Still, it’s not confusing and involves no torturous capitalization tricks, so I’d best not complain too much.
High Definition Audio provideswait for ithigh-definition audio on the PC, built right into the chipset. This new specification aims to replace the current AC97 audio spec. HD Audio allows for up to eight channels of digital audio at up to 24-bits of precision at 192KHz sample rates. That’s enough fidelity for PCs and PC-based devices to reproduce all of the major consumer electronics audio standards, including Dolby Digital Surround EX, DTS, and THX, provided the proper software support.
HD Audio also improves over AC97 from an I/O standpoint, with support for dynamic bandwidth allocation, flexible use of DMA streams for audio input or output, and a clock signal that’s generated on the south bridge chip itself, not on the codec chip (or chips).
Of course, standards for digital audio on the PC only sound as good as their implementations, and the Intel implementation on its D925XCV motherboard is fairly representative of what most motherboard makers seem to be doing in several respects. Intel has chosen a Realtek 880 codec chip, which is the HD Audio successor to Realtek’s wildly popular ALC650-series codecs, found in what seems like every motherboard we’ve reviewed in the past year or so.
For output, the ALC880 can do digital-to-analog conversion for eight channels of audio at up to 24 bits and 192KHz, believe it or not, with a claimed signal-to-noise ratio of 100dB. Its S/PDIF output is limited to 24 bits and 96KHz. For recording, the ALC880 has three stereo analog-to-digital converters that peak at 24 bits and 96KHz; the claimed S/N ratio is 85dB.
In other words, the first implementations of HD Audio do indeed seem to deliver high-definition audio, at least in terms of precision and sample rates. This ain’t no SoundBlaster Audigy card, claiming 24 bits when the DAC can only do 16. Whether or not the ALC880, situated on a PC motherboard, really produces good sound is another story.
I haven’t had time to conduct extensive listening tests to get a good subjective take on HD Audio, but I did listen to some MP3s on it with a decent pair of speakers, and I can at least say this: it doesn’t totally suck. That is, of course, more than one can say for an awful lot of built-in motherboard audio these days, so that’s something. But you probably won’t be prying my VIA Envy24HT-based PCI card with fancy DACs away from me any time soon.
The really good news here is that Intel has established an excellent new baseline for PC audio, much better than the AC97 stuff we’ve seen to date.
Integrated graphics gets faster, less Extreme
Apparently, someone in Intel marketing figured out that calling its uber-high-end Pentium 4 chip the Extreme Edition probably didn’t jibe with calling its integrated chipset video Intel Extreme Graphics. Accordingly, Intel’s new integrated graphics core has been given the more modest name of Graphics Media Accelerator 900.
Fortunately, the more modest name is paired up with a significantly beefed up graphics core. The GMA 900 features four pixel pipelines running at 333MHz, as opposed to the single pipe of its predecessor in the 845G and 865G chipsets. Its 1.3Gtexel/s fill rate peak matches GeForce FX 5200 Ultra. The extra memory bandwidth the 915G chipset gets from DDR2 533MHz memory should help boost performance, as well.
Read further down the spec sheet, and the GMA 900 starts to sound formidable, at least in the world of integrated graphics. The GMA 900 supports DirectX 9, OpenGL 1.4, and Pixel Shader 2.0, at least on its spec sheet. Intel chooses to offload vertex shader work to the CPU, but the Prescott processor includes some SSE3 instructions specifically designed to accelerate vertex shader calculations.
However, Intel only claims the GMA 900 has 1.5 times the performance of the 865G’s Extreme Graphics, so don’t expect miracles. I think I saw some pixel shader effects on it when running UT2004 benchmarks, but the GMA 900 crashed out of Far Cry and refused to run the excellent DX9 “rthdribl” demo. The GMA 900 may give ATI’s Radeon IGP chipsets a run for their money, but Intel needs to work on its drivers a little first. And gamers can forget about it, regardless.
Fortunately, though, Intel has bolstered the GMA 900 with support for HDTV, including 1080i and 720p resolutions and component outputs, in addition to VGA and DVI. In fact, our Intel 915G test board arrived with a PCI-E X16 riser card sporting a DVI output.
Intel is launching a range of new processors for its 915 and 925X Express chipsets, all of which come in a new “land grid array” type package that has, oddly enough, no pins. It simply has 775 connector pads on its underside.
The LGA775 processors fit into a funky motherboard socket that has pins protruding from it. Like so:
This arrangement makes the processor vastly less susceptible to bent or broken pins. The question is whether it makes motherboards more prone to the same things.
So far, that’s not been my experience at all. Having dealt with my share of bent pins and damaged CPUs (tip: never drop an Athlon 64 while trying to shoehorn it into a small form factor box), I have to say that I’ve felt more comfortable dealing with the LGA775 stuff during testing. The processors are remarkably sturdy now, of course. You could play Tiddlywinks with the darned things. And CPUs generally cost quite a bit more than motherboards, anyhow.
But the motherboards don’t seem too terribly delicate. The pins are spaced close together enough that they form a pretty solid surface, and the socket mech itself tends to protect them. I’d rather have the pins in there than out on the CPU. Still, I’m curious to see how these sockets wear over time, and how well some of the motherboard makers manage to handle their return policies now that it’s their turn to deal with bent pins.
In addition to everything else it’s introducing today, Intel is unleashing a new range of processors in the LGA775 package, including one truly new clock speed, the Pentium 4 “Prescott” processor at 3.6GHz. In fact, aside from the Pentium 3.4GHz Extreme Edition for LGA775, I believe all of the CPUs Intel will be supplying in the new package are based on the 90nm Prescott core.
As expected, Intel has assigned model numbers to these new processors, easing the emphasis off of clock speeds, as AMD has already done. The new 3.6GHz version of the Prescott Pentium 4 gets a model number of 560. Here’s a table with all the numbers.
Although older, Northwood-based processors tend to offer better performance and dissipate less heat at a given clock speed, the Prescott will apparently be Intel’s workhorse CPU going forward.
Cooling the Pentium 4 560
To aid in dissipating the 115-plus watts of heat generated by a 3.6GHz Prescott CPU, Intel supplied an interesting new cooler with our review equipment. Check it out:
This cooler is designed to protect the processor from damage by chopping off your freaking fingers if they get too close.
I like it, though, because the BTX-style four-pin power connector on the motherboard gives it the ability to ramp fan speeds up and down in a linear fashion as needed. This setup isn’t nearly as noticeable as the transitions between speeds common to multi-stage cooling fans, which can be annoying.
Both ATI and NVIDIA are getting in on the action with PCI Express video cards. We were able to test one card from each company with Intel’s new PCI Express chipsets. Let’s have a look at them.
The NVIDIA card is a GeForce 6800GT based on the NV40 chip plus NVIDIA’s HSI chip, which bridges between the GPU’s AGP interface and the motherboard’s PCI Express X16 connection. NVIDIA says this chip talks to the GPU at two times the speed of AGP 8X, so PCI Express data rates should be faster than with AGP 8X, despite the bridge chip.
ATI’s Radeon X600 XT is the PCI Express version of the Radeon 9600 XT. Unlike NVIDIA, ATI has chosen to re-spin its Radeon chips with built in, native PCI Express interfaces. Unfortunately for us, the X600 XT also has a higher memory clock speed than the 9600 XT, so direct comparisons between the 9600 XT and X600 XT will be a little bit iffy.
We’ve heard endless discussions about the potential performance impact of these two companies’ approaches to PCI Express. We may be able to settle some of the dispute with the test results in the following pages.
First, a few notes on how we labeled our graphs. This is, uhm, a bit of a complex product launch, so we tested multiple processors on multiple chipsets in order to include all the relevant info. I believe we’ve done it, but listen up. The graphs are labeled for both the CPU tested and the chipset involved. The common thread among Intel chipsets is the Pentium 4 3.4GHz Extreme Edition processor, which we’ve labeled as a “Pentium 4 XE 3.4GHz”. Use those scores to compare chipsets best.
The Prescott CPUs are running at different speeds here. The Pentium 4 3.4E is the Prescott on Socket 478 with the older 875P chipset. The Pentium 4 560 is the 3.6GHz Prescott in the new LGA775 package.
You’ll notice that all of the new Intel chipsets and CPUs are highlighted in the graphs for easy reading. The other stuff is in darker colors.
Keep in mind that the X600 XT is running at a higher memory clock speed than the 9600 XT. We did our CPU and chipset testing with these Radeon cards, but we tested gaming with the GeForce 6800GT cards, as well, particularly because those cards run at the same speed on both AGP and PCI-E. Note, also that the scores labeled “915G/GMA” are not using an external graphics card but the 915G chipset’s built-in graphics.
Finally, we’ve updated the BIOS on our 875P platform (the Abit IC7-G motherboard) since our last set of CPU tests, and we found we got better performance out of it. We were able to use better memory timings, as well, so the 875P is a small amount faster all around.
Our testing methods
As ever, we did our best to deliver clean benchmark numbers. Tests were run at least twice, and the results were averaged.
Our test systems were configured like so:
|Processor||Athlon 64 3800+ 2.4GHz
Athlon 64 FX-53 2.4GHz
| Pentium 4 3.4GHz Extreme Edition
Pentium 4 3.4’E’GHz
| Pentium 4 3.4GHz Extreme Edition
Pentium 4 560 3.6GHz
|Pentium 4 3.4GHz Extreme Edition|
|Front-side bus||HT 16-bit/800MHz downstream
HT 16-bit/800MHz upstream
|800MHz (200MHz quad-pumped)||800MHz (200MHz quad-pumped)||800MHz (200MHz quad-pumped)|
|Motherboard||MSI MS-6702E||Abit IC7-G||Intel D925XCV||Intel D915GUX|
|North bridge||K8T800 Pro||82875P MCH||925X MCH||82915G MCH|
|Chipset drivers||4-in-1 v.4.51
|INF Update 22.214.171.1242||INF Update 126.96.36.1994||INF Update 188.8.131.524|
|Memory size||1GB (2 DIMMs)||1GB (2 DIMMs)||1GB (2 DIMMs)||1GB (2 DIMMs)|
|Memory type||Corsair TwinX XMS3200LL DDR SDRAM at 400MHz||Corsair TwinX XMS3200LL DDR SDRAM at 400MHz||Micron DDR2 SDRAM at 533MHz||Micron DDR2 SDRAM at 533MHz|
|RAS to CAS delay||3||3||4||4|
|Hard drive||Maxtor MaXLine III 250GB SATA 150|
|Graphics 1||Radeon 9600 XT 128MB AGP with CATALYST 4.6 drivers||Radeon X600 XT 128MB PCIe with CATALYST 4.6 drivers||Radeon X600 XT 128MB PCI-E with CATALYST 4.6 drivers|
|Integrated Graphics Media Accelerator with 184.108.40.20681 drivers|
|Graphics 2||GeForce 6800GT 256MB AGP with 61.45 drivers||GeForce 6800GT 256MB PCIe with 61.45 drivers|
|OS||Microsoft Windows XP Professional|
|OS updates||Service Pack 1, DirectX 9.0b|
All tests on the Intel systems were run with Hyper-Threading enabled, unless otherwise specified.
Thanks to Corsair for providing us with memory for our testing. If you’re looking to tweak out your system to the max and maybe overclock it a little, Corsair’s RAM is definitely worth considering.
The test systems’ Windows desktops were set at 1152×864 in 32-bit color at an 85Hz screen refresh rate, with exception of the 915G with GMA, which was at 1024x768x32 at 85Hz. Vertical refresh sync (vsync) was disabled for all tests.
We used the following versions of our test applications:
- Cachemem 2.65MMX
- SiSoft Sandra 2004 (9.89)
- Compiled binary of C Linpack port from Ace’s Hardware
- NewTek Lightwave 7.5
- Cinebench 2003
- POV-Ray for Windows 3.6
- ScienceMark 2.0 beta (23SEP03 build)
- Sphinx 3.3
- LAME 3.96 (build from mitiok.cjb.net)
- Xmpeg 5.0.3 with DivX Video 5.11
- Comanche 4 demo
- Quake III Arena 1.31
- Splinter Cell 1.2
- Unreal Tournament 2004 3236
- Far Cry 1.1
- HD Tach 2.61
- Serious Magic 3D Image Download Benchmark
- Iometer 2003.12.16
- RightMark 3DSound 1.01
The tests and methods we employ are generally publicly available and reproducible. If you have questions about our methods, hit our forums to talk with us about them.
So how does DDR2 memory handle, and does the 925X chipset offer much performance advantage over the 915G?
Well, well. The old 875P chipset with DDR400 memory comes out on top in the memory bandwidth sweeps.
Linpack shows us a visual of how fast the CPU can compute data matrices of various sizes, from those stored in the on-chip cache to those that only fit into main memory. Unfortunately, in the case of the Pentium 4 Extreme Edition chips, their 2MB L3 caches can hold any data set Linpack throws at them, so we don’t get a look at main memory performance. The Pentium 4 560, though, nicely outpaces the 875P system in Linpack, showing us a hint of better performance from DDR2 memory after we get into data sizes larger than its 1024K L2 cache.
Here’s a surprise. For some reason, I half expected DDR2 memory to have higher access latencies, but that’s not the case. The Pentium 4 Extreme Edition takes longer to get to memory on the 875P chips than on then 915G and 925X. Then again, it’s all very close, and the tables are turned with the Prescott chips, where the 875P shows slightly lower access latencies. No surprise that the AMDs are fastest, though, in overall memory bandwidth and access latencies. The Athlon 64’s built in memory controller is very tough to beat.
Don’t let the 3D graphs scare you. The graphs are indulgent, but they’re useful, too. I’ve arranged them manually in a very rough order from worst to best, for what it’s worth. Shorter bars are generally better. I’ve also colored the data series according to how they correspond to different parts of the memory subsystem. Yellow is L1 cache, light orange is L2 cache, and orange is main memory. The red series, if present, represents L3 cache. Of course, caches sometimes overlap, so the colors are just an interesting visual guide.
Ok, so the order from highest to lowest latency is totally a rough estimate. Don’t pay much attention to the order in which the graphs are presented. Instead, look at how much higher the memory latencies are on the 900-series chipsets at the highest step sizes, 2048 and 4096. Obviously, the 915/925 memory controller behaves very differently with DDR2 than the 875P does with DDR. Although our single sample point on the last page showed decent latencies for the DDR2 chipsets, the reality is that it depends very much on how memory is being accessed. It’s hard to say which chipset is generally quicker at accessing memory, based on these results. The one thing we can say with certainty is that AMD’s integrated memory controller is very, very quick.
We’ll get right down to gaming with the GeForce 6800GT cards. This may be a better chipset comparison than the Radeon gaming tests because of the different clock speeds between Radeon X600 XT and 9600 XT. Remember, though, that in this case the GPU is talking to PCI Express through a bridge chip.
We’ve kept the game resolution low in order to give the chipsets and PCI Express a chance to work their magic. At higher resolutions, we’d run into the limitations of the GeForce 6800GT card’s fill rate, memory bandwidth, and pixel shader performance, obscuring the impact of PCI-E and the new chipsets. I’ve tested both UT2004 and Far Cry at medium and high quality settings, to see if larger texture sizes or additional vertex and command traffic will strain the AGP bus and allow PCI Express to excel.
Unreal Tournament 2004
Quake III Arena
In all five of the games we tested with the GeForce 6800GT, we see no performance advantage to the 915/925 chipsets and PCI Express. The 875P chipset is ever so slightly faster than the 925X most of the time, with only Quake III Arena allowing the 875P to separate a little.
Again, we’ve kept the resolution low with the Radeon cards. We want to highlight the impact of PCI-E and chipset differences, and we also want to minimize the impact of the higher memory clocks on the X600 XT versus the 9600 XT.
I’ve also included performance numbers for the Intel 915G chipset’s Graphic Media Accelerator 900 in the results below. Matching up to the Radeon X600 isn’t really a fair fight, but you should get an idea about its performance. Unfortunately, Comanche 4 wouldn’t run on the GMA 900 because of its lack of a vertex shader. Also, Far Cry crashed immediately on initializing the graphics engine on the GMA 900, so I don’t have any results for it.
Unreal Tournament 2004
Quake III Arena
The 915G and 925X look a little better here. Is it because of the X600 XT’s slight memory clock speed advantage, or because of the X600 XT’s native PCI Express interface? Hard to say. But let’s try something a little different . . . .
Back when we wrote this article, the folks at Serious Magic devised a test for measuring the speed of downloading an image from the graphics card back into main system memory. At the time, the graphics drivers for Direct3D were abysmally slow at getting data back off the card, but both ATI and NVIDIA updated their drivers to fix the problem, and transfer speeds went up. Now, with PCI Express, transfer speeds should go up again. Let’s see what happens.
Both the ATI and NVIDIA cards show improvement in pulling down data from graphics memory via PCI Express, but the Radeon X600 XT is clearly the faster of the two. The GeForce 6800GT is only about 14MB/s faster via PCI-E than via AGP 8X, while the Radeon is twice as fast via PCI Expressand significantly faster overall. This synthetic benchmark looks like a vindication of sorts for ATI’s native PCI Express implementation, but it still raises some intriguing questions. First, why are both cards’ data transfer rates so low, relatively speaking. PCI Express should provide 4GB/s of bandwidth for data transfers from the video card to main memory. Second, why is the ATI card almost exactly twice as fast at pulling data back from the video card via PCI Express as it is via AGP 8X? I don’t have any answers, but I am curious to learn why that might be.
Ricky Houghton first brought us the Sphinx benchmark through his association with speech recognition efforts at Carnegie Mellon University. Sphinx is a high-quality speech recognition routine that needs the latest computer hardware to run at speeds close to real-time processing. We use two different versions, built with two different compilers, in an attempt to ensure we’re getting the best possible performance.
Prescott loves Sphinx, and Sphinx loves the Prescott. The 925X system with the Prescott 3.6GHz takes the top spot by a hair, but the Prescott 3.4GHz on the 875 offers generally the same performance. LAME MP3 encoding
We used LAME to encode a 101MB 16-bit, 44KHz audio file into a very high-quality MP3. The exact command-line options we used were:
lame –alt-preset extreme file.wav file.mp3
Chipsets don’t tend to make a lot of difference for MP3 encoding, as you can see. DivX video encoding
This new version of XMPEG includes a benchmark feature, so we’re reporting scores in frames per second now.
Here’s another test where, as with Sphinx, the Pentium 4 Prescott at 3.6GHz looks to push the 925X over the top. The 875P is faster with like processors.
NewTek’s Lightwave is another popular 3D animation package that includes support for multiple processors and is highly optimized for SSE2. Lightwave can render very complex scenes with realism, as you can see from the sample scene, “A5 Concept,” below.
We’ve tested the processors with one and two rendering threads to see if Hyper-Threading helps.
Here’s another test where chipsets just don’t matter much. The Prescott at 3.6GHz manages to beat out the Athlon 64 3800+ in both cases, though.
POV-Ray is the granddaddy of PC ray-tracing renderers, and it’s not multithreaded in the least, because it’s designed to be a cross-platform application. POV-Ray also relies more heavily on x87 FPU instructions to do its work, and it contains only minor SIMD optimizations.
Again, chipsets aren’t much of a factor, but look at the nice performance increase for the Prescott at 3.6GHz. The Pentium 4 560 is much faster than the 3.4E. Then again, the Athlon 64 is far and away the fastest.
Cinebench is based on Maxon’s Cinema 4D modeling, rendering, and animation app. This revision of Cinebench measures performance in a number of ways, including 3D rendering, software shading, and OpenGL shading with and without hardware acceleration. Cinema 4D’s renderer is multithreaded, so it takes advantage of Hyper-Threading, as you can see in the results.
Our final rendering test is another CPU showcase; chipsets don’t matter much. The Pentium 4 chips all do well in Cinebench rendering, and the Pentium 4 560 manages to get fairly close to the Extreme Edition chips in overall performance.
ScienceMark is optimized for SSE, SSE2, 3DNow! and is multithreaded, as well. In the interest of full disclosure, I should mention that Tim Wilkens, one of the originators of ScienceMark, now works at AMD. However, Tim has sought to keep ScienceMark independent by diversifying the development team and by publishing much of the source code for the benchmarks at the ScienceMark website. We are sufficiently satisfied with his efforts, and impressed with the enhancements to the 2.0 beta revision of the application, to continue using ScienceMark in our testing.
ScienceMark’s problem-solving tests show us what we’ve come to expect by nowthat performance difference between the 875P, 915G, and 925X are fairly minor, and that the 915G and 925X tend to trail the 875P slightly when chipsets do matter.
The Prescott Pentium 4s put on a show in DGEMM, displaying their improved SSE2 performance. Again, chipsets aren’t much of a factor.
Now we’ll dive into the south bridges, where the new ICH6 can get more of a workout. We’ve cut the contenders down to three here, because the 915G and 925X share the same south bridge chips.
RightMark3D measures CPU utilization with a number of audio tasks. We can see how well Intel’s High Definition Audio implementation works.
Overall, the 925X does pretty well in these tests, showing CPU utilization generally similar to the 875P’s AC97 audio engine. The VIA audio controller on the K8T800 Pro uses much more CPU time, consistent with what we’ve seen from it in the past. However, there are several mitigating factors here. First, I don’t believe RightMark3D tests with any sort of high-definition audio streams that require mixing of high-bit-rate audio data, so it’s not really a torture test we’re seeing. Second, I’m always wary of CPU utilization tests that report numbers with Hyper-Threading enabled. Generally, software doesn’t seem to get a very accurate representation of CPU utilization with HT turned on, because one of the two logical CPUs will be sitting idle. Then again, I’m not sure it’s accurate not to have it turned on, either. Just something to keep in mind. USB performance
We used HD Tach to measure USB transfer rates to a Maxtor DiamondMax D740X hard drive in a USB 2.0 drive enclosure.
The 925X achieves faster read speeds and lower CPU utilization than the 875P chipset. The K8T800 Pro with the Athlon 64 is faster yet, but at the price of much higher CPU utilization. Out of curiosity, I turned off Hyper-Threading and re-ran the test on the 925X. CPU utilization was then reported at 19.4%, still much lower than the K8T800 Pro system.
Here we get to see whether Native Command Queuing has any measurable benefits. I used Iometer with both workstation and database access patterns to simulate real-world disk loads. Note that there are two sets of results for the 925X. One of them is without Native Command Queuing, using the built-in Microsoft disk driver in WinXP. The other uses Intel’s Application Accelerator for RAID 4.0 driver, which enables Native Command Queuing supportand not surprisingly, that’s the result labeled “NCQ” in the graphs.
In all cases, we’re using Maxtor’s MaXLine III SATA 150 drive that features a 16MB buffer. This is a pre-production drive with NCQ support.
Without NCQ, the 925X chipset is very closely comparable to the 875P and K8T800 Pro chipsets. But with both access patterns, Native Command Queuing shows higher transaction rates, lower response times, and only negligible spikes in CPU utilization (below about 3%) versus the non-NCQ configs.
Here is a feature that folks should be lining up for. Hard drives are the slowest components in a modern PC, and the 925X with Native Command Queuing delivers SCSI-like performance in a Serial ATA drive. We’ll have to test RAID with NCQ soon.
Those of you who were looking for earth-shaking performance differences out of Intel’s new chipsets may be disappointed, but realistically, most of the changes are not of the sort easily measurable via common benchmarks or applications. No, the 915G and 925X chipsets aren’t really faster in gaming with PCI Express graphics cards, but we saw the same thing back when AGP 8X arrived. That doesn’t mean we don’t need a better, faster path to the graphics card; it just shows that game developers tend to write their applications with the limitations of their target hardware in mind. As of right now, that target is probably a graphics card with 128MB of memory, AGP4X, and a DirectX 8-class GPU. Depressing, but true. Applications that take advantage of PCI Express in a big way will come along sooner or later. As for DDR2 memory, at 533MHz, it’s a little disappointing, because it isn’t really faster than DDR400. However, remember that we were testing with first-gen Micron DIMMs with relatively conservative timings. We may see better performance yet out of fancy performance DIMMs like the Kingston HyperX or Corsair XMS2 stuff. If not, well, DDR2 probably won’t be worth the price premium for a while yet. I have here an Abit motherboard based on the 915P chipset with DDR400 memory support. I’m curious to see how it performs. Boards like it may be the best choice for those looking to get into a PCI Express system right away.
Obviously, the biggest performance win of them all with the new chipsets is Serial ATA with Native Command Queuing. Its performance alone would be enough to sway me away from the older Pentium 4 platform and perhaps from an AMD-based one, as well. We’ll have to measure it more thoroughly in time, but based on what we’ve seen so far, I expect NCQ will cut boot times, among other things. It’s just the right thing to do, and now we can have it, complete with RAID 0 and 1, without paying for SCSI. We can even have data integrity and extra performance with two drives, thanks to Matrix RAID.
The 915G’s integrated graphics seems to be an improvement, but the graphics driver needs work in order to make the GMA’s claim of DirectX 9 support seem credible. For what it will be asked to do, the GMA 900 should be just fine. Just don’t ask it to run Far Cry.
The rest of the changes to the PC platform are a little harder to quantify. I need to play with High Definition Audio a little more using a proper 5.1 or 7.1 surround sound system and a high-quality audio source before I feel qualified to pronounce it a complete success, but it’s at least decent. I’m a little shocked how capable Realtek’s ALC880 codec turned out to be. Eight channels of 24-bit, 192KHz audio is a heckuva new baseline for PC audio capabilities.
So what should we make of the whole package, including Intel’s new LGA775 Pentium 4 Prescott processors? Well, right now, the AMD64 platform seems to have the lead in terms of overall CPU performance, gaming performance, and memory performance, despite the arrival of PCI Express and DDR2 memory. The Athlon 64 has power consumption and thermal characteristics superior to any system based on an LGA775 processor. Also, the Athlon 64 unambiguously has support right now for 64-bit operating systems and applications as they become available. All in all, no small set of advantages.
AMD also seem to have a big advantage in terms of product availability at the high end of the market. As of today, I couldn’t find a single Pentium 4 3.4E listed for sale on PriceWatch, and here we are reviewing a 3.6GHz model. Craziness. Intel needs to launch silicon, not paper.
However, with the 915 and 915X Express chipsets, Intel has innovated mightily in ways that deliver a better overall user experience and a better overall PC platform. Of course, everyone will benefit from some of these changes, including Athlon 64 buyers, once competent PCI Express chipsets arrive for the Athlon 64. But Intel’s implementation of all these new technologies is here now, seems reasonably solid, and is poised to become the new PC platform standard over the next six to twelve months. Taken together, all these improvements add up to a pretty compelling argument for 915/925X-based systems, assuming they’re sufficiently available. I’m cautiously optimistic, and I’m intrigued to start reviewing new 915/925X motherboards, higher performance DDR2 DIMMs, and PCI Express graphics cards. That optimism may turn into an all-out recommendation, especially if Intel can turn on its 64-bit extensions and get its CPU heat problems reined in a bit.