AMD’s Athlon XP 2800+ and NVIDIA’s nForce2

TODAY we get the chance to give you a preview of some very interesting products coming down the pike from AMD and NVIDIA. This is a somewhat unique opportunity. Both products are noteworthy for their performance, and they make a very potent combination together. However, this particular combination of parts won’t be widely available for some time yet.

AMD’s last desktop processor launch was the Athlon XP 2600+, and those chips are only just now becoming available to PC manufacturers. End-user versions of the 2600+ probably won’t be widely available for a few weeks, based on everything I’ve heard. It may take even longer than that.

Similarly, NVIDIA announced its nForce2 core-logic chipset in mid-July (we wrote up a technology preview at the time), and the company promised retail product availability in August. The month of August came and went, and no products showed up. We didn’t hear a peep out of NVIDIA about nForce2’s schedule, but the fact of the product delay was obvious.

Now, AMD and NVIDIA are teaming up to introduce the combination of the Athlon XP 2800+ processor and the nForce2 chipset. AMD is taking a novel approach to this product launch, making Athlon XP 2700+ and 2800+ chips available through certain PC makers as “a limited edition gaming microprocessor.” Those systems should be available “in November”, which pretty much means the end of that month. The 2800+ won’t be available on retail until next year.

NVIDIA says the nForce2 really, really is coming soon now, too. Now that I’ve waited as long as I have, I’m going to adopt the consummate Missourian’s stance on this one: I’ll believe it when I see it.

But I have played with the preproduction versions of AMD’s Athlon XP 2800+ and Asus’s nForce2 motherboard, and they are both quite nice. The Athlon XP 2800+ runs at 2.25GHz on a 333MHz bus, and the nForce2 feeds it data from dual banks of DDR333 memory. The combination is as nearly as potent as my Kansas City Chiefs’ offense. Read on to find out why.


The press test kit: Athlon XP 2800+ on a pre-production Asus nForce2 mobo

Leveraging proactive synergies for a win-win result. Cough.
The Athlon XP 2800+ is (along with the 2700+) the first Athlon to support a 333MHz front-side bus, which is a welcome development to AMD fans who have been clamoring for a faster bus for some time now. There’s just something about getting beaten by the Pentium 4 and its fancy-pants 533MHz bus in every bandwidth-intensive benchmark everywhere that makes a faster bus sound like a good idea. We toyed around with raising an Athlon XP’s bus from 266MHz to 333MHz a few months ago, and we liked the promise of a faster bus paired up with a faster Athlon XP.

The 2.25GHz Athlon XP 2800+ is fed by NVIDIA’s nForce2 “system platform processor” (north bridge), which talks to dual banks of DDR333 memory. We’ll talk more about NVIDIA’s chipset in a few pages, because we’re going to focus on the CPU side of things first by comparing the Athlon XP 2800+ to its Pentium 4 competition. However, you should note that nForce2’s memory controller combines dual memory banks, data pre-fetching, and caching in order to improve memory access performance. You’ll want to keep that in mind as you see the benchmark scores.

 

Our processor testing methods
As ever, we did our best to deliver clean benchmark numbers. Tests were run at least twice, and the results were averaged.

Our test systems were configured like so:

  Athlon XP Pentium 4 DDR Pentium 4 RDRAM
Processor AMD Athlon XP 2200+ 1.8GHz
AMD Athlon XP 2600+ 2.13GHz
AMD Athlon XP 2800+ 2.25GHz Intel Pentium 4 2.53GHz
Pentium 4 2.8GHz
Pentium 4 2.8GHz
Front-side bus 266MHz (133MHz  DDR) 333MHz (166MHz DDR) 533MHz (133MHz quad-pumped) 533MHz (133MHz quad-pumped)
Motherboard Asus A7N-8X (pre-release sample) Abit SR7-8X Asus P4T533C
Chipset NVIDIA nForce2 SiS 648 Intel 850E
North bridge nForce2 SPP 648 82850E MCH
South bridge nForce2 MCP-T 963 82801BA ICH2
Chipset drivers 2.77 SiS AGP 1.10 Intel Application Accelerator 6.22
Memory size 512MB (2 DIMMs) 512MB (1 DIMM) 512MB (4 RIMMs)
Memory type Corsair XMS3200 PC2700 DDR SDRAM Corsair XMS3200 PC2700 DDR SDRAM Samsung PC1066 Rambus DRAM
Graphics ATI Radeon 9700 Pro 128MB (Catalyst 7.76 drivers)
Sound Creative SoundBlaster Live!
Storage Maxtor DiamondMax Plus D740X 7200RPM ATA/100 hard drive
OS Microsoft Windows XP Professional
OS updates Service Pack 1

Thanks to Corsair for providing us with DDR333 memory for our testing. If you’re looking to tweak out your system to the max and maybe overclock it a little, Corsair’s RAM is definitely worth considering. Using it makes life easier for us as we’re dealing with brand-new chipsets and pre-production motherboards, because we don’t have to worry so much about stability and compatibility.

The test systems’ Windows desktops were set at 1024×768 in 32-bit color at an 85Hz screen refresh rate. Vertical refresh sync (vsync) was disabled for all tests.

We used the following versions of our test applications:

All the tests and methods we employed are publicly available and reproducible. If you have questions about our methods, hit our forums to talk with us about them.

 

Memory performance
Our first set of tests will highlight just how a faster bus and a new chipset can benefit the Athlon XP. We’ll start out with SiSoft’s Sandra memory bandwidth benchmark. This one uses extensive buffering, streaming SIMD extensions, nitrous oxide injectors, and rocket fuel to cram as much data as possible in and out of memory, so it will show us something close to theoretical peak memory throughput.

The Athlon XP 2800+ comes out looking amazing here, tying our DDR333-based Pentium 4 systems in a test customarily dominated by the Intel chips. You can see the increase of about 550MB/s over the 266MHz-bus Athlons. Both Athlons and P4s take PC2700 memory near its 2.7GB/s peak data rate. Our P4 test rig with PC1066 RDRAM comes out well ahead of all the DDR333 systems, but RDRAM isn’t exactly a growth industry these days.

Cachemem gives us a bit of a different picture, because it’s not so aggressive about pushing data through by whatever means necessary. Also, cachemem shows both read and write performance.

Here, the Athlon XP can’t quite keep up with the P4 with its hyper-aggressive prefetching and larger L2 cache.

Now we’ll look at memory access latency, which is every bit as important as bandwidth in the grand scheme of things.

Here’s where the 333MHz bus really helps. It also doesn’t hurt that we’re running the DDR333 memory synchronous with the 333MHz front-side bus on the 2800+ system. All of the other systems here are running memory at a different base clock speed than their front-side busses.

Finally, we have Linpack, which shows us a nice picture of the cache and memory subsystems in our test systems. The size of the data matrix Linpack is processing increases from left to right, so the left side of the picture shows L1 cache at work, the middle shows L2 cache, and the right shows main memory access speeds. (Linpack also tests FPU performance to some degree, since it’s crunching big floating-point numbers.)

This is about what we’ve come to expect from Athlons and P4s here, with one notable exception: the Athlon XP rig beats out all DDR333 systems above matrix sizes of about 1.2MB. This is new territory for the Athlon XP.

So initial indications are very positive for the 2800+ and its revamped memory subsystem. Let’s see how these numbers translate into performance in real applications.

 

Business Winstone

The nForce2/2800+ combo opens up a huge lead in Business Winstone, topping the nearest P4 system by 8 points. The bus speed increase obviously helps out here. You can see a marked jump in scores from 2600+ (with 266MHz bus) to 2800+ (with 333MHz bus).

Content Creation Winstone

Here’s another test where the Pentium 4 has traditionally dominated, but not today. The 2800+ knocks off the P4 with DDR333 and nearly catches the PC1066 RDRAM rig.

 

LAME MP3 encoding
We used LAME 3.92 to encode a 101MB 16-bit, 44KHz audio file into a high-quality, variable-bit-rate MP3. The exact command-line options we used were:

lame -v -b 128 -q 1 file.wav file.mp3

Here are the results…

None of these chips is really memory bound when compressing audio with LAME, but the 2800+ does show a fairly linear improvement over the 2600+.

DivX video encoding
Xmpeg can encode video files using the popular DivX format, which produces very high quality video in relatively small amounts of space. For this test, we took a 279MB video file, encoded in MPEG2 format at DVD quality, and converted it to a 37MB DivX file. We used the “medium” quality/speed setting on the DivX encoder, and we turned off audio processing. Otherwise, all settings were left at their defaults.

Now this is a result we weren’t expecting! The 2800+ system snags another oft-P4-dominated test away from the P4 2.8GHz/DDR333 rig. Obviously, the nForce2 and faster bus help out greatly here. Of course, nothing can touch RDRAM in this bandwidth-hungry benchmark.

 

Quake III Arena

The Athlon can’t quite top the P4 here, but it’s close.

3DMark2001 SE

The Athlon XP hasn’t beaten the fastest P4 in 3DMark for a long time, but here it tops the DDR333 system and nearly busts the 15,000 mark in the process.

Serious Sam SE

The Athlon XP wins this one outright. Check out the 20 fps boost from 2600+ to 2800+.

Comanche 4

This one is neck and neck until the RDRAM rig butts in.

Unreal Tournament 2003
You’ll notice some P4/DDR scores missing from the UT2003 results. We tried and tried, but the SiS 648 chipset in our P4 DDR333 system just wouldn’t play nice with our Radeon 9700 graphics card. The thing locked up at the same point every time we ran the UT2K3 benchmark. This kind of problem is a known issue with brand-new AGP 8X cards and mobos. We did have a few stability problems outside of UT2K3, but we were able to get past those and record benchmark scores. Not so here.

The Athlon XP edges out the RDRAM P4 system by just a hair. Heck, it’s not even a whole hair. Just part of a hair. Just by a split end.

 

SPECviewperf

There’s not much to say here. The Athlon XP 2800+ dominates the new viewperf suite, winning every single test outright—and some by large margins.

 

Speech recognition
Sphinx is a high-quality speech recognition routine that needs the latest computer hardware to run at speeds close to real-time processing. We use two different versions, built with two different compilers, in an attempt to ensure we’re getting the best possible performance.

There are two goals with Sphinx. The first is to run it faster than real time, so real-time speech recognition is possible. The second, more ambitious goal is to run it at about 0.8 times real time, where additional CPU overhead is available for other sorts of processing, enabling Sphinx-driven real-time applications.

Few things are sweeter than seeing a new system config make big strides in Sphinx. When we initially started testing with Sphinx, everything we threw at it ran well below real time, and now, we’re breaking the 0.8 mark. This is yet another test where the Athlon XP was rarely able to compete with the Pentium 4, but the nForce2/333MHz bus combo breaks through to nearly 0.8 times real time.

POV-Ray 3D rendering
We’ll use a pair of POV-Ray tests this time around, our usual “chess2” scene and a new benchmarking function built into the new 3.5 release of POV-Ray. The built-in bench stresses a number of POV-Ray functions, including some now optimized for the Pentium 4’s SSE2 instruction set extensions.

Any way you slice it, the Athlon XP is fastest at POV-Ray rendering. Bus speeds and memory architectures don’t tend to matter so much here. However, the official benchmark shows more promising results for the Pentium 4 than our usual test scene does.

 

ScienceMark

ScienceMark is generally the domain of the Athlon, and the 2800+ defends its turf admirably here. Here are the results for some of the component tests:

This is familiar split result. Primordia runs best on the P4, while the rest is all Athlon XP.

 

Investigating nForce2
Now that you’ve seen all the CPU results, you’re probably wondering how much of the Athlon XP 2800+’s impressive performance increase is caused by its faster 333MHz front-side bus and how much is attributable to the nForce2 chipset. That’s the question we aim to answer with our next set of tests, where we’ll throw the nForce2 into the ring versus its two top competitors: VIA’s KT333 and KT400 chipsets. (We would have included more competing chipsets, but frankly, after our last Athlon chipset roundup and VIA’s domination of Socket A market share, there’s little point.)

Before we throw the nForce2 to the wolves, however, we should review the tools it brings to this fight. You owe it to yourself to go read my preview of the nForce2 for a true overview of its features. To sum up, the incarnation of the nForce2 we’re playing with here today is loaded to the gills with all the latest goodies: AGP 8X, a reworked memory controller with dual-bank DDR support, USB 2.0, IEEE 1394, ATA/133, dual Ethernet controllers, and Dolby Digital 5.1 audio. And, of course, it supports a 333MHz front-side bus.


The “south bridge” chip of the nForce2 set, the MCP handles I/O duties and battles Tron

Probably the most important update to nForce2 is the completely redesigned memory controller, which incorporates a number of changes intended to help performance. Since memory controllers are the primary determinant of core-logic chipset performance, these changes are especially noteworthy. nForce2’s memory controller uses a newer, more aggressive data pre-fetch algorithm that avoids stalling on memory reads and writes (apparently a bit of a problem with the first nForce). The controller’s buffering and memory addressing has been improved, as well. Also, nForce2 now supports three separate address lines, one for each DIMM slot, so that the controversial “super-stability” fallback mode is no longer needed.

I asked Drew Henry, NVIDIA’s Senior Director of Platform Product Management, about nForce2’s DASP pre-fetch and “L3” caching mechanism. Some folks had wondered whether this arrangement was necessary—or perhaps even counterproductive—now that the Athlon “Palomino” chips include their own data pre-fetch mechanisms. He assured me DASP would work in concert with the Athlon XP’s pre-fetch just fine, and that, generally, more pre-fetching is better.

Mr. Henry also downplayed the nForce2’s support for DDR400 memory. Currently, DDR400 chips from only two RAM vendors, Samsung and Micron, are validated by NVIDIA for use with nForce2. (The JEDEC DDR400 spec, of course, is even further off.) He said “basic computer science” dictates that a memory bus that runs at the same speed as the front-side bus is better, anyhow. NVIDIA would presently prefer to run memory synchronously and use aggressive memory timings instead of higher clock speeds. This configuration is, apparently, especially helpful with dual-channel DDR.

(That’s particularly interesting in light of the VIA KT333’s seemingly magical ability to squeeze a little extra performance out of DDR333 memory on a 266MHz bus with nearly no latency penalty.)

NVIDIA claims dual-channel memory can indeed help performance even without integrated graphics, where the extra memory bandwidth is really needed. Even though the Athlon XP’s 333MHz bus runs no faster than a single PC2700 DIMM, dual-bank configs can supposedly improve access latencies, allow for simultaneous memory reads and writes, and help with AGP transfers and other types of I/O not destined to travel over the front-side bus.

All of this makes intuitive sense to me, but the potential improvements with dual-bank memory still seem fairly minor. Let’s put the theory to test.

 

Our chipset testing methods
We tested the nForce2 in several configurations: with an Athlon XP 2800+ and 333MHz bus/memory speeds with dual-channel memory, the same configuration with single-bank RAM, and with an Athlon XP 2200+ at 266/333MHz bus/memory speeds in a dual-bank config.

We had to resort to two different processor speeds for our testing because our KT400 motherboard wouldn’t boot with the Athlon XP 2800+. Both of our new preproduction Athlon XP chips, the 2600+ and 2800+, have faulty thermal diodes. AMD assures us this won’t be a problem in production chips, but neither of our KT400 motherboards would run with our review sample chips. Their thermal safety mechanisms would kick in and shut the system down right after boot. Turning off the thermal safety features in the BIOS didn’t help.

So we’ve tested the KT333 and nForce2 at 333/333MHz bus/memory speeds, and we’ve thrown in results for both of those chipsets plus the KT400 with an Athlon XP 2200+ on a 266MHz bus. I’ve varied the colors on the graphs to highlight the processor and bus speed differences. The 2200+/266 FSB configs are lighter colors than the 2800+/333 FSB configs.

As ever, we did our best to deliver clean benchmark numbers. Tests were run at least twice, and the results were averaged.

Our test systems were configured like so:

  nForce2 KT333 KT400
Processor AMD Athlon XP 2200+ 1.8GHz AMD Athlon XP 2800+ 2.25GHz AMD Athlon XP 2200+ 1.8GHz AMD Athlon XP 2800+ 2.25GHz AMD Athlon XP 2200+ 1.8GHz
Front-side bus 266MHz (133MHz  DDR) 333MHz (166MHz DDR) 266MHz (133MHz  DDR) 333MHz (166MHz DDR) 266MHz (133MHz  DDR)
Motherboard Asus A7N-8X (pre-release sample) Epox 8K3A+ Abit IT7-MAX2
Chipset NVIDIA nForce2 VIA KT333 VIA KT400
North bridge nForce2 SPP VT8367 VT8377
South bridge nForce2 MCP-T VT8233A VT8235
Chipset drivers 2.77 VIA 4-in-1
4.42v(a)
VIA 4-in-1
4.43v
Memory size 512MB (2 DIMMs) 512MB (1 DIMM) 512MB (1 DIMM)
Memory type Corsair XMS3200 PC2700 DDR SDRAM Corsair XMS3000 PC2700 DDR SDRAM Corsair XMS3000 PC2700 DDR SDRAM
Graphics ATI Radeon 9700 Pro 128MB (Catalyst 7.76 drivers)
Sound Creative SoundBlaster Live!
Storage Maxtor DiamondMax Plus D740X 7200RPM ATA/100 hard drive
OS Microsoft Windows XP Professional
OS updates Service Pack 1

The test systems’ Windows desktops were set at 1024×768 in 32-bit color at an 85Hz screen refresh rate. Vertical refresh sync (vsync) was disabled for all tests.

We used the following versions of our test applications:

All the tests and methods we employed are publicly available and reproducible. If you have questions about our methods, hit our forums to talk with us about them.

 
Memory performance

On a 333MHz bus in a single-bank configuration, the nForce2 essentially ties the KT333. With dual banks, the KT333 pulls ahead. On a 266MHz bus, the three chipsets are in a dead heat.

Cachemem shows some real differences between the different chipsets. The nForce2 is fastest at 333/333, but when the bus runs asynchronously from the memory interface, the KT400 and KT333 are both faster.

The nForce2 does indeed gain lower latencies from running a second bank of memory. Remarkably, the KT400 shows the lowest latency here, even though the bus and memory clocks are out of sync.

 

Business Winstone

The nForce2 absolutely creams the competition in Business Winstone. It’s a full 10 points ahead of the KT333 when both are running a 333/333 bus/memory. Even at 266/333, the nForce2 is quite a bit faster than the VIA chipsets.

Content Creation Winstone

Content Creation Winstone isn’t such a lopsided victory, but the nForce2 does come out on top again.

3DMark2001 SE

The 3DMark results are mixed. The nForce is fastest at 333/333, but its relative performance drops off quite a bit at 266/333.

Sphinx speech recognition

Sphinx is similar to 3DMark. The advantages of dual-bank RAM and the 333/333 bus with nForce2 are impressive, yet the game changes in the 266/333 config.

 
Conclusions
Well, what can I say? The Athlon XP 2800+ and the nForce2 are a very powerful combination. If you can get a hold of them together, you’ll have a system that can take on today’s Pentium 4 2.8GHz systems and win. Of course, by that time, most folks expect the Pentium 4 to be running at over 3GHz with Hyper-Threading enabled, and it may well be running on dual-bank DDR chipsets of its own.

Even so, the Athlon XP 2800+ looks very promising. The bus speed increase delivers exactly the kind of performance boosts in exactly the places one would hope to see. And its 2.25GHz clock speed assures the CPU is hungry enough to take advantage of the additional bus bandwidth. If AMD can deliver enough of these babies for the smaller custom PC builders like Alienware, there should be some very, very sweet pre-built rigs available this Christmas.

I just wish we were seeing widespread product availability sooner. AMD has paper launched its processors so far ahead—the 2700+ will be available before the end of 2002, but the 2800+ won’t arrive until the first quarter of ’03—that I can’t help but worry about the fate of the 2900+ or whatever comes next. How far out has AMD pushed its schedule?


But will it fly?

That said, AMD can still compete on price with Intel. Here’s a rundown of AMD’s new pricing down through its processor lineup:

2800+ $397 (each in quantities of 1,000)
2700+ $349
2600+ $297
2400+ $193
2200+ $183
2100+ $174
2000+ $155
1900+ $139

Those prices are competitive with Intel’s, but they aren’t all severely cheaper. AMD has clearly enjoyed the pricing freedom its new model number rating system has afforded.

As for the nForce2, NVIDIA has really smashed one outta the park here. This revision of nForce finally lives up to our expectations from a company like NVIDIA, and it comes at a time when NVIDIA could use a bit of a lift. We will review the nForce2’s rich set of features (including audio, graphics, and networking capabilities) more thoroughly once we have final products in our sweaty little paws. I’m hopeful that will be sooner, as in the next few weeks, rather than later. With nForce2’s excellent performance, NVIDIA has served notice to VIA that the Socket A chipset market is now a two-horse race. This is one fight I can’t wait to see. 

Comments closed

Pin It on Pinterest

Share This

Share this post with your friends!