Workstation platforms compared

THE WORKSTATION COMPUTER is an enigmatic beast. Back in the day when I was young and computing power wasn’t so easy to come by, workstations were much easier to define. These particularly powerful computer systems came in sleek, stylish enclosures, and were attached to massive, gorgeous flat-screen displays as large as 19 or 21 inches. What’s more, workstation makers packed fast, high-res color graphics and exotic, RISC-inspired processors into those boxes, and they ran advanced multitasking operating systems with slick GUIs manipulated by optical mice.

In other words, they weren’t far from your average $1400 personal computer on display at Best Buy, with the obvious exception that today’s cheap PCs have gobs more computing power than those dinosaurs. Speaking of dinosaurs, yesterday’s workstation makers are now flirting with dinosaur status. SGI bled all its engineering talent to companies like 3dfx, ATI, and NVIDIA, and Sun is, well, not high on my list of stock picks, let’s just say. The workstation world ain’t what she used to be.

Over time, workstations themselves have been transformed—as have most servers, home computers, and—heh—pocket calculators—from proprietary boxes running proprietary software (various vendor-specific flavors of Unix) to x86-based systems running either the same Windows OS your grandma’s PC runs or the latter-day One True Unix, Linux. As a result, workstations are rather difficult to separate from your everyday, run-of-the-mill desktop Pee Cee. Indeed, the “workstation-class PC” has become an appendage of the larger PC market.

Through the magic of different drivers, purpose-engineered physical incompatibilities, and newly dreamed-up marketing names, pedestrian chips like GeForces, Athlons, Pentiums, and Radeons become exotic Quadros, Opterons, Xeons, and Fire GLs. The chips aren’t really that different, but the price differences can be rather impressive—and if you want a dual-processor system based on the Pentium 4 “Netburst” architecture, for instance, you’re gonna have to pony up the cash for a pair of Xeons.

Because the barriers between desktop- and workstation-class parts are semi-transparent, spotting the boundaries between the two worlds isn’t easy. Dell will sell you a Pentium 4-based Precision “workstation” with nearly all the same parts in it as in an Optiplex. In fact, Intel’s higher-end 875P chipset, with its exclusive Performance Acceleration Technology to improve memory access latency, is targeted at high-end desktop systems, enthusiasts, and—you guessed it—workstations.

So, you may be asking, what really distinguishes a workstation from a built-to-the-hilt desktop PC? The short answer: premium parts with higher price tags and, one hopes, some tangible benefits in terms of performance, capability, and reliability. For one thing, workstations tend to support ECC memory, in order to keep cosmic rays from scrambling your bits while you work on your CAD drawings. Also, workstations often have SCSI storage subsystems, with smarter drive control logic and higher spindle speeds. Then there are the aforementioned workstation chips. Workstation-class graphics chips tend to differ from their desktop counterparts in terms of drivers; the workstation cards’ drivers have optimized OpenGL code certified for use with high-end applications for tasks like design, engineering, and content creation. In CPUs, AMD and Intel have chosen to disable multiprocessing capabilities on their desktop processors, so they can charge premiums for CPUs that work in pairs. Workstation CPUs also sometimes get other enhancements, such as larger caches or wider paths to memory, in order to improve performance.

That’s about it, in a nutshell. Workstations are typically deployed where performance, stability, and compatibility are most important. They are also, as you might imagine, very smooth machines to use. Let’s take a look at several typical workstation-class PC configurations and see how they compare.

The processors
The systems we’re looking at today are based on x86 CPUs from AMD and Intel, the Opteron and Xeon, respectively.


$1176 worth of processors—at least back when we bought ’em

If you’re familiar with Pentium 4 processors, then Xeons ought to look mighty familiar. Today’s Xeon is essentially the same chip as the Pentium 4, sold under a different name. It has the same Netburst architecture as the Pentium 4, and it’s made on the same 0.13-micron fab process as the Pentium 4. Unlike the P4, however, Xeons are capable of running in multi-processor configurations. Xeons also come in a slightly larger package than the P4, with 604 pins to the Pentium 4’s current 478. The Xeon model we’re testing today is the 2.66GHz version with 512K of on-chip L2 cache and a 533MHz front-side bus.

At the time when we started work on this article, the Xeon 2.66GHz was exactly the same price—$294 a pop, American money—as the Opteron 240 to which we’ll be comparing it. Since then, Intel has moved to counter AMD’s Opteron with some aggressive moves, including cutting prices, adding a 1MB L3 cache to some models, and moving to an 800MHz front-side bus with dual channels of DDR400 memory. We invited Intel to participate in our comparison with newer Xeon models, but it elected not to do so. Newer Xeon models (and motherboards to support them) are just now becoming available, so you may not see too many pre-assembled workstation rigs based on them just yet, anyhow.


The Xeon’s 604-pin package

AMD’s Opteron chip may not be as familiar to most folks as the Xeon, because the desktop variant of the Opteron hasn’t yet hit store shelves. This new chip, based on AMD’s K8 “Sledgehammer” architecture, brings a number of enhancements over the K7 architecture in the Athlons XP and MP. The Opteron packs an on-chip, dual-channel memory controller to reduce memory access latencies and allow for better performance scaling as the number of processors in a system rises. Also, the Opteron supports Intel’s SSE2 instruction set (in addition to AMD’s own 3DNow!), enabling accelerated SIMD computations with double-precision floating point datatypes. Many workstation apps use SSE2, especially for 3D rendering, so this addition is important. The Opteron’s larger, 1MB L2 cache won’t hurt, either. On the speed front, AMD has lengthened the K8’s pipeline to 12 stages (from the K7’s 10) and moved to a new, 0.13-micron silicon-on-insulator fabrication process in order to help the chip run faster and cooler.

Finally and perhaps most importantly, the Opteron is a true 64-bit processor. Through the use of AMD’s 64-bit extensions to the x86 instruction set architecture (ISA), the Opteron can run 64-bit operating systems and applications. This 64-bit capability breaks several barriers, including the ability to address more than 4GB of memory directly. In the workstation market, the x86 PC’s traditional 4GB memory barrier can be a crippling problem, so AMD probably won’t have to work too hard to make the case for 64-bit computing here. Operating systems and applications will have to be recompiled in order to support AMD64, but that work is already happening in both the Windows and Linux universes. The AMD64 ISA also includes more registers, or temporary on-chip storage slots, than the 32-bit x86 ISA. Recompiled applications may show substantial performance gains on AMD64 even if they can’t take advantage of AMD64’s expanded memory address space, because the chip won’t have to resort to cache accesses as often.

We haven’t yet tested the Opteron with 64-bit software, but you’ll see shortly that it performs quite well running 32-bit code. AMD’s 64-bit extensions haven’t diminished the K8’s 32-bit performance.


The Opteron’s imposing 940-pin underbelly

Asus SK8N: Single-processor Opteron with nForce3 Pro 150
The first of the three workstation platforms we’re comparing today is based on NVIDIA’s nForce3 Pro 150 chipset. This Opteron core-logic chipset is a single-chip solution intended only for single-processor systems. Asus was first to market with the nForce3 Pro, and as far as I know, the SK8N motherboard is still the only nForce3 Pro mobo available.


Asus’ SK8N has an unconventional but clean layout

Because we’re dealing with an Opteron here, the core-logic chipset arrangement is a bit different from a traditional system. The dual-channel Opteron memory controller is integrated on the processor, so there’s no conventional north bridge, and memory access doesn’t happen over the front-side bus. What’s more, NVIDIA has folded the remaining north bridge functions into a single chip along with the usual south bridge I/O capabilities. The Opteron processor communicates with the lone nForce3 Pro chip over a HyperTransport connection. HyperTransport provides the key plumbing for K8-based systems, offering a high-bandwidth transport over pairs of narrow, unidirectional links with high clock speeds. In the case of the nForce3 Pro 150, those links provide 3.6GB/s of peak bandwidth.

The nForce3 Pro supports most of the latest standards, including AGP 8X, ATA/133, and USB 2.0. This chipset doesn’t have a native Serial ATA controller, but NVIDIA says one of the chipset’s three ATA/133 channels can be overclocked and bridged to support two SATA devices (a master and a slave) at 150MB/s. Asus chose not to go that route, opting instead for a Promise RAID controller. Curiously, NVIDIA’s datasheet also says the nForce3 Pro supports RAID levels 0, 1, and 0+1, but the SK8N manual mentions only RAID via the Promise controller.

The SK8N lacks some of the high-end features present in many workstation systems today, like 64-bit/66MHz PCI or PCI-X slots and AGP Pro. Also, surprisingly, there’s no Gigabit Ethernet support, only a 10/100 Ethernet controller based on the Ethernet MAC in the nForce3 Pro. The SK8N isn’t certified for NVIDIA’s SoundStorm, either. Asus chose Realtek’s low-cost ALC650 codec for the SK8N, and supplies Realtek audio drivers with the board.

Unlike most desktop systems, though, the memory controller in the Opteron chip requires registered DIMMs in order to operate properly. (We tried to get the system to boot with non-registered DIMMs, to no avail.) Registered memory relieves some of the electrical load on the memory controller, but it does so at the expense of one clock cycle of additional memory access latency. Given the proximity of Opteron’s built-in memory controller, that’s a fair trade-off. The K8 memory controller also supports ECC memory for improved data integrity, though it doesn’t require ECC DIMMs.

All in all, the nForce3 Pro is a decidedly low-end workstation chipset, especially as implemented on the Asus SK8N. With the exception of its single-processor limitation, though, it fits in nicely with the other platforms we’re looking at here. Expect the desktop versions of nForce3 for the upcoming Athlon 64 to look very similar to the nForce3 Pro.

MSI’s 9130 K8T Master2: A KT8T00-based dual Opteron
MSI’s 9130 K8T Master2 is a very interesting low-end workstation and server motherboard. It’s based on VIA’s do-everything K8T800 chipset, which can scale from 4- and 8-way servers to single-CPU desktop systems based on the Athlon 64. The first thing you’ll notice about the 9130 K8T Master2 mobo is those two matching 940-pin sockets for Opteron chips.


Dual sockets for dual-processor action

Yep, this puppy’s a dually. VIA’s K8T800 gives the MSI 9130 several advantages over the SK8N, and multiprocessor operation is one of them. VIA’s K8T800 chipset also has a more traditional layout than the nForce3 Pro, with separate north bridge and south bridge chips. VIA’s proprietary, HyperTransport-like V-Link interconnect links the two at a rate of 1.06GB/s. By breaking things out into two chips, VIA can update the south bridge silicon independently from the north bridge, or vice-versa. VIA’s 8237 south bridge includes a true Serial ATA drive controller with support for RAID 0 and 1, plus a pair of ATA/133 interfaces. The 8237 also has a six-channel AC’97 audio controller, which MSI pairs up with an ALC210A codec. (The audio ports themselves are located on a PCI slot plate that connects to the 9130 via a header.)

Much like the SK8N, the use of a Realtek codec prevents the 9130 from earning VIA’s Vinyl Audio designation. MSI bypasses VIA’s Fast Ethernet controller, as well, and incorporates a Broadcom GigE chip.


A block diagram of the K8T800. Source: VIA.

The MSI 9130 has only 32-bit/33MHz PCI slots, but the K8T800 is capable of supporting faster PCI standards, all the way to up PCI-X, by an unorthodox arrangement in which a VIA PCI controller chip hangs off of the north bridge’s AGP 8X port. This setup could be useful for servers, where PCI-based graphics would suffice, but not for workstations, where the AGP port would best be dedicated to a graphics card. Accordingly, MSI has given the 9130 an 8X AGP Pro slot.

The MSI 9130’s talents are considerable, but MSI chose a peculiar cost-saving measure in designing this board. Although the 9130 has dual Opteron processors, each with its own dual-channel DDR memory controller, MSI connected only one of the two CPUs to DIMM slots. The second CPU has no local memory, just as indicated in the diagram above. This is not a typical arrangement for multiprocessor Opteron systems, and VIA says the K8T800 works fine with multiple processors using multiple memory controllers. The MSI 9130’s performance isn’t bad, as you’ll see soon, but it has half the memory bandwidth of optimal dual-Opteron configurations, and the system’s second processor must always resort to non-local memory access. What’s more, the 9130’s multiprocessor config offers less redundancy. If CPU 0 fails, CPU 1 cannot access memory, and the system croaks. And last but not least, you’ll need to use 2GB DIMMs if you want to reach the 9130’s max of 8GB RAM, because it has only 4 DIMM slots.

The 9130 does have another leg up over its nForce3 Pro competition, though, thanks to VIA’s Hyper8 technology. Hyper8 is a complete implementation of the fastest link afforded by the HyperTransport spec. On the K8T800, the HyperTransport connection between the north bridge and the primary processor is 16 bits wide in each direction and runs at 800MHz, yielding 6.4GB/s of bandwidth. NVIDIA’s solution, by contrast, has only 3.6GB/s of peak bandwidth between the CPU and the nForce3 Pro chip. To highlight this difference, VIA has released a HyperTransport analysis tool, and as you can see in the screenshots, it indicates the nForce3 Pro has an 8-bit upstream link and a 16-bit downstream link, both of which run at 600MHz.


HT analyzer on the K8T800


HT analyzer on the nForce3 Pro

With memory already local to the processor, a faster HyperTransport link should primarily affect AGP performance, along with other forms of direct-memory-access I/O.

Tyan’s Tiger i7505: Intel E7505 chipset with dually Xeons
Tyan’s Tiger i7505 mobo is based on Intel’s E7505 chipset, a.k.a. Granite Bay, which is the direct precursor to the Canterwood chipset now at the top of Intel’s Pentium 4 lineup. The E7505 north bridge is very much like Canterwood, only it runs at lower frequencies, with a 533MHz bus and dual channels of DDR266 memory. The E7505, of course, also supports multiprocessor configurations.


The Tiger i7505 is loaded

Xeon systems are more conventional than Opterons in terms of chipset layout, with the memory controller residing on the north bridge chip, as ever. In the case of the E7505, the front-side bus and dual channels of DDR266 memory are matched at 4.3GB/s each. The Xeons share memory and bus bandwidth between themselves.

The E7505 north bridge can connect directly to an Intel PCI-X chip, and a second PCI-X chip can hang off of that one, if needed. That’s a decent arrangement, especially since the connection between the north and south bridge chips on the E7505 is only 266MB/s. Tyan chose to include only 32-bit, 33MHz PCI slots on the i7505, though.

Nevertheless, the Tiger i7505 itself bristles with ports, slots, and sockets. Tyan’s characteristically conservative approach to motherboard design feels right at home in the workstation market, where overclocking and flamboyance aren’t exactly orders of the day. Unlike the Opteron boards, the Tiger i7505 doesn’t require—and indeed won’t operate with—registered DIMMs. The board does support both ECC and non-ECC memory types, though. As with all these dual-channel designs, DIMMs must be installed in pairs for optimal performance.

The i7505 includes both an AGP Pro slot and Gigabit Ethernet, the latter courtesy of an Intel PCI Ethernet chip. Like our other two contestants, the i7505 uses a Realtek codec chip to translate AC’97 audio from the south bridge. The older ICH4 south bridge chip on the Tiger only supports ATA/100, and no RAID. To augment the board’s disk I/O capabilities, Tyan has chosen a Promise Serial ATA RAID controller.

Despite its lack of 64-bitness, the dual Xeon i7505 is one sweet workstation-class system. With Hyper-Threading enabled, the dual Xeons show up as four logical processors in utilities like Windows Task Manager, causing a little twinge of excitement in the heart of any true geek.

About the tests
We’ll be testing the three workstation platforms we’ve just described against a pair of familiar foes: the Pentium 4 3.2GHz and Athlon XP 3200+. Now, before you market segmentation purists get your briefs bunched up, let me explain why. First, there’s the issue of price. A pair of Xeons 2.66GHz chips or Opteron 240s will set you back a little more than a single Athlon XP 3200+, or a little less than a Pentium 4 3.2GHz. The comparison seems only fair, in that regard. Then there’s the fact that at least one of these two platforms, the Pentium 4 with 875P chipset, is actually considered a workstation-class product by its manufacturer. (The P4 with an 865 chipset is Intel’s mainstream desktop combo.) We would, thus, be remiss not to include it.

The next facet of our tests to cause potential bunching effects is our choice of graphics cards, which is the decidedly non-workstation-class GeForce FX 5900 Ultra. Never mind that there’s a Quadro FX based on the same chip to match it, the card we’ve used for testing doesn’t fit the workstation mold, and we own up to it. Truth is, we blew our budget on the workstation-class processors, because they are the focus of our attention today. The better-tuned and certified OpenGL drivers that come with a Quadro would only have made a difference in a couple of our benchmarks, anyhow: SPECviewperf and the OpenGL portions of Cinebench. Otherwise, most of the tests we employed are platform-bound.

Finally, we threw in a few gaming tests at the end of our benchmarks, just for fun. I’m sure product managers everywhere will be aghast. We plead guilty to whatever labels you want to throw on us for this torrid act of sedition—even the “enthusiast” tag. We are bad, bad men. We don’t know how we sleep at night.

Our testing methods
As ever, we did our best to deliver clean benchmark numbers. Tests were run at least twice, and the results were averaged.

Our test systems were configured like so:

Athlon XP K8T800 Opteron nForce3 Opteron Xeon Pentium 4
Processor Athlon XP ‘Barton’ 3200+ 2.2GHz AMD Opteron 240 1.4GHz
2 x AMD Opteron 240 1.4GHz
AMD Opteron 240 1.4GHz
2 x AMD Opteron 240 1.4GHz
2 x Xeon 2.66GHz Pentium 4 3.2GHz
Front-side bus 400MHz (200MHz DDR) HT 16-bit/800MHz downstream
HT 16-bit/800MHz upstream
HT 16-bit/600MHz downstream
HT 8-bit/600MHz upstream
533MHz (133MHz quad-pumped) 800MHz (200MHz quad-pumped)
Motherboard Asus A7N8X Deluxe v2.0 MSI 9130 Asus SK8N Tyan Tiger i7505 Abit IC7-G
North bridge nForce2 SPP K8T800 nForce3 Pro E7505 MCH 82875P MCH
South bridge nForce2 MCP-T VT8237 82801DB ICH4 82801ER ICH5R
Chipset drivers nForce Unified 2.45 4-in-1 v.4.49
AGP 4.42
AGP 3.34
ATA 3.44
Audio 5.10.0.5100
INF Update 5.0.2
IAA 2.3.0.2160
Audio 5.10.0.5250
INF Update 5.0.1015
ATA 5.0.1007.0
Audio 5.10.0.5250
BIOS revision 1005 1.0 1002 1.01 1.6
Memory size 1GB (2 DIMMs) 1GB (2 DIMMs) 1GB (2 DIMMs) 1GB (4 DIMMs) 1GB (2 DIMMs)
Memory type Corsair TwinX XMS4000 DDR SDRAM at 400MHz Infineon PC2700 registered ECC DDR SDRAM at 333MHz Infineon PC2700 registered ECC DDR SDRAM at 333MHz Corsair TwinX XMS3200LL DDR SDRAM at 266MHz Corsair TwinX XMS4000 DDR SDRAM at 400MHz
Hard drive Seagate Barracuda V 120GB ATA/100 Seagate Barracuda V 120GB SATA 150 Seagate Barracuda V 120GB ATA/100 Seagate Barracuda V 120GB ATA/100 Seagate Barracuda V 120GB SATA 150
Audio nForce2 MCP/ALC650 Creative SoundBlaster Live! nForce3 Pro/ALC650 ICH4/ALC650 ICH5/ALC650
Graphics GeForce FX 5900 Ultra
OS Microsoft Windows XP Professional
OS updates Service Pack 1, DirectX 9.0b

All tests on the Pentium 4 and Xeon systems were run with Hyper-Threading enabled.

One note about sound. The MSI 9130’s audio ports are located on a PCI slot plate that connects to the 9130 via a header. We first received the 9130 sans manual or audio connectors, so we tested it with a separate sound card. We used built-in audio solutions on the other platforms.

Thanks to Corsair for providing us with memory for our testing. If you’re looking to tweak out your system to the max and maybe overclock it a little, Corsair’s RAM is definitely worth considering.

The test systems’ Windows desktops were set at 1152×864 in 32-bit color at an 85Hz screen refresh rate. Vertical refresh sync (vsync) was disabled for all tests.

We used the following versions of our test applications:

All the tests and methods we employed are publicly available and reproducible. If you have questions about our methods, hit our forums to talk with us about them.

Benchmark results

Memory performance
Our synthetic memory tests should help show us how the Opteron benefits from its on-chip memory controller, among other things.

On the bandwidth front, the Opterons easily outrun the Athlon XP, although the Pentium 4 is still king. The dual Xeons, with their dual DDR266 memory config, don’t fare so well in Sandra. Cachemem’s bandwidth test, however, generally seems a little more indicative of real-world performance, and the Xeons do well there. Notice how the dual Opteron K8T800 is slower than the same board with only one processor. That’s because the second CPU is saddled with non-local memory access.

Also note how the K8T800 pulls slightly ahead of the nForce3 Pro in these memory bandwidth tests. That’s an unexpected outcome, because the two systems share the same memory controller, embedded in the Opteron processor. VIA and MSI have pulled off a minor coup in beating out the nForce3 Pro here.

Linpack shows us the impact of the Opteron’s large 1MB L2 cache. Unfortunately, at only 1.4GHz, the Opteron 240 can’t quite keep up with the Athlon XP or the Xeons until their caches are exhausted. The Pentium 3.2GHz with dual DDR400 is a monster in Linpack, beating out the Opteron on larger matrix sizes, even with a smaller 512K L2 cache.

With its integrated memory controller, the Opteron shows us some very low latencies for memory access. Only the Pentium 4 is able to keep up by virtue of its very high internal clock frequencies, fast bus, and PAT-enabled north bridge chip. The dual Opteron system suffers the effects of non-local memory access rather acutely here, as one might expect.

3ds max rendering
We begin our 3D rendering tests with Discreet’s 3ds max, one of the best known 3D animation tools around. 3ds max is both multithreaded and optimized for SSE2, making it a perfect playground for our Xeons and Opterons. We rendered a couple of different scenes at 1024×465 resolution, including the Island scene shown below. Our testing techniques were very similar to those described in this article by Greg Hess. In all cases, the “Enable SSE” box was checked in the application’s render dialog.

The single-processor Pentium 4 system ties the Xeon on the simpler Earth-Apollo.max scene, but the duallies pull ahead in the more complex Island scene. Intriguingly, the Opteron’s enhancements over the Athlon XP aren’t enough to make up for the 800MHz difference between the 1.4GHz Opteron 240 and the 2.2GHz Athlon XP 3200+.

Lightwave rendering
NewTek’s Lightwave is another popular 3D animation package that includes support for multiple processors and is heavily optimized for SSE2. Lightwave can render very complex scenes with realism, as you can see from the sample scene, “A5 Concept,” below.

Where 3ds max is self-tuning, though, Lightwave is not. Users may choose the number of rendering threads to execute in Lightwave. We tried a number of different configurations and came up with the following. The single-processor Opteron and Athlon systems always performed best with a single thread. The multi-processor and Hyper-Threaded systems were somewhat trickier, so we’ve reported scores with two, four, or eight threads, depending on the situation. In all cases, the fastest score was always reported.

The Xeons perform exceptionally well here, but the dual Opterons are right on their heels. The Opteron’s ability to execute SSE2 code allows our single-Opteron systems to outrun the Athlon XP 3200+, which is the only processor in our round-up without SSE2 capability.

POV-Ray rendering
POV-Ray is the granddaddy of PC ray-tracing renderers, and it’s not multithreaded in the least. Don’t ask me why—seems crazy to me. POV-Ray also relies more heavily on x87 FPU instructions to do its work, because it contains only minor SIMD optimizations.

We’ve tested with a pair of scenes: the old “chess2.pov” scene we’ve been using forever, and the POV-Ray 3.5 benchmark scene, which apparently tests some of the SIMD-optimized codepaths in this new POV-Ray release.

Clock speed and FPU prowess are the keys here, and our Athlon XP and P4 systems have the advantage.

Cinebench 2003 rendering and shading
Cinebench is based on Maxon’s Cinema 4D modeling, rendering, and animation app. This revision of Cinebench measures performance in a number of ways, including 3D rendering, software shading, and OpenGL shading with and without hardware acceleration.

Cinema 4D’s renderer is multithreaded, so it takes advantage of Hyper-Threading and SMP. For the Athlon XP and the single-CPU Opteron systems, I’ve reported the single-processor results. For the rest of the systems, I’ve reported the multi-threaded results, which in all cases were notably faster.

In the CPU-based Cinema 4D renderer, the Xeons simply rule. This test has always responded well to Hyper-Threading, and doing it on two different processors at once results in a big performance win.

The OpenGL-based shading tests are captured by the fast single-CPU P4 and Athlon systems.

SPECviewperf workstation graphics
SPECviewperf simulates the graphics loads generated by various professional design, modeling, and engineering applications. It’s an interesting test for workstation-class systems, although, as we’ve noted, we don’t really have a workstation-class graphics card (or at least driver) in our test systems.

viewperf doesn’t really benefit from a second CPU, which is obvious from looking at the test results. Thus, the P4 and Athlon XP systems trade back and forth in the top two slots, for the most part. One interesting note: in four of the six tests, the K8T800 outruns the nForce3 Pro, sometimes by a fair margin. We may be seeing VIA’s Hyper8 advantage in action.

ScienceMark
I’d like to thank Alex Goodrich for his help working through a few bugs the 2.0 beta version of ScienceMark. Thanks to his diligent work, I was able to complete testing with this impressive new benchmark, which is optimized for SSE, SSE2, 3DNow! and is multithreaded, as well.

In the interest of full disclosure, I should mention that Tim Wilkens, one of the originators of ScienceMark, now works at AMD. However, Tim has sought to keep ScienceMark independent by diversifying the development team and by publishing much of the source code for the benchmarks at the ScienceMark website. We are sufficiently satisfied with his efforts, and impressed with the enhancements to the 2.0 beta revision of the application, to continue using ScienceMark in our testing.

The molecular dynamics simulation models “the thermodynamic behaviour of materials using their forces, velocities, and positions”, according to the ScienceMark documentation. Sounds simple enough, right?

Primordia “calculates the Quantum Mechanical Hartree-Fock Orbitals for each electron in any element of the periodic table.” In our case, we used the default element, Argon.

The next test measures performance in AES encryption.

In the three tests above, the Athlon XP and Pentium 4 are consistently the best performers, even though ScienceMark obviously benefits from SMP systems. The dual Xeon and dual Opteron systems trade off in the third-place slot. The Blas tests below measure matrix multiplication performance, much like Linpack. However, these tests are optimized in various ways, and we can see how the different codepaths perform. The SGEMM test measures single-precision floating-point math, where 3DNow! and SSE are able to help, while DGEMM is double-precision, so only SSE2 can accelerate these calculations.

The Pentium 4 and Xeon systems come out looking very good in these tests. The most interesting thing to note, however, is the relative performance on different codepaths. In SGEMM, the Athlon XP is faster in 3DNow!, while the Opteron is faster with SSE. On the DGEMM test, the Pentium 4 screams with SSE2 and packed data, but the Athlon XP nearly keeps up using only its FPU. The Opteron, meanwhile, performs just about the same with SSE2 packed, SSE2 scalar, and x87 assembly—oddly balanced across the board.

LAME MP3 encoding
We used LAME 3.92 to encode a 101MB 16-bit, 44KHz audio file into a very high-quality MP3. The exact command-line options we used were:

lame –alt-preset extreme file.wav file.mp3

Unfortunately, LAME isn’t multithreaded.

As we’ve seen all along, our 1.4GHz Opterons just can’t keep up with the pack. The Xeons smoke ’em. DivX video encoding
We tested DivX encoding using a new and different move clip than the one we’ve used in past tests, so don’t let these scores throw you. Xmpeg is partially self-tuning, and we noted that it chose the SSE2 Optimized iDCT on the Opteron processors. Xmpeg and DivX are also multithreaded, so a second processor should speed things up.

These are remarkable results from the Opteron. The dual Opteron 240 system pulls out a rare victory over the dual Xeon 2.66GHz system, and more impressively, the single Opteron 240 systems nearly catch up with the Athlon XP 3200+. The Opteron’s integrated memory controller and SSE2 support give it way more clock-for-clock oomph than the Athlon XP, which was itself no slouch. Sphinx speech recognition
Ricky Houghton first brought us the Sphinx benchmark through his association with speech recognition efforts at Carnegie Mellon University. Sphinx is a high-quality speech recognition routine that needs the latest computer hardware to run at speeds close to real-time processing. We use two different versions, built with two different compilers, in an attempt to ensure we’re getting the best possible performance.

There are two goals with Sphinx. The first is to run it faster than real time, so real-time speech recognition is possible. The second, more ambitious goal is to run it at about 0.8 times real time, where additional CPU overhead is available for other sorts of processing, enabling Sphinx-driven real-time applications.

The Pentium 4 just throttles the competition here, taking the top spot by a wide margin. With a fast P4 system, Sphinx-driven speech recognition apps could well be a reality. I half expected the Opterons to set new records in Sphinx, but I suspect we’ll need DDR400 memory and higher clock speeds from the Opteron in order for it to match the Pentium 4.

PICCOLOR image analysis
We thank Dr. Reinert Muller with the FIBUS Institute for pointing us toward his PICCOLOR benchmark. This image analysis and processing tool is partially multithreaded, and it shows us the results of a number of simple image manipulation calculations. The overall score is indexed to a Pentium III 1GHz system based on a VIA Apollo Pro 133. In other words, the reference system would score a 1.0 overall.

Remarkably, the wildly different Opteron 240 and Xeon 2.66GHz systems tie exactly in terms of overall score. Here are the results of the individual tests…

The K8T800 system stumbles on the two video tests. I’m unsure what the problem was, but the low score was consistent across all of our benchmark runs. We’ll have to investigate further. Otherwise, the individual scores are interesting. Some tests, like Float Interpolation, are clearly multithreaded. Others, like ArrayIndex, are not.

Just for fun: gaming performance

Quake III Arena
We were able to test at least one game, Ye Olde Quake III Arena, with multithreading. The “r_smp 1” console command invokes Q3’s multithreaded mode, which we tested alongside its regular mode.

Whoa. The dual Opteron K8T800 system darn near knocks off the Pentium 4 3.2GHz in Q3A—not exactly what I’d expected to see. Even with only one CPU, the Opteron 240/K8T800 system performs very well, nearly matching the Athlon XP 3200+. The nForce3 Pro is a fair bit slower than the K8T800 here, which suggests VIA’s faster HyperTransport link really can make a difference. I’m going to throw the rest of the gaming results at you without much comment. These are workstation systems, after all.

3DMark03

Comanche 4

Serious Sam SE

Unreal Tournament 2003

Wolfenstein: Enemy Territory

Ok, I have to make some comment. The Xeon and Opteron systems are, at least with the GeForce FX 5900 card we’re using, very competent gaming machines. The Opterons and Xeons trade places enough that I can’t say either one is particularly better than the other, but they’re both pretty decent. The K8T800 earns special distinction for outrunning the nForce3 Pro pretty darn consistently in gaming and graphics benchmarks.

Conclusions
Workstations are typically very specialized machines, so you’ll have to decide for yourself which of these platforms makes the most sense for your particular needs. If you are faced with the choice of a top-of-the-line single-processor PC or a lower-end workstation-class system, you can see from our tests that the choice may be quite a dilemma. Clearly, in most tasks, the faster single-processor system outperforms a slower dual-processor workstation. However, for tasks that are easily parallelizable, like 3D rendering, multiprocessor systems can be worth investigating. Also, most workstations aren’t bought on the tightest of budgets, from what I gather, and dual-processor rigs promise the highest outright performance when price isn’t an object. Driver certifications and the like for your particular application may play into the decision, too. If you’re choosing between Xeons and Opterons, the choice is also tough. Unfortunately, our dual Opteron test motherboard didn’t have DIMM slots for the second processor’s memory controller, so the Opteron may have been shortchanged a little bit. Then again, the version of Windows we used for testing wasn’t NUMA-aware, anyhow.

Clearly, the Xeons at 2.66GHz outran the dual Opteron 240 setup in most of our tests. With its faux-quad-processors via Hyper-Threading, our dual Xeon Tiger i7505 system ripped through multithreaded tests, absolutely devastating the competition in Cinema 4D rendering. As new Xeon DP chips with 1MB cache, higher core clock speeds, and 800MHz front-side bus speeds with dual-channel DDR400 memory become available, the Xeon should become even more formidable. Obviously, Intel is not taking the Opteron threat lightly.

However, the Opteron itself showed quite a bit of promise, tying or beating its older sibling, the Athlon XP 3200+, in several tests despite a cavernous 800MHz clock-frequency gap. With 64-bit applications, a 64-bit OS, and a proper memory configuration, the Opteron could be a deadly fast workstation config. The addition of an integrated memory controller and SSE2 puts the Opteron in good position to take on a charging pair of Xeons without getting gored. Most important for the Opteron, though, is getting the clock speed up. The new 2GHz Opteron chips should be much more potent competitors, even against Intel’s 3.2GHz Netburst-based processors. (I’m not kidding—keep an eye on TR for more soon.)

Among the Opteron chipsets, the K8T800 looks like the better workstation platform right now. The nForce3 Pro doesn’t have any great flaws, and its single-chip approach is elegant, but the K8T800 offers more. The K8T800 supports multiple CPUs and Serial ATA, and its faster HyperTransport link between CPU and chipset brings better performance whenever AGP is invoked. I’m not too enamored with the MSI 9130’s funky memory config, but its single-processor performance is impressive.

Comments closed
    • Anonymous
    • 16 years ago

    Well, anyway……there’s time to make a good up-to-date benchmark of opteron vs. xeon, and with Good Motherboards using the proper chipsets (intel’s for xeon and amd’s for opteron) and top-of-the-line CPUs (Opteron 246 and Xeon 3.06 at this moment for dual)

    Without such a kind of benchamrk……you can’t compare in equal conditions those platforms.
    The barebones systems are not often the best platform and configurations.

    I hope these guys do it soon, because many people are interested on this.

    There’s also the problem that many benchmarks don’t reflect a good or useful results because they don’t support SMP.

    So proper hardware and proper benchmarks are needed (using 3d software which support SMP like in this report is a good way indeed)

    • Anonymous
    • 16 years ago

    I think you also should have used the Tyan K8W dual Opteron Mainboard with DDR banks for both processor, since I guess is better than MSI’s one, it has the AMD’s 3 chipsets for opteron, wich probably produce different results in benchmarking.

    I see it is very hard to get an up-to-date and very reliable benchmark wich helps to decide more clearly what workstation one may want to build.

    Please, if possible, make another report on this topic. It’s very useful and please include the latest technology and the best configuration possible for a workstation unit.

    Thanks a lot.

    • Anonymous
    • 16 years ago

    I recognize this is a hard and valuable work (doing benchmarks). But… why did you use xeon 2.x Ghz and Opteron 240 CPUs ?

    Would have been much better to use the last models available, like Xeon 3.06 and Opteron 246.

    it’s a pitty you didn’t use them..
    The market moves fast, and these benchmarks are not up-to-date 🙁

    Thanks a lot for your effort.

      • Anonymous
      • 16 years ago

      🙄 You really think that this guys here have gobs of money, don’t ya? 🙄

        • Anonymous
        • 16 years ago

        I don’t know……..but they could borrow the CPUs to the companies for benchmarking, or something like that. I don’t think it would be a problem for these guys.

        Hope they do anothers benchmarks as they wrote.

          • Anonymous
          • 16 years ago

          Did you even read the coments?? Damage himself said that the only ones to be blamed here were the companies… As he pointed out, Intel and AMD don’t want “entusiast” sites to compare their workstation chips with desktop chips. So, in few words… would they borrow those chips? hell no!

            • Anonymous
            • 16 years ago

            Well, anyway……there’s time to make a good up-to-date benchmark of opteron vs. xeon, and with Good motherboards using the proper chipsets (intel’s chipsets for xeon and amd’s for opteron).

            Without such a kind of benchamrk……you can’t compare in equal conditions those platforms.

            I hope these guys do it soon, because many people are interested on this.

    • Anonymous
    • 16 years ago

    Fuck ya you guys rock with the indepth report with a sense of humor too. Now I really have a clear idea of what to get … NOT! Keep up the great work.

    • BlueDjinn
    • 16 years ago

    For the record:

    §[<http://www.pcmag.com/article2/0,4149,1274138,00.asp<]§ PC Magazine has posted a complete review of the dual 2GHz Power Mac G5, giving it a four out of a possible five rating. "When Apple's Steve Jobs introduced the Apple Power Mac G5 this summer as the fastest personal computer any company had built to date, we took it with a grain of salt... After testing a loaded ($4,349 direct, after we opted for more RAM and upgraded graphics) dual 2.0GHz Power Mac G5 on a range of high-end content creation applications and comparing the results with a similarly configured (and priced) Dell Precision 650 Workstation running dual 3.06-GHz Xeon processors, we see that indeed the G5 is generally as fast as the best Intel-based workstations currently available"

    • Anonymous
    • 16 years ago

    WHY WHY WHY are there never any Dually Athlon setups benchmarked in these numerous ‘workstation comparisons’ on AMD fansites?!! WHY?! We have Xeon duallies, we have Opteron duallies, why not Athlon duallies?

    Like, for example, §[<http://www.overclockers.com/tips1097/<]§ using XP 2500 Bartons or XP 2600 T-Breds oc'd to 2.2 GHz 150 FSB I agree G5s could have been included, but for price/performance, can't beat Athlon duallies.....

    • Ardrid
    • 16 years ago

    To those of you who want a look at how the 2.0GHz Opteron/Athlon 64 performs in comparison to a P4 3.0GHz, an Athlon XP 3200+ and a Dual Xeon 3.06 w/1MB cache in gaming/workstation/general content benchmarks, I direct you to the following link:

    §[<http://www.anandtech.com/cpu/showdoc.html?i=1856&p=1<]§ As usual, Anand, or in this case Wesley Fink, does an excellent job. Sept 23rd is only 8 days away :D

    • Anonymous
    • 16 years ago

    21, I don’t like the thing you ‘said’/’inferred’ about Scott. It’s a lie.

    • Anonymous
    • 16 years ago

    #31, you wrote a courteous post, both times. #32, take a hike.

    • redpriest
    • 16 years ago

    You know, I just don’t understand all the fuss about the whole memory only being local to one processor. The memory latency to the processor that doesn’t have the memory local to it is *still* 100 ns, and that’s pretty frigging fast when you compare it to other smp systems. Granted, having memory local to both processors would be nice but you’d really only reap the benefits under a NUMA OS.

    • Anonymous
    • 16 years ago

    /me loves damage =)

    • IntelMole
    • 16 years ago

    Am I the only one here that thinks this was actually a good article?

    I read, and more importantly, understood the article, it’s limitations (wrt graphics card, processor speeds etc.), and accepted them as perfectly valid compromises.

    Am I right is saying, however, that a NUMA-aware OS would make little difference in this particular instance? Follow what I’m thinking here: A NUMA-aware OS will put data in memory that can be accessed faster by the processor using it. If the memory configuration on the dual Opteron board is 4×0, the OS cannot put the data for CPU2 closer than on CPU1’s memory. NUMA-awareness simply cannot do anything in this particular situation…

    A 2×2 setup, however, would be a completely different story.

    Can’t wait for that 2GHz article Damage, really whet my appetite now.

    Oh, and did I mention? Good review,
    -Mole

      • just brew it!
      • 16 years ago

      With the exception of the totally off-base bit about CPU redundancy (which I already griped about in my previous post), I thought it was a good article too.

      • Anonymous
      • 16 years ago

      I don’t think you’re correct. A non-NUMA OS on an SMP system will attempt to evenly distribute its process load between the two processors on the theory that that is the best use of resources.

      A NUMA aware OS, on the other hand, ought to bias the processor with better access to memory somewhat, and that bias ought to result in higher performance (depending on the task, of course).

        • IntelMole
        • 16 years ago

        With only two processors, the bias usually won’t actually come to anything with any SMP-aware applications. At least, I can’t see it having much effect. It’s still gonna spill a second thread over to CPU2, because CPU1 is busy.

        But with single-threaded apps, yeah, I could see that having an effect…
        -Mole

    • wagsbags
    • 16 years ago

    My only complaint is that there was no mention of how fast the opterons have gotten. Does anybody else get the impression that the opterons are really getting their asses kicked w/o a 64 bit os?

    • Anonymous
    • 16 years ago

    Not meaning to throw more oil on the fire, but this post should have been titled “Intel vs. AMD”, you left out the better workstation platforms like SGI and Sun. Hell, even Apple’s new G5 should’ve been in there, be it only for its 64-bit CPU.

      • Anonymous
      • 16 years ago

      When does Apple get 64 bit OS?

      • Sargent Duck
      • 16 years ago

      /quote “Not meaning to throw more oil on the fire, but this post should have been titled “Intel vs. AMD”, you left out the better workstation platforms like SGI and Sun. Hell, even Apple’s new G5 should’ve been in there, be it only for its 64-bit CPU” /end quote

      Did you not read Damage’s post a little bit ago. He said they bought everything with their own money. In case you didn’t get it, let me repeat. They took out their wallet, and with their own money, bought the parts. I would like to know how many people can buy a SGI and a Sun and an overpriced G5, along with the opterons and Xeons out of THEIR own pockets. I’m sure they would love to review a Sun/SGI/G5 if you were willing to supply them with it.

      BTW, congrats Damage. An excellent article. I seem to be one of the few *cough cough* that actually read the intoduction and realized that the systems were being compared on a price basis. All the cpu’s were with in a few dollars of each other, which will help people who are looking to get a upper but not top of the line workstation.

    • cRock
    • 16 years ago

    On NUMA: The Opteron architecture presents main memory to the OS just as any other x86 system would – i.e. as one contiguous block. Hence, the memory is uniform as far as the OS knows (even if it isn’t physically uniform). NUMA never enters the picture. AMD has always claimed that the hypertransport bus is so fast, that it doesn’t matter if main memory is spilt between procs. We’ll find out when you get that Tyan board…

    • Damage
    • 16 years ago

    Looks like we have some rabid gerbils here today. Let me try to answer some of your questions and address some of your concerns. Many of them are addressed in the text of the article, but I will try to say a little more where it’s needed.

    First, some of you AMD fanboys are up in arms about our choice of chips. Remember, please, that we purchased the Opterons and Xeons with our own money, and the Opteron 240s cost exactly the same amount, to the dollar, as the 2.66GHz Xeons, when we bought them. We did ask both AMD and Intel if they would supply us with processors for testing, and they both refused to do so. (They don’t like enthusiasts comparing their workstation chips to desktop CPUs. Makes them twitch to think about it.) As a result, you’re seeing this comparison at price parity. If you AMD fanboys don’t like what you’re seeing, complain to your favorite company for its PR decisions.

    And these are the latest CPUs. They are not “old” by any standard, even if they aren’t the top speed bins.

    On the same front, we said it once and we’ll say again, the P4 and Athlon XP chips were in the same price range as dual Xeons and Opterons. That’s why we included them.

    Next, we had no intention of slighting the Opteron by using a motherboard with only one CPU connected to DIMM slots. In fact, I had no idea this board was set up in this way–why should it be??–until I got well into the review process. Heck, I’d purchased 2GB of registered DDR333 memory in order to have 4 DIMMs, or one for each memory channel. Turns out I didn’t need them, which was news to me. We will have to get our hands on that Tyan board.

    But, again, we were using WinXP Pro, which isn’t NUMA aware (look it up, kids), so I’m not sure a better memory config would have mattered in WinXP. We did play around with a quad-channel dual Opteron box (server–no AGP) with Windows Server 2003, which has a NUMA-aware kernel, but it appeared the kernel quanta were too long to get good scores from our low-level programs like cachemem or decent performance in other workstation-type apps. The reality is that we’ll have to wait for the 64-bit OS flavors to arrive (they are all in beta now) before we get optimal performance from Opterons. Our comparison simply reflected this reality.

    Adi expressed concern over what makes a system a workstation and whether these parts qualify for the tag, and he whinged a little about the FX 5900 card. Those issues were addressed in the article. Next time, we’ll make a graph of Adi’s reading comprehension scores. I will say one thing about why we didn’t use a Radeon: our access 64-bit drivers. Stay tuned. 😉

    Finally, this is our first real foray into workstation-class PC testing. We’ve dabbled in 760MPXes and the like in the past, but we’ve never committed the resources to make something like this happen. We’d prefer constructive criticism (and not 200 copies of the same, obvious stuff) to fanboy invective, so we can improve our efforts next time around. Now that we have our foot in the door, we’d prefer you not slam it in there.

      • eitje
      • 16 years ago

      q[

        • adisor19
        • 16 years ago

        q[

          • eitje
          • 16 years ago

          oh yeah, Xeons don’t have individual banks of memory. 🙂

          i guess i’ve just gotten so used to thinking about server-class CPUs like the Opteron as having their own RAM banks that I’d forgotten some CPUs still don’t do that. 😉

      • droopy1592
      • 16 years ago

      Good Job Damage. I had figured they were price competitive. That’s AMDs choice. They claim price gives a product a certain “quality” status.

      • adisor19
      • 16 years ago

      Fair enough, can’t wait to see the graph 😛

      Still we can argue on and on about the fact that just because a board accepts a worstation grade cpu does not make it a workstation grade board..

      Also, what does 64bit drivers have to do with all this ?? If you tested with WinXP 32bit edition, then 32bit drivers were used… unless you’re planning to retest everything with the leaked 64 bit version of WinXP 😀 (you little naughty Damage you !!! 😉 ) hehe

      Adi

      • Anonymous
      • 16 years ago

      Damage: It would be very interesting to test the difference between SMP Opterons in single and dual memory access configurations, and under NUMA and non-NUMA aware operating systems.

      I think you are incorrect in asserting that Windows XP being non-NUMA aware is a non-factor in this sort of situation. NUMA, for those that don’t know, means Non Uniform Memory Access and describes system architectures in which access to memory in some locations does not have the same performance characteristics as memory in other locations. A NUMA aware operating system (Linux kernels 2.6 and perhaps recent 2.4 iirc are NUMA aware) will arrange things such that processes are “closer” to the physical memory they are using.

      I would expect a NUMA aware OS to perform better on the dual Opteron system reviewed than a non-NUMA OS, all other things being equal.

      As far as the rest of the review is concerned, having access to some performance information is better than having access to none, and everybody should try to understand the realities of access to equipment and the cost of that equipment that face small websites, especially as they try and take on higher-end hardware. 2cpu.com faces the same problems. It is simply unfortunate that a non-optimal Opteron config was used, and that the intro contains hyperbole to the extent that people don’t realize that this was a workstation review based on similar cost rather than of the top example of every configuration. I didn’t realize that until I read through the comments myself.

      • DrDillyBar
      • 16 years ago

      Wonderful article as always. Attention to detail. 🙂

    • mattsteg
    • 16 years ago

    From all the complaints I get the feeling people expect TR to have access to infinite money and/or hardware. That’s simply not the case.

    • Anonymous
    • 16 years ago

    I thought the review was rather entertaining. Even though I have a feeling this review was delayed. But it does give some insight on how opteron compares to the other chips that are out. Not too bad.

    Hey can you throw in a 4-way opteron next time? that would be really neat.

      • R2P2
      • 16 years ago

      q[

    • just brew it!
    • 16 years ago

    From the article:

    q[

      • Damage
      • 16 years ago

      Ack. You gotta be right about that.

    • Steel
    • 16 years ago

    To all the people complaining about the memory configuration of the dual Opteron: that’s how i[

      • eitje
      • 16 years ago

      well, you could swap 4 DIMM slots for one processor with 2 slots for each processor. with current DDR DIMM configs, you’d still be getting a max of 4 GB per proc, 8 GB total, with no latency for that second proc.

        • IntelMole
        • 16 years ago

        And you would route that… how exactly?

        Part of the reason they do this is that it’s simpler to design…

        read: cheaper 😀

        Eventually, competition should force all motherboard manufacturers to route the wires the sensible 2×2 way, but that’s a little off yet, with only the K8W having it IIRC,
        -Mole

          • eitje
          • 16 years ago

          if the K8W can do it, MSI could’ve done it as well. and you can only imagine, even if it was more costly, what sort of performance they’d be getting out of that board. 🙂

            • Steel
            • 16 years ago

            Yeah, but look how *[

            • HiggsBoson
            • 16 years ago

            Actually conspiracy theorists will like this rumor:

            Scuttlebutt is that Intel is pressuring motherboard manufacturers to design their layouts without separate DIMM slots for each processor as a way to /[

            • Steel
            • 16 years ago

            My guess is the DIMM sockets need to be near the CPU socket to keep the traces short. Mind you, this is only a guess 😉 .

            • just brew it!
            • 16 years ago

            So put 4 next to one CPU, and 4 next to the other one, like the Tyan board does. It’s not the end of the world if you lose a PCI slot in the process, most motherboards include almost everything you need on-board these days anyway. I seriously doubt that many people actually i[

            • Anonymous
            • 16 years ago

            Another rumor says that its because of video drivers not liking NUMA memory together with an AGP slot. So all boards that have a AGP slot if you research is limited to memory on one CPU.

            The K8W is touted as shipping in August, but I have not seen a site with it in stock.

            • Anonymous
            • 16 years ago

            This burning question about what performance hits are caused by non-local memory for CPU1 could be answered if there are any old comparisons between any server-type dual CPU board and any desktop-type ATX-sized board. This has nothing to do with Opterons, so we need not search only Opteron benchmarks. As you’ll notice, all ATX-sized duallies make the memory compromise, whether Opteron, Athlon, P4, Xeon or whatever. Is there such a comparison anyone has seen from older reviews, e.g. a server-class 12”x13” board for Athlons versus an ATX-sized board?

            This also brings up the important point that mobo manufacturers have been making these compromised boards for years now. If performance was being so badly hit, why would they bother since the market for duallies must’ve been pretty niche anyway? On the other hand, it could be that it didn’t make a difference so far, but once a NUMA-aware OS becomes standard, it will. In which case, the compromise boards should be history quite soon.

            The key question regarding the Thunder K8W is whether, for most practical graphics and digital-content applications, the extra local memory for CPU1 will outweigh MSI’s use of the (possibly) superior and 800Mhz FSB supporting Master2-FAR. It’s a trade-off between latency (on the MSI) and slower FSB and slightly older AMD chipset design (the Tyan). Maybe the Tyan really shines in server disciplines and huge databases etc. and not really when you apply a photoshop filter or render a Maya image.

            Anyway, the key would be to find an older review which might’ve compared similar boards with similar CPUs – one 12”x13” and one 12”x9.6”. Such a benchmark should clear up things.

    • Anonymous
    • 16 years ago

    well, this is my third post in a row, but the more I read this article the more worked up I get.

    First, Xeon has gone up to 3.06 Ghz on the 533MHz bus and 1Mb L2 cache (no 800 MHz bus for Xeon yet). The clock increase from 2.66 to 3.06 is 15%, and tack on a bit more performance gain for increase L2 cache. So the performance of the 2.66 Ghz Xeon is not that far from the top model.

    But the opteron has gone up from 1.4 GHz to 2.0 GHz, something like a 43% increase. The performance of a model 240 compared to the top model is a pretty big difference.

    And while the improvements on the Xeon has been noted by the author, he has however failed to mention that the Opteron tested here is the lowest model, and instead giving the impression that it IS the latest available.

    So to compare a 1.4 Ghz Opteron to a 2.66 GHz Xeon is not hardly “an all-out workstation platform brawl”, more like kicking the baby.

    Then the price comparison… well P4 Xeon has been on the market for a MUCH longer period of time than Opteron, it has had a much longer market saturation time and time for price to ramp down. The Opteron has only been out for a few months, that its price has come down to a comparable level with Xeon is already a wonder. However, if the article is meant to be price conscious, why is there a P4 3.2 GHz and an Athlon XP 3200+? And if it IS meant to be price conscious, then the opening of the article shouldn’t really claim it is about the latest workstation CPUs, since both sides have already significantly improved their line of workstation CPUs.

    So perhaps like I mentioned earlier, it is likely the case that this article was written a few months ago, and the author forgot to publish it at the time.

    However, if the reason for the choice is due to budget constraint (which the author alluded to a few times, and if fully understandable), he nevertheless should not make misleading claims that the article is about the latest CPUs.

    I sincerely would like the author to address these problems, because it really feels like a biased review.

      • Anonymous
      • 16 years ago

      sorry, the sentence should read”

      “(which the author alluded to a few times, and is fully understandable))

      • Anonymous
      • 16 years ago

      Hmmm….you seem like such a smart fella. Maybe you should review some gear and then give a write-up of it. What’s that? Too busy? Well that’s a shame. You sure sound like you could do one 10 times better than Damage.

        • BabelHuber
        • 16 years ago

        But not smart enough to use the ‘Edit’ button 🙂

          • Anonymous
          • 16 years ago

          can’t edit anonymous posts.

        • Anonymous
        • 16 years ago

        I don’t claim to be able to do a better job, merely that it seems some of the choices that the author made is not clear to me, and I just wish him to clear up the matter and explain his choices. As the choices do not seem to make sense in light of many other reviews on the web, which if you have the intelligence you should be able to find and read.

        I am trying to be honest rather than condescending, however, since I don’t seem to be able to resist your baiting for a bit of flaming I guess I am not much better than you, but then again, I didn’t claim to be better than anyone.

          • Anonymous
          • 16 years ago

          I have found and read some. For example, I was the one who linked the review in post #18, so I guess I’m smarter than you thought.

          And if it wasn’t your intention to sound condescending, perhaps you could approach it from a better angle, because that was how your comments came across.

      • HiggsBoson
      • 16 years ago

      q[

    • R2P2
    • 16 years ago

    What’s with using 4 and 8 threads for the dual Opteron in Lightwave? If you only used 1 thread for the single, it’d make sense to show 2 threads for the dual, wouldn’t it? The article explains why extra threads were used with the Hyperthreaded CPUs, but that doesn’t explain the Opteron.

      • Damage
      • 16 years ago

      The Opterons were consistently faster with 4 and 8 threads than with 2 in Lightwave. I guess I should have put the scores in, but that’s the deal.

    • Anonymous
    • 16 years ago

    another thing:

    is the article’s author going to reply about the complaints or would he simply wave them off as fanboy rants?

      • Anonymous
      • 16 years ago

      10 to 1 what he’ll do is just delete these comments. Check back tomorrow to see if everything he just doesn’t like is censored.

        • HiggsBoson
        • 16 years ago

        Um, seriously if all you’re going to do is ad hom and troll why are you bothering to post? That’s just unnecessary.

    • Anonymous
    • 16 years ago

    “We’ve rounded up the latest workstation CPUs and motherboards for an all-out workstation platform brawl” – by Scott Wasson

    1.4 GHz Opteron is hardly the “latest workstation CPU”

    did the author write this article several months ago and forgot to post it then?

    • Anonymous
    • 16 years ago

    this review is really interesting – first dual sledge assessment I’ve seen using only one bank of memory. really should have changed the review material to focus on that, rather than a pitch for an all out dualie comparison.

    • Anonymous
    • 16 years ago

    Gah. With Opteron procs at 2.2, I expected more of TR; you know, an over-the-top monster CPU war. Well, in any case, some of the stuff is informative, but is much below you usual level.

      • Steel
      • 16 years ago

      What 2.2GHz Opteron? It currently tops out at 2GHz with the x46 models.

    • wesley96
    • 16 years ago

    Don’t you love all those AG’s missing the whole point and thinks the comparison was about pitting P4 3.2GHz against Opteron 1.4GHz? 🙂 I say they jumped right to the graphs.

      • Steel
      • 16 years ago

      No kidding.

    • Anonymous
    • 16 years ago

    What a surprise! 3.2Ghz P4 beats 1.4Ghz Opteron!!! Thanks for the info… Next time, try something like P4 prescott 3.6 against thunderbird 700MHz… I’m longing to know who’ll win…

      • indeego
      • 16 years ago

      RTFA
      UTFAg{<.<}g

        • Anonymous
        • 16 years ago

        U? “Understand” ?

          • indeego
          • 16 years ago

          You got it! yay! read damage’s post below.

          I think the only thing they should be criticized for is not having a real workstation class graphic’s card in there, (which they recognize, kudo’s to them) since that seems the only thing left out, but I doubt it would change the bench’s all that much. Then again, if they went through the effort of buying this expensive stuff, at least get a low end workstation graphics card and try it out with thatg{<.<}g

    • Anonymous
    • 16 years ago

    #7: Ouch! But probably too true. :\

    Otherwise, Wasson (or should I say Wusson?), you could have saved yourself a whole of typing and us a whole lot of reading by just listing the systems you tested and saying the P4 beat everything in almost every test.

    • Anonymous
    • 16 years ago

    /[<"Unfortunately, our dual Opteron test motherboard didn't have DIMM slots for the second processor's memory controller, so the Opteron may have been shortchanged a little bit. "<]/ Well, I've never seen TR stoop so low ... talk about a sadly biased report. As the 'writer' and the rest of us know, the whole point of the on-board memory-controller is to significantly improve bandwidth, particularly in multi-cpu configurations. Why pick a ludicrously crippled MSI board? If that's all you could get your hands on, then why show it at all? "Short-changed a little"; my arse! This being the first reply I've ever posted during the lifetime of TR, I might as well tell it like it is: Hey Wasson! You're a freaking wus. That was by far least impartial 'report' you've ever had on your site. Hell; you hardly even pointed out these were old 1.4ghz Opterons ... talk about shrouded misguidance. AMD are screwing up their marketing campaign quite well enouhgh without your pathetic help.

      • Anonymous
      • 16 years ago

      (and #8): Fanboyism shows itself again.

      “old Opteron” – it’s newer than either the P4 or Xeon processors it’s up against, so is it “old”? When more boards and processors are around, you’ll get reviews to match.

      Get a grip. You can scale performance on your own. If you don’t want to learn anything, then don’t read the review, and don’t whine like spoiled children.

        • Anonymous
        • 16 years ago

        “You can scale performance on your own” -> Please consider looking at alternate reviews where various flavours of K8/Xeon/P4/AXP are compared. You’ll discover that scalability is _very_ different for these 4 processors.
        This mean that testing a single model number of these proc does not tell the whole picture (especially when you compare sub-parts of Xeon/Opteron with high grade AXP-P4).
        Call it fanboy-ism if you like, I would simply say that while this review let us know how K8/1.4, Xeon 2.6 and P4 3.2 compares, it is meaningless at understanding how the different plateform compares.

          • Anonymous
          • 16 years ago

          It’s information. Most people won’t be spending the current thousands of dollars for a new motherboard and pair of processors to get Opteron’s greatest possible performance.

          Every little tidbit of information helps, and considering prices, it gives you a better idea of what your money will accomplish on the different platforms.

            • wesley96
            • 16 years ago

            Indeed, the basepoint of this evaluation was based on the indicator that the Xeon and the Opteron in question was at the same price range.

            • Dissonance
            • 16 years ago

            And don’t forget the fact that a pair of the tested Opterons/Xeons was just a little less than a single Pentium 4 3.2GHz, and a little more than an Athlon XP 3200+.

    • adisor19
    • 16 years ago

    These mobos are NOT workstation grade.. at best i’d qulify them cheap people’s choice for duallies or workstation class cpu holders.. seriously, not 64 bit PCI slots is the first sign that they are not workstation mobos.

    I would be really happy to see a review of the K8W from Tyan where each opteron has acces to it’s own memory, where there is an AGP slot ready for action and where there is a 64 bit PCI slot where i can plug in a FireWire800 adapter or other serious SATA raid controller. I mean you can’t expect much out of 133MB/s from the old PCI bus… you need 64bit 66Mhz.

    Also, what was that joke of a 5900FX card doing there ?? Rather you put a Trident then that excuse of a “DX9+” card 🙁 Seriously, put a Radeon next time.

    Nice review overall (obviously unbiased and accurate AND TRUSTWORTHY) but i feel it’s point was missed as these are arguably not workstation mobos…

    Adi

    • leor
    • 16 years ago

    I don’t understand why MSI would do something as silly as have no memory connection to the second opteron processor. I wonder how any of the other chips would have performed if their memory bandwidth was halved.

    • indeego
    • 16 years ago
      • Anonymous
      • 16 years ago

      my thoughts exactly.

      On topic, it seems the Opteron really needs to scale frequency-wise if AMD wishes to effectively compete with Intel’s Xeon. Fortunately, it seems performance scales much better with K8 than does Netburst. Exciting times ahead.

      • adisor19
      • 16 years ago

      Huh ??? you didn’t post anything as the first post.. quite lame indeego 🙁

      Let’s read the article first and see what it’s all about then just reserve your status as first post with nothing to say..

      Adi

        • Ardrid
        • 16 years ago

        I’m a little irked about the use of an Opteron 240. I’m assuming that’s all Damage could get at the time. From what I’ve seen elsewhere, the Opteron scales beautifully with every increase in clock speed, not to mention with every processor added to the configuration, and the 246 (2.0GHz) pretty much hand’s the 3.2GHz its ass, to be blunt. I’m looking forward to Sept 23rd and the Tech-Report’s review of, the Athlon 64 and the Opteron 246, if it’s coming that is 🙂

Pin It on Pinterest

Share This