Home Chipset Serial ATA and RAID performance compared
Reviews

Chipset Serial ATA and RAID performance compared

Geoff Gasior
Disclosure
Disclosure
In our content, we occasionally include affiliate links. Should you click on these links, we may earn a commission, though this incurs no additional cost to you. Your use of this website signifies your acceptance of our terms and conditions as well as our privacy policy.

STORAGE SUBSYSTEMS DON’T GET nearly enough attention, though they’re arguably the most important subsystem of a modern PC. Of course, they store all of a system’s data—an increasingly precious resource that most of us don’t back up nearly often enough. Storage subsystems are often the slowest components in a modern PC, as well. Hard drives are essentially mechanical devices, and even with ever-growing platter densities, their seek times can amount to a practical eternity in computer time. Modern storage subsystems have a trick up their sleeves, though. RAID arrays have the potential to improve storage performance dramatically by spreading data over multiple drives. They can also improve redundancy, allowing a system to survive one or more drive failures with no data loss.

If RAID’s speed and redundancy aren’t enough to pique your interest, maybe its price will. Serial ATA RAID support is included in most of today’s core logic chipsets, so it’s essentially free. Chipset RAID has been around for a while, of course, but only the most recent core-logic chipsets from Intel and NVIDIA support arrays of up to four drives and the highly coveted RAID 5.

We’ve spent a couple of months running Intel’s ICH7R and NVIDIA’s nForce4 Serial ATA RAID controllers through our exhaustive suite of storage tests, and the results are something to behold. We started with a single drive and worked our way up through RAID levels 0, 1, 10, 0+1, and 5 with two, three, and even four hard drives. Read on to see which RAID controller reigns supreme and how the different RAID levels compare in performance.

RAID refresher
Before we dive into an unprecedented, er, array of benchmark results, we should take a moment to unravel the oft-misunderstood world of RAID. Depending on who you believe, RAID either stands for Redundant Array of Independent Disks or Redundant Array of Inexpensive Disks. We’re inclined to side with “independent” rather than “inexpensive,” since RAID arrays can just as easily be built with uber-expensive 15K-RPM SCSI drives as they can with relatively inexpensive Serial ATA disks. RAID arrays aren’t necessarily redundant, as you’ll soon see, so there’s little point in squabbling over the acronym.

There are a myriad of RAID levels to choose from, but we’ll be focusing our attention on RAID 1, 0, 10, 0+1, and 5. Those are the array types supported by today’s core-logic chipsets, and they’re also the most common RAID options for add-in cards. Each array level offers a unique blend of performance, redundancy, and capacity, which we’ve outlined below. For the sake of simplicity, we’ll assume that arrays are being built with identical drives of equal capacity.

  • RAID 0 — Otherwise known as striping, RAID 0 improves performance by spreading data across multiple disks in an array. Data is striped in blocks, whose size can range from a few kilobytes all the way up to 256KB or more. The size of these blocks is consistent throughout the array and known as the stripe size. Users can usually define the stripe size when configuring the array.

    RAID 0 arrays can be created with as few as two drives or with as many drives as there are open ports on the RAID controller. Adding more drives should improve performance, but those performance gains come with an increased risk of data loss. Because RAID 0 spreads data over multiple disks, the failure of a single drive will destroy all the data in an array. The more drives in a RAID 0 array, the greater the chance of a drive failure.

    While it may be more prone to data loss than other levels, RAID 0 does offer the best capacity of any conventional array type. The capacity of a RAID 0 array is equal to the capacity of one of the drives in the array times the total number of drives in the array, with no space wasted on mirrors, parity, or other such luxuries.


    RAID 0 (left) and RAID 1 (right)
  • RAID 1 — Also known as mirroring, RAID 1 duplicates the contents of a primary drive on an auxiliary drive. This arrangement allows a RAID 1 array to survive a drive failure without data loss. The enhanced fault tolerance of a RAID 1 array (or of most other RAID levels) is no substitute for a real data backup solution, of course. If the primary drive in a RAID 1 array is plagued with viruses, malicious software, or data loss due to user error, the mirrored auxiliary drive will suffer the same fate simultaneously.

    Although RAID 1’s focus is on redundancy, mirroring can improve performance by allowing read requests to be distributed between the two drives (although not all RAID 1 implementations take advantage of this opportunity). RAID 1 is one of the least efficient arrays when it comes to storage capacity, though. Because data is duplicated on an auxiliary drive, the total capacity of the array is only equal to the capacity of a single drive.

  • RAID 10 — This array type combines RAID 0 and RAID 1, and it enjoys the benefits of both striping and mirroring. Data is striped across a pair of mirrored arrays, allowing for better performance without sacrificing redundancy. Combining mirroring and striping allows RAID 10 to enjoy the performance perks of each, making this array type particularly potent. RAID 10 arrays can also survive multiple drive failures without data loss, provided that at least one drive in each mirrored array remains unscathed.


    RAID 10: An array of striped (red) mirrors (blue)

    RAID 10’s attractive blend of performance and redundancy does come at a price, though. Arrays require an even number of at least four drives to implement, and capacity is limited to just half the total capacity of the drives used.

  • RAID 0+1 — RAID 0+1 is similar to RAID 10 in that it combines striped and mirrored components, but it does so in reverse. Rather than striping array of mirrors, RAID 0+1 mirrors an array of stripes. Like RAID 10, RAID 0+1 can benefit from the performance characteristics of both mirroring and striping. However, a RAID 0+1 array can only tolerate one drive failure without data loss. The failure of one drive will destroy a RAID 0+1 array’s mirrored component, effectively reducing it to RAID 0.


    RAID 0+1: An array of mirrored (blue) stripes (red)

    Like RAID 10, RAID 0+1 implementations require an even number of at least four drives. Array capacity is equal to half the total capacity of drives in the array.

  • RAID 5 — Referred to as striping with parity, RAID 5 attempts to combine the performance benefits of RAID 0 with improved fault tolerance that doesn’t consume half the drives in an array. Like RAID 0, RAID 5 stripes data across multiple disks in blocks of a given size. This setup allows the array to use multiple disks for both read and write requests, allowing for performance that should rival that of a similar striped array.

    Since striped arrays are a disaster as far as fault tolerance is concerned, RAID 5 adds a measure of fault tolerance using a little binary math. Parity data is calculated for each rank of blocks written to the array, and that parity data is spread across the drives in the array. This data can be used to reconstruct the contents of a failed drive, allowing RAID 5 arrays to survive a single drive failure with no data loss.


    RAID 5: Striping with parity

    Parity’s real benefit isn’t so much that it allows for a measure of fault tolerance, but that this fault tolerance can be achieved with little capacity sacrifice. Because only one parity block is needed for each rank of blocks written to the array, parity data only consumes the capacity of one drive in the array. This results in an array capacity equal to the total capacity of all drives in the array minus the capacity of just one drive—a huge improvement over the mirror overhead of RAID 1, 10, and 0+1 arrays. Parity also allows fault-tolerant RAID 5 arrays to be created with as few as three drives.

    Parity calculations, however, can introduce significant computational overhead. Parity data must be calculated each time data is written to the disk, creating a potential bottleneck for write performance. High-end RAID 5 add-in cards typically avoid this bottleneck by performing parity calculations with dedicated hardware, but current chipset-based RAID 5 implementations rely on the CPU to perform parity calculations.

 

The competitors
ATI, Intel, NVIDIA, SiS, ULi, and VIA core logic chipsets all feature various flavors of RAID, but the only widely available options that offer support up to four drives and RAID 5 are Intel’s ICH7R south bridge and NVIDIA’s nForce4 family. ULi’s new M1575 south bridge does support up to four-drive RAID 5, but it’s only just becoming available on a small handful of ATI CrossFire motherboards—products that weren’t available when we began our mammoth testing spree several months ago. VIA’s VT8251 south bridge will also support four-drive RAID 5, but that chip currently exists only on reference boards, and even then without complete drivers.


Intel’s ICH7R south bridge

When Intel revealed the ICH7R in May of this year, the south bridge was the first core-logic chipset component to support RAID 5. The ICH7R also supports RAID 0, 1, and 10, but arrays are limited to up to four Serial ATA drives. Those Serial ATA drives will enjoy support for Native Command Queuing (NCQ) and 300MB/s host transfer rates, though. The ICH7R also includes an interesting feature called Matrix RAID, which allows users to create RAID 0 and RAID 1 volumes spanning just two drives. We’ve looked at Matrix RAID’s performance extensively in the past, so we won’t be covering it today.


NVIDIA’s nForce4 MCP SLI

Our second competitor is NVIDIA’s nForce4 SLI MCP, the south bridge component of nForce4 SLI Intel Edition chipset. Like the ICH7R, the nForce4 chip supports RAID 0, 1, and 5. However, NVIDIA has elected to support RAID 0+1 instead of RAID 10. (We’d obviously prefer RAID 10 for its greater fault tolerance.)

While the ICH7R’s RAID support is limited to Serial ATA drives, the nForce4 also supports RAID with plain old ATA drives. This capability allows the south bridge to handle arrays with up to eight drives (four ATA and four Serial ATA), although support for NCQ and 300MB/s host transfer rates will obviously be limited to the Serial ATA drives.

NVIDIA first made RAID 5 support available in its core logic chipsets with the nForce4 SLI Intel Edition, but the same RAID controller is used across the nForce4 Ultra, SLI, SLI Intel Edition, and SLI X16 chipsets. Only the latter two currently offer RAID 5, but according to NVIDIA, that capability can be added to the nForce4 Ultra and SLI chipsets with a motherboard BIOS update.

While we’re on the subject of chipsets, it should be noted that there’s a significant gap in interconnect bandwidth between the Intel and NVIDIA chipsets. The nForce4 SLI Intel Edition’s interconnect between the north bridge and south bridge chips tops out at only 1.6GB/s, while the Intel chipset enjoys 2GB/s of bandwidth. 400MB/s isn’t chump change, but our benchmark results should provide more insight on whether the discrepancy in interconnect bandwidth makes much of a difference.

Software RAID with Windows XP
Every time we do a RAID performance comparison, I get emails asking why we haven’t included Windows’s software RAID. Windows 2000, XP, and several flavors of Windows Server are capable of creating software RAID volumes, regardless of a system’s chipset, using dynamic disks. However, several limitations keep us from directly comparing OS-level software RAID with chipset implementations. First, some array levels, such as RAID 5, are only available in server versions of Microsoft’s operating systems. Support for these levels can be hacked into Windows XP with a hex editor, but they’re not officially sanctioned. We wouldn’t recommend relying on an OS hack to provide redundant storage. The creation of a Windows software RAID array also requires that the operating system already be installed, which prevents users from creating a striped system volume. That limitation doesn’t exist for any of our chipset RAID implementations, which can easily be implemented for system drives.

1.6TB of Caviar
RAID arrays should be configured with identical drives for best results, and we’ve rounded up a quartet of Western Digital Caviar RE2 hard drives for this comparison. The Caviar RE2 is an enterprise drive that’s apparently optimized for RAID environments, and its impressive single-drive performance has already earned it a TR Editor’s Choice award. The RE2 also weighs in at 400GB, allowing us to configure RAID 0 and 5 arrays with over 1TB of total capacity. Sorry Paris, that’s hot.


Four Western Digital Caviar RE2 400GB hard drives
 

Test notes
Rather than overriding manufacturer defaults, we prefer to let RAID controllers chose their own stripe sizes for striped arrays. These defaults are usually what the manufacturer considers to be an optimal configuration for the RAID controller, and we wouldn’t want to test a configuration that was anything less than optimal. Unfortunately, the chipset RAID controllers from Intel and NVIDIA default to different stripe sizes for RAID 0 arrays—128KB for the ICH7R and 64KB for the nForce4. This difference in stripe size could impact performance in some tests, so keep it in mind. The default stripe size for RAID 10/0+1 and 5 arrays is 64KB for both chipsets, so it’s not an issue.

Because we’re masochists thorough, we tested each chipset with a wide variety of arrays, including RAID 1, 0, 10, 0+1, and 5. RAID 1 and RAID 10/0+1 arrays were limited to two and four drives, respectively, but we busted out a few variations for our RAID 0 and RAID 5 tests. RAID 0 arrays were tested with two, three, and four drives, while RAID 5 arrays were tested with three and four drives. We’ve also included single-drive test results for reference.

Our testing methods
All tests were run three times, and their results were averaged, using the following test systems.

Processor Pentium 4 3.4GHz Extreme Edition
System bus 800MHz (200MHz quad-pumped)
Motherboard Asus P5WD2 Premium MSI P4N Diamond
Bios revision 0422 1.30
North bridge Intel 955X MCH NVIDIA nForce4 SLI SPP
South bridge Intel ICH7R NVIDIA nForce4 SLI MCP
Chipset drivers Chipset 7.2.1.1003
AHCI/RAID 5.1.0.1022
NVIDIA ForceWare 7.13
Memory size 1GB (2 DIMMs) 1GB (2 DIMMs)
Memory type Micron DDR2 SDRAM at 533MHz
CAS latency (CL) 3 3
RAS to CAS delay (tRCD) 3 3
RAS precharge (tRP) 3 3
Cycle time (tRAS) 8 8
Audio ICH7R/ALC882D Creative P17
Graphics Radeon X700 Pro 256MB with CATALYST 5.7 drivers
Hard drives Western Digital Caviar RE2 400GB SATA
OS Windows XP Professional
OS updates Service Pack 2, DirectX 9.0C

Our test system was powered by OCZ PowerStream power supply units. The PowerStream was one of our Editor’s Choice winners in our latest PSU round-up.

We used the following versions of our test applications:

The test systems’ Windows desktop was set at 1280×1024 in 32-bit color at an 85Hz screen refresh rate. Vertical refresh sync (vsync) was disabled for all tests.

All the tests and methods we employed are publicly available and reproducible. If you have questions about our methods, hit our forums to talk with us about them.

 

WorldBench overall performance
WorldBench uses scripting to step through a series of tasks in common Windows applications. It then produces an overall score. WorldBench also spits out individual results for its component application tests, allowing us to compare performance in each. We’ll look at the overall score, and then we’ll show individual application results alongside the results from some of our own application tests.

There’s a spread of more than ten points between the fastest and slowest configurations in WorldBench, with the RAID 5 arrays trailing by a significant margin. Looking at the overall results, the ICH7R has a slight edge in single-drive, RAID 1, and RAID 10 performance, while the nForce4 has a small lead with RAID 5 and with one of our RAID 0 configs. WorldBench’s results for individual application tests should shed more light on how thing stack up.

Multimedia editing and encoding

MusicMatch Jukebox

Windows Media Encoder

Adobe Premiere

VideoWave Movie Creator

Of WorldBench’s multimedia editing and encoding tests, Premiere is the only application to really show a significant difference in performance between the various RAID levels. RAID 0 proves faster than any other configuration, but RAID 10/0+1 and even RAID 1 aren’t far behind. Neither are our single-drive configurations, for that matter. RAID 5 is another story. Parity overhead appears to be crippling RAID 5’s performance in Premiere, resulting in scores that are much slower than our other configurations.

The Premiere test crawls on our RAID 5 array with both the Intel and NVIDIA systems, but NVIDIA’s implementation is clearly faster. NVIDIA also has a slight lead over Intel with some of the other array configurations, although those results only differ by a few seconds.

 

Image processing

Adobe Photoshop

ACDSee PowerPack

Our Photoshop results are pretty close, but ACDSee shows the RAID 5 configurations way behind the rest of the pack again. As with our Premiere results, RAID 0 and 10/0+1 prove to be the fastest array levels, although there’s little performance difference between RAID 0 arrays with two, three, and four drives. Single-drive and RAID 1 performance is a little slower than the striped arrays but significantly faster than RAID 5.

Multitasking and office applications

Microsoft Office

Mozilla

Mozilla and Windows Media Encoder

Performance doesn’t deviate much in WorldBench’s office and multitasking tests, but RAID 5 is clearly slower in the Office XP test.

Other applications

Winzip

Nero

RAID 5 continues to slow our system in WorldBench’s WinZip and Nero tests. In WinZip, the ICH7R and nForce4 continue to trade blows across each RAID test, with neither emerging as a clear overall leader. However, the ICH7R looks better in Nero, where its single-drive, RAID 1, and RAID 10 performances are significantly faster than those of the nForce4. NVIDIA still manages to beat Intel when it comes to RAID 5 performance, though.

 

Boot and load times
To test system boot and game level load times, we busted out our trusty stopwatch.

Our ICH7R and nForce4 systems use different motherboards with different POST sequences, so it’s hard to compare boot times between them. However, the results clearly show that the time needed to scan for and initialize a RAID array on either chipset slows the boot process by enough for our single-drive configurations to come out on top.

Results in our level load time tests are more varied, but the single-drive configurations prove tough to beat. The nForce4 is noticeably slower than the ICH7R with a number of configurations in the Far Cry test and in a couple of instances in DOOM 3.

Array rebuild times
Our array rebuild tests simulate a drive failure by removing and formatting one drive in an array. The system is booted with the drive disconnected to ensure that the array functions properly without it, and we time the rebuild process after the drive is reconnected. Curiously, our ICH7R RAID 10 configuration wouldn’t accept our formatted drive back into the array, citing a hardware error. We had to use another drive, a 500GB Hitachi DeskStar 7K500, to get the RAID 10 array to rebuild. We didn’t have any such problems with our RAID 1 or 5 rebuilds on the ICH7R.

Reconstructing a RAID array takes a while, regardless of the system or array configuration. The ICH7R’s RAID 5 rebuilds are a little quicker than those of the nForce4, but the tables turn with the RAID 1 array.

 

File Copy Test – Creation
File Copy Test is a pseudo-real-world benchmark that times how long it takes to create, read, and copy files in various test patterns. File copying is tested twice: once with the source and target on the same partition, and once with the target on a separate partition. Scores are presented in MB/s.

Striping flexes its muscles in FC-Test’s file creation tests, and performance scales nicely from two to four drives. The performances of the two-drive RAID 0 and RAID 10/0+1 arrays are nearly identical, as are the performances of our single-drive and RAID 1 configurations. RAID 5 performance is dismal, though. Parity must be calculated every time data is written to the array, and in a file creation test, that’s all the time.

Although both RAID 5 implementations are slow in this test, the nForce4’s performance with parity is consistently ahead of that of the ICH7R. However, the ICH7R proves faster with three- and four-drive RAID 0 arrays.

 

File Copy Test – Read

The nForce4 dominates our FC-Test read results, sweeping the ICH7R nearly across the board. Intel pulls within striking distance with the ISO test pattern, which consists of a small number of very large files, but it’s not even close otherwise.

Looking at overall array performance, RAID 5 finally finds some redemption and even leads the field in a couple of test patterns. RAID 10 and 0+1 also perform comparably well. Notably, though, RAID 1 doesn’t offer much of a performance boost over a single-drive system. Overall, read performance doesn’t scale nearly as aggressively with RAID as file creation performance does, despite the fact that three- and four-drive arrays continue to cluster at the front of the pack.

 

File Copy Test – Copy

These tests combine both read and write operations, and that write component is enough to sink our RAID 5 arrays. RAID 10/0+1 and 0 arrays provide a healthy performance boost over our RAID 1 and single-drive configurations, and our multi-drive RAID 0 arrays dominate almost across the board.

Speaking of domination, the nForce4 reigns supreme again with all but the ISO test pattern. This performance advantage over the ICH7R is most apparent with RAID 0 arrays and virtually nonexistent with single-drive and RAID 1 configurations.

 

File Copy Test – Partition-to-partition copy

These partition-to-partition copy results play out much like the copy tests, with the nForce4 leading the ICH7R in all but the ISO test pattern. Again, we see RAID 5 stuck in the doldrums, while our multi-drive RAID 0 arrays lead the field.

 

iPEAK multitasking
We recently developed a series of disk-intensive multitasking tests to highlight the impact of command queuing on hard drive performance. You can get the low-down on these iPEAK-based tests here. The mean service time of each drive is reported in milliseconds, with lower values representing better performance.

Multi-drive RAID 0 arrays thrive in our first round if iPEAK tests. RAID 10/0+1 performance is also impressive, and it’s clear that our single-drive configuration is out of its depth. RAID 5 performance is more mixed, for once. The nForce4’s RAID 5 arrays don’t fare nearly as poorly as the ICH7R’s, although the performance delta between the two chipsets is much smaller with other array types.

 

iPEAK multitasking – con’t

RAID 0 and 10/0+1 continue to dominate our iPEAK results, in some cases providing performance that’s more than four times faster than a single drive. Again, we see big performance gaps between Intel and NVIDIA’s RAID 5 implementations, with the ICH7R on the losing end of that battle in each test. Otherwise, the two chipsets trade blows, with neither able to put the other away.

 

HD Tach
We tested HD Tach with the benchmark’s full variable zone size setting. Oddly, our MSI nForce4 motherboard had problems consistently completing the test with arrays larger than 1TB. We had to swap in Gigabyte’s GA-8N-SLI Royal, which uses the same nForce4 SLI Intel Edition chipset, to get results with our four-drive RAID 5 and three- and four-drive RAID 0 arrays. The Gigabyte board’s performance with other arrays was identical to that of the MSI, so the results should be comparable.

Although sustained read performance scales very well from one to two drives, adding a third or fourth drive yields much more modest performance gains. Unfortunately, mirroring doesn’t improve read speeds much, either. There’s no performance difference between our single-drive and RAID 1 configurations, and RAID 10/0+1 performance is nearly identical to that of a two-drive RAID 0 array. At least RAID 5 fares relatively well.

When we move to writes, RAID 5 assumes its position at the back of the field, trailing the performance of single-drive configurations by two to three times. Again, we see diminishing returns scaling beyond two drives. While Intel’s RAID 0 and 10 arrays have a clear lead over the nForce4, the ICH7R’s RAID 5 performance is especially poor.

The host transfer rate of our Caviar RE2 hard drives is limited to 150MB/s, so our burst speed results aren’t particularly impressive. The arrays appear to hit a wall around 250MB/s, but there are significant gains to be had going from a single drive to a striped array. Mirroring doesn’t improve burst speeds at all, though, and the ICH7R’s RAID 1 array is curiously but consistently slow in this test.

RAID 1 bounces back with a strong performance in HD Tach’s random access time test, but only on the Intel side. The ICH7R’s RAID 10 array also fares well here, but the results for the nForce4 and our other arrays are pretty close overall.

CPU utilization results are also close when we take into account HD Tach’s +/- 2% margin for error for this test. Despite that margin for error, the ICH7R’s CPU utilization appears to be consistently lower than the nForce4’s.

 

IOMeter – Single drive – Transaction rate
IOMeter simulates multi-user file server, web server, workstation, and database loads, and measures a system’s transaction rate, response time, and CPU utilization. There’s a lot of data to present here, so we’ve started by segmenting our results by RAID level, or in this case, a lack of RAID. We should note that IOMeter wouldn’t run on the nForce4 with 128 or 256 outstanding I/O requests, so you won’t find scores for the NVIDIA chip for either. We’ve seen similar behavior from various storage controllers in the past, including the ICH7R with earlier drivers, so it’s probably nothing to be too worried about.

Single-drive transaction rates scale better on the ICH7R than on the nForce4, with the NVIDIA chip only revving up under heavier loads.

 

IOMeter – Single drive – Response time

Single-drive response times are very close between the two platforms, with the ICH7R only a hair faster.

 

IOMeter – Single drive – CPU utilization

The ICH7R has a slight edge when it comes to CPU utilization, as well, although the nForce4 is barely over 0.5%.

 

IOMeter – RAID 1 – Transaction rate
Next we’ll move to our RAID 1 results and see what mirroring can do under multi-user loads.

The ICH7R’s transaction rates are generally higher than those of the nForce4 with RAID 1, in part because the NVIDIA controller seems to hit a bottleneck with a load of 16 or more outstanding I/Os. Rather than scaling smoothly, the nForce4’s performance seems to jump dramatically between plateaus. At the beginning of each plateau, the nForce4 is faster than the ICH7R, though.

 

IOMeter – RAID 1 – Response time

We don’t see any plateaus in our RAID 1 response time results. Although the nForce4 is a little faster at some load levels, the ICH7R is more responsive under heavier loads—and overall.

 

IOMeter – RAID 1 – CPU utilization

RAID 1 CPU utilization is low across the board, but the ICH7R seems to build a slightly stronger appetite for CPU cycles as the load increases.

 

IOMeter – RAID 0 – Transaction rate
From mirroring to striping, RAID 0 is up next. Here we have results for two-, three-, and four-drive arrays on each controller.

The ICH7R’s RAID 0 transaction rates scale beautifully under load, as it does from two to four drives. The nForce4’s performance isn’t as pretty, though. NVIDIA’s RAID 0 arrays hit a transaction rate ceiling with loads above 16 outstanding I/Os, although that ceiling increases as more drives are added to the array. Raise the roof, er, or something. Either way, the ICH7R’s RAID 0 arrays look much better equipped to handle heavy multi-user loads.

 

IOMeter – RAID 0 – Response time

Intel continues to lead as we move to IOMeter’s response time results, where the ICH7R’s RAID 0 arrays are easily more responsive than those of the nForce4.

 

IOMeter – RAID 0 – CPU utilization

CPU utilization continues to be pretty low across the board, although the ICH7R does consume more CPU cycles as it handles loads of 128 and 256 outstanding I/Os.

 

IOMeter – RAID 10/0+1 – Transaction rate
Can the ICH7R’s dominance IOMeter continue as we move to RAID 10 and 0+1 arrays?

It certainly can. The nForce4 bonks early, and performance levels off somewhere between four and 16 outstanding I/Os. Intel’s ICH7R keeps on ticking, scaling aggressively under heavier loads and eventually doubling the nForce4’s peak transaction rate with the web server test pattern.

 

IOMeter – RAID 10/0+1 – Response time

This one isn’t even close, folks. The ICH7R’s response times are much lower than the nForce4’s here.

 

IOMeter – RAID 10/0+1 – CPU utilization

Despite its significant performance advantage over the nForce4, the ICH7R’s CPU utilization continues to be reasonably low.

 

IOMeter – RAID 5 – Transaction rate
Parity time. RAID 5 rounds out our IOMeter array segmentation, and we have results for three- and four-drive configurations.

And the nForce4 hits the wall again. Transaction rates for RAID 5 arrays on the nForce4 start to trail off as we hit four outstanding I/Os, and it’s all over by the time we get to 16. That’s really a shame since the nForce4 looks competitive at first.

To be fair, the ICH7R also hits a wall, but its performance doesn’t start to plateau until the load reaches 64 outstanding I/Os. Since the Intel chip doesn’t slow down at all with the web server test pattern, which is made up exclusively of read operations, it seems likely that the ICH7R’s RAID 5 transaction rates are being constrained by write-related parity overhead.

 

IOMeter – RAID 5 – Response time

Given our previous results, it’s no surprise that the ICH7R’s RAID 5 response times are significantly faster than those of the nForce4.

 

IOMeter – RAID 5 – CPU utilization

Despite the need to calculate parity, the CPU utilization of our RAID 5 arrays is quite reasonable. That’s a surprising result, and it makes me wonder whether both companies’ drivers aren’t limiting the CPU resources allocated to parity calculations.

 

IOMeter – ICH7R – Transaction rate
Now that we’ve compared the ICH7R and nForce4’s performance at each RAID level, we’re busting out a set of graphs that show how performance scales across different raid levels for each controller. Since CPU utilization was so low across the board, we’ve won’t present those results again.

Our ICH7R scaling graphs show that RAID is really a no-brainer for multi-user environments. Even the two-drive arrays improve on single drive’s performance by a significant margin, and the four-drive RAID 0’s peak transaction rate is scary by comparison. Note that the RAID 5 arrays are much more competitive in the web server test, where they don’t have to calculate parity for any write operations. It’s also interesting to see comparatively strong performances by the mirrored arrays in the read-dominated web server test. Mirroring can improve read performance, after all.

 

IOMeter – ICH7R – Response time

Our rainbow of response time results clearly illustrates why multi-user server environments have been using RAID for years. It really is that much faster, and the more drives you add, the more responsive the system becomes under heavier loads.

 

IOMeter – nForce4 – Transaction rate
How does array performance scale on the nForce4?

Nicely, that is until you hit that pesky performance ceiling above 16 outstanding I/Os. It looks like we’re missing three- and four-drive RAID 0 results in the web server graph, but they’re there, hiding right under the three- and four-drive RAID 5 results. Take away the parity requirements of write operations, and the nForce4’s RAID 5 arrays perform just like RAID 0 arrays in IOMeter.

 

IOMeter – nForce4 – Response time

Again, the performance of three- and four-drive RAID 0 arrays on the nForce4 is identical to that of three- and four-drive RAID 5 arrays with the read-dominated web server test pattern. We don’t see nearly as much separation in the nForce4’s response time results as we saw with the ICH7R, though.

 
Conclusions
After several months of testing, reams of benchmark results, and a stunningly excessive number of graphs, we’ve learned a few things about the performance of the RAID controllers inside Intel’s ICH7R and NVIDIA’s nForce4. Before summarizing the results of the title match between the two chipsets, we should take a moment to make some general observations about the performance of chipset-level RAID.

RAID has always been a proven performer in demanding multi-user environments, and our IOMeter results bear that out even for two-drive RAID 1 arrays. There’s always been a question of whether arrays do much for desktops, though. Looking at our WorldBench and load time results, it’s obvious that RAID doesn’t improve the performance of typical desktop applications by a significant margin. However, our FC-Test and iPEAK results show that moving from a single-drive configuration to a multi-drive RAID array can dramatically increase file creation, copy, and read performance. RAID can also improve storage subsystem responsiveness during disk-intensive multitasking. The fault tolerance of RAID 1, 10, 0+1, and 5 arrays can also pay obvious dividends for desktop systems, as anyone who has lost data due to a drive failure can attest.

In theory, RAID 5 is probably the most attractive array type for those looking to balance redundancy with storage capacity, and it’s a new checklist feature for core logic chipsets. Unfortunately, the write performance of chipset-level RAID implementations is pretty dismal. It’s slow enough to affect file transfers and desktop applications, and even IOMeter takes a hit when parity comes into play. Curiously, CPU utilization for chipset RAID 5 arrays is very low, suggesting that there’s plenty of parity-crunching processor power to spare. Chipset RAID 5 arrays may be bottlenecked elsewhere, or drivers may limit the number of CPU cycles used to calculate parity in an attempt to keep the system’s processor free for other tasks. Either way, there’s a hefty price to be paid for striping with parity in today’s core-logic chipsets.

Apart from general observations about RAID performance, we can also draw several conclusions about the merits of the Intel ICH7R and NVIDIA nForce4 RAID controllers. Which one proves better very much depends on the application. Based on our test results, the nForce4 looks like a superior RAID platform for desktop systems. The nForce4’s performance in the majority of our FC-Test and iPEAK tests is better than that of the ICH7R, and in single-user tests, NVIDIA’s RAID 5 implementation is less crippling than Intel’s.

While the nForce4 performs strongly in single-user tasks, it’s easily outclassed by the ICH7R under multi-user loads in IOMeter. We’ve seen NVIDIA RAID controllers hit a wall in IOMeter before, and the nForce4 has real problems scaling array performance beyond 16 outstanding I/Os. The ICH7R, on the other hand, scales beautifully under increasingly heavy loads, and it even doubles the transaction rate of the nForce4 in some cases. This stellar IOMeter performance makes the ICH7R the clear choice for servers and other demanding multi-user environments, even if it’s a slightly less attractive option for desktop systems.