Memory subsystem performance
We figure the graph below is just complicated enough to weed out the lightweights and keep our readership from getting all big and unmanageable. Fortunately, it's not really all that difficult to read. You're just seeing how much bandwidth the memory subsystem on each CPU can deliver at different block sizes, which tend to correspond with different caches. For instance, the 1MB block size ought to spill into the L3 cache on the Phenoms.
One noteworthy result here: at that 1MB block size, the Phenom II's L3 cache bandwidth is higher than the Phenom X4 9950's, even though the 9950's L3 cache runs at 2GHz, or 200MHz faster than the Phenom II's.
Since it's difficult to see the results once we get into main memory, let's take a closer look at the 256MB block size:
Although our Core 2 test systems have relatively fast DDR3 memory, their memory bandwidth appears to be limited by their front-side bus speeds. With integrated memory controllers, all of the Phenoms can transfer data from main memory faster, and the Phenom IIs make some nice gains over the original Phenom in this department. I suspect the Phenom II's larger L3 cache and more aggressive data prefetch algorithm deserves some of the credit for this result. Of course, the Core i7 is even faster still thanks to its onboard triple-channel DDR3 memory controller.
We have noted before that the Phenom's L3 cache appears to contribute some delay to the whole memory subsystem. Even though the Phenom II's L3 cache is three times the size and runs 200MHz slower, the Phenom II is nearly as quick at getting out to main memory as the Phenom X4 9950. Not bad.
Below are 3D graphs of memory access latencies at various block and step size for the Phenom II and some of its closer rivals. We've color coded them just as a guide, although it doesn't mean much. Yellow roughly corresponds to the chip's L1 cache size, light orange to the L2 cache, red to the L3 cache, and dark orange to main memory.
The Phenom II's L3 cache is indeed pretty quick, although the Core i7-920's is larger and quicker still. The Core 2 Q9400 lacks an L3 cache, but has a larger L2 instead.