Memory performance The 1066MHz bus on the Extreme Edition ought to allow for more memory bandwidth. Does it deliver?
Yes, the P4 Extreme Edition 3.46GHz (dubbed the P4 XE 3.46GHz in our graphs) gets a nice bump in Sandra memory bandwidth. However, look at the Cachemem results, and you can see how the P4 Prescott's more aggressive prefetching of data into the L2 cache produces better bandwidth scores, even on an 800MHz bus. This is one reason I'd like to see a Prescott with a 1066MHz bus; I think maybe the Prescott could take better advantage of it.
Linpack shows us, visually, how the P4 Extreme Edition's copious on-chip cache dwarfs that of all competitors. With data matrix sizes up to 2MB, calculations are performed almost entirely in L2 cache, where they are very quick.
I should mention, though, that the P4 Extreme Edition's total effective cache size is 2MB, even though the chip has an 8K L1 data cache, a 512K L2 cache, and a 2MB L2 cache. Intel's cache hierarchy is not exclusive, so total effective cache size isn't additive. The L2 cache mirrors the contents of the L1 data cache, and the L3 cache mirrors the contents of the L2. This arrangement is distinct from the exclusive caches on recent AMD processors, where a 128K L1 data cache and 1024K L2 cache would lead to a total effective cache size of 1152K.
The faster bus on the Extreme Edition shows a small but measurable improvement in our memory access latency benchmark. Let's look at this effect in more detail with our fancy 3D graphs.