Memory subsystem performance
Our first few tests are synthetic benchmarks that let us inspect the performance of the cache and memory subsystems.
This test is nicely multithreaded, so the caches from all available cores contribute to the throughput measured. You may be surprised to see that the Phenom II X6 1100T achieves higher bandwidth than the FX-8150 at the smaller block sizes, but remember it has six L1 caches where the FX has four. The more apt comparison may be the Phenom II X4 980, with four cores and a 3.7GHz clock frequency. The FX's L1 caches will cover block sizes up to 64KB, and the FX-8150 is faster than the Phenom II X4 980 at each step from 2KB to 64KB. Then again, with only four cores, Sandy Bridge's L1 caches are faster still.
The 256KB to 1MB block sizes are L2 cache territory, and the FX's L2 caches don't look to be especially fast, either, though they do largely outperform the Phenom II X4 980's. Bulldozer's L2 caches may lack for speed, but they're large. At the 4MB data point, the rest of the CPUs are into their L3 caches. The FX is still in its L2 coverage area. The next step up in block size is 16MB, which is right at the outer edge of the FX's effective total cache capacity, since its 8MB of L3 cache doesn't replicate the contents of its 8MB of L2 cache. The FX-8150 again delivers the highest throughput at the 16MB block size, but not by much.
Some of the credit for the FX-8150's strong showing here no doubt goes to its use of 1866MHz DIMMs. However, we've tried 1866MHz memory on the older CPU cores in Llano, and our Stream results topped out at around 15GB/s. Bulldozer's smart data prefetchers and large L2 caches deserve credit for taking good advantage of the available memory bandwidth.
Measuring memory access latencies has gotten to be tricky with the advent of Turbo-style clock speed ramping, because latencies are reported in the number of CPU cycles. Nevertheless, we've chosen to report access latencies with the caveat that our guesses about likely frequencies for these CPUs may be incorrect.
If we're right, the FX comes out looking pretty good, with access latencies comparable to competing Sandy Bridge parts, despite its larger caches. Again, the use of 1866MHz memory may be helping the FX here.
For what it's worth, our tool reports Bulldozer's L1 data cache latency at 3 cycles, L2 at 18 cycles, and L3 at 65 cycles.
|Razer Electra V2 offers affordable immersion||0|
|Samsung 360 Round camera captures the world from all angles||6|
|National Seafood Bisque Day Shortbread||5|
|MSI GS63 Stealth laptop flies under the radar with a GTX 1050||5|
|Zotac GTX 1080 Ti ArcticStorm Mini proves that size doesn't matter||24|
|Aorus X9 packs two GTX 1070s in a slim chassis||13|
|ROG Strix X370-I and B350-I are itty-bitty boards for Ryzen builds||15|
|Qualcomm shows progress on 5G mobile broadband||21|
|Samsung foundry train stops at 8-nm LPP before heading to EUV||25|
|Honestly can't see the point of Vega64 for gamers. It's a power-hungry compute monster that barely outperforms Vega56 and no matter how much you overc...||+23|