Single page Print

AVX performance
None of the benchmarks you've seen on the preceding pages make of use AVX instructions, with the exception of the AES subset used in TrueCrypt. At this point in time, finding applications or benchmarks that make use of AVX isn't easy. Fortunately, I was able to find several, and at least one program looks to be reasonably well optimized for Bulldozer: AIDA64 from FinalWire. AIDA64 includes several small, synthetic tests that can be accelerated with Bulldozer's new instructions.

In order to measure the impact of those new instructions on performance, I tested both the FX-8150 and the Core i5-2500K in two configurations: with and without Windows 7 SP1 installed. Service Pack 1 is required for AVX support, so testing without it has the impact of disabling AVX. The results marked "No AVX" below are those without SP1.

The first test, CPU Hash, uses AMD's XOP instructions on Bulldozer. The next two, FPU Julia and FPU Mandel, make use of FMA4. Tamas Miklos of FinalWire, maker of AIDA64, tells us these benchmarks were developed using pre-release Bulldozer systems. He further says:

Our code is 100% optimized for Bulldozer. We don't see much room for improvement. We've had a chance to talk to AMD, and we've explained (in details) how our benchmarks work, and what tricks we use on Bulldozer. They seemed to be content about what we do and how we do on Bulldozer, and they didn't tell us any hints on possible improvements.

So we should have a resonable opportunity to see Bulldozer's full potential with the latest instructions.

Here's another one of those occasional instances where the Phenom II X6 was already faster than Sandy Bridge, and again, the FX-8150 is a little faster still. Looks like there's a roughly 10% gain with AVX/XOP instructions in use.

The two tests above make use of FMA4, and these really aren't the sort of results we were anticipating. In both cases, the Phenom II X6 110T is faster than the FX-8150 with AVX and FMA4 enabled. Hrmph.

We do have another round of AVX tests, from the latest version of SiSoft's Sandra. We don't have any word on how well these tests are optimized for Bulldozer or whether they use XOP and FMA4. They do appear to make use of AVX on the FX-8150, though.

Although these quick benchmarks are labeled "Multimedia" in Sandra, in truth they're simply fractal computations like the AIDA64 Julia and Mandel tests.

Well, Bulldozer looks great in the integer test, but the FPU results don't look much different that they did in AIDA64. At least the FX-8150 is faster than the Phenom II X6, I guess.

We're disheartened by these results, but AVX is still early in its life, so we're hesitant to draw any definitive lessons from them. AMD did supply us with custom-built versions of the x264 video encoder that use XOP and FMA4 late in the review process. We'll have to try those out soon.