Because LuxMark uses OpenCL, we can use it to test both GPU and CPU performance—and even to compare performance across different processor types. OpenCL code is by nature parallelized and relies on a real-time compiler, so it should adapt well to new instructions. For instance, Intel and AMD offer integrated client drivers for OpenCL on x86 processors, and they both support AVX. The AMD APP driver even supports Bulldozer's and Piledriver's distinctive instructions, FMA4 and XOP. We've used the Intel ICD on the Intel processors and the AMD ICD on the AMD chips, since that was the fastest config in each case.
We'll start with CPU-only results.
So, two things. First, the results for the 4950HQ are not a fluke. I had some worry that the Iris Pro graphics drivers might have installed a different version of the Intel OpenCL ICD that was responsible for the 4950HQ's victory, but that's not so. The 4950HQ is also faster than the 4770K when using AMD's APP driver. Looks like that L4 cache is finally showing us some potential.
Second, it appears Intel's OpenCL ICD doesn't yet support FMA on Haswell. I'd expect that instruction to come in very handy here. Perhaps a future update will correct that oversight.
Now we'll see how a Radeon HD 7950 performs when driven by each of these CPUs.
It's hard to beat a modern GPU for this sort of FLOPS-intensive work. I'd sure like to see a Haswell with proper FMA support make a run at it, though.
We can try combining the CPU and GPU computing power by asking both processor types to work on the same problem at once.
Now, let's pull the discrete GPU out of the test systems and see how their IGPs perform in OpenCL.
AMD has based its sales pitch for APUs on converged computing and OpenCL acceleration. Looks to me like Intel isn't willing to cede any ground to its competitor here. Using its eDRAM cache, the Iris Pro 5200 IGP nearly triples the performance of the A10's Radeon IGP. Even the scaled back HD Graphics configs in the 3770K and 4770K outperform the A10's integrated graphics.
The Cinebench benchmark is based on Maxon's Cinema 4D rendering engine. It's multithreaded and comes with a 64-bit executable. This test runs with just a single thread and then with as many threads as CPU cores (or threads, in CPUs with multiple hardware threads per core) are available.
STARS Euler3d computational fluid dynamics
Euler3D tackles the difficult problem of simulating fluid dynamics. Like MyriMatch, it tends to be very memory-bandwidth intensive. You can read more about it right here.
At the very end of our regular suite of benchmarks, we have a bright and shining example of how the 4950HQ's eDRAM cache can make a difference. True to expectations, it's good for computational fluid dynamics. Ok, so maybe that's not the best justification for bringing a GT3e config to a socketed CPU, but I'd still like to see that happen.
|G.Skill KM560 MX keyboard drops the numpad||8|
|Rumor: Acer Triton 700 may use an unreleased Pascal GPU||19|
|Silverstone Vital VT02 could hold a Core i7 in under two liters||7|
|Galax and KFA2 induct the GTX 1080 Ti into the Hall of Fame||21|
|Acer's Aspire GX-281 lineup brings Ryzen to the masses||14|
|Deals of the week: discounts on CPUs, mobos, and more||8|
|Asetek gets $600,000 from Cooler Master in AIO cooler patent spat||17|
|Acer Predator Triton and Helios laptops are ready for serious play||14|
|Intel enjoys healthy revenue and profits for Q1 2017||29|