Here is a link to a detailed and very informative article about the A9X when benchmarked using loads other than the "magical" GeekBench: http://www.pcworld.com/article/3006268/ ... aptop.html
I won't post every CPU benchmark there, but the findings from PCWorld mirror my own personal experience having actually used an admittedly first-generation and non-optimized Core-M for over a year now: While the Core-M is by no means a performance powerhouse as far as Intel CPUs go, Apple needs to -- at bare minimum -- double the performance of the A9X CPU cores to be able to honestly say they can beat a 2014-era Core M in a meaningful way. Furthermore, while the gulf in graphics power isn't as huge, it turns out that Apple's favorable GPU benchmarks fall down when apples to apples performance benchmarks are run and Apple can't rely on 16-bit low-precision GPU paths that make their GPU look better than it actually is.
Chip works has released a die-shot and what they found was rather interesting. It is only a two core design, just like the Core M. The general expectation before launch was that it would be a triple or quad core design to leap frog its A8X predecessor. While the A9X is faster the lack of additional cores makes the performance jump rather small. Increasing the core count would help put the system on par with the other Intel platforms in multithreaded workloads. Considering the size of a core on the A9X die, going quad core wouldn't be too difficult.
Furthermore, the shared L3 cache for the entire SoC has been removed.
Apple does have rather large L2 caches for its CPU cores so this isn't as big of an issue for the CPU side but this could easily affect the performance of the other SoC components who could use that cache. It would be interesting to see if additional logic blocks on the SoC increased the size of any exclusive caches as well to compensate compared to the A8X.
Regardless, Apple appears to be on the level of a Sandy Bridge chip without SMT clock-for-clock, core-per-core. That is still respectable today and especially so in the context of its power consumption. However, it isn't the uber chip Geekbench likes to make it out to be. I also have a few minor issues with that PC World article too. The zip test really needed to explore how much acceleration the A9X chip got from coprocessors. AES encryption/decryption isn't handled by the CPU core itself but by a special crypto unit. This is one of the reason ARM cores in general can show extremely high crypto scores despite having otherwise week CPU scores. The crypto unit can also handle compression/decompression work too. A more interesting comparison would be with a different compression and encryption algorithm that isn't accelerated on either platform.
I will make a note that the much of the vaunted GPU power on the A9X appears to be gained from using lower-precision FP16 math to produce graphics with higher performance but while making sacrifices in image quality. Once the fast code-paths are removed using professional grade benchmarking software, even the "incompetent" Broadwell Core-m GPU from 2014 effectively ties the GPU in the A9X, and Skylake flat out destroys the A9X:http://images.techhive.com/images/artic ... -large.png
While 16 bit precision is common in mobile as it uses less power and image quality isn't a concern, Intel is adding to Kaby Lake GPUs. They'll be on equal footing in a year and the comparison with the A10X will be very interesting.
nVidia will also be doing some optimizing around 16 bit floats next year with Pascal too.
The article also includes a detailed discussion of our friend Geekbench, and I personally don't think Geekbench comes out looking like a proper benchmark judging by the spin they told the author.
I would argue that some of the subtests are rather fair and comparable across platform. However, they also tend to fall into Linus's criticism of fitting into L2 cache, if not the L2 cache.
I do think that PC World was approaching problem with a more real world focus but kinda dropped the ball to make the tests equivalent across platforms. Granted, it isn't easy with the walled garden app eco system of the iOS to find true equivalents but that is where they should have started looking for applications for performance testing, not the other way around.*
(*There is still value in trying to find Windows application equivalents on iOS as that indicates what work a user can bring over from PC to tablet. However, that criticism is appropriate for a general device review than scientific performance testing which their article was attempting to do.)
Once again, the A9X is a powerful (at least as far as ARM goes) table SoC. At 147^2mm, which is 25mm^2 larger than a full-bore desktop Skylake part like the 6700K, it darn well better be a strong tablet chip. However, sites like the Verge and Ars Technica that spend 15 minutes running Geekbench and then declare victory for Apple are *not* doing proper hardware performance reviews and can't be trusted as reliable sources of information.
In fairness, TSMC 16 nm FinFET has virtually the same density as their lamented 20 nm process so the chips being larger than the Intel one isn't too surprising. Intel still holds a manufacturing edge here.
The more interesting comparison is that Apple's SoC is larger than most other tablet SoC's as well.
With true 14 nm production more than a year away from TSMC and 10 nm further on the horizon from all the players, I'm curious just how large a die Apple is willing to produce for mobile. While efficiency can still be increased, any major performance leap will have to come from additional parallelism and that means more transistors and larger dies.