news cpu z changes benchmarking algorithm for accuracy with ryzen

CPU-Z changes benchmarking algorithm for accuracy with Ryzen

We don't use it here at TR, but the CPU-Z benchmark has served me well in my personal life as a figurative hand-to-the-forehead for poorly-performing systems. Starting with the latest CPU-Z release, version 1.79, benchmark results will be quite different from those gathered using previous versions. The change was prompted due to an issue with one of the benchmarking algorithms used by the software.

CPU-Z's developers say they saw unexpectedly-excellent results on AMD's Ryzen CPUs that ran counter to the new processors' real-world performance. After a thorough investigation, the team discovered that Ryzen CPUs were executing a certain sequence of integer instructions in a way that avoided an intentional delay, producing improperly-inflated benchmark numbers.

Ordinarily, that kind of automatic optimization would be welcome, but upon further investigation, the CPU-Z team failed to replicate that behavior with Ryzen CPUs in real-world situations. Furthermore, the team says that due to the extreme unlikelihood of that specific sequence of instructions showing up in non-benchmark software, it felt it would be best to revise CPU-Z to reflect real-world results more accurately.

The CPU-Z team says the new benchmark computes a two-dimensional noise function in a way that a game might use to generate procedural data. The benchmark is written in C++ and uses SSE2 instructions in the 64-bit version of the app. The 32-bit version soldiers on with the legacy x87 instructions. If you'd like to see how your chip measures up, you can grab CPU-Z 1.79 at the program's download page.

0 responses to “CPU-Z changes benchmarking algorithm for accuracy with Ryzen

  1. Ya the “usual crowd” is disappearing, me included, and he is one of the reasons. Some of his comments/reactions make the youtube comment section seem more civil.

    Brings back memories of snakeoil and long banned others……………

  2. I have been reading TR for awhile now as my other Tech sites are so inundated with video adds in the margins. I have kept quiet until now as I didn’t see what all the fuss was…

    I simply buy the best bang for the buck part best suited to my needs. Whether it’s AMD vs Intel, or AMD/(ATI) vs nVidia. Yes sometimes the software or games I use has made a difference in the purchase decision, but usually I am building for the long haul, so what I am using now isn’t the end all. I usually end up buying right in the middle of the family line, be it CPU or GPU, unless there is a great sale going on or a well priced part totally outperforms it’s price point. As soon as the dollar to performance graph begins to taper off, that’s where I buy. I have gone back and forth between the big brands over the years, especially the GPU. Yes, the last few years the CPU purchases have been Intel, but I have bought AMD in the past and if Ryzen is as good as they claim in the midrange price point, I could have a choice to make. You are really only stuck with your choice if your MoBo is still worth keeping, then you are mated to it’s socket type. Otherwise, you owe Intel or AMD nothing.

    When it comes to electronics, it’s what have you done for me/the market lately. And I don’t care about the server market at all, the IT guys can worry about that. You could suck at Server, but if you make a great Desktop CPU, I’m sold. (Yes I know the CPU Mfrs. need Server sales to sustain profits).

    In the end I want AMD (both GPU and CPU), Intel, and nVidia to be competitive all the time because then we the consumer benefit. Because then they will be motivated to use competitive pricing.

  3. 2 Things:

    I can’t believe I spent 15 minutes of my only day off in 2+ weeks reading through all these comments and somehow found it relaxing.

    Second, I can’t believe that people have such an emotionally vested interest in the products one company makes over another, to the extent that one relatively minor insignificant piece of software, that may or may not have contained a “cheat” aimed at one or another cpu, generates 63+ comments, some very long, arguing over the merits of one cpu vs another.

    Can we just agree on the following:

    AMD’s Ryzen is a significant improvement over the previous offerings.

    Ryzen cpu’s are not as good as comparable Intel cpu’s, if price was no factor.

    Ryzen cpu desirability needs the qualifier “for the money”, i.e. Ryzen R5’s are great FOR THE MONEY.

    I will be interesting what will happen when/if Intel releases a main stream 6 core processor.

  4. More like synthetic benchmarks are stupid, IMO. If we want to know how device X performs for real, then it’s better to use a real application.

  5. Then don’t cite irrelevant benchmark cheats from over a decade ago. It’s just as relevant, and equally annoying. Not directing that at you, because you were explaining something already brought up, just explaining my justification for using that quote.

    Also, knock off the social justice cop-out crap. I don’t give unnecessary respect to, or worship genocide or holocaust events. They are what they are. Tiptoeing around them just causes people to: A: worship them religiously, B: forget. Nobody’s advocating for genocide here, don’t be a jerk who tries to insinuate I’m associating genocide with a benchmark. I’m not. Obviously. The concept is relevant, not mass murder. JESUS CHRIST, WHY WOULD YOU EVEN IMPLY THAT. YOU MONSTER! OH THE HUMANITY!

    The saying has a relevant point. You’re essentially missing the forest for the trees. (there, happy?) None of this discussion about decade old benchmark cheating is relevant to today’s news.

  6. Please do not cite a quote inspired by genocide to describe habitual performance optimizations for computer games. Jesus Christ, man.

  7. A Single Death is a Tragedy; a Million Deaths is a Statistic.

    The thing is about quack3, is that it was one game which was massively blown out of proportion and fixed afterward. Nvidia did this with almost every game on the FX, and denied doing it, or called it optimization. They haven’t stopped cheating either, as they’ve evolved regular cheating into blatant sabotage via Gameworks.

    Nvidia’s FX cheating was so widespread that ATI eventually implemented a “feature” called “AI” that copied it to a lesser degree, and was more upfront about it. I think this set back general performance as well, as ATI started to put more effort into “AI” than general optimization, which led to some of ATI’s driver reputation. The x1900 was a beast though, and ATI dropped most of the questionable hacks by that point.

    I just think it’s rather hypocritical to bring up one pointless game in the era of such rampant cheating that it’s been professionalized, and alternative APIs are necessary to work around the cruft. Especially when this is a CPU article, and is most likely an optimization instead of a cheat. How did we even get here? Oh yearh. chuckula.

  8. I think Intel gets some blame for the lack of AVX in the CPU-Z benchmark. It won’t run on Pentiums or Celeries. But Intel doesn’t get any blame for whatever AMD did to circumvent it.

  9. It ultimately depends on what the benchmark is, and what it measures. For a benchmark designed for the cutting edge, an AVX(2) component would be welcome, but there are tons of CPUs still being made today without any support for it. CPU-Z’s 32-bit version flogs x87 instructions because a hardware FPU is assumed for all Win32-capable processors manufactured in the last 20 years. This results in a quantity that can be compared to other processors regardless of age, to make an educated guess where they all line up to one another in a speed hierarchy. The 64-bit version flogs SSE2 because it’s a baseline part of the x86_64 specification intended to replace the old FPU stack, and thus appropriate for comparing different processor families. Whenever we roll around to x86_128, we can probably assume AVX2 as a baseline feature there. But that’s a ways off.

  10. It’s funny, I actually ran a 2.8 GHz Northwood P4 with hyperthreading and a 5900XT as a Linux box for years. I’d describe that combo as good within its limitations, but easy to push too far. Thanks for the link!

  11. Nah. CPU-Z doesn’t need to be modified because it inflates Ryzen’s numbers, every app needs to be optimized to reflect what CPU-Z was reporting. 🙂

  12. It sounds like the code was a weird benchmarking corner case that actually did nothing, and RyZen (correctly) optimized it out. This is a smart move on AMD’s part.

    CPU-Z was also correct to change their benchmarking algorithm, since how good a CPU is at eliding superfluous instructions is not something you really want a general-purpose benchmark to focus on.

    Nothing to see here, move along.

  13. Trying not to be partisan at all — the opposite in fact.

    There’s a tendency on many forums to post arguments along the lines of “it doesn’t matter if AMD does something bad ‘cus Intel and NVidia did these other bad things.” It’s basically the high-tech equivalent of “he started it!!”

    I’m not particularly attached to any of the companies — it’s the argument style that I find annoying.

  14. @Redocbew
    If game devs would not totally fail at even the most basic software optimizations, we would not need synthetics.

    A now-days top game engine can run HD-maxed on 5 years ago hardware. That’s if you can properly program it.

  15. In translation: Ryzen was smarter than CPU-Z’s intentional cheating and that had to be corrected.

  16. Supporting AVX2 is a pain in the arse. If you’re writing desktop software, you have a few options:

    1) Make AVX2 a minimum requirement for your software. Great, except you lock out 95% of the market, between the pre-Haswell Intel systems, Atoms, Pentiums and Celerons that still don’t support it, and pre-Carrizo AMD systems.

    2) Manually write two versions of every library that you want to use AVX2. In the Microsoft C++ compiler you have to choose a single CPU feature level for each library you compile, meaning you get either SSE2 compatibility [i<]or[/i<] AVX2 compatibility. You can manually duplicate your libraries and select between them at runtime, but it's a hassle. 3) Pay for an Intel compiler license (on top of the Visual Studio license you also need), and use its auto-dispatch mechanism to compile your libraries for multiple different instruction sets and have it deal with the runtime selection of versions. 1 is basically a no-go, 2 is a pain to maintain, and 3 is expensive. Choose your poison.

  17. The main problem seems to be that CPU-Z was using “busy-work” in it’s benchmark, and Ryzen’s scheduler is clever enough to optimize the busy-work out. This is not unique to Ryzen, as many processors over the years have made these poorly written benchmarks look foolish.

  18. So, CPU-Z changes code that functions perfectly well on the Ryzen to make sure Intel stays on top? We can’t let AMD win this benchmark, can we?

    Ridiculous. Especially the negative publicity about cheating. I could understand this attitude if the Ryzen produced FALSE results.

  19. G FX was highly unusual case. It reminded me a bit of NetBurst. There’s nice article delving deep into architecture:

  20. In theory it could be intentional, by having management core monitor instruction stream for specific and rare case of group of instructions. Not sure how feasible it is, because it would require quite more complexity for management core and pipeline. Cost/benefit isn’t there IMO unlike with GPUs. Such complexity would likely be quite expensive and that could have been sued for full 256-bit cores instead of cheating on a benchmark.

    TL.DR: It could be done, but it’s highly unlikely.

  21. I’m thinking the silent majority have probably picked up on that. It’s not the sort of thing which is done here in general unless there’s a pretty good case for it.

  22. I hope they do a ‘most downvoted’ giveaway at the next TR giveaway mate, and you end up with a Coffee Lake combo and a big SSD or something equally outrageous for your efforts.

    Solid A+


  23. It’s called the Straw Man arguement and it’s about all he has to work with.

  24. Yeah, as I said in another reply, I didn’t mean to imply any nefarious behavior on the part of anyone. I think it was just that Ryzen wasn’t executing the benchmark in the way that CPU-Z intended and as a result it was giving abnormally good numbers. A bug, not a hack/trick or optimization.

  25. Yeah, when I wrote the piece I didn’t intend to insinuate any nefarious intent on the part of AMD [b<]or[/b<] the CPU-Z guys. Just a coincidence as far as I'm concerned.

  26. I don’t normally get into this sort petty partisanship, but I must admit your articulation of this situation have me quite the chuckle.

  27. In fact, Transmeta’s old Crusoe processors could do exactly this and by pass that extraneous work to produce higher benchmark results. It wasn’t doing any sort application detection, rather it just detected that the benchmark itself wasn’t doing any meaningful work and completely skip a test. I wish I could remember the benchmarks name as its results were roughly a factor of 10 off with respect to the Crusoe and they to had to rewrite their code to better reflect real world performance.

  28. [quote<]AMD got caught with their hands in the cookie jar because there [b<]was[/b<] a way in the AMD code to detect that it was a bench marking program because it's the kind of code that only bench marking programs use. [/quote<] And what [b<]was[/b<] the way they detected the code? That'd be pretty strong evidence that they had their hand in the cookie jar. CPU-Z didn't even say this. What CPU-Z said is that the instruction stream used ran exceptionally well on Ryzen but that mix of instructions is not representative of real world workloads, so they changed it. Basically it sounds like CPU-Z used instructions that fell into the edge case scenario and AMD designed Ryzen to handle those edge cases rather well. (The definitive answer on this point would require CPU-Z to release that instruction stream, which is something I'm curious to see.)

  29. I seriously doubt they were detecting this specific benchmark. The benchmark was probably doing some calculations, stashing the results in machine registers, then overwriting them with new results without using them or storing them to RAM. A sufficiently clever optimizer in the CPU could notice that those instructions have no net effect, and decide to skip them entirely. There’s already a lot of dataflow analysis going on in a modern CPU (to implement out-of-order execution), so it’s not really a stretch to do something like this.

    Compilers sometimes generate crappy code (especially when fed crappy source code as input), so doing optimizations like this at the hardware level probably has some value in the real world.

    IOW, no nefarious intent required.

  30. It is tangential but there is a point that benchmarks should have all moved over to AVX which is true. Artificial market segmentation sucks as Intel thinks you should pay more to use AVX despite users buying hardware capable of executing it (*cough* Celerons and Pentiums *cough*).

  31. If those chaps who develop CPU-Z would be kind enough to release the combination of instructions, an investigation could be done by someone.

    I find it very unlikely AMD did this to cheat CPU-Z because the CPU-Z code is written in C++. Any compiler updates/changes could potentially change the instructions generated and avoid the accelerated instructions.

    AMD would have to check every version of CPU-Z to make sure the compiler version wasn’t changed and issue microcode updates if it was. I don’t think AMD has the manpower or money for that especially​ considering how unimportant CPU-Z is.

  32. Yes, the CPU-Z is not particularly useful, instead have a look at all the other benchmarks where Ryzen beats Intels best 8-core processor.

  33. What about all those other benchmarks where the 8-core Ryzen beat the 6900k, are you going to demand an immediate and thorough investigation, or do you want to ignore those?

  34. CPU microcode is pretty analogous to driver code, and I expect it could make a good guess at what program it’s running (though I’m still skeptical that that’s what happened here).

  35. Irritating, isn’t it? He’s been called out multiple times for that behavior by several regular posters here. He consistently takes preemptive swings at his “usual” fanboys, as if he wants to invite them in and bring the entire discussion down to that level.

  36. Ryzen’s a CPU. There is no driver code that detects CPUZ to cheat a benchmark. It’s a hardware optimization that runs certain instructions faster than normal. CPUZ just happened to be using code that matched the optimization. Any application that takes advantage of this should run faster on Ryzen, not just CPU-Z.

  37. Oh, I got it all — I just think teeing off on Intel for market segmentation seems way too tangential to the actual issue.

    I completely agree that it’s beer’o’clock, though. Cheers!

  38. No, that’s not what he’s saying. CPU-Z had an oddity in their SSE2-centric x86_64 benchmark: Ryzen appears to have executed the instructions so fast it counts as working above its weight class in other typical speed metrics, and CPU-Z’s revised the benchmark to correct for that. Chuckula’s bitching that AVX should have been used because it’s newer (which it is, by a lot) even though Intel’s lower end kit lacks the instruction set entirely for product segmentation purposes and AMD’s pre-Zen silicon is at a substantial disadvantage when running AVX. This means that while an AVX benchmark would be useful for determining how different silicon handles AVX code, it wouldn’t run at all on large swathes of existing Intel hardware and would make AMD’s old kit look even worse than it would otherwise.

    So [b<]the[/b<] is complaining that Intel's using AVX as a market segmentation tool, chuckula's gone full AVX über alles and ranting about grand conspiracies and engineering money being blown on a benchmark utility about as useful in the scheme of things as Linux kernel bogomips, everyone's screaming at each other, and I think I want a beer. Does that clear anything up?

  39. [quote<]Ordinarily, that kind of automatic optimization would be welcome, but upon further investigation, the CPU-Z team failed to replicate that behavior with Ryzen CPUs in real-world situations. Furthermore, the team says that due to the extreme unlikelihood of that specific sequence of instructions showing up in non-benchmark software, it felt it would be best to revise CPU-Z to reflect real-world results more accurately.[/quote<] So... the part where CPU-Z couldn't replicate the speed up in real world conditions and that the pattern of code is almost completely confined to bench marking software means what with you exactly? Because to people that read the article, it seems to completely counter your entire argument. AMD got caught with their hands in the cookie jar because there [b<]was[/b<] a way in the AMD code to detect that it was a bench marking program because it's the kind of code that only bench marking programs use. That it didn't need to detect this specific benchmark is not relevant to the fact it was detecting code used almost exclusively used in various bench marking software and the acceleration was not able to be replicated under other circumstances, at least according to CPU-Z. Now if you have some argument why CPU-Z is not credible in it's assertions, I'm all ears, but that is a completely different argument than you are currently making and have made across the thread.

  40. Let me get this straight: CPU-Z had an oddity in their benchmark that made RyZen look much better than it does anywhere else, and the bad guy in the story is… Intel?


  41. Intel loves to drag their feet in making certain features standard across its line up as it prevents further market segmentation. I’m halfway convinced that SSE2 is only standardized in Intel’s line up as it is a prerequisite for x86-64 support. As weird as it sounds, Intel was still shipping some 32 bit only Atom chips in 2014.

    Intel should bit the bullet and make more things standard for the greater good of their platform. Software extension fragmentation sucks.

    I maybe a lowly individual without direct say in the matter but software developers [i<]do[/i<] have weight. Intel considered releasing their own variant of 64 bit extensions but MS killed that idea by stated they'd only support a single x86-64 platform and were already working on porting Windows to AMD's design. We need more companies to call Intel out on their intentional platform fragmentation.

  42. I think it was performing culling a bit too aggressively and leading to some odd failure cases, [url=<]if this elderly article's anything to go by.[/url<]

  43. That was Nvidia’s GeforceFX line, and there was a whole LOT of very aggressive driver optimization going on there. Much of this was at least a little understandable, as the architecture was wildly unsuited to all but the least demanding Direct3D 9 titles. I never had much trouble with them in Linux; there (and for OpenGL in general) they were a mild to moderate upgrade over the preceding Geforce4 series at the same price points. That said, there was enough misbehavior and inconsistent driver behavior that it deserves to be called out as shenanigans. I still feel sorry for anyone who bought a GeforceFX 5950 over a Radeon 9800.

    ATi had their quack3.exe tribulations, too. They’ve done other gross stuff too, and they still come out smelling like roses compared to hijinks by early low tier GPU manufacturers like SiS, Trident, S3, and others so long ago. The quest for optimization and stopgap solutions until driver refactoring and reliable custom codepaths are user-transparent is built on a road [b<]full[/b<] of potholes.

  44. And a certain sequence of instructions is also what caused [url=<]Intel to nuke TSX from orbit.[/url<] The real indicator of cheating is to find some sort of detection mechanism for the benchmark and determine if the same sequence of instruction is universally accelerated (ie not just for that benchmark). So far all that has been discovered is that CPU used an instruction stream that ran really well on Ryzen but didn't reflect performance of Ryzen in the greater software ecosystem. Ryzen is a new architecture and anomalies like this can and do happen.

  45. I’m really curious what the specific sequence of instructions actually is. [url=<]Agner's Optimization manual[/url<] was just updated yesterday and it does appear that several instructions do execute remarkably faster on Ryzen than modern Intel chips. This of course is not universal for all instructions. This could easily a case of that benchmark fitting into a well optimized case. If it was something really nefarious on AMD's end, there would be some sort of detection for this benchmark in play if it wouldn't optimize a similar instruction sequence in the same way (see [url=<]Quack 3.exe[/url<]).

  46. 1. Go ahead. Fling your powerless internet points at me, at other posters, and into the wind until someone goads you into having a stroke because [i<]someone said something mean about Intel,[/i<] and see how much good they do you then. 2. Six years old, and still nowhere near ubiquitous. If AVX were vital, Intel would probably offer it in the Celerons and Pentiums they sell by the jillions, yet it mostly shows up in HPC applications and some multimedia encoders. You might as well argue that CPU-Z should incorporate AES-NI and TSX instructions because they're vitally important for tiny niche markets, everyone should therefore have them, and we should hold the world's biggest and most insanely toxic weenie roast in history to burn all the hardware that doesn't support them into ashes. I wouldn't really care what you do, because most of what you do here is scream for or at multinational companies that hold you in less regard than a farmer grants the life of a single spider in his wheat field.

  47. Anyone remember the old days off benchmark cheating? I can’t remember if it was ATI or Nvidia, but one of them had a driver that would mess with 3Dmark (2001 I think) so that it wouldn’t render objects that fell outside the visible path of the benchmark. The cheating wasn’t discovered until someone messed with the benchmark code to allow free-look and saw how everything vanished at a certain point but only with a certain driver. Talk about blatant cheating…

  48. “The usual crowd reads the headline and gets all excited…”

    Sounds like you have pretty clearly conceived ideas of what this “usual crowd” will think, when none of that crowd has apparently turned up yet.

    Lest you confuse me with someone who gives a damn too… I’ve stuck with Intel for all of my work machines, and regard Ryzen with vague professional curiosity.

    I’m just confused where you’re seeing this “usual crowd”, when the only person around here lately whose responses are guaranteed to follow a very predictable format are yours.

  49. You obviously have a great understanding of machine code, performance and errors. :rolleyes:

  50. 1. You deserved it.

    2. AVX is a [b<]SIX YEAR OLD[/b<] instruction set that's supported by both Intel and AMD chips that are generally considered obsolescent around here. I get a little tired of hearing people claim that there should be no testing of vitally important CPU features that have been sold in practically every major chip for over half a decade. How would you feel if I ran around screaming that no Vulkan/DX12 benchmarks should be run on modern video cards because the total number of modern high-end AMD and NVidia GPUs that have been sold with strong support for those APIs is a drop in the bucket compared to the last half-decade of consumer CPU shipments?

  51. It sounds like a good change for the CPU-Z team to make regardless of any possible shenanigans which helped discover it.

    The rest is meh, for me. Synthetic benchmarks have long since been a target for benchmark shenanigans, and while it’s always worthwhile to call that out it’s not anything new or even all that unexpected.

  52. What fanboyish nonsense? It looks like AMD got caught redhanded here.

    Incidentally, when the story about RyZen freezing when a certain highly specific sequence of integer operations was performed using the chip, it was considered no big deal because a microcode update could fix the issue for that specific instruction sequence.

    Well, it looks like microcode can “fix” “problems” for other instruction sequences too, like the CPU-Z micro-benchmark apparently.

  53. Seriously, SSE2’s everywhere* and is executed in a fashion across different architectures that’s broadly comparable. AVX is still a mostly niche set of instructions with widely varying performance characteristics; as TwistedKestrel said, for the purpose of comparing like to like and getting a feel for how a wide range of CPUs handles common tasks, it’s a safe bet.

    * Granted, somebody out there’s still got an Athlon MP box rumbling along as a server in a closet and may take offense to this, but I’m not worried about them.

    edit: Hey, what a shocker, chuckula’s magic middle finger of three downvotes strikes within five minutes of me posting this. Grow a thicker skin, go back to bed, or get the hell out, chuckie.

  54. Or, y’know… maybe you’re the one people are sat around eating their popcorn while watching.
    Ever think to wait until someone posts fanboyish nonsense *before* calling out everyone’s fanboyish nonsense?

  55. [quote<]The benchmark is written in C++ and uses SSE2 instructions in the 64-bit version of the app.[/quote<] Well the CPU-Z benchmark was pretty much worthless before this update and if it's only using 16 year old SSE2 instructions from 2001 and couldn't even be bothered to use 6 year old AVX instructions from 2011, then it's at least consistently worthless.

  56. [quote<]After a thorough investigation, the team discovered that Ryzen CPUs were executing a certain sequence of integer instructions in a way that avoided an intentional delay, producing artificially-inflated benchmark numbers.[/quote<] TIME TO GET THE POPCORN! [Funniest part about this story: The usual crowd reads the headline and gets all excited that RyZen's performance will magically triple now that the software is "optimized" for its "superior" architecture. Then they read the actual story (at least some of them).]