AMD’s Ryzen Threadripper 2990WX CPU reviewed

Just over a year ago, AMD’s Ryzen Threadripper CPUs delivered more cores to demanding users for less money than the competition, and that formula proved successful in re-establishing AMD as a player in high-end desktop systems. As a matter of fact, AMD says its 16-core, 32-thread Threadripper 1950X was its best-selling high-end desktop part. Despite the copious compute resources on offer from the top-end Threadripper, the company heard from customers who wanted even more from their high-end desktops.

AMD listened to those folks, and this morning, the company is unleashing even more multithreaded performance from the same X399 platform that underpinned first-generation Threadrippers in the form of the $1799 Ryzen Threadripper 2990WX. As we’ve known it would for some time, the 2990WX will let builders put 32 cores and 64 threads in any X399 motherboard with nothing more than a firmware update. AMD recently flew us all the way to Maranello, Italy, home of a little racing team called Scuderia Ferrari, to give us a look under the 2990WX’s hood and to make the point that this chip is one heck of a fast CPU.

  Cores/

threads

Base

clock (GHz)

Peak boost

clock (GHz)

L2

cache (MB)

L3

cache (MB)

TDP Suggested

price

Threadripper 2990WX 32/64 3.0 4.2 16 64 250 W $1799
Threadripper 2970WX 24/48 12 64 $1299
Threadripper 2950X 16/32 3.5 4.4 8 32 180 W $899
Threadripper 1950X 16/32 3.4 4.2 8 32 $999
Threadripper 2920X 12/24 3.5 4.3 6 32 $649
Threadripper 1920X 12/24 3.5 4.2 6 32 $799
Threadripper 1900X 8/16 3.8 4.2 4 16 $549

Adding more cores and threads to a Threadripper would seem to invite scaling out other parts of its platform, as well, but AMD wanted to remain within the bounds of first-generation X399 motherboards to keep an upgrade path for owners of those boards intact—much as it plans to for Socket AM4 builders through 2020. As the company tells it, that meant no changes to the pinout of the TR4 socket and no new memory channels that would require a move to single-socket-server-like motherboards.

The Threadripper 2990WX soldiers on with the same quad-channel memory arrangement as the 16-core Threadripper 1950X—and thus the same potential memory bandwidth—despite having double the number of active cores under its heat spreader. Those constraints posed challenges that AMD had to work around as it figured out how to cram even more into the same socket.

The Infinity Fabric topology of the Threadripper 2950X (and 1950X). Source: AMD

The key to scaling out Threadripper, as with most AMD projects of late, is the use of the Infinity Fabric on-die and on-package interconnect. The Infinity Fabric let AMD join together two eight-core Ryzen dies (also known as Zeppelins) to form the first Threadrippers. Two-die Threadripper multi-chip modules (or MCMs) enjoy a 50 GB/s bi-directional link over the Infinity Fabric, or roughly the equivalent of two channels of DDR4-3200 RAM.

The Infinity Fabric topology of the Threadripper 2990WX. Source: AMD

Achieving a die-to-die connection for every Zeppelin on the Threadripper 2990WX multi-chip module comes at a cost to that inter-die bandwidth. Each of the four dies on the 2990WX MCM has a 25 GB/s bi-directional connection to every other die on the package, assuming DDR4-3200 in the motherboard’s memory slots. That bandwidth is roughly equivalent to a single channel of memory, and it’s quite a bit lower than the 42 GB/s die-to-die bandwidth that the company specifies for inter-die communication on fully-connected Epyc multi-chip modules.

AMD concedes that saturating this die-to-die link could have a negative effect on performance for applications that care about bandwidth above all else. That said, our experience suggests applications that saturate memory bandwidth are rare in single-user computing, although users seriously shopping for a 32-core, 64-thread CPU probably have one foot in client workloads and one in the data center or high-performance computing center. Even so, AMD is likely safe to take this bet for the 2990WX’s target audience.

The fact that some dies on the 2990WX have access to resources like memory and connected graphics cards, while others do not, creates a challenge for the operating system in pairing programs with resources for the best performance. To help out the OS, the 2990WX will always run in a non-uniform memory access, or NUMA, topology. In contrast, the Threadripper 1950X and Threadripper 2950X give their owners the options of running in a local (or NUMA) memory-access mode as well as a distributed mode that presents the entire MCM as one uniform memory access domain.

Each I/O-capable die on the 2990WX is its own NUMA node, and each compute-only die is its own NUMA node. Because of the 2990WX’s always-on NUMA topology, the operating system will attempt to schedule threads on the die where their associated memory resides first before spilling threads over to the compute dies (where memory latency is always worst-case thanks to the round trip over the Infinity Fabric to an I/O die, all the way out to memory and back).

AMD says the performance implications of this unevenness on a fully-loaded 2990WX are less of a concern than they might seem at first blush. The company says workloads that scale up to 64 threads are generally less concerned with memory latency than they are with bandwidth, a fact that the 2990WX’s design is well-positioned to take advantage of in its mission to scale out. AMD says it’s also worked with Microsoft to increase awareness of the unusual memory-access topology of the Threadripper 2990WX multi-chip module, and that it’s continuing to work with Redmond to refine the way the chip and OS work together for better performance in the future.

Just because the Threadripper 2990WX always operates as a group of NUMA nodes doesn’t mean users are out of luck when it comes to maximizing application performance. Not all software is NUMA-aware, and other programs may be overwhelmed by having 32 cores and 64 threads at its disposal. To accommodate those applications, AMD will give owners the option to power down two Zeppelins through its Ryzen Master utility, leaving the chip with 16 cores and 32 threads. If that’s not good enough, Ryzen Master can disable as many as three Zeppelins to leave the 2990WX with eight cores and 16 threads.

As a second-generation Ryzen CPU, the Threadripper 2990WX benefits from three major refinements. First, AMD’s Precision Boost 2 technology lets Threadrippers respond gracefully to changing load conditions, resulting in what AMD fellow Joe Macri called “a much more usable machine.” First-generation Threadrippers relied on a four-cores-active boost speed before dropping down to their all-cores-active speed, a performance characteristic that Macri likened to going over a cliff. Precision Boost 2 manages boost speeds in a more linear fashion in a continuous curve from one-core loads to all-core loads, and it can adjust clock speeds in 25-MHz increments to support that mission.

Second, Extended Frequency Range 2 (XFR 2) allows the 2990WX to take advantage of ambient conditions and beefy cooling hardware to deliver better sustained performance under multi-threaded workloads. Unlike the first generation of XFR, which applied a fixed offset to both single-core and all-core clock speeds when conditions allowed, XFR 2 only affects multithreaded speeds.

Finally, the Threadripper 2990WX incorporates the Zen+ microarchitecture, born from GlobalFoundries’ 12LP process. 12LP allowed AMD to use better-performing transistors in critical parts of the Ryzen die, resulting in better cache and memory latencies. In the case of the Ryzen Threadripper 2950X—for which we’ll have a separate review shortly—the 12LP process also allowed AMD to raise the peak single-core clock speed to 4.4 GHz. By a hair, that’s the fastest stock single-core clock that AMD has shipped on a Ryzen CPU so far. The Threadripper 2990WX only tops out at a single-core clock speed of 4.2 GHz, though.

That deficit might seem strange at first, but AMD says finding enough dies with similar enough performance characteristics to allow for the same single-core clock speeds on both the 2950X and 2990WX was a challenge. That makes sense when you consider that AMD continues to select only the top five percent of Ryzen dies for use in Threadrippers to start with. The company likely needs to leave itself enough 4.4-GHz-capable silicon to make 2950Xes, and statistics probably favor assembling sets of four 4.2-GHz capable dies in sufficient quantities to make 2990WXes.

 

For more information on the second-generation Ryzen Threadripper lineup, as well as an unboxing video that goes through some of the hardware we’re testing with today, be sure to check out our second-generation Threadripper unveiling. Otherwise, let’s get to testing.

 

Introducing the Mechanical TuRk

This review marks a major change in our CPU testing methods—possibly the biggest change in our approach to any hardware testing since we began collecting and digesting frame-time data a while back. You see, every CPU review we’ve produced over the past couple years has involved manually running every single benchmark on every single system under test, manually recording those results, and manually entering them into Excel for sorting and graphing.

That method simply doesn’t scale when you have large numbers of CPUs to benchmark, multi-thousand-word reviews to write, and tight deadlines to deliver it all. It also leads to torturous all-nighters in the lead-up to reviews, and those are exceedingly bad for long-term human health and mental performance. All told, our methods weren’t serving us or you, the audience, any longer.

Over the past few weeks, our code monkey Bruno Ferreira has been exploring how to automate our CPU tests so that we can balance the competing demands of performing in-depth testing, getting launch-day coverage out the door, and respecting the limits of the human body. That work has been an overwhelming success, and the end product is a little utility called the Mechanical TuRk. We’ve been able to wrap up the vast majority of our CPU tests into a single easy-to-install package that requires one button click to run and no human involvement to log or collect the results.

Even better, Bruno went above and beyond by creating a utility for us called Suleiman that automates the process of collecting data from each test system, transferring it into Excel, sorting it, and graphing it—all over the local network in the TR labs. That means most of our productivity testing is only limited by the number of systems I can build and run simultaneously, not the amount of time I can stay awake before falling asleep on my feet. Watching test results flow through Suleiman and into Excel automagically is downright intoxicating.

For all its automation, the TuRk doesn’t alter a key value behind our testing: the availability and reproducibility of the benchmarks behind our work. All of the utilities and tests it runs are still free to download and run individually, or they reside on public websites. Through head-to-head testing, we’ve confirmed that the overhead of the tool is vanishing and has no material effect on the benchmark numbers it collects.

We won’t be making the TuRk package itself available to the community, but interested parties can still verify our work independently or compare the performance of their own systems by downloading and running any particular benchmark or benchmarks of interest.

All told, the Mechanical TuRk will ultimately let us give you more of what you want from TR: more in-depth tests on more systems in more high-quality reviews. The value of this tool can’t be overstated for our work. Three cheers for Bruno for putting this tool together for us in a short timeframe and making it production-ready.

For the moment, however, our work in testing and refining the TuRk has left us short on time to perform some of the manual benchmarking we still need to do in order to make a complete TR review of the Ryzen Threadripper 2990WX. The most prominent victim of our time crunch is the DAW Bench suite, which still requires quite a bit of active time for every chip we need to test. We apologize for the omission of those numbers from this review, and we’ll be collecting them ASAP for a follow-up article (along with a number of other interesting content-creation workloads that weren’t natural fits for this piece). Thanks in advance for your patience—the long-term payoff will be worth it.

Our testing methods

As always, we did our best to deliver clean benchmarking numbers. We ran each benchmark at least three times and took the median of those results. Our test systems were configured as follows:

Processor Intel Core i7-8086K
CPU cooler Corsair H110i 280-mm closed-loop liquid cooler
Motherboard Gigabyte Z370 Aorus Gaming 7
Chipset Intel Z370
Memory size 16 GB
Memory type G.Skill Flare X 16 GB (2x 8 GB) DDR4 SDRAM
Memory speed 3200 MT/s (actual)
Memory timings 14-14-14-34 2T
System drive Samsung 960 Pro 512 GB NVMe SSD

 

Processor AMD Ryzen Threadripper 2990WX AMD Ryzen Threadripper 2950X
CPU cooler Enermax Liqtech TR4 240-mm closed-loop liquid cooler
Motherboard Gigabyte X399 Aorus Xtreme
Chipset AMD X399
Memory size 32 GB
Memory type G.Skill Flare X 32 GB (4x 8 GB) DDR4 SDRAM
Memory speed 3200 MT/s (actual)
Memory timings 14-14-14-34 1T
System drive Samsung 970 EVO 500 GB NVMe SSD

 

Processor AMD Ryzen Threadripper 1950X AMD Ryzen Threadripper 1920X
CPU cooler AMD Wraith Ripper
Motherboard Gigabyte X399 Aorus Gaming 7
Chipset AMD X399
Memory size 32 GB
Memory type G.Skill Flare X 32 GB (4x 8 GB) DDR4 SDRAM
Memory speed 3200 MT/s (actual)
Memory timings 14-14-14-34 1T
System drive Samsung 960 EVO 500 GB NVMe SSD

 

Processor Core i9-7980XE Core i9-7960X Core i9-7900X Core i7-7820X
CPU cooler Corsair H150i Pro 360-mm closed-loop liquid cooler
Motherboard Gigabyte X299 Designare EX
Chipset Intel X299
Memory size 32 GB
Memory type G.Skill Flare X 32 GB (4x 8 GB) DDR4 SDRAM
Memory speed 3200 MT/s (actual)
Memory timings 14-14-14-34 1T
System drive Intel 750 Series 400 GB NVMe SSD

Our test systems shared the following components:

Graphics card Nvidia GeForce GTX 1080 Ti Founders Edition
Graphics driver GeForce 398.82
Power supply Thermaltake Grand Gold 1200 W (Intel X299)

Seasonic Prime Platinum 1000 W (AMD Threadripper 2950X/2990WX)

Corsair RMx 850 W (AMD Threadripper 1950X/1920X)

Seasonic SS660-XP2 660 W (Core i7-8086K)

Some other notes on our testing methods:

  • All test systems were updated with the latest firmware, graphics drivers, and Windows updates before we began collecting data, including patches for the Spectre and Meltdown vulnerabilities where applicable. As a result, test data from this review should not be compared with results collected in past TR reviews. Similarly, all applications used in the course of data collection were the most current versions available as of press time and cannot be used to cross-compare with older data.
  • Our test systems were all configured using the Windows Balanced power plan, including AMD systems that previously would have used the Ryzen Balanced plan. AMD’s suggested configuration for its CPUs no longer includes the Ryzen Balanced power plan as of Windows’ Fall Creators Update, also known as “RS3” or Redstone 3.
  • Unless otherwise noted, all productivity tests were conducted with a display resolution of 2560×1440 at 60 Hz. Gaming tests were conducted at 1920×1080 and 144 Hz.

Our testing methods are generally publicly available and reproducible. If you have any questions regarding our testing methods, feel free to leave a comment on this article or join us in the forums to discuss them.

 

Memory subsystem performance

The AIDA64 utility includes some basic tests of memory bandwidth and latency that will let us peer into the differences in behavior among the memory subsystems of the processors on the bench today, if there are any.

With the same memory speeds and timings, the Skylake-X chips run away in AIDA64’s memory read test, but the results for writes and copies are much more closely matched among our quad-channel contenders. The Threadripper 2950X proves an especially eager writer and copier in these tests.

AIDA64’s memory latency test is especially interesting given that we have X-series and WX-series Threadrippers in our stable. On the Threadripper 2990WX, Windows seems to be working to keep the memory latency test scheduled on a core with “near” access, so we get the lowest possible latency figure we would expect from the part. As more and more threads are scheduled on the chip and spill onto its compute dies, we might expect latencies to creep upwards, but we can’t test that expectation with AIDA64 alone.

The 2950X, on the other hand, doesn’t seem to benefit from that same NUMA-awareness, so we get a result more in line with Threadrippers past.

Some quick synthetic math tests

AIDA64 also includes some useful micro-benchmarks that we can use to flush out broad differences among CPUs on our bench. The PhotoWorxx test uses AVX2 and AVX-512 on compatible CPUs. The CPU Hash integer benchmark uses AVX and Ryzen CPUs’ Intel SHA Extensions support, while the single-precision FPU Julia and double-precision Mandel tests use AVX2 with FMA.

While PhotoWorxx uses AVX-512, FinalWire suggests that it’s also quite sensitive to memory bandwidth. That might explain the negative scaling in performance we see as core counts climb among our Skylake-X chips—there isn’t a concurrent increase in bandwidth to feed the i9-7960X or i9-7980XE. Despite being limited to AVX2 instructions and having less theoretical SIMD throughput, the Threadrippers don’t trail that far behind the Skylake-X pack.

As we mentioned, CPU Hash uses Intel’s SHA Extensions on compatible chips to accelerate certain cryptographic functions. As the Threadrippers’ disproportionately large results in this test show, they support those extensions. Intel’s chips do not.

FinalWire says both the single-precision Julia and double-precision Mandel tests use AVX-512, as well, and as the plucky Core i7-7820X shows, those instructions can let compatible chips punch far above their weight class when SIMD throughput is asked for. The Core i9-7960X and i9-7980XE lead the pack by no small margin in this synthetic, too. However, the Threadripper 2990WX plants itself among the Core i9-7900X, the Core i9-7960X, and i9-7980XE by dint of its sheer core count.

We don’t normally include AIDA64’s ray-tracing benchmarks in our selection of synthetics, but our slate of high-end desktop CPUs warrants pulling this one out. These tests support AVX-512, as well, and they allow the Skylake-X chips on the bench to achieve some truly jaw-dropping throughput—especially from the i9-7960X and i9-7980XE. Only the Threadripper 2990WX’s core count allows it to stay in the mix here.

Now that we’ve oohed and aahed at the theoretical mathematical prowess of our test subjects, let’s see how they handle real-world workloads.

 

Javascript

The usefulness of Javascript microbenchmarks for comparing browser performance may be on the wane, but these tests still allow us to tease out some single-threaded performance differences among CPUs. As part of our transition to using the Mechanical TuRk to benchmark our chips, however, we’ve had to switch to Google’s Chrome browser so that we can automate these tests. Chrome does perform differently on these benchmarks than Microsoft Edge, our previous browser of choice, and so it’s vitally important not to cross-compare these results with older TR reviews.

On the whole, Chrome seems to be kinder to AMD CPUs than Edge. The Threadripper 2990WX can generally outpace the Core i9-7960X and Core i9-7980XE, and it tends to hang right with the Core i9-7900X. It’s impressive that we don’t have to sacrifice much, if any, single-threaded responsiveness to get 32 cores in an X399 motherboard.

WebXPRT 3

The WebXPRT 3 benchmark is meant to simulate some realistic workloads one might encounter in web browsing. It’s here primarily as a counterweight to the more synthetic microbenchmarking tools above.

WebXPRT isn’t entirely single-threaded—it uses web workers to perform asynchronous execution of Javascript in some of its tests. Perhaps that’s part of why this test lets second-generation Threadrippers run away from the high-end desktop pack, trailing only the Core i7-8086K.

Our Javascript tests suggest users shouldn’t have to worry about giving up too much single-threaded responsiveness with even the Threadripper 2990WX in exchange for its 32 cores. Let’s see how it handles our range of multithreaded benchmarks now.

 

Compiling code with GCC

Our resident code monkey, Bruno Ferreira, helped us put together this code-compiling test. Qtbench records the time needed to compile the Qt SDK using the GCC compiler. The number of jobs dispatched by the Qtbench script is configurable, and we set the number of threads to match the hardware thread count for each CPU.

Qtbench does scale with thread count, but its returns quickly diminish around 16 cores. Still, the Threadripper 2990WX ekes out the barest win here.

File compression with 7-Zip

The free and open-source 7-Zip archiving utility has a built-in benchmark that occupies every core and thread of the host system.

So here’s the first bump in the road for the Threadripper 2990WX. Despite what should be a class-leading complement of cores and threads in this benchmark, the chip falls to the back of the pack in 7-Zip’s compression test. We asked AMD whether it had observed similar behavior with 7-Zip compression, and the company does believe that some refinement may be required to see full performance from the 2990WX in this benchmark.

I’ve also asked some other reviewers about their experiences with the chip in this benchmark, and it does appear that we’re looking at a Windows-specific performance issue. In at least one benchmark of 7-Zip compression under Linux that the reviewers at Techgage shared with me, the 2990WX leads the pack, as we would expect.

It’ll be interesting to see what happens if AMD does manage to help the 7-ZIp developers straighten out compression performance, because the 2990WX’s decompression speeds are eye-popping.

Just to drive home the point that the 2990WX’s compression performance under 7-zip is abnormal, witness its performance in AIDA64’s benchmark of the Zlib compression algorithm. Presuming 7-zip’s developers can uncork this Threadripper’s full capabilities with some tuning, we could be in for some great compression numbers to go with the chip’s incredible decompression performance in the future.

Disk encryption with Veracrypt

May as well get the bad results out of the way early. The Veracrypt benchmark also doesn’t seem to like something about the 2990WX, at least in its accelerated AES portion.

Move to the pure number-crunching demands of the unaccelerated Twofish algorithm, however, and the 2990WX stretches its incredible integer muscle once again.

 

Cinebench

The evergreen Cinebench benchmark is powered by Maxon’s Cinema 4D rendering engine. It’s multithreaded and comes with a 64-bit executable. The test runs with a single thread and then with as many threads as possible.

Cinebench’s single-threaded benchmark is less kind to our Threadrippers than our Javascript tests were. The Skylake-X chips have a slight edge in this test, while the mainstream-desktop i7-8086K demonstrates its world-beating single-threaded performance.

One doesn’t run Cinebench for its single-threaded mode, though. This benchmark is all about stretching every core and thread possible, and the Threadripper 2990WX does just that in this test.

Blender

Blender is a widely-used, open-source 3D modeling and rendering application. The app can take advantage of AVX2 instructions on compatible CPUs. We chose the “bmw27” test file from Blender’s selection of benchmark scenes to put our CPUs through their paces.

Our Cinebench result wasn’t a fluke. The Threadripper 2990WX shaves an incredible 50 seconds off the Core i9-7980XE’s render time. We don’t often see step-function increases in performance like this.

Corona

Corona, as its developers put it, is a “high-performance (un)biased photorealistic renderer, available for Autodesk 3ds Max and as a standalone CLI application, and in development for Maxon Cinema 4D.”

The company has made a standalone benchmark with its rendering engine inside, so it’s a no-brainer to give it a spin on these CPUs.

Corona is another out-of-the-park home run for the Threadripper 2990WX.

Indigo

Indigo Bench is a standalone application based on the Indigo rendering engine, which creates photo-realistic images using what its developers call “unbiased rendering technologies.”

Despite its general willingness to devour every core and thread we can throw at it, Indigo’s “Bedroom” scene doesn’t favor the Threadripper 2990WX like our other rendering benchmarks have. Perhaps this is another software-related rough spot that can be polished out with future updates.

The “Supercar” scene is kinder to the Threadripper 2990WX, but its performance is no better than that of its 32-threaded stablemates. The Core i9-7960X and i9-7980XE demonstrate that further scaling is still possible, so perhaps the Threadrippers still have some performance waiting to be tapped.

 

Handbrake

Handbrake is a popular video-transcoding app that recently hit version 1.1.1. To see how it performs on these chips, we converted a roughly two-minute 4K source file from an iPhone 6S into a 1920×1080, 30 FPS MKV using the HEVC algorithm implemented in the x265 open-source encoder. We otherwise left the preset at its default settings.

Handbrake proves another case where our Threadrippers hit a scaling wall that the Core i9-7960X and Core i9-7980XE do not.

SPECwpc WPCcfd

Computational fluid dynamics is an interesting and CPU-intensive benchmark. For years and years, we’ve used the Euler3D benchmark from Oklahoma State University’s CASElab, but that benchmark has become more and more difficult to continue justifying in today’s newly-competitive CPU landscape thanks to its compilation with Intel tools (and hence a baked-in vendor advantage). 

Ahead of this review, we set out to find a more vendor-neutral and up-to-date computational fluid dynamics benchmark than the wizened Euler3D. As it happens, the SPECwpc benchmark includes a CFD test constructed with Microsoft’s HPC Pack, the OpenFOAM toolkit, and the XiFoam solver. More information on XiFoam is available here. SPECwpc allows us to yoke every core and thread of our test systems for this benchmark. The result is reported as a multiple of SPEC’s reference system performance, namely a Lynnfield Xeon X3430 with 8 GB of dual-channel RAM.

As our long experience with Euler3D has suggested, CFD workloads tend to bottleneck on memory bandwidth, perhaps explaining the large cluster of Threadrippers and Skylake-X CPUs in the middle of our chart. Somehow, though, the Threadripper 2950X and 2990WX break through to take the top spots in this benchmark.

SPECwpc NAMD

The SPECwpc benchmark also includes a Windows-ready implementation of NAMD. As its developers describe it, NAMD “is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. Based on Charm++ parallel objects, NAMD scales to hundreds of cores for typical simulations and beyond 500,000 cores for the largest simulations.” Our ambitions are considerably more modest, but NAMD seems an ideal benchmark for our many-core single-socket CPUs.

As its developers suggest, NAMD has no trouble scaling with every core and thread we can throw at it. The Threadripper 2990WX is far and away the fastest chip here.

That’s it for our first round of productivity testing with the 2990WX. While many applications in our test suite have no trouble yoking every core and thread the 2990WX has to offer, others delivered performance far below the chip’s apparent potential. AMD may have some work ahead of it to ensure that even multithreaded software can take full advantage of its uberchip.

Next up, we’re going to do something completely ill-advised for the heck of it by running our CPU-bound gaming stress tests on the 2990WX. Buckle up.

 

Crysis 3

Even as it passes six years of age, Crysis 3 remains one of the most punishing games one can run. With an appetite for CPU performance and graphics power alike, this title remains a great way to put the performance of any gaming system in perspective.


Crysis 3 ought to be a best-case scenario for the 2990WX, but even so, the chip trails the pack. At least the game seems to handle 32 cores and 64 threads gracefully, as cutting over to the 2990WX’s 16-core, 32-thread mode actually results in a minor decrease in performance.


These “time spent beyond X” graphs are meant to show “badness,” those instances where animation may be less than fluid—or at least less than perfect. The formulas behind these graphs add up the amount of time our graphics card spends beyond certain frame-time thresholds, each with an important implication for gaming smoothness. Recall that our graphics-card tests all consist of one-minute test runs and that 1000 ms equals one second to fully appreciate this data.

The 50-ms threshold is the most notable one, since it corresponds to a 20-FPS average. We figure if you’re not rendering any faster than 20 FPS, even for a moment, then the user is likely to perceive a slowdown. 33 ms correlates to 30 FPS, or a 30-Hz refresh rate. Go lower than that with vsync on, and you’re into the bad voodoo of quantization slowdowns. 16.7 ms correlates to 60 FPS, that golden mark that we’d like to achieve (or surpass) for each and every frame.

To best demonstrate the performance of these systems with a powerful graphics card like the GTX 1080 Ti, it’s useful to look at our three strictest graphs. 8.3 ms corresponds to 120 FPS, the lower end of what we’d consider a high-refresh-rate monitor. We’ve recently begun including an even more demanding 6.94-ms mark that corresponds to the 144-Hz maximum rate typical of today’s high-refresh-rate gaming displays.

While the fact that the 2990WX only holds our GTX 1080 Ti up for about three seconds of our one-minute test run at the 8.3-ms mark might sound impressive, that’s not a great result in this company. Even the Threadripper 1920X makes our graphics card spend about half as long past that mark, and the rest of our chips manage a second or less of hold-up time at this threshold.

 

Deus Ex: Mankind Divided

Thanks to its richly detailed environments and copious graphics settings, Deus Ex: Mankind Divided can punish graphics cards at high resolutions and make CPUs sweat at high refresh rates.


Turn Deus Ex: Mankind Divided loose on every one of the 2990WX’s cores and threads, and you will have a bad time. Cutting the core count in half leads to better performance, but the 2990WX is still overshadowed by two-die Threadrippers, not to mention every Skylake-X part in our stable.


At the 8.3-ms mark, our time-spent-beyond-X analysis throws the 2990WX’s struggles into sharp relief. Running a 1920×1080 monitor with a massively powerful graphics card like the GTX 1080 Ti is a stress test, to be sure, but the 2990WX just wilts under the pressure. Choosing settings that force a game to become CPU-bound on the 2990WX is simply not advisable.

 

Grand Theft Auto V

Grand Theft Auto V‘s lavish simulation of Los Santos and surrounding locales can really put the hurt on a CPU, and we’re putting that characteristic to good use here.


Grand Theft Auto V is no better equipped to take advantage of the 2990WX’s thread count than Deus Ex: Mankind Divided, but at least switching off half the chip leads to a major increase in performance for both average frame rates and 99th-percentile frame times.


Our time-spent-beyond-X measurements show just how large an improvement the 2990WX’s 16-core mode provides in GTA V. The chip spends less than a quarter of the time holding up our GTX 1080 Ti with half of its cores active than with every cylinder firing.

 

Assassin’s Creed Origins

Assassin’s Creed Origins isn’t just striking to look at. It’ll happily scale with CPU cores, and that makes it an ideal case for our test bench.


Our Assassin’s Creed Origins numbers don’t paint a pretty picture for the 2990WX or 2950X at first glance. However, the Denuvo DRM baked into Assassin’s Creed Origins prevented me from performing retesting to confirm or exclude a performance problem with these parts. That said, I am comfortable in saying that Origins isn’t terribly happy on the 2990WX, especially in its full 32-core, 64-thread glory. A sub-60-FPS average frame rate is not something the unsuspecting gamer wants to see from their GTX 1080 Ti. Flipping on the 2990WX’s 16-core mode helps, but it still can’t help the GTX 1080 Ti run as quickly as it does with the Skylake-X chips or the first-generation Threadrippers. All told, we likely need to re-run some tests to confirm the behavior we’re seeing in this title from the 2950X and 2990WX.


 

 

Far Cry 5


Far Cry 5‘s Montana mountain setting has plenty to say about the end times, but that doesn’t mean one should have to suffer apocalyptic performance from their gaming PC in the bargain. Weirdly enough, the low frame rates and high 99th-percentile frame times we saw in this title didn’t go away when we started bumping up the resolution to shift our performance bottleneck onto the CPU. Perhaps Far Cry 5 just isn’t expecting to be run on a CPU with a server-like core and thread count. Indeed, shifting our 2990WX into its 16-core form greatly improved the gameplay experience.


Our time-spent-beyond-X measurements prove that it’s essential to cut down the 2990WX’s core and thread count if you’re unwisely attempting to push Far Cry 5‘s frame rates to its limits on this chip.

Overall, we wouldn’t put too much stock in our gaming tests on the way to a final judgment of the Threadripper 2990WX or any other CPU architecture that descends from the data center to the desktop. High-end desktop chips and high-refresh-rate gaming generally do not go together well, and we knew that going into these tests. We forged ahead just to see what would happen. Even so, we weren’t expecting the 2990WX to perform as weakly as it did in many of our CPU-bound games.

It’s nice that the Ryzen Master utility offers users the ability to cut down the chip’s core and thread counts to combat potential performance issues like the ones we saw. Even so, having to deactivate cores and threads in order to achieve even decent baseline performance at lower resolutions with fast graphics cards could limit the 2990WX’s appeal for framerate-hungry single-PC streamers who might want every bit of the chip’s processing power at their disposal. AMD did demonstrate the chip encoding a stream of a 4K game at its presentation in Italy, and perhaps that mostly-GPU-bound scenario is how flush-with-cash streamers enamored of AMD CPUs will want to plan around this part.

Speaking of which, we wanted to run our usual single-PC gaming and streaming tests with the fully-enabled Threadripper 2990WX to see how it stacked up against the Core i9-7980XE, but Far Cry 5 is our title of choice for that testing right now, and you can imagine how that might turn out given our test results above. We’re exploring alternative titles that we can perform that testing with, and we’ll gather performance results and share them as soon as we’re able.

 

A quick look at power consumption and efficiency

To get a sense of just how the chips we’re testing today balance power consumption and performance, we’re going to run our usual back-of-the-napkin numbers using each part’s performance in the Blender “bmw27” benchmark and some handy observations from our Watts Up power meter.

First, we’ll compare the instantaneous power draw of our test systems under load while they run the Blender “bmw27” benchmark. If we stopped our analysis here, we might call the Threadripper 2990WX a huge power hog, but that wouldn’t be correct.

Recall that the 2990WX completes our Blender benchmark in record time—far faster than any other chip on the bench. Take that time-to-completion in seconds and multiply it by the instantaneous power draw of the chip, and we get an estimate of the total energy each chip on the bench needs to complete the “bmw27” test scene.

The 2990WX may have the highest instantaneous power draw of any high-end desktop chip on our test bench by a wide margin, but its startlingly quick completion of that workload lets it consume the least power of any chip on the bench by our estimate. Not only is the 2990WX the winner on sheer performance, it’s also the least power-hungry when we ask it to do something that scales well across all of its cores.

This scatter plot might help visualize the balance between power consumption and performance for the chips we’re testing. The best results on this chart will tend toward the lower-left corner, where energy consumption is lowest and time-to-completion is fastest. While there’s a cluster of chips that take about the same amount of time and expend the same amount of energy to finish our test workload, the Threadripper 2990WX just stands out all the more here.

 

Conclusions

AMD’s Ryzen Threadripper 2990WX often proves to be the best-performing high-end desktop CPU on the planet, but its potential isn’t fully realized yet.

When this chip rips, it really rips. We’ve never seen the kind of speed the 2990WX delivers in Blender and Corona rendering before. The 2990WX also turns in chart-topping performances in HPC workloads like the SPECwpc NAMD benchmark. Handbrake transcoding with the x265 encoder and the Indigo “Supercar” scene are about the only tests we’ve yet run where Intel’s i9-7980XE (and i9-7960X) take large leads over AMD’s uberchip.

In other cases, software doesn’t seem to know quite what to make of the 2990WX, as evidenced by our 7-zip compression results, the Indigo benchmark’s “Bedroom” test scene, and our Veracrypt AES test. AMD acknowledges that the 2990WX’s hardware might be in front of the software in some cases, so some patience might be required until developers can tune their applications to get the most from the chip. I’m fairly confident devs will be able to smooth out those wrinkles with time, but we have to judge the Ryzen 2990WX as it stands today.

One might be tempted to think a chip with this many cores and threads can handle both work and play, and that might be true for high-resolution gaming at sub-100-Hz refresh rates, but we’d still advise caution with the 2990WX after hours. We ran our high-refresh-rate 1920×1080 gaming tests on the 2990WX just to see what would happen, and it trails far behind the rest of the high-end desktop pack. On top of that, its sheer thread count and NUMA-ness seem capable of causing show-stopping playability problems with some titles, as our Far Cry 5 tests demonstrated.

To get around those issues, AMD lets owners shut off the 2990WX’s compute dies or even reduce the chip to just one eight-core, 16-thread die through the Ryzen Master utility, and those measures do work. Still, those are Band-Aids, not panaceas. If for some reason you primarily envision yourself running this chip in its 16-core mode, the Threadripper 2950X is a much better choice and costs much less.

We’re not holding gaming performance against the 2990WX, to be clear—high-end desktop platforms are rarely the right choice for gamers trying to push the most frames possible. Given the behavior we saw, though, developers may need to begin thinking about NUMA nodes and single-socket-server-like core counts as they refine today’s titles and prototype tomorrow’s.

If AMD works with software developers to address the issues we observed with some of our multithreaded benchmarks, we’ll immediately crown the Threadripper 2990WX as the chip to get in the vast majority of ultra-high-end desktops. For the moment, though, I’m holding off on the full-throated endorsement a TR Editor’s Choice award would imply.

That said, if your workload already scales well on the 2990WX and your time is money, I’d still recommend this chip today. Stay tuned as we finish even more testing on our stable of CPUs to get a complete picture of the newly reshaped high-end desktop landscape the Threadripper 2990WX heralds.

Comments closed
    • Mr Bill
    • 1 year ago

    Nice review Jeff! Under ‘Our Testing Methods’ could you please put a “cores/threads/base clock” line? It would make it much easier to run back and check when looking at a particular result graph. I’m sure many other people are as unfamiliar as I am with the rest of the pack for any particular CPU review.

    • Goty
    • 1 year ago

    Seems like much of the 2990WX slowness in games might be attributable to a shoddy NVIDIA driver: [url<]https://www.golem.de/news/32-kern-cpu-threadripper-2990wx-laeuft-mit-radeons-besser-1808-136016.html[/url<]

      • DreadCthulhu
      • 1 year ago

      I just saw that as well, and you ninja’d me in posting it. 😉 Golem.de swapped in a Vega 64 and found that in 4 of the 5 games tested, the Vega had massively better results than a GTX 1080 Ti. I would guess that Nvidia’s driver has some oddness that doesn’t work well with very high-core count CPUs.

        • dragontamer5788
        • 1 year ago

        Windows’s scheduler is the same for kernel-level tasks as well as user-space tasks.

        If its a general Windows Scheduler issue, then it would make sense that NVidia’s drivers would have some issues. We can assume that AMD figured out how to avoid the issue when they wrote Vega64’s driver.

        • K-L-Waster
        • 1 year ago

        As discussed on the 2950X review thread, that doesn’t explain how golem.de somehow only managed ~50 FPS from a Vega 64 at 720p. That card should be doubling those FPS or more at that res.

        There’s something fishy in the entire test they’re doing. Yes, the 1080 TI comes off even worse, but there’s no way a wonky NVidia driver explains the Vega 64 results *also* being ludicrously low.

      • Anonymous Coward
      • 1 year ago

      Good to see there is an explanation, because it did strike me as pretty odd, that huge performance impact from extra cores.

    • moose17145
    • 1 year ago

    I am sure I am not the first to wonder this… but… what I would be particular interested in, would be the TR 2990WX vs. it’s Epyc counterpart.

    The two chips are VERY similar… but Epyc also has some clear advantages over it’s TR counterpart. It would be interesting to see how those differences affect performance. Especially in some of the applications where the 2990WX stumbled.

    What I would also be curious about is the Virtulization performance of this vs. it’s Epyc counterpart. One of my buddies is looking to replace his home ESX server, and has been debating about a TR vs. an Epyc chip. I am sure Epyc would be the clear winner… but if it’s only a difference of 5% more performance for the Epyc chip then it may be worth it to just go with the TR chip instead. This is, afterall, a HOME ESX server… not one that is sitting in a datacenter. As a result, additional hardware costs are something to more heavily consider.

    Out of curiosity, how difficult would it be to include some kind of virtualization benchmark for CPUs? I am honestly asking. I am unsure what would all go into adding a virtualization benchmark. If it’s something that could be added to the Mechanical TuRK and not take up much additional time, or if adding it would just turn into a huge time sink for each new article.

      • Jeff Kampman
      • 1 year ago

      The most comparable Epyc CPU is probably the 7551P. It has 32 cores, 64 threads, a 2-GHz base clock, and a 3-GHz peak boost speed. Most people would probably find that to be insufficient for client use, and you can’t do anything about it via overclocking. It’s about $500 more expensive than the 2990WX.

      If you want the highest-performance Epyc, the 7601 is about $4600 and still only tops out at 3.2 GHz for peak single-core speed. There’s really nothing else like the 2990WX as far as core count and peak clock speed go—it’s a perfect blend for those who need client-like responsiveness and sheer multi-threaded grunt.

      As far as virtualization benchmarks go, VMWare has one but the hardware requirements are impractical for dabblers: [url<]https://www.vmware.com/products/vmmark.html[/url<]

        • Goty
        • 1 year ago

        I wonder how much power/thermal headroom disabling the various uncore bits on the compute dice of the 2990WX actually freed up vs something like the 7601. That’s an awfully big clockspeed gap to make up in just 70W of TDP. It would be interesting to see a 7601 pushed to similar speeds.

          • moose17145
          • 1 year ago

          I agree. But… we might also see real life power consumption numbers that are closer than what paper numbers suggest.

        • moose17145
        • 1 year ago

        Thank you for the reply!

        And I figured the Epyc chips would have a slightly lower base clock (while costing more money), but at the same time each of the 4 dies has it’s own memory channels attached to it, plus, as you mentioned, more bandwidth between each die. I wonder if those things would be enough to bump the Epyc chips performance ahead of (or at least on par with) the 2990WX despite the clock speed decrease. I do not expect you to go out and purchase one of these things just to benchmark… it was more a thought out of curiosity about how much this chip is hobbled compared to it’s big data-center oriented brother.

        And holy crud the system requirements on VMMark! Wow yea okay I can definitely see why you are not testing virtualization performance! I was hoping there was something smaller out there that could be tested on a machine with a few gigs of ram and maybe easily integrated into the Mechanical Turk just to get a general idea about which CPUs might be better for running VMs either in a smaller ESX home server or inside VMWare Player… especially since that seems like a potentially great application for the 2990WX… but… Oof…

        Great Article Jeff! And I hope your new Mechanical TuRK serves you well for future reviews!

    • yeeeeman
    • 1 year ago

    Jeff, I want to know, is this lack of knowledge or bad intentions? How come other sites could read about game mode and run all gaming tests in this mode, which for all intents and purposes makes 2990WX perform like a 2700X?
    Please clarify since I believed that you are a professional hardware journalist.
    Also, a CPU like 2990WX need more focus on actual work programs and less on games, since the story of games is simple and can be summarized in one line.
    Thanks!

      • Srsly_Bro
      • 1 year ago

      How dare you question Lord Kampman in his kingdom. For he can do no wrong and is ever righteous in his quest to bring us unbiased computer reviews.

      Srsly, tho, this site is similar to Reddit in terms of group think. You aren’t able to question much of anything without getting down-voted. Just accept it as-is or find a site that tests the way you prefer. Glhf

        • derFunkenstein
        • 1 year ago

        I guess you guys missed that the graphs have both 16-core and 32-core results.

          • Srsly_Bro
          • 1 year ago

          I was speaking in a general sense. If he f’ed up, I’m not defending that, which appears he did.

          Thanks bro.

        • yeeeeman
        • 1 year ago

        I see some people have downvoted me, but probably I said something true if they had such a reaction.
        Actually, a lot of reviews sites just installed games, double clicked and posted results. That is not how you stay professional, but I guess that since audience is dumber day by day, you can get away with this.
        I just remember the days when Anand was writing articles for Anandtech. You could see that they meant hard work of documenting, testing and retesting. Nowadays, you only hear that Intel is better because you have 5% more performance in games and that is it…

          • auxy
          • 1 year ago

          You didn’t even read the review!! There are results on every game in game mode!! [b<]There are more non-game benchmarks than games[/b<] This is the only site that did it right you blithering facile tool!

            • Waco
            • 1 year ago

            This.

            Odd times when I find myself consistently agreeing with auxy. That takes special talent by the OP. 😛

            • jihadjoe
            • 1 year ago

            Hammond^WYeeeeman, you blithering idiot!

          • just brew it!
          • 1 year ago

          Did you even read the review before commenting? Because the only two possibilities are that you didn’t, or that you have some serious problems with reading comprehension.

          Jeff tested both with game mode enabled and disabled (and presented results for both). It’s right there in the charts.

          In addition to the gaming tests, the review presents results for synthetic tests, Javascript/web performance, code compilation, compression, decompression, encryption, 3D rendering, video transcoding, scientific/engineering computational workloads, absolute power consumption, and efficiency.

          To top it all off, the article even notes that doing gaming benchmarks on a HEDT CPU like this is “ill- advised”.

          So the review basically did the opposite of everything you’re complaining about. And THAT’S why you got downvoted.

        • Redocbew
        • 1 year ago

        A few days ago while I’m in the car stopped at a stoplight I see a dude on the sidewalk. He’s walking apparently with some destination in mind. He’s carrying on his back an extremely rustic looking pack. He’s wearing some kind of roughspun one-piece robe. He’s got a branch from a tree shorn of any smaller branches to create a walking stick. He’s also got little white dots painted around his eyeballs. He leans over to look in the window at me and says: Take me to your leader!

        True story, bro.

        I feel like I have just about as much chance of understanding that dude as I do this “why you no use game mode!” post, and so I fling the useless Internet points accordingly. Is that not how it’s done?

          • derFunkenstein
          • 1 year ago

          Well? How did this story end? Did you give him a ride?

            • Redocbew
            • 1 year ago

            The light turned green, and off I went. It was kind of anticlimactic really. Despite the phrase being well known I was left wondering(among other things) who my leader would be, so I’m not sure where I would have taken him even if I had given him a lift.

            • chuckula
            • 1 year ago

            I read both of those posts with Matthew Mcconaughey’s voice in my head.

            Alright Alright Alright!

          • Srsly_Bro
          • 1 year ago

          That’s a great story, bro.

        • Wonders
        • 1 year ago

        [quote<]You aren't able to question much of anything without getting down-voted.[/quote<] To those unable to distinguish earnest, constructive questions from drive-by implications thinly veiled as "questions" -- let these poor souls face the full might of the Thumbs of the Silent Majority. As for you, Srsly_Bro, it's clear that it will take you years and lots of effort to become an informed web citizen. That's not a put down. Just the rationale behind my suggestion that you keep more of your thoughts to yourself going forward. In doing so, you'll be making the world a slightly better place.

      • chuckula
      • 1 year ago

      [url=https://www.youtube.com/watch?v=q6EoRBvdVPQ<]YEE[/url<] I'd also like to point out that the same people who call me the world's biggest paid Intel shill* and accuse me of making the most biased posts are curiously never, ever, around to attack crap like that. * Still waiting on my check.

        • Shobai
        • 1 year ago

        Maybe they came, saw, and down voted?

      • Goty
      • 1 year ago

      [quote<]Also, a CPU like 2990WX need more focus on actual work programs and less on games, since the story of games is simple and can be summarized in one line.[/quote<] So then why does it matter how they benchmarked the games?

      • moose17145
      • 1 year ago

      Suggest in the future you more closely read the review before posting.

      You then might see that they DID test the Chip in game mode that way. You may ALSO then note then that they did more NON-GAMING testing than gaming testing, and all throughout the gaming testing are pointing out that this is really NOT what this chip is meant for, but that they are doing it anyways (which is still important to those who may want to use the same system for both work AND play).

    • chµck
    • 1 year ago

    odd that the 2990 does so poorly in AES encryption, far worse than the 1st generation, while the 2950 does fine

    • karma77police
    • 1 year ago

    A new generation of Nvidia video cards (RTX 2080) or whatever might be called will really show how badly entire Ryzen CPU line performs. The performance gap between Ryzen and Intel counterpart will be even bigger and really creep into 2k and 4k resolution. Don’t shoot a messenger.

    • Mr Bill
    • 1 year ago

    Does TuRk imply an Alan Turing reference? Amazon has [url=https://www.mturk.com/<]Mturk[/url<] which they say is a mechanical turk. Ah I found it! [quote<]The name Mechanical Turk was inspired by "The Turk", an 18th-century chess-playing automaton made by Wolfgang von Kempelen that toured Europe, beating both Napoleon Bonaparte and Benjamin Franklin.[/quote<] From the Wikipedia entry for Amazon Mechanical Turk.

    • just brew it!
    • 1 year ago

    The uneven performance is not entirely surprising, given the NUMA and interconnect architecture. If your primary use case is one of the ones where it beats the crap out of everything else, this looks like a no-brainer. For other workloads, you’re not getting much for your extra $, and in many cases you’re actually getting less.

    I’m not entirely convinced that fixing the speed bumps all comes down to application optimization though. I suspect that OS thread schedulers have some catching up to do, to make most efficient use of this architecture. The purported discrepancy in 7-Zip performance between Windows and Linux hints at this; the Linux kernel which was used in the test may have already incorporated updates to make it 4-die-Threadripper-aware.

    If this comes down mostly to OS thread scheduling issues, the applications where this chip underperforms may not need updates to fix the problem. When the OS thread scheduler catches up, performance of the affected applications should get a “free” performance boost.

    OTOH, even with a fully aware scheduler, there will always be some problematic workloads which hit scaling issues once you start spilling over into the cores which don’t have a direct path to system RAM. That’s just a fact of life when dealing with NUMA.

      • Mr Bill
      • 1 year ago

      Once again, open source is quicker to adapt to platform changes.

        • Klimax
        • 1 year ago

        Move fast, break things..

          • just brew it!
          • 1 year ago

          There’s definitely some truth to the “break things” part, and the degree to which it is true is very dependent on the specific distro we’re talking about.

          Based on my recent attempt to set up a new Linux-based home file server, IMO Ubuntu 18.04 LTS isn’t quite ready for prime time. I’m staying away for now, and will probably take it for another test drive after the 18.04.1 point release.

    • HERETIC
    • 1 year ago

    Read a few reviews, and I think this comment by Ian at AT sums it up well-
    “For most users, the 2950X is enough. For the select few, the 2990WX will be out of this world.”

    Your move Intel. Get some of that 12nm goodness our way……………………………

      • just brew it!
      • 1 year ago

      Process size numbers are largely meaningless these days. 14nm, 12nm, 11nm… meh.

    • plonk420
    • 1 year ago

    regarding 7zip performance, someone found this with Win10 vs Linux performance

    [url<]https://i.imgur.com/X02plS8.png[/url<]

      • Jeff Kampman
      • 1 year ago

      Yep, like I noted in the review I have talked to other reviewers who use Linux and it appears there are going to be some teething pains with this chip as Windows 10 catches up to it.

        • Mr Bill
        • 1 year ago

        Windows product segmentation? Surely they know how to write a thread scheduler (if that is the problem, I’m no programmer).
        Edit: I guess its not product segmentation, just something new… [quote<]AMD says it's also worked with Microsoft to increase awareness of the unusual memory-access topology of the Threadripper 2990WX multi-chip module, and that it's continuing to work with Redmond to refine the way the chip and OS work together for better performance in the future.[/quote<]

          • just brew it!
          • 1 year ago

          Of course they know how to write a thread scheduler. But if the scheduler hasn’t yet been tuned to properly weigh the tradeoffs peculiar to the 4-die Threadripper, it may be making poor scheduling decisions for certain workloads.

          Given Linux’s domination of the HPC segment, they probably pay more attention to details like this. It would not surprise me at all if the Linux thread scheduler is currently a little ahead of Microsoft’s when it comes to tuning for chips like this. Give Microsoft a little time, and they’ll catch up.

          Optimally scheduling threads on an architecture with both SMT [i<]and[/i<] NUMA is tricky!

            • Mr Bill
            • 1 year ago

            +3 Good to know. [quote<]Optimally scheduling threads on an architecture with both SMT and NUMA is tricky![/quote<]You are referring to OS architecture in this case (since this is a NUMA only CPU)?

            • just brew it!
            • 1 year ago

            You don’t want to force applications to worry about which cores their threads run on to get good performance. This really should be handled by the OS.

            That said, modern OSes provide mechanisms which allow applications to pin processes and threads to specific cores; so applications which [i<]want[/i<] to manage stuff at this level do have some degree of control. But that's an ugly solution since it forces the application to deal with details of the specific CPU architecture it is running on. Other than some very niche special cases (e.g. dedicated userspace device driver that does polled I/O to eliminate interrupt latency), you shouldn't [i<]need[/i<] to explicitly manage how your threads are getting scheduled.

            • cygnus1
            • 1 year ago

            No, he’s definitely referring to the CPU architecture. He’s pointing out that these TR chips have complex NUMA topology AND all the cores are SMT enabled, meaning they can run 2 threads simultaneously. The reason that makes it tricky is that when you actually run 2 threads on a single core, they’re both running slower than if they’d been run on separate physical cores. So when scheduling a thread to run on a partially loaded CPU, the scheduler now has to decide between different performing NUMA nodes AND whether or not to schedule on a logical core (what an extra SMT/Hyperthreading core is called). At a low load, threads should load onto the NUMA node(s) connected to RAM, and stay bound to the node connected to the RAM where they are stored. But once you run out of those cores, it’s hard to say which will perform better, a physical core on the slower NUMA nodes or on a logical core on the already busy RAM connected NUMA node? I don’t envy the thread scheduler programmers job.

            • Mr Bill
            • 1 year ago

            OK, I see what you and JBI mean. Thanks for explaining. [quote<]...AND all the cores are SMT enabled, meaning they can run 2 threads simultaneously. [/quote<]

            • dragontamer5788
            • 1 year ago

            SMT is “hyperthreading”. The 2990wx is BOTH an SMT and NUMA chip.

            UMA (uniform memory-architecture) is the typical way of doing things.

            • just brew it!
            • 1 year ago

            [quote<]UMA (uniform memory-architecture) is the typical way of doing things.[/quote<] It's typical for desktops. It's less typical for servers. It is not entirely surprising that HEDT platforms might straddle the line between the desktop and server worlds, and go NUMA. Ever since the move to IMCs, it's unavoidable that any system with a high enough core count to require multiple CPU dies -- whether in multiple sockets, or a single-socket MCM -- will be NUMA. There's just no way around it.

            • Waco
            • 1 year ago

            Even within CPUs, various cores are closer/further away from the memory controller(s) on-die. This is measurable on ring-bus Intel chips and even more pronounced on the mesh topology dies.

            NUMA is the way going forward – hopefully all consumer OSes keep up.

      • Mr Bill
      • 1 year ago

      • ptsant
      • 1 year ago

      More generally, the difference in Win vs Linux is … astounding.

      [url<]https://www.phoronix.com/scan.php?page=article&item=2990wx-linwin-scale[/url<] I can't say if this is enough to change the competitive landscape because there is no Intel counterpart in this test, but the differences between OSs are often in the 20+% range.

    • wiak
    • 1 year ago

    please use standalone x264 and x265 binaries or ffmpeg for that matter, as ffmpeg is what most software used under the hood, heck even amd them self uses it in amd relive and google uses it for youtube, its also used in shotcut

    • RtFusion
    • 1 year ago

    Great review as always.

    Very interesting if TuRK can be adapted to other components like GPU or SSD testing?

    • gamoniac
    • 1 year ago

    [quote<]We're not holding gaming performance against the 2990WX, to be clear—high-end desktop platforms are rarely the right choice for gamers trying to push the most frames possible.[/quote<] Thanks for the review. For this CPU, it would have been better if TR isn't focusing so much on the gaming performance, as you have pointed out above.

    • cipfab
    • 1 year ago

    Could you please test with daw bench like you did with threadripper 1950X, AMD had “bad” performances for AUDIO, I wonder if is it still the case.

      • Jeff Kampman
      • 1 year ago

      It’s on the to-do list once I finish the 2950X review.

    • gerryg
    • 1 year ago

    [quote<]CPU Hash uses Intel's SHA Extensions on compatible chips to accelerate certain cryptographic functions. As the Threadrippers' disproportionately large results in this test show, they support those extensions. Intel's chips do not. [/quote<] I'm not a CPU nerd, but that statement sounds weird. Intel doesn't support their own extensions??

      • Waco
      • 1 year ago

      They restrict them to Xeons IIRC.

      EDIT: Nevermind, read chuckula’s reply below.

        • chuckula
        • 1 year ago

        Actually, the SHA Extension was originally proposed by Intel but it is not implemented in the current Xeons.

        Original specification from 2013: [url<]https://software.intel.com/en-us/articles/intel-sha-extensions[/url<] Currently the main Intel chips that do support it are actually Goldmont Atoms: [url<]https://en.wikipedia.org/wiki/Goldmont[/url<] Expect to see it more fully supported in at least Ice Lake, and it might sneak into Cascade Lake.

          • Waco
          • 1 year ago

          Shows how much I pay attention to those features. 😛

          I am looking into acceleration of deduplication with SHA extensions with Epyc nodes – Intel wasn’t even a consideration due to lack of memory bandwidth and PCIe lanes. :/

          • just brew it!
          • 1 year ago

          That’s pretty amusing. Likely some combination of one or more of: design cycles lining up in a way that was favorable to AMD getting this feature out the door; product segmentation decisions; and Intel prioritizing other features over this one.

            • willmore
            • 1 year ago

            They were probaly aiming it for their 10nm chips and those didn’t happen and they never backported it to the rushed 14nm++(+?) I keep forgetting how many plusses we are up to.

    • LocalCitizen
    • 1 year ago

    may i ask how 16c / 32t mode was accomplished? bios option?

    another site reports that by using the Ryzen Master software to enable the Game Mode (1 die, 8 core, dual memory channels, requires a reboot) will improve the game performance significantly. in particular, in far cry 5, 1080p, the 2990wx in game mode was nearly as good as 8700k.

    edit: oops i missed the segment in the conclusion

      • chuckula
      • 1 year ago

      [quote<]may i ask how 16c / 32t mode was accomplished? bios option? [/quote<] What kind of amateurs do you take TR for? The answer to your question is obvious: Plasma Torch!

        • LocalCitizen
        • 1 year ago

        but but the world’s smallest chainsaw is gathering dust

    • Mr Bill
    • 1 year ago

    As expected, Threadripper 2 does not make a very good dedicated video game controller. But I suspect with the right operating system scheduling, you could be running heavy computational model(s) and still have enough left over to play a good game.

    • benedict
    • 1 year ago

    There must be something wrong with AC:Origins test. 2950 should outperform 1950 by 3-7% and it does so in all other tests. It’s architecturally the same, but with higher frequency so there’s absolutely no reason for the 2950 to flop on that test.

    • USAFTW
    • 1 year ago

    The 2990WX is insane for people whose work can take advantage of so many threads, but I was shocked to see how badly it did in gaming. Then again, that’s really not the point of either of these chips I suppose.
    The 2950 is a bit more balanced it seems, and it gets a nice price cut to further undercut Intel.
    AMD seems to have successfully captured lightning in a bottle. Until until scratches its head to get 10nm going, AMD could gain some nice marketshare, hopefully.
    If only their GPUs were any good. Sigh. I guess it was too much to ask AMD for both CPU and GPU prowess.

      • Kretschmer
      • 1 year ago

      AMD marketshare has generally not tracked relative performance. On one hand they have to fight public perception as an inferior good, on the other hand fanbois keep them afloat during the dark days. The net is a firm that bumbles along without making large gains.

    • dragontamer5788
    • 1 year ago

    Hmm, it definitely seems like the 32-core Threadripper is incredibly niche. With these benchmarks, the only task that really makes sense is rendering (Blender Cycles or whatever).

    It seems like compilers are now single-thread bound (at least, with this many cores). Maybe the linker step (which has traditionally been single-threaded). Other major tasks like H264 or Audio encoding, are also not really benefiting that much from the additional power.

    In my own tests, Magix Vegas 15 barely uses 30% of my 1950x. So it too is single-threaded bound.

    ———–

    So it seems like a processor purely designed for the 3d rendering market. An important market for sure, but it just seems too impractical for virtually everyone else. EDIT: I guess Threadripper 2990WX might be good for a “cheap server” option. EPYC chips require more-expensive RDIMMs, while you can buy ECC UDIMMs on Threadripper. So a ~128GB RAM server will probably be cheaper on the Threadripper 2990WX, rather than an EPYC.

    With that being said, the 2950x looks incredible.

    • Neutronbeam
    • 1 year ago

    Ripping good review Jeff–and thanks for it!

    • karma77police
    • 1 year ago

    It looks like these CPUs suck especially in gaming. If had to pay for 2990WX i would get Intel 18 Core…much better CPU all around. Also don’t have to micro manage anything with Intel CPU. AMD delivered number of cores but in reality didn’t really deliver anything. As soon as Nvidia release a new video card this performance gap between AMD and Intel in gaming will creep into 2k and 4k resolution in favor of Intel.

    There is really good reason for this…this is not true 32/64 CPU but 4 CPUs slapped together so in reality AMD didn’t anything outstanding…to me waste of money.

      • K-L-Waster
      • 1 year ago

      They do suck at gaming, but that’s a bit like saying a Ford Super-Duty sucks on the race track compared to a Mustang. Well, yeah, but not the point.

        • chuckula
        • 1 year ago

        Irony alert: AMD chose Ferraris for the vehicle at their “totally not trying to influence the media” day.

        They should have brought out pimped-out [s<]18[/s<] [u<]make that 32[/u<] wheelers.

        • brucethemoose
        • 1 year ago

        Nonsense.

        [url<]https://youtu.be/C3tfJjVuY-o[/url<]

      • thx1138r
      • 1 year ago

      If you’re talking about gaming and serious multithreading, a better option would seem to be to get two systems. I reckon you could build a i7-8068k system and a 1950X system for the price of a single i9-7980 system.
      Don’t get me wrong, the i9-7980 is a very powerful chip, but it’s not great for gaming, it’s not the best multi-tasking HEDT chip any more and it’s particularly poor value when compared with the 2950X.

    • ermo
    • 1 year ago

    I think it was clever move by AMD to hold back the best bins of the Zen+ dies for the 2950X, as it now provides genuinely superior performance in e.g. games compared to the 2700X.

    This way you don’t end up in the situation that intel finds itself in, where the HEDT platform is slower than its consumer platform in single-thread dominated scenarios, despite costing quite a bit more.

    Well played AMD.

    [i<][b<]Mandatory chuckula disclaimer:[/b<][/i<] I'm not implying that intel doesn't make excellent CPUs. They clearly do.

      • chuckula
      • 1 year ago

      [quote<]where the HEDT platform is slower than its consumer platform in single-thread dominated scenarios[/quote<] Oh, I'll take the 8086K [and soon 9900K] any day of the week if it's a single-threaded or even single-digit threaded performance you want.

        • ermo
        • 1 year ago

        Don’t get me wrong: Assuming that intel implements mitigations to the best of their ability for Meltdown, Spectre and L1 Terminal Fault, I can easily see myself aiming for an i9-9900K over the 2700X since I’m contemplating a VR build to replace my relatively ancient delidded 3770k.

        However, the R7 2700X is looking nice for the platform price, the TR 2950X is mighty tempting (I’m going to be doing a fair bit of FEA going forward) and Zen2 is juuust around the corner.

        Decisions, decisions.

    • ermo
    • 1 year ago

    Cheers for the review and nice to see the Mechanical TuRk perform as hoped.

    If you have the hardware, would you be interested in doing automated benchmarks of the following CPUs:

    – i7-920
    – i5-2500k or 2550k
    – i7-2600k or 2700k
    – i5-3570k
    – i7-3770k
    – i7-4570k
    – i7-4770k or 4790k
    – i7-5775C
    – i5-6500k
    – i7-6700k
    – i7-7700k
    – i7-8086k
    – Phenom II X4 980 BE
    – Phenom II X6 1100T
    – FX-8170 (Bulldozer)
    – FX-6300 (Piledriver)
    – FX-8370 (Piledriver)
    – Atlon X4 880K (Steamroller, DDR3)
    – Athlon X4 845 (Excavator, DDR3)
    – Athlon X4 970 (Excavator, DDR4)
    – Ryzen 3 2200G (just the CPU)
    – Ryzen 5 2400G (just the CPU)
    – R5-1600X
    – R7-1800X
    – R5-2600X
    – R7-2700X

    Would be a pretty massive undertaking, but it should provide some excellent scaling data in the post-Meltdown and -Spectre world and possibly drive a nice few upgrades via your partner links?

      • Kretschmer
      • 1 year ago

      We need more contemporary i3 and i5 benchmarks.

      • Khali
      • 1 year ago

      I asked about this a while back. The answer was its just too big of a job since they would have to round up the proper chipset motherboards and memory for each cpu.

      If your really curious head over to [url<]http://cpu.userbenchmark.com/.[/url<]

    • fellix
    • 1 year ago

    Looks like the new XFR clock boost could result in negative performance scaling for TR2 in some cases. There’s still some fine tuning to be done on firmware and OS level for this platform, particularly for the asymmetrical NUMA operation.

    • crystall
    • 1 year ago

    The new logo truly shows the Judas Priest-inspired lineage of the Threadripper name.

      • MileageMayVary
      • 1 year ago

      The cores are Threadrippers, the prices are Painkillers.

    • Unknown-Error
    • 1 year ago

    2950X, especially for $899, looks good.

    • psuedonymous
    • 1 year ago

    I think the pursuit of More Cores is going to run up against a big challenge pretty soon: the GPU. In the embarrassingly parallel workloads where these lots-of-OK-cores chips excel, a GPU excels even more by dint of greater memory bandwidth and even more cores. Even in raytracing, where typically high chore count CPUs have held their edge, literally the just-about-to-release generation of GPUs has raytracing acceleration as their headlining feature.

    I suspect there are going to be few workloads where a many-cores CPU is going to be competitive with a few-but-fast cores CPU and spending the difference on a GPU.

      • brucethemoose
      • 1 year ago

      The problem is paying someone to write GPU code, vs massaging or just spawning multiple instances of x86 code.

        • spiketheaardvark
        • 1 year ago

        I’m just a PhD that cludges together some python to do relatively simple tasks. A lot of what I do falls under the embarrassingly parallelizable definition. Even I can play the spawn multiple x86 instances game. I looked into using a GPU and decided quickly that I’ve got other thing to do and “Just run it overnight” is surprisingly efficient when it comes to my time. I theory the GPU in my laptop could do the job faster than our 16 core work station, but not with me doing the coding.

          • psuedonymous
          • 1 year ago

          This is where Larrabee/Xeon Phi fit in.

            • brucethemoose
            • 1 year ago

            Xeon Phi is marketed as an AI accelerator atm.

            Intel said they’re working on a new design for that niche for 2020/2021, but until then I don’t think Phi won’t be particularly competetive with big server CPUs.

            • Waco
            • 1 year ago

            The Phi is actually quite good if you can run dense code that relies on both vector width and gobs of memory bandwidth. I sit next to nearly 10,000 of them and they can crank through big problems far faster than anything that has been built before. GPUs need not apply, you can’t get data in/out of them fast enough and they don’t have anywhere near enough local memory to be useful yet.

            • Eversor
            • 1 year ago

            Not really. These don’t have the same branch predictors as said workstation and you also need to make full use of AVX512 on top of that. Otherwise you’re just better with these many core X86 chips.

        • Mr Bill
        • 1 year ago

        Not to mention trying to use a GPU to manage large amounts of memory and storage. CPU’s definitely have their niech.

      • chuckula
      • 1 year ago

      GPUs are great at certain parallel workloads.

      However, that does not mean that GPUs are great at [b<]all[/b<] parallel workloads.

      • Eversor
      • 1 year ago

      A GPU does not work the way you think. They are fast at those tasks because it’s code without branches that can run in simple cores that can mostly do multiply+add. The second one thread branches, it’s done.

      These x86 cores can run complex, continuously branching code without such issues. They can also run some vector loads (non-branching code) as the GPUs.

      To each problem, it’s one solution.

    • thx1138r
    • 1 year ago

    The 2990WX look like a great chip for the people that need such a highly threaded CPU, but I’m really looking forward to the separate 2950X review. That seems to be the real winner here, usually pretty close to the 7980 in performance at half the price.

      • chuckula
      • 1 year ago

      Man, the AMD crowd is really touchy today when they are downthumbing posts that compliment the 2950X but don’t push the 2990WX hard enough.

        • thx1138r
        • 1 year ago

        For all the talk about the 2990WX it’s really a niche chip, and a small niche at that. Great for those few people that can regularly make use of that number of threads, but as this review shows it can actually be a hindrance for the average power-user on the street.
        The bad news for AMD is that the actually useful chip, the 2950X has been completely overshadowed in the run up to this launch. Baring a few small exceptions in gaming, it out-performs it’s predecessor quite admirably, and at a lower price point.

        • Unknown-Error
        • 1 year ago

        2950X is within striking distance of the 7980XE when it comes to multithreaded workloads and with a max boost of 4.4 GHz gives it decent single threaded performance all for $899. That’s somewhat of a steal. On the other hand, I am not convinced with the usefulness of the 2990WX.

          • Waco
          • 1 year ago

          In apps that can scale well, the 2990WX delivers lower task energy usage [i<]and[/i<] the best performance. Those two rarely go hand in hand! That said, if you're looking at 32 cores the cost of an extra few DIMMs and a workstation/server motherboard for Epyc isn't much of a price increase. You do lose that absurdly high (for such a high core count part) single-threaded clock boost if you go Epyc though.

            • Unknown-Error
            • 1 year ago

            [quote<]In apps that can scale well, the 2990WX delivers lower task energy usage and the best performance. Those two rarely go hand in hand![/quote<] Yeah, but then, same can be said about the 7980XE and 7960X. If you look at the AIDA64 FPU Julia/Mandel, FP32/FP64 Ray-Tracing benchmarks, thanks to the AVX-512, the 7980XE and 7960X absolutely massacres their opponents - [url<]https://techreport.com/review/33977/amd-ryzen-threadripper-2990wx-cpu-reviewed/3[/url<]

            • Waco
            • 1 year ago

            Yup! Totally depends on the application.

            • Mr Bill
            • 1 year ago

            Indeed…[quote<]Ahead of this review, we set out to find a more vendor-neutral and up-to-date computational fluid dynamics benchmark than the wizened Euler3D. As it happens, the SPECwpc benchmark includes a CFD test constructed with Microsoft's HPC Pack, the OpenFOAM toolkit, and the XiFoam solver. More information on XiFoam is available here. SPECwpc allows us to yoke every core and thread of our test systems for this benchmark.[/quote<]

    • JosiahBradley
    • 1 year ago

    Would love to see the test done on Server 2019. This is a workstation part after all.

      • Krogoth
      • 1 year ago

      I doubt it would have an meaningful impact for the applications that would be affected by possible OS thread scheduling and NUMA issues.

    • ET3D
    • 1 year ago

    I imagine that we’d see some threading optimisations in Windows 10 version 1903, released in June 2019, which will help 2990WX performance.

    • techguy
    • 1 year ago

    Thanks for the review, excellent as always, and thanks for performing the Handbrake test as that has allowed me to make a purchase decision (or rather, delay one).

      • brucethemoose
      • 1 year ago

      Encoding should definitely be run in multiple instances on a CPU like this. I was kinda sad TR didn’t test that, actually.

      IIRC someone made an encoder GUI that automatically splits single videos into multiple clips, encodes them in parallel, then stitches them together at the end, but I can’t remember what it was… It’s not StaxRip or MeGUI, and certainly not HandBrake (which is kinda over-rated IMO).

        • techguy
        • 1 year ago

        I agree, though if the individual instances run slower than competing parts the only way to scale is to run more simultaneous transcodes and that’s not always an option for me. I think I’ll go Quadro P2000 rather than upgrade my CPU again.

    • Krogoth
    • 1 year ago

    Great workstation/low-end server-tier CPU, but it appears to be marred by its topology at some applications and might be memory bandwidth limited in some cases (The Quad-channel DDR4 can’t keep all those 32 cores happy and the odd topology doesn’t help either).

    • NTMBK
    • 1 year ago

    Nice, a side dish of screen tearing to go with that thread ripping.

      • NTMBK
      • 1 year ago

      I mean, obviously, if you buy this thing to play games, you’re mad. It wasn’t designed for it. Go buy a normal Ryzen. But it’s interesting the massive gulf in results here. I don’t think it’s just the core count- in some of those gaming benchmarks the extra cores [i<]hurt[/i<] performance, so clearly having any threads scheduled on those dies hurt the game. I think the extra hop needed to get to memory (or the GPU, or storage) really hurts. It's such a strange product.

        • chuckula
        • 1 year ago

        [quote<]in some of those gaming benchmarks the extra cores hurt performance,[/quote<] And what did Chuckuladamus [url=https://techreport.com/discussion/33968/amd-second-generation-ryzen-threadripper-cpus-revealed?post=1086024<]predict?[/url<] [quote<]All of these chips are getting into the realm of performance tradeoffs to accommodate higher core counts. There are going to be situations where there's nearly perfect scaling, negative scaling -- and I don't only mean in single-threaded applications -- and in-between scaling AMD is claiming with rendering benchmarks.[/quote<]

          • Krogoth
          • 1 year ago

          Of course

          These multi-core chips aren’t going to be straightforward choices. The topology of their logic, memory bandwidth and cache are going to be decisive factors.

        • Krogoth
        • 1 year ago

        It is a pure HEDT SKU. Skylake-X and Threadripper I chips aren’t exactly optimal choices for gaming either.

      • tipoo
      • 1 year ago

      *Rip and Tear from Doom already playing in my head*

    • derFunkenstein
    • 1 year ago

    [quote<]We won't be making the TuRk package itself available to the community, but interested parties can still verify our work independently or compare the performance of their own systems by downloading and running any particular benchmark or benchmarks of interest.[/quote<] Definitely the right call. No reason to let other websites steal your work and benefit from it. If they wanted it they could do it themselves. 😀 Nice work there, Bruno! edit: i had a correction but it was factually incorrect. Ignore.

      • morphine
      • 1 year ago

      Thanks! 🙂 I’m really happy for the opportunity to be let loose on something that’ll save tons of work in the future. Some of the work was touch-and-go as I learned some intricacies as I went along, but it’s very rewarding to see the utilities put to great use by Jeff.

    • Kretschmer
    • 1 year ago

    Countdown to Alienware bundling one of these with a 1070 and/or people buying Threadripper to watch Netflix and play World of Warcraft?

      • chuckula
      • 1 year ago

      Threadripper 2 is the ULTIMATE gaming CPU!

      Try running the new 24-thread CPU coin-mining DLC that we are adding to all our games on those pathetic “Intel” processors and get back to us about your precious frame time performance!

        • Kretschmer
        • 1 year ago

        Hey if you’re playing a AAA game, streaming to Twitch, unzipping, and encoding a DVD, Threadripper 2 will dominate with its MANLY CORE COUNT.

          • Srsly_Bro
          • 1 year ago

          Manly Core is the evolution of Magny Cour.

          • maroon1
          • 1 year ago

          Yea unzipping, but not zipping. 2990WX has less than half performance of 7980XE when compressing.

          Gaming performance is horrible. It can’t get 60fps in some games, and has horrible frametime specially in far cry 5, and thats without running anything else in the background

          You could disable half of the cores to improve gaming performance but if you do that, then 7980XE will beat it in multi-tasking anyway (gaming with background application running) because fast 18 cores vs slow 16 cores. in other words 2990WX is not good choice for gaming and multi-tasking at same time. You either get poor gaming performance with good multi-core performance, or if you use the gaming mode (disable half of the cores) then you will get worse performance than 7980XE.

            • Redocbew
            • 1 year ago

            Threadripper 2 is not the ultimate gaming CPU? I am disappoint.

            • Kretschmer
            • 1 year ago

            (We are engaging in satire. But doing it faster than you, because of MANLY CORES.*)

            *Not really, I’m on a wimpy four-core i7. Don’t tell anyone.

            • Redocbew
            • 1 year ago

            If I’m doing it on a lowly i3, then does that mean I have higher SPT(Satire Per Thread)?

    • Waco
    • 1 year ago

    Those gaming results (and 7zip) are just weird. Something is clearly causing threads to be scheduled funkily on the 2nd gen Threadrippers; on paper they’re a clean upgrade from the 1950X in everything even for the 2950X.

    Still – if you need cores, AMD clearly knocked this one out of the park!

      • chuckula
      • 1 year ago

      I SAY MOAR

      YOU SAY COAR!

      MOAR!

        • Waco
        • 1 year ago

        COAR!

      • Krogoth
      • 1 year ago

      It is entirely [b<]NUMA-related[/b<] and the nature of being a MCM chip. I suspect the games in question are dancing around each of the cores on the Threadripper 2 dies. I'm not sure if Windows 10's thread scheduler is at fault or not. It would be interesting to see how forcing CPU affinity could affect results. Threadripper 2's topology behaves more like a quad-socket system so it shouldn't be that surprising that it doesn't fare that well at single and dual-threaded applications.

        • Waco
        • 1 year ago

        The 2950X is only 2 dies, just like the 1950X. The 1950X shouldn’t be faster in any of the tests, yet it pulls away pretty dramatically in a few.

        It’ll be interesting to see how these chips fare in those tests after a few updates roll out.

          • moose17145
          • 1 year ago

          With regards to Gaming, Linus Tech Tips actually noted an ineresting observation that precision boost 2 seems to be a possible culprit for the 1950 besting the 2950.

          The reason for this was that Precision boost 2 was using it’s finer granularity to hit intermediate clocks vs. the “all or nothing” approach of precision boost v1. So as long as the 1950 still had thermal headroom, it would boost up to it’s max and just hold the higher frequency vs. the 2950 settling into a more thermally happy middle ground as far as clock speeds are concerned.

            • Waco
            • 1 year ago

            That would be an interesting side effect. Based on the other wonky results though, I’m hoping it’s a small contributor versus the major reason.

        • just brew it!
        • 1 year ago

        [quote<]Threadripper 2's topology behaves more like a quad-socket system[/quote<] It's even worse than that -- it is effectively a quad-socket topology with memory channels connected to only two of the sockets! If the OS's thread scheduler is erroneously putting threads on the cores with no direct path to memory when there are still cores available on the other dies, that's really gonna hurt.

          • Waco
          • 1 year ago

          It’s not really any worse than having your process consume memory in excess of your local node, though. It’s just that you hit it immediately for all accesses instead of having anything local.

          There’s clearly something funny with the scheduler though – the 1950X shouldn’t be winning anything versus the 2950X.

    • kokolordas15
    • 1 year ago

    All good with the review but may i ask what made you turn the ram speed down from ~3800 to 3200 on 8086k?

    • tahir2
    • 1 year ago

    Hmm I want to talk about this Mechanical TuRk called Suleiman… Are the Muslim’s taking over again?

      • morphine
      • 1 year ago

      Two separate entities, and [url=https://en.wikipedia.org/wiki/Mechanical_Turk<]Mechanical TuRk[/url<] is an appropriate name (Forge came up with the idea) for an automation tool. Meanwhile, [url=https://en.wikipedia.org/wiki/Suleiman_the_Magnificent<]Suleiman[/url<] (ruler of Turks, even mechanical ones) is the master utility that collects results from one or more TuRks running on separate systems and dumps the data into Excel automagically.

        • psuedonymous
        • 1 year ago

        Would that network be considered a suzerainty?

          • Captain Ned
          • 1 year ago

          No, more like a vilayet.

        • tahir2
        • 1 year ago

        I like it… shame irony doesn’t translate well over the interweb.

      • Unknown-Error
      • 1 year ago

      Huh?

    • chuckula
    • 1 year ago

    A few more serious observations:

    1. In GCC you’ve clearly hit Amdahl’s law. I’ll bet that the dominant time sink in that test is now in the non-multithreaded linker. Newer linker implementations like the gold linker or LLVM might show better scaling.

    2. 7 zip is just plain weird right now. The 7980XE shouldn’t win at compression by the margin it does and the 2990WX shouldn’t win at decompression by the margin it does either.

      • shank15217
      • 1 year ago

      Windows 10 isn’t optimized for 4 numa cores, its a ‘desktop’ os. Real power users should look into Linux if they want to use this processor.

        • Waco
        • 1 year ago

        More likely the scheduler just needs a bit of tuning. The 2950X only has 2 NUMA domains, just like the 1950X.

        • Yan
        • 1 year ago

        Phoronix’s results for [url=https://www.phoronix.com/scan.php?page=article&item=amd-linux-2990wx&num=4<]Indigo[/url<] are very different from Tech Report's: they show the 2990WX well ahead. Perhaps some of the surprising results are caused by the software or the OS itself. Of course, that doesn't help much if you have to use Windows or a Windows program. Edit: and [url=https://www.phoronix.com/scan.php?page=article&item=2990wx-linux-windows&num=2<]another test[/url<] shows that 7Zip is about twice as fast under Linux as under Windows with the 2990WX.

          • just brew it!
          • 1 year ago

          Yup. Clearly the Linux thread scheduler is better tuned for this chip than the Windows one. I expect this will be remedied in a future Windows update.

            • Mr Bill
            • 1 year ago

            Its also apparent as you noted, MS is not as responsive for windows updates, to manufacturer’s input (AMD’s notification and help), as the open source community is for Linux.

          • Eversor
          • 1 year ago

          This can’t be up voted enough.

          As you point out, this should be software or scheduler based. It has also occurred in Ashes of the Singularity when Ryzen first came out, until they patched the game to account for CCX latency differences.

          The whole benchmark suite shows Windows 10 getting thoroughly trounced by all Linux distros.

          • Mr Bill
          • 1 year ago

          Found this [url=https://www.realworldtech.com/forum/?threadid=179265&curpostid=179405<]RWT thread talking about the Phoronix results[/url<]

    • chuckula
    • 1 year ago

    Fear not!
    In a completely unrelated move, Intel has released this slick Hollywood [url=https://youtu.be/92gP2J0CUjc?t=1m32s<]response outlining it's new HEDT strategy![/url<]

      • thx1138r
      • 1 year ago

      Actually Intel’s new strategy seems to be to decrease the need for HEDT chips, hence their purported new 8-core desktop i9.

      • Srsly_Bro
      • 1 year ago

      Intel’s holy hand grenade will be out in 2020™.

        • chuckula
        • 1 year ago

        Until then we’re stuck with a trojan rabbit.

          • K-L-Waster
          • 1 year ago

          Well, it does have big naaasty teeth….

          • anotherengineer
          • 1 year ago

          Rabbits need trojans, yes they do.

      • Redocbew
      • 1 year ago

      And here I was thinking that Chuckula clearly thought Threadripper was the funniest joke in the world.

        • Ummagumma
        • 1 year ago

        If you know Chuckula then you know he is a flaming Intel fanboi.

        Therefore any AMD product is a funny joke to Chuckula.

      • Mr Bill
      • 1 year ago

      That rabbit’s dynamite!

Pin It on Pinterest

Share This