Intel’s Core i9-7900X CPU reviewed, part one

Intel’s Core X CPUs are here, and we’re kicking off this new era with the highest-end chip in the lineup so far: the Core i9-7900X. As it traditionally does for its high-end desktop platform, the company is repurposing silicon from its upcoming Skylake Xeons to serve as Skylake-X chips. That means some unusually large changes are in store for us enthusiasts as Skylake makes its transition from mainstream desktops to the data center.

The Core X family of CPUs needs a new socket, LGA 2066, and a new platform, called X299. We’ve already covered the Core X lineup and the X299 platform, as well as the entry-level Kaby Lake-X CPUs for that platform, in a dedicated article. The short take is that X299 is an evolutionary step forward from the X99 platform. It keeps that platform’s quad-channel memory support (out of a total of six on at least some Skylake-X dies) and pairs it with a chipset powered by a lot of the same DNA present in the Z270 platform. If you need to brush up on Core X before reading on, feel free.

Now, back to Skylake-X. The fundamental pipeline of this chip isn’t much different from the various Skylake and Kaby Lake desktop parts that we’ve known and loved for almost two years now. We never did a deep dive into the Skylake architecture, but compared to Haswell and Broadwell, the basic Skylake integer pipeline is wider and can have more going on at once. To prevent this wider engine from burning power on execution of the wrong instructions, Skylake also features a better branch predictor than Haswell and Broadwell, according to Intel.

The basic Skylake core as seen in the Core i7-6700K and friends. Source: Intel

That’s a gross oversimplification, of course, but it’s generally how Intel has improved its chips over the past few years. Let’s have a look at the big differences between mainstream and server Skylake CPUs now.

AVX-512 unhinges its jaw

The first big change in Skylake-X is support for the AVX-512 instruction set. These new instructions add important new capabilities to Intel’s SIMD implementation, including scatter-gather support, dedicated state and mask registers, and much, much more. To support this generation of AVX, the chip’s vector data registers are now twice as wide, and there are twice as many of them. These wider registers are fed with more load and store bandwidth. Skylake-X can now handle two 64-byte loads and one 64-byte store per cycle, compared to a single 64B load and a single 32B store per cycle in mainstream Skylake.

On top of its wider and more numerous registers, the Skylake-X core also has a dedicated AVX-512 fused multiply-add unit (FMA) on top of the pair of 256-bit-wide AVX FMA units in Skylake-S. This unit can only handle AVX-512 FMAs, and it resides on port five of the Skylake-X unified scheduler. Since the pair of 256-bit FMAs can execute a single AVX-512 FMA in parallel alongside the dedicated AVX-512 FMA unit, throughput for that common instruction is effectively doubled in the best case compared to mainstream Skylake.

The AVX register structure in Skylake-X CPUs. Source: Intel

While those performance improvements may sound impressive, real-world performance of AVX-512 has caveats. Firing up those monster SIMD units requires large amounts of power (and therefore produces more heat), so Skylake-X CPUs might be forced to clock below Intel’s specified Turbo speeds (or not Turbo at all) when executing AVX-512 instructions. That clock-speed tradeoff might result in lower-than-expected performance from AVX-512 code, and Intel says developers will need to be able to amortize the expected performance gain of their AVX-512 applications over time versus the clock-speed drop their code might incur. Mixed workloads with a small proportion of AVX-512 instructions in the overall mix are apparently not an ideal case for speedups from that SIMD hardware, either.  

Model Base

clock

(GHz)

Turbo

clock

(GHz)

Turbo

Boost

Max 3.0

clock

(GHz)

Cores/

threads

L3

cache

PCIe

3.0

lanes

Memory

support

TDP Socket

Price

(1K

units)

i9-7980XE 18/36 LGA 2066 $1999
i9-7960X 16/32 $1699
i9-7940X 14/28 $1399
i9-7920X 12/24 $1199
i9-7900X 3.3 4.3 4.5 10/20 13.75MB 44 Quad-channel

DDR4-2666

140W $999
i7-7820X 3.6 4.3 4.5 8/16 11MB 28 $599
i7-7800X 3.5 4.0 NA 6/12 8.25MB Quad-channel

DDR4-2400

$389
i7-7740X 4.3 4.5 NA 4/8 8MB 16 Dual-channel

DDR4-2666

112W $339
i5-7640X 4.0 4.2 NA 4/4 6MB $242

What’s more, not every Core X chip in the lineup will enjoy the same boost in SIMD performance from AVX-512. Only the Core i9 series of CPUs will ship with the dedicated AVX-512 FMA. The Core i7-7800X and Core i7-7820X will still have the wider registers for AVX-512, but they’ll only execute instructions using the pair of 256-bit AVX units common to all Skylake chips. This exercise in segmentation might surprise people expecting a uniform performance increase from AVX-512 across all the CPUs that support it. (The Kaby Lake-X Core i5-7640X and Core i7-7740X won’t support AVX-512 at all.)

 Because of those caveats, we may be waiting a while for mainstream desktop applications that can really take advantage of all the extra parallelism on offer from these new instructions. Scientific-computing, deep-learning, and financial-services folks will probably be drooling for AVX-512, but regular Joes and Janes probably won’t see any major speedups until companies recompile their software (at the very least). That assumes AVX-512 is coming to mainstream Intel CPUs, as well.

 

Bigger L2 caches for better performance

Skylake-X also has a much different cache allocation per core compared to its mainstream counterparts. Instead of the relatively small 256kB L2 cache (or mid-level cache, in Intel parlance) in Skylake-S and Broadwell-E, each Skylake-X core enjoys a whopping 1MB of private L2. In support of AVX-512, the bandwidth between the L1 data cache and the L2 cache has been increased to 128 bytes per cycle. On top of that size increase, Intel quadrupled the associativity of the cache from four ways in  to 16 ways in Skylake-X. Intel says the move to a larger private cache lets programmers keep usefully large data structures close to the core, and the result is higher performance. Pretty cut and dry.

Intel says it undertook this change because it felt its older architectures placed too much emphasis on data sharing through the L3 caches. In turn, Skylake-X’s architects reduced the shared last-level cache allocation to as much as 1.375MB per core, compared to as much as 2.5MB per core for Broadwell-E chips. This last-level cache isn’t inclusive of the L2 caches, and it serves as a victim cache for the L2. The tradeoff for this rebalancing of cache allocations is a higher L3 cache access latency, according to Intel.

Getting meshy

Finally, Intel is abandoning the ring topology it’s used to connect CPU cores in its many-core CPUs for several generations. In place of its ring, the company is introducing a new (or at least new outside of the Knights Landing accelerator) mesh interconnect topolgy that promises several improvements. First off, Intel says its mesh interconnect delivers lower latency and higher bandwidth than the ring bus, all while operating at a lower frequency and voltage. Those last two characteristics are important, because they should result in less power consumption from the interconnect portion of the chip as it scales up.

Intel also says that the mesh design also allows it to include units like I/O, inter-socket interconnects, and memory controllers in a modular, scalable way as core counts increase. The company claims the distribution of these elements across the chip using the mesh minimizes undesirable “hot spots” of activity that might ultimately constrain cores’ access to those critical resources, limiting performance.

Mesh architecture conceptual representation. Red lines represent horizontal and vertical wires connecting CPU components; green squares represent switches at intersections. Source: Intel

The mesh design should also offer a boon for applications that need to do a lot of inter-core communication. The last-level cache in Skylake-X is distributed across each core, and thanks to the more uniform access characteristics of the mesh, Intel claims that application developers no longer have to worry about non-uniform latencies when accessing data in those caches. Cores should also enjoy more uniform access characteristics when accessing the die’s I/O and memory controller, as well.

Multiple ring buses, as seen in high-core-count Haswell Xeons.

Previously, the shared L3 caches on a chip might have resided on different rings, requiring cores to communicate across the buffered switches formerly used to join discrete rings on the die. These switches added latency on top of that incurred by traversing the ring bus in the first place—something that Intel gave customers the opportunity to avoid in past chips with a “cluster-on-die” mode that turned each ring into something resembling a NUMA domain of its own. The mesh topology in Skylake-X should make headaches from the non-uniform distribution and access latencies of resources among rings a thing of the past.

As for characteristics of the Skylake-X silicon itself, Intel honchos clammed up when we asked about die size and transistor count. The company believes that disclosing this information will lead to unfounded conclusions from its competitors about the quality of their chip designs and process technologies compared to Intel’s. Only the paranoid survive, we suppose.

 

The Core i9-7900X and LGA 2066 in the flesh
We’ve already touched on the Core X CPU family and the X299 platform in depth, but it’s good to get an up-close look at the Core i9-7900X and Asus Prime X299-Deluxe motherboard that Intel has provided us for testing. 

Clockwise from upper left: Core i7-5960X, Core i7-6950X, Core i9-7900X, Core i7-7700K, Ryzen 5 1700X

Outwardly, little has changed that would help us identify the Core i9-7900X at first glance. The eagle-eyed will note more rounded corners on the edges of the integrated heat spreader, but that’s about it.

From left to right: Core i7-5960X, Core i7-6950X, Core i9-7900X

Flipping these chips over reveals a dense forest of surface-mounted components, but it’s otherwise hard to notice the extra 55 lands on the Core i9-7900X. If you want to count, we’ll wait.

That outward similarity might lead one to believe that LGA 2011 and LGA 2066 chips are interchangeable among X99 and X299 motherboards, and the LGA 2066 socket doesn’t help. The dimensions of the socket are the same as those of LGA 2011, but chips for that socket absolutely will not work with LGA 2066. Don’t make an expensive mistake by eyeballing it. The only things that builders can carry over from LGA 2011 systems are their DDR4 kits and cooling hardware.

Intel sent us home with Asus’ ultra-ritzy Prime X299-Deluxe motherboard to host the Core i9-7900X. This board boasts everything one might want out of a high-end platform: four USB 3.1 Gen 2 ports, built-in 802.11ac and 802.11ad Wi-Fi, tasteful RGB LEDs, and a bevy of PCIe x16 slots. Asus also offers two M.2 slots, one of which allows gumstick SSDs to stand up vertically for better cooling. Even if both M.2 slots are occupied, the horizontal M.2 SSD can still enjoy plenty of heat-dissipation potential thanks to an integrated heatsink for the slot and chipset. This mobo even comes with a Thunderbolt 3 card and an add-on fan control board with several extra headers.

Our testing methods

As always, we did our best to collect clean test numbers. We ran each of our benchmarks at least three times, and we’ve reported the median result. Our test systems were configured like so:

Processor Ryzen 7 1800X
Motherboard Gigabyte Aorus AX370-Gaming 5
Chipset AMD X370
Memory size 16 GB (2 DIMMs)
Memory type G.Skill Trident Z DDR4 SDRAM
Memory speed 3866 MT/s (rated)

3200 MT/s (actual)

Memory timings 15-15-15-35 1T
System drive Intel 750 Series 400GB NVMe SSD

 

Processor Intel Core i7-5960X Intel Core i7-6950X Intel Core i9-7900X

(DDR4-2666)

Intel Core i9-7900X

(DDR4-3200)

Intel Core i7-7700K
Motherboard Gigabyte GA-X99-Designare EX Asus Prime X299-Deluxe Gigabyte Aorus GA-Z270X-Gaming 8
Chipset Intel X99 X299 Z270
Memory size 32GB
64GB
32GB
64GB
16GB
Memory type G.Skill Trident Z

DDR4 SDRAM

G.Skill Trident Z

DDR4 SDRAM

G.Crucial Ballistix Elite

DDR4 SDRAM

G.Skill Trident Z

DDR4 SDRAM

G.Skill Trident Z

DDR4 SDRAM

Memory speed 3600 MT/s (rated)

2666 MT/s (actual)

3200 MT/s
2666 MT/s 3200 MT/s 3866 MT/s (rated)

3200 MT/s (actual)

Memory timings 15-15-15-35 2T
16-18-18-38 2T
16-17-17-37 2T 16-18-18-38 2T 15-15-15-35 2T
System drive Corsair Neutron XT 480GB SATA SSD Samsung 960 EVO 500GB NVMe SSD Samsung 850 Pro 512GB SATA SSD Samsung 960 EVO 500GB NVMe SSD

They all shared the same common elements:

Storage 2x Corsair Neutron XT 480GB SSD

1x Kingston HyperX 480GB SSD

Discrete graphics Nvidia GeForce GTX 1080 Ti Founders Edition
Graphics driver version GeForce 382.33
OS Windows 10 Pro with Creators Update
Power supply Seasonic Prime Titanium 1000W

Thanks to Intel, Corsair, Kingston, Asus, Gigabyte, Cooler Master, G.Skill, and AMD for helping us to outfit our test rigs with some of the finest hardware available.

Some further notes on our testing methods:

  • The test systems’ Windows desktops were set at a resolution of 3840×2160 in 32-bit color. Vertical refresh sync (vsync) was disabled in the graphics driver control panel.

  • For our Ryzen systems, we used the AMD Ryzen Balanced power plan included with the company’s most recent chipset drivers. We left our Intel systems on Windows’ default Balanced power plan.

The tests and methods we employ are usually publicly available and reproducible. If you have questions about our methods, hit our forums to talk with us about them.

 

Memory subsystem performance

Since Skylake-X has such a drastically different cache allocation versus its predecessors, we simply had to run it through SiSoft’s bandwidth and latency tests. We’ll start with the Sandra utility’s cache and memory bandwidth test.

To examine the effect of the increased L2 cache size in an intuitive way, we started with the single-threaded test. To interpret these results, remember that the block size identifies which level of cache the test is exercising. From 2KB to 32KB on all of these CPUs, we’re in the L1 cache. Past that point, we’re generally in L2 for most of these CPUs out to 256KB for Haswell, Broadwell, and Kaby Lake, 512KB for Zen, and 1MB for Skylake-X. Finally, we spill over into L3. 

The most interesting result in the chart above unsurprisingly ocurs between the 512KB mark and 1MB, where we’re testing most of these chips’ L2 caches. Thanks to their 1MB L2s, each Skylake-X core can sustain much higher bandwidths at the 1MB block size than the competition. Save for the Zen core, which is still in its L2 at 512KB, all of the older Intel chips are already into their L3 caches.

Sandra’s multithreaded cache and memory bandwidth test shows the combined bandwidth from every core on each chip. In turn, we have to account for the combined size of each cache level on each chip when mapping block sizes to the cache that we’re in at a given size. That’s because the same source block is being split among multiple cores. If that sounds complicated, it is, and there are a lot of moving parts to account for here between core counts and clock speeds. Still, this measurement is a good way of showing just how much data these chips can move around in total.

Because Sandra supports AVX-512 instructions, and because Skylake-X doubles the L1 cache load bandwidth available per core, we get much more throughput compared to older Intel CPUs at most block sizes. Since the test is still hitting the i9-7900X’s L1 cache from 64KB out to 256KB, for example, the increased load bandwidth for that cache lets the 7900X turn in incredible throughput increases across the chip. That trend continues as the test moves into the 7900X’s L2 cache, where it can move roughly 50% more data compared to the Core i7-6950X, itself no slouch. All told, the i9-7900X sets a high new bar for cache throughput—not an easy task in this company.

Next, let’s look at some tests of main memory bandwidth from the popular AIDA64 utility.

Interesting. Though the i9-7900X takes a commanding lead over the i7-6950X in AIDA64’s read tests with both DDR4-2666 and DDR4-3200 RAM, the Skylake-X chip falls slightly behind Broadwell-E in write bandwidth, and it needs DDR4-3200 to match the i7-6950X in the copy test. None of these chips are slouches, to be sure, but the results for the i9-7900X aren’t quite as eye-popping as they were in Sandra’s cache tests. Time to talk latency.

Sandra also offers a detailed benchmark for cache and memory latencies. To reduce the effect of prefetching on this result, we use the “in-page random” access pattern. Like the single-threaded cache bandwidth test above, this test isn’t multithreaded, so it’s easy to keep track of what cache is being measured at each block size. 

As Intel suggested it would, the L2-L3 rebalancing on Skylake-X offers more bandwidth from its L2 caches at the same latency as before, at the cost of slightly higher latencies in the L3. Seems like a fine tradeoff to us, given the apparent benefits of the larger L2.

Compared to past Intel architectures, Skylake-X lags in main-memory access latencies. That increase may be offset somewhat by the potentially much higher hit rate in the L2 cache, though.

Some quick synthetic math tests

AIDA64 offers a useful set of built-in directed benchmarks for assessing the performance of the various subsystems of a CPU. The PhotoWorxx benchmark uses AVX2 on compatible CPUs, while the FPU Julia and Mandel tests use AVX2 with FMA.

 

PhotoWorxx doesn’t seem to benefit from the improvements in the Core i9-7900X, but the Skylake-X CPU enjoys a commanding lead in our tests of floating-point prowess. At least in these synthetics, the i9-7900X seems to justify its more-than-twice-the-price tag of the Ryzen 1800X. Let’s see if this strong opening translates into real-world performance now.

 

Javascript performance

The usefulness of Javascript benchmarks for comparing browser performance may be on the wane, but these collections of tests are still a fine way of demonstrating the real-world single-threaded performance differences among CPUs. These tests may be an ideal stage for Skylake-X’s improved Turbo Boost Max 3.0 implementation.

Turbo Boost Max 3.0 doesn’t go all the way toward eliminating the single-threaded performance gap we tend to see between mainstream and high-end Intel chips using the same architecture, but the feature does let the i9-7900X get most of the way there—and in Kraken, the Skylake-X chip does catch the i7-7700K. That’s good news for folks who like a responsive machine in both lightly-threaded and all-out workloads.

Compiling code in GCC

Our resident code monkey, Bruno Ferreira, helped us put together this code-compiling test. Qtbench records the time needed to compile the Qt SDK using the GCC compilers. The number of jobs dispatched by the Qtbench script is configurable, and we set the number of threads to match the hardware thread count for each CPU.

Bang. The i9-7900X neatly shaves 20% off the compile time of our previous champion, the Core i7-6950X, and it does it for hundreds of dollars less.

7-Zip file compression

Put the i9-7900X to work zipping and unzipping archives, and it delivers a small-to-moderate boost over its Broadwell predecessor. Memory speeds don’t seem to make the least bit of difference to the 7900X’s performance, though.

VeraCrypt disk encryption

VeraCrypt is a continuation of the handy TrueCrypt project. This disk-encryption utility lets us test throughput for both the hardware-accelerated AES algorithm and the implemented-in-software Twofish cipher.

The AES half of our VeraCrypt testing does appear to be memory-bound for once, so DDR4-3200 allows for a whopping 2.4 GB/s higher throughput. Twofish doesn’t enjoy any such speedup from faster memory, although the i9-7900X still handily pulls away from the i7-6950X.

 

Cinebench

The Cinebench benchmark is powered by Maxon’s Cinema 4D rendering engine. It’s multithreaded and comes with a 64-bit executable. The test runs with a single thread and then with as many threads as possible.

In Cinebench’s single-threaded test, the i9-7900X once again proves its Turbo Boost mettle against the Core i7-7700K. Core for core, the Skylake-X chip also hands in a roughly 16% improvement over the Core i7-6950X. Pretty darn good considering the low-single-digit performance improvements we’ve come to expect from Intel’s generation-to-generation advances.

Blender

Blender is a widely-used open-source 3D modeling and rendering application. The app can take advantage of AVX2 instructions on compatible CPUs. We chose the “bmw27” test file from Blender’s selection of benchmark scenes to put our CPUs through their paces.

The i9-7900X carves up our test file in fine form with a 12.5% decrease in render time over its predecessor. The i7-6950X is probably looking at its original suggested price tag and sweating a bit by now. Once again, faster memory makes no difference in performance for the i9-7900X.

Handbrake video transcoding

Handbrake is a popular video-transcoding app that recently hit version 1.0. To see how it performs on these chips, we’re switching things up from past reviews. Here, we converted a roughly two-minute 4K source file from an iPhone 6S into a 1920×1080, 30 FPS MKV using the HEVC algorithm implemented in the x265 open-source encoder. We otherwise left the preset at its default settings.

Noticing a pattern yet? The i9-7900X completes this job in 14% less time than the i7-6950X needs, and its performance doesn’t vary with memory speed.

 

picCOLOR

It’s been a while since we tested CPUs with picCOLOR, but we now have the latest version of this image-analysis tool in our hands courtesy of Dr. Reinert H.G. Mueller of the FIBUS research institute. This isn’t Photoshop; picCOLOR’s image analysis capabilities can be used for scientific applications like particle flow analysis. In its current form, picCOLOR supports AVX2 instructions, multi-core CPUs, and simultaneous multithreading, so it’s an ideal match for the CPUs on our bench. Check out FIBUS’ page for more information about the institute’s work and picCOLOR.

picCOLOR offers an interesting shake-up in our results for once. Although the i9-7900X still totally outruns every other chip in our stable, this time it’s trailed by the Ryzen 7 1800X instead of the Core i7-6950X. Faster memory still doesn’t do squat for the 7900X, though.

STARS Euler3D

Euler3D tackles the difficult problem of simulating fluid dynamics. It tends to be very memory-bandwidth intensive. You can read more about it right here. We configured Euler3D to use every thread available from each of our CPUs.

Interesting. Euler3D seems to lean on the memory subsystem in ways that are particularly amenable to high performance with the i7-6950X and not so much so for the i9-7900X. Not every workload benefits from Intel’s rebalanced cache hierarchy with Skylake-X, it seems.

Digital audio workstation performance

DAWBench is a popular addition to our CPU test suite, and we’re now working directly with the creator of DAWBench, Vin Curigliano, to refine our testing methods. As part of that collaborative effort, Vin provided us with a beta version of the DAWBench DSP 2017 benchmark. We’re leaving DAWBench’s virtual instrument test on the bench this time around, however, since these high-powered Intel CPUs tend to max out the benchmark and an updated version of the test isn’t quite ready yet.

DAWBench DSP 2017 relies on the freely-available Shattered Glass Audio SGA1566 VST plugin. We used the 64-bit version of this VST in our testing. DAWBench DSP lets us enable instances of this plugin until the session becomes unresponsive. We used Reaper as our host DAW for the test, and we monitored the project using a Focusrite Scarlett 2i2 interface with the company’s latest USB ASIO drivers. We set a 96 KHz sampling rate and used two ASIO buffer depths: a punishing 64 and a slightly-less-punishing 128.

As we’ve come to expect, the i9-7900X delivers a modest performance improvement in this test versus the Core i7-6950X. Strangely, the 7900X-and-DDR4-3200 pairing actually performs worse than its DDR4-2666-equipped configuration, though. We’ll have to monitor this test and see whether it’s a behavior that changes as the X299 platform matures.

 

A quick look at power consumption and energy efficiency

Skylake-X’s performance improvements wouldn’t be worth much if they came with a corresponding decrease in energy efficiency. We can get a rough idea of whether the Core i9-7900X is as efficient as it is fast by monitoring our test system’s power consumption at the wall with our trusty Watts Up power meter and estimating the total amount of energy it needs to complete a task. Our observations have shown us that Blender consumes about the same amount of power at every stage of the bmw27 benchmark we test with, so it’s an ideal guinea pig for this kind of calculation. First, though, let’s check idle and peak load power consumption numbers.

At idle, the X299 platform paired with the Core i9-7900X sips only a bit more power than the Core i7-7700K. Under load, however, the 7900X system consumes the most power at peak load by a wide margin. Of course, peak power draw only tells part of the efficiency story.

To really get a sense of how efficient the Core i9-7900X is, we need to take the task energy consumed over the course of our Blender benchmark into account. Not only does the Core i9-7900X cut 94 seconds off the Ryzen 7 1800X’s bmw27 render time, it does it while expending only just a bit more power to do so. That’s impressive performance per watt.

 

Conclusions

It’s time once more to sum up our results using our famous scatter plots. To spit out this final index, we take the geometric mean of each chip’s results in our real-world productivity tests, then plot that number against retail pricing gleaned from Newegg. Where Newegg prices aren’t available, we use a chip’s original suggested price.

Our value scatter tells the entire story of the Core i9-7900X: if your workload scales to many threads, this chip is generally the one to run it on. The server version of Skylake delivers an unusually large performance boost for a modern Intel CPU revision in many tasks. Core for core and thread for thread, the already-beastly Core i7-6950X can sometimes lag the 7900X in the range of 10% to 20%. All that oomph comes for a jaw-dropping $724 less than the 6950X’s initial suggested price, too. Competition is a wonderful thing.

In a milestone for Intel’s high-end desktop platform, the Core i9-7900X mostly ends the tradeoff between single-threaded swiftness and multi-threaded grunt typical of some older Intel high-end desktop chips. For lightly-threaded workloads, the i9-7900X’s improved Turbo Boost Max 3.0 behavior lets it trail our single-thread-favorite Core i7-7700K by only a few percentage points at most. In typical desktop use, then, the i9-7900X and its TBM 3.0-enabled brethren should feel about as snappy as their mainstream desktop cousins. I need to get the i9-7900X paired up with a GeForce GTX 1080 Ti soon to see whether that single-threaded performance translates to similar gameplay smoothness.

I’ll also need to explore Skylake-X overclocking in depth soon. Thanks to immature firmware and monitoring utilities, I didn’t feel comfortable pushing my 7900X too hard at this point. That said, I see a lot of promise for overclocking this chip. My test system made it to the Windows desktop without a hiccup at an astounding 4.7 GHz on all cores, and thermal limits seem as though they’ll be the primary obstacle to fully exploiting that speed. I’ll explore Skylake-X’s overclocking potential more once the X299 platform has had a bit more time in the oven (and once I’ve picked up some seriously beefy cooling hardware in the meantime).

We’ve always been loath to recommend the top-end CPU in Intel’s high-end desktop family (and yes, that is this chip for the moment). Despite the Ryzen-inspired price reshuffling that’s coming with Core X, the i9-7900X still isn’t a great value. The star of the Core X lineup may actually be the Core i7-7820X, whose eight cores and 16 threads have clocks similar to those of the i9-7900X. You may lose a couple of cores in the bargain, but even so, the i7-7820X should perform better than a Ryzen 7 1800X for not that much more money. We hope to play with one of these more attainable Skylake-X CPUs soon.

Of course, the performance of the Core i9-7900X is beyond question: it’s the fastest single-socket CPU we’ve ever tested. The X299 platform may need a little polish yet to let Core X chips really shine, but the performance bar the i9-7900X is already setting promises an exciting standoff this summer as AMD prepares its Ryzen Threadripper CPUs for launch. If you need as many cores and threads as possible from your desktop, times have never been more exciting. Stay tuned as we see whether the i9-7900X has got game.

0 0 votes
Article Rating
167 Comments
Newest
Oldest Most Voted
Inline Feedbacks
View all comments
BIF
BIF
5 years ago

I still want to see useful [email protected] performance for CPUs and GPUs.

*** !HINT! !HINT! ***

Just sayin’.

kamikaziechameleon
kamikaziechameleon
5 years ago

finally more than stinking 6 cores for a consumer desktop. My cell phone had 8 core a number of years past. Intel has been reluctant to broaden their consumer CPU thread count. But its happening and its glorious.

mat9v
mat9v
5 years ago
Reply to  chuckula

I wonder about clocks, 7900X as stated by Intel should be running at 4Ghz all core boost and 6950X was 3.4Ghz if I remember correctly? It seems that IPC for new SkyLakeX arch is not so good in some tasks. Cinebench: 7900X – 2205/4Ghz=551,25 6950X – 1892/3.4Ghz=556,47 Blender: 7900X – 203s for 4Ghz > for 1Ghz would be 812s 6950X – 232s for 3.4Ghz > for 1Ghz would be 788s Handbrake: 7900X – 271s for 4Ghz > for 1Ghz would be 1084s 6950X – 315s for 3.4Ghz > for 1Ghz would be 1071s I do not like IPC of those… Read more »

Kougar
Kougar
5 years ago
Reply to  Sahrin

I understand your point. But your point is inconsequential because AMD hasn’t produced Opterons in half a decade, and more importantly Threadripper hasn’t launched yet which makes the 1800X the best possible chip to compare against.

That you’d ignore these key details just makes you sound like either an industry shill or just another fanboy fanatic.

coolflame57
coolflame57
5 years ago
Reply to  weaktoss

So this is like comparing Michael Phelps to the fastest swimmer in some coastal village, just so that the villagers can understand just how fast Phelps is, right? The village swimmer is just a valuable point of reference for the villager(bil)s, and until they finish training their next pro swimmer, he is the fastest available.

MOSFET
MOSFET
5 years ago
Reply to  Sahrin

I thought Ryzen looked pretty good in this review, actually. Huh.

synthtel2
synthtel2
5 years ago
Reply to  Klimax

Nah. AVX-512 is still weaksauce compared to a GPU in absolute terms and in perf/W terms, and even with the inflation due to *coin, it’s still massively cheaper to hook six Hawaii/P10s to a single board than it is to get a half-dozen 7900Xes up and crunching. There may be some efficiency points in favor of the CPU (RAM access patterns etc) but they’re not even in the same league of brute force per your already-questionably-invested dollar. 64 SP FLOPS/cycle * 10 cores * 3.5 GHz = 2.24 TFLOPS, or ~half of one of those $640 480s. Etherium is a… Read more »

Klimax
Klimax
5 years ago
Reply to  chuckula

I am sorry, but AVX-512 will likely mean that for quite few cryptocoins these chips will be very good. (vector rotation and co)

Klimax
Klimax
5 years ago

Quick note: x264 in recent builds should support AVX-512.

synthtel2
synthtel2
5 years ago
Reply to  Bauxite

I’m going to be contrary – I think Intel is doing it right this time. Wrong is what they do with Celerons and Pentiums, which makes it annoying for devs because if we want both AVX’s gains and wide support, we have to have both an SSE and an AVX path. This Zen and SKL-X method means that the latest instruction set gets distributed across the board and devs don’t have to worry about it (or won’t, once it’s been out for as long as Sandy has). It decouples instruction set support from the width of the underlying execution. Having… Read more »

BobbinThreadbare
BobbinThreadbare
5 years ago
Reply to  Klimax

There are pro-monopolist economic theories

YellaChicken
YellaChicken
5 years ago
Reply to  Sahrin

How much alcohol did it take to convince yourself you had a valid point?

derFunkenstein
derFunkenstein
5 years ago
Reply to  chuckula

Lots of folks griped about effectively-halved AVX throughput on Ryzen. TR’s benchmark suite pretty well established how that was a mistake, too.

JMccovery
JMccovery
5 years ago
Reply to  Wirko

Hmm… Interesting.

Waco
Waco
5 years ago
Reply to  Sahrin

Perhaps the “it’s not released” part about Threadripper is news to you? They can’t compare something that doesn’t exist.

Top-end AMD stops at $500 these days, like it or not. There is no better comparison you can make until AMD starts shipping Threadripper.

Wirko
Wirko
5 years ago
Reply to  JMccovery

A 16-core Opteron in liquid antideuterium?

Krogoth
Krogoth
5 years ago
Reply to  the

Aside from epenis extreme overclocking benchmarking. I really don’t see this happening. This isn’t the late 1990s and early 2000s anymore.

The PC landscape has completely changed. Enthusiasts want performance without resorting to exotic cooling solutions or super-load fans that belong in 1U/2U in a server room.

i9-7900 is already pushing the envelope as it is and Intel “recommends” using some kind of water-cooling if you want overclock the chip with all cores running.

lycium
lycium
5 years ago
Reply to  DrCR

Whoops, sorry for the unintentional red herring, it was Mike Abrash not Tom Forsyth who tells the story: [url<]http://www.drdobbs.com/parallel/a-first-look-at-the-larrabee-new-instruc/216402188[/url<]

the
the
5 years ago
Reply to  ronch

Unfortunately instead of sending nukes to each other, we’ll be getting them in our processor sockets with how much energy these chips will burn to edge each other out in performance.

NTMBK
NTMBK
5 years ago
Reply to  psuedonymous

Given the insane heat that Tom’s Hardware found in their tests, I wouldn’t try to cram the 7900X into a mini-ITX build.

psuedonymous
psuedonymous
5 years ago

With the single-thread performance being barely behind the flagship 115x part, this is the first time I’ve considered moving to the HEDT platform. Previously the loss of SP performance has not been worth the few cases where extra cores would have been worth it.

Coupled with the jam-packed ASRock X299 ITX board, all I need now is a suitable (low-ish profile) heatsink that can fit between the risers, but beefier than the venerable NH-L9i to handle the extra ~60W of heat.

psuedonymous
psuedonymous
5 years ago
Reply to  Fonbu

None yet. The larger Skylake-SP (Purley platform) chips have been announced, but AFAIK no Xeons have ever been mentioned for Basin Falls.

Anonymous Coward
Anonymous Coward
5 years ago
Reply to  Klimax

I’m not sure anyone in the forum cheers monopolies [i<]too[/i<] much, but in the broader political landscape, there is a certain amount of freedom == no regulations == we all win thinking.

Anonymous Coward
Anonymous Coward
5 years ago
Reply to  setaG_lliB

Mostly I think the quad channel memory enables the same level of performance per core. Not much else to say, is there?

psuedonymous
psuedonymous
5 years ago
Reply to  Krogoth

[quote<]28 PCIe 3.0 lanes on the platform which may limit your expansion options down the road (M.2 NVMe media and Thunderbolt 3). [/quote<] That's still enough for me to build an ITX system with a 1080Ti and three m.2 drives: one 960, one Optane, and one 2tb 'slow' m.2 once 3D NAND enters wider production and pushes prices down a bit more; all operating at full bandwidth. That's something no other platform can do today, and none on the horizon are likely to do. Reports from ASRock, current king of SFF boards, has been "LOLno, have you seen the size of that socket?!" on any non-ATX Threadripper boards. And with SP performance looking to match, or at least approach, the 7700k in some cases, I'm seriously considering moving to X299 with an i7-7820x. Ryzen's SP performance is not sufficient for me (and Threadripper looks to only be going downhill from there), and just hopping from the 6700k to the 7700k would not be worth the motherboard change.

Klimax
Klimax
5 years ago

More like it is down vote for manufacturing nonsensical position and “meme”. Aka making strawman…

Klimax
Klimax
5 years ago
Reply to  w76

As far as I can tell, nobody held that nonsensical position. You have manufactured a strawman. At least I haven’t sen any comment like that in front page discussions.

Klimax
Klimax
5 years ago
Reply to  albundy

Yes. At test level they look small, but throw at it real load and you start looking at distinct advantages like saving hours or days.

ronch
ronch
5 years ago
Reply to  chuckula

It’s World War 3 as far as Intel and AMD are concerned.

setaG_lliB
setaG_lliB
5 years ago

Yeah. Looks good and all, but considering how well Ryzen does in some of these benchmarks, I just can’t wait to see what Threadripper will do with quad channel memory.

jihadjoe
jihadjoe
5 years ago
Reply to  Krogoth

There are another 24 PCIe lanes going to the PCH.
And you really want to plug those NVMe drives onto the PCH (instead of the CPU) because it’ll let you boot NVMe RAID.

DrCR
DrCR
5 years ago
Reply to  lycium

Link to the read?

Anonymous Coward
Anonymous Coward
5 years ago
Reply to  jts888

Yeah I think that captures the situation.

Anonymous Coward
Anonymous Coward
5 years ago
Reply to  Voldenuit

All the server work I do has next to no need for a fancy interconnect, its separate processes doing their thing on VM slices of big servers, often one core per VM, or two cores. I’ve rented up to 32 cores in one VM, but also there, it was lots of separate processes working with minimal chatter between them. You make a good point that the mesh interconnect could be a step in the right direction for efficiency, but I wonder if the optimal solution for me would be AMD’s famously minimalistic approach to interconnects. Who needs a monolithic die… Read more »

ronch
ronch
5 years ago

I just kept looking at how the Ryzen 1800X puts up against all the Intel chips. Still a damn good value in my book as shown by the famous TR graph! Holy cow I wouldn’t think twice about getting it if I were out for a high core count chipolata. I might sound like a shill right now (which I swear by my FX-8350 I’m not) but hey, I’m sure you all agree it’s a great choice for $460.

ronch
ronch
5 years ago
Reply to  Sahrin

On the contrary it makes super duper sense to put the best chip AMD has available [u<]for purchase [/u<] here.

ronch
ronch
5 years ago

Ok, I put this table together just for comparison:

[url<]http://tinypic.com/r/2zfn7k0/9[/url<] You're welcome. Edit - some blokes here just love downthumbing my posts. Hope doing that makes you rich, guys.

curtisb
curtisb
5 years ago
Reply to  chuckula

Rudy’s? It’s ok, but head on down to San Antonio for some Texas Pride instead. It’s worth the drive. 🙂

curtisb
curtisb
5 years ago
Reply to  jts888

As was said in that article about the RAID enabler keys, those are nothing new on high-end workstation and server systems. RFID and NFC don’t have anything to do with those hardware keys.

Unknown-Error
Unknown-Error
5 years ago
Reply to  jts888

Damn. I wish I had your writing skills during my college days.

Star Brood
Star Brood
5 years ago
Reply to  Sahrin

Whoa dude, Threadripper hasn’t had it’s NDA lifted for reviewers (or hasn’t been sent to them). 1800X is the biggest, fastest AMD CPU currently on the market.

lycium
lycium
5 years ago
Reply to  NTMBK

It’s a crying shame about AVX-512 because SSE is such a mess. The story of how AVX-512 came to be is a really funny and interesting one, worth reading about (Tom Forsyth).

synthtel2
synthtel2
5 years ago
Reply to  Krogoth

I’m running a G3258, DDR3-2133 CL9, and a cheesy SSD (Sandisk X300 512GB), and I’ve been manually controlling clocks of late because my mobo, my USB DAC, and pulseaudio are conspiring to ruin everything (long story). For the kind of workload in question here, there is a clear difference between 2.4 and 3.7 GHz. It’s not even subtle, and this thing at 2.4 is still waaay beyond Pentium Ds and A64x2s.

When running non-crappy code that doesn’t deal with large amounts of data 1.0 GHz is fine, but there is a lot of crappy code out there.

albundy
albundy
5 years ago

the non-synthetic tests say it all. the price/performance ratio makes it a questionable purchase. are the small gains worth the cost?

Krogoth
Krogoth
5 years ago
Reply to  Pancake

Crappy code is the real culprit here. You are being held back by [b<]I/O throughput[/b<] and memory bandwidth. Again, the CPU isn't the issue unless you are running something prior to Core 2-era.

Pancake
Pancake
5 years ago
Reply to  nexxcat

Wut?? Consider yourself lucky if you’ve never had to open a PDF of a complex map on a single page – bajillions of polygons, lines, fills etc where it takes seconds to redraw a zoom or pan. Nothing to do with I/O bandwidth or paging. Take a look at that Anandtech review where opening a PDF is one of their CPU tests. Any of the Intel CPUs slaughter Ryzen there. Making maps is one of the things I do to present results to clients. I would imagine CAD and graphic design people would also appreciate fast CPUs.

nexxcat
5 years ago
Reply to  Pancake

[quote=”Pancake”<]Au contraire. I'm talking about bloated PDFs which take an irritating amount of time to scroll through or redraw. I'm talking about web pages full of horrorshow JavaScript bloatage. I'm talking about the many seemingly trivial but irritatingly janky things that we do everyday.[/quote<] A lot of that is memory and IO bandwidth. If going back to bloated PDFs is janky for you, then likely core bits of memory have been swapped out. I don't know how good Windows 10 is, but I know Windows Server 2008 R2? 2012? will still aggressively swap out bits of memory that's not being used -- we did some benchmarking at my former place of employment, and the use case there was "flurry of activity, then 1-2 hours of quiet, then markets open." When we simulate that in realtime, the 1-2 hours of quiet was enough to shove memory into swap, and even though we were using enterprise SSDs, swap is still slow, and we'd get substantial, and otherwise unexplainable, slowness as a result. They're probably now waiting on Xeons based on these chips. The sweet spot, for us, were 2-socket servers with maxed cores each when considering rackspace and power.

chuckula
chuckula
5 years ago

Judging by Kampman’s twitter feed it looks like he’s moved on to Austin for an Epyc press conference tomorrow.

Get some Rudy’s and/or Salt Lick while you’re there!

Pancake
Pancake
5 years ago
Reply to  Krogoth

Au contraire. I’m talking about bloated PDFs which take an irritating amount of time to scroll through or redraw. I’m talking about web pages full of horrorshow JavaScript bloatage. I’m talking about the many seemingly trivial but irritatingly janky things that we do everyday.

Like me, posting on here instead of concentrating on my work…

Krogoth
Krogoth
5 years ago
Reply to  Pancake

Snappiness for single-threaded and dual-threaded loads is primarily bottleneck by I/O throughput. The CPU hasn’t been much of an issue in this regard since we moved onto Core 2 and beyond.

It is a different story for multi-threaded workloads though.

Pin It on Pinterest

Share This

Share this post with your friends!