Personal computing discussed

Moderators: Flying Fox, morphine

 
synthtel2
Gold subscriber
Gerbil Elite
Topic Author
Posts: 608
Joined: Mon Nov 16, 2015 10:30 am

Miscellaneous Ryzen thoughts

Fri Aug 18, 2017 4:25 pm

New stuff:
R7 1700 stock-clocked, ASRock AB350 Gaming-ITX/ac, 16 GB of RAM (still being dialled in but 2666CL16 / 2800CL18 or thereabouts).

Old stuff:
G3258 at 4.1-4.3 (recently 3.2-4.0 due to wear), ASRock Z97E-ITX/ac, 8GB RAM at 2133CL9.

Other stuff:
Sandisk X300 512GB, 4GB GTX 960, a 1080p60 monitor, Arch Linux.

==== ==== ==== ====

Remember this thread? That effect is definitely present here. It stalls in different places than it did before, but there are some patterns. In browsing and general OS/background stuff, most non-I/O waits that would take 200+ msec before have been sped up somewhere between noticably and massively, and most of the new hitches are under 50 msec. The improvements hint that the world may be more multithreaded than we generally give it credit for (a lot of that being Firefox in this case). A lot of those new shorter hitches went away or got better when I got the RAM clocked above 2133CL15, so maybe absolute latency is important for that. As for the rest, I'm suspicious of clock scaling. On the G3258, the intel_pstate governor (despite powersave mode) tended to keep clocks maxed and rely on C-states for power saving, and this 1700 both spends a lot more time at idle P-state (1.55 GHz) and probably takes a lot longer spinning up from that.

(Speaking of that thread, I haven't forgotten about it. I think implementing such tests as profiling on a naturally-big project is going to work better than a dedicated test suite, though, so that's on an opportunistic kind of schedule.)

I'm still at a low-ish sample size of games that are heavy enough to matter (this internet connection is garbage and downloading them is annoying), but the pattern I'm getting is that average framerates are unaffected or moderately improved and frametime / worst case / hitchiness problems are massively improved. Average versus 99% frametime spreads are now in ranges most gamers would call normal, where on the G3258 they could be anywhere between normal and unplayable. Shadow of Mordor is the worst I ran into on the G3258, where it went to <5 fps for maybe a half second whenever I tried to look a different direction (with an associated heavy multithreading attempt). SoM runs pretty well on the 1700, albeit regardless of available CPU power it clearly wasn't designed for the kind of fast view changes a mouse can produce. This paragraph does not account for gaming via Wine, which I'll get to in a bit.

Game load times are often greatly improved, even for relatively simple games like Broforce. :wink:

Y'all already know what this upgrade was like for compiling and other known-multithreaded loads. 8)

==== ==== ==== ====

I've got a couple of Linux-specific tricks to put these threads to use:

Instead of swapping to disk, I swap to an LZ4-compressed RAMdisk. With more cores at work on the LZ4, I can act like I've got 20 or 24 GB of RAM and for many purposes hardly even notice a performance hit.

Wine-staging with CSMT is awesome. The idea is that instead of doing all the DX-to-GL black magic on the game's render thread, Wine runs its own thread to handle it in parallel (including the actual OGL calls, I guess). The stable version of Wine apparently has something like this implemented, but it's implemented with emphasis on correctness and it isn't actually faster than doing the work inline. Wine-staging's version doesn't mind being a bit unsafe, it's very rarely an issue, and it brings Real Serious Performance. It's purportedly sometimes even faster (average fps at least) than running the game on Windows, which sounds plausible enough if the base game is doing a lot of non-render work on the render thread (since calling Wine is likely cheaper than calling the actual graphics driver). So far (again with a small sample size), this is all working for me just as described, though Wine still does result in notably less consistent frametimes than native.

==== ==== ==== ====

I haven't (knowingly) run into that compiling bug, but the compiles I'm doing aren't the most likely to trigger it. This is no Gentoo.

Rumors of Ryzen's memory controller weirdness seem well-founded. The kit I'm using is G.Skill's F4-3000C15D-16GTZB, which according to this page is dual-rank Samsung E-die. I don't know what it actually is (Thaiphoon doesn't work in Wine, unsurprisingly), but it looks single-rank to me. It wouldn't boot over 2133 without manual tweaking. Lots of manual tweaking later, I had it seemingly stable (spoilers: it wasn't) at 3066 16-16-16-36 1.22V.

I'd like to take a moment here to discuss Ryzen's boot-time memory training, which is both wonderful and terribly sketchy. It's wonderful because in finding the limits on a ton of separate timings and timing groups, CMOS clear was only required once, and only once did obviously scary levels of glitchiness make it through to OS boot. Usually, if something is pushed too far, it'll just fail training and dump you back in the UEFI interface using JEDEC settings (but with the last custom ones conveniently saved). It's sketchy because it isn't deterministic. When pushed to its limits, it gives slightly different speeds on each boot (presumably due to variance in some subtimings or other). What if it messes up once in a blue moon and boots with timings that aren't stable despite any stress-testing you already threw at it? I don't like that thought, and I don't have any feel for how much margin it needs to eliminate that possibility.

Anyway, the plan was to dial in something more aggressive than I'd really want to run, throw all the stress testing at it, find the edge, run at it for a bit to confirm stability, then back off by 266 or so to get something I can count on. Well, stress testing doesn't work. It passed an overnight mixed stressapptest run (which I had kind of been assuming was decent because it's what Google uses, right?), I went on to testing through cautious use, and next thing I know a pile of random 775 permissions over in /var changed to 755s and I have to tell pacman to reinstall all packages to fix some kind of wifi packet loss problem. :o

I'm really not trying to run any extreme settings here, but it is a big deal for performance, and it being impossible to tell where the limit is without accepting software damage makes this difficult.

Since I did give these some individual attention, here are the timings I arrived at in case it helps out anyone else:

3066 1.22V
tCL-tRCD-tRP-tRAS-tRC-cmdrate = 16-16-16-36-54-2T (GDM off)
tCWL = 14, tRTP = 11, tWR = 22, tRRD_S = 8, tRRD_L = 10, tFAW = 42
tWTR_S = 6, tWTR_L = 9, TrdrdScL = 6, TwrwrScL = 6, Trdwr = 10, Twrrd = 4, tCKE = 8
TwrwrSc = 1, TwrwrSd = 6, TwrwrDd = 6, TrdrdSc = 1, TrdrdSd = 6, TrdrdDd = 6
RFC/2/4 follow JEDEC times the new clock divided by 2133

The tight timings there are on the tCL etc line, and the rest mostly have a bit of margin. Bump tFAW to 50+ while you're working on the rest of it for stability (especially at low voltage like this), and don't touch TwrwrSc/TrdrdSc because they have a huge effect on performance. tWR should be double tRTP, tCWL should probably be a bit less than tCL, and tFAW needs to be 4x tRRD_S at minimum. All of those rdrd/wrwr 6s got tended to in pairs rather than individually.

==== ==== ==== ====

Using lower clocks and more cores to whatever extent workable is about reliability, not power use, but this thing's lack of power use is impressive. If heatsink exhaust is anything to go by, 65W seems like a high estimate, even under full prime95. As for reliability, multicore load sees it running a nicely low 1060mV and 3.15-3.2 GHz.

Were I speccing it out again, I'd drop to an R5 1600 and spend that money on ECC RAM. Even if that RAM isn't as theoretically performant and the overclocking is real overclocking, at least it'd be clear how far is too far. As for the 6C/8C choice, I may appreciate every one of these cores in 2021, but for now 8C is comfortably overkill.
 
strangerguy
Gerbil First Class
Posts: 199
Joined: Fri May 06, 2011 8:46 am

Re: Miscellaneous Ryzen thoughts

Sun Aug 20, 2017 7:40 pm

I would have got a 1700 non-X half a year ago if not for flaky and still is 2x16GB DDR4-3000+ compatibility.

What a pity though, because now I'm more than willing to spend more than AMD on CFL i7 for the same reasons and ~30% more ST performance.

It's not surprising you got a massive gaming boost on a 1700 when the G3258 is simply awful for most gaming purposes even at it's launch, it's easily the most overrated chip in recent memory; the $110 Haswell i3 was sooooo much better it's not funny.
4790K 4.4GHz | Asus H81i-Plus | 16GB Crucial VLP DDR3 | GTX 1070 | 240GB Evo 840| 1TB M550 | 3TB Seagate 7200rpm | 650W Seasonic G | Corsair H100i + 250D
 
synthtel2
Gold subscriber
Gerbil Elite
Topic Author
Posts: 608
Joined: Mon Nov 16, 2015 10:30 am

Re: Miscellaneous Ryzen thoughts

Mon Aug 21, 2017 1:53 pm

The G3258 was originally a stop-gap because I was broke, with a 4690K taking over later. I was content to hold off on games native to 8th-gen consoles until getting the 4690K, and the G3258 was theoretically quite competent at everything else I wanted to do. It just turns out to be a bigger deal for older games than benches generally showed, and when my 4690K died and I had to go back to the G3258 in April 2016, that was a real problem.
 
ptsant
Gerbil Team Leader
Posts: 259
Joined: Mon Oct 05, 2009 12:45 pm

Re: Miscellaneous Ryzen thoughts

Mon Aug 21, 2017 2:18 pm

synthtel2 wrote:
New stuff:
R7 1700 stock-clocked, ASRock AB350 Gaming-ITX/ac, 16 GB of RAM (still being dialled in but 2666CL16 / 2800CL18 or thereabouts).

3066 1.22V
tCL-tRCD-tRP-tRAS-tRC-cmdrate = 16-16-16-36-54-2T (GDM off)
tCWL = 14, tRTP = 11, tWR = 22, tRRD_S = 8, tRRD_L = 10, tFAW = 42
tWTR_S = 6, tWTR_L = 9, TrdrdScL = 6, TwrwrScL = 6, Trdwr = 10, Twrrd = 4, tCKE = 8
TwrwrSc = 1, TwrwrSd = 6, TwrwrDd = 6, TrdrdSc = 1, TrdrdSd = 6, TrdrdDd = 6
RFC/2/4 follow JEDEC times the new clock divided by 2133


I don't know what the exact specs of the memory are, but most kits that do 3000MHz+ are usually tuned for 1.35V and can occasionally tolerate more. You should have a look at the XMP profile to get an idea of most of the values. You can also tune the ODT (on-die termination) in some motherboards. Anyway, if you can get it to 2800CL16, you will be better than average. I am definitely satisfied with my 1700X at 2800MHz, even though theoretically the memory can do 3000.
Image
 
blahsaysblah
Gerbil Elite
Posts: 580
Joined: Mon Oct 19, 2015 7:35 pm

Re: Miscellaneous Ryzen thoughts

Mon Aug 21, 2017 2:24 pm

Do you know if your board supports KVM/QEMU GPU pass through to give a Win 10 VM that GTX 960? I was looking at that board.
 
synthtel2
Gold subscriber
Gerbil Elite
Topic Author
Posts: 608
Joined: Mon Nov 16, 2015 10:30 am

Re: Miscellaneous Ryzen thoughts

Mon Aug 21, 2017 4:55 pm

ptsant wrote:
I don't know what the exact specs of the memory are, but most kits that do 3000MHz+ are usually tuned for 1.35V and can occasionally tolerate more. You should have a look at the XMP profile to get an idea of most of the values. You can also tune the ODT (on-die termination) in some motherboards. Anyway, if you can get it to 2800CL16, you will be better than average. I am definitely satisfied with my 1700X at 2800MHz, even though theoretically the memory can do 3000.

1.22V is a reliability thing again. 1.35V is usually fine, but I'm beyond tired of random hardware failures and just don't want that to be an issue. AFAIK "tuned for 1.35V" should be more about the XMP profile than the hardware itself, and I'm fine with dropping some performance in the name of reliability.

Activating the XMP profile sets speed, voltage, tCL through tRC, tRRD/tFAW, tRFC/2/4, and maybe command rate (I didn't make note of that one). Lots of stuff is not set by it, and even at higher voltage a lot of values couldn't even approach what loading XMP and hitting go would leave them at. For reference, XMP on this kit is 3000, 1.35V, 15-16-16-16-35-51, tRRD_S / tRRD_L / tFAW = 6 / 8 / 32.

I haven't touched ODT etc, but on auto with sufficiently loose timings it goes to 3200 at 1.22V, and I suspect timings were the real reason 3200 at 1.20V failed. When I go back to this, I'll probably mess with ODT just for the sake of not leaving it on auto, and see if I can confirm 3333 or 3466 for margin.

blahsaysblah wrote:
Do you know if your board supports KVM/QEMU GPU pass through to give a Win 10 VM that GTX 960? I was looking at that board.

No idea, but if it's determinable from firmware options / kernel logs / etc and you can tell me what I'm looking for, I can check for you.
 
blahsaysblah
Gerbil Elite
Posts: 580
Joined: Mon Oct 19, 2015 7:35 pm

Re: Miscellaneous Ryzen thoughts

Tue Aug 22, 2017 7:22 am

synthtel2 wrote:
No idea, but if it's determinable from firmware options / kernel logs / etc and you can tell me what I'm looking for, I can check for you.

I havnt rolled my own linux in long time. Just use headless server VMs for dev work not related to linux at all. Was looking for my next build, maybe if something good comes during holidays.

RYZEN GPU PASSTHROUGH SETUP GUIDE: FEDORA 26 + WINDOWS GAMING ON LINUX

ArchLinux, PCI passthrough via OVMF Setting up IOMMU -> Enabling IOMMU & Ensuring that the groups are valid

1. Seems like, IOMMU and other virtualization stuff need to be enabled in UEFI.
2. Than IOMMU needs to enabled in kernel.

And than you can use small shell script to check if the devices are in their own separate groups(groups are passed to VM, so like GTX 960 and its HDMI AUDIO would be in one group). Initially, seems boards dumped all PCI-E devices into same group. But maybe with AEGI..something 1.0.0.6 microcode update, that is all fixed.

Only if you're personally interested as its far away build for me.
 
synthtel2
Gold subscriber
Gerbil Elite
Topic Author
Posts: 608
Joined: Mon Nov 16, 2015 10:30 am

Re: Miscellaneous Ryzen thoughts

Tue Aug 22, 2017 11:37 pm

That script shown both places outputs this:

IOMMU Group 0 00:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1452]
IOMMU Group 10 03:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Device [1022:43bb] (rev 02)
IOMMU Group 10 03:00.1 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] Device [1022:43b7] (rev 02)
IOMMU Group 10 03:00.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43b2] (rev 02)
IOMMU Group 10 04:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43b4] (rev 02)
IOMMU Group 10 04:01.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43b4] (rev 02)
IOMMU Group 10 04:04.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43b4] (rev 02)
IOMMU Group 10 04:05.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43b4] (rev 02)
IOMMU Group 10 04:06.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43b4] (rev 02)
IOMMU Group 10 04:07.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43b4] (rev 02)
IOMMU Group 10 09:00.0 Network controller [0280]: Intel Corporation Device [8086:24fb] (rev 10)
IOMMU Group 10 0a:00.0 Ethernet controller [0200]: Intel Corporation I211 Gigabit Network Connection [8086:1539] (rev 03)
IOMMU Group 11 0b:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM206 [GeForce GTX 960] [10de:1401] (rev a1)
IOMMU Group 11 0b:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:0fba] (rev a1)
IOMMU Group 1 00:01.3 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:1453]
IOMMU Group 2 00:02.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1452]
IOMMU Group 3 00:03.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1452]
IOMMU Group 4 00:03.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:1453]
IOMMU Group 5 00:04.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1452]
IOMMU Group 6 00:07.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1452]
IOMMU Group 6 00:07.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:1454]
IOMMU Group 6 11:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Device [1022:145a]
IOMMU Group 6 11:00.2 Encryption controller [1080]: Advanced Micro Devices, Inc. [AMD] Device [1022:1456]
IOMMU Group 6 11:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Device [1022:145c]
IOMMU Group 7 00:08.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1452]
IOMMU Group 7 00:08.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:1454]
IOMMU Group 7 12:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Device [1022:1455]
IOMMU Group 7 12:00.2 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] (rev 51)
IOMMU Group 7 12:00.3 Audio device [0403]: Advanced Micro Devices, Inc. [AMD] Device [1022:1457]
IOMMU Group 8 00:14.0 SMBus [0c05]: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller [1022:790b] (rev 59)
IOMMU Group 8 00:14.3 ISA bridge [0601]: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge [1022:790e] (rev 51)
IOMMU Group 9 00:18.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1460]
IOMMU Group 9 00:18.1 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1461]
IOMMU Group 9 00:18.2 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1462]
IOMMU Group 9 00:18.3 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1463]
IOMMU Group 9 00:18.4 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1464]
IOMMU Group 9 00:18.5 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1465]
IOMMU Group 9 00:18.6 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1466]
IOMMU Group 9 00:18.7 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1467]

.... which looks like a success to me, not that I've done enough research on it to know. I didn't look at any of the UEFI/kernel config noted before running that.
 
blahsaysblah
Gerbil Elite
Posts: 580
Joined: Mon Oct 19, 2015 7:35 pm

Re: Miscellaneous Ryzen thoughts

Wed Aug 23, 2017 7:58 am

Yes, the GPU is by itself. Thanks :)
 
synthtel2
Gold subscriber
Gerbil Elite
Topic Author
Posts: 608
Joined: Mon Nov 16, 2015 10:30 am

Re: Miscellaneous Ryzen thoughts

Sat Sep 30, 2017 7:35 pm

With 1.0.0.6b (I skipped 1.0.0.6a) and a CPU replaced under warranty for the performance marginality problem, RAM now Just Works. The final config for medium-term stability evaluation is 1.20V / 2666 / 14-14-14-14-32-46 / everything else on auto. It'll probably end up at 1.25V for margin.

Just for kicks, I also set it to 1.35V and started upping clocks, and it got as high as 3333 / 20-20-20-20-40-60 with everything else on auto. 3466 / 20-20-20-20-40-60 broke something and it wouldn't boot anywhere above 2133 again until I did a CMOS clear.

Who is online

Users browsing this forum: No registered users and 3 guests