**** **** CPU-side **** ****
Excuse the lack of hard numbers in this section - in benchmarkable scenarios, it's fast enough that it just doesn't matter, and it only really bogs down when there are 100+ players nearby. On potato settings (except render distance 6000, textures, and resolution) with bank group swap off on an R7 1700 with DDR4-2666 CL14, I'm averaging probably 90-ish FPS across all scenarios, 60 happens often enough, and extremely dense fights may still bog it below 50.
The key was disabling bank group swap. If you've got an AMD CPU and PS2 is bogging badly in the big fights, you should try that. It actually slowed it down a bit for me in best-case (benchmarkable) scenarios, and seems to have made shadows slower in all cases, but helped out dense 200-player shadows-disabled scenarios by a third to half. As that's exactly where it most needed the help and I don't mind running it on potato settings for an extra frame or two, I'll call that a solid win.
Model quality (in-game setting) was the other non-obvious one (as it didn't affect anything benchmarkable). I thought for a bit turning it down had nearly solved everything, but it turns out it just makes it bog in fewer conditions; with it on low, 200 players spread out over a region may be fine when they wouldn't have been, but if they're all trying to contest one room it may be just as much of a problem as before. As it appears to primarily affect animation LOD distances, this makes sense.
The other particularly important setting for CPU load is shadow quality. Low and medium are fairly mild, as they keep the shadow frustum small (don't try to render shadows very far away from you). Medium -> high is the biggest step for the CPU, and ultra is mainly a GPU load increase.
Hardware-wise, it seems to care most about RAM latency, then bandwidth, then core clocks, then cores (assuming you've got at least 4C4T). Aside from the usual timings, I got noticeable gains from tightening tRRD and tFAW. It averages up to five and a half threads of utilization when heavily loaded on an 8C16T CPU, but it doesn't seem to make any noticeable difference in framerates beyond 4C4T. There does seem to be a touch more pop-in and similar jankiness (if you've played the game you know what I mean) at 4C4T than 4C8T.
Even after disabling bank group swap, it doesn't seem to like Zen quite as much as Haswell. Given Intel's snappier uncore and PS2's sensitivity to memory, I'm not surprised.
I found a log file sitting around the game folder with a couple of interesting lines in it:
outofmemory.txt wrote:Given that I don't leave the game open all day, we're looking at well into the hundreds of thousands of MutexWaits per second. PS2 has generated another of these on my brother's i5-4590 system, and his indicates only 1.69% MutexWaits. This could help explain some stuff.MutexLocks: 10276319581
MutexWaits: 2179828962 (21.21% of MutexLocks)
**** **** GPU-side **** ****
The setting simply called graphics quality rearranges a lot of the pipeline, and a lot of other settings don't have any effect unless it's on high. (Many don't have any effect anyway.) It's the heaviest on GPU load aside from ultra shadows and resolution itself.
GPU load fits a model of a flat per-frame cost plus a per-pixel cost plus a shadow mapping cost. There's no apparent square-of-pixel-count cost, and (though it doesn't matter for this) the CPU side seems to stay completely independent of resolution. Sampling shadow maps should have a noticeable per-pixel cost, but it seems to be too small to count accurately here (something about memory access patterns and caching meaning it approximates reading the whole map once and only once, probably). I gathered data for this looking out across Koltyr from a balcony on the VS warpgate, maxed render distance, holding a Tanto-P and fixing the view on a particular point of warpgate structure; it's a consistent, landscape-heavy, player-light view:
GQ=low, RX 460 ....... 4 ms/frame ...... +3.2 ms/megapixel
GQ=med, RX 460 ..... 4 ms/frame ...... +4.5 ms/megapixel
GQ=low, GTX 960 ..... 4.5 ms/frame ... +1.8 ms/megapixel
GQ=med, GTX 960 ... 4.5 ms/frame ... +2.5 ms/megapixel
GQ=high, GTX 960 ... 5 ms/frame ...... +3.5 ms/megapixel
Ultra shadows add about 4 ms flat on either card (tested in the 1080p to 1440p region). Low/medium/high shadows cost a third to half that.
This is all weird, because usually Maxwell is faster at the flat overheads and shadows and GCN is faster at the per-pixel stuff. Not only is each walking away with the other's usual strength, they're doing so at rates greater than their paper specs would suggest. Shadows being equal makes some sense as I think PS2 uses variance shadow maps, which are likely to be bandwidth-bound on both cards at 112 GB/s (64 bits per shadow texel means they both run into that limit before ROP limits, where Nvidia usually gets a solid win on shadows due to their ROP power).
**** **** latency and rate of fire **** ****
The biggest non-obvious factor seems to be whether frames are piling up between the CPU and GPU. Basically, it helps out latency if it's always bound either by the CPU or an artificial limiter rather than the GPU. Artificial limiters are annoying, but the in-game option "smoothing" does a decent (not perfect) job of it dynamically, if you don't mind your framerate being capped to your monitor's refresh rate.
AMD drivers with PS2 may have some kind of latency issue not solved by that, the fix for which is turning down flip queue size (in the driver). The driver doesn't expose that option anymore, though, and it takes a regedit. Turning this (or pre-rendered frames for Nvidia) down in the driver isn't a full replacement for not being GPU-bound.
On the topic of artificial limiters, MaximumFPS in UserOptions.ini should never be higher than necessary. If there's a big (>2x) gap between your actual framerate and MaximumFPS, high-RoF infantry weapons won't fire quite as fast as they should. Smoothing seems to do something to counter this, but not always. (This can happen anyway when Connery is being janky - I think that's why the NC were demolishing everyone this April 1st.)
I haven't been able to tell that the in-game option "reduce input latency" actually does anything other than occasionally cost an fps or two.