Personal computing discussed

Moderators: renee, Hoser

 
synthtel2
Gerbil Elite
Topic Author
Posts: 956
Joined: Mon Nov 16, 2015 10:30 am

Planetside 2 per-component benchmarking

Tue Oct 11, 2016 11:24 pm

Planetside 2 is a pretty heavy game on CPUs, it's mostly single-threaded, and it's a twitchy shooter in which the resulting bad framerates are a pretty big deal. I've got much more trouble with it than most because I'm running it on Linux through Wine. In fact, it's the one thing for which I really don't have enough single-thread performance with an OC'ed G3258. In the interest of figuring out what kind of upgrades and/or overclocking would help most with that, I did some benchmarking of how downclocking specific components affects it. So long as I did that, I thought I'd post the results here.

//// //// disclaimers //// ////

I'm running this on Linux, and there is a very real performance hit associated with that. My results may or may not bear any resemblance to most people's. Planetside doesn't have a single-player mode of any sort, and it's impossible to get entirely away from other players doing whatever other players do. I did my best to minimize the effects of that, but it's not impossible something slipped through. Finally, my framerates with low player density are quite alright. It's when there are big fights with 100+ players in an area that things get unworkable, and that may load the system differently than what I'm looking at here.

I should have downclocked the core and uncore more independently, but it shouldn't make the results much different.

//// //// setup //// ////

I've got a Pentium G3258, a GTX 960, 8 GB of DDR3-2133 at 9-11-10-28-2, an ASRock Z97E-ITX/ac, and a 512 GB SanDisk X300. The display is basic 1080p60 garbage. My default clocks are 4.1 GHz core, 3.5 GHz uncore, 1442 GPU core, and 7800 VRAM. Main RAM is stock. It's running up-to-date Arch Linux as of Sunday. The kernel is 4.7.6 and the Nvidia driver is 370.28.

The game's graphics settings are mostly as low as they go. Exceptions are that textures are on medium, model quality is on medium (because low doesn't draw characters until they get well within deadly range), and draw distance is at 1500 (about the minimum I find practical). FOV is 72. I usually run at 1080p, but to fully focus on the CPU, I set the render resolution to 50% for this testing (this boosted the framerate by <5%).

To get some consistency in-game, I went to the VR training grounds, took a Scythe east to the canyon with the bridge, then south to the cluster of holographic Vanu well away from the spawn area. I climbed a small hill to their east and centered the view on a particular one of them to the west, leaving many Vanu holograms and my Scythe in view. This made things pretty consistent between tests, despite the inconsistent nature of the game.

I used the game's internal framerate monitor (accessible with alt-f). This doesn't output any data I could do statistics on, but it bounces around within a fairly narrow and consistent range, so I just recorded the lowest and highest common figures. Frametime variance subjectively feels pretty low.

For testing, I slowed down different components of my system to 3/4ths their original speed. The options didn't always line up perfectly, but it was pretty close, and I compensated for that when calculating effect sizes.

//// //// results //// ////

It's mostly about core clocks, but none of the items are useless:

Image

I did double-check my math on those effect sizes above one. My guess is that it shows up that way due to CPU time being spent on things that aren't per-frame.

On the uncore and bandwidth tests, there were some anomalies I cut out of the data. They didn't fit the pattern, and I suspect they had to do with the actions of other players on the map.

I'm at about 125% CPU usage when testing this stuff (where 200% is two cores fully loaded), so more cores will probably do nothing. The smaller parts all help, and keeping them good is going to be necessary, but it doesn't look like they can move anything from unplayable to playable. More cache might do something, but it's probably mostly in the same boat. It looks like if I want this to work well, I'm looking at Skylake-C or Kaby Lake-C at entirely unsafe clocks, which is not exactly what I had in mind for my next upgrade. :(
 
travbrad
Gerbil XP
Posts: 423
Joined: Mon Dec 08, 2008 5:39 pm

Re: Planetside 2 per-component benchmarking

Wed Oct 12, 2016 1:33 pm

Thanks for the comparison.  I think you may be right the RAM speed is playing a role in the performance as well.  I was on DDR3-1600 on Sandy Bridge and DDR4-2400 on Skylake so it's hard to say for sure but it does look like it makes a difference based on your testing.  I have a feeling pure IPC and clock speeds still play the largest role though (also shown by your testing).

FWIW most bases actually run REALLY well on my new 6700K.  In that other thread I was talking about amp stations which are the "worst case scenario" for me right now.  Other bases generally run at 50-100% better framerates.  There is just something really messed up with amp stations.  I can even be at a base next to an amp station and if I raise my render distance my framerate will drop dramatically, then lower it (so that the amp station isn't being rendered) and framerate jumps back up.

Also when I upgraded from a GTX 660 to my GTX 970 I got better performance.  The weird thing is the game didn't utilize either card 100%, but still seemed to benefit from just using part of a faster card's potential.  It only uses 1-1.5GB of VRAM too so that shouldn't have anything to do with it either, although the 970 does have more memory bandwidth of course.

Also the reason for the recent performance problems seem to be because they "upgraded" to a newer C++ compiler/Visual Studio.  There's nothing we can really do about it but they've at least acknowledged the problem.  I'm just not sure if they have enough resources to fix it at this point with how much of their workforce has been cut.   https://www.reddit.com/r/Planetside/comments/50qcpj/performance_update/
Last edited by travbrad on Wed Oct 12, 2016 2:23 pm, edited 1 time in total.
6700K @ 4.6ghz || ASUS Sabertooth Z170 S || Crucial Ballistix DDR4-2400 16GB
ASUS STRIX GTX 970 || EVGA Supernova 750W G2 || Noctua NH-D15 || Fractal Define R5
Crucial MX200 500GB || 2x WD Blue 6TB || 2x WDGreen 2TB
Philips 272G5DYEB || Dell U2312HM
 
Firestarter
Gerbil Elite
Posts: 773
Joined: Sun Apr 25, 2004 11:12 am

Re: Planetside 2 per-component benchmarking

Wed Oct 12, 2016 2:15 pm

thanks for the comparison

now I'm sad that I didn't spring for faster RAM when I built my i5-2500K PC (common wisdom was that it didn't really matter), but I'll definitely keep that in mind for my upgrade!
 
synthtel2
Gerbil Elite
Topic Author
Posts: 956
Joined: Mon Nov 16, 2015 10:30 am

Re: Planetside 2 per-component benchmarking

Wed Oct 12, 2016 2:58 pm

travbrad wrote:
FWIW most bases actually run REALLY well on my new 6700K.  In that other thread I was talking about amp stations which are the "worst case scenario" for me right now.  Other bases generally run at 50-100% better framerates.  There is just something really messed up with amp stations.  I can even be at a base next to an amp station and if I raise my render distance my framerate will drop dramatically, then lower it (so that the amp station isn't being rendered) and framerate jumps back up.

That's interesting that your problem area is amp stations while mine is player counts. I'm curious about the technical reasons, but that sounds like a tough one to track down.

travbrad wrote:
Also when I upgraded from a GTX 660 to my GTX 970 I got better performance.  The weird thing is the game didn't utilize either card 100%, but still seemed to benefit from just using part of a faster card's potential.  It only uses 1-1.5GB of VRAM too so that shouldn't have anything to do with it either, although the 970 does have more memory bandwidth of course.

It seems like Planetside 2 doesn't have the most efficient GPU work issuance. If the CPU and GPU are pretty well balanced, each still spends a significant amount of time waiting on the other. It's a shame, because if it were more efficient, I could at least turn some graphics settings up a bit.

travbrad wrote:
Also the reason for the recent performance problems seem to be because they "upgraded" to a newer C++ compiler/Visual Studio.  There's nothing we can really do about it but they've at least acknowledged the problem.  I'm just not sure if they have enough resources to fix it at this point with how much of their workforce has been cut.   https://www.reddit.com/r/Planetside/comments/50qcpj/performance_update/

Further down that thread is an explanation of their multithreading system, and it's a lot better than I thought! It makes it a bit befuddling why it uses my system in such a single-threaded way, though. It's likely that the render thread is the usual bottleneck, but having more cores might yet do something helpful. Maybe a quad-core isn't useless and my performance situation isn't hopeless? :)

I think I'll try to look at what individual threads are doing in a big fight later today. That might answer some questions.
 
travbrad
Gerbil XP
Posts: 423
Joined: Mon Dec 08, 2008 5:39 pm

Re: Planetside 2 per-component benchmarking

Wed Oct 12, 2016 3:41 pm

synthtel2 wrote:
Further down that thread is an explanation of their multithreading system, and it's a lot better than I thought! It makes it a bit befuddling why it uses my system in such a single-threaded way, though. It's likely that the render thread is the usual bottleneck, but having more cores might yet do something helpful. Maybe a quad-core isn't useless and my performance situation isn't hopeless? :)

I think I'll try to look at what individual threads are doing in a big fight later today. That might answer some questions.

With my 2500K I could only get the game to use about 200% (ie 2 out of my 4 cores), but I have seen screenshots from people who show the game using all of their cores nearly 100%.  I haven't been able to figure out how though.  The game also doesn't use my GPU 100% so it's really hard to tell what the bottleneck is and why the game isn't using either of them.  I have twice as many threads now with the 6700K and now the game uses about 400% (ie 4 out of 8 threads) so for me at least the game seems to just only use half of my CPU no matter how many threads I have...which sounds similar to what it is doing on your dual core CPU.  I really don't understand why though.
Screenshot after alt-tabbing out of big (100ish people) fight on my 2500K
Screenshot after alt-tabbing out of same size fight on my 6700K

EDIT:

Did some more testing by manually limiting which threads the game uses:
6700K limited to 2 threads
6700K limited to 4 threads

When manually limiting the affinity of the game it seems to actually be able to use 4 threads fully for me.  I still can't explain why it never did on my 2500K though, or why it's not using the full potential of your dual-core.  My framerate was about half when running 2 threads compared to 4 (and similar to the framerates you were reporting).  At the exact same fight I was getting roughly 50-60FPS with 2 threads, and 100-120FPS or so with 4 threads.  With all 8 threads I saw another small bump up to about 130-140FPS, even though the game wasn't actually using any more CPU power according to task manager.

The more I look into this the more confused I get  :lol:
6700K @ 4.6ghz || ASUS Sabertooth Z170 S || Crucial Ballistix DDR4-2400 16GB
ASUS STRIX GTX 970 || EVGA Supernova 750W G2 || Noctua NH-D15 || Fractal Define R5
Crucial MX200 500GB || 2x WD Blue 6TB || 2x WDGreen 2TB
Philips 272G5DYEB || Dell U2312HM
 
synthtel2
Gerbil Elite
Topic Author
Posts: 956
Joined: Mon Nov 16, 2015 10:30 am

Re: Planetside 2 per-component benchmarking

Wed Oct 12, 2016 5:57 pm

Alright, I found more interesting stuff:

Threads in my previous test scene (windowed 1280*800 gives slightly better framerates than what I was doing yesterday?)
Threads when moderately loaded (looking towards a fight of 24-48 friendlies and 12-24 NC, biggest I could find at this time of day)

It might be noteworthy that while it says it's GPU-limited, it lies. It's a known issue. Also, I don't know why imgur converted stuff to jpeg. My system isn't that ugly. :-?

I don't have to actually be seeing chaos for my framerate to tank - looking towards it while it's within my draw distance is sufficient. As you can see, my framerate sucked pretty badly even in that non-crazy test. When I looked the opposite direction, it got back up to 50 fps or so and the thread situation looked more like in the other scene.

Anyway, it looks like the majority of the load is spread out nicely across tons of threads. That render thread hardly uses more than 40% in an ideal case, but even in a moderate fight on my machine, it's dropping to 30%. Overall CPU use seems strangely capped at 130%. A bit more of that load is system instead of userspace when loaded (maybe that's context switching?). Load (in the load average sense) gets higher when looking at a fight, but doesn't seem excessive. Other uses of CPU time in the midst of this are about 10% for wineserver and 6% for pulseaudio (b/c it's garbage), but any way that's sliced there's still at least a fifth of the system not in use.

None of the explanations I've got for the low system utilization make sense. That applies doubly if some people don't have that problem. It almost sounds like they're artificially limiting it, but that's, uh, not good.

Both your results and mine show that more cores would probably solve my performance problem, which is an excellent finding. :D Thanks for the extra data!
 
LostCat
Minister of Gerbil Affairs
Posts: 2107
Joined: Thu Aug 26, 2004 6:18 am
Location: Earth

Re: Planetside 2 per-component benchmarking

Wed Oct 19, 2016 7:16 pm

synthtel2 wrote:
It might be noteworthy that while it says it's GPU-limited, it lies.

You're talking about a DX9 game last I checked, which isn't even remotely capable of using the power of a modern CPU or GPU properly.  So it could be a lot of things working against it.
Meow.
 
synthtel2
Gerbil Elite
Topic Author
Posts: 956
Joined: Mon Nov 16, 2015 10:30 am

Re: Planetside 2 per-component benchmarking

Thu Oct 20, 2016 1:57 am

DX9's expensive draw calls surely aren't helping anything, but the render thread that has to use DX9 is actually using less CPU time when my game is having the most trouble, so I doubt that's the core problem. I'd guess that it says it's GPU-limited because the CPU doesn't spend zero time waiting for the GPU, and that inefficiency could be pinned on DX9 (though I don't know enough about DX9 to say for sure).

I've got a theory on the low CPU utilization: things aren't pipelined well, such that when the render thread is running, the rest of the threads aren't. Context switching and other thread management could account for the rest of the gap (since they're submitting lots of small jobs into conventional threads), as well as possibly a bit spent in the render thread waiting for the GPU. Scheduling efficiency of ~70% within the multi-threaded portion (or a boost from multi-threading of ~1.4x) would be sufficient to explain my results then, and that's not unreasonable at all. This also meshes with reports that Planetside likes being given high priority with the OS.

The main problem with this theory is that most reasons for low scheduling efficiency should be more dependent on thread count than this. If it's only 70% on my dually, it should be truly garbage when given eight threads to work with. Maybe it's usually better on two threads and running it through Wine is wrecking things for me? I still have no idea what's with the results when it's artificially limited to fewer threads.

The engineering focus on low latency for Planetside could be a factor in bad pipelining. Throughput optimization dictates getting started on frame 2 as soon as frame 1 is off to the render thread, but latency optimization doesn't. The problem with that theory is just that a lack of pipelining gives only a small latency improvement in return for a big throughput penalty, especially on CPUs with more cores than two.
 
LostCat
Minister of Gerbil Affairs
Posts: 2107
Joined: Thu Aug 26, 2004 6:18 am
Location: Earth

Re: Planetside 2 per-component benchmarking

Sat Oct 22, 2016 2:01 pm

synthtel2 wrote:
DX9's expensive draw calls surely aren't helping anything, but the render thread that has to use DX9 is actually using less CPU time when my game is having the most trouble, so I doubt that's the core problem.

Aside from draw calls, it's also single core and can't access most of the shaders on your card (in type or amount) or related GPGPU functions (while it is possible with OpenCL, I doubt they're doing any of that if they can't be bothered to move on from DX9.)
It could be 'GPU bound' because it can't access any more than it already does.
That's about all I could say on the topic though, since I don't know/care about Linux or Planetside 2 (unless they actually upgrade its engine, which seems unlikely at this point even with things like the Skyrim and Bioshock upgrades going on.)
Meow.
 
synthtel2
Gerbil Elite
Topic Author
Posts: 956
Joined: Mon Nov 16, 2015 10:30 am

Re: Planetside 2 per-component benchmarking

Sat Oct 22, 2016 3:49 pm

Being single-core is mainly an issue because of the expensive draw calls, since all the stuff that isn't draw calls can still be decently multithreaded. While it's missing some features on the GPU side (including some that are a big deal for performance like compute shaders), it can still see and use all of a card's main resources (SPs, ROPs, etc). If a game isn't trying to render anything particularly fancy, a good DX9 renderer should be good for at least 2/3rds the GPU performance of a good DX11 renderer (and that would have to be a pretty impressive DX11 renderer indeed to create a gap like that).
 
LostCat
Minister of Gerbil Affairs
Posts: 2107
Joined: Thu Aug 26, 2004 6:18 am
Location: Earth

Re: Planetside 2 per-component benchmarking

Sat Oct 22, 2016 5:35 pm

Why the Warframe team added a DX10 renderer a while ago may or may not also be relevant
https://forums.warframe.com/topic/22605 ... g-hitches/
Meow.
 
synthtel2
Gerbil Elite
Topic Author
Posts: 956
Joined: Mon Nov 16, 2015 10:30 am

Re: Planetside 2 per-component benchmarking

Sat Oct 22, 2016 8:43 pm

I think you might be confusing shaders as in the programs that run on a GPU with shaders as in the stream processors of a GPU. Some of the terminology is a bit messy.

Shader compilation (first sense of the word "shader" and what's being talked about in that Warframe thread) is a potential hiccup point, for sure, and not a thing I was thinking about (I may be too used to the modern game dev world where it's mostly an avoidable problem). It uses more CPU time in annoyingly bursty patterns, and newer hardware and APIs make it easier to have lots of shader complexity in a game without running into trouble. It is a CPU-side problem, but I doubt it's giving me any trouble in Planetside, because Planetside doesn't have particularly complex or varied shaders (especially with all settings on low) and I'm not seeing any hiccups of that variety.

Shaders as in the stream processors of a GPU are all usable regardless of the API. Compute shaders and DX12/Vulkan can increase utilization of this kind of shader by a bit, but that's mostly about reducing downtime between different tasks, not enabling any more ideal-case compute power.
 
LostCat
Minister of Gerbil Affairs
Posts: 2107
Joined: Thu Aug 26, 2004 6:18 am
Location: Earth

Re: Planetside 2 per-component benchmarking

Sat Oct 22, 2016 9:48 pm

synthtel2 wrote:
I think you might be confusing shaders as in the programs that run on a GPU

Err, no.  That is what I've been talking about.
Meow.
 
synthtel2
Gerbil Elite
Topic Author
Posts: 956
Joined: Mon Nov 16, 2015 10:30 am

Re: Planetside 2 per-component benchmarking

Sat Oct 22, 2016 10:56 pm

LostCat wrote:
can't access most of the shaders on your card (in type or amount) or related GPGPU functions (while it is possible with OpenCL, I doubt they're doing any of that if they can't be bothered to move on from DX9.)
It could be 'GPU bound' because it can't access any more than it already does.

That post sounds like it was referring to shader hardware, though I'm a bit confused about the subject at this point. Shaders in the software sense are programs the game devs write, not anything about the card, in case that helps anything. I'm happy enough to ramble about this stuff in great detail, but I'm not sure what I'm supposed to be rambling about. :wink:
 
LostCat
Minister of Gerbil Affairs
Posts: 2107
Joined: Thu Aug 26, 2004 6:18 am
Location: Earth

Re: Planetside 2 per-component benchmarking

Sat Oct 22, 2016 11:09 pm

synthtel2 wrote:
That post sounds like it was referring to shader hardware, though I'm a bit confused about the subject at this point. Shaders in the software sense are programs the game devs write, not anything about the card, in case that helps anything. I'm happy enough to ramble about this stuff in great detail, but I'm not sure what I'm supposed to be rambling about.  :wink:

Right.  DX9 could do limited amounts of vertex and pixel shaders, but not geometry, compute, or tesselation shaders.

I read up on whatever I can usually...obviously not a developer but I like to be as informed as possible elsewise. This article speaks to the limits I've been mentioning.  https://en.wikipedia.org/wiki/High-Level_Shading_Language

I wanted to play Planetside 2 eventually, but I kept hoping they were going to upgrade the engine...  :/
Meow.
 
synthtel2
Gerbil Elite
Topic Author
Posts: 956
Joined: Mon Nov 16, 2015 10:30 am

Re: Planetside 2 per-component benchmarking

Sun Oct 23, 2016 12:35 am

Ah, like that, alright. The use of the word "shader" that I didn't think of. XD

Who thought it was a good idea to overload the word with that many meanings, anyway?

Yeah, the deal is that vertex and pixel shaders are the critical ones, and the others are much more situational. Geometry and tessellation shaders could mostly be emulated by just throwing more mesh data into the renderer (at more of a performance hit), and most of the point of compute shaders is making new kinds of algos feasible. Graphically modern games make use of all the shader types to make some of the fancy new effects decently efficient, but the kind of graphical effects Planetside is packing are already perfectly efficient on vertex and pixel shaders. Since all of these shader types map to the same underlying hardware (with small exceptions), the lack of some shader types doesn't end up holding back PS2's GPU performance much.
 
synthtel2
Gerbil Elite
Topic Author
Posts: 956
Joined: Mon Nov 16, 2015 10:30 am

Re: Planetside 2 per-component benchmarking

Sun Aug 27, 2017 1:55 pm

Well, here I am with 8C16T (and Wine CSMT if needed) that should by all rights play this game perfectly, but in the meantime they added some anti-cheat which is entirely borked in Wine. Arg.

I want to play this game enough that this may yet end in a dual-boot. Win10 + gaming on this internet connection would be a massive PITA though, especially if it's only booted often enough to play one game. Some manner of router QoS tech may be in order (not my forte).
 
synthtel2
Gerbil Elite
Topic Author
Posts: 956
Joined: Mon Nov 16, 2015 10:30 am

Re: Planetside 2 per-component benchmarking

Sat Sep 09, 2017 4:56 pm

It isn't directly comparable due to the OS change and intervening game updates, of course, but on an R7 1700 with RAM at 2133 (for now) it's typically (mostly CPU-limited) 80-120 fps and never below 60. Now I want to max out this monitor at 144 though. :lol: I'll see what RAM does whenever I get around to clocking it properly.
 
synthtel2
Gerbil Elite
Topic Author
Posts: 956
Joined: Mon Nov 16, 2015 10:30 am

Re: Planetside 2 per-component benchmarking

Sat Sep 30, 2017 7:37 pm

2133, 16-16-16-16-36-54 .... 137-144 fps
2933, 16-16-16-16-36-54 .... 153-159 fps
2666, 14-14-14-14-32-46 .... 152-161 fps

.... or performance about 30% specific to RAM clock, with latency apparently being a pretty big deal. The 2666 config only has a 3-7% absolute latency advantage to make up for that nearly 10% bandwidth (and data fabric) deficit against the 2933 config, but it hangs right with it.

Same scene, measurement method, and all that as before aside from a few minor settings boosts (still 50% render res / no shadows though, so very little GPU limitation and very similar CPU load pattern). Timings not noted were all auto (which actually works with 1.0.0.6b). I guess the multithreaded parts aren't the most bandwidth-heavy parts, but it's still worth a nice boost.

Who is online

Users browsing this forum: No registered users and 1 guest
GZIP: On