TR Forums

ronch · Fri Jun 21, 2013 4:39 am

Ok guys, I've been wanting to ask this question for a while now but I guess I just keep forgetting to ask.

What determines how much CPU utilization a piece of software uses? Let's make it simple by discussing just a single core. Isn't the CPU supposed to do things as fast as it could and make itself as busy as it could? I mean, if I fire up my web browser, it utilizes less than 10% of a particular core's resources according to Task Manager. So, why do some programs eat up 100% of a core and some just 5%? Why does it eat up just 50% instead of 80%? Is the program idling a little bit by not using all the CPU cycles it can? If you make that certain program eat up 50% instead of 5%, will that make it run faster? Are there program mechanisms that deliberately make it utilize a certain level of CPU core utilization? To ask another relevant question, how does Task Manager determine how much a program utilizes a CPU's core?

vargis14 · Fri Jun 21, 2013 7:09 am

I think it's because the programs we mostly use take hardy any CPU cycles to run. Say like a browser...if it was to goto 100% CPU usage for just a few quick cycles on a say 3770k, then imagine how slow it would run on a older AM2 1.5ghz dual core. Even a i3 would take a long time o open a browser. We are not running 300mhz pentiums or 900 tbrds anymore.

Today's x86 programs and code is mostly optimized to run on multipl cores or just a single core. I bet you would get 100% CPU usage on a 300 MHz pentium. I think it's the same basic programs and programming its just today's cpu's are super fast.

Unless you are benchmarking, photo editing, some games or handbraking some home videos :roll:

most CPUs stay well away from 100% usage.

Now how the Taskmanager hands out jobs. Sorry I just don not have the answer for that. You need a uber Gerbil for that one my friend

Chrispy_ · Fri Jun 21, 2013 7:11 am

Usually the CPU is waiting on the subsystem (RAM, disk, network) for more work to do.

Only a few things are bottlenecked by the CPU (gaming, rendering, encoding).

UberGerbil · Fri Jun 21, 2013 7:38 am

A CPU only does work when it has something to do. Generally speaking, to have something to do it must have both a stream of instructions to run and the data those instructions need... and that's the rub. Sometimes the data is immediately available -- it's already sitting in a register, or is an immediate (constant) value in the instruction stream; more commonly, the data needs to come from memory. If the programmer is lucky or good, the data is already cached and the CPU doesn't have to go all the way out to main memory. If the data is in the L1 cache then it is available almost instantly; if it's in the L2 cache, there will be a few cycles delay; in the L3 cache, a few more. And if it has to go all the way out to main memory, it's quite a few more cycles than that. (That's what the "Sandra Cache and Memory Latency" graph in TR's CPU reviews is showing you.) The more cycles the CPU has to wait, the less "utilized" it can be. If it is spending half the available cycles waiting on data, it's going to be "50%" busy (at best ). And that's assuming the data is already in memory. What if it's sitting out on an SSD, or even worse, a hard drive? Worse yet, what if that data is sitting somewhere out on the internet? And for web browsers, that's pretty much always the case: a web page (and the scripts and everything else associated with it) is a big wad of data and it all needs to get pulled down from a server somewhere. For a CPU that measures its cycles in nanoseconds and its bandwidth in GB/s, the multi-ms latency and low Mb/s speed of your typical net connection means a lot of time spent twiddling its thumbs (ie either idle or doing something else; either way, the browser's utilization of the CPU is going to be far from total).

This is why the programs used to "burn" CPUs (like Prime number generators, etc) generally run entirely out of cache: the only way to really max out the CPU execution resources is not to ever have to wait on memory (let alone anything slower than memory). Even browser benchmarks run entirely locally, using data already cached in memory.

And of course a lot of programs aren't trying to get high utilization out of the CPU. They aren't doing anything computationally intensive; in many cases, they're twiddling their thumbs waiting on input from the user. And once they've got it, they don't necessarily do a whole lot with it. Your typical IM app for example is just grabbing text from the keyboard and tossing it down the network pipe, and then retrieving a response and displaying it; most of the time it isn't doing anything but waiting for a human at one end of the conversation or the other to do something (and for a CPU that operates in nanoseconds, the seconds or minutes of human timescales makes for a lot of waiting). There may be some fancy GUI stuff going on but it's nothing that will tax a modern CPU; even your typical word processing or email client , with all its spell-checking and whatnot, isn't really doing much by modern CPU standards.

Now, separate from all that is the scheduling that OS uses to make sure every active program gets a timeslice of the CPU. Some programs have higher priority than others, so they get relatively more timeslices (and will see higher CPU utilization) but all program threads get interrupted regularly. This was more of an issue in the era of single-core CPUs, but even now no matter how many cores you have (on a typical desktop/laptop) you probably have more than that many processes (or, more precisely, active threads). Meanwhile the OS itself also has to to use the CPU to do its own housekeeping. So in ordinary use no program can fully occupy all the cores all the time, and in practice most aren't trying to: that background spellchecking in your word processor is running on a separate thread, but it's trying to stay out of the way. Being a good citizen in a multitasking environment means leaving as much CPU (and every other resource, like memory and battery life) available for whatever else the user might be running at the same time. Of course some programs genuinely need to crank a thread hard, at least temporarily. When that photoshop filter or video encoder is churning, the OS scheduler will generally try to keep it on the same core (or cores, in the case of multiple worker threads) to maximize cache utilization, and you will see the CPU usage spike. Of course, what you're actually seeing is an average spread out over time: certain hardware counters get updated periodically and those are averaged and reported in task manager. It would be too much overhead to update those continuously, and we poor humans couldn't make sense of information coming in every nanosecond anyway, so what you see as some percentage utilization is the average utilization over a more reasonable timespan.

Wirko · Fri Jun 21, 2013 8:44 am

Let's put it this way.

There's a single water tap in the village of 32 families. And there's a minister sitting by the tap. All of the villagers need water, some a lot of it, others very little. The Fox family comes with a cart loaded with barrels to fill with water. Now if no one else is waiting for the water, the Foxes can fill up the barrels in one go. But there are three other people with empty barrels and jars there. The minister wil make sure everyone gets to the tap in succession, and no one can use it for more than two minutes at a time if there are others waiting. He also shuts the valve to save water where there's no one there that needs it.

Now, Fox is Firefox, barrels are a CPU intensive task, 32 is number of processes currently in my XP, and water are CPU cycles. The minister is the task scheduler, and his authority to drive villagers away from the tap is preemptive multitasking, which is used in all modern OSes. Shutting off water amounts to putting the CPU in a low power state.

Fri Jun 21, 2013 9:28 am

Unless you're running something that is particularly CPU intensive (e.g. mathematical simulation, video rendering, FPS game with a GPU that is fast enough that the CPU becomes the bottleneck), most well-written programs end up being I/O bound. What this means is that the program spends most of its time waiting for an external (to the CPU) device -- e.g. a network packet from a web server, a keystroke or mouse click from the user, a sector from the disk. When the program is waiting for an external event (and provided there aren't any OTHER programs that need the core right then), the OS scheduler idles the core.

So core utilization is essentially "the percentage of time the CPU spends NOT waiting for something external to the CPU to happen".

Chrispy_ wrote:
Usually the CPU is waiting on the subsystem (RAM, disk, network) for more work to do.

Actually, DRAM is an exception. DRAM is fast enough, and operates at a low enough level that the OS's CPU scheduler can't get in there and muck with things. The DRAM request would complete long before any CPU scheduling actions could be taken at the OS level (and those actions would themselves likely require additional DRAM accesses, creating a chicken-and-egg situation), so there's no point. So time spent waiting for DRAM access does not count as "idle" time.

There's also (sort of) an exception to the exception: On a hyperthreaded CPU, if the thread on one virtual core is waiting for data from DRAM and the thread running on its HT partner has useful work it can do using only data that is already in internal CPU registers, the HT mechanism will switch the core over to the other thread while it waits for the DRAM access to complete. But this activity is (mostly) invisible to the OS scheduler -- the OS effectively "sees" two slower cores, since the switching is handled entirely in hardware without any intervention from the OS scheduler. All the OS scheduler can really do is be aware of which cores are HT partners, and take that information into account when deciding which cores to use when making macro-level scheduling decisions.

Flatland_Spider · Fri Jun 21, 2013 10:38 am

Here's a good thread on Stackoverflow on how CPU usage is calculated. How is CPU usage calculated?

Try running Process Explorer from Sysinternals. It will give you more information about what the processes are doing on your system. You can add the columns CPU cycles and CPU time which give a better picture of how much time a process is using since that is the raw data of CPU usage by a process.

For simplicity's sake, I'm going to take scheduling algorithms and priority levels out of the equation and just reference CPU time, the actual time a program or process takes to run, rather then clock time. Calculating Clock time gets really complicated, and it depends on the load on the system.

ronch wrote:
What determines how much CPU utilization a piece of software uses?

The algorithms and what the program wants to do. A program with lots of complex calculations will take up more CPU time to run then a program that prints "hello world."

The numbers you see are calculated by the OS from various metrics it keeps track of for use in scheduling running processes.

I mean, if I fire up my web browser, it utilizes less than 10% of a particular core's resources according to Task Manager. Why does it eat up just 50% instead of 80%?

That's all the CPU time it needs to accomplish it's goal, and it doesn't need anymore.

Is the program idling a little bit by not using all the CPU cycles it can? If you make that certain program eat up 50% instead of 5%, will that make it run faster?

The actual answer to this is complicated due to how process scheduling works in the operating system. The simple answer is no. The program uses all the CPU cycles it needs.

Reducing the CPU time from 50% down to 5% will make it run faster. Increasing the CPU time a program needs means the program takes longer to run. The point is to have programs run in the shortest amount of time possible.

Are there program mechanisms that deliberately make it utilize a certain level of CPU core utilization?

There are priority levels that the scheduler will use to order the process queue. The process scheduler in the OS will favor processes use the priority level to determine how many timeslices the process gets and how often. A process with a higher priority level will get more timeslices more often then a process with a lower priority level. There is more to it then this, but this is the basic concept for simplicity's sake.

Scheduling aside, it's dependent on the program. I've seen some programs police themselves with limits on threads, forked processes, or utilization, but usually the OS takes care of making sure everything gets some time on the CPU.

Fri Jun 21, 2013 10:59 am

Flatland_Spider wrote:
Scheduling aside, it's dependent on the program. I've seen some programs police themselves with limits on threads, forked processes, or utilization, but usually the OS takes care of making sure everything gets some time on the CPU.

Yup. The OS's scheduler is tuned to give "reasonable" performance and responsiveness for "typical" workloads. If this sounds vague, that's because it is. Typical workload (and performance/power tradeoff) for a desktop, laptop, web/file server, or node in a compute cluster are all different. There are system settings that you can tweak (and which may have different defaults on a desktop-oriented vs. server-oriented OS, for example) to modify the default behavior.

The user can also tweak the scheduling priority of individual programs to give them higher or lower priority, but the average user generally does not do this. As multi-core CPUs have become more prevalent it has become less necessary unless you're running a compute-bound background job that occupies all of your cores (in which case you might want to lower the priority of the CPU-bound job so that foreground apps remain responsive).

Nec_V20 · Wed Jun 26, 2013 7:36 pm

There are two things which one can do to regulate the behaviour of a process:

1) Set that process's Priority

2) Set that process's Affinity

The two should not be confused. The thing that both have in common is that they only last until the system reboots and after that whatever change has been made will revert back to the default.

Messing about with the priority of processes will generally lead to the system hanging or crashing at some point. It should be done VERY sparingly if at all. One CPU or Core can only do one thing at a time and every running process will get a certain timeslice of CPU attention, now the priority of a process determines whether or not it can "cut in front of" another process which would normally be scheduled to have CPU attention. Setting the priority will determine how few timeslices go by before the higher prioritised process can get CPU attention once more.

Try this for an experiment, load the task manager and click on the tab "Processes", now look for the process "Explorer.exe", right click on it and end it and see what happens. If you can get Windows to do anything again before you need to reboot then you are not a complete n00b. If you look at the priority for the process "Explorer.exe" and its importance to the relationship between you and the OS (even though stopping the process does get rid of the major source of computer errors, namely PEBCAK) you will notice that it runs under the priority "normal".

The "statistics" of CPU usage are only a reflection of how often a process calls upon the CPU (the interrupt) it generates, NOT the priority this process has with regard to its right to the attention of the CPU. Black people are a minority in the US but it doesn't mean they have less right to vote.

So you really don't want to dick around with the priorities of processes, it will most probably end in tears.

Now setting a process's "Affinity" means assigning the process to one (or more) CPUs or cores and not allowing it to interact with any others in the system. By default all processes are assigned to all CPUs (or cores). Setting the affinity of one process to one (or more) CPUs or cores doesn't really either make sense or bring you any advantage in itself (except to indirectly decrease its priority).

I have written a post, which I will quote here, where using the ability to assign affinities does make sense (although only when hyperthreading has been disabled in the BIOS) with regard to gaming performance:

So you think you cannot get something for nothing? Let me change your mind.

OK here's the deal, you have overclocked your processor and the game you are running doesn't really get the performance boost you were hoping for.

Here is a bit of really OLD knowledge which those of us who used to work with SMP (Symmetric MultiProcessing) machines knew but it seems to have been forgotten since SMP machines have become single die "cores".

With this simple little trick you will - depending on the game - get a hell of a lot more performance (even without overclocking) and it may even cure a games propensity to crash to desktop (CTD).

Since Windows NT one has had the ability to not only adjust the priority of a process but with multi processor machines one could also set the affinity.

Getting to the point you first of all have to turn off hyperthreading in the BIOS

If you open the Task Manager and you right click on a process you will see the option "Set Affinity". What this does is it fixes the process to run off one of the cores in your CPU, and stay there.

Now you could do this manually but the pain is that when you reboot all the settings are gone. So there is a nice utility where you can store profiles and load them to automate this process. The utility can be found here:

http://www.koma-code.de/?option=com_content&task=view&id=88&Itemid=93

You are thinking to yourself, "Why the hell should I do this?"

The reason is that Windows left to its own devices will mess you and your game playing session around. The main acceleration to your processing power comes from your L2 Cache - a bit of super fast memory between your processor and your RAM.

Now what happens is that you have your game running quite happily on one of the cores of your CPU but Windows decides to swap the game over to another core which it deems to be underutilised. This will of course wipe the L2 Cache and it will have to be filled again from the incredibly slower RAM.

Also Windows works on the basis of pre-emptive multitasking, which means that processes (your game for instance) get a certain time slice on any one of the cores where it is running and then it is another processes turn - again this mucks about with your game and the L2 Cache.

So what you do is boot up your system and when it is up and running, start CPU-Control and assign all of the running processes to CPU 0. Now some games take well to multi processors and some games not so much. You will have to do a bit of research on the game engine to determine that.

If you have found that the game can only run on one processor then load the game and allocate it to one of the remaining cores. If the game can use multiple processors then simply assign the game to the other three cores (on a quad core CPU).

Assigning a game that really only runs on one core to three cores is counterproductive and you will not get the acceleration you were hoping for, so you have to do your research on the game engine.

It is not unusual to find that games which were unplayable all of a sudden feel very comfortable on your machine if your computer has a weaker CPU. It is more of a "go faster stripe" than any overclocking will ever give you, that's for sure.

I have looked around and from what I have been able to garner those games which do take advantage of more than one core will only use two cores maximum.

Thus allocating games to three cores as I mentioned above when they can only use two cores maximum will actually negate the performance gain due to core swapping by Windows.

Probably the best bet is to just allocate the game to one core, see what kind of performance you get and then allocate it to two cores and see if the performance increases.

TR Forums

CPU Core Utilization Question

CPU Core Utilization Question

Re: CPU Core Utilization Question

Re: CPU Core Utilization Question

Re: CPU Core Utilization Question

Re: CPU Core Utilization Question

Re: CPU Core Utilization Question

Re: CPU Core Utilization Question

Re: CPU Core Utilization Question

Re: CPU Core Utilization Question

Who is online