Floating-point units in server-grade CPUs

Discussion of all forms of processors, from AMD to Intel to VIA.

Moderators: Flying Fox, morphine

Re: Floating-point units in server-grade CPUs

Postposted on Tue Nov 02, 2010 11:20 am

just brew it! wrote:
Glorious wrote:
Shining Arcanine wrote:Floating point numbers map irrational numbers to rational ones, which is a hack to obtain an illusion that a machine word can represent more than is theoretically possible. If you know differently, then please enlighten me.

This is just a mess.

First off, computers are finitely precise, so if you want to use irrational numbers you're going to HAVE to use an approximate rational number for them. Integer/FP/whatever, if you're dealing with things like pi or e on a computer, you're using this "hack." Which, you know, is kinda of a good thing? Otherwise cellphones, the internet, scientific computing, etc... would all be impossible.

Or to look at it another way, this "hack" is nothing more than the concept of "significant digits" that we all learned in high school math class, recast from the decimal realm to the binary realm.

SA -- FWIW I actually agree that many developers incorrectly use floating point values where a scaled integer, BCD representation, or even string would be more appropriate. But that doesn't change the fact that you seem to have completely missed the point.

I think I'd put it a little more bluntly. It's not a hack at all. It's a useful data structure at its core. It models data in a certain way that approximates (very well actually) real floating point numbers.

He calls it a hack because he has some grudge against it or something. But the IEEE floating point standard is simply a well-documented data structure that is extremely useful for modeling floating point numbers.
Buub
Maximum Gerbil
Silver subscriber
 
 
Posts: 4192
Joined: Sat Nov 09, 2002 11:59 pm
Location: Seattle, WA

Re: Floating-point units in server-grade CPUs

Postposted on Tue Nov 02, 2010 11:58 am

This thread is a mess :o
StuG
Graphmaster Gerbil
Silver subscriber
 
 
Posts: 1457
Joined: Wed May 23, 2007 11:19 pm
Location: Florida

Re: Floating-point units in server-grade CPUs

Postposted on Tue Nov 02, 2010 12:19 pm

Floating point numbers map irrational numbers to rational ones, which is a hack to obtain an illusion that a machine word can represent more than is theoretically possible. If you know differently, then please enlighten me.


When you try to multiply or divide something by "a third" using fixed-point datatypes, what are you actually dividing/multiplying by? When you're doing anything that involves Pi or e, what value are you actually using?

Divide 8 /3 using fixed-point datatypes. The value you get is an approximation.

The point is, ALL datatypes on some level resort to illusion; and they all have rounding errors. Floating point allows you to store an enormously large range of numbers in a smaller space, and the trade off is some degree of precision. But, in the examples above, they are actually more precise than the alternative.
cphite
Gerbil Elite
 
Posts: 545
Joined: Thu Apr 29, 2010 9:28 am

Re: Floating-point units in server-grade CPUs

Postposted on Tue Nov 02, 2010 12:27 pm

Buub wrote:I think I'd put it a little more bluntly. It's not a hack at all. ...

That's why I put quotation marks around the word "hack" in my reply!
(this space intentionally left blank)
just brew it!
Administrator
Gold subscriber
 
 
Posts: 37514
Joined: Tue Aug 20, 2002 10:51 pm
Location: Somewhere, having a beer

Re: Floating-point units in server-grade CPUs

Postposted on Tue Nov 02, 2010 12:28 pm

StuG wrote:This thread is a mess :o

That's because it used FP math.
There is a fixed amount of intelligence on the planet, and the population keeps growing :(
morphine
Gerbil Khan
Silver subscriber
 
 
Posts: 9934
Joined: Fri Dec 27, 2002 8:51 pm
Location: Portugal (that's next to Spain)

Re: Floating-point units in server-grade CPUs

Postposted on Tue Nov 02, 2010 12:36 pm

morphine wrote:
StuG wrote:This thread is a mess :o

That's because it used FP math.

So are we suffering from the effect of inexact representation, NaNs, denormals, or all of the above?
(this space intentionally left blank)
just brew it!
Administrator
Gold subscriber
 
 
Posts: 37514
Joined: Tue Aug 20, 2002 10:51 pm
Location: Somewhere, having a beer

Re: Floating-point units in server-grade CPUs

Postposted on Tue Nov 02, 2010 12:57 pm

just brew it! wrote:
morphine wrote:That's because it used FP math.

So are we suffering from the effect of inexact representation, NaNs, denormals, or all of the above?


FDIV problems due to a bad look up table.

This thread was supposed to look like this:

4195835.0/3145727.0 = 1.333 820 449 136 241 002 5

Instead we got this:

4195835.0/3145727.0 = 1.333 739 068 902 037 589 4
"Welcome back my friends to the show that never ends. We're so glad you could attend. Come inside! Come inside!"
Ryu Connor
Global Moderator
Gold subscriber
 
 
Posts: 3506
Joined: Thu Dec 27, 2001 7:00 pm
Location: Marietta, GA

Re: Floating-point units in server-grade CPUs

Postposted on Wed Nov 03, 2010 9:10 am

In *general* terms, about 90% of the processing in the data center is integer. ~10% is floating point. Again, *generally* speaking, those that are doing heavy floating point are looking towards GPGPU as a solution.

While some may question the "need for FP", just because some workloads don't leverage it, having it in the CPUs makes total sense. Because when you need, it has to be there.
While I work for AMD, my posts are my own opinions.

http://blogs.amd.com/work/author/jfruehe/
JF-AMD
Gerbil
 
Posts: 33
Joined: Wed Dec 09, 2009 11:27 am

Re: Floating-point units in server-grade CPUs

Postposted on Wed Nov 03, 2010 10:19 pm

Buub wrote:He calls it a hack because he has some grudge against it or something. But the IEEE floating point standard is simply a well-documented data structure that is extremely useful for modeling floating point numbers.


Computers at their heart are approximations of deterministic Turing machines. Deterministic Turing machines consist of nothing more than an infinite tape, a finite alphabet of characters that can be written to the tape, a tapehead that can do random I/O operations on the tape and a state transition function that describes the behavior of the tapehead at each step. With that in mind, you could use the unary system and still have a computer. As far as deterministic turing machines are concerned, operating on floating point numbers does not make them anymore powerful than they already are. It just means that you have a little over 4 billion characters in the 32-bit case, plus a state transition function that can kill your Turing machine if it moves the machine to the NaN state. Integers require the same number of characters in the 32-bit case, but they have safe state transition functions that never kill the Turing machine. If you are careful and implement a few more characters, you could even merge the two state transition functions to get an even more complicated function, but it would inherit the NaN problem.

I have no "grudge" against floating point numbers, but the logical issues that they cause merit that they be classified as a hack. I mentioned that they were hacks to support the idea that they are not strictly needed for the operation of a modern computer, which supported the cause for my earlier question as to why people care about the performance of these operations on a CPU. The only person here who even attempted to answer that question was Ryu Connor who demonstrated willingness to answer it without trying to dodge the question.

With that said, I think most people here are trying to steer the discussion away from the original question by substituting discourse regarding what floating point numbers are to avoid the more difficult question as to why they care about the performance of CPU floating point units. There is a good page on Wikipedia describing this phenomena:

http://en.wikipedia.org/wiki/Displacement_(psychology)

JF-AMD wrote:In *general* terms, about 90% of the processing in the data center is integer. ~10% is floating point. Again, *generally* speaking, those that are doing heavy floating point are looking towards GPGPU as a solution.

While some may question the "need for FP", just because some workloads don't leverage it, having it in the CPUs makes total sense. Because when you need, it has to be there.


I do not think your 10% figure is far from the truth. The need for floating point hardware in CPUs going forward seems overblown to me outside of special cases that would move to either the GPU or specialized hardware anyway. Something Intel recently did in this area was implement a dedicated hardware video encoder into Sandy Bridge, which operates orders of magnitude faster than Sandy Bridge's CPU cores operate. As time passes, the number of floating point operations done on CPUs will likely approach zero as things move away from the CPU. There are many other things that could benefit from the die area used by hardware floating point units in CPUs that I am not certain how long you could reasonably argue that hardware floating point units are good to have "just in case", because the negative aspect of losing the performance provided by hardware floating point units is continually shrinking while the benefit from putting that die area into other areas remains constant or even increases with the loss of hardware floating points (e.g. more cores). There is no floating point operation that cannot be done with integer arithmetic, albeit more slowly, so CPUs in no way are obligated to continue to have such units moving forward. In that context, having these units perform well does not seem quite so important as other things such as having more cores. It could be the case that AMD's decision to share a floating point unit between every two cores is an early step in this direction.
Disclaimer: I over-analyze everything, so try not to be offended if I over-analyze something you wrote.
Shining Arcanine
Gerbil Jedi
 
Posts: 1717
Joined: Wed Jun 11, 2003 11:30 am

Re: Floating-point units in server-grade CPUs

Postposted on Thu Nov 04, 2010 12:27 am

Shining Arcanine, I think it's absolutely cool how you pulled off your [Defense of the Troll] (1 hour cooldown) ability, much like Prime1, Krogoth, et al, because you just completely ignored Glorious' comment (here, I'll help, click this link).

As per the above skill's description, the reason why you did that is because he's completely proven you wrong and if you were to acknowledge that, the thread would already be dead by now. Instead, you choose to continue ignorance and your own dark entertainment without realising that the battle's already over.

It was fun until you did that.
Meadows
Grand Gerbil Poohbah
Silver subscriber
 
 
Posts: 3151
Joined: Mon Oct 08, 2007 1:10 pm
Location: Location: Location

Re: Floating-point units in server-grade CPUs

Postposted on Thu Nov 04, 2010 1:10 am

Shining Arcanine wrote:I have no "grudge" against floating point numbers, but the logical issues that they cause merit that they be classified as a hack. I mentioned that they were hacks to support the idea that they are not strictly needed for the operation of a modern computer, which supported the cause for my earlier question as to why people care about the performance of these operations on a CPU...
With that said, I think most people here are trying to steer the discussion away from the original question by substituting discourse regarding what floating point numbers are to avoid the more difficult question as to why they care about the performance of CPU floating point units.

If you're going to use that strategy, then by all means, data structures in general are a "hack". They are not strictly needed. Same with compilers and high level languages. They're all strictly unnecessary. Nothing prevents us from writing all our code in machine language.

Well, except for that fact that nothing would get done... These things are all highly useful abstractions that allow larger problems to be solved, because they automate or abstract away tedious details. Same with floating point math.

And the reason FP is done in hardware rather than software is for the many orders of magnitude improvement in computational efficiency. Much like automatic handling of stack frames and other things processors do now that were once handled in explicit lines of code.
Buub
Maximum Gerbil
Silver subscriber
 
 
Posts: 4192
Joined: Sat Nov 09, 2002 11:59 pm
Location: Seattle, WA

Re: Floating-point units in server-grade CPUs

Postposted on Thu Nov 04, 2010 4:02 am

So what am I supposed to do when I need a float/double? Pray that the CPU emulates an FPU for me?
Mothership: Thuban 1055T@3.7GHz, 12GB DDR3, M5A99X EVO, GTX470+Icy Vision Rev.2@840/3800, Vertex 2E 60GB
Supply ship: Sargas@2.8GHz, 12GB DDR3, M4A88TD-V EVO/USB3
Corsair: Macbook Air Ivy Bridge
Crayon Shin Chan
Minister of Gerbil Affairs
 
Posts: 2236
Joined: Fri Sep 06, 2002 11:14 am
Location: Malaysia

Re: Floating-point units in server-grade CPUs

Postposted on Thu Nov 04, 2010 4:07 am

Shining Arcanine wrote:
Buub wrote:He calls it a hack because he has some grudge against it or something. But the IEEE floating point standard is simply a well-documented data structure that is extremely useful for modeling floating point numbers.


Computers at their heart are approximations of deterministic Turing machines. Deterministic Turing machines consist of nothing more than an infinite tape, a finite alphabet of characters that can be written to the tape, a tapehead that can do random I/O operations on the tape and a state transition function that describes the behavior of the tapehead at each step.


I never understood why everybody finds the need to describe a Turing machine in terms of outdated computer hardware nobody ever encounters (saying you use a tape punch daily is like saying we need FPUs after all). Why can't they just say "an infinite data storage device with random accessibility (and while we're at it, tapes can't do random I/O), a limited set of characters that can be stored on said data device" and I have no clue what this state transition function is.
Mothership: Thuban 1055T@3.7GHz, 12GB DDR3, M5A99X EVO, GTX470+Icy Vision Rev.2@840/3800, Vertex 2E 60GB
Supply ship: Sargas@2.8GHz, 12GB DDR3, M4A88TD-V EVO/USB3
Corsair: Macbook Air Ivy Bridge
Crayon Shin Chan
Minister of Gerbil Affairs
 
Posts: 2236
Joined: Fri Sep 06, 2002 11:14 am
Location: Malaysia

Re: Floating-point units in server-grade CPUs

Postposted on Thu Nov 04, 2010 5:35 am

I'm off to design a new type of FPU... a Fraction Processing Unit. Then everyone will be happy :D
Fernando!
Your mother ate my dog!
cheesyking
Minister of Gerbil Affairs
 
Posts: 2245
Joined: Sun Jan 25, 2004 7:52 am
Location: That London (or so I'm told)

Re: Floating-point units in server-grade CPUs

Postposted on Thu Nov 04, 2010 5:45 am

Buub wrote:
Shining Arcanine wrote:I have no "grudge" against floating point numbers, but the logical issues that they cause merit that they be classified as a hack. I mentioned that they were hacks to support the idea that they are not strictly needed for the operation of a modern computer, which supported the cause for my earlier question as to why people care about the performance of these operations on a CPU...
With that said, I think most people here are trying to steer the discussion away from the original question by substituting discourse regarding what floating point numbers are to avoid the more difficult question as to why they care about the performance of CPU floating point units.

If you're going to use that strategy, then by all means, data structures in general are a "hack". They are not strictly needed. Same with compilers and high level languages. They're all strictly unnecessary. Nothing prevents us from writing all our code in machine language.

Well, except for that fact that nothing would get done... These things are all highly useful abstractions that allow larger problems to be solved, because they automate or abstract away tedious details. Same with floating point math.

And the reason FP is done in hardware rather than software is for the many orders of magnitude improvement in computational efficiency. Much like automatic handling of stack frames and other things processors do now that were once handled in explicit lines of code.


If you want to talk further on this, please send me a private message. It is not related to the question I asked originally at all, which I made clear in my other post. The following page describes what you are doing fairly well and it is not appreciated:

http://en.wikipedia.org/wiki/Displacement_(psychology)

Crayon Shin Chan wrote:So what am I supposed to do when I need a float/double? Pray that the CPU emulates an FPU for me?


Are you aware that compilers are capable of inserting integer instructions into a program that uses floating point numbers so that it will behave like the hardware does floating point operations, but it will not use a single floating point instruction? If you are aware that compilers can do that, why are you asking a question if you know the answer?

Crayon Shin Chan wrote:
Shining Arcanine wrote:
Buub wrote:He calls it a hack because he has some grudge against it or something. But the IEEE floating point standard is simply a well-documented data structure that is extremely useful for modeling floating point numbers.


Computers at their heart are approximations of deterministic Turing machines. Deterministic Turing machines consist of nothing more than an infinite tape, a finite alphabet of characters that can be written to the tape, a tapehead that can do random I/O operations on the tape and a state transition function that describes the behavior of the tapehead at each step.


I never understood why everybody finds the need to describe a Turing machine in terms of outdated computer hardware nobody ever encounters (saying you use a tape punch daily is like saying we need FPUs after all). Why can't they just say "an infinite data storage device with random accessibility (and while we're at it, tapes can't do random I/O), a limited set of characters that can be stored on said data device" and I have no clue what this state transition function is.


Magnetic tape was invented after the Turing machine was defined. Furthermore, the tape used in a Turing machine does not exist in even finite quantity, for the reasons you stated.
Disclaimer: I over-analyze everything, so try not to be offended if I over-analyze something you wrote.
Shining Arcanine
Gerbil Jedi
 
Posts: 1717
Joined: Wed Jun 11, 2003 11:30 am

Re: Floating-point units in server-grade CPUs

Postposted on Thu Nov 04, 2010 6:33 am

Shining Arcanine wrote:
Crayon Shin Chan wrote:So what am I supposed to do when I need a float/double? Pray that the CPU emulates an FPU for me?

Are you aware that compilers are capable of inserting integer instructions into a program that uses floating point numbers so that it will behave like the hardware does floating point operations, but it will not use a single floating point instruction?

Yes, and if you do this it executes very slowly. If the code does a lot of FP math, you'll be lucky to get code that executes at even 1/10th the speed of equivalent hardware floating point (it'll probably be significantly worse than that).

Multiple people have already posted valid examples of why a FPU is desirable in a general-purpose CPU; you continue to ignore these examples. Yes, we know that floating point is inexact. Time to move on.
(this space intentionally left blank)
just brew it!
Administrator
Gold subscriber
 
 
Posts: 37514
Joined: Tue Aug 20, 2002 10:51 pm
Location: Somewhere, having a beer

Re: Floating-point units in server-grade CPUs

Postposted on Thu Nov 04, 2010 9:06 am

Shining Arcanine wrote:
Buub wrote:
Shining Arcanine wrote:I have no "grudge" against floating point numbers, but the logical issues that they cause merit that they be classified as a hack. I mentioned that they were hacks to support the idea that they are not strictly needed for the operation of a modern computer, which supported the cause for my earlier question as to why people care about the performance of these operations on a CPU...
With that said, I think most people here are trying to steer the discussion away from the original question by substituting discourse regarding what floating point numbers are to avoid the more difficult question as to why they care about the performance of CPU floating point units.

If you're going to use that strategy, then by all means, data structures in general are a "hack". They are not strictly needed. Same with compilers and high level languages. They're all strictly unnecessary. Nothing prevents us from writing all our code in machine language.

Well, except for that fact that nothing would get done... These things are all highly useful abstractions that allow larger problems to be solved, because they automate or abstract away tedious details. Same with floating point math.

And the reason FP is done in hardware rather than software is for the many orders of magnitude improvement in computational efficiency. Much like automatic handling of stack frames and other things processors do now that were once handled in explicit lines of code.


If you want to talk further on this, please send me a private message. It is not related to the question I asked originally at all, which I made clear in my other post. The following page describes what you are doing fairly well and it is not appreciated:

http://en.wikipedia.org/wiki/Displacement_(psychology)


Just because you claim it, doesn't make it so. This is a very good example. No displacement is occuring here. There is a statement by you and a related response by another followed by a response by you. If you open the can of worms you really cannot complain that you wanted a nightcrawler and got a wriggler. And as a side-note, this really isn't a debateable issue. This has gotten just bizarre. There is a need and you simply deny it.

I would love to see the conversations you will have with your boss when you tell him that everyone is wrong but you.
Sony a7
Sony Zeiss 55/1.8 SSM, 24-70/4 SSM
Minolta 17-35/2.8-4 D, 100-300 APO
TheEmrys
Minister of Gerbil Affairs
Silver subscriber
 
 
Posts: 2144
Joined: Wed May 29, 2002 8:22 pm
Location: Northern Colorado

Re: Floating-point units in server-grade CPUs

Postposted on Thu Nov 04, 2010 10:20 am

Crayon Shin Chan wrote:
Shining Arcanine wrote:
Buub wrote:He calls it a hack because he has some grudge against it or something. But the IEEE floating point standard is simply a well-documented data structure that is extremely useful for modeling floating point numbers.


Computers at their heart are approximations of deterministic Turing machines. Deterministic Turing machines consist of nothing more than an infinite tape, a finite alphabet of characters that can be written to the tape, a tapehead that can do random I/O operations on the tape and a state transition function that describes the behavior of the tapehead at each step.


I never understood why everybody finds the need to describe a Turing machine in terms of outdated computer hardware nobody ever encounters (saying you use a tape punch daily is like saying we need FPUs after all). Why can't they just say "an infinite data storage device with random accessibility (and while we're at it, tapes can't do random I/O), a limited set of characters that can be stored on said data device" and I have no clue what this state transition function is.

Wow, I didn't even notice that issue -- Turing Machines don't get to do random I/O! They always proceed one step at a time and read the state contained at that location to determine what their next state is.
So much for his academic credentials, if he won't even describe a Turing Machine accurately. :lol:
Core i7 920, 3x2GB Corsair DDR3 1600, 80GB X25-M, 1TB WD Caviar Black, MSI X58 Pro-E, Radeon 4890, Cooler Master iGreen 600, Antec P183, opticals
SNM
Emperor Gerbilius I
 
Posts: 6206
Joined: Fri Dec 30, 2005 10:37 am

Re: Floating-point units in server-grade CPUs

Postposted on Thu Nov 04, 2010 10:53 am

So that's why it's always a tape... because Turing machines can't do random I/O, and tapes don't lend themselves well to that (technically speaking, they can seek), and every other storage device invented since then could do random I/O.
Mothership: Thuban 1055T@3.7GHz, 12GB DDR3, M5A99X EVO, GTX470+Icy Vision Rev.2@840/3800, Vertex 2E 60GB
Supply ship: Sargas@2.8GHz, 12GB DDR3, M4A88TD-V EVO/USB3
Corsair: Macbook Air Ivy Bridge
Crayon Shin Chan
Minister of Gerbil Affairs
 
Posts: 2236
Joined: Fri Sep 06, 2002 11:14 am
Location: Malaysia

Re: Floating-point units in server-grade CPUs

Postposted on Thu Nov 04, 2010 12:21 pm

Shining Arcanine wrote:I do not think your 10% figure is far from the truth. The need for floating point hardware in CPUs going forward seems overblown to me outside of special cases that would move to either the GPU or specialized hardware anyway. Something Intel recently did in this area was implement a dedicated hardware video encoder into Sandy Bridge, which operates orders of magnitude faster than Sandy Bridge's CPU cores operate. As time passes, the number of floating point operations done on CPUs will likely approach zero as things move away from the CPU. There are many other things that could benefit from the die area used by hardware floating point units in CPUs that I am not certain how long you could reasonably argue that hardware floating point units are good to have "just in case", because the negative aspect of losing the performance provided by hardware floating point units is continually shrinking while the benefit from putting that die area into other areas remains constant or even increases with the loss of hardware floating points (e.g. more cores). There is no floating point operation that cannot be done with integer arithmetic, albeit more slowly, so CPUs in no way are obligated to continue to have such units moving forward. In that context, having these units perform well does not seem quite so important as other things such as having more cores. It could be the case that AMD's decision to share a floating point unit between every two cores is an early step in this direction.


Ok this makes little sense, CPU companies are integrating more into the CPU not pushing more of it outside. First CPUs many just had the int hardware, everything was done in software. They then made specialized FP hardware off CPU as a co-processor, to you know, handle FP calculations. They then moved it into the CPU. They did the same thing for cache. The companies are now in the process of pulling everything into the CPU to decrease latency and increase performance, Northbridge, PCI-E, graphics, and other specialized HW. The video encoder on the newer Intel chips is just more FP work being pulled into an even more specialized piece of floating point HW they that now have room for within or at least in package on with the CPU. As the die keeps shrinking both AMD, Intel, IMB, Sun, ect will keep pulling more items into their CPUs...
tfp
Grand Gerbil Poohbah
 
Posts: 3068
Joined: Wed Sep 24, 2003 11:09 am

Re: Floating-point units in server-grade CPUs

Postposted on Thu Nov 04, 2010 1:13 pm

Computers at their heart are approximations of deterministic Turing machines. Deterministic Turing machines consist of nothing more than an infinite tape, a finite alphabet of characters that can be written to the tape, a tapehead that can do random I/O operations on the tape and a state transition function that describes the behavior of the tapehead at each step.


There is nothing random about a deterministic Turing machine. :)

Floating-point allows us to store a much wider range of numbers than fixed point given the same space, and the trade off is precision. They are no more of a "hack" than any other data type.
cphite
Gerbil Elite
 
Posts: 545
Joined: Thu Apr 29, 2010 9:28 am

Re: Floating-point units in server-grade CPUs

Postposted on Thu Nov 04, 2010 5:53 pm

GPUs aren't used because basically 0% of software uses CUDA, OpenCL or DX Compute. Whether it's servers, consumer, etc.

Actually CUDA/OpenCL/Direct Compute are probably more common for servers and HPC than anything else right now. And almost all of that is CUDA because it's a 2-4 year old ecosystem, while OpenCL and Direct Compute are less than a year old basically (but not proprietary).

Moreover, GPUs are very brittle tools and perform poorly for software that has irregular control flow, data structures or communication between threads. On any of those cases, CPUs tend to win. And most efficient algorithms for FP workloads tend to use a lot of communication to reduce computation by many orders of magnitude.

Also, if you think floating point is unimportant, you need to get a clue. FP is an important element of computer performance. Some workloads have very little...and some may have no FP...but for general purpose computers, you want to have good FP support.

http://citeseerx.ist.psu.edu/viewdoc/do ... 1&type=pdf

That's a paper on the Xbox 360 - I quote: "First, for the game workload, both integer and floating-point performance are important." ... "In addition, several sections of the application lend themselves well to vector floating-point acceleration."

Most engineering software (e.g. used to design chips, airplanes, architecture, etc.) uses FP, as does a lot of HPC.

DK
dkanter
Gerbil
Gold subscriber
 
 
Posts: 10
Joined: Sat Dec 02, 2006 6:24 pm

Re: Floating-point units in server-grade CPUs

Postposted on Thu Nov 04, 2010 5:55 pm

I use doubles to calculate the quadratic equation, and for a fuzzy logic project. It's in the hardware because a long time ago people realized this needed to be accelerated. Let's keep the FPU inside the CPU, it's more convenient for everybody that way.
Mothership: Thuban 1055T@3.7GHz, 12GB DDR3, M5A99X EVO, GTX470+Icy Vision Rev.2@840/3800, Vertex 2E 60GB
Supply ship: Sargas@2.8GHz, 12GB DDR3, M4A88TD-V EVO/USB3
Corsair: Macbook Air Ivy Bridge
Crayon Shin Chan
Minister of Gerbil Affairs
 
Posts: 2236
Joined: Fri Sep 06, 2002 11:14 am
Location: Malaysia

Re: Floating-point units in server-grade CPUs

Postposted on Thu Nov 04, 2010 11:28 pm

dkanter wrote:GMoreover, GPUs are very brittle tools and perform poorly for software that has irregular control flow, data structures or communication between threads. On any of those cases, CPUs tend to win. And most efficient algorithms for FP workloads tend to use a lot of communication to reduce computation by many orders of magnitude.
Yeah, and to expand on the communication part -- often people ignore the break-even points of using the GPU at all (based on the size of the problem). The overhead involved in getting your dataset into the GPU's accessible memory, arranging the data to be effective for GPU computation and getting it back out again can be substantial and make the overall gains much lower than just comparing the raw computation alone.

A research group at my alma mater published a workshop paper called "On the Limits of GPU Acceleration" where they model these ratios and calculate the break-even points for different problem types. And of course, the sentiment you're expressing above about the limited applicability of GPGPU to a narrow set of domains is a given.


Shining Arcanine wrote:Did you know that close to zero of the commonly-used server software requires a floating point unit?
Server is a hugely ambiguous and overloaded term; sometimes "server" just means managed machines in data centers, which could be running large scale machine learning batch jobs, data mining, simulations, or any number of things. What do you mean by server?
bitvector
Grand Gerbil Poohbah
 
Posts: 3234
Joined: Wed Jun 22, 2005 4:39 pm
Location: Mountain View, CA

Re: Are the Bulldozer FP capabilities being underestimated?

Postposted on Fri Nov 05, 2010 1:23 am

Shining Arcanine wrote:
Code: Select all
float i = 0;
while (i != 1) { i += 0.02 }
cout << "I iterated " << 50 * i << " times" << endl;


How many times does that iterate?

There is an inordinate number of examples where the rules of mathematics are violated by floating point numbers.


So, I have learned 2 things here:

1) SA writes crappy code, therefore all code must be crappy. QED
and
2) My PC, like most others in the world, can - and often does - violate the rules of mathematics billions of times per second. Quite frankly, this worries me a lot and I think I will write my congressman about it. We may be altering fundamental constants, or breaking very laws of causality... FPUs should clearly be regulated by a new federal space-time protection agency.
Saber Cherry
Gerbil XP
 
Posts: 303
Joined: Fri Mar 14, 2008 3:41 am
Location: Crystal Tokyo

Re: Floating-point units in server-grade CPUs

Postposted on Fri Nov 05, 2010 6:52 am

bitvector wrote:What do you mean by server?


Clearly, it's whatever he imagines will support his argument. :P
Glorious
Darth Gerbil
Gold subscriber
 
 
Posts: 7837
Joined: Tue Aug 27, 2002 6:35 pm

Re: Floating-point units in server-grade CPUs

Postposted on Fri Nov 05, 2010 7:38 am

This thread is a mess :o
GACKT ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !
Dual nehalem Xeon e5520 / 16 GB DDR3 / GTX 280
mph_Ragnarok
Graphmaster Gerbil
 
Posts: 1315
Joined: Thu Aug 25, 2005 7:04 pm

Re: Floating-point units in server-grade CPUs

Postposted on Fri Nov 05, 2010 10:50 am

Crayon Shin Chan wrote:I use doubles to calculate the quadratic equation, and for a fuzzy logic project. It's in the hardware because a long time ago people realized this needed to be accelerated. Let's keep the FPU inside the CPU, it's more convenient for everybody that way.


I used double variables in a homework assignment I did yesterday for my Numerical Analysis class. I was doing Newtonian mechanics simulations and at the scale I was doing them, the presence of hardware floating point units did not make a difference in whether or not it finished in an acceptable time-frame. If it did matter, I could have used the Runge-Kutta method instead of Euler's method to obtain solutions of the differential equations involved.

Outside of legacy scientific computing software where you can wait months or even years for computations to finish, I am not sure why anyone would need a hardware floating point unit in their CPU. Processors are fast enough that that the things that hardware floating point units made computable per unit time 10 years ago are computable per unit time with compiler generated integer instructions today. Aside from legacy scientific computing software, there is no killer application that takes advantage of hardware floating point units in CPUs, because even if the CPU is as optimal as possible, it is still too slow. Having these calculations be done on GPUs is the way forward and it is not just me who thinks this. The NCSA director made public comments on this recently, which are identical to what I am saying:

http://insidehpc.com/2010/11/02/ncsa-di ... computing/

General purpose logic is always slower than dedicated logic. This somewhat contradicts historical experience, but historically, since clock speeds increased with transistor budgets, economics of scale enabled companies like Intel to take advantage of higher clock speeds and greater transistor budgets from more advanced process technology and perform well enough that the dedicated hardware could not compete from a performance/price perspective. Today, since you cannot get faster clock speeds from more advanced process technologies, you must to add constraints on how things are done to continue scaling and specifically, those constraints are that the same function is done on independent data in parallel, which is stream processing. If you go further back in history to the advent of the CPU, you would find that simply doing things on the CPU placed constraints on how things are done and it only makes sense that moving forward beyond what the CPU enabled would require additional constraints.

Furthermore, it is difficult to scale floating point computation intensive calculations without doing the same functions independently of one another and if you do them independently of one another, you have an application that exploits stream processing. It is so difficult to scale such calculations that as far as I know, there does not exist a single application that does floating point computation intensive calculations, which both is not a stream processing application and can be accelerated by SMP CPUs. With that in mind, I do not see how the presence of a hardware floating point units helped your project. It seems to me that you are crediting a very specific approximation of a deterministic turing machine for what is given to you by a much larger category of approximations of deterministic turing machines. How is that not the case?

bitvector wrote:
dkanter wrote:GMoreover, GPUs are very brittle tools and perform poorly for software that has irregular control flow, data structures or communication between threads. On any of those cases, CPUs tend to win. And most efficient algorithms for FP workloads tend to use a lot of communication to reduce computation by many orders of magnitude.
Yeah, and to expand on the communication part -- often people ignore the break-even points of using the GPU at all (based on the size of the problem). The overhead involved in getting your dataset into the GPU's accessible memory, arranging the data to be effective for GPU computation and getting it back out again can be substantial and make the overall gains much lower than just comparing the raw computation alone.

A research group at my alma mater published a workshop paper called "On the Limits of GPU Acceleration" where they model these ratios and calculate the break-even points for different problem types. And of course, the sentiment you're expressing above about the limited applicability of GPGPU to a narrow set of domains is a given.


If it goes slower below the break-even point, then that is fine. As long as it goes faster above it, it is likely that no one will care in 10 years. That is done all the time in computer programming.

bitvector wrote:
Shining Arcanine wrote:Did you know that close to zero of the commonly-used server software requires a floating point unit?
Server is a hugely ambiguous and overloaded term; sometimes "server" just means managed machines in data centers, which could be running large scale machine learning batch jobs, data mining, simulations, or any number of things. What do you mean by server?


A server is a machine in a standard ATX or blade case that is dedicated to handling multiple users. Perhaps I should have been more clear on that, as you are right that the term is too abstract to discuss specific things about it.
Disclaimer: I over-analyze everything, so try not to be offended if I over-analyze something you wrote.
Shining Arcanine
Gerbil Jedi
 
Posts: 1717
Joined: Wed Jun 11, 2003 11:30 am

Re: Floating-point units in server-grade CPUs

Postposted on Fri Nov 05, 2010 11:26 am

IIRC the FP units in modern day CPUs are doing double duty running SSE instructions too? So there is still a use for that. Simply put, the unit has been integrated, not taking up that much die space, and the vendors just leave it there. How much do you think Intel/AMD/SunOracle/etc can save if they take out the FPU from their cores? $5 from the chip price?

Shining Arcanine wrote:A server is a machine in a standard ATX or blade case that is dedicated to handling multiple users. Perhaps I should have been more clear on that, as you are right that the term is too abstract to discuss specific things about it.
Great, all those 1U-4U rack mounted computers are not servers anymore. :roll:
Image
The Model M is not for the faint of heart. You either like them or hate them.

Gerbils unite! Fold for UnitedGerbilNation, team 2630.
Flying Fox
Gerbil God
 
Posts: 24290
Joined: Mon May 24, 2004 2:19 am

Re: Floating-point units in server-grade CPUs

Postposted on Fri Nov 05, 2010 11:38 am

Shining Arcanine wrote:Outside of legacy scientific computing software where you can wait months or even years for computations to finish, I am not sure why anyone would need a hardware floating point unit in their CPU. Processors are fast enough that that the things that hardware floating point units made computable per unit time 10 years ago are computable per unit time with compiler generated integer instructions today. Aside from legacy scientific computing software, there is no killer application that takes advantage of hardware floating point units in CPUs, because even if the CPU is as optimal as possible, it is still too slow. Having these calculations be done on GPUs is the way forward and it is not just me who thinks this.

Dude! Why weren't you around when Intel and AMD were spending all that time adding floating point hardware to their CPUs years ago. You could have saved them so much time and money! Obviously, they were mislead as to this particular need.

... or maybe you need to get out more, and there are a helluva lot more applications that benefit greatly from hardware FP than you claim. Ever had to recompute a massive spreadsheet that took more than an hour with hardware FP? It would take days with software FP. And then there is stuff like simple gaming, and its close cousin simulation. AMD suffered big time in the K6 days because their hardware FP wasn't as good as Intel's; something they fixed with the Athlon. Not to mention all the scientific computing you just mentioned, which may or may not fit a CUDA-like model. Your view of the computing world appears to be exceedingly small.

Your approach reminds me of grid computing. You can push stuff into a grid and take advantage of massively parallel computational power. Something that might otherwise take days can be done in minutes, making very complex problems rather simple. That is, if it fits the grid paradigm. Of course, you have to re-architect the solution to this completely non-traditional paradigm. And random data access is very different -- you can't just query a SQL database. Grid data is distributed in chunked files around the grid for fast parallel access, but is extremely inefficient to access in a random access pattern. You could put an actual SQL database on the grid, but it would likely melt down as many thousands of processes try to access the data at the same time, since it's not designed for these sorts of access patterns.

The point is, as others have quite eloquently pointed out, not everything fits the GPU model, and even if it did, GPUs are not consistently available, consistently featureful, or even consistently of the same API. Maybe some day when GPU units are built into every processor, ala AMD Fusion, the streaming processors can be more closely integrated with the CPU. But that's a long way off. What we have now is clumsily integrated and must be explicitly accommodated.

Sorry, but you're completely off base in your analysis. Yes, for certain problems GPU-based solutions are awesome, just as for certain problems grid-based parallelism is awesome. But the problem must fit the solution space in this particular case, rather than the other way around.
Buub
Maximum Gerbil
Silver subscriber
 
 
Posts: 4192
Joined: Sat Nov 09, 2002 11:59 pm
Location: Seattle, WA

PreviousNext

Return to Processors

Who is online

Users browsing this forum: No registered users and 4 guests