Personal computing discussed

Moderators: renee, Flying Fox, morphine

 
chuckula
Minister of Gerbil Affairs
Topic Author
Posts: 2109
Joined: Wed Jan 23, 2008 9:18 pm
Location: Probably where I don't belong.

16 core CPUs with a single RAM channel are AWESOME

Wed Jun 06, 2018 7:55 pm

So since I'm a trendsetter and not a mere follower, I'm already way way past that 32 core Threadripper 2.

TIME FOR THREADRIPPER 3 BABY!

7nm? CHECK

64 cores on four dies of JUSTICE? CHECK

The EXACT SAME PLATFORM as Threadripper? DOUBLE CHECK THANK YOU AMD!

So given all that, and my ability to do elementary math that tends to elude some people around here, I must come to an inescapable conclusion:

A single channel of DDR4 on a 16 core CPU (and I don't mean an AllWinner SoC) is awesome!

So the next time you see a cheapy notebook with a single channel of DDR, just remember: If it's good enough for 16 cores, then it's good enough for your $300 Dell special!
4770K @ 4.7 GHz; 32GB DDR3-2133; Officially RX-560... that's right AMD you shills!; 512GB 840 Pro (2x); Fractal Define XL-R2; NZXT Kraken-X60
--Many thanks to the TR Forum for advice in getting it built.
 
Mikael33
Gerbil First Class
Posts: 107
Joined: Tue Mar 18, 2008 5:13 pm

Re: 16 core CPUs with a single RAM channel are AWESOME

Wed Jun 06, 2018 8:00 pm

How do I downvote this?
 
chuckula
Minister of Gerbil Affairs
Topic Author
Posts: 2109
Joined: Wed Jan 23, 2008 9:18 pm
Location: Probably where I don't belong.

Re: 16 core CPUs with a single RAM channel are AWESOME

Wed Jun 06, 2018 8:01 pm

Mikael33 wrote:
How do I downvote this?


You can't. But I do note that you don't have an argument as to why it's wrong, so you really should be asking why you can't downvote reality.
4770K @ 4.7 GHz; 32GB DDR3-2133; Officially RX-560... that's right AMD you shills!; 512GB 840 Pro (2x); Fractal Define XL-R2; NZXT Kraken-X60
--Many thanks to the TR Forum for advice in getting it built.
 
uni-mitation
Gerbil XP
Posts: 308
Joined: Mon Feb 04, 2013 1:28 am

Re: 16 core CPUs with a single RAM channel are AWESOME

Wed Jun 06, 2018 8:03 pm

Mikael33 wrote:
How do I downvote this?


You don't. Grab popcorn and enjoy being in a free country!

Take it easy.

uni-mitation
 
Mikael33
Gerbil First Class
Posts: 107
Joined: Tue Mar 18, 2008 5:13 pm

Re: 16 core CPUs with a single RAM channel are AWESOME

Wed Jun 06, 2018 8:10 pm

chuckula wrote:
Mikael33 wrote:
How do I downvote this?


You can't. But I do note that you don't have an argument as to why it's wrong, so you really should be asking why you can't downvote reality.

It wasn't a serious question, this topic doesn't seem very serious either, you're just continuing your silly shenanigans from the news article comment section.
 
Redocbew
Minister of Gerbil Affairs
Posts: 2495
Joined: Sat Mar 15, 2014 11:44 am

Re: 16 core CPUs with a single RAM channel are AWESOME

Wed Jun 06, 2018 8:14 pm

I wish I could downvote reality all the time, but that would be way too Black Mirror-ish if it were to actually happen, and I have a feeling all the people who I'd be downvoting would probably make up the majority anyway.
Do not meddle in the affairs of archers, for they are subtle and you won't hear them coming.
 
chuckula
Minister of Gerbil Affairs
Topic Author
Posts: 2109
Joined: Wed Jan 23, 2008 9:18 pm
Location: Probably where I don't belong.

Re: 16 core CPUs with a single RAM channel are AWESOME

Wed Jun 06, 2018 8:20 pm

Mikael33 wrote:
this topic doesn't seem very serious either


Oh, it's deadly serious assuming Lisa Su didn't flat-out lie at the end of AMD's webcast yesterday. Which, unlike the AMD fansquad around here, I actually watched live.

People tend to forget that I take technology, but not myself, seriously.

The story comments sections are littered with idiots who, quite curiously, tend to hold themselves in great esteem while still not being able to think through the ramifications of their object of worship.
4770K @ 4.7 GHz; 32GB DDR3-2133; Officially RX-560... that's right AMD you shills!; 512GB 840 Pro (2x); Fractal Define XL-R2; NZXT Kraken-X60
--Many thanks to the TR Forum for advice in getting it built.
 
Mikael33
Gerbil First Class
Posts: 107
Joined: Tue Mar 18, 2008 5:13 pm

Re: 16 core CPUs with a single RAM channel are AWESOME

Wed Jun 06, 2018 8:28 pm

chuckula wrote:
Mikael33 wrote:
this topic doesn't seem very serious either


Oh, it's deadly serious assuming Lisa Su didn't flat-out lie at the end of AMD's webcast yesterday. Which, unlike the AMD fansquad around here, I actually watched live.

People tend to forget that I take technology, but not myself, seriously.

The story comments sections are littered with idiots who, quite curiously, tend to hold themselves in great esteem while still not being able to think through the ramifications of their object of worship.

Well what I really want to to know is how many concurrent iterations of Crysis it can run at 60fps.
 
kvndoom
Minister of Gerbil Affairs
Posts: 2758
Joined: Sat Feb 28, 2004 11:47 pm
Location: Virginia, thank goodness

Re: 16 core CPUs with a single RAM channel are AWESOME

Wed Jun 06, 2018 8:57 pm

Redocbew wrote:
I wish I could downvote reality all the time, but that would be way too Black Mirror-ish if it were to actually happen, and I have a feeling all the people who I'd be downvoting would probably make up the majority anyway.

They'd probably send the bees after you if you downvote too much. :lol:
A most unfortunate, Freudian, double entendre is that hotel named "Budget Inn."
 
NTMBK
Gerbil XP
Posts: 371
Joined: Sat Dec 21, 2013 11:21 am

Re: 16 core CPUs with a single RAM channel are AWESOME

Thu Jun 07, 2018 1:46 am

At this rate the entire "Hot forum threads" section will soon be full of Chucky's salty bitching.
 
ptsant
Gerbil XP
Posts: 397
Joined: Mon Oct 05, 2009 12:45 pm

Re: 16 core CPUs with a single RAM channel are AWESOME

Thu Jun 07, 2018 2:10 am

chuckula wrote:
So since I'm a trendsetter and not a mere follower, I'm already way way past that 32 core Threadripper 2.

TIME FOR THREADRIPPER 3 BABY!

7nm? CHECK

64 cores on four dies of JUSTICE? CHECK

The EXACT SAME PLATFORM as Threadripper? DOUBLE CHECK THANK YOU AMD!


The logical thing to do for TR3 would be higher frequency and lower TDP. Also note that Zen2 will bring IPC improvements. So I don't think they will aim for a higher core count.
Image
 
haugland
Gerbil
Posts: 15
Joined: Fri May 05, 2006 1:40 am

Re: 16 core CPUs with a single RAM channel are AWESOME

Thu Jun 07, 2018 7:16 am

chuckula wrote:
(more of the same drab complaining)

You have made your point: You do not like AMD. We get it!

Now go away PLEASE.
 
derFunkenstein
Gerbil God
Posts: 25427
Joined: Fri Feb 21, 2003 9:13 pm
Location: Comin' to you directly from the Mothership

Re: 16 core CPUs with a single RAM channel are AWESOME

Thu Jun 07, 2018 7:33 am

chuckula wrote:
Mikael33 wrote:
How do I downvote this?


You can't. But I do note that you don't have an argument as to why it's wrong, so you really should be asking why you can't downvote reality.

Some things are so ridiculous as to not require an argument.

I'm not mad. I'm not even disappointed. It's just expected at this point. ;)

Couldn't this have just been posted in your other myriad Threadripper threads?
I do not understand what I do. For what I want to do I do not do, but what I hate I do.
Twittering away the day at @TVsBen
 
Amiga500+
Gerbil
Posts: 76
Joined: Wed Sep 21, 2016 2:10 am

Re: 16 core CPUs with a single RAM channel are AWESOME

Thu Jun 07, 2018 8:30 am

chuckula wrote:
So the next time you see a cheapy notebook with a single channel of DDR, just remember: If it's good enough for 16 cores, then it's good enough for your $300 Dell special!


Excellent.

Looking on gumtree right now for a 50 quid notebook just so I can experience some of the pure awesomeness.


Oh sage of all things Computers - Do you recommend an Intel Atom for exploring this nirvana of high performance?
 
Topinio
Gerbil Jedi
Posts: 1839
Joined: Mon Jan 12, 2015 9:28 am
Location: London

Re: 16 core CPUs with a single RAM channel are AWESOME

Thu Jun 07, 2018 9:13 am

I don't see how this helps, or what the problem is here? Has anyone made a 16C CPU with a single-channel memory controller? I guess there are server builds where someone has put in only 1 DIMM, and there are workloads where that could be awesome as well as cost-effective (but probably not in the big picture).

Presumably, though, this is a case of whining about Threadripper 2 with it's mere 4 memory channels. Because the i9's with their 4 memory channels are better and releasing TR2 is a bad thing because it Intel can't charge as much or stick to $300 4C CPUs with only 2 memory channels???

So yeah, why not? Appropriately priced, I'll take a 16C CPU with a single memory channel of DDR4 3200 delivering 25.6 GB/s -- after all, that's what the i7-3770K can do.

It's surely not better (for anyone except Intel, Intel stockholders, or Intel fanbois) if AMD didn't release a less-cutdown EPYC than the cutdown EPYC that's already been released last summer. Maybe it might not be as good as if it had all 8 memory channels active, but I don't think you can show that it's a Bad Thing at this point (and to say so is FUD) because there are no benchmarks showing any evidence that what AMD has engineered is worse IRL than what it (or Intel) has already released. Which it might be, but I suspect it will be better. And if it is worse, it'll be cheaper, because capitalism.
Desktop: 750W Snow Silent, X11SAT-F, E3-1270 v5, 32GB ECC, RX 5700 XT, 500GB P1 + 250GB BX100 + 250GB BX100 + 4TB 7E8, XL2730Z + L22e-20
HTPC: X-650, DH67GD, i5-2500K, 4GB, GT 1030, 250GB MX500 + 1.5TB ST1500DL003, KD-43XH9196 + KA220HQ
Laptop: MBP15,2
 
chuckula
Minister of Gerbil Affairs
Topic Author
Posts: 2109
Joined: Wed Jan 23, 2008 9:18 pm
Location: Probably where I don't belong.

Re: 16 core CPUs with a single RAM channel are AWESOME

Thu Jun 07, 2018 9:44 am

Topinio wrote:
I don't see how this helps, or what the problem is here? Has anyone made a 16C CPU with a single-channel memory controller? I guess there are server builds where someone has put in only 1 DIMM, and there are workloads where that could be awesome as well as cost-effective (but probably not in the big picture).


I'm doing an amusing but realistic extrapolation to point out a scaling problem that's going to be an issue that's hard to solve. Especially when Lisa Su went out of her way to state that the existing Epyc platform is not changing, so DDR5 (which I actually don't think would help too much) or adding even more memory channels are not options. Furthermore, going past 8 memory channels is not only complicated but it's going to drive up the server price no matter how cheap the CPU is.

I'm curious to see what the future options are for getting these complex chips fed with data. HBM2 sounds interesting but in its current incarnations it's great for GPUs (high bandwidth) but not so great for CPUs (too much latency even if there is a lot of bandwidth). So what is the solution when you are slapping 64 cores that are supposed to have high performance into a socket and need to do things that go beyond L1 cache busy loops.

As for TR2, the main complaint isn't having quad-channel memory, it's the compromises that AMD is putting in to get the core count larger. Furthermore, I'm generally tired of every Intel product announcement being called a complete failure when the hard numbers don't seem to bear that out at all.
4770K @ 4.7 GHz; 32GB DDR3-2133; Officially RX-560... that's right AMD you shills!; 512GB 840 Pro (2x); Fractal Define XL-R2; NZXT Kraken-X60
--Many thanks to the TR Forum for advice in getting it built.
 
Amiga500+
Gerbil
Posts: 76
Joined: Wed Sep 21, 2016 2:10 am

Re: 16 core CPUs with a single RAM channel are AWESOME

Thu Jun 07, 2018 10:18 am

chuckula wrote:
I'm curious to see what the future options are for getting these complex chips fed with data. HBM2 sounds interesting but in its current incarnations it's great for GPUs (high bandwidth) but not so great for CPUs (too much latency even if there is a lot of bandwidth). So what is the solution when you are slapping 64 cores that are supposed to have high performance into a socket and need to do things that go beyond L1 cache busy loops.


Well, the current chat is that Zen2 (or will it be Zen3) sockets will actually have a module dedicated to memory control.

I'd assume then that all CPU modules connect to this module via Infinity Fabric then requests to system memory are centralised from here. Disadvantage being a hop to main memory, advantage being that inter-CPU IF and L3 clock rates could be decoupled from main memory clock rates.
 
Antimatter
Gerbil
Posts: 11
Joined: Fri Mar 18, 2011 9:14 am

Re: 16 core CPUs with a single RAM channel are AWESOME

Thu Jun 07, 2018 10:26 am

It's not clear TR3 will have 64 cores. If your workload requires more memory bandwidth, buy the product that makes sense for your workload.

On the memory issue. Wouldn't eDRAM be a solution?
 
Amiga500+
Gerbil
Posts: 76
Joined: Wed Sep 21, 2016 2:10 am

Re: 16 core CPUs with a single RAM channel are AWESOME

Thu Jun 07, 2018 11:51 am

Antimatter wrote:
It's not clear TR3 will have 64 cores.


I very much agree.

A 64 core "box" is probably a step too far even for the prosumer workstation community. Your reaching the level where companies invest in racks for HPC farms and users fire jobs on.

At that point, your market is almost infinitesimally small to the point your more liable to steal sales from (the more profitable) EPYC than generate new sales.


In fact, I'm not sure if the 32 core product is already beyond the point of cannibalising more from EPYC than generating new sales. I'm very interested in a 16C TR2 that uses the much more effective Zen+ boost algorithms to keep, say, 10 cores running very quickly when workloads flash in and out of embarassingly parallel. Not at all interested in a 32C though.
Last edited by Amiga500+ on Thu Jun 07, 2018 11:55 am, edited 1 time in total.
 
Topinio
Gerbil Jedi
Posts: 1839
Joined: Mon Jan 12, 2015 9:28 am
Location: London

Re: 16 core CPUs with a single RAM channel are AWESOME

Thu Jun 07, 2018 11:54 am

chuckula wrote:
Topinio wrote:
I don't see how this helps, or what the problem is here?

I'm doing an amusing but realistic extrapolation to point out a scaling problem that's going to be an issue that's hard to solve.

[...]

As for TR2, the main complaint isn't having quad-channel memory, it's the compromises that AMD is putting in to get the core count larger. Furthermore, I'm generally tired of every Intel product announcement being called a complete failure when the hard numbers don't seem to bear that out at all.

Ah, it's amusing to you. Cool. I guess you've pulled me in now. Your thought-experiment extrapolation doesn't start from reality, IMO, which is where engineering and business need to.

IRL, no-one is going to engineer or go to market with an ecosystem with 1 memory channel supporting big 16C server CPUs, even ones which are binned down and sold as consumer products. There are 4 channels, that's plenty for 16C CPUs (TR1) and is probably okay for future 32C ones, though we'll see via benchmarking in a couple of months.

AMD's drop from the 8 channels EPYC has to the 4 for the consumer version makes sense because of factors including binning and motherboard manufacturing costs and consumer tolerance for buying an 8-DIMM kit. Now the ecosystem is in place, selling 32C CPUs on the same socket makes sense too. I don't expect performance to be awful, unless AMD's sillier than I think it is.

So, TR2 will be up to 32C on the same socket as TR1, and EPYC2 apparently will be up to 48C https://www.servethehome.com/amd-epyc-r ... er-socket/ likewise, with 64C EPYC to be on a new platform.

For 4 and 8 channels respectively, DDR4-3200 would deliver 102.4 GB/s and 204.8 GB/s for the whole CPU, which is reasonable.

What you seem to argue, which to me seems weird to get into it over, is that these bandwidths being equivalent to 3.2 GB/s and 4.3 GB/s per core if all cores are trying to do RAM RW at the same time is a Big Problem.

But if you want to apply that scenario you have to do it globally, and then you can also complain that the $13000 Intel Xeon Platinum 8180M has 4.6 GB/s per core, or the $3000 Intel Xeon Phi 7290 has 1.6 GB/s per core. And the Intel Core i9-7980XE for $2000 has 4.7 GB/s per core at twice the cost of where I'd expect AMD's top TR2 to launch -- i.e. AMD will probably be selling 32C with 102 GB/s (= 3.2 GB/s ea.) for $1000, versus Intel's 18C with 85 GB/s (= 4.7 GB/s ea.) for $2000.

But even this doesn't make sense, because if I have an ideally-scaling n-way memory-bound job -- the only scenario where your gedankenexperiment matters -- then if I run it on the top i9 I can get it running 18-way and being fed at 4.7 GB/s per core, but this is not better than on TR2 because when I run it on there 18-way I get 5.7 GB/s per core!

The only way to get TR2 to have less memory bandwidth than the top i9 would be to run more tasks on the AMD CPU than the Intel one, i.e. we go up to 32-way. But this is silly, and pointless. Or we go up to 32-way on both, but then we have to watch the Core i9 choke because it's running on HyperThreading and anyway has now 2.7 GB/s per thread??

I don't get it.
Desktop: 750W Snow Silent, X11SAT-F, E3-1270 v5, 32GB ECC, RX 5700 XT, 500GB P1 + 250GB BX100 + 250GB BX100 + 4TB 7E8, XL2730Z + L22e-20
HTPC: X-650, DH67GD, i5-2500K, 4GB, GT 1030, 250GB MX500 + 1.5TB ST1500DL003, KD-43XH9196 + KA220HQ
Laptop: MBP15,2
 
Glorious
Gerbilus Supremus
Posts: 12343
Joined: Tue Aug 27, 2002 6:35 pm

Re: 16 core CPUs with a single RAM channel are AWESOME

Thu Jun 07, 2018 11:56 am

WHY CAN'T I UPVOTE?
 
Waco
Maximum Gerbil
Posts: 4850
Joined: Tue Jan 20, 2009 4:14 pm
Location: Los Alamos, NM

Re: 16 core CPUs with a single RAM channel are AWESOME

Thu Jun 07, 2018 11:57 am

chuckula wrote:
HBM2 sounds interesting but in its current incarnations it's great for GPUs (high bandwidth) but not so great for CPUs (too much latency even if there is a lot of bandwidth).

HBM/HBM2 are pretty comparable for latency...and they are both measurably faster than DRAM due to various factors (proximity to the memory controller being one of them).

Am I missing something?


EDIT: Xeon Phi chips have ~500 GB/s of bandwidth to the on package memory, so they're closer to 7 GB/s per core (assuming you aren't using every thread, where it drops to 1/4 that).
Victory requires no explanation. Defeat allows none.
 
Topinio
Gerbil Jedi
Posts: 1839
Joined: Mon Jan 12, 2015 9:28 am
Location: London

Re: 16 core CPUs with a single RAM channel are AWESOME

Thu Jun 07, 2018 4:54 pm

Waco wrote:
EDIT: Xeon Phi chips have ~500 GB/s of bandwidth to the on package memory, so they're closer to 7 GB/s per core (assuming you aren't using every thread, where it drops to 1/4 that).

Yeah, alright, ignore me on the Phi aspect (never bothered with mine very much).

ETA: Ich habe es auch versagt, der Anfangsbuchstabe des Wortes "Gedankenexperiment" zu großschreiben...
Desktop: 750W Snow Silent, X11SAT-F, E3-1270 v5, 32GB ECC, RX 5700 XT, 500GB P1 + 250GB BX100 + 250GB BX100 + 4TB 7E8, XL2730Z + L22e-20
HTPC: X-650, DH67GD, i5-2500K, 4GB, GT 1030, 250GB MX500 + 1.5TB ST1500DL003, KD-43XH9196 + KA220HQ
Laptop: MBP15,2
 
chuckula
Minister of Gerbil Affairs
Topic Author
Posts: 2109
Joined: Wed Jan 23, 2008 9:18 pm
Location: Probably where I don't belong.

Re: 16 core CPUs with a single RAM channel are AWESOME

Thu Jun 07, 2018 7:12 pm

Waco wrote:
chuckula wrote:
HBM2 sounds interesting but in its current incarnations it's great for GPUs (high bandwidth) but not so great for CPUs (too much latency even if there is a lot of bandwidth).

HBM/HBM2 are pretty comparable for latency...and they are both measurably faster than DRAM due to various factors (proximity to the memory controller being one of them).

Am I missing something?


EDIT: Xeon Phi chips have ~500 GB/s of bandwidth to the on package memory, so they're closer to 7 GB/s per core (assuming you aren't using every thread, where it drops to 1/4 that).


Am I missing something about HBM latency? I''ve never seen a concrete source saying it has great latency and have generally heard the opposite. While a Xeon Phi is theoretically a CPU it is targeting the same massive bandwidth but non-latency sensitive applications that GPUs typically target.

Real numbers are scarce but this technical paper examining HBM in HPC applications pretty much states that HBM is great for bandwidth, not great for latency: https://arxiv.org/pdf/1704.08273.pdf

If you are using your CPU for large-scale stream number crunching I can definitely see the benefit, but move into a small random I/O scenario and it may not be worth it.
4770K @ 4.7 GHz; 32GB DDR3-2133; Officially RX-560... that's right AMD you shills!; 512GB 840 Pro (2x); Fractal Define XL-R2; NZXT Kraken-X60
--Many thanks to the TR Forum for advice in getting it built.
 
Waco
Maximum Gerbil
Posts: 4850
Joined: Tue Jan 20, 2009 4:14 pm
Location: Los Alamos, NM

Re: 16 core CPUs with a single RAM channel are AWESOME

Thu Jun 07, 2018 8:04 pm

chuckula wrote:
Am I missing something about HBM latency? I''ve never seen a concrete source saying it has great latency and have generally heard the opposite. While a Xeon Phi is theoretically a CPU it is targeting the same massive bandwidth but non-latency sensitive applications that GPUs typically target.

It's better than DRAM if you're using it in exclusive mode IME. We run latency and bandwidth sensitive applications - GPUs need not apply.
Victory requires no explanation. Defeat allows none.
 
chuckula
Minister of Gerbil Affairs
Topic Author
Posts: 2109
Joined: Wed Jan 23, 2008 9:18 pm
Location: Probably where I don't belong.

Re: 16 core CPUs with a single RAM channel are AWESOME

Thu Jun 07, 2018 9:08 pm

Topinio wrote:
chuckula wrote:
Topinio wrote:
I don't see how this helps, or what the problem is here?

What you seem to argue, which to me seems weird to get into it over, is that these bandwidths being equivalent to 3.2 GB/s and 4.3 GB/s per core if all cores are trying to do RAM RW at the same time is a Big Problem.

But if you want to apply that scenario you have to do it globally, and then you can also complain that the $13000 Intel Xeon Platinum 8180M has 4.6 GB/s per core, or the $3000 Intel Xeon Phi 7290 has 1.6 GB/s per core. And the Intel Core i9-7980XE for $2000 has 4.7 GB/s per core at twice the cost of where I'd expect AMD's top TR2 to launch -- i.e. AMD will probably be selling 32C with 102 GB/s (= 3.2 GB/s ea.) for $1000, versus Intel's 18C with 85 GB/s (= 4.7 GB/s ea.) for $2000.

But even this doesn't make sense, because if I have an ideally-scaling n-way memory-bound job -- the only scenario where your gedankenexperiment matters -- then if I run it on the top i9 I can get it running 18-way and being fed at 4.7 GB/s per core, but this is not better than on TR2 because when I run it on there 18-way I get 5.7 GB/s per core!

The only way to get TR2 to have less memory bandwidth than the top i9 would be to run more tasks on the AMD CPU than the Intel one, i.e. we go up to 32-way. But this is silly, and pointless. Or we go up to 32-way on both, but then we have to watch the Core i9 choke because it's running on HyperThreading and anyway has now 2.7 GB/s per thread??

I don't get it.


There's a whole lot of back of the napkin calculating there but then there's real-world experiences by the same people who you'd be happy to hear think AVX-512 is pretty useless, who say that large-scale chips are pretty much bandwidth starved under lots of situations.

Here's a popular one, Y-cruncher
Because of the memory-intensive nature of computing Pi and other constants, y-cruncher needs a lot of memory bandwidth to perform well. In fact, the program has been noticably memory bound on nearly all high-end desktops since 2012 as well as the majority of multi-socket systems since at least 2006.


Here's another one: https://www.sisoftware.co.uk/2017/09/12 ... andra-sp2/

In algorithms heavily dependent on memory bandwidth or latency AVX512 cannot work miracles, but at least will extract the maximum possible compute performance from the CPU. SKUs with lower number of cores (8, 6, 4, etc.) likely to gain even more from AVX512.


Which is wonderful for lower core-count CPUs but starts to become a bigger and bigger problem as you scale the cores. Even if AMD only sticks to AVX2, a 64-core AVX2 system from AMD is going to be requiring approximately comparable (maybe even greater) bandwidth requirements than a 32-core AVX-512 configuration. A 4 or even 8 channel DDR4 setup isn't going to be happy.
4770K @ 4.7 GHz; 32GB DDR3-2133; Officially RX-560... that's right AMD you shills!; 512GB 840 Pro (2x); Fractal Define XL-R2; NZXT Kraken-X60
--Many thanks to the TR Forum for advice in getting it built.
 
Waco
Maximum Gerbil
Posts: 4850
Joined: Tue Jan 20, 2009 4:14 pm
Location: Los Alamos, NM

Re: 16 core CPUs with a single RAM channel are AWESOME

Thu Jun 07, 2018 11:04 pm

If Epyc 2 has 16 channels of DRAM per socket I'll be especially happy...but it's a pipe dream. Simulation HPC is a niche. :/
Victory requires no explanation. Defeat allows none.
 
synthtel2
Gerbil Elite
Posts: 956
Joined: Mon Nov 16, 2015 10:30 am

Re: 16 core CPUs with a single RAM channel are AWESOME

Thu Jun 07, 2018 11:43 pm

Being bound by memory performance isn't automatically a problem any more than being bound by compute is automatically a problem. In isolation, neither says much of anything about the net perf/$ of the system. We're used to the expense all being focused on the compute side (and being memory-bound therefore being a Very Bad Thing) because Intel's been ramping up their margins at 4C8T instead of ramping up core counts for so long, but in this case we can be pretty sure that the marginal cost of the silicon for 32C is very mild compared to what they're selling for retail, and with all the reused parts initial costs to get this in the pipeline should also be low.

Memory bandwidth absolutely does matter (more than it gets credit for, I'd say), but there are also plenty of workloads that would rather just have more cores/threads. It doesn't have to be good at everything to be good. If someone's buying hardware of this grade without doing some basic research on what kind of hardware their workload actually likes, they've got bigger problems.

As for the thread title, try this.
 
ptsant
Gerbil XP
Posts: 397
Joined: Mon Oct 05, 2009 12:45 pm

Re: 16 core CPUs with a single RAM channel are AWESOME

Fri Jun 08, 2018 2:49 am

Topinio wrote:
But even this doesn't make sense, because if I have an ideally-scaling n-way memory-bound job -- the only scenario where your gedankenexperiment matters -- then if I run it on the top i9 I can get it running 18-way and being fed at 4.7 GB/s per core, but this is not better than on TR2 because when I run it on there 18-way I get 5.7 GB/s per core!

The only way to get TR2 to have less memory bandwidth than the top i9 would be to run more tasks on the AMD CPU than the Intel one, i.e. we go up to 32-way. But this is silly, and pointless. Or we go up to 32-way on both, but then we have to watch the Core i9 choke because it's running on HyperThreading and anyway has now 2.7 GB/s per thread??


Excellent explanation. I would add that a given chip does not have to make sense for ALL workloads to be a market success, especially if it is a "free" EPYC reject. Since cores also have their own private cache there can be many use cases where absolute core count matters much more than aggregate bandwidth (compilation, for example?).
Image
 
Topinio
Gerbil Jedi
Posts: 1839
Joined: Mon Jan 12, 2015 9:28 am
Location: London

Re: 16 core CPUs with a single RAM channel are AWESOME

Fri Jun 08, 2018 3:45 am

chuckula wrote:
There's a whole lot of back of the napkin calculating there
... sure ...

chuckula wrote:
but then there's real-world experiences by the same people who you'd be happy to hear think AVX-512 is pretty useless,
Eh?

chuckula wrote:
who say that large-scale chips are pretty much bandwidth starved under lots of situations.
"a lot of people say"? I guess that Intel and AMD should be told, then they can stop designing these things? Also, all those buying them, they should save their money.

If these people you refer to are buying CPUs with lots of cores for jobs that are so memory-bound that they run out of bandwidth and starve the jobs when they use more than a small number of cores, then they are either buying it wrong, running it wrong, or both. Or, are these people who don't have enough clout to be involved in specification or procurement and just have to use what someone else has designed and built (e.g. a university HPC user, not able to build own cluster)?

chuckula wrote:
In fact, the program has been noticably memory bound on nearly all high-end desktops since 2012 as well as the majority of multi-socket systems since at least 2006.
So that particular program has been memory-bound on "the majority of multi-socket systems since at least 2006", i.e. Intel 2S servers since quad-channel DDR2-667 or so, i.e. 21.3 GB/s system = 10.7 GB/s per socket = 5.3 GB/s per core is where it started to be limited, unless they mean Clovertown when it's 2.7 GB/s.

"nearly all high-end desktops since 2012" implies SB-E, but that's 51 GB/s (4-channel DDR3-1600) i.e. 8.5 GB/s per core which is way higher. I'm also pretty sure that SB-E HEDT got a much higher percentage of theoretical peak memory bandwidth IRL than those old FSB-bound servers did. Does not add up.

chuckula wrote:
In algorithms heavily dependent on memory bandwidth or latency AVX512 cannot work miracles, but at least will extract the maximum possible compute performance from the CPU. SKUs with lower number of cores (8, 6, 4, etc.) likely to gain even more from AVX512.
So memory-bound tasks don't magically gain from throwing even more compute capability at the non-compute-bound problem? 'k. And using the CPU's highest GFLOPS units "at least" delivers the highest GFLOPS? 'k.

I don't know why you would quote that.

chuckula wrote:
Which is wonderful for lower core-count CPUs but starts to become a bigger and bigger problem as you scale the cores.
I don't get it. What are you arguing here?

Some workloads which are CPU-bound and not memory-bound will perform better when CPUs with more performance come along; some workloads which are memory-bound and not CPU-bound won't. This should be obvious, as should that in workloads which are memory-bound on the socket and whose memory requirements scale with number of threads used, trying to use more threads won't ever help.

Are you arguing that companies (AMD??) should not be designing and producing newer CPUs which will help the first set, because that doesn't help the last set of use cases?

If so, that's ignoring the glaringly obvious fact that it's possible for those buying the machines for the last set to choose lower-core count CPUs on a socket e.g. pick the 1900X instead of the 1950X, or the 8156 instead of the 8180. Those buying for the first set can choose the higher core count option on their platform, e.g. the 8180 or the 1950X...
Desktop: 750W Snow Silent, X11SAT-F, E3-1270 v5, 32GB ECC, RX 5700 XT, 500GB P1 + 250GB BX100 + 250GB BX100 + 4TB 7E8, XL2730Z + L22e-20
HTPC: X-650, DH67GD, i5-2500K, 4GB, GT 1030, 250GB MX500 + 1.5TB ST1500DL003, KD-43XH9196 + KA220HQ
Laptop: MBP15,2

Who is online

Users browsing this forum: No registered users and 1 guest
GZIP: On