Personal computing discussed

Moderators: Flying Fox, morphine

 
dragontamer5788
Gerbil First Class
Posts: 180
Joined: Mon May 06, 2013 8:39 am

Re: Is x86 RISC or CISC?

Thu Nov 29, 2018 12:48 pm

Buub wrote:
dragontamer5788 wrote:
VLIW seems like a good idea, but it seems like in practice, even in the highly-parallel GPU world, its not really practical for the compiler to be making that decision. Modern GPUs are slightly "smarter" than VLIW and are kinda designed to be 10x SMT / hyperthreaded. So sure, there's a VLIW-like thing going on, with VPUs being allocated by the compiler. But the GPU is smart enough to realize when these cores aren't being fully utilized, and can quickly switch between threads to find more work.

"Seemed like a good idea" is the key here. It is impossible for the compiler to predict run-time behavior without a lot of additional information (for example PGO, which is itself imperfect).

VLIW failed because it was the wrong solution to the problem. IMHO, it was the opposite of the right solution.

Modern compilers are fantastic at optimizing code. People often don't realize just much amazing "rocket science" there is in a modern compiler's optimizer. They're fantastic. But, that being said, once again, they can only infer the programmer's intent and turn it into better code; they cannot predict runtime behavior because there are so many external factors that affect the runtime path.


I would argue that VLIW failed because SIMD was more practical way of reaching parallelism, and traditional registers are a more practical way of describing parallelism.

I'm sure that the scalar x86 assembly language is slower than Itanium. But SIMD / AVX2 code will knock your socks off. Skylake has 3 AVX2 pipelines, each of which does a 8x32-bit operation per clock cycle. That's a parallelism count of x24 FLOPs per core per clock tick. GPUs go one step further and only implement the SIMD instruction set, to a fully ridiculous degree. Most GPU code executes at a 32x SIMD level: 32-FLOPs per core per clock tick. And since these SIMD cores are very simplified, GPUs manage to get more instruction-pointers (Symmetric Multiprocessors in CUDA-terms) than CPUs typically get cores.

With numbers like that, Itanium's VLIW of 3-instructions per bundle just can't keep up with that kind of parallelism. Even as future Itanium processors do 4-bundles per clock tick (12-instructions per clock!!), it can't keep up with SIMD. And one of the best SIMD cores is... well... x86 (after GPUs of course. GPUs win in SIMD, but x86 seems like the fastest "normal" instruction set with SIMD implemented)

EDIT: And it turns out that you can "cut dependencies" in traditional scalar assembly language by simply using instructions like "xor eax, eax", which on modern x86 systems effectively functions as the "full stop" instruction works on Itanium. So modern compilers basically learned how to put the VLIW-like parallelism into normal code. I think a future VLIW instruction set could be written to take advantage of the lessons learned in the last 20 years, but its more likely that SIMD GPUs are just going to get bigger and better instead...
 
just brew it!
Gold subscriber
Administrator
Posts: 51942
Joined: Tue Aug 20, 2002 10:51 pm
Location: Somewhere, having a beer

Re: Is x86 RISC or CISC?

Thu Nov 29, 2018 10:37 pm

Don't discount external market factors!

If AMD hadn't upset the apple cart with x86-64, Intel (and the computing industry at large) might have devoted enough attention and resources to IA-64 to make it a viable path forward. Opteron ate a big chunk of Itanium's target market, squeezing it into a much smaller niche than Intel expected, and sucking resources away from it as Intel countered the AMD threat.

AMD's grafting of 64-bit extensions onto x86 (while preserving native compatibility with 32-bit code) was simultaneously a massive kludge, and a stroke of genius. One of those crazy events that changes the course of the tech industry.
Nostalgia isn't what it used to be.
 
Flying Fox
Gerbil God
Posts: 25458
Joined: Mon May 24, 2004 2:19 am
Contact:

Re: Is x86 RISC or CISC?

Thu Nov 29, 2018 11:22 pm

This paper of ours should get A+ no problem, right? :roll: /s

:lol:
The Model M is not for the faint of heart. You either like them or hate them.

Gerbils unite! Fold for UnitedGerbilNation, team 2630.
 
just brew it!
Gold subscriber
Administrator
Posts: 51942
Joined: Tue Aug 20, 2002 10:51 pm
Location: Somewhere, having a beer

Re: Is x86 RISC or CISC?

Thu Nov 29, 2018 11:58 pm

Flying Fox wrote:
This paper of ours should get A+ no problem, right? :roll: /s

:lol:

If someone can construct a coherent paper based on an internet forum thread, maybe they deserve the A? :lol:
Nostalgia isn't what it used to be.
 
dragontamer5788
Gerbil First Class
Posts: 180
Joined: Mon May 06, 2013 8:39 am

Re: Is x86 RISC or CISC?

Fri Nov 30, 2018 4:37 pm

just brew it! wrote:
Don't discount external market factors!

If AMD hadn't upset the apple cart with x86-64, Intel (and the computing industry at large) might have devoted enough attention and resources to IA-64 to make it a viable path forward. Opteron ate a big chunk of Itanium's target market, squeezing it into a much smaller niche than Intel expected, and sucking resources away from it as Intel countered the AMD threat.

AMD's grafting of 64-bit extensions onto x86 (while preserving native compatibility with 32-bit code) was simultaneously a massive kludge, and a stroke of genius. One of those crazy events that changes the course of the tech industry.


Maybe the advantages of EPIC / VLIW is completely different, and possibly relevant today. Hear me out: EPIC / VLIW was supposed to be way faster. But AMD proved with their Opterons that they could build a chip just as fast as the Itanium and keep backwards compatibility. So obviously, EPIC / VLIW isn't really much faster than a typical architecture, at least for the typical server-program.

But today, a huge amount of study has gone into the question of power-efficiency. It would seem to me that EPIC / VLIW might be more power-efficient, since the out-of-order engine has explicit "parallelism points" it can scan for in the instruction steam. The out-of-order engines of CPUs are well known to be relatively heavy power-users: the Intel Atom didn't have one for a while (and Intel Atom Goldmont has a very small out-of-order engine: integer ops only).

So if Itanium were redesigned today as a low-power, explicitly architecture... using the lessons of the past (ie: register renaming is good. Don't fill out lol 128 architectural registers)... it might be interesting to look at.
 
chuckula
Gold subscriber
Gerbil Jedi
Posts: 1890
Joined: Wed Jan 23, 2008 9:18 pm
Location: Probably where I don't belong.

Re: Is x86 RISC or CISC?

Fri Nov 30, 2018 6:46 pm

dragontamer5788 wrote:
just brew it! wrote:
Don't discount external market factors!

If AMD hadn't upset the apple cart with x86-64, Intel (and the computing industry at large) might have devoted enough attention and resources to IA-64 to make it a viable path forward. Opteron ate a big chunk of Itanium's target market, squeezing it into a much smaller niche than Intel expected, and sucking resources away from it as Intel countered the AMD threat.

AMD's grafting of 64-bit extensions onto x86 (while preserving native compatibility with 32-bit code) was simultaneously a massive kludge, and a stroke of genius. One of those crazy events that changes the course of the tech industry.


Maybe the advantages of EPIC / VLIW is completely different, and possibly relevant today. Hear me out: EPIC / VLIW was supposed to be way faster. But AMD proved with their Opterons that they could build a chip just as fast as the Itanium and keep backwards compatibility. So obviously, EPIC / VLIW isn't really much faster than a typical architecture, at least for the typical server-program.

But today, a huge amount of study has gone into the question of power-efficiency. It would seem to me that EPIC / VLIW might be more power-efficient, since the out-of-order engine has explicit "parallelism points" it can scan for in the instruction steam. The out-of-order engines of CPUs are well known to be relatively heavy power-users: the Intel Atom didn't have one for a while (and Intel Atom Goldmont has a very small out-of-order engine: integer ops only).

So if Itanium were redesigned today as a low-power, explicitly architecture... using the lessons of the past (ie: register renaming is good. Don't fill out lol 128 architectural registers)... it might be interesting to look at.


Another point in Itanium's favor: Spectre and most of the side channel attacks don't work or are at a minimum much less effective because of the static nature of the code scheduling.
4770K @ 4.7 GHz; 32GB DDR3-2133; GTX-1080 sold and back to hipster IGP!; 512GB 840 Pro (2x); Fractal Define XL-R2; NZXT Kraken-X60
--Many thanks to the TR Forum for advice in getting it built.
 
blargh4
Gerbil
Posts: 11
Joined: Thu Oct 20, 2016 12:31 pm

Re: Is x86 RISC or CISC?

Fri Nov 30, 2018 6:47 pm

Nvidia still seems to think the idea behind those old Transmeta chips was a good one; their Denver/Carmel cores dynamically recompile/optimize an arm64 instruction stream, as necessary, into native VLIW ops that are stored in a big off-chip cache and feed a wide in-order core. The performance benchmarks I've seen were solid but not revelatory, haven't seen any good perf/watt comparisons though.
 
Buub
Maximum Gerbil
Posts: 4800
Joined: Sat Nov 09, 2002 11:59 pm
Location: Seattle, WA
Contact:

Re: Is x86 RISC or CISC?

Mon Dec 03, 2018 10:28 am

chuckula wrote:
Another point in Itanium's favor: Spectre and most of the side channel attacks don't work or are at a minimum much less effective because of the static nature of the code scheduling.

My first reaction is: that's like saying bikes are better than cars because they don't need oil changes.

Who is online

Users browsing this forum: No registered users and 1 guest