Page 1 of 1

Intel’s Haswell CPU Microarchitecture

Posted: Sun Jan 20, 2013 7:59 pm
by biffzinker
Was reading the article "Intel’s Haswell CPU Microarchitecture" at real world technologies: http://www.realworldtech.com/haswell-cpu/ ran across this paragraph on the second page at the bottom.

"One difference in Haswell’s decoding path is the uop queue, which receives uops from the decoders or uop cache and also functions as a loop cache. In Sandy Bridge, the 28 entry uop queue was replicated for each thread. However, in Ivy Bridge the uop queue was combined into a single 56 entry structure that is statically partitioned when two threads are active. The distinction is that when a single thread is executing on Ivy Bridge or Haswell, the entire 56 entry uop buffer is available for loop caching and queuing, making better use of the available resources."

Reminded me of the "disable core parking" thread and some of the mixed results I got with it enabled/disabled. Anyways since the "disable core parking" thread is locked, and I doubt the mods want a continuation that's all I have.

Re: Intel’s Haswell CPU Microarchitecture

Posted: Mon Jan 21, 2013 8:03 am
by Glorious
biffzinker wrote:
Reminded me of the "disable core parking" thread and some of the mixed results I got with it enabled/disabled.


May I ask why?

Re: Intel’s Haswell CPU Microarchitecture

Posted: Mon Jan 21, 2013 9:46 am
by MadManOriginal
Glorious wrote:
biffzinker wrote:
Reminded me of the "disable core parking" thread and some of the mixed results I got with it enabled/disabled.


May I ask why?


Uh oh...IT'S A TRAP!

Re: Intel’s Haswell CPU Microarchitecture

Posted: Tue Jan 22, 2013 7:18 am
by Glorious
biffzinker wrote:
Uh oh...IT'S A TRAP!


Well, maybe, though that really wasn't my intent. I was trying to elicit more detail from the poster before I responded in full.

In general, I'm wondering what an obscure and rather deep intra-core detail in a specific processor or two has to do with the general Windows feature of core-parking.

To be blunt, I don't see how it would "remind" anyone of core-parking, let alone how it would even relate to it.

So I wanted to see if the poster would expand on what he was saying, because I was honestly at a loss.

Re: Intel’s Haswell CPU Microarchitecture

Posted: Fri Jan 25, 2013 12:27 pm
by biffzinker
I was thinking in regards to Sandy bridge having the uop queue laid out as 28-28, and with core parking enabled half the buffer goes unused.
Ivy bridge/Haswell a single thread can use the whole 56 entry uop queue. With core parking enabled it should force the buffer from being partitioned into 28-28 hardwired split Sandy bridge has.

Edit: From Page 3 Figure 2:
Image

Re: Intel’s Haswell CPU Microarchitecture

Posted: Fri Jan 25, 2013 3:09 pm
by Glorious
biffzinker wrote:
I was thinking in regards to Sandy bridge having the uop queue laid out as 28-28, and with core parking enabled half the buffer goes unused.


I'm not sure why that would be the case. Core-parking deals with cores, and what you are talking about is intra-core. I don't see how it would interact with core-parking unless logical cores were being "parked," which would mean that only hyper-threading Sandy Bridges would show any effect.

And, of course, if the logical cores are being parked, that would be an obvious performance difference by itself, right?

biffzinker wrote:
Ivy bridge/Haswell a single thread can use the whole 56 entry uop queue. With core parking enabled it should force the buffer from being partitioned into 28-28 hardwired split Sandy bridge has.


Are you saying that core parking makes Ivy Bridge/Haswell behave like Sandy Bridge? Because I don't think that's right, at all.

Re: Intel’s Haswell CPU Microarchitecture

Posted: Fri Jan 25, 2013 3:39 pm
by biffzinker
I just saying on Ivy bridge core parking would allow the whole buffer to be assigned to one thread while the other thread is parked unless a demanding workload is run.

Re: Intel’s Haswell CPU Microarchitecture

Posted: Mon Jan 28, 2013 8:49 am
by Glorious
biffzinker wrote:
I just saying on Ivy bridge core parking would allow the whole buffer to be assigned to one thread while the other thread is parked unless a demanding workload is run.


Again, as I said, if that were true then you would only see an effect on hyperthreading chips.

And the feature is called "core-parking" but you are now talking about "thread parking." I don't know what that means, but if you actually mean the logical core can be parked, then, as I said, that would obviously have performance implications that transcend the structure of the decode queue. Namely, you aren't using the logical core at all.