After the long reign of the Pentium D and Core 2 processor series, Intel’s Hyper-Threading technology made a comeback late last year with the Core i7. Apparently, Microsoft felt inclined to tweak its next operating system to take better advantage of the technology. So points out InformationWeek, quoting a speech Microsoft Windows development chief Bill Veghte gave at the TechEd conference earlier this week:
The second thing that we’re excited to announce in terms of the cooperation and the work that’s been done is around hyper-threading. And obviously the work that Intel has done around hyper-threading across a multi-core system is absolutely critical for you. And so the work that we’ve done in Windows 7 in the scheduler and in the core of the system to take full advantage of those capabilities, ultimately we think together we can deliver a great and better experience for you. advantage
As you’ll know if you perused our Core i7 review (or our evaluation of the original 3.06GHz Pentium 4), Hyper-Threading is an implementation of simultaneous multithreading that allows processors to juggle two logical threads per physical core.
The Core i7 may be somewhat of a niche choice due to high platform costs right now, but Intel also plans to feature Hyper-Threading in 45nm Lynnfield quad-core derivatives later this year, not to mention 32nm Westmere dual-core offerings shortly after that. Both variants should find their way into notebooks, too.
I thought HyperThreading was being able to stitch up a hole in a pair of pants very quickly. Seriously though, this gives Intel the edge, and confirms the saying I heard in college that Microsoft is in bed with Intel. I half of the software out there was teaked for AMD processor, AMD might have an advantage. The only thing keeping AMD alive right now is the licensing fees that Intel has to pay to use their x86-64 technology. I’m rooting for AMD, and Microsoft should too. 🙂
Your ‘understanding’ pains me. Intel and MS are very buddy-buddy, but it’s not about money. Intel and MS both want to make the other guy’s stuff look good, since Windows on Intel hardware is the most common PC setup sold in the world.
Also, Intel does not pay AMD anything for x86-64, never has. The agreement between Intel and AMD basically lets each use things the other has without payments or penalties.
I remember seeing a chip with both the Intel AND Advanced Micro Devices logo on it. It was an old processor from the 80’s I think, that Intel and AMD developed together. If AMD was the top dog in chip sales, I would be rooting for Intel and hating AMD, but that is just my human nature. So this hyperthreading optimization in Windows 7 kind of irks me.
microsoft would say anything to make you buy windows 7
They don’t have to say much, as I love using 7, has worked great for me. Only issue I have is that some of my older games won’t play at all (Space Empires 4). However when they polish of xp mode, that will hopefully change.
I don’t ever learn where the reply button is… :/
Which is precisely why it’s a good idea to make the scheduler smarter about using it.
EDIT: I need to learn to read better.
Sounds to me as beneficial as ReadyBoost.
Isn’t hyperthreading kind of irrelevant and pretty much useless technology?
Especially with 4+ core CPUs that are the main audience of Win7 (think ~5 years)
Free processing power is never a bad thing.
It’s not free. There’s an additional overhead on the CPU that clearly slows it down in cases where it’s not making use of many threads.
That’s not to say that it’s a true disadvantage, but it’s not a perfect idea, and I imagine that’s why Intel uses it sparingly, and AMD still avoids it altogether (though that has proved to be a bit of a mistake on the server end.)
Not only that, but the reason that HYPErthreading can exist at all within a core is because of its architecture: HYPErthreading, since the beginning, has merely been a method whereby more processing power is squeezed out of a core when it would otherwise be stalled. A more efficient core design would not leave enough processing room for HYPErthreading. The P4 had plenty of room in that regard…;) And that’s why HT was born there, literally as an afterthought.
Cyril uses the term “juggling” to describe what HT does with two threads per core. That’s perfectly accurate. Unlike when two cores run two threads simultaneously, HT instead merely multitasks between the two threads, and never runs them both at the same time. Depending on the software this can improve performance or it can degrade it, accordingly.
I don’t see anything “wrong” with HT, really. It can work in Intel cpus at times to an advantage–but because of its nature, as you point out, it can also reduce performance. I’ve always thought it chiefly served Intel as a marketing tool because people imagine that it allows one core to run two threads simultaneously, which it doesn’t.
Walt, you should really refrain from talking about things you don’t understand.
Simultaneous Multi-Threading (SMT) aka HyperThreading is tremendously useful and has shown performance gains of 30%, for die area costs of approximately 5-10% per core and in terms of power its almost free.
SMT/HT can simultaneously execute instructions from two different threads and can overlap a substantial amount of memory latency.
If you’ll notice, almost every server chip uses Multithreading: POWER5, POWER6, Niagara I, Niagara II, Rock, Fujitsu’s SPARC64, Itanium, the EV8 would have. GPUs use multithreading, as do many embedded CPUs.
SMT can issue and execute N threads simultaneously, some implementations may be able to retire from N threads as well.
Show me another single microarchitectural feature that can add 30% performance…
Show me another feature with such a favorable perf/watt and perf/mm2 ratio…
Yeah, it’s unfortunate that HT was given such a bad name by the P4. The reason why the Atom uses HT is because it gives more performance at a lower cost (in terms of power).
so you’re saying hyperthreading works by loading up two threads worth of data into cache but only ever executing one thread up to “a certain period of time” before ‘context switching’ threads.
from my understanding the whole point was that instruction re-ordering and parallelization was not perfect resulting in underutilisation of available resources. like only 2 instructions being issued on a 4-issue wide core. hyperthreading would ‘fill in’ the extra 2 slots with instructions from the other thread resulting in full utilisation rather than half.
i think your case matches what occurred on the p4 as it was 2-issue wide. due to the very long execution pipeline, re-ordering would hit the limit before it needed a result to issue more instructions. so instead of stalling it’d load instructions from the other thread.
No he isn’t saying that at all, read up on how SMT works.
sorry… can you clarify which part of his reply i interpretted incorrectly?
i replied to this statement of his (without quoting originally):
l[
You seem to speak pretty negatively of “HYPErthreading” for not seeing anything wrong with it.
One of my 3DS render times went from 17:20 on my X2 4200 from AMD to 6:22 on my i7. When I disable HT, the time goes back up to 12:45.
I’ll keep my hyperthreading, thank you very much.
Not for tasks that can actually use more hardware threads. Sure, most embarrassingly parallel tasks are server side, but 5 years from now we’ll have people complaining about how long transcoding their QuadHD videos takes…
The software will never match up with the hardware.
I believe that it’s more likely that you will do those tasks on GPUs via CUDA. And with 512/1024 cores on GPUs they will make those CPUs pathetic.
I don’t believe video encoding (one of my main time-wasters) is going to be running on my GPU soon. GPU power is increasing faster than CPU power still, but the difference between the two rates of change is lower than ever before and still narrowing.
Also, it won’t be CUDA, any more than GLide revolutionized 3D. The proprietary solutions get out to an early lead, but in the end the open standards win out. OpenCL is what we should be hoping/planning for.
That said, my other major CPU occupier is virtual machines, and HT is a godsend there. It’s also an area where GPUs are unlikely to take over.
l[
Thre’s one key difference between DX and the really proprietary APIs and that is the latter were…really proprietary, meaning they were tied to specific hardware from a specific company. DX is not tied to specific hardware so it’s not quite the same, I’m not certain but I also believe MS even smartly provides all the tools needed to use DX for free. DX vs OpenGL isn’t the same as DX or OpenGL versus early proprietary APIs that requires specific hardware, CUDA is like the latter because it requires specific hardware.
DirectX is a lot more open than Glide ever was. Unlike CUDA and Glide, DirectX is vendor-neutral on the hardware side.
Also, I think OpenGL did win over Glide. It was very popular at the time and it probably did more to beat Glide into the ground than Direct3D did. It wasn’t until DX8 and especially DX9 that OpenGL really fell to the wayside.
I don’t believe video encoding (one of my main time-wasters) is going to be running on my GPU soon. GPU power is increasing faster than CPU power still, but the difference between the two rates of change is lower than ever before and still narrowing.
Also, it won’t be CUDA, any more than GLide revolutionized 3D. The proprietary solutions get out to an early lead, but in the end the open standards win out. OpenCL is what we should be hoping/planning for.
That said, my other major CPU occupier is virtual machines, and HT is a godsend there. It’s also an area where GPUs are unlikely to take over.
Obviously you’ve never seen those SAP numbers.
I find it useful. But then I do things that are “embarrassingly parallel” that does not involve transcoding video or other graphics-intensive stuff.
Some of the Intel processors, including some that are a year old don’t have Hyperthreading while AMD has had it in all CPU’s up to the X2’s 5000 series.
I’m glad MS is using this technology despite not everyone will be able to use it.
Are you talking about the virtualization support that people were all up in arms about a week or two ago?
Because AMD has never supported multithreading on its cores. It has had multiple cores (the X2 CPUs) but never multiple threads per core.
Yeah, somebody had their jargon wrong. AMD = no hyperthreading ever
and for intel, it’s the P4’s, Atoms and i7’s. no hyperthreading on core or core 2
May be he was thinking HyperTransport.
“all CPU’s up to the X2’s 5000 series”?
I can’t come up with a theory that makes sense of this. I’m just chalking it up to random packet loss, or maybe a trial run of a very buggy AI turned loose on teh intratubes.
hopefully what that means is that the scheduler is smart enough to know what “logical” cores are paired together so it makes sure all physical cores are busy before burdening one of them with a second thread.
I believe I’ve read as such.
In that case, this will only make a difference for multicore processors, and won’t affect the single-core Atom that currently ships in most netbooks. Which is unfortunate, because if there was any processor that needed some extra performance, it’s certainly not the i7 (not that improving performance for any processor is a bad thing, but Atom needs all the tweaks it can get).
While I’d like to see extra performance squeezed out of the Atom, I think at this point you’re trying to get blood from a stone.
Sooo… not Crysis on my Atom. CURSES.
Foiled again!
Actually, the scheduler has been aware of that for some time (I think Win2K was the last kernel that was confused by virtual vs physical cores, and that may have been fixed in an SP, I don’t recall). But they’ve been improving the smarts of the kernel wrt node topology with every version (XP SP2 added NUMA support, Server 2K3 improved it, Vista / 2K8 improved it further) so it’s not surprising they’re still working at it. And now that they have both in-order cores (Itanium and Atom) as well as OoO cores (P4 and i7) doing SMT, there are probably several “right” ways for the scheduling algorithm to operate depending on the hardware it finds itself on (particularly since HT on the i7 is notably improved over the Replay dysfunction of the P4).
Task manager may have been aware that the cores were logical versus physical but that doesn’t mean the scheduler does. IIRC windows XP has trouble dealing with any multicore processor well anyways.
UG did say XP SP2. Remember there’s a lot of Server 2003 code pulled in to SP2 and subsequent releases.
FINALLY! Now I’ll be able to play Crysis on my Atom!
Uhh… Not quite.
But can it play Crysis?
oops…
And Atom offers hyperthreading also.
As far as Windows is concerned, hyperthreading never went away: Windows Server got some tweaks for SMT in the Itanium builds.
Quiet, you.
This is DEFINITELY about the Atom, and not the i7.
Unlikely, given single core atoms can’t really benefit from loading up the physical cores before the logical cores.
As a proud owner of an i7, i can really less care WHY they are adding improved hyper-threading support… all i see is good news for me! 🙂
Likely, because they want to replace Windows XP with Windows 7 as the primary netbook system baseline.
As such, any plus performance on crippled hardware should be met with fanfare.