Hyper-Threading erratum rears its head in Skylake and Kaby Lake

CPUs can ship with bugs just as software can, and members of the Debian Linux community have uncovered what they claim is a serious one in Intel Skylake and Kaby Lake processors, including Skylake-X CPUs. In a message on the project's mailing list yesterday that was noticed by the eagle-eyed folks at HotHardware, the project says affected CPUs "could, in some situations, dangerously misbehave when hyper-threading is enabled." The message further recommends that users "disable hyper-threading immediately in BIOS/UEFI to work around the problem."

The project member claims the erratum in question is called "Short Loops Which Use AH/BH/CH/DH Registers May Cause Unpredictable System Behavior" in Intel documentation. The company describes some of the required conditions for the bug to occur:

Under complex micro-architectural conditions, short loops of less than 64 instructions that use AH, BH, CH or DH registers as well as their corresponding wider register (e.g. RAX, EAX or AX for AH) may cause unpredictable system behavior. This can only happen when both logical processors on the same physical processor are active.

According to the Debian mailing, the bug was triggered by members of the OCaml community, who were able to demonstrate the issue using the OCaml compiler. Further investigation from members of the OCaml project isolated the behavior to Skylake CPUs with Hyper-threading. When the bug was triggered, the OCaml developers noted "compiler and application crashes [and] incorrect program behavior, including incorrect program output."

Intel says "due to this erratum, the system may experience unpredictable system behavior," and that a BIOS fix could prevent the issue from occurring. A spot check of the BIOS update history for the Z270 motherboards in the TR labs doesn't show any fixes that would outwardly claim to address this issue. We've asked Intel about its plans to address this bug under Windows (presuming it hasn't been corrected already), and we'll let you know what we hear when we hear it. Intel and Microsoft can update CPU microcode through Windows Update, so it's possible this issue has already been quietly patched for Windows without anybody hearing about it.

For what it's worth, we haven't noticed any unusual instability or crashes from our Skylake or Kaby Lake CPUs with Hyper-Threading enabled under Windows in our long history with those parts, so we probably won't be turning off the feature or advising most other people to do the same. Skylake CPUs have been in the wild since August 2015, and if this was a critical or easy-to-trigger bug, we'd likely have heard about it long before now. If you work with mission-critical data requiring absolute correctness or can't tolerate the possibility of application or system crashes, you might want to disable Hyper-Threading for the time being. Everybody else can likely wait and see what's up.

Tip: You can use the A/Z keys to walk threads.
View options

This discussion is now closed.