Potholes on the road to the North Pole
I'm just lucky, I guess, but thanks to a unique set of circumstances in the test hardware I used, I got an education in low-level motherboard behavior. It's just as well; nothing blew up, and I get to relate this interesting story to you, dear reader.
While hooking up the Vapochill's power connectors, I followed the manual's instructions and hooked the ATX12V lead to the appropriate jack on the ChillControl board. Then, I hooked the included ATX12V cable from the ChillControl to the motherboard.
This configuration didn't work too well. When the evaporator temperature reached the point where the motherboard should get powered on, the power supply kept turning itself off and back on. The manual warns about this situation in a paragraph following the hookup instructions for the ATX12V lead. Here's a direct quotation:
Using the ATX12V socket on the ChillControl makes some PSU shut down. If this is the case, simply connect the ATX+12V directly to the motherboard instead.
That sounded more or less like what I was experiencing, so I followed the instructions, unplugging the power supply's ATX12V lead from the ChillControl board and plugging it into the motherboard. Once everything was up and running, however, I encountered some strange, intermittent errors. After being off for a long period of time (several hours or so) the Vapochill would run for several minutes and then generate an error message on the front display, error E041. Powering down the system and turning it back on simply caused an immediate recurrence of the error, but I found that if I just powered the box down and let it sit for a few minutes, it would power up normally.
The manual said that an E041 error indicated a "sensor short cut," a short-circuit in the evaporator's temperature sensor. That seemed strange to me, because the error cleared if I just left the system powered off for a few minutes, and short-circuits don't usually just fix themselves. However, since the error was so sporadic, it was a couple of days before I took the time to do some research on the Asetek forums.
Once I did, I found a variety of messages regarding these E041 errors, and based on those messages, it seemed the errors were related to the motherboard I was using, an Abit IT7-MAX2. The reason the system was registering a short-circuit was that the temperature sensor was hitting its maximum value of 65 degrees Celsius. Additionally, it seems that the processors were, at least in some cases, getting significantly hotter than 65C. I found several reports of people who had burned up one or even multiple CPUs due to this problem.
In the default configuration, the ChillControl computer regulates the ATX12V lead to the motherboard, not applying power until it is ready to release the reset line and boot the system. However, with certain power supplies (including, for example, Antec, but not, for example, Enermax) having the ChillControl regulate the ATX12V lead causes problems. Asetek blames this issue on a vaguely formulated ATX standard, and says that the problem may even come and go on some power supplies depending on the motherboard used. If the problem occurs, the only way to get the system to boot is to connect the ATX12V lead from the power supply directly to the motherboard.
Unfortunately, the IT7-MAX2 apparently does not keep power from the ATX12V lead from getting to the CPU when the reset line is held high. You can see where this is going. In the time it takes the evaporator to get down to operating temperature, the CPU is getting power. With no effective heat dissipation method, it gets hot and gets hot quickly. If you've increased the processor's stock voltage, this makes things even worse. The result: the CPU reaches critically high temperatures before the evaporator can cool it off.
Those familiar with the Pentium 4 will say "But what about the processor's thermal protection mechanisms?" This is a good point. The Pentium 4 has previously been demonstrated to handle removal of a heatsink during active operation. Why the problem here?
I contacted Intel to get some explanations for this behavior. Turns out the Pentium 4 not only has the thermal throttling that many enthusiasts are aware of, it also has an even more severe action, called THERMTRIP in the CPU datasheets. THERMTRIP is asserted when the processor's internal temperature sensor indicates a temperature of approximately 135 Celsius, a temperature at which permanent silicon damage may occur. When THERMTRIP is asserted, the processor shuts down all internal clocks in an attempt to reduce its temperature. Additionally, in order to comply with processor requirements, Pentium 4 motherboards must remove core voltage from the processor within 0.5 seconds when THERMTRIP is asserted. Of course, this arrangement is designed to protect the processor during normal operation.
When critical temperatures occur while the reset line is held high, things get complicated, for a couple of reasons. First, B0 stepping processors deassert THERMTRIP when RESET is asserted, which could obviously cause issues with the way the Vapochill keeps the system from powering up. Steppings C0 and later won't deassert THERMTRIP until the deassertion of the PWRGOOD signal (i.e. a complete power off of the system). However, in reality this doesn't mean the problem is solved with these later steppings, because it's likely that the motherboard won't correctly respond to a THERMTRIP event when the reset line is held high, which is the reason people have ended up with dead processors.
Intel pointed out to me that the reset specifications for the Pentium 4 don't allow for the reset line to be held high for as long as is typical with the Vapochill's startup procedure. Since such an operation doesn't conform to Intel's specifications, any processor deaths which occur in this situation are basically out of their hands.
Personally, I'm somewhat torn on the importance of this problem. I felt compelled to report on it because I experienced it while working on this review, but at the same time, it is the sort of crazy coincidence one might expect when using technology such as phase-change cooling of a processor. Asetek is working hard to remedy these issues. They have contacted Abit about the problem and are doing their best to get it resolved. In the meantime, I would recommend purchasing an Enermax power supply to go with your Vapochill; they seem to work with the default ATX12V configuration regardless of motherboard. If you already have a power supply and it doesn't work with the stock ATX12V configuration, buy another one. A new PSU is a lot cheaper than a new 3.06GHz P4.
UPDATE: Asetek now has a beta firmware for their ChillControl unit which does not hold the RESET line high during initial startup. Rather, it uses control of the 12V supply to keep the system from starting, then applies 12V and briefly trips the RESET line (basically pushing the reset button) to start the system at the appropriate time. While this new firmware doesn't help the above compatibility issues with the IT7-MAX2, it should prevent CPU death. With the RESET line low, the processor should have no problem asserting THERMTRIP should the need arise.
| Friday night topic: The trouble with Best Buy | 137 |