just brew it! wrote:Sounds like the RAM subsystem is right on the hairy edge of stability with the XMP profile. As you've already acknowledged, there's really no way to tell based on the info you have whether this is due to the RAM or the CPU (or even both... maybe you've got a "perfect storm" where both are just a little sub-par).
Your options at this point are to start replacing things to see what happens, or just back off and run the RAM at stock. I'd be inclined to opt for the latter, but it's your time and money.
Mmm hmm.
Right now I'm running the PSU test, but with the memory very slightly overvolted to 1.51V while at 1866. If this is stable... now I don't know what the heck is going on with my desktop, now. I don't have a second CPU or a second set of dual-channel DDR3-1866 or better RAM, even if I do find errors, there's still the off-chance that the CPU controller is the faulting part (or both).
Long-term stability testing is also needed. I think I'll be able to arrive at a conclusion tomorrow. 8 hours of OCCT or more while the RAM is set to stock speeds is probably more than enough - I think I'll try to aim for 12 hours.
Update: PSU test passes cleanly after 2 hours, with the RAM set to 1.51 V instead of the usual 1.5 V. Now the next step: should I do a long-term stability testing at 1.51V with XMP 1866 speeds, or do it when the memory is at JEDEC stock speeds and latencies?
Update 2: Long-term stability testing with the RAM running in XMP 1866 mode, but with the voltage at 1.51V. Weird. So now I have a dual-channel kit that refuses to run properly at their rated voltage.
Now I have a few options.
1. Run the RAM at stock speeds.
2. Run the RAM at 1.51V and suffer from tripled idle power and doubled access power.
3. Attempt to overclock the RAM by overvolting it to 1.65V and then playing with frequencies and timings, throwing power consumption to the wind in the process.
4. RMA the pair and be stuck with no desktop to use if no temporary DDR3 is in use.
Oh well.
Update 3: PSU test failed again at XMP 1866 at the exact same time spot. Hmm... Anyway, this time, I've opened up the case, took the RAM and graphics card out, and disconnected the PCIe plugs, cleaned the gold contacts on the RAM and the graphics card, made sure that they're clean enough - they looked slightly dull, and someone elsewhere mentioned that I should clean them just to be very sure that it's not just a contact issue. Now that everything has been reinstalled, I'm going to run the PSU test again, but on a longer period of time. Hopefully any instability will be rooted out; I'm actually pretty confident that it's probably the RAM, in one way or another.
Also to note: when I removed the RAM sticks rather quickly, their heat spreaders were very warm to the touch. Almost uncomfortable, really. Is this normal?
(And I also learned that the video card will throw up a helpful message on POST if the video card isn't getting its PCIe plugs connected. Shame on me.)
Update 4: OK, not a RAM contact issue, and not an ambient temperature issue (turned on AC today) - PSU test failed in 35 minutes.
I'm 99% confident the RAM is the problem, and the problem being the RAM probably just got... bad enough that it can't maintain its XMP profile speeds at the specified 1.5V. I think I'll try to get the RAM replaced.
Update 5: Forcing the system's fans to all 100% seems to fix the problem, too... but still preliminary. I need to do it again for at least 8 hours, with the AC turned off, to remove or confirm insufficient case airflow. The system is currently configured to have two front intakes and the CPU radiator exhausting to the back.