Personal computing discussed

Moderators: renee, Flying Fox, morphine

 
HorseIicious
Gerbil First Class
Topic Author
Posts: 128
Joined: Tue Aug 16, 2011 7:52 pm
Location: Florida

Help identifying possible faulty CPU

Mon Nov 04, 2013 1:29 pm

I've never experienced a bad CPU before, but I think I may have one in my i5-2500k... I need some help determining if the CPU is indeed going bad, and/or what other tests I can run to identify the problem. A little back story... I have run this CPU on a 24/7 overclock with a 0.075+ main voltage offset for years (all other settings on Auto). I have a Corsair Hyper 212 for cooling. Typical idle temps were in high 20c/low 30c range, typical loads in mid 50c, and stress testing got me to high 60c/low 70c ranges.

I began experiencing issues about a month ago, and at that time I reset everything back to stock settings. So all tests I have run have been on stock 3.3GHz and stock voltage.

The problem I am experiencing is with memory faults... Programs like 7zip (CRC), QuickPar (Verify Failed, Checksum Error, Memory Fault), Firefox/Chrome (random crashes all the time when just browsing text websites) and many others are giving me memory errors. I have unzipped the same files which are giving me CRC errors on a different system without any problems. I have also run QuickPar repairs that give me various errors on this main system - but the files are checked and repaired fine on another system (yet continually fail on this main system).

I put each 8GB stick of RAM into another system, and ran Memtest (4.3.5) for over 5 hours (3+ passes) on each individual stick by itself. All resulted in 0 errors. Then I put the sticks back into my main system, and ran Memtest (with all 4 DIMMs populated) for over 10 hours (3+ passes) - again 0 errors. This leads me to believe my motherboard (at least memory DIMMs) and RAM are okay. I have a PSU test, and I also tested my PSU (without load), and it says it's fine (but nothing indicates the PSU would be going bad). I unseated/reseated my GPU, RAM, and all SATA connectors - still have the issue.

As an additional example, I ran Prime95 (on stock settings, and after doing all of the other memory tests), and within 3 minutes, 3 of my cores fail with, "rounding was 0.5 expected less than 0.4" - Edit: And the 4th core failed after 36 minutes ("rounding was 0.4981994629, expected lass then 0.4")

From what I've read, all I can figure is maybe the L2/L3 cache is going bad on my CPU.

Any input, advice, knowledge would be much appreciated.

Thanks
Last edited by HorseIicious on Mon Nov 04, 2013 1:39 pm, edited 1 time in total.
 
paco
Minister of Gerbil Affairs
Posts: 2083
Joined: Wed Jul 21, 2004 7:14 pm
Location: So Cal

Re: Help identifying possible faulty CPU

Mon Nov 04, 2013 1:37 pm

You need to put the memory from the "other" system into the one you are having trouble with and run prime95 and see what results you get.

Memtest can't stress test the memory, it can only check for faults it finds for in the memory itself. The memory could be failing under heavy load.
 
Ryu Connor
Global Moderator
Posts: 4369
Joined: Thu Dec 27, 2001 7:00 pm
Location: Marietta, GA
Contact:

Re: Help identifying possible faulty CPU

Mon Nov 04, 2013 1:40 pm

A defective CPU will sometimes create a specific type of error into the Event Viewer. You'll need to look under the System log for a machine-check exception (MCE).
All of my written content here on TR does not represent or reflect the views of my employer or any reasonable human being. All content and actions are my own.
 
just brew it!
Administrator
Posts: 54500
Joined: Tue Aug 20, 2002 10:51 pm
Location: Somewhere, having a beer

Re: Help identifying possible faulty CPU

Mon Nov 04, 2013 1:43 pm

While it is conceivable that the CPU is failing (especially given the overclock), I would still suspect the motherboard, RAM, or PSU before the CPU. Memtest does not find all faults, and a failing motherboard or PSU can definitely cause symptoms like this as well.

Try running your stability tests with only 2 sticks of RAM. Then swap in the other two sticks and run with those instead. See if the symtoms change.

Look closely at the motherboard for any capacitors that are bulging or leaking.

Consumer PSU testers don't test the PSU under realistic loads, and don't check how clean the power on the rails is, so they will only find major failures (e.g. a rail that isn't working at all). If you have a spare PSU you can try, swap the PSU and see if the symptoms change.
Nostalgia isn't what it used to be.
 
HorseIicious
Gerbil First Class
Topic Author
Posts: 128
Joined: Tue Aug 16, 2011 7:52 pm
Location: Florida

Re: Help identifying possible faulty CPU

Mon Nov 04, 2013 1:53 pm

Thanks to all of you for your quick and informative replies. Another reason why I love this place.

Ryu Connor wrote:
A defective CPU will sometimes create a specific type of error into the Event Viewer. You'll need to look under the System log for a machine-check exception (MCE).


I just checked this and I don't see any MCE. All the logs are "Service Control Manager" with event ID 7036 and just basically log which services have stopped/started. I did a search for both "Machine-check-exception" and "MCE" and nothing turned up. Good to know though, thanks.

just brew it! wrote:
While it is conceivable that the CPU is failing (especially given the overclock), I would still suspect the motherboard, RAM, or PSU before the CPU. Memtest does not find all faults, and a failing motherboard or PSU can definitely cause symptoms like this as well.

Try running your stability tests with only 2 sticks of RAM. Then swap in the other two sticks and run with those instead. See if the symtoms change.

Look closely at the motherboard for any capacitors that are bulging or leaking.

Consumer PSU testers don't test the PSU under realistic loads, and don't check how clean the power on the rails is, so they will only find major failures (e.g. a rail that isn't working at all). If you have a spare PSU you can try, swap the PSU and see if the symptoms change.


Thanks for the info. I will try the tests you've recommended on the RAM, and reinspect the motherboard for bad caps, and report back later this evening. The only PSU I have that is suitable for a swap is in my HTPC. So I'll have to pull that later. I'll let you know what results I find.

Thanks again for the assistance.
 
Forge
Lord High Gerbil
Posts: 8253
Joined: Wed Dec 26, 2001 7:00 pm
Location: Gone

Re: Help identifying possible faulty CPU

Mon Nov 04, 2013 1:58 pm

Given your clean memtest runs, I'd suggest checking the disk for problems. I saw similar issues a few weeks back when a coworker's SSD was on the way out.
Please don't edit my signature for me. Thanks.
 
HorseIicious
Gerbil First Class
Topic Author
Posts: 128
Joined: Tue Aug 16, 2011 7:52 pm
Location: Florida

Re: Help identifying possible faulty CPU

Mon Nov 04, 2013 5:12 pm

just brew it! wrote:
While it is conceivable that the CPU is failing (especially given the overclock), I would still suspect the motherboard, RAM, or PSU before the CPU. Memtest does not find all faults, and a failing motherboard or PSU can definitely cause symptoms like this as well.

Try running your stability tests with only 2 sticks of RAM. Then swap in the other two sticks and run with those instead. See if the symtoms change.

Look closely at the motherboard for any capacitors that are bulging or leaking.

Consumer PSU testers don't test the PSU under realistic loads, and don't check how clean the power on the rails is, so they will only find major failures (e.g. a rail that isn't working at all). If you have a spare PSU you can try, swap the PSU and see if the symptoms change.


Okay, so I did a pretty good inspection of my motherboard (I've seen bad caps before, so I feel confident saying I don't have any obvious culprits). None leaking, none leaning, none bulging. All the caps look good.

Then I did two separate Prime95 tests with only 2 sticks (16GB 2x8GB) populated for each test - both tests failed within a few minutes. Then out of curiosity I pulled 2x4GB sticks from another system and tried those (so only 8GB 2x4GB). Currently Prime95 has been running for well over an hour with no faults on any core with the 2x4GB option. The only difference in the RAM (other than capacity) are that the 8GB sticks are rated at 1333, and the 4GB sticks 1600 (all timings, voltages, brand/model etc are identical). Also of note, the 4GB sticks were/are not running at 1600 (but instead at 1333) - even though I have BIOS set on all-auto for DRAM settings.

Finally, I discovered Intel Processor Diagnostic Tool (https://downloadcenter.intel.com/Detail ... s&lang=eng) - and ran that, twice. Both runs gave me "PASS" results (regardless of what RAM was populating the system).

Really not sure how to proceed from here. Is my motherboard just randomly rejecting RAM that has worked perfectly for over a year? Don't really understand why the 8GB sticks error out so quickly, while the 4GB sticks seem stable...

Forge wrote:
Given your clean memtest runs, I'd suggest checking the disk for problems. I saw similar issues a few weeks back when a coworker's SSD was on the way out.


Thanks for the suggestion. I actually had installed a new SSD just a few weeks before this all started happening. I eventually pulled that drive, and returned it (thinking it was causing the issues). Then I went back to my previous drive (with the previous untouched Windows installation still there) - but continued having issues. I currently have installed a new SSD (as of about a week ago), with a fresh copy of W7x64 Pro directly from an Original OEM disc. Disk checks have revealed no faults with this drive yet...
 
Flying Fox
Gerbil God
Posts: 25690
Joined: Mon May 24, 2004 2:19 am
Contact:

Re: Help identifying possible faulty CPU

Mon Nov 04, 2013 6:06 pm

Seems like RAM, or just failure when multiple sticks are in play. I would say that 3 passes of memtests are not enough. I do those overnight with at least 20 passes.
The Model M is not for the faint of heart. You either like them or hate them.

Gerbils unite! Fold for UnitedGerbilNation, team 2630.
 
HorseIicious
Gerbil First Class
Topic Author
Posts: 128
Joined: Tue Aug 16, 2011 7:52 pm
Location: Florida

Re: Help identifying possible faulty CPU

Mon Nov 04, 2013 6:29 pm

Flying Fox wrote:
Seems like RAM, or just failure when multiple sticks are in play. I would say that 3 passes of memtests are not enough. I do those overnight with at least 20 passes.


Yeah. Seems like it I guess. Maybe b/c each DIMM is 8GB it takes longer in memtest. I think 20 passes would probably take me over 2 days... I let one run overnight, and it was at 8+ hours and only 4 passes (and each pass seemingly took longer than the previous).

I am going to run the system with only the 8GB (2x4GB) that appears to be working fine (3 hours of Prime95 now, and still good). If the crashes/errors stop entirely then I'll put the other pair of matching 2x4GB sticks I have in (to be at 16GB 4x4GB) and test it like that for a bit (to see if I can rule out all motherboard/DIMM slot issues).

I guess if that all checks out then like you said, it must be the RAM sticks.

In the meantime, if anyone has any other test to try, let me know.
 
zenlessyank
Gerbil
Posts: 88
Joined: Tue Oct 16, 2012 5:15 pm

Re: Help identifying possible faulty CPU

Mon Nov 04, 2013 6:31 pm

I had a similar experience with my asrock extreme3 mainboard...seemed like memory errors yet they checked out in another machine...had high end corsair PS ...tested everything....in the end it was a flaky motherboard... found a newer MSI board and haven't looked back in 2 years. Unless you have black burnt spot on CPU, I highly doubt your CPU is bad.

ASRock have improved their boards now , but earlier ones were somewhat flaky
MSI X58A-GD65, [email protected], EVGA 660 GTX SC SLI, Neutron 240 GB,2.75 TB Spindle Storage,ASUS DVD-RW, 6GB Patriot, Win 7 & 8.1 & Kubuntu Triple Boot, ASUS VK278 on DisplayPort.
 
HorseIicious
Gerbil First Class
Topic Author
Posts: 128
Joined: Tue Aug 16, 2011 7:52 pm
Location: Florida

Re: Help identifying possible faulty CPU

Mon Nov 04, 2013 8:46 pm

zenlessyank wrote:
I had a similar experience with my asrock extreme3 mainboard...seemed like memory errors yet they checked out in another machine...had high end corsair PS ...tested everything....in the end it was a flaky motherboard... found a newer MSI board and haven't looked back in 2 years. Unless you have black burnt spot on CPU, I highly doubt your CPU is bad.

ASRock have improved their boards now , but earlier ones were somewhat flaky


Thanks for the input. I think you're probably right. Despite the fact that I had no issues for so long, I'm beginning to think that's it's either the board itself, or at the very least some sort of incompatibility with board and the 8GB DIMMs. I'm going on 5 hours now Prime95 running smoothly with no errors w/ the 8GB (2x4GB). So regardless, I feel confident ruling out the CPU failure (phew). I really didn't think CPU failure was likely, it was just that all of the tests I had run to that point seemed to rule everything else out.

Anyway, I'll keep testing, and see if I can pinpoint where the exact problem is.
 
Forge
Lord High Gerbil
Posts: 8253
Joined: Wed Dec 26, 2001 7:00 pm
Location: Gone

Re: Help identifying possible faulty CPU

Mon Nov 04, 2013 9:43 pm

Glad it's not the disk. Those are not fun to recover.

The board is a likely suspect. I had a Z68 mobo last year, flaky POS. Made me quite upset, with troubleshooting mobo+CPU being pretty much one piece anymore.

I got a Z77 in a fit of pique and have been much happier. Having native USB3 instead of an add-in controller is nice, too.
Please don't edit my signature for me. Thanks.

Who is online

Users browsing this forum: No registered users and 1 guest
GZIP: On