Troubleshooting a Crashing System

Discuss the core components that make up the heart and soul of any good computer.

Moderators: Flying Fox, Thresher

Troubleshooting a Crashing System

Postposted on Wed Aug 31, 2011 8:49 am

I have a friend who is having trouble with his PC crashing rather randomly. Last night he brought it to me to attempt to troubleshoot. I think I've exhausted my knowledge and resources on it, but wanted to see if someone that knows better than I had any ideas.

It's a P55 chipset, Gigabyte motherboard, I-750 processor, Saphire Radeon 4870, two DDR3-1333 2GB sticks of Mushkin memory, and a Seagate 500GB Spinpoint F3 hard drive, plus a basic CD Rom.

Originally on that computer he had Obsidian series OCZ memory, but one of the two sticks was defective. Rather than RMAing it, he just used the 1 good stick until he got the Mushkin pair. I identified the memory issue with the OCZ set by running memtest with each stick for a few hours. One came back with problems, the other without. So I thought I had solved that successfully.

After about a full year of working (though he always complained of crashing, though was never very descriptive, and I discounted it as poor upkeep and bad software), he called me last night. Apparently his system was left on overnight, and when he came back to it today it was crashing pretty-much constantly on start-up. A lot of errors mentioned the file system, system recovery, drivers, and he mentioned something about IRQ. I told him to try booting from his windows CD and performing a repair, or a clean format and reinstall if that failed. It continued to crash when booting from CD, even with the hard drive disconnected from the system.

So I had him bring it over so I could see things myself. I didn't have a lot of parts on hand to test with, but here's the configurations I tried and what worked/didn't work.

First, I removed one of the memory sticks, left the hard-drive disconnected, and tried to boot to CD-Rom. This started off, but crashed when the "loading windows" came up.

I tried this config with the other memory stick as well, and it took a little longer to crash, but still crashed.

The problem with troubleshooting often came down to not having any way to remove components. I had to have the graphics card, memory, mother board, and cpu consistently. The one part I had a spare of was the graphics card, which mine was an identical model, so I swapped my graphics card in...

And that worked. I was able to boot into his hard drive, or the cd, without any problems. Well... with some problems. Using the same GPU and the same monitor I use, his system had some artifacting that I never witnessed on my own computer. I thought this was odd.

This made me think the GPU was the issue... but I wanted to at least try to isolate that. So I plugged his graphics card into my system (same model, same manufacturer). And it worked... perfectly normal for my system. The only difference was after windows was loaded, it wanted to configure the new device, but it functioned fine, even running some games briefly.

After all that, it's really hard for me to say what's wrong. I'm inclining in the direction of the motherboard, my logic being that it's highly unlikely that both sticks of memory are corrupt (unless they've been damaged by something in the motherboard), the hard drive and graphics card both seem to work when taken out of that system, I'm seeing strange distortion in the graphics when testing a similar GPU (but then the whole system worked... which was off-putting)...

I really don't know what to do at this point. If I had parts, I could test more components. If he had wanted to leave it overnight, I could have tried a clean install on a spare hard-drive, run memtest overnight on the memory system, with my GPU instead of his... I also never disconnected his CD-Rom now that I think about it, but... well it worked.

Should I blame the graphics card? Even though his system still showed graphical glitches with my working card, and his card worked in my system fine?

What could I do to better understand the crashes? It seemed to crash in a dozen different ways, but all only when starting to get into windows (bios/booting seemed fine). Connecting my own hard drive to his system and attempting to boot from that didn't work at all, but I don't know if that should necessarily...
Creamsteak
Gerbil First Class
 
Posts: 113
Joined: Wed May 14, 2008 6:29 pm

Re: Troubleshooting a Crashing System

Postposted on Wed Aug 31, 2011 9:18 am

Start with running memtest86 overnight, that will point out defective ram
iMac Retina - 32GB- 1 TB SSD - M295X
elmopuddy
Gerbil Elite
Gold subscriber
 
 
Posts: 937
Joined: Thu Dec 27, 2001 7:00 pm
Location: Montreal, Canada

Re: Troubleshooting a Crashing System

Postposted on Wed Aug 31, 2011 9:36 am

Power supply hasn't been mentioned and could cause a lot of this instability. It shouldn't draw a lot, but if it's failing it could be the culprit. I know it's a lot of work, but swap your psu for his.
Hallucin8
Gerbil
 
Posts: 56
Joined: Mon Jun 28, 2010 9:12 am

Re: Troubleshooting a Crashing System

Postposted on Sat Sep 03, 2011 10:13 am

I would suspect the motheboard or power supply. Check the motherboard for bulging or leaky caps, excessive dust that might be changing the resistance between components, maybe remount the cpu heatsink to ensure that there aren't any hot spots due to poor contact with the heatsink, and reset the bios to defaults. Also get a flashlight and look inside the PCIE slots, and the memory slots and make sure there is no corrosion or bent pins (when you have strange intermittent problems it could be something that simple). Go in the bios utility and check the hardware monitor to make sure all voltages are inline with ATX spec (http://www.formfactors.org/developer/specs/ATX12V_PSDG_2_2_public_br2.pdf page 13). If you can boot into windows, install a utility that can monitor the voltages and write them all down. Next, start by running something like Furmark and see if the voltages change much or go out of spec. Let that run for a few minutes, and if it is still working ok run something like prime95 to stress the cpu and recheck the voltages. A weak power supply will probably see a large drop on the 12v rail, but it could be noisy power also, so your best bet is to try your power supply and run these tests again to check all the voltages and compare them to the previous results. Also, make sure everything is running at stock speed and voltage. Check the motherboard manual to make sure DDR3-1333 is officially supported. Maybe try slowing the RAM down or changing slots. Hope that helps. Good luck.
spiked_mistborn
Gerbil
 
Posts: 26
Joined: Fri Aug 06, 2010 11:01 pm

Re: Troubleshooting a Crashing System

Postposted on Wed Sep 07, 2011 1:08 pm

Just a quick update for closure.

We tried out a replacement motherboard and everything magically worked. I tried to convince him to run memtest overnight and possibly start a clean install, but I doubt it will happen. Hopefully he at least RMA's the motherboard since it's under warranty still. The power supply didn't seem to be the issue, but I didn't have a multimeter so really testing it thoroughly hasn't happened.
Creamsteak
Gerbil First Class
 
Posts: 113
Joined: Wed May 14, 2008 6:29 pm


Return to Motherboards, Chipsets, & RAM

Who is online

Users browsing this forum: No registered users and 4 guests