S.M.A.R.T. data worth concern?

All things storage here: hard drives, DVD RW drives, little wicker baskets.

Moderators: morphine, Steel

S.M.A.R.T. data worth concern?

Postposted on Sun Jul 15, 2012 1:52 pm

I am curious about 1 particular entry - the UltraDMA CRC Error Count.

There's plenty of info on the internet about this error so I don't need an explanation or anything - I'm looking for advice.

It's a brand new (5 days old) WD Caviar Black 2TB (WD2002FAEX) and HD Tune is reporting a "Warning" for the CRC Error Count.

My question is would you RMA the drive based on that alone? Probably not right? You'd monitor it, see if it got worse. If it remained stable you'd probably keep the drive, chalk it up to ... whatever. (btw the count, the # of errors = 11)

But what if the drive was having nothing but trouble staying online? Simple ops - copying files, running chkdsk or defrag - won't run and hang the system, forcing reboots.

I haven't ruled out other causes - maybe the drive is perfectly fine...I'm waiting on an external enclosure so I can test it outside the system. But after all the trouble and now, seeing this S.M.A.R.T data warning, it feels like the last straw to me ... I've owned the drive for 5 days. I've rebooted more times in those 5 days than I have in the 2 years since I put the system together.

(btw the WD drive, a BD drive, headphones - all were tightly packed into a sturdy box with bubble wrap)

thanks for sharing any thoughts or ideas, I appreciate it!
canoli
Gerbil XP
Silver subscriber
 
 
Posts: 332
Joined: Fri Jul 18, 2008 9:55 pm

Re: S.M.A.R.T. data worth concern?

Postposted on Sun Jul 15, 2012 2:30 pm

If the error count doesn't go higher from there, you can chalk it up to the lady of the lake.

If it does, RMA it.
Meadows
Grand Gerbil Poohbah
Silver subscriber
 
 
Posts: 3161
Joined: Mon Oct 08, 2007 1:10 pm
Location: Location: Location

Re: S.M.A.R.T. data worth concern?

Postposted on Sun Jul 15, 2012 2:53 pm

Thanks that's pretty much what I figured.

With all the trouble this drive is having .... now this ... well I guess I have to wait, test it in an external cage.
canoli
Gerbil XP
Silver subscriber
 
 
Posts: 332
Joined: Fri Jul 18, 2008 9:55 pm

Re: S.M.A.R.T. data worth concern?

Postposted on Sun Jul 15, 2012 3:39 pm

DMA CRC errors could be due to a bad (or loose) cable.
(this space intentionally left blank)
just brew it!
Administrator
Gold subscriber
 
 
Posts: 37673
Joined: Tue Aug 20, 2002 10:51 pm
Location: Somewhere, having a beer

Re: S.M.A.R.T. data worth concern?

Postposted on Sun Jul 15, 2012 4:01 pm

Thanks, and I haven't tried a fresh (new) cable yet. I'm waiting on an order from monoprice...hopefully the solution is as simple as that. :)
canoli
Gerbil XP
Silver subscriber
 
 
Posts: 332
Joined: Fri Jul 18, 2008 9:55 pm

Re: S.M.A.R.T. data worth concern?

Postposted on Mon Jul 16, 2012 10:04 am

Run the full test in WD Diagnostic yet?
Image
The Model M is not for the faint of heart. You either like them or hate them.

Gerbils unite! Fold for UnitedGerbilNation, team 2630.
Flying Fox
Gerbil God
 
Posts: 24378
Joined: Mon May 24, 2004 2:19 am

Re: S.M.A.R.T. data worth concern?

Postposted on Mon Jul 16, 2012 12:05 pm

+1 for a bad or loose data cable (first), a flaky drive (second), or on a really long shot, maybe a coal-mine canary indicating that something is about to go astonishingly bad in your power supply.
He who laughs last, laughs first next time.
ludi
Gerbil Elder
 
Posts: 5431
Joined: Fri Jun 21, 2002 10:47 pm
Location: Sunny Colorado front range

Re: S.M.A.R.T. data worth concern?

Postposted on Mon Jul 16, 2012 1:04 pm

ludi wrote:+1 for a bad or loose data cable (first), a flaky drive (second), or on a really long shot, maybe a coal-mine canary indicating that something is about to go astonishingly bad in your power supply.


I hate to scaremonger, but ludi's long shot is not as long as it sounds. We had a batch of faulty PSU's in some workstations over the last couple of years and weird disk behaviour became an early warning system.
<insert large, flashing, epileptic-fit-inducing signature (based on the latest internet-meme) here>
Chrispy_
Gerbil Jedi
Gold subscriber
 
 
Posts: 1841
Joined: Fri Apr 09, 2004 3:49 pm

Re: S.M.A.R.T. data worth concern?

Postposted on Mon Jul 16, 2012 1:14 pm

Chrispy_ wrote:I hate to scaremonger, but ludi's long shot is not as long as it sounds. We had a batch of faulty PSU's in some workstations over the last couple of years and weird disk behaviour became an early warning system.

I have a problematic drive here myself (that I'm about to try and get replaced), I doubted the PSU theory in my case because I have three other drives that are completely asymptomatic. The same issue persisted across two motherboards and limitless cable configurations, regardless of the SATA connector I used.
Meadows
Grand Gerbil Poohbah
Silver subscriber
 
 
Posts: 3161
Joined: Mon Oct 08, 2007 1:10 pm
Location: Location: Location

Re: S.M.A.R.T. data worth concern?

Postposted on Mon Jul 16, 2012 2:04 pm

You say you know what the error means but you are asking the wrong questions because you are asking opions about the validity of your hard drive. This error has nothing to do with the HD, so your HD is fine and so is your MB/computer but potentially not the data on the HD. The MB controller and the HD controller add a Cyclic Redundancy Check (CRC) to the signal that enters the cable and then when the data is received it is decoded and any errors noted. A high UltraDMA CRC error count means that your data between the MB controller and the HD is being corrupted and the only thing between those two points is the cable and the plugs. This has nothing to do with the validity of your HD.

Replace the cable (best) or at the minimum re-route it to avoid electo-magnetic interference from other stuff.

Normally, this error won't cause a data corruption problem because when corruption is spotted the HD and MB controllers just keep trying to resend till the data gets through correctly. You should check the Windows system event logs for any delayed-write or disk errors because if the cable is bad enough that the data never gets through even with retrys Windows will time-out and that will potentially mean something is corrupted.
Last edited by P5-133XL on Mon Jul 16, 2012 2:16 pm, edited 1 time in total.
Put those spare CPU/GPU cycles to good use - Folding@Home
Image
P5-133XL
Gerbil
 
Posts: 71
Joined: Fri Apr 18, 2008 4:52 am

Re: S.M.A.R.T. data worth concern?

Postposted on Mon Jul 16, 2012 2:13 pm

P5-133XL wrote:You say you know what the error means but you are asking the wrong questions because you are asking opions about the validity of your hard drive. This error has nothing to do with the HD, so your HD is fine and so is your MB/computer. The MB controller and the HD controller add a Cyclic Redundancy Check (CRC) to the signal that enters the cable and then when the data is received it is decoded and any errors noted. A high UltraDMA CRC error count means that your data between the MB controller and the HD is being corrupted and the only thing between those two points is the cable and the plugs. This has nothing to do with the validity of your HD.

Replace the cable (best) or at the minimum re-route it to avoid electo-magnetic interference from other stuff.

While I agree that the cable is the most likely culprit, it isn't as cut-and-dried as you're implying. Flaky SATA controller or flaky logic board on the HD can also cause CRC errors.
(this space intentionally left blank)
just brew it!
Administrator
Gold subscriber
 
 
Posts: 37673
Joined: Tue Aug 20, 2002 10:51 pm
Location: Somewhere, having a beer

Re: S.M.A.R.T. data worth concern?

Postposted on Mon Jul 16, 2012 2:17 pm

just brew it! wrote:
P5-133XL wrote:You say you know what the error means but you are asking the wrong questions because you are asking opions about the validity of your hard drive. This error has nothing to do with the HD, so your HD is fine and so is your MB/computer. The MB controller and the HD controller add a Cyclic Redundancy Check (CRC) to the signal that enters the cable and then when the data is received it is decoded and any errors noted. A high UltraDMA CRC error count means that your data between the MB controller and the HD is being corrupted and the only thing between those two points is the cable and the plugs. This has nothing to do with the validity of your HD.

Replace the cable (best) or at the minimum re-route it to avoid electo-magnetic interference from other stuff.

While I agree that the cable is the most likely culprit, it isn't as cut-and-dried as you're implying. Flaky SATA controller or flaky logic board on the HD can also cause CRC errors.


If it is the controller or the HD miscalculating the CRC's then nothing will get through and you'll just have a dead drive or no drives (if it is the MB controller with the problem). With cables the problem is intermittent depending upon the EM fields producing crosstalk so retries means that it may get through at a different time. if the problem is at the controller level it will happen continuously and nothing will ever get through even with retries.

So yes it is as cut-and-dried as I said it was.
Put those spare CPU/GPU cycles to good use - Folding@Home
Image
P5-133XL
Gerbil
 
Posts: 71
Joined: Fri Apr 18, 2008 4:52 am

Re: S.M.A.R.T. data worth concern?

Postposted on Mon Jul 16, 2012 2:24 pm

P5-133XL wrote:If it is the controller or the HD miscalculating the CRC's then nothing will get through and you'll just have a dead drive.

Intermittent hardware failures can (and do) happen. Solder connections can be flaky, chips can overheat (or supply rails can be slightly out of spec causing chips to malfunction), etc.
(this space intentionally left blank)
just brew it!
Administrator
Gold subscriber
 
 
Posts: 37673
Joined: Tue Aug 20, 2002 10:51 pm
Location: Somewhere, having a beer

Re: S.M.A.R.T. data worth concern?

Postposted on Wed Jul 25, 2012 9:29 am

Update:

Looks like it was indeed the cables that were causing the trouble - on both drives.

Once I hooked up the new cables (from monoprice; thanks to JustBrewIt and DancinJack for the recommendation) ... lo and behold all the problems disappeared.

The CRC Errors have stopped...copying files, HD Tune's benchmarks, Chkdsk, Defrag - all run perfectly fine now, as expected. No more strange delays on the Spinpoint either, which is really nice.


Thanks Everyone, for your help.
canoli
Gerbil XP
Silver subscriber
 
 
Posts: 332
Joined: Fri Jul 18, 2008 9:55 pm

Re: S.M.A.R.T. data worth concern?

Postposted on Wed Jul 25, 2012 10:09 am

You're welcome!

It's nice when the fix for a vexing problem turns out to be a pair of 59 cent cables! :D

(I suppose it can be a little annoying too, though: "I spent all that time trying to figure this out, and it was this stupid cable all along!" :lol:)
(this space intentionally left blank)
just brew it!
Administrator
Gold subscriber
 
 
Posts: 37673
Joined: Tue Aug 20, 2002 10:51 pm
Location: Somewhere, having a beer

Re: S.M.A.R.T. data worth concern?

Postposted on Wed Jul 25, 2012 8:29 pm

lol - yah I'm relieved it wasn't anything worse ... and then I think...I let this spinpoint act up for a year, never really investigating, never really trusting it ... and all along it was the #$^@ cable?!
canoli
Gerbil XP
Silver subscriber
 
 
Posts: 332
Joined: Fri Jul 18, 2008 9:55 pm

Re: S.M.A.R.T. data worth concern?

Postposted on Thu Jul 26, 2012 10:36 am

I've become sensitized to this issue because we had a RAID-1 volume in a server here at work a few years back where one of the drives kept dropping out of the RAID array every few days. The culprit turned out to be the SATA cable.
(this space intentionally left blank)
just brew it!
Administrator
Gold subscriber
 
 
Posts: 37673
Joined: Tue Aug 20, 2002 10:51 pm
Location: Somewhere, having a beer


Return to Storage

Who is online

Users browsing this forum: No registered users and 2 guests