Page 1 of 1

S.M.A.R.T. data worth concern?

Posted: Sun Jul 15, 2012 1:52 pm
by canoli
I am curious about 1 particular entry - the UltraDMA CRC Error Count.

There's plenty of info on the internet about this error so I don't need an explanation or anything - I'm looking for advice.

It's a brand new (5 days old) WD Caviar Black 2TB (WD2002FAEX) and HD Tune is reporting a "Warning" for the CRC Error Count.

My question is would you RMA the drive based on that alone? Probably not right? You'd monitor it, see if it got worse. If it remained stable you'd probably keep the drive, chalk it up to ... whatever. (btw the count, the # of errors = 11)

But what if the drive was having nothing but trouble staying online? Simple ops - copying files, running chkdsk or defrag - won't run and hang the system, forcing reboots.

I haven't ruled out other causes - maybe the drive is perfectly fine...I'm waiting on an external enclosure so I can test it outside the system. But after all the trouble and now, seeing this S.M.A.R.T data warning, it feels like the last straw to me ... I've owned the drive for 5 days. I've rebooted more times in those 5 days than I have in the 2 years since I put the system together.

(btw the WD drive, a BD drive, headphones - all were tightly packed into a sturdy box with bubble wrap)

thanks for sharing any thoughts or ideas, I appreciate it!

Re: S.M.A.R.T. data worth concern?

Posted: Sun Jul 15, 2012 2:30 pm
by Meadows
If the error count doesn't go higher from there, you can chalk it up to the lady of the lake.

If it does, RMA it.

Re: S.M.A.R.T. data worth concern?

Posted: Sun Jul 15, 2012 2:53 pm
by canoli
Thanks that's pretty much what I figured.

With all the trouble this drive is having .... now this ... well I guess I have to wait, test it in an external cage.

Re: S.M.A.R.T. data worth concern?

Posted: Sun Jul 15, 2012 3:39 pm
by just brew it!
DMA CRC errors could be due to a bad (or loose) cable.

Re: S.M.A.R.T. data worth concern?

Posted: Sun Jul 15, 2012 4:01 pm
by canoli
Thanks, and I haven't tried a fresh (new) cable yet. I'm waiting on an order from monoprice...hopefully the solution is as simple as that. :)

Re: S.M.A.R.T. data worth concern?

Posted: Mon Jul 16, 2012 10:04 am
by Flying Fox
Run the full test in WD Diagnostic yet?

Re: S.M.A.R.T. data worth concern?

Posted: Mon Jul 16, 2012 12:05 pm
by ludi
+1 for a bad or loose data cable (first), a flaky drive (second), or on a really long shot, maybe a coal-mine canary indicating that something is about to go astonishingly bad in your power supply.

Re: S.M.A.R.T. data worth concern?

Posted: Mon Jul 16, 2012 1:04 pm
by Chrispy_
ludi wrote:
+1 for a bad or loose data cable (first), a flaky drive (second), or on a really long shot, maybe a coal-mine canary indicating that something is about to go astonishingly bad in your power supply.


I hate to scaremonger, but ludi's long shot is not as long as it sounds. We had a batch of faulty PSU's in some workstations over the last couple of years and weird disk behaviour became an early warning system.

Re: S.M.A.R.T. data worth concern?

Posted: Mon Jul 16, 2012 1:14 pm
by Meadows
Chrispy_ wrote:
I hate to scaremonger, but ludi's long shot is not as long as it sounds. We had a batch of faulty PSU's in some workstations over the last couple of years and weird disk behaviour became an early warning system.

I have a problematic drive here myself (that I'm about to try and get replaced), I doubted the PSU theory in my case because I have three other drives that are completely asymptomatic. The same issue persisted across two motherboards and limitless cable configurations, regardless of the SATA connector I used.

Re: S.M.A.R.T. data worth concern?

Posted: Mon Jul 16, 2012 2:04 pm
by P5-133XL
You say you know what the error means but you are asking the wrong questions because you are asking opions about the validity of your hard drive. This error has nothing to do with the HD, so your HD is fine and so is your MB/computer but potentially not the data on the HD. The MB controller and the HD controller add a Cyclic Redundancy Check (CRC) to the signal that enters the cable and then when the data is received it is decoded and any errors noted. A high UltraDMA CRC error count means that your data between the MB controller and the HD is being corrupted and the only thing between those two points is the cable and the plugs. This has nothing to do with the validity of your HD.

Replace the cable (best) or at the minimum re-route it to avoid electo-magnetic interference from other stuff.

Normally, this error won't cause a data corruption problem because when corruption is spotted the HD and MB controllers just keep trying to resend till the data gets through correctly. You should check the Windows system event logs for any delayed-write or disk errors because if the cable is bad enough that the data never gets through even with retrys Windows will time-out and that will potentially mean something is corrupted.

Re: S.M.A.R.T. data worth concern?

Posted: Mon Jul 16, 2012 2:13 pm
by just brew it!
P5-133XL wrote:
You say you know what the error means but you are asking the wrong questions because you are asking opions about the validity of your hard drive. This error has nothing to do with the HD, so your HD is fine and so is your MB/computer. The MB controller and the HD controller add a Cyclic Redundancy Check (CRC) to the signal that enters the cable and then when the data is received it is decoded and any errors noted. A high UltraDMA CRC error count means that your data between the MB controller and the HD is being corrupted and the only thing between those two points is the cable and the plugs. This has nothing to do with the validity of your HD.

Replace the cable (best) or at the minimum re-route it to avoid electo-magnetic interference from other stuff.

While I agree that the cable is the most likely culprit, it isn't as cut-and-dried as you're implying. Flaky SATA controller or flaky logic board on the HD can also cause CRC errors.

Re: S.M.A.R.T. data worth concern?

Posted: Mon Jul 16, 2012 2:17 pm
by P5-133XL
just brew it! wrote:
P5-133XL wrote:
You say you know what the error means but you are asking the wrong questions because you are asking opions about the validity of your hard drive. This error has nothing to do with the HD, so your HD is fine and so is your MB/computer. The MB controller and the HD controller add a Cyclic Redundancy Check (CRC) to the signal that enters the cable and then when the data is received it is decoded and any errors noted. A high UltraDMA CRC error count means that your data between the MB controller and the HD is being corrupted and the only thing between those two points is the cable and the plugs. This has nothing to do with the validity of your HD.

Replace the cable (best) or at the minimum re-route it to avoid electo-magnetic interference from other stuff.

While I agree that the cable is the most likely culprit, it isn't as cut-and-dried as you're implying. Flaky SATA controller or flaky logic board on the HD can also cause CRC errors.


If it is the controller or the HD miscalculating the CRC's then nothing will get through and you'll just have a dead drive or no drives (if it is the MB controller with the problem). With cables the problem is intermittent depending upon the EM fields producing crosstalk so retries means that it may get through at a different time. if the problem is at the controller level it will happen continuously and nothing will ever get through even with retries.

So yes it is as cut-and-dried as I said it was.

Re: S.M.A.R.T. data worth concern?

Posted: Mon Jul 16, 2012 2:24 pm
by just brew it!
P5-133XL wrote:
If it is the controller or the HD miscalculating the CRC's then nothing will get through and you'll just have a dead drive.

Intermittent hardware failures can (and do) happen. Solder connections can be flaky, chips can overheat (or supply rails can be slightly out of spec causing chips to malfunction), etc.

Re: S.M.A.R.T. data worth concern?

Posted: Wed Jul 25, 2012 9:29 am
by canoli
Update:

Looks like it was indeed the cables that were causing the trouble - on both drives.

Once I hooked up the new cables (from monoprice; thanks to JustBrewIt and DancinJack for the recommendation) ... lo and behold all the problems disappeared.

The CRC Errors have stopped...copying files, HD Tune's benchmarks, Chkdsk, Defrag - all run perfectly fine now, as expected. No more strange delays on the Spinpoint either, which is really nice.


Thanks Everyone, for your help.

Re: S.M.A.R.T. data worth concern?

Posted: Wed Jul 25, 2012 10:09 am
by just brew it!
You're welcome!

It's nice when the fix for a vexing problem turns out to be a pair of 59 cent cables! :D

(I suppose it can be a little annoying too, though: "I spent all that time trying to figure this out, and it was this stupid cable all along!" :lol:)

Re: S.M.A.R.T. data worth concern?

Posted: Wed Jul 25, 2012 8:29 pm
by canoli
lol - yah I'm relieved it wasn't anything worse ... and then I think...I let this spinpoint act up for a year, never really investigating, never really trusting it ... and all along it was the #$^@ cable?!

Re: S.M.A.R.T. data worth concern?

Posted: Thu Jul 26, 2012 10:36 am
by just brew it!
I've become sensitized to this issue because we had a RAID-1 volume in a server here at work a few years back where one of the drives kept dropping out of the RAID array every few days. The culprit turned out to be the SATA cable.