Crucial M4 SSD failure

All things storage here: hard drives, DVD RW drives, little wicker baskets.

Moderators: morphine, Steel

Crucial M4 SSD failure

Postposted on Fri Oct 04, 2013 11:23 am

Well, after a few years of usage my 512GB SSD drive finally decided to give up. Unfortunately it did it without any obvious warning - I only noticed it when I recently attempted to perform a full backup and any utility I have tried, including the built-in Windows backup tool, has failed to do that, producing random "The IO operation at logical block address e953b98 for Disk 0 was retried" error messages in an Event Log (and before you say anything - yes, I did try to connect this SSD to different controllers in different PCs, that didn't change anything and my other drives still work perfectly on same controller). Fortunately I was still able to manually copy most of the data (few files actually got corrupted but they were not very important) and I also had old images of this drive on my external enclosure. Interestingly enough the CheckDisk (chkdsk) does not find any errors regardless of switches being used with it and the "S.M.A.R.T" (more like "STUPID") thingie still shows the drive as "healthy" even with some suspiciously looking "Raw Values":

Image
My subscription allows you people to exist on this site and makes me a better human being than you'll ever be
JohnC
Gerbil Jedi
Gold subscriber
 
 
Posts: 1881
Joined: Fri Jan 28, 2011 2:08 pm
Location: NY/NJ/FL

Re: Crucial M4 SSD failure

Postposted on Fri Oct 04, 2013 11:43 am

Does Crucial have a tool to read the vendor-specific fields? It'd be interesting if their tool also reported healthy...
Z68XP-UD4 | 2700K @ 4.4 GHz | 16 GB | 770 | PCP&C Silencer 950 | XSPC RX360 | Heatkiller R3 | D5 + RP-452X2 | HAF 932 | 1 TB WD Black w/ SRT
Waco
Gerbil Elite
 
Posts: 744
Joined: Tue Jan 20, 2009 4:14 pm

Re: Crucial M4 SSD failure

Postposted on Fri Oct 04, 2013 12:15 pm

No they do not. At least not publicly available.
My subscription allows you people to exist on this site and makes me a better human being than you'll ever be
JohnC
Gerbil Jedi
Gold subscriber
 
 
Posts: 1881
Joined: Fri Jan 28, 2011 2:08 pm
Location: NY/NJ/FL

Re: Crucial M4 SSD failure

Postposted on Fri Oct 04, 2013 3:09 pm

I wonder if they'll shed some light on those fields and what they mean if you contact their tech support. Corsair was terrible in my experiences with them but perhaps Crucial will be more forthcoming.
Z68XP-UD4 | 2700K @ 4.4 GHz | 16 GB | 770 | PCP&C Silencer 950 | XSPC RX360 | Heatkiller R3 | D5 + RP-452X2 | HAF 932 | 1 TB WD Black w/ SRT
Waco
Gerbil Elite
 
Posts: 744
Joined: Tue Jan 20, 2009 4:14 pm

Re: Crucial M4 SSD failure

Postposted on Fri Oct 04, 2013 3:53 pm

Well, they are not defined in the official documents (this is why SMART utilities also give a generic description to them):
http://www.micron.com/~/media/Documents ... ibutes.pdf
Most likely these are related to the amount of data written to the drive.
My subscription allows you people to exist on this site and makes me a better human being than you'll ever be
JohnC
Gerbil Jedi
Gold subscriber
 
 
Posts: 1881
Joined: Fri Jan 28, 2011 2:08 pm
Location: NY/NJ/FL

Re: Crucial M4 SSD failure

Postposted on Fri Dec 06, 2013 7:31 am

The amount of data written can be guesstimated from the (AD) value, 17 full cycles. Which is really low. Less than 1% of lifetime ( see the (CA) attribute).

What I find useful about the screenshot is the (05) attribute. There were 8192 reallocated sectors, which is a nice round power of 2. Apparently there are 8192 overprovisioned blocks available for reallocation, the drive exhausted them all and now it cannot reallocate the 8193th faulty block. It still works, but cannot read/use the faulty one.

I'm no expert, but my guess would be one of the chips died catastrophically (at least a part of it) and the drive ate up the overprovisioned blocks, probably in one go.
jurc11
Gerbil In Training
 
Posts: 4
Joined: Fri Dec 06, 2013 7:21 am

Re: Crucial M4 SSD failure

Postposted on Fri Dec 06, 2013 7:52 am

Sounds like file system corruption to me.

Are you overclocking the original system by chance?
Ivy Bridge i5-3570K@4.0Ghz, Gigabyte Z77X-UD3H, 2x4GiB of PC-12800, EVGA 660Ti, Corsair CX-600 and Fractal Refined R4 (W). Kentsfield Q6600@3Ghz, HD 4850 2x2GiB PC2-6400, Gigabyte EP45-DS4P, OCZ Modstream 700W, and PC-7B.
Krogoth
Maximum Gerbil
Silver subscriber
 
 
Posts: 4402
Joined: Tue Apr 15, 2003 3:20 pm
Location: somewhere on Core Prime

Re: Crucial M4 SSD failure

Postposted on Fri Dec 06, 2013 10:41 am

jurc11 wrote:What I find useful about the screenshot is the (05) attribute. There were 8192 reallocated sectors, which is a nice round power of 2. Apparently there are 8192 overprovisioned blocks available for reallocation, the drive exhausted them all and now it cannot reallocate the 8193th faulty block. It still works, but cannot read/use the faulty one.


Or it means that the drive has failed one bad flash page (4KB) and relocated it and is reporting how many 512 byte sectors it moved when it did that.

Assuming that the drive hasn't had anything done to it, the SMART values are a little odd. If it suffered from a read failure due to a bad flash block, there should be a non-zero pending sector count as the drive should be waiting for the next write of data to the failed sectors to relocate them. Of course that assumes that it behaves, from a SMART perspective, similar to a mechanical drive. No guarantees there of course.

Patrick M
SecretSquirrel
Gerbil Jedi
Gold subscriber
 
 
Posts: 1698
Joined: Tue Jan 01, 2002 7:00 pm
Location: The Colony, TX (Dallas suburb)

Re: Crucial M4 SSD failure

Postposted on Fri Dec 06, 2013 10:48 am

SecretSquirrel wrote:Or it means that the drive has failed one bad flash page (4KB) and relocated it and is reporting how many 512 byte sectors it moved when it did that.

Assuming that the drive hasn't had anything done to it, the SMART values are a little odd. If it suffered from a read failure due to a bad flash block, there should be a non-zero pending sector count as the drive should be waiting for the next write of data to the failed sectors to relocate them. Of course that assumes that it behaves, from a SMART perspective, similar to a mechanical drive. No guarantees there of course.

Patrick M


Actually there is a non-zero pending sector count. There are 2 sectors pending reallocation.

I agree with the hypothesis that the reallocation is failing because the drive had already exhausted its entire supply of 8192 spares.
jihadjoe
Gerbil Team Leader
 
Posts: 251
Joined: Mon Dec 06, 2010 11:34 am

Re: Crucial M4 SSD failure

Postposted on Fri Dec 06, 2013 11:25 am

Wow 512GB that sucks.

Will a zero write or a format get you a working drive??? Is it under warranty still?
Life doesn't change after marriage, it changes after children!
anotherengineer
Gerbil Elite
 
Posts: 546
Joined: Fri Sep 25, 2009 1:53 pm
Location: Timmins, ON Canada, Yes I know, Up in the sticks

Re: Crucial M4 SSD failure

Postposted on Fri Dec 06, 2013 3:32 pm

SecretSquirrel wrote:Or it means that the drive has failed one bad flash page (4KB) and relocated it and is reporting how many 512 byte sectors it moved when it did that.


Just to clarify, my assumption that all the spares were exhausted stems from the OP's description how a single logical address is appearing in the Event Log. Re-reading his post I see that he doesn't specifically say it's the same address every time.

If it's the same address, then the spares may very well be exhausted. If there would be spares available, that logical address would be reallocated to a different (working) physical address and it would no longer get reported.

Even if it's not the same address every time, if a large part of a flash chip is faulty, you'd get errors on many logical addresses, since none of them can reallocate anymore.


jihadjoe wrote:I agree with the hypothesis that the reallocation is failing because the drive had already exhausted its entire supply of 8192 spares.


Hey, somebody agreeing with me! Wish my wife did that from time to time.

What I'm curious about is how many reallocations do I get on my 128 gig M4. I'm on AD = 180 (6% of lifetime), with zero (01) and (05). Hope it stays that way for the next 31 years that I'll need to get to 100%.
jurc11
Gerbil In Training
 
Posts: 4
Joined: Fri Dec 06, 2013 7:21 am

Re: Crucial M4 SSD failure

Postposted on Fri Dec 06, 2013 6:55 pm

jihadjoe wrote:Actually there is a non-zero pending sector count. There are 2 sectors pending reallocation.
{/quote]
I think you are off by a line on the data. There are two relocation events logged, but no sectors pending.

jihadjoe wrote:I agree with the hypothesis that the reallocation is failing because the drive had already exhausted its entire supply of 8192 spares.

{/quote]
Micron flash uses 2048 byte page sizes, and 128K block sizes so two failed pages would equal 8192 standard 512 byte sectors. 8192 blocks would be 1GB. If that is the case, then it sounds more like and entire flash chip failed. That would be the easiest way to get 1GB of relocated sectors in only two events.

It is all kinda academic as the real question is "is the drive toast, and if not, can you trust it?" My answers are: 1) hard to say without more details and 2) nope.

--SS
SecretSquirrel
Gerbil Jedi
Gold subscriber
 
 
Posts: 1698
Joined: Tue Jan 01, 2002 7:00 pm
Location: The Colony, TX (Dallas suburb)


Return to Storage

Who is online

Users browsing this forum: Google [Bot] and 3 guests