just brew it! wrote:It seems this thread necro has taken an interesting twist...
Oh, never noticed it was a necro. Not that bad though, more of a zombie thing than an undead skeleton thing. It just barely had time to start to decompose.
just brew it! wrote:Sounds like something's flaky in your network or file server. That's a nightmare waiting to happen; silent data corruption is far worse than something that fails outright, since you may not realize you have a problem until some time in the future. Depending on your backup regime, you may not even have a backup with a good copy of the file any more (yes, this is an argument for archiving some of your backups permanently instead of recycling the media).
Flaky NIC, flaky network switch, or bad RAM in the file server (does the server have error-correcting RAM?) are a few potential culprits that come to mind.
Thats the weird thing, I've done a few tests with copying, reading, writing, doing parity check after that, etc. Nothing except for lightroom seems to be affected. Unfortunately It's not really like I have parity data for all my files just laying around, which is why I'm in the process of looking at a parity file generation scheme right now to be able to actually test it.
As far as this goes, it seems like it could be a couple of things.
* Option 1 - Lightroom for some reason, is the program corrupting either the files in the progress of rendering the preview. And thus for some reason have a problem working against a folder on a network share.
* Option 2 - Some of the images was corrupted thanks to errors during my restore of a failed drive in the array a few months ago.
* Option 3 - Problems with the raid.
* Option 4 - Problem with the network.
* Option 5 - Problems with the server.
The thing is, the images I'm working with is my set of photos from my christmas New Zealand trip. And I'm going through it for the third time in my iterative sorting approach. First iteration is setting a single star and leaving the stuff not worth anything as unrated. Next iteration is rating up shots thats better, etc. 3rd rating marks stuff that might be shown to the people I was traveling with. 4rt rating is stuff put on for everybody, facebook, etc. 5th rating is stuff worthy to be put into the portfolio. So all the files that are corrupted, have been read and shown fine at an earlier point. And yeah, as you say, silent corruption is far worse then anything else, and something that scares the bejebus out of me.
The thing is, I dont remember if I did the last iteration before I did the raid restore. So it might be option 2 above... otherwise option 1 could be more likely. At this point I've ruled out Option 3 and 4 since niether seems that likely since nothing else that I've found is affected. Only lightroom using raw images, and after the corruption, all other programs shows the same raw corruption. Although I havent looked, I also havent found a way to re-read the raw data and regenerate a new preview. And since I have very little storage, only programs games and a few bunches of txt-documents on my workstation, all the data i use is from the fileserver, so option 3 or 4 should lead to seeing the affect otherwise. And I dont see any errors in the port counters on the switch(Netgear GS-108T), etc. The backup was taken from the raid array, and since the backup's are fine,so it hasnt always been there.
As for Option 5, yeah, there is a few issues with the server. When i rebuilt my workstation I reinstalled to Win 2008 r2 from 2003 and put in another 4 GB of ram and switched the Dual Core cpu to a Quad Core. Sometimes I get hit with a bluescreen on that server, once every month or so perhaps... and I havent yet really done anything about that. Might be a bios setting, might be the extra ram, 2008-drivers, etc. The old 2003 was rock stable so the mobo and all add in cards are good.
Sometime when I have the time I might get around to see what I can do about it. Right now I only recopy the file from a backup after it having been corrupted. Any and all ideas are very welcome though.
just brew it! wrote:Yeah, that's another big caveat with RAID. You need to have a mechanism in place to notify you when drive failures occur; otherwise, RAID can cause more problems than it solves by laying time bombs for you!
Yeah, also sometime on my to-do list, right after I have automized my weekly full backups of the raid. Right now I have automatic backups of all system data to the raid. The raid gets backuped, or really synced to my NAS with free file sync on a biweekly/triweekly basis manually, but I started doing the automated batch profiles that are going into the scheduler.
As for the rest one "should" have, most of it is on the backburner, but I would love to have a nice syslog for all the networking/firewall and system stuff, together with decent graphing(Cacti) and a nice SMS/Mail alarm system. Still, for a home system, some of it is pretty overkill, which is why I havent done it yet.