Personal computing discussed

Moderators: renee, Captain Ned

 
Dirge
Gerbil Jedi
Topic Author
Posts: 1620
Joined: Thu Feb 19, 2004 3:08 am

The best way to backup lossless audio data?

Mon Aug 09, 2010 6:02 pm

I have just begun ripping my small CD collection with .flac lossless compression.

I got to thinking it would be best, at least for my needs, to add PAR2 files to each album and include a checksum/hashes file. I would then back up everything to two separate external drives for redundancy.

I spotted the HashCheck Shell Extension and got to wondering if I should generate a .md5 or SFV checksum file for each album. Are these checksum files standardised so that different applications may read them?

I want to keep this discussion purely to backup and redundancy.. but if are looking for a guide to using FLAC with EAC then look no further.

*Edited for clarification*
Last edited by Dirge on Mon Aug 09, 2010 10:00 pm, edited 1 time in total.
FDISK /MBR
 
Captain Ned
Global Moderator
Posts: 28704
Joined: Wed Jan 16, 2002 7:00 pm
Location: Vermont, USA

Re: The best way to backup lossless audio data?

Mon Aug 09, 2010 6:06 pm

The md5 hash function is a standard function that should produce the same output hash no matter what applet created it.
What we have today is way too much pluribus and not enough unum.
 
Flying Fox
Gerbil God
Posts: 25690
Joined: Mon May 24, 2004 2:19 am
Contact:

Re: The best way to backup lossless audio data?

Mon Aug 09, 2010 6:29 pm

Wouldn't this be the exact same scenario where you want some more redundancy when you are backing up regular files as well?
The Model M is not for the faint of heart. You either like them or hate them.

Gerbils unite! Fold for UnitedGerbilNation, team 2630.
 
Dirge
Gerbil Jedi
Topic Author
Posts: 1620
Joined: Thu Feb 19, 2004 3:08 am

Re: The best way to backup lossless audio data?

Mon Aug 09, 2010 9:51 pm

Flying Fox wrote:
Wouldn't this be the exact same scenario where you want some more redundancy when you are backing up regular files as well?


I didn't have that idea in mind but I don't see why it should be any different. In the past I have created PAR2 files for ISOs I dont want to loose. I have never investigated the process of storing checksums/hashes in checksum files
FDISK /MBR
 
Captain Ned
Global Moderator
Posts: 28704
Joined: Wed Jan 16, 2002 7:00 pm
Location: Vermont, USA

Re: The best way to backup lossless audio data?

Mon Aug 09, 2010 10:20 pm

Dirge wrote:
I have never investigated the process of storing checksums/hashes in checksum files

Well, an md5 won't save your data, unlike PAR2 files. All it will do is tell you it's been corrupted or altered.
What we have today is way too much pluribus and not enough unum.
 
Dirge
Gerbil Jedi
Topic Author
Posts: 1620
Joined: Thu Feb 19, 2004 3:08 am

Re: The best way to backup lossless audio data?

Sat Sep 25, 2010 7:03 am

Minor thread necro but this is relevant to my OP. I found a utility called md5deep which can compute MD5, SHA-1, SHA-256, Tiger, or Whirlpool message digests/hashes. The great thing is that under recursive mode, md5deep will hash files in the current directory and files in subdirectories. It will work on Windows and Linux too. I actually have it running in cygwin and use rsync to transfer files between drives.

My idea is to add par2 files to all the important data I can not afford to loose and then create an md5 digest for the data I want to back up. I will keep at least two backups of my data and can periodically check they are bit perfect with md5deep. If one of the backups has errors I can always recover files from the redundant backup or par files. This way I should be able to maintain a backup of my data, with some redundancy, and know I wont be transferring any corrupted files in the process.
FDISK /MBR
 
just brew it!
Administrator
Posts: 54500
Joined: Tue Aug 20, 2002 10:51 pm
Location: Somewhere, having a beer

Re: The best way to backup lossless audio data?

Sat Sep 25, 2010 9:41 am

I've been using my own poor-man's version of md5deep for a while:
#!/bin/bash
find . -xdev -type f -or -type l | sort | xargs --delimiter=\\n md5sum -b

This script generates a list of MD5 checksums for the entire directory tree rooted at the current directory. The checksums can be redirected into a file which can then be used later (via md5sum --check) to verify the checksums.
Nostalgia isn't what it used to be.
 
Usacomp2k3
Gerbil God
Posts: 23043
Joined: Thu Apr 01, 2004 4:53 pm
Location: Orlando, FL
Contact:

Re: The best way to backup lossless audio data?

Sat Sep 25, 2010 10:53 am

Why not just put it on a good raid1/5 setup and then not worry about it?
 
just brew it!
Administrator
Posts: 54500
Joined: Tue Aug 20, 2002 10:51 pm
Location: Somewhere, having a beer

Re: The best way to backup lossless audio data?

Sat Sep 25, 2010 11:01 am

Usacomp2k3 wrote:
Why not just put it on a good raid1/5 setup and then not worry about it?

Because RAID is about maintaining availability in the event of a hardware failure, not backup.
Nostalgia isn't what it used to be.
 
Dirge
Gerbil Jedi
Topic Author
Posts: 1620
Joined: Thu Feb 19, 2004 3:08 am

Re: The best way to backup lossless audio data?

Sat Sep 25, 2010 1:49 pm

just brew it! wrote:
I've been using my own poor-man's version of md5deep for a while:
#!/bin/bash
find . -xdev -type f -or -type l | sort | xargs --delimiter=\\n md5sum -b

This script generates a list of MD5 checksums for the entire directory tree rooted at the current directory. The checksums can be redirected into a file which can then be used later (via md5sum --check) to verify the checksums.


Nice script JBI, I could have used something like that :). Though now I have discovered the awesomeness that is md5deep. Of course I would appreciate any other ideas for archiving my data and avoiding bit rot.

I read this rather interesting article by Robin Harris the other day 50 ways to lose your data. :o Haha *bites nails*
FDISK /MBR
 
Flying Fox
Gerbil God
Posts: 25690
Joined: Mon May 24, 2004 2:19 am
Contact:

Re: The best way to backup lossless audio data?

Fri Oct 01, 2010 6:28 pm

"for /r" will get you similar effect under NT Command Prompt. ;)

A duplicate backup should be fine. I fail to see how adding PARs would significantly help in addition to regular full backups. The risk is not reduced so much anyway over just backing up.
The Model M is not for the faint of heart. You either like them or hate them.

Gerbils unite! Fold for UnitedGerbilNation, team 2630.
 
Crayon Shin Chan
Minister of Gerbil Affairs
Posts: 2313
Joined: Fri Sep 06, 2002 11:14 am
Location: Malaysia
Contact:

Re: The best way to backup lossless audio data?

Thu Jan 20, 2011 4:36 am

FLAC files should include their own MD5summ. Use foobar2000 and look at the file properties.

I only use PAR2 style archiving for the really rare/hard to find albums, like doujin music from Comikets. Since there is no good PAR2 tool that can recurse directories, I 7zip the directory and PAR2 that. Then I burn to a DVD+R.
Mothership: FX-8350, 12GB DDR3, M5A99X EVO, MSI GTX 1070 Sea Hawk, Crucial MX500 500GB
Supply ship: [email protected], 12GB DDR3, M4A88TD-V EVO/USB3
Corsair: Thinkpad X230
 
zamb
Gerbil
Posts: 15
Joined: Wed Nov 26, 2008 9:38 am

Slightly off-topic (sorry)…

Thu Jan 20, 2011 5:09 am

#!/bin/bash
find . -xdev -type f -or -type l | sort | xargs --delimiter=\\n md5sum -b

Why not:
find . -xdev -type f -or -type l -print0 | xargs -0 md5sum -b

That’s much more robust and elegant. (I ignored the “sort” command as it’s not really needed.)

By the way: The “-b” parameter to “md5sum” is not needed under Unix/BSD/Linux systems.

Sorry for the off-topic (and the smugness involved).
Ziyad.
 
just brew it!
Administrator
Posts: 54500
Joined: Tue Aug 20, 2002 10:51 pm
Location: Somewhere, having a beer

Re: Slightly off-topic (sorry)…

Thu Jan 20, 2011 8:24 am

zamb wrote:
#!/bin/bash
find . -xdev -type f -or -type l | sort | xargs --delimiter=\\n md5sum -b

Why not:
find . -xdev -type f -or -type l -print0 | xargs -0 md5sum -b

That’s much more robust and elegant. (I ignored the “sort” command as it’s not really needed.)

By the way: The “-b” parameter to “md5sum” is not needed under Unix/BSD/Linux systems.

I agree that the use of the "-print0" and "-0" options is an improvement.

However...

The reason I sorted the list is so that you can subsequently use the "diff" command to compare the hash lists for two directory hierarchies, to instantly identify all files that differ or have been added/removed. You can't count on "find" outputting the files in sorted order, as the order is dependent on the type of the underlying filesystem.

Being "diff"-friendly is also the reason for the "-b" option -- even though it does not affect the checksum calculation when running on a *NIX system, it does prepend a '*' to each checksum in the output. So if you want to be able to use "diff" to compare checksum files which were generated on *NIX with checksum files which were generated on Windows, you need to use the "-b" option (on both sides).

Sorry for the off-topic (and the smugness involved).

Sorry for pointing out that some of the smugness was premature. :wink:
Nostalgia isn't what it used to be.
 
zamb
Gerbil
Posts: 15
Joined: Wed Nov 26, 2008 9:38 am

Re: Slightly off-topic (sorry)…

Fri Jan 21, 2011 2:50 am

just brew it! wrote:
The reason I sorted the list is so that you can subsequently use the "diff" command to compare the hash lists for two directory hierarchies, to instantly identify all files that differ or have been added/removed.

Good point.

just brew it! wrote:
Being "diff"-friendly is also the reason for the "-b" option -- even though it does not affect the checksum calculation when running on a *NIX system, it does prepend a '*' to each checksum in the output. So if you want to be able to use "diff" to compare checksum files which were generated on *NIX with checksum files which were generated on Windows, you need to use the "-b" option (on both sides).

Also, good point (and thanks for the information. I didn’t think that far ahead).

just brew it! wrote:
Sorry for pointing out that some of the smugness was premature. :wink:

No problem. You taught me something I didn’t know (the “*” in the checksum files) which is worth it.
Thank you.
Ziyad.
 
Alisha07
Gerbil In Training
Posts: 1
Joined: Fri Jul 08, 2011 1:09 am

Re: The best way to backup lossless audio data?

Fri Jul 08, 2011 1:11 am

**REMOVED**
Last edited by Kevin on Fri Jul 08, 2011 6:44 am, edited 1 time in total.
Reason: EDIT BY MOD: Spam removed, user banned, left post to explain necro.
 
Aphasia
Grand Gerbil Poohbah
Posts: 3710
Joined: Tue Jan 01, 2002 7:00 pm
Location: Solna/Sweden
Contact:

Re: The best way to backup lossless audio data?

Fri Jul 08, 2011 5:02 am

just brew it! wrote:
Usacomp2k3 wrote:
Why not just put it on a good raid1/5 setup and then not worry about it?

Because RAID is about maintaining availability in the event of a hardware failure, not backup.
You can say that again.

For some reason I have a problem with data corruption when working in lightroom against a folder on my raid array. It seems like the transfer and working directly on the files via network can corrupt them when generation the raid. Only thing to do is reload the file from the backup and reimport it. And I cant really import the full information since my SSD cant hold a library with 30GB worth of photos.

Without backups, I've would've been toast.
 
just brew it!
Administrator
Posts: 54500
Joined: Tue Aug 20, 2002 10:51 pm
Location: Somewhere, having a beer

Re: The best way to backup lossless audio data?

Fri Jul 08, 2011 8:30 am

It seems this thread necro has taken an interesting twist... :wink:

Aphasia wrote:
You can say that again.

For some reason I have a problem with data corruption when working in lightroom against a folder on my raid array. It seems like the transfer and working directly on the files via network can corrupt them when generation the raid. Only thing to do is reload the file from the backup and reimport it. And I cant really import the full information since my SSD cant hold a library with 30GB worth of photos.

Without backups, I've would've been toast.

Sounds like something's flaky in your network or file server. That's a nightmare waiting to happen; silent data corruption is far worse than something that fails outright, since you may not realize you have a problem until some time in the future. Depending on your backup regime, you may not even have a backup with a good copy of the file any more (yes, this is an argument for archiving some of your backups permanently instead of recycling the media).

Flaky NIC, flaky network switch, or bad RAM in the file server (does the server have error-correcting RAM?) are a few potential culprits that come to mind.
Nostalgia isn't what it used to be.
 
Scrotos
Graphmaster Gerbil
Posts: 1109
Joined: Tue Oct 02, 2007 12:57 pm
Location: Denver, CO.

Re: The best way to backup lossless audio data?

Fri Jul 08, 2011 10:21 am

Oh I have something even better that took a while for me to logically figure out.

SQL server on a RAID 1 (mirror). One day after a windows update and reboot it lost about 6 months. Literally. Everything on the primary drive was reset to February. Even the Event Logs had a giant gap. All our data is backed up each night so we lost a day of data. I have since learned about transaction logs and backing those up as well. But primary drive was Feb, secondary drive (where backups were stored) was current. Wouldn't have been a restore point accidentially restored as that wouldn't have jacked up our SQL databases, right? And files that shouldn't be covered in a restore point were missing, just gone, stuff I had moved over in the months after Feb. Windows Update said we had a crap-ton of updates, too, and WSUS was freakin' out about the server status.

As far as I can figure it, in Feb the RAID got degraded and took one drive offline. It hummed along not telling us it was broke until 6 months later when, during a reboot, the RAID controller swapped what drives it thought were "good" and "degraded". It was during the "wtf happened here?" process that we saw the RAID was broken and of course kicked off a rebuild. At that point we probably destroyed any hope of a quick fix of our data from the day. But then again if the RAID thought that the good drive was "bad", we were probably SOL at that point anyway.

The system was available. It never went down, per se, and even with a drive "failure" we had 100% uptime. So score another for the "RAID is not a backup" camp--our data was wiped out and without a real backup we would have been so hosed it ain't even funny.

On the other hand this gave us a "test your disaster recovery plan quarterly" check!
 
just brew it!
Administrator
Posts: 54500
Joined: Tue Aug 20, 2002 10:51 pm
Location: Somewhere, having a beer

Re: The best way to backup lossless audio data?

Fri Jul 08, 2011 10:36 am

Yeah, that's another big caveat with RAID. You need to have a mechanism in place to notify you when drive failures occur; otherwise, RAID can cause more problems than it solves by laying time bombs for you!
Nostalgia isn't what it used to be.
 
Aphasia
Grand Gerbil Poohbah
Posts: 3710
Joined: Tue Jan 01, 2002 7:00 pm
Location: Solna/Sweden
Contact:

Re: The best way to backup lossless audio data?

Sat Jul 09, 2011 5:31 pm

just brew it! wrote:
It seems this thread necro has taken an interesting twist... :wink:
Oh, never noticed it was a necro. Not that bad though, more of a zombie thing than an undead skeleton thing. It just barely had time to start to decompose. :wink:


just brew it! wrote:
Sounds like something's flaky in your network or file server. That's a nightmare waiting to happen; silent data corruption is far worse than something that fails outright, since you may not realize you have a problem until some time in the future. Depending on your backup regime, you may not even have a backup with a good copy of the file any more (yes, this is an argument for archiving some of your backups permanently instead of recycling the media).

Flaky NIC, flaky network switch, or bad RAM in the file server (does the server have error-correcting RAM?) are a few potential culprits that come to mind.

Thats the weird thing, I've done a few tests with copying, reading, writing, doing parity check after that, etc. Nothing except for lightroom seems to be affected. Unfortunately It's not really like I have parity data for all my files just laying around, which is why I'm in the process of looking at a parity file generation scheme right now to be able to actually test it.

As far as this goes, it seems like it could be a couple of things.
* Option 1 - Lightroom for some reason, is the program corrupting either the files in the progress of rendering the preview. And thus for some reason have a problem working against a folder on a network share.
* Option 2 - Some of the images was corrupted thanks to errors during my restore of a failed drive in the array a few months ago.
* Option 3 - Problems with the raid.
* Option 4 - Problem with the network.
* Option 5 - Problems with the server.

The thing is, the images I'm working with is my set of photos from my christmas New Zealand trip. And I'm going through it for the third time in my iterative sorting approach. First iteration is setting a single star and leaving the stuff not worth anything as unrated. Next iteration is rating up shots thats better, etc. 3rd rating marks stuff that might be shown to the people I was traveling with. 4rt rating is stuff put on for everybody, facebook, etc. 5th rating is stuff worthy to be put into the portfolio. So all the files that are corrupted, have been read and shown fine at an earlier point. And yeah, as you say, silent corruption is far worse then anything else, and something that scares the bejebus out of me.

The thing is, I dont remember if I did the last iteration before I did the raid restore. So it might be option 2 above... otherwise option 1 could be more likely. At this point I've ruled out Option 3 and 4 since niether seems that likely since nothing else that I've found is affected. Only lightroom using raw images, and after the corruption, all other programs shows the same raw corruption. Although I havent looked, I also havent found a way to re-read the raw data and regenerate a new preview. And since I have very little storage, only programs games and a few bunches of txt-documents on my workstation, all the data i use is from the fileserver, so option 3 or 4 should lead to seeing the affect otherwise. And I dont see any errors in the port counters on the switch(Netgear GS-108T), etc. The backup was taken from the raid array, and since the backup's are fine,so it hasnt always been there.

As for Option 5, yeah, there is a few issues with the server. When i rebuilt my workstation I reinstalled to Win 2008 r2 from 2003 and put in another 4 GB of ram and switched the Dual Core cpu to a Quad Core. Sometimes I get hit with a bluescreen on that server, once every month or so perhaps... and I havent yet really done anything about that. Might be a bios setting, might be the extra ram, 2008-drivers, etc. The old 2003 was rock stable so the mobo and all add in cards are good.

Sometime when I have the time I might get around to see what I can do about it. Right now I only recopy the file from a backup after it having been corrupted. Any and all ideas are very welcome though.


just brew it! wrote:
Yeah, that's another big caveat with RAID. You need to have a mechanism in place to notify you when drive failures occur; otherwise, RAID can cause more problems than it solves by laying time bombs for you!
Yeah, also sometime on my to-do list, right after I have automized my weekly full backups of the raid. Right now I have automatic backups of all system data to the raid. The raid gets backuped, or really synced to my NAS with free file sync on a biweekly/triweekly basis manually, but I started doing the automated batch profiles that are going into the scheduler.

As for the rest one "should" have, most of it is on the backburner, but I would love to have a nice syslog for all the networking/firewall and system stuff, together with decent graphing(Cacti) and a nice SMS/Mail alarm system. Still, for a home system, some of it is pretty overkill, which is why I havent done it yet.

Who is online

Users browsing this forum: No registered users and 19 guests
GZIP: On