Ubuntu 8.04, trouble with filesystem corruption.

Where Penguins and Daemons chill together in the warmth of the Sun.

Moderators: SecretSquirrel, notfred

Ubuntu 8.04, trouble with filesystem corruption.

Postposted on Fri Aug 08, 2008 12:18 pm

I've already posted this on the Ubuntu forums, but haven't got a response, maybe some of the talented people here at TR can help.

Recently, I started to have trouble reading certain files on a 500gb ext3 partition I have mounted for storage/backup purposes.

I checked /var/log/messages, have lots of entries like this for the past week or so:

Code: Select all
Aug  6 21:02:49 tbird kernel: [114941.226727] attempt to access beyond end of device
Aug  6 21:02:49 tbird kernel: [114941.226735] sdb1: rw=0, want=23548135088, limit=976768002



So I unmount the partition, run ****, and get lots of this:

Code: Select all
Inode 5734439 has INDEX_FL flag set but is not a directory.
Clear HTree index? yes

Inode 5734439, i_size is 5245066037214557465, should be 0.  Fix? yes

Inode 5734439, i_blocks is 2978642201, should be 0.  Fix? yes

Inode 5734437 has INDEX_FL flag set but is not a directory.
Clear HTree index? yes

Inode 5734437, i_size is 7687059029158035282, should be 0.  Fix? yes

Inode 5734437, i_blocks is 1586999798, should be 0.  Fix? yes



Now, the partition is usable, but I lost a few files in the process. I ran the manufacturer's (Western Digital) diagnostics on the drive, and it checks out.

So I start going through the past logs to find out when this started, and found that the trouble started after this event:

Code: Select all
Jul 29 02:21:33 tbird kernel: [70618.590538]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x14 (ATA bus error)
Jul 29 02:21:33 tbird kernel: [70618.590551] ata2: hard resetting link
Jul 29 02:21:33 tbird kernel: [70619.029895] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Jul 29 02:21:33 tbird kernel: [70619.044092] ata2.00: configured for UDMA/133
Jul 29 02:21:33 tbird kernel: [70619.044105] ata2: EH complete
Jul 29 02:21:33 tbird kernel: [70619.046916] sd 1:0:0:0: [sdb] 976773168 512-byte hardware sectors (500108 MB)
Jul 29 02:21:33 tbird kernel: [70619.048232] sd 1:0:0:0: [sdb] Write Protect is off
Jul 29 02:21:33 tbird kernel: [70619.051277] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support


Any ideas what I should be checking? I'm not really familiar with Linux system error messages as opposed to Windows, so I don't know how worried to be. I have 3 SATA drives in the system on the same controller (plus a SATA DVD-RW) and only one has had issues, so I'm thinking maybe a defective or loose data(or power) connection (the partition mounted as root gets thrashed a lot more and hasn't had any issue like this before); but I also wonder if it might be some sort of bug, since the files that got corrupted were large files > 1GB in size. Suggestions or comments welcome.

Okay, now it's two days later and I notice some weird things in a directory listing of /usr/share/applications/screensavers... (some screensavers are not working). So I **** /dev/sda1, and more trouble. After repairing the root partition and rebooting, I lost a few files again, so I check /var/log/messages/ and notice this:

Code: Select all
Aug  8 11:25:53 tbird kernel: [  347.083138] attempt to access beyond end of device
Aug  8 11:25:53 tbird kernel: [  347.083142] sda1: rw=0, want=34359738368, limit=316384047


Repeating over and over. I checked back in some previous logs, and see no other errors about /dev/sda1.

For what it's worth, I found this in the logs, so I've uninstalled VMWare Workstation 6.0.3 for now, I don't know if it's important.

Code: Select all
Aug  5 09:33:38 tbird kernel: [51113.674007] VMBlock warning: DentryOpRevalidate: invalid args from kernel
Aug  5 09:33:38 tbird kernel: [51113.674494] VMBlock warning: DentryOpRevalidate: invalid args from kernel
Aug  5 09:33:38 tbird kernel: [51113.675177] VMBlock warning: DentryOpRevalidate: invalid args from kernel
Aug  5 09:33:38 tbird kernel: [51113.675431] VMBlock warning: DentryOpRevalidate: invalid args from kernel


For now, I'm going to run memtest for a while to be sure, but then just cross my fingers. Doesn't look like a bad drive, since it's affected two drives. So I'm thinking it must be either hardware trying to fail or some sort of new bug. *sigh*
badger badger badger badger badger badger badger
axeman
Minister of Gerbil Affairs
 
Posts: 2019
Joined: Fri Jan 31, 2003 10:46 am

Re: Ubuntu 8.04, trouble with filesystem corruption.

Postposted on Fri Aug 08, 2008 2:19 pm

Looks like general system flakiness to me -- RAM or motherboard. Basically, something is corrupting your filesystem meta-data. It is unclear whether the corruption is occurring in RAM (with the bad meta-data eventually getting flushed back to disk), or if the corruption is occurring during disk I/O.

I seriously doubt it is a kernel/driver bug, but I suppose anything is possible. Did you by chance update to a newer kernel around the time the corruptions started? Did you install any third-party device drivers (i.e. ones that didn't come from the Ubuntu repository)?
(this space intentionally left blank)
just brew it!
Administrator
 
Posts: 35311
Joined: Tue Aug 20, 2002 9:51 pm
Location: Somewhere, having a beer

Re: Ubuntu 8.04, trouble with filesystem corruption.

Postposted on Fri Aug 08, 2008 6:54 pm

just brew it! wrote:Looks like general system flakiness to me -- RAM or motherboard. Basically, something is corrupting your filesystem meta-data. It is unclear whether the corruption is occurring in RAM (with the bad meta-data eventually getting flushed back to disk), or if the corruption is occurring during disk I/O.

I seriously doubt it is a kernel/driver bug, but I suppose anything is possible. Did you by chance update to a newer kernel around the time the corruptions started? Did you install any third-party device drivers (i.e. ones that didn't come from the Ubuntu repository)?


Well, it looks like I started having these warning: "VMBlock warning: DentryOpRevalidate: invalid args from kernel" when I upgraded from VMWare 6.0.3 to 6.0.4, and that was on the 22nd, so there is one possible suspect, since this is 3rd party kernel modules that aren't from Ubuntu.

Also, it looks like from the logs that I upgraded from kernel 2.6.24-19-generic to 2.6.24-20-generic, also just before the trouble started, on the 27th; the first suspicious log entry was on the 29th.

I ran memtest for around 10 passes today, no errors, so if it's a hardware issue, I would guess its something more nebulous like motherboard flakiness rather than bad ram - the system exibits no sign of instability either, just this disk corruption problem. Thanks for suggesting 3rd party drivers or kernel upgrades, though, I definitely have done both recently.
badger badger badger badger badger badger badger
axeman
Minister of Gerbil Affairs
 
Posts: 2019
Joined: Fri Jan 31, 2003 10:46 am

Re: Ubuntu 8.04, trouble with filesystem corruption.

Postposted on Fri Aug 08, 2008 7:39 pm

Just so we're clear here -- the problematic Ubuntu installation is the host OS, and you're hosting something else (or another instance of Ubuntu) in VMware, right? (I just realized that if you're dealing with VMs, I could easily be confused as to which system is actually giving you trouble.)

FWIW I've not been particularly thrilled with VMware on Ubuntu. Yes, you can make it work; but it seems to work reluctantly at best. I've been meaning to take VirtualBox for a test drive to see if it is any better...
(this space intentionally left blank)
just brew it!
Administrator
 
Posts: 35311
Joined: Tue Aug 20, 2002 9:51 pm
Location: Somewhere, having a beer

Re: Ubuntu 8.04, trouble with filesystem corruption.

Postposted on Fri Aug 08, 2008 8:17 pm

Virtualbox just isn't very good compared to VMware in general. I am running VMware 6.0.3 in Ubuntu just fine. What troubles were you having?

As to the Ubuntu issue, I had this once where a drive acted flaky and just started to corrupt things. That drive still gave me issues later. I think it had to do with using windows and unbuntu on the same drive and then the mbr got hosed. As to you using kernel 2.6.24-20-generic, where did you get that from? I have 3 Ubuntu machines in my house and they are all as of today still using the 2.6.24-19-generic. I forget the last number it's like 30 or something if you check.
atryus28
Minister of Gerbil Affairs
 
Posts: 2140
Joined: Tue Apr 22, 2003 1:56 am

Re: Ubuntu 8.04, trouble with filesystem corruption.

Postposted on Fri Aug 08, 2008 8:40 pm

What filesystem are you using? Is it ReiserFS? If it's corrupting your files on your hard drive it might move next to the files in your mind. Let us know if you find yourself having thoughts of murdering your wife.




/no help.
ssidbroadcast
Graphmaster Gerbil
 
Posts: 1376
Joined: Sun Jul 08, 2007 9:42 pm
Location: Bellingham, WA

Re: Ubuntu 8.04, trouble with filesystem corruption.

Postposted on Fri Aug 08, 2008 8:42 pm

My guess is that you have one of two problems:

1. Flaky hardware. Disks can fail with certain combinations of circumstances that may not occur with the diagnostics. Or you could have problems with the controller, CPU, or RAM. If you are overclocking, set it back to standard, and set the memory timings to safe values.

2. Bad partition settings. Long ago, I set a Windows 98 partition to the wrong type, and the block numbers wrapped around and Windows scribbled all over my Linux partition. Things get very strange when partitions overlap or extend past the end of the drive. Check your partition table.
zzatz
Gerbil
 
Posts: 10
Joined: Fri Apr 07, 2006 10:11 pm

Re: Ubuntu 8.04, trouble with filesystem corruption.

Postposted on Fri Aug 08, 2008 8:56 pm

atryus28 wrote:Virtualbox just isn't very good compared to VMware in general.

That's disappointing.

I am running VMware 6.0.3 in Ubuntu just fine. What troubles were you having?

General issues getting it to install properly, and (once that was resolved) problems with the system clock being wonky (too fast/slow). VMware seems to have significant issues with any sort of power management which causes the clock speed of the host CPU to vary. Clock also sometimes runs fast/slow even on systems with PM disabled...
(this space intentionally left blank)
just brew it!
Administrator
 
Posts: 35311
Joined: Tue Aug 20, 2002 9:51 pm
Location: Somewhere, having a beer

Re: Ubuntu 8.04, trouble with filesystem corruption.

Postposted on Fri Aug 08, 2008 9:51 pm

I have to check again but I got that taken care of and I didn't need to turn off my power management. I set something in the config file. I did this s few months ago and haven't bothered with it since because the issue was resolved. I found the answer in the vmware forums though.

When was the last time you tried with Ubuntu?
atryus28
Minister of Gerbil Affairs
 
Posts: 2140
Joined: Tue Apr 22, 2003 1:56 am

Re: Ubuntu 8.04, trouble with filesystem corruption.

Postposted on Sat Aug 09, 2008 5:49 am

ssidbroadcast wrote:What filesystem are you using? Is it ReiserFS? If it's corrupting your files on your hard drive it might move next to the files in your mind. Let us know if you find yourself having thoughts of murdering your wife.




/no help.

In the first two lines the OP stated:
axeman wrote:I've already posted this on the Ubuntu forums, but haven't got a response, maybe some of the talented people here at TR can help.

Recently, I started to have trouble reading certain files on a 500gb ext3 partition I have mounted for storage/backup purposes...
The best things in life are free.
http://www.gentoo.org
Guy 1: Surely, you will fold with me.
Guy 2: Alright, but don't call me Shirley.
titan
Grand Gerbil Poohbah
 
Posts: 3275
Joined: Mon Feb 18, 2002 6:00 pm
Location: Great Smoky Mountains

Re: Ubuntu 8.04, trouble with filesystem corruption.

Postposted on Sat Aug 09, 2008 7:49 am

just brew it! wrote:Just so we're clear here -- the problematic Ubuntu installation is the host OS, and you're hosting something else (or another instance of Ubuntu) in VMware, right? (I just realized that if you're dealing with VMs, I could easily be confused as to which system is actually giving you trouble.)

FWIW I've not been particularly thrilled with VMware on Ubuntu. Yes, you can make it work; but it seems to work reluctantly at best. I've been meaning to take VirtualBox for a test drive to see if it is any better...


Yep, the problematic Ubuntu is the host OS. Off topic, I haven't been thrilled with VMWare Workstation 6.x on _any_ OS. I use virtual machines extensively at work, and I'm still using 5.5.x on WIndows XP because 6 always ends up crashing, sometimes BSOD'ing the host machine. I don't have flaky hardware, and I've even reinstalled the OS on the host machine from scratch. Back at home, VMware w/s 6 hasn't been so unstable, but I can name a few annoying bugs. In any case, I can do without VMWare at home for a while, and see if my disk corruption returns.
badger badger badger badger badger badger badger
axeman
Minister of Gerbil Affairs
 
Posts: 2019
Joined: Fri Jan 31, 2003 10:46 am

Re: Ubuntu 8.04, trouble with filesystem corruption.

Postposted on Sat Aug 09, 2008 8:56 am

atryus28 wrote:I have to check again but I got that taken care of and I didn't need to turn off my power management. I set something in the config file. I did this s few months ago and haven't bothered with it since because the issue was resolved. I found the answer in the vmware forums though.

When was the last time you tried with Ubuntu?

Probably a couple, maybe three months ago.

I should add that this was VMware Server, not VMware Workstation.
(this space intentionally left blank)
just brew it!
Administrator
 
Posts: 35311
Joined: Tue Aug 20, 2002 9:51 pm
Location: Somewhere, having a beer

Re: Ubuntu 8.04, trouble with filesystem corruption.

Postposted on Sat Aug 09, 2008 9:34 am

I've been using kvm to develop my diskless stuff. It's limited in the hardware it supports in the guest, but it doesn't mess up the host. Getting the networking setup if you want to bridge it on to your LAN is a bit of a nuisance, but it does work reliably.

I don't have any ideas on the filesystem corruption though. I did see similar corruption a while ago on my work system, but that was running an old version of RHEL that had a buggy sata_nv driver and a kernel upgrade fixed it. All I can suggest is make sure you have all the latest updates installed and double check all the cables and connectors. ext3 is a pretty solid filesystem, the problem is far more likely to be in the controller driver or the hardware itself.
notfred
Grand Gerbil Poohbah
 
Posts: 3494
Joined: Tue Aug 10, 2004 9:10 am
Location: Ottawa, Canada

Re: Ubuntu 8.04, trouble with filesystem corruption.

Postposted on Sat Aug 09, 2008 9:10 pm

notfred wrote:ext3 is a pretty solid filesystem, the problem is far more likely to be in the controller driver or the hardware itself.

I can vouch for this. At my day job, we have a prototype product where we are pretty abusive with ext3. It's an embedded app where the system is frequently power cycled without warning. I initially had some misgivings about using ext3 in this environment, but based on our experiences over the past few months, ext3 is pretty robust even when (mis)used in this manner.

I'd still stop short of recommending that it be used this way in a production product, but I am definitely impressed.
(this space intentionally left blank)
just brew it!
Administrator
 
Posts: 35311
Joined: Tue Aug 20, 2002 9:51 pm
Location: Somewhere, having a beer

Re: Ubuntu 8.04, trouble with filesystem corruption.

Postposted on Sun Aug 10, 2008 6:07 am

notfred wrote:I've been using kvm to develop my diskless stuff. It's limited in the hardware it supports in the guest, but it doesn't mess up the host. Getting the networking setup if you want to bridge it on to your LAN is a bit of a nuisance, but it does work reliably.

I don't have any ideas on the filesystem corruption though. I did see similar corruption a while ago on my work system, but that was running an old version of RHEL that had a buggy sata_nv driver and a kernel upgrade fixed it. All I can suggest is make sure you have all the latest updates installed and double check all the cables and connectors. ext3 is a pretty solid filesystem, the problem is far more likely to be in the controller driver or the hardware itself.


It's interesting you mention this, my system is using the sata_nv driver as well. I also have some interesting messages at boot time regarding the disk drivers:

Code: Select all
Aug  8 11:15:43 tbird kernel: [   30.684265] Driver 'sd' needs updating - please use bus_type methods
Aug  8 11:15:43 tbird kernel: [   30.684432] sd 0:0:0:0: [sda] 625142448 512-byte hardware sectors (320073 MB)
Aug  8 11:15:43 tbird kernel: [   30.684442] sd 0:0:0:0: [sda] Write Protect is off
Aug  8 11:15:43 tbird kernel: [   30.684458] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Aug  8 11:15:43 tbird kernel: [   30.684498] sd 0:0:0:0: [sda] 625142448 512-byte hardware sectors (320073 MB)
Aug  8 11:15:43 tbird kernel: [   30.684506] sd 0:0:0:0: [sda] Write Protect is off
Aug  8 11:15:43 tbird kernel: [   30.684520] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Aug  8 11:15:43 tbird kernel: [   30.684524]  sda:<4>Driver 'sr' needs updating - please use bus_type methods


I guess off to google to see what this means. :wink:
axeman
Minister of Gerbil Affairs
 
Posts: 2019
Joined: Fri Jan 31, 2003 10:46 am

Re: Ubuntu 8.04, trouble with filesystem corruption.

Postposted on Sun Aug 10, 2008 7:17 am

Just a question... does an **** perform a resize2fs as part of it's process? I'm just thinking that perhaps the meta-data is saying your ext3 is a certain size when it physically isn't... This could be a bit of a long shot I realize but it may not be such a bad idea to run a resize2fs.
Ubuntu 12.04 AMD64: E8200 // P35 // HD 4850 // 4GB
OS X 10.8.x: iMac12,2, MacBook 5,2
pedro
Gerbil First Class
 
Posts: 176
Joined: Fri May 11, 2007 5:13 am


Return to Linux, Unix, and Assorted Madness

Who is online

Users browsing this forum: No registered users and 3 guests