Page 1 of 1

NotFred VM not cleaning up - causes space issues

Posted: Mon Aug 11, 2008 9:58 pm
by dblind1
Hey guys. I just wanted to see if anyone else is having a problem with the NotFred VM. After a while, I get an error about writes outside the hda1 area. I did some invesigation and might have found a reason. After checking out the web interface, I found that the WU that was completed and from what I can tell was sent is still in the 'instance' directory. The whole directory has grown to 126MB. I'm not sure where the other room has gone, but the Virtual disk is now at its full size. I'm not sure how to find out where the other space is at. I know how to use ls -al and df in linux, but is there any other command to try?

To 'NotFred' could you check this out? I'm running VMServer 1.06 with your latest folding VM files (which rock with the a2 core btw). It seems after one or 2 completed cores, the virtual disk fills up so maybe if you could do a scheduled job to check for sent WUs (and I have no clue how to know if they are sent) and remove them after the next WU starts. Maybe this will allow linux to release them to be deleted. I love the VM and would love for it to become as reliable as possible. I'm guessing that maybe the timer not syncing with the host might be causing the built in stanford process to no longer work. Anyway, if anyone else can duplicate this, I would appreciate it. I guess the only other request if a cleanup script isn't possible, that maybe we can get a larger max size on the virtual disk. 512MB is just a wee bit tiny with these large WUs.

Re: NotFred VM not cleaning up - causes space issues

Posted: Tue Aug 12, 2008 8:38 am
by notfred
Thanks for this! There have been a number of people who have seen the writes outside hda1 area, but I haven't had a chance to track down exactly what is going on. The lack of cleanup is a well known issue in the Stanford Linux code, it doesn't seem to delete the work unit it has just finished. From my home system that is running the SMP client native on Linux:
grep "Could not delete" /opt/folding/FAHlog.txt
[10:35:24] - Warning: Could not delete all work unit files (6): Core file absent
[21:56:29] - Warning: Could not delete all work unit files (7): Core file absent
[13:13:09] - Warning: Could not delete all work unit files (8): Core file absent
[00:06:10] - Warning: Could not delete all work unit files (9): Core file absent

Looks like I need to increase the disk size, and I've also heard that 512MB of memory doesn't seem to quite do it for the a2 core so I need to increase the memory as well.

Will try to get a new version out soon.

Re: NotFred VM not cleaning up - causes space issues

Posted: Tue Aug 12, 2008 11:12 pm
by dblind1
Great! Yeah .. I had some weird work units get stuck. I cleaned them out manually and it appeared to work great. on the clean up ... maybe your could right a small script that greped the fahlog.txt for the line where it says that something02.dat was sent and then see if the results02.dat file still exists in the work folder and if both are true then rm *02.* or something. Have it run once or twice a day. Or you could grep for the failed delete in FAHLog and then have it manually remove that group of files. Put that script in a cron job and it will auto clean for ya.

Just some ideas to help out! looking forward to those updates! Thanks for the work you do on the folding stuff. It is nice to have a VM that is easy to use and nothing to have to configure. Performance is great as well. I'm running 4 of these on a dual quad and it is producing about 9k ppd and was only doing about 4500k PPD with 4 windows vms with the windows client.