I've been running the Linux Folding SMP client on as many as a dozen VMware guests over WinXP for the bulk of 2008. Lately, I've been "underperforming" and I haven't been able to pinpoint why. I think the problem lately has related to using the -advmethods to get the A2 cores. Something is badly amiss here, and it's frustrating.
It seems to be more of an art than science to keep the Linux SMP client running: random crashes, hangs, 100% completion with hangs... etc. Reboots, Qfix'es, Ctrl+C, hop up and down while hitting yourself on the left side of the head with your keyboard.
As I mentioned, my PPD has taken a hit lately, and I've started paying attention to the Work Unit Run Clone and Generation. Does everyone do this?? Watch the Project RCG?
Now that I'm paying attention, I notice that several Ubuntu SMP guests are 100% completing a Work Unit, not uploading properly, and starting the same Work Unit again from scratch. Then completing it (AGAIN) to 100% and failing to upload it, and starting the same PRCG at the beginning.
I've noticed weird download timelines previously on Fahmon, where it says that a WU was downloaded 2-3 days ago, when it clearly was downloaded only hours ago. I always dismissed it to calculation issues in Fahmon. However, now that I look closely, the first instance of the project download was indeed done days before, but the guest has been repeatedly processing the same project Run Clone and Generation. Nice.
Now, most of the VMware Ubuntu guests that I have running have processed hundred of WU's, and been cloned numerous times, including being shutdown hard countless times. Perhaps I need to do a new "clean" SMP guest build.
Some of you guys must be running into similar issues. Monitoring and playing with a dozen VMWare Linux guests is a pain. I've learned a few things, which I'll post as I start watching more carefully.
- JP