Page 1 of 2

Linux/OSX SMP core update increases throughput > 35%

Posted: Mon Aug 04, 2008 8:43 pm
by Ragnar Dan
This thread should be stickied until everyone on the team who's running a Linux (or Mac OS X) SMP folding client has made the update. Anyone who started SMP folding since July 9 should be fine already, but those who started before then may need to update.

Everyone who's running an SMP client should check to see if they've got the upgraded version of the FahCore_a2.exe core file. It's speeding up the calculations on 2662 WU's by over 35% on my Opteron, and makes the client utilize the CPU more efficiently. It appears that even machines with several cores will be able to run this core efficiently enough to make it useful to employ all cores in a single client. Since most of us are using VMWare Virtual Machines to run the Linux SMP client, that may not be possible. It will still increase output no matter how many cores you use. I don't know why the Windows SMP clients can't benefit from this update, but since this core isn't available for it yet it may well be worth moving to a VM running the Linux SMP client just for the output gain.

Warning: Do not upgrade in the middle of a WU unless it's running a different core, because it will trash the current WU if you do. So if you're running a Project 2619 or a Project 2662, let it complete before upgrading the core.

Here's how you can tell the core it's presently running. When it first starts crunching the WU, it says something like this:

[23:30:41] *------------------------------*
[23:30:41] Folding@Home Gromacs SMP Core
[23:30:41] Version 1.91 (2007)
[23:30:41]
[23:30:41] Preparing to commence simulation


That was the older core. There's a problem with the system updating authomatically like it should. Here's what they say at the FoldingForum about it:

FoldingForum OP wrote:
We've released a new A2 SMP core to advanced methods (currently projects 2619, 2662) for OSX and Linux.
This core is much more efficient in terms of its CPU utilization than the A1 core or the first-generation A2 core, and we're excited about the greater scientific productivity it allows.

The new core version is 2.00 (2.01 for OSX); if you don't have a previous A2 core it should auto-download. If you have an A2 core prior to 1.95, please delete it. Those cores do not always download upgrades automatically and will run much slower than the new version.

Edit by Mod: Based on comments below, if your current WU is using FahCore_A2, updating will probably destroy it, so wait until it finishes.
-b


Here's how the new core looks when it starts up:

[15:33:14] *------------------------------*
[15:33:14] Folding@Home Gromacs SMP Core
[15:33:14] Version 2.00 (Wed Jul 9 13:11:25 PDT 2008)
[15:33:14] Preparing to commence simulation


On the completion of your next Linux (or Mac OS X) SMP WU, shut down the client, delete FahCore_a2.exe and restart the SMP client. I went from nearly 28 minutes / frame on a 2662 to < 18 minutes / frame by this change, which forces a new FahCore_a2.exe to download. I went from ~900 PPD on those 2662's to over 1500 PPD, which is even higher than 2605 WU's! Here's the FoldingForum.org thread about it: new A2 core released--on advanced methods but I can't seem to find the one I read earlier, because I think the one I read was more helpful and had Dr. Pande mentioning various useful bits of information.

Anyway, UPGRADE AS SOON AS YOU CAN! It's WORTH THE EFFORT.

They say here that:
We haven't updated A1--this is just an update to A2. They are used by a different set of projects. A2 gives us a number of benefits, both scaling-wise and science-wise. As the new core settles in, we'll be offering more A2 projects and fewer A1. But that's the medium-term plan.


So the sooner we all get updated the better our production will be, and the fewer problems we'll have as a team.

Edit: There's mention of the new a2 core in notfred's Diskless Folding Virtual Appliance thread after the core's July introduction, by the way, and maybe elsewhere, but I only noticed this thing yesterday.

Re: Linux/OSX SMP core update increases throughput > 35%

Posted: Mon Aug 04, 2008 11:13 pm
by Pegasus
I've got Linux SMP running under VMware with the newest client. It is running the A1 core. I added the -advmethods flag and deleted the A1 core file. It redownloaded the A1 core. Will it grab the A2 if it picks up a WU that needs it (now that I have advmethods enabled)?
They are on project 2605's.

Re: Linux/OSX SMP core update increases throughput > 35%

Posted: Mon Aug 04, 2008 11:29 pm
by Ragnar Dan
Right. If you didn't have the file FahCore_a2.exe on your system, then it will download the latest version if you're ever assigned a WU that requires it. I used to be displeased with how these formerly slower-than-Project-2605 WU's reduced my production, but now they're the best ones I get.

Re: Linux/OSX SMP core update increases throughput > 35%

Posted: Wed Aug 06, 2008 10:34 pm
by Pegasus
The A2 core and -advmethods added 800ppd to one of my Linux VMware setups (1300 to 2000ppd).

I have 2 VMware setups over 3 cores (because of GPU2 on 1 full core in XP). And the 2nd LinuxSMP client freezes if they both try running these new WUs with the A2 core and -advmethods. Although it works if I run just one with A2 core -advmethods and the other with A1 core.

Re: Linux/OSX SMP core update increases throughput > 35%

Posted: Thu Aug 07, 2008 9:01 am
by notfred
Has anyone got any _a2 cores on a Quad yet? I have two Athlon X2 systems and as soon as I put the -advmethods flag on they got _a2 WUs for their next WU, but my Quad6600 running 2 SMP clients is still running _a1 WUs only. I've double checked that it has everything the same as my X2 machines, but no luck yet on the new core.

Re: Linux/OSX SMP core update increases throughput > 35%

Posted: Thu Aug 07, 2008 1:30 pm
by JPinTO
My Quad Q9450 running dual Linux SMP's are both running 2662WU on the A2 core.

Re: Linux/OSX SMP core update increases throughput > 35%

Posted: Sun Aug 10, 2008 5:08 pm
by just brew it!
Another data point... my Athlon64 X2 3800+ Linux box just finished processing a 2662 WU, and has started on another one. Looks like it results in a nice boost in PPD; it's doing around 1300 PPD (up from around 800 PPD before).

One minor anomaly though -- at the end of the first WU, it hung after uploading the results, and wouldn't download a new WU until I killed and restarted the client.

Guess I'll need to keep an eye on it over the next few days...

Edit: Thread stickied.

Re: Linux/OSX SMP core update increases throughput > 35%

Posted: Sun Aug 10, 2008 6:30 pm
by SmokinJoe-Salem
You guys running multiple SMP clients: Are you running multiple instances in one virtual machine or multiple virtual machines?

Re: Linux/OSX SMP core update increases throughput > 35%

Posted: Sun Aug 10, 2008 11:17 pm
by Pegasus
I'm running 2 virtual machines via VMware server. Linux installed on both and running 1 FAH client on each (-smp).

Re: Linux/OSX SMP core update increases throughput > 35%

Posted: Wed Aug 13, 2008 6:46 pm
by notfred
I've seen the hangs at the end of the WU a few times now. A "ps" claims the process is in "T" state which means
man page wrote:
T Stopped, either by a job control signal or because it is being traced.

Seeing as they seem to be in a stopped state, I tried
killall -CONT FahCore_a2.exe
and that seemed to allow them to continue and everything started up fine again without anything else needing to be done i.e. the client keeps running and downloads the next WU correctly.

If anyone has an account on the official Folding Forum, you may want to pass this tip along unless they are already aware.

Re: Linux/OSX SMP core update increases throughput > 35%

Posted: Wed Aug 13, 2008 8:41 pm
by just brew it!
Good catch. I'll probably try setting up a cron job to automatically check for (and unstick) the stuck core.

I haven't seen any more of them after the first couple I got though; I wonder if they have temporarily stopped handing them out until they can deal with the hang issue?

Re: Linux/OSX SMP core update increases throughput > 35%

Posted: Wed Aug 13, 2008 9:35 pm
by PRIME1
hmm my Linux box is a P4 with HT.
http://folding.stanford.edu/English/FAQ-SMP#ntoc25

The SMP client was originally intended for multi-core CPUs, which generally do not support HT. For machines with 2 physical CPUs, we do recommend enabling HT for the SMP client as this presents the operating system with what looks like 4 logical processors (and our SMP client is intended for 4 processors). If you have 4 physical CPUs, we recommend against using HT, as this presents the operating system with 8 logical processors, which will make the SMP client run inefficiently (especially since the logical processors coming from HT run much slower than the normal ones).


Do the X2s or even C2Ds have hyperthreading?

Re: Linux/OSX SMP core update increases throughput > 35%

Posted: Wed Aug 13, 2008 9:43 pm
by just brew it!
PRIME1 wrote:
Do the X2s or even C2Ds have hyperthreading?

No.

Atom has it, and Intel has indicated that Nehalem will have it. But none of the current generation of mainstream x86 processors do.

Re: Linux/OSX SMP core update increases throughput > 35%

Posted: Wed Aug 13, 2008 9:48 pm
by PRIME1
just brew it! wrote:
PRIME1 wrote:
Do the X2s or even C2Ds have hyperthreading?

No.

Atom has it, and Intel has indicated that Nehalem will have it. But none of the current generation of mainstream x86 processors do.

That's what I thought. So that's why that statement about the SMP client being for "4 processors" seems odd.

Re: Linux/OSX SMP core update increases throughput > 35%

Posted: Wed Aug 13, 2008 9:56 pm
by just brew it!
PRIME1 wrote:
That's what I thought. So that's why that statement about the SMP client being for "4 processors" seems odd.

My understanding is that the code runs optimally on systems with 4 cores, but does reasonably well on dual core as well (by running 2 threads per core).

Re: Linux/OSX SMP core update increases throughput > 35%

Posted: Sat Aug 16, 2008 4:14 pm
by Ragnar Dan
notfred wrote:
I've seen the hangs at the end of the WU a few times now. A "ps" claims the process is in "T" state which means [...]

If anyone has an account on the official Folding Forum, you may want to pass this tip along unless they are already aware.

I added a post here about your discovery, and already got a reply from 7im that he's passed the info on to the Pande Group.

(And Stanford is having a network problem that has putatively just been resolved as I was reading the thread about it.)

Re: Linux/OSX SMP core update increases throughput > 35%

Posted: Sat Aug 16, 2008 6:06 pm
by just brew it!
FWIW I've turned in a couple more of the 2662s, and haven't seen a repeat of the "a2 hang" issue so far. So I guess it is intermittent...

Edit: Dang, with the increased production of this new SMP core, I might actually have a shot at getting back into the TR top 20 producers list (it's been quite a while since I've been there). Have parts for a Socket AM2 system... gotta get that puppy up and running! :D

Re: Linux/OSX SMP core update increases throughput > 35%

Posted: Sat Aug 16, 2008 11:30 pm
by Ragnar Dan
Since you're a Linux geek, how about a temporary shell script for sleeping 15 minutes and checking for it being at FINISHED_UNIT and then checking for that T state? :D

Re: Linux/OSX SMP core update increases throughput > 35%

Posted: Sun Aug 17, 2008 11:52 am
by notfred
Disclaimer, this is untested. You probably want to change the directory where it logs and the directory where you are folding.

#!/bin/sh
while [ 1 ]
do
  # Run every 5 minutes
  sleep 300

  # Clean up the log file
  if [ -f /etc/folding/hanglog.txt ]
  then
    tail -n 1000 /etc/folding/hanglog.txt > /tmp/hanglog.txt
    mv /tmp/hanglog.txt /etc/folding/hanglog.txt
  fi

  # For each instance
  instance=1
  while [ -d /etc/folding/$instance ]
  do
    echo `date` " Checking instance " $instance >> /etc/folding/hanglog.txt

    # Check for upload and not trying to download following
    grep -E 'Number of Units Completed|Preparing to get new work unit' /etc/folding/$instance/FAHlog.txt | tail -n 1 | grep -q 'Number of Units Completed'
    if [ $?  -eq 0 ]
    then
      # Give the client a chance to continue
      echo "Potential stop found, waiting to see if it clears..." >> /etc/folding/hanglog.txt

      sleep 300
      grep -E 'Number of Units Completed|Preparing to get new work unit' /etc/folding/$instance/FAHlog.txt | tail -n 1 | grep -q 'Number of Units Completed'
      if [ $?  -eq 0 ]
      then
        echo "Stop failed to clear, Continuing cores" >> /etc/folding/hanglog.txt
        killall -CONT FahCore_a2.exe
      fi
    fi

  instance=`expr $instance + 1`
  done
done

Re: Linux/OSX SMP core update increases throughput > 35%

Posted: Mon Aug 18, 2008 1:07 pm
by farmpuma
How big are the 2662 and other a2 download and upload files?

edit: I notice the 2662 has the same number of atoms as the 2665, who's 22MB upload file makes it doubly distasteful to me.

Re: Linux/OSX SMP core update increases throughput > 35%

Posted: Mon Aug 18, 2008 2:05 pm
by Flying Fox
farmpuma wrote:
How big are the 2662 and other a2 download and upload files?

edit: I notice the 2662 has the same number of atoms as the 2665, who's 22MB upload file makes it doubly distasteful to me.

I honestly don't care. It gives me 100ppd more so I am happy. 8)

----

farmpuma jedi post saving edit: You sir, obviously have a real (broadband) internet connection, you lucky fox. Try taking two to three hours to upload a finished 2665 WU, if the internet connection doesn't drop-out halfway through, and then tell me how happy you are.

Re: Linux/OSX SMP core update increases throughput > 35%

Posted: Mon Aug 18, 2008 3:36 pm
by Ragnar Dan
They're big all right:
[03:49:21] + Attempting to send results
[03:49:21] - Reading file work/wuresults_06.dat from core
[03:49:22]   (Read 26669664 bytes from disk)


That's the last 2662 I uploaded on this PC last night.

My X2-3800, though, Flying Fox, is running at 2080 MHz, and is giving me nearly 40% better output (i.e. nearly 400 PPD) than the Project 2605 WU's were giving. And it's running on an NF3 motherboard (which I think I've read people complain that nVidia never updated Windows drivers to allow X2's to run on, or some such thing).

Re: Linux/OSX SMP core update increases throughput > 35%

Posted: Mon Aug 18, 2008 3:39 pm
by Ragnar Dan
farmpuma: I wouldn't edit someone else's post if I were a moderator, unless he violated forum rules.

----

farmpuma edit: Removed my misguided edit.

Re: Linux/OSX SMP core update increases throughput > 35%

Posted: Mon Aug 18, 2008 4:16 pm
by Flying Fox
Ragnar Dan wrote:
My X2-3800, though, Flying Fox, is running at 2080 MHz, and is giving me nearly 40% better output (i.e. nearly 400 PPD) than the Project 2605 WU's were giving. And it's running on an NF3 motherboard (which I think I've read people complain that nVidia never updated Windows drivers to allow X2's to run on, or some such thing).
Very interesting. I am back to stock 2.0GHz on my X2, with notfred's diskless folding appliance the 2605's are doing 650ppd and the 2662 is now showing 750ppd.

On another note, I think farmpuma just wants to stay as a Jedi. :lol:

Re: Linux/OSX SMP core update increases throughput > 35%

Posted: Mon Aug 18, 2008 4:54 pm
by farmpuma
Ragnar Dan and I wrote:
farmpuma: I wouldn't edit someone else's post if I were a moderator, unless he violated forum rules.

----

farmpuma jedi post saving edit: Yeah, you're probably technically correct. I really shouldn't depend on the kindness of strangers just to remain a Gerbil Jedi as long as I possibly can. It's coming down to the wire with less than forty Jedi post left and then I'll probably start slinging posts like buckshot.

Thanks for the logfile snippet. Am I correct in assuming the client didn't do any file compression before it uploaded?

You sir are indeed correct. It seems my misguided efforts have created a disturbance in the force, which was never my intention. I sincerely apologize to all whom I may have offended and I will remove said edits if any of you feel particularly violated, while doing my best to maintain thread continuity.

Re: Linux/OSX SMP core update increases throughput > 35%

Posted: Mon Aug 18, 2008 5:00 pm
by farmpuma
Ragnar Dan wrote:
They're big all right:
[03:49:21] + Attempting to send results
[03:49:21] - Reading file work/wuresults_06.dat from core
[03:49:22]   (Read 26669664 bytes from disk)


That's the last 2662 I uploaded on this PC last night.

Thank you for the logfile snippet. Am I correct in assuming the client did no file compression before uploading?

edit: Improved grammar.

Re: Linux/OSX SMP core update increases throughput > 35%

Posted: Mon Aug 18, 2008 5:23 pm
by Ragnar Dan
farmpuma: I didn't realize you were interested in maintaining your Gerbil ... title, or whatever they call it. The thing that I think wrong about editing other's posts for other than official reasons is that people can't tell who wrote what any more, and the validity of going back to look at a post to resolve disputes is destroyed when those with site privileges can go back and modify things after the fact.

You could always make a farmpuma#2 account if you really thought it important, though.

And just FYI, I'm totally lost now with your posts. It reads like you asked a question and then answered it as though you were a different person, but in the order of answer first and then question.

It appears to me that they do not compress the upload data for some odd reason. I've wondered about that for some time, but so far haven't remembered to bring up the question on the folding forum site.

Flying Fox: Well, the X2-3800 I'm using is running straight from notfred's ISO, booting from a CD-R. If yours is running under VMWare that will reduce output of course, but even so that seems too much unless you're doing something serious that really eats cycles. I have noticed that 2662's really seem to notice RAM speed. On my C2D the change from the DDR2 running at 784 MHz to running at 980 MHz (RAM multiplier from 2x to 2.5x) was fairly significant. I'm trying to find a way to keep my Crucial Ballistix from committing suicide like the first pair I had in that machine did. In the end I decided to just let it do what it's going to do and buy a different brand if and when that becomes necessary.

Re: Linux/OSX SMP core update increases throughput > 35%

Posted: Sat Nov 01, 2008 2:28 pm
by Gerbil Jedidiah
I saw a video on YouTube about the SMP client. Apparently, the SMP client runs multiple cores that talk to each other about their progress. This helps speed up the folding process as they learn from each other's mistakes. That's why they recommend running 1 client on 4 cores instead of 2 clients on 2 cores (for quad core machines).

Re: Linux/OSX SMP core update increases throughput > 35%

Posted: Mon Nov 03, 2008 1:22 pm
by jeffry55
david00214 wrote:
I saw a video on YouTube about the SMP client. Apparently, the SMP client runs multiple cores that talk to each other about their progress. This helps speed up the folding process as they learn from each other's mistakes. That's why they recommend running 1 client on 4 cores instead of 2 clients on 2 cores (for quad core machines).


Uh-Oh ...................................... SMP folding clients talking to each other and learning!!?? :o Oh boy, The Terminator is coming for sure now. :lol:

Re: Linux/OSX SMP core update increases throughput > 35%

Posted: Wed Mar 11, 2009 4:28 pm
by adisor19
Unfortunatly OS X 10.5.6 CPU scheduler is still broken. Apple is still denying there is a problem to begin with so i can't fold on my MBP.

Whenver i fold, my computer slows down to a crawl (when playing flash videos for example) as the scheduler is not treating "Niced" programs as Nice priority but rather as User priority so it's competing with all my other programs for resources.

Here's hope 10.6 will fix this although from what i've seen in the leaked seeds, it doesn't look like it.

Someone at FAH needs to release a SINGLE CORE client for OS X. This is getting beyond ridiculous.

Adi