new to this, want to help. (notfred PXE)

Come join the... uh... er... fold.

Moderators: just brew it!, farmpuma

new to this, want to help. (notfred PXE)

Postposted on Fri Jan 02, 2009 11:25 am

Hi all, I just downloaded the stuff in Notfred's diskless pxe boot information and am ready to start setting up my little netowrk of 7 c2d computers to add to you all's efforts.

I guess my only stupid question is what team number and information do I need to put in the default file? And do I need to set up anything at folding@home website? Also, is this going to utilize both cores?

Thanks
Last edited by Shinare on Mon Jan 05, 2009 9:34 am, edited 1 time in total.
Shinare
Gerbil XP
 
Posts: 352
Joined: Wed Jul 06, 2005 11:48 am

Re: new to this, want to help.

Postposted on Fri Jan 02, 2009 11:31 am

Heh, ok, just looking at the file I see the answer to my first stupid question. I've just changed my name to "Shinare" after realizing on the F@H website that I must have already set myself up a long time ago for TR's folding stuff and forgot.

But I still want to make sure that I will be using both cores. In the default file I see SMPCPUS=4 do I need to change that to 2 since I only have 2 cores?
Image
Shinare
Gerbil XP
 
Posts: 352
Joined: Wed Jul 06, 2005 11:48 am

Re: new to this, want to help.

Postposted on Fri Jan 02, 2009 12:17 pm

Ok, so now I've answered all my own questions. :) I have them all up and running I guess. Hope my little bit helps.

James
Image
Shinare
Gerbil XP
 
Posts: 352
Joined: Wed Jul 06, 2005 11:48 am

Re: new to this, want to help.

Postposted on Fri Jan 02, 2009 12:51 pm

Yep, seven C2Ds will be much more than a little help and welcome back to team 2630!
Image Image
.* * M-51 * *. .The Whirlpool Galaxy.
farmpuma
Minister of Gerbil Affairs
Silver subscriber
 
 
Posts: 2305
Joined: Sun Mar 21, 2004 11:33 pm
Location: Soybean field, IN, USA, Earth .. just a bit south of John .. err .... Fart Wayne, Indiana

Re: new to this, want to help.

Postposted on Sat Jan 03, 2009 1:35 pm

Sorry, I'm not familiar with the Notfred setup. You may want to post your SMPCPUS=4 question on a notfred thread so that other's familiar with it can help you.

Which processors do you have?? With 7 Quads folding, you should get 30k-40k Points Per Day.

- JP
JPinTO
Gerbil Team Leader
 
Posts: 240
Joined: Sat Jun 30, 2007 6:02 am
Location: Toronto, Ontario

Re: new to this, want to help.

Postposted on Sat Jan 03, 2009 6:35 pm

On a 2 processor setup, it won't make any difference.

It's about determining the number of copies of SMP folding to run: Number to run = Number of processors / SMPCPUs parameter
notfred
Grand Gerbil Poohbah
 
Posts: 3647
Joined: Tue Aug 10, 2004 9:10 am
Location: Ottawa, Canada

Re: new to this, want to help.

Postposted on Mon Jan 05, 2009 9:06 am

Thanks for the info. It looks as though it is running. I left it running over the weekend and was a little concerned when I came in to not be able to connect to the IP addresses that I was able to on Friday from the "server". But looking at the tftp program it looks as tho they all got new IP addresses. Is this normal behavior?

Also, I also installed just the regular FAH client on my workstation Friday but it looks as though it is only using 1 core (my computer says %51 utilization right now). Can I tell it somewhere to use both cores?

Thanks again.
Image
Shinare
Gerbil XP
 
Posts: 352
Joined: Wed Jul 06, 2005 11:48 am

Re: new to this, want to help. (notfred PXE)

Postposted on Mon Jan 05, 2009 9:56 am

Shinare wrote:Also, I also installed just the regular FAH client on my workstation Friday but it looks as though it is only using 1 core (my computer says %51 utilization right now). Can I tell it somewhere to use both cores?

Thanks again.


If you're running the Win Console client, there's an option to run 'advanced configuration' during the fah configuration, choose 'yes' to enter 'advanced..'; under 'addition parameters', type '-smp' to enable multi-core.
Fold! And I don't mean your clothes!

Do you have a favorite gerbil recipe? Please share with the TR community!
flybywire
Gerbil Jedi
 
Posts: 1883
Joined: Wed Jun 16, 2004 1:28 pm
Location: Springfield, VA - USA

Re: new to this, want to help. (notfred PXE)

Postposted on Mon Jan 05, 2009 10:32 am

flybywire wrote:
Shinare wrote:Also, I also installed just the regular FAH client on my workstation Friday but it looks as though it is only using 1 core (my computer says %51 utilization right now). Can I tell it somewhere to use both cores?

Thanks again.


If you're running the Win Console client, there's an option to run 'advanced configuration' during the fah configuration, choose 'yes' to enter 'advanced..'; under 'addition parameters', type '-smp' to enable multi-core.


OK, I've removed the tray icon version and ran the console version using the advanced config. I told it to start as a service with everything else default except for the additional configuration which I put in -smp as told. Now when I rebooted it does not start, and in the event log I have:

"The Folding@home-CPU-[1] service terminated unexpectedly."

Sorry to be such a pain, I just want to make sure all the cores I have available are folding. Is there a way I can remove this service and try again?

Also, back the notfred's diskless PXE client, when I hit the IP address of the client, should I be seeing more than 1 instance? All I see is "Instance 1".
Image
Shinare
Gerbil XP
 
Posts: 352
Joined: Wed Jul 06, 2005 11:48 am

Re: new to this, want to help. (notfred PXE)

Postposted on Mon Jan 05, 2009 11:20 am

On a dual core processor, you'll only see 1 instance.

BTW you should generally try and keep the same machines to the same IPs in the DHCP server, there were reports from people that things tended to get messed up with my diskless stuff if they were not. I'm not sure if everything got fixed up in that area or not - it's complicated with the Stanford stuff using MPI over the loopback interface.
notfred
Grand Gerbil Poohbah
 
Posts: 3647
Joined: Tue Aug 10, 2004 9:10 am
Location: Ottawa, Canada

Re: new to this, want to help. (notfred PXE)

Postposted on Mon Jan 05, 2009 11:42 am

OK, just making sure. I've forced the remote reboot on all of them and now they all say the correct IP address on their page.

On the windows client, I removed that initial service I made and have re-set up based on the suggestion in another thread I found to make two directories and just make two services. (I did not specify SMP anywhere but it seems to be working). I'm going to set this up this way on a few more E8500's that can run all the time.
Image
Shinare
Gerbil XP
 
Posts: 352
Joined: Wed Jul 06, 2005 11:48 am

Re: new to this, want to help. (notfred PXE)

Postposted on Mon Jan 05, 2009 12:26 pm

To run the Windows SMP client, you must run a different executable, available on this page. If I understand your post correctly, you're currently simply running 2 instances of the standard console client, each of which will (usually) run on its own core, and not interact, and which will produce less than 1/3 of the daily point totals the SMP client will on those C2D processors.

I always add the "-verbosity 9" parameter to any folding instance I run, so that any errors that occur will generate all possible messages available. It may be that if you do that the "terminated unexpectedly" error will list more information about it in the instance's FAHlog.txt file.

I've never run the Windows SMP client, and last I've seen people are frustrated with its unreliability, which is why many who run dedicated machines either run them as Linux machines for better production and greater reliability, or run Linux in a Virtual Machine hosted on Windows, which is what I do on the machine I'm posting from. Not many people who have control over machines others use will run a Linux VM, though, because it does eat cycles and often makes the machine noticeably sluggish, and the users will complain. But you'll probably want to find a better source than me for information about the Windows SMP clients available at the aforementioned link.
Ragnar Dan
Gerbil Elder
 
Posts: 5348
Joined: Sun Jan 20, 2002 6:00 pm

Re: new to this, want to help. (notfred PXE)

Postposted on Mon Jan 05, 2009 1:43 pm

Hey thanks for that link. I downloaded the beta you linked me to and set it up, then ran the install.bat and got the "if you see this message twice it works." twice. So I guess its working. Then made a shortcut to the new exe in there and added -smp to the end of it. Then ran it and set it up as a service and it continued to come up in SMP mode. So I killed it and started the service but it only ran on one core. So I went into the registry and added -smp to the command line for the service and it bombs out. It works well in SMP when I just run the shortcut I made, so I guess I wont be running it as a service, which stinks as I would have liked to install it on other computers that are also E8500's but are used periodically throughout the day by others.
Image
Shinare
Gerbil XP
 
Posts: 352
Joined: Wed Jul 06, 2005 11:48 am

Re: new to this, want to help. (notfred PXE)

Postposted on Tue Jan 06, 2009 9:37 am

Well, based on the last couple days, it looks like I'm getting about 10k points per day. I guess I was expecting more from the earlier comment. Anyway, I guess every little bit helps. :)
Image
Shinare
Gerbil XP
 
Posts: 352
Joined: Wed Jul 06, 2005 11:48 am

Re: new to this, want to help. (notfred PXE)

Postposted on Tue Jan 06, 2009 1:55 pm

Well, 10k per day would be pretty sweet in my book as I'm still struggling to reach 5k per day.

IIRC, You can run the win SMP client as a service, but you have to set it up in windows services (run, services.msc or in the msconfig tab .. or where ever, as I don't do services other than shutting them off) rather than using the client option.

Also watch out for machine ip address changes or even ip refreshes which can freeze the SMP client.
Image Image
.* * M-51 * *. .The Whirlpool Galaxy.
farmpuma
Minister of Gerbil Affairs
Silver subscriber
 
 
Posts: 2305
Joined: Sun Mar 21, 2004 11:33 pm
Location: Soybean field, IN, USA, Earth .. just a bit south of John .. err .... Fart Wayne, Indiana

Re: new to this, want to help. (notfred PXE)

Postposted on Tue Jan 06, 2009 2:03 pm

Yah, I've noticed on the notfred's PXE clients them changing IP addresses at what seems like random times. If I hit the new IP address with the browser, it still shows the old IP address in the info page. I then hit the remote reboot and reboot the client. then it comes up on the same new IP with the correct ip listed. not sure if I need to keep doing that (the rebooting part) but it seems to be doing that at least once a day for all machines.

This doesn't look to be the "set it and forget it" I was hoping for. :)
Image
Shinare
Gerbil XP
 
Posts: 352
Joined: Wed Jul 06, 2005 11:48 am

Re: new to this, want to help. (notfred PXE)

Postposted on Tue Jan 06, 2009 3:59 pm

You need to setup whatever is doing the DHCP to fix IP addresses to MAC addresses, that way they will not change.
notfred
Grand Gerbil Poohbah
 
Posts: 3647
Joined: Tue Aug 10, 2004 9:10 am
Location: Ottawa, Canada

Re: new to this, want to help. (notfred PXE)

Postposted on Tue Jan 06, 2009 4:45 pm

OK, I see in TFTP to do that you edit the ini under the DHCP section and type in MacAddress=IPAddress for each node. I've done that and hopefully that will fix things. They are booting up right now.

Thanks for your reply!
Image
Shinare
Gerbil XP
 
Posts: 352
Joined: Wed Jul 06, 2005 11:48 am

Re: new to this, want to help. (notfred PXE)

Postposted on Wed Jan 07, 2009 11:13 am

Does it take longer for some work units than others? I have the SMP console running on a dual core 3.16GHz E8500 and its doing about %1 of a 250000 step WU in 17 minutes. I have a dual core 2.6GHz E8200 (most of the computers in my farm are these) that is doing about 1% of a 500000 step WU in about 15 minutes. I would think the slower computer doing twice the work to get a single percentage point done would take at least twice as long, not do it quicker. I must be misunderstanding something, heh.
Image
Shinare
Gerbil XP
 
Posts: 352
Joined: Wed Jul 06, 2005 11:48 am

Re: new to this, want to help. (notfred PXE)

Postposted on Wed Jan 07, 2009 11:28 am

Of course, each WU type varies in terms of processing times. Take a note as to the project number (say, 2665) and you can look up the usual times that people are reporting.
Image
The Model M is not for the faint of heart. You either like them or hate them.

Gerbils unite! Fold for UnitedGerbilNation, team 2630.
Flying Fox
Gerbil God
 
Posts: 24141
Joined: Mon May 24, 2004 1:19 am

Re: new to this, want to help. (notfred PXE)

Postposted on Wed Jan 07, 2009 11:58 am

Are you using something like fahmon or fahspy for monitoring?
Usacomp2k3
Gerbil God
 
Posts: 21240
Joined: Thu Apr 01, 2004 3:53 pm
Location: Orlando, FL

Re: new to this, want to help. (notfred PXE)

Postposted on Wed Jan 07, 2009 12:08 pm

Nope, will look at both. Thanks! Like I said, new to this. :)
Image
Shinare
Gerbil XP
 
Posts: 352
Joined: Wed Jul 06, 2005 11:48 am

Re: new to this, want to help. (notfred PXE)

Postposted on Wed Jan 07, 2009 12:12 pm

Shinare wrote:Nope, will look at both. Thanks! Like I said, new to this. :)

In that case, I'll recommend them. I've used fahmon, and it worked great. Haven't personally used fahspy, although I think others here have.
Usacomp2k3
Gerbil God
 
Posts: 21240
Joined: Thu Apr 01, 2004 3:53 pm
Location: Orlando, FL

Re: new to this, want to help. (notfred PXE)

Postposted on Wed Jan 07, 2009 1:18 pm

Thanks for that. I've loaded it up. It was a little odd trying to figure out where to point them but I think I figured it out. I pointed each client to something such as:

\\192.168.1.12\c\etc\folding\1\

I hope this is correct, it looks like it is working. This is what I got when I first loaded it:

Image

That one thats named "30" took rebooting like 4 times before it would come up. On the tird try I decided to hook the monitor up to it and it said it could not contact standford or soemthing like that and to check my DNS information. So I turned it off again and back on and watched it on th monitor and it came right up. However, when it came up I got a bunch of this business in the tftpd32 server:

Code: Select all
...
Ack block 1685 ignored (received twice) [07/01 13:00:32.796]
Ack block 1685 ignored (received twice) [07/01 13:00:32.875]
Ack block 1685 ignored (received twice) [07/01 13:00:33.000]
Ack block 1685 ignored (received twice) [07/01 13:00:33.171]
Ack block 1685 ignored (received twice) [07/01 13:00:33.421]
Ack block 1685 ignored (received twice) [07/01 13:00:33.812]
Ack block 1685 ignored (received twice) [07/01 13:00:34.375]
Ack block 1685 ignored (received twice) [07/01 13:00:35.234]
Ack block 2659 ignored (received twice) [07/01 13:00:36.406]
Ack block 2659 ignored (received twice) [07/01 13:00:36.484]
Ack block 2659 ignored (received twice) [07/01 13:00:36.593]
Ack block 2659 ignored (received twice) [07/01 13:00:36.765]
Ack block 2659 ignored (received twice) [07/01 13:00:37.031]
...


Took forever to download the file it was downloading. Is this something to live with or is something wrong with my network?

Heres the fahmon now that its back up and running:

Image

EDIT: I also wanted to as if anyone know if there was a way in tftpd32 to specify a host name as well as an IP to a MAC address? I'm not sure if this had anything to do with "30" not coming up 4 times in a row but wanted to know if I could force a name on them anyway.
Image
Shinare
Gerbil XP
 
Posts: 352
Joined: Wed Jul 06, 2005 11:48 am

Re: new to this, want to help. (notfred PXE)

Postposted on Thu Jan 08, 2009 11:30 am

I wonder if you have a bad Ethernet cable or a bad connection in the RJ-45. Have you tried swapping cables?

I don't know if tftpd32 supports it, but there is a way in the DHCP protocol to specify hostname (Option 12) and my diskless stuff will respect the hostname given.
notfred
Grand Gerbil Poohbah
 
Posts: 3647
Joined: Tue Aug 10, 2004 9:10 am
Location: Ottawa, Canada

Re: new to this, want to help. (notfred PXE)

Postposted on Thu Jan 08, 2009 12:06 pm

Thanks for the suggestions. The replacement of the patch cable, as well as also trying a diferent port in the switch both had no effect.

When restarted, it downloads the initrd thing and kernal64 just fine, very speedy. Then it downloads something like 192.168.1.30.A.1 and fills the log file with those ACK things and takes like 3 minutes. It must be something with that computer's onboard NIC I guess as I don't think any of the other computers are exhibiting this behavior. However, I dont think I have had to reboot them abruptly. Maybe I should go pull the plug on one and see what happens. heh

Anyways, I poked around in tftpd32 for a little while trying to see if you could do something like MACAddress=IPAddress:HostName to specify "Folder30" as the host name but I didn't see anything. I'm using tftpd32 because that was what was in the instructions for the diskless PXE. Is there something better out there? I dont want to break the whole PXE boot thing goin on.
Image
Shinare
Gerbil XP
 
Posts: 352
Joined: Wed Jul 06, 2005 11:48 am

Re: new to this, want to help. (notfred PXE)

Postposted on Thu Jan 08, 2009 1:41 pm

Just as an update, I am getting that ACK stuff on all the computers when they are downloading their backup files. *shrug*
Image
Shinare
Gerbil XP
 
Posts: 352
Joined: Wed Jul 06, 2005 11:48 am

Re: new to this, want to help. (notfred PXE)

Postposted on Thu Jan 08, 2009 1:55 pm

The backup files might be giants due to the well-known bug in the Stanford code where wuinfo.txt ends up huge. That could be the problem on that node.

For DHCP/TFTP server software, I don't now about Windows really as I use all Linux. I had a quick look at the tftpd32 documentation and didn't see anyway to set the hostname on a per host basis.
notfred
Grand Gerbil Poohbah
 
Posts: 3647
Joined: Tue Aug 10, 2004 9:10 am
Location: Ottawa, Canada

Re: new to this, want to help. (notfred PXE)

Postposted on Thu Jan 08, 2009 2:07 pm

OK, I'll keep playing with stuff when I get time.

One odd thing, I just lost 4000 points according to my little image in my signature. It was saying over 12,000 for today about an hour ago. :(
Image
Shinare
Gerbil XP
 
Posts: 352
Joined: Wed Jul 06, 2005 11:48 am

Re: new to this, want to help. (notfred PXE)

Postposted on Thu Jan 08, 2009 2:11 pm

It shows 12,260 here.
Usacomp2k3
Gerbil God
 
Posts: 21240
Joined: Thu Apr 01, 2004 3:53 pm
Location: Orlando, FL

Next

Return to TR Distributed Computing Effort

Who is online

Users browsing this forum: No registered users and 2 guests

cron