Personal computing discussed

Moderators: renee, farmpuma, just brew it!

 
Hoobie7
Gerbil
Topic Author
Posts: 21
Joined: Mon Oct 29, 2007 11:27 am
Contact:

Can I force 32 bit to run?

Fri Nov 02, 2007 12:42 pm

Ok, so I've now got three systems running in our farm. All three are C2D's. And I woke up today to all 3 hung at 100%.
[16:35:46] Completed 500000 out of 500000 steps  (100 percent)
[16:35:46] Writing final coordinates.
[16:35:46] Past main M.D. loop
[16:35:46] Will end MPI now
[16:36:46]
[16:36:46] Finished Work Unit:
[16:36:46] - Reading up to 3726048 from "work/wudata_01.arc": Read 3726048
[16:36:46] - Reading up to 1785888 from "work/wudata_01.xtc": Read 1785888
[16:36:46] goefile size: 0
[16:36:46] logfile size: 19006
[16:36:46] Leaving Run
[16:36:47] - Writing 5535342 bytes of core data to disk...
[16:36:47]   ... Done.
[16:36:47] - Shutting down core
[16:36:47]
[16:36:47] Folding@home Core Shutdown: FINISHED_UNIT


[08:55:35] Completed 500000 out of 500000 steps  (100 percent)
[08:55:35] Writing final coordinates.
[08:55:35] Past main M.D. loop
[08:55:35] Will end MPI now
[08:56:35]
[08:56:35] Finished Work Unit:
[08:56:35] - Reading up to 3723552 from "work/wudata_01.arc": Read 3723552
[08:56:35] - Reading up to 1780560 from "work/wudata_01.xtc": Read 1780560
[08:56:35] goefile size: 0
[08:56:35] logfile size: 25722
[08:56:35] Leaving Run
[08:56:38] - Writing 5534234 bytes of core data to disk...
[08:56:38]   ... Done.
[08:56:38] - Shutting down core
[08:56:38]
[08:56:38] Folding@home Core Shutdown: FINISHED_UNIT
[13:38:53] - Autosending finished units...
[13:38:53] Trying to send all finished work units
[13:38:53] + No unsent completed units remaining.
[13:38:53] - Autosend completed


[12:00:25] Completed 500000 out of 500000 steps  (100 percent)
[12:00:25] Writing final coordinates.
[12:00:25] Past main M.D. loop
[12:00:25] Will end MPI now
[12:01:26]
[12:01:26] Finished Work Unit:
[12:01:26] - Reading up to 3718704 from "work/wudata_03.arc": Read 3718704
[12:01:26] - Reading up to 1772124 from "work/wudata_03.xtc": Read 1772124
[12:01:26] goefile size: 0
[12:01:26] logfile size: 25098
[12:01:26] Leaving Run
[12:01:30] - Writing 5520326 bytes of core data to disk...
[12:01:30]   ... Done.
[12:01:30] - Shutting down core
[12:01:30]
[12:01:30] Folding@home Core Shutdown: FINISHED_UNIT
[12:55:36] - Autosending finished units...
[12:55:36] Trying to send all finished work units
[12:55:36] + No unsent completed units remaining.
[12:55:36] - Autosend completed


So is there a way to force them to run in 32 bit since there's a bug in 64bit SMP??

Thnaks
Hoobie :cry:
 
cass
Minister of Gerbil Affairs
Posts: 2269
Joined: Mon Feb 10, 2003 9:12 am
Contact:

Fri Nov 02, 2007 1:26 pm

The old version I had would do it... I will have to look at the current version.

If notfred don't get this before then maybe I can help you tonight if I still have the old version tarballed.

Maybe you could try running from a cdrom? That supposedly works... if you have some extra cd roms. That may be what I try tonight, because I have 4 smp diskless. Can't say for sure I haven't tried yet. I have enough hdd's to load debian64 up disk based.
Last edited by cass on Fri Nov 02, 2007 1:29 pm, edited 1 time in total.
 
7im
Gerbil
Posts: 51
Joined: Sat Sep 02, 2006 6:20 pm

Fri Nov 02, 2007 1:29 pm

What makes you think this is a 64 bit problem?
 
cass
Minister of Gerbil Affairs
Posts: 2269
Joined: Mon Feb 10, 2003 9:12 am
Contact:

Fri Nov 02, 2007 1:34 pm

7im wrote:
What makes you think this is a 64 bit problem?


You mean there is a 32bit smp that works? :-)

I think the reference is to 64bit smp notfred diskless PXE booted Memory Filesystem nodes running the new 6.x linux client. That particular combination is a hangathon at the end of every WU. I gave up last night and just deleted two units I could do nothing to make them finish. probably should smb the results to a windows machine and see if it would finish them. hmm
 
notfred
Maximum Gerbil
Posts: 4610
Joined: Tue Aug 10, 2004 10:10 am
Location: Ottawa, Canada

Fri Nov 02, 2007 2:17 pm

You just need to edit the .cfg file and remove the "default64" entry if you only want to run 32 bit. If this is CD it is isolinux.cfg, if USB then it is syslinux.cfg, if PXE boot then it is pxelinux.cfg/default.
 
Hoobie7
Gerbil
Topic Author
Posts: 21
Joined: Mon Oct 29, 2007 11:27 am
Contact:

Fri Nov 02, 2007 7:13 pm

notfred - thanks I'll give that a shot next

This is the odd part, the third node that I just added is currently running off CD. So I'm really starting to wonder if I'm doing something wrong or if there's something odd about my network that's causing issues.

Hmm
Hoobie
 
Hoobie7
Gerbil
Topic Author
Posts: 21
Joined: Mon Oct 29, 2007 11:27 am
Contact:

Sat Nov 03, 2007 5:26 pm

ROFL - Oh, I'm so confused!

DEFAULT fold
DEFAULT64 fold64

LABEL fold
KERNEL kernel32
APPEND initrd=initrd USER=notfred2630 TEAM=2630 BIG=yes BACKUP=15 REBOOT=enabled INSTALL=yes

LABEL fold64
KERNEL kernel64
APPEND initrd=initrd USER=notfred2630 TEAM=2630 BIG=yes BACKUP=15 REBOOT=enabled INSTALL=yes


I deleted the bold part. Now it runs 32 bit but its two single instances. Is there a way to get 32 bit SMP to run? I know I'm a pain in the arse, but I really wanna know what's going on now! So the known 100% hang bug is only in PXE flavor 64 bit? They DL WU's fine, could there be something wrong with my network?

Thanks for the help
Hoobie :-?
 
Flying Fox
Gerbil God
Posts: 25690
Joined: Mon May 24, 2004 2:19 am
Contact:

Sat Nov 03, 2007 5:30 pm

Unfortunately I don't think there is a 32-bit SMP client. If it runs 2 instances of the 32-bit single clients all your cores will still be fully utilized.
The Model M is not for the faint of heart. You either like them or hate them.

Gerbils unite! Fold for UnitedGerbilNation, team 2630.
 
notfred
Maximum Gerbil
Posts: 4610
Joined: Tue Aug 10, 2004 10:10 am
Location: Ottawa, Canada

Sun Nov 04, 2007 7:47 pm

No 32bit SMP client for Linux from Stanford yet. Running as 32 bit, my program will detect all the processors and run an instance per processor so it will use all your resources but the points score will not be as high as SMP folding. The hang at 100% is just part of the SMP client's beta-ness and it seems my program provokes it more than some others.

Who is online

Users browsing this forum: No registered users and 1 guest
GZIP: On