Personal computing discussed
Moderators: renee, farmpuma, just brew it!
notfred wrote:The drive is shared on the network, so you could just copy from that to do a backup - restore would require you to overwrite before a backup kicked in and then reboot the diskless folding instance.
--- Opening Log file [June 10 14:21:40]
# SMP Client ##################################################################
###############################################################################
Folding@Home Client Version 6.02beta
http://folding.stanford.edu
###############################################################################
###############################################################################
Launch directory: /etc/folding/1
Executable: ./fah6
Arguments: -local -forceasm -verbosity 9 -smp
Warning:
By using the -forceasm flag, you are overriding
safeguards in the program. If you did not intend to
do this, please restart the program without -forceasm.
If work units are not completing fully (and particularly
if your machine is overclocked), then please discontinue
use of the flag.
[14:21:40] - Ask before connecting: No
[14:21:40] - User name: nomad8u (Team 33)
[14:21:40] - User ID: 126F3C4E19B42D8D
[14:21:40] - Machine ID: 1
[14:21:40]
[14:21:40] Loaded queue successfully.
[14:21:40] - Autosending finished units...
[14:21:40] Trying to send all finished work units
[14:21:40] + No unsent completed units remaining.
[14:21:40] - Autosend completed
[14:21:40]
[14:21:40] + Processing work unit
[14:21:40] Core required: FahCore_a1.exe
[14:21:40] Core not found.
[14:21:40] - Core is not present or corrupted.
[14:21:40] - Attempting to download new core...
[14:21:40] + Downloading new core: FahCore_a1.exe
[14:21:40] Downloading core (/~pande/Linux/x86//Core_a1.fah from www.stanford.edu)
[14:21:41] Initial: AFDE; + 10240 bytes downloaded
[14:21:41] Initial: B54E; + 20480 bytes downloaded
[14:21:41] Initial: D6C2; + 30720 bytes downloaded
[14:21:41] Initial: 9F08; + 40960 bytes downloaded
[14:21:41] Initial: C6C3; + 51200 bytes downloaded
[14:21:41] Initial: EBA8; + 61440 bytes downloaded
[14:21:41] Initial: 3141; + 71680 bytes downloaded
[14:21:41] Initial: D218; + 81920 bytes downloaded
[14:21:41] Initial: F7AC; + 92160 bytes downloaded
[14:21:41] Initial: 820B; + 102400 bytes downloaded
[14:21:41] Initial: 1B1E; + 112640 bytes downloaded
[14:21:41] Initial: C249; + 122880 bytes downloaded
[14:21:42] Initial: 5EBD; + 133120 bytes downloaded
[14:21:42] Initial: CD6C; + 143360 bytes downloaded
[14:21:42] Initial: 221C; + 153600 bytes downloaded
[14:21:42] Initial: DB18; + 163840 bytes downloaded
[14:21:42] Initial: 237E; + 174080 bytes downloaded
[14:21:42] Initial: AEEC; + 184320 bytes downloaded
[14:21:42] Initial: 4C66; + 194560 bytes downloaded
[14:21:42] Initial: AE1E; + 204800 bytes downloaded
[14:21:42] Initial: A37E; + 215040 bytes downloaded
[14:21:42] Initial: 8193; + 225280 bytes downloaded
[14:21:42] Initial: 9F05; + 235520 bytes downloaded
[14:21:42] Initial: AAA5; + 245760 bytes downloaded
[14:21:42] Initial: 6400; + 256000 bytes downloaded
[14:21:42] Initial: 6E3D; + 266240 bytes downloaded
[14:21:42] Initial: EA6B; + 276480 bytes downloaded
[14:21:42] Initial: 820A; + 286720 bytes downloaded
[14:21:42] Initial: DE6D; + 296960 bytes downloaded
[14:21:42] Initial: B97B; + 307200 bytes downloaded
[14:21:42] Initial: 9D5D; + 317440 bytes downloaded
[14:21:43] Initial: 91D7; + 327680 bytes downloaded
[14:21:43] Initial: BB3B; + 337920 bytes downloaded
[14:21:43] Initial: 611B; + 348160 bytes downloaded
[14:21:43] Initial: B290; + 358400 bytes downloaded
[14:21:43] Initial: B0AA; + 368640 bytes downloaded
[14:21:43] Initial: 6A85; + 378880 bytes downloaded
[14:21:43] Initial: BF10; + 389120 bytes downloaded
[14:21:43] Initial: A818; + 399360 bytes downloaded
[14:21:43] Initial: 90E1; + 409600 bytes downloaded
[14:21:43] Initial: 2869; + 419840 bytes downloaded
[14:21:43] Initial: CAFE; + 430080 bytes downloaded
[14:21:43] Initial: 414B; + 440320 bytes downloaded
[14:21:43] Initial: 9B7A; + 450560 bytes downloaded
[14:21:43] Initial: 33AA; + 460800 bytes downloaded
[14:21:43] Initial: B1D5; + 471040 bytes downloaded
[14:21:44] Initial: 0206; + 481280 bytes downloaded
[14:21:44] Initial: 11F4; + 491520 bytes downloaded
[14:21:44] Initial: 31B5; + 501760 bytes downloaded
[14:21:44] Initial: 46B2; + 512000 bytes downloaded
[14:21:44] Initial: 3113; + 522240 bytes downloaded
[14:21:44] Initial: 525A; + 532480 bytes downloaded
[14:21:44] Initial: 66F9; + 542720 bytes downloaded
[14:21:44] Initial: 9672; + 552960 bytes downloaded
[14:21:44] Initial: 9058; + 563200 bytes downloaded
[14:21:44] Initial: 49ED; + 573440 bytes downloaded
[14:21:44] Initial: 515D; + 583680 bytes downloaded
[14:21:44] Initial: CAC0; + 593920 bytes downloaded
[14:21:44] Initial: 0B15; + 604160 bytes downloaded
[14:21:44] Initial: 5A89; + 614400 bytes downloaded
[14:21:44] Initial: 0F31; + 624640 bytes downloaded
[14:21:44] Initial: 2BC3; + 634880 bytes downloaded
[14:21:44] Initial: 3C06; + 645120 bytes downloaded
[14:21:44] Initial: 89C7; + 655360 bytes downloaded
[14:21:44] Initial: 6C54; + 665600 bytes downloaded
[14:21:45] Initial: 8D4D; + 675840 bytes downloaded
[14:21:45] Initial: EA59; + 686080 bytes downloaded
[14:21:45] Initial: C563; + 696320 bytes downloaded
[14:21:45] Initial: 8D45; + 706560 bytes downloaded
[14:21:45] Initial: 9BD0; + 716800 bytes downloaded
[14:21:45] Initial: 130C; + 727040 bytes downloaded
[14:21:45] Initial: CDA1; + 737280 bytes downloaded
[14:21:45] Initial: 7681; + 747520 bytes downloaded
[14:21:45] Initial: 1110; + 757760 bytes downloaded
[14:21:45] Initial: EE35; + 768000 bytes downloaded
[14:21:45] Initial: E5E1; + 778240 bytes downloaded
[14:21:45] Initial: 4B97; + 788480 bytes downloaded
[14:21:45] Initial: 4D75; + 798720 bytes downloaded
[14:21:45] Initial: E268; + 808960 bytes downloaded
[14:21:45] Initial: FAC6; + 819200 bytes downloaded
[14:21:45] Initial: A625; + 829440 bytes downloaded
[14:21:45] Initial: A12A; + 839680 bytes downloaded
[14:21:45] Initial: 83A3; + 849920 bytes downloaded
[14:21:46] Initial: 3BEA; + 860160 bytes downloaded
[14:21:46] Initial: 5298; + 870400 bytes downloaded
[14:21:46] Initial: 4811; + 880640 bytes downloaded
[14:21:46] Initial: EB07; + 890880 bytes downloaded
[14:21:46] Initial: 83FC; + 901120 bytes downloaded
[14:21:46] Initial: FA4E; + 911360 bytes downloaded
[14:21:46] Initial: 2945; + 921600 bytes downloaded
[14:21:46] Initial: 6BC9; + 931840 bytes downloaded
[14:21:46] Initial: E495; + 942080 bytes downloaded
[14:21:46] Initial: 1050; + 952320 bytes downloaded
[14:21:46] Initial: 2070; + 962560 bytes downloaded
[14:21:46] Initial: 1083; + 972800 bytes downloaded
[14:21:46] Initial: 96E5; + 983040 bytes downloaded
[14:21:46] Initial: 3EEE; + 993280 bytes downloaded
[14:21:46] Initial: 84AC; + 1003520 bytes downloaded
[14:21:46] Initial: 3B6B; + 1013760 bytes downloaded
[14:21:46] Initial: 3030; + 1024000 bytes downloaded
[14:21:46] Initial: 4B95; + 1034240 bytes downloaded
[14:21:47] Initial: D9BC; + 1044480 bytes downloaded
[14:21:47] Initial: C5B8; + 1054720 bytes downloaded
[14:21:47] Initial: A5EF; + 1064960 bytes downloaded
[14:21:47] Initial: 28DC; + 1075200 bytes downloaded
[14:21:47] Initial: 0943; + 1085440 bytes downloaded
[14:21:47] Initial: 338A; + 1095680 bytes downloaded
[14:21:47] Initial: ADFC; + 1105920 bytes downloaded
[14:21:47] Initial: ED39; + 1116160 bytes downloaded
[14:21:47] Initial: D284; + 1126400 bytes downloaded
[14:21:47] Initial: 0057; + 1136640 bytes downloaded
[14:21:47] Initial: 3E65; + 1146880 bytes downloaded
[14:21:47] Initial: FCB5; + 1157120 bytes downloaded
[14:21:47] Initial: A7D8; + 1167360 bytes downloaded
[14:21:47] Initial: A564; + 1177600 bytes downloaded
[14:21:47] Initial: 7654; + 1187840 bytes downloaded
[14:21:47] Initial: 0848; + 1198080 bytes downloaded
[14:21:47] Initial: 471E; + 1208320 bytes downloaded
[14:21:47] Initial: A7F3; + 1218560 bytes downloaded
[14:21:48] Initial: FA59; + 1228800 bytes downloaded
[14:21:48] Initial: FBF2; + 1239040 bytes downloaded
[14:21:48] Initial: F54E; + 1249280 bytes downloaded
[14:21:48] Initial: 3023; + 1259520 bytes downloaded
[14:21:48] Initial: AB37; + 1269760 bytes downloaded
[14:21:48] Initial: 0896; + 1280000 bytes downloaded
[14:21:48] Initial: 756D; + 1290240 bytes downloaded
[14:21:48] Initial: C1E7; + 1300480 bytes downloaded
[14:21:48] Initial: 9AAC; + 1310720 bytes downloaded
[14:21:48] Initial: E5AF; + 1320960 bytes downloaded
[14:21:48] Initial: BBE3; + 1331200 bytes downloaded
[14:21:48] Initial: 3596; + 1341440 bytes downloaded
[14:21:48] Initial: 924C; + 1351680 bytes downloaded
[14:21:48] Initial: 30B7; + 1361920 bytes downloaded
[14:21:48] Initial: AEB7; + 1372160 bytes downloaded
[14:21:48] Initial: 7D25; + 1382400 bytes downloaded
[14:21:48] Initial: 0FEB; + 1392640 bytes downloaded
[14:21:48] Initial: 3131; + 1402880 bytes downloaded
[14:21:48] Initial: 755F; + 1413120 bytes downloaded
[14:21:49] Initial: 4800; + 1423360 bytes downloaded
[14:21:49] Initial: 1282; + 1433600 bytes downloaded
[14:21:49] Initial: B2A3; + 1443840 bytes downloaded
[14:21:49] Initial: 21E9; + 1454080 bytes downloaded
[14:21:49] Initial: 789E; + 1464320 bytes downloaded
[14:21:49] Initial: 8542; + 1474560 bytes downloaded
[14:21:49] Initial: 3A56; + 1484800 bytes downloaded
[14:21:49] Initial: D4FE; + 1490945 bytes downloaded
[14:21:49] Verifying core Core_a1.fah...
[14:21:49] Signature is VALID
[14:21:49]
[14:21:49] Trying to unzip core FahCore_a1.exe
[14:21:49] Decompressed FahCore_a1.exe (3625104 bytes) successfully
[14:21:49] + Core successfully engaged
[14:21:54]
[14:21:54] + Processing work unit
[14:21:54] Core required: FahCore_a1.exe
[14:21:54] Core found.
[14:21:54] Working on Unit 01 [June 10 14:21:54]
[14:21:54] + Working ...
[14:21:54] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a1.exe -dir work/ -suffix 01 -checkpoint 15 -forceasm -verbose -lifeline 442 -version 602'
[14:21:54]
[14:21:54] *------------------------------*
[14:21:54] Folding@Home Gromacs SMP Core
[14:21:54] Version 1.74 (November 27, 2006)
[14:21:54]
[14:21:54] Preparing to commence simulation
[14:21:54] - Ensuring status. Please wait.
[14:22:12] - Assembly optimizations manually forced on.
[14:22:12] - Not checking prior termination.
[14:22:12] - Expanded 2436318 -> 12886013 (decompressed 528.9 percent)
[14:22:13]
[14:22:13] Project: 2605 (Run 6, Clone 143, Gen 61)
[14:22:13]
[14:22:13] Assembly optimizations on if available.
[14:22:13] Entering M.D.
[14:22:19] Calling FAH init
[14:22:20] in POPC
[14:22:20] Writing local files
[14:22:20] checkpoint)
[14:22:20] Read checkpoint
[14:22:20] steps (1 percent)
[14:22:20] Extra SSE boost OK.
[14:22:20] files
[14:22:20] Completed 5000 out of 500000 steps (1 percent)
[14:22:20] Extra SSE boost OK.
[14:34:32] Writing local files
[14:34:32] Completed 10000 out of 500000 steps (2 percent)
[14:46:43] Writing local files
[14:46:44] Completed 15000 out of 500000 steps (3 percent)
[14:58:55] Writing local files
[14:58:55] Completed 20000 out of 500000 steps (4 percent)
[15:11:02] Writing local files
[15:11:02] Completed 25000 out of 500000 steps (5 percent)
[15:23:06] Writing local files
[15:23:06] Completed 30000 out of 500000 steps (6 percent)
[15:35:08] Writing local files
[15:35:08] Completed 35000 out of 500000 steps (7 percent)
[16:59:44] Writing local files
[16:59:44] Completed 40000 out of 500000 steps (8 percent)
[16:59:44] Timered checkpoint triggered.
[16:59:44] Writing local files
[16:59:44] Completed 45000 out of 500000 steps (9 percent)
[16:59:44] Timered checkpoint triggered.
[16:59:44] Writing local files
[16:59:44] Completed 50000 out of 500000 steps (10 percent)
[16:59:44] Timered checkpoint triggered.
[16:59:44] Writing local files
[16:59:44] Completed 55000 out of 500000 steps (11 percent)
[16:59:44] Timered checkpoint triggered.
[17:00:55] Writing local files
[17:00:56] Completed 60000 out of 500000 steps (12 percent)
[18:14:48] Timered checkpoint triggered.
[18:14:48] Writing local files
[18:14:48] Completed 65000 out of 500000 steps (13 percent)
[18:14:48] Timered checkpoint triggered.
[18:14:48] Writing local files
[18:14:48] Completed 70000 out of 500000 steps (14 percent)
[18:14:48] Timered checkpoint triggered.
[18:14:48] Writing local files
[18:14:48] Completed 75000 out of 500000 steps (15 percent)
[18:14:48] Writing local files
[18:14:48] Completed 80000 out of 500000 steps (16 percent)
[18:16:16] Writing local files
[18:16:16] Completed 85000 out of 500000 steps (17 percent)
[18:28:27] Writing local files
[18:28:27] Completed 90000 out of 500000 steps (18 percent)
[18:40:38] Writing local files
[18:40:38] Completed 95000 out of 500000 steps (19 percent)
[18:52:46] Writing local files
[18:52:46] Completed 100000 out of 500000 steps (20 percent)
[19:05:32] Writing local files
[19:05:32] Completed 105000 out of 500000 steps (21 percent)
[19:17:53] Writing local files
[19:17:53] Completed 110000 out of 500000 steps (22 percent)
[19:30:06] Writing local files
[19:30:06] Completed 115000 out of 500000 steps (23 percent)
[19:42:22] Writing local files
[19:42:22] Completed 120000 out of 500000 steps (24 percent)
[19:54:34] Writing local files
[19:54:34] Completed 125000 out of 500000 steps (25 percent)
[20:06:40] Writing local files
[20:06:40] Completed 130000 out of 500000 steps (26 percent)
[20:18:52] Writing local files
[20:18:52] Completed 135000 out of 500000 steps (27 percent)
[20:21:40] - Autosending finished units...
[20:21:40] Trying to send all finished work units
[20:21:40] + No unsent completed units remaining.
[20:21:40] - Autosend completed
[20:31:00] Writing local files
[20:31:00] Completed 140000 out of 500000 steps (28 percent)
[20:43:09] Writing local files
[20:43:09] Completed 145000 out of 500000 steps (29 percent)
[20:55:18] Writing local files
[20:55:18] Completed 150000 out of 500000 steps (30 percent)
[21:07:27] Writing local files
[21:07:27] Completed 155000 out of 500000 steps (31 percent)
[21:19:40] Writing local files
[21:19:40] Completed 160000 out of 500000 steps (32 percent)
[21:31:54] Writing local files
[21:31:54] Completed 165000 out of 500000 steps (33 percent)
[21:44:05] Writing local files
[21:44:05] Completed 170000 out of 500000 steps (34 percent)
[23:01:11] Writing local files
[23:01:11] Completed 175000 out of 500000 steps (35 percent)
[23:01:11] Writing local files
[23:01:11] Completed 180000 out of 500000 steps (36 percent)
[23:01:11] Writing local files
[23:01:11] Completed 185000 out of 500000 steps (37 percent)
[23:01:11] Writing local files
[23:01:11] Completed 190000 out of 500000 steps (38 percent)
[23:01:11] Writing local files
[23:01:11] Completed 195000 out of 500000 steps (39 percent)
[23:01:11] Writing local files
[23:01:11] Completed 200000 out of 500000 steps (40 percent)
[00:19:58] Writing local files
[00:19:58] Completed 205000 out of 500000 steps (41 percent)
[00:19:58] Writing local files
[00:19:58] Completed 210000 out of 500000 steps (42 percent)
[00:19:58] Writing local files
[00:19:58] Completed 215000 out of 500000 steps (43 percent)
[00:19:58] Writing local files
[00:19:58] Completed 220000 out of 500000 steps (44 percent)
[00:19:58] Writing local files
[00:19:58] Completed 225000 out of 500000 steps (45 percent)
[00:19:58] Writing local files
[00:19:58] Completed 230000 out of 500000 steps (46 percent)
[00:26:04] Writing local files
[00:26:05] Completed 235000 out of 500000 steps (47 percent)
[00:37:46] Writing local files
[00:37:47] Completed 240000 out of 500000 steps (48 percent)
nomad8u wrote:The core continues to run at what appears to be an appropriate speed (other than the fact it's driving FAHSpy nuts ) and I'm aware that VMWare has some time issues, but I haven't seen anything like this before. Any clue?
DriveEuro wrote:If you access the windows share of the hard drive image over the network, what does it look like? Does it show as full? Are there some giant files on it?It appears that I've filled up the 512mb space and it wants more space. FAH continues to make progress while it says this in FAHLOG.txt. I reset each VM and it goes away for a while. But ends up coming back any amount of hours later.
DriveEuro wrote:It appears that I've filled up the 512mb space and it wants more space. FAH continues to make progress while it says this in FAHLOG.txt. I reset each VM and it goes away for a while. But ends up coming back any amount of hours later.
DriveEuro wrote:
notfred wrote:The numbers of files and folders are increasing as well, I suspect it is going in to a loop in the /proc directory.
LumberJack wrote:Hey NotFred. Thanks for creating this method of folding. I already have 2 friends with farms that are using it.
I have managed to configure a perfectly working Virtual Machine running notfreds on a dual core CPU. It also saves to the USB stick. And can also restore from the USB stick. My method and configuration is different from yours though. But since I'm basically a "noob" on this forum I don't want to post the exact instructions unless you want me to. Thanks.
DriveEuro wrote:
DriveEuro wrote:That's run out of memory and the notorious Linux OOM Killer is killing off processes. Try increasing the amount of memory available for the VM.
nomad8u wrote:Hey notfred... I seem to have an issue with a VA running in VMWare. It seems as though "time" is getting stuck per the FAH log. Hang check log looks normal with a check every 5 minutes.
It was progressing normally at around 12:16 - 12:28/frame or so until the 34% mark. Then it seemed to "lose" 2 hours and jumped several % at the same time
It looked like it was starting to "sync" back up after 46%, but the jump from 47-48% is wrong..
The core continues to run at what appears to be an appropriate speed (other than the fact it's driving FAHSpy nuts ) and I'm aware that VMWare has some time issues, but I haven't seen anything like this before. Any clue?
I just setup and launched a second VA so we'll see if this is unique to the 1st instance I set up.
-snip-
[15:46:22] Timered checkpoint triggered.
[15:46:22] Writing local files
[15:46:22] Completed 65000 out of 500000 steps (13 percent)
[15:46:22] Timered checkpoint triggered.
[15:46:22] Writing local files
[15:46:22] Completed 70000 out of 500000 steps (14 percent)
[15:46:22] Timered checkpoint triggered.
[15:46:22] Writing local files
[15:46:22] Completed 75000 out of 500000 steps (15 percent)
[15:46:22] Timered checkpoint triggered.
[15:46:22] Writing local files
[15:46:22] Completed 80000 out of 500000 steps (16 percent)
[15:46:22] Timered checkpoint triggered.
[15:46:26] Writing local files
[15:46:26] Completed 85000 out of 500000 steps (17 percent)
-snip-
[18:09:35] Timered checkpoint triggered.
[18:09:35] Writing local files
[18:09:35] Completed 120000 out of 500000 steps (24 percent)
[18:09:35] Timered checkpoint triggered.
[18:09:35] Writing local files
[18:09:35] Completed 125000 out of 500000 steps (25 percent)
[18:09:35] Timered checkpoint triggered.
[18:09:35] Writing local files
[18:09:35] Completed 130000 out of 500000 steps (26 percent)
[18:09:35] Timered checkpoint triggered.
[18:09:35] Writing local files
[18:09:35] Completed 135000 out of 500000 steps (27 percent)
[18:09:35] Timered checkpoint triggered.
[18:09:35] Writing local files
[18:09:35] Completed 140000 out of 500000 steps (28 percent)
nomad8u wrote:DriveEuro wrote:
Don't know what WU's you were getting that on, but now that I see it, that's the issue I mentioned above where I setup a second instance and had 5 - 2665 WU's yak in a row. I killed the instance after the second WU crashed, and increased the ram allocation to 896 and it still crashed 3 more before I stopped it. Since I had the first instance running with the default 512meg of ram, I didn't want to allocate any higher than 896 on the second for fear of stalling Windows.
I'm now wondering if this may be related to the "run away files" issue, and since the 2665's can be very demanding, it was trying to page out ram, but the hd issue was going on in the background at the same time.. Hmmmm.
Notfred, any ideas?
[08:34:43]
[08:34:43] *------------------------------*
[08:34:43] Folding@Home Gromacs SMP Core
[08:34:43] Version 1.74 (November 27, 2006)
[08:34:43]
[08:34:43] Preparing to commence simulation
[08:34:43] - Ensuring status. Please wait.
[08:35:00] - Assembly optimizations manually forced on.
[08:35:00] - Not checking prior termination.
[08:35:01] - Expanded 2420849 -> 12854153 (decompressed 530.9 percent)
[08:35:01]
[08:35:01] Project: 2605 (Run 0, Clone 540, Gen 46)
[08:35:01]
[08:35:01] Assembly optimizations on if available.
[08:35:01] Entering M.D.
[08:35:07] Calling FAH init
[08:35:08] in POPC
[08:35:08] Writing local files
[08:35:08] checkpoint)
[08:35:08] Read checkpoint
[08:35:08] Protein: Protein in POPC
[08:35:08] Writing local files
[08:35:09] Extra SSE boost OK.
[08:35:09] Writing local files
[08:35:09] Completed 0 out of 500000 steps (0 percent)
[08:35:09]
[08:35:09] Folding@home Core Shutdown: INTERRUPTED
[08:35:13] CoreStatus = 66 (102)
[08:35:13] + Shutdown requested by user. Exiting.***** Got a SIGTERM signal (15)
[08:35:13] Killing all core threads
Folding@Home Client Shutdown.