Page 1 of 1

Project: 2665 (Run 3, Clone 762, Gen 179)

Posted: Sun Jan 31, 2010 5:58 am
by MoneyGuyBK
Why is this project slowing down to a crawl .....

As you can see in the log, it satrts fast and then crawls to Super Slow:

The first log before I rebooted to try to remedy the issue:

Code: Select all

[14:16:55] Project: 2665 (Run 3, Clone 762, Gen 179)
[14:16:55] 
[14:16:55] Entering M.D.
[14:17:01] Calling FAH init
[14:17:03] ater
[14:17:03] Writing local files
[14:17:03] rom checkpoint)
[14:17:03] Read checkpoint
[14:17:04] eps  (18 percent)
[14:17:04] ter
[14:17:04] Writing local files
[14:17:04] Completed 45000 out of 250000 steps  (18 percent)
[14:17:13] Extra SSE boost OK.
[14:41:43] Writing local files
[14:41:43] Completed 47500 out of 250000 steps  (19 percent)
[15:05:03] Writing local files
[15:05:04] Completed 50000 out of 250000 steps  (20 percent)
[15:28:20] Writing local files
[15:28:21] Completed 52500 out of 250000 steps  (21 percent)
[15:51:40] Writing local files
[15:51:40] Completed 55000 out of 250000 steps  (22 percent)
[16:16:43] Writing local files
[16:16:43] Completed 57500 out of 250000 steps  (23 percent)
[16:40:40] Writing local files
[16:40:40] Completed 60000 out of 250000 steps  (24 percent)
[17:05:22] Writing local files
[17:05:22] Completed 62500 out of 250000 steps  (25 percent)
[17:31:13] Writing local files
[17:31:13] Completed 65000 out of 250000 steps  (26 percent)
[17:58:07] Writing local files
[17:58:07] Completed 67500 out of 250000 steps  (27 percent)
[18:26:33] Writing local files
[18:26:33] Completed 70000 out of 250000 steps  (28 percent)
[18:56:29] Writing local files
[18:56:29] Completed 72500 out of 250000 steps  (29 percent)
[19:26:30] Timered checkpoint triggered.
[19:28:32] Writing local files
[19:28:33] Completed 75000 out of 250000 steps  (30 percent)
[19:58:33] Timered checkpoint triggered.
[20:02:26] Writing local files
[20:02:26] Completed 77500 out of 250000 steps  (31 percent)
[20:16:21] - Autosending finished units... [January 30 20:16:21 UTC]
[20:16:21] Trying to send all finished work units
[20:16:21] + No unsent completed units remaining.
[20:16:21] - Autosend completed
[20:32:26] Timered checkpoint triggered.
[20:39:08] Writing local files
[20:39:08] Completed 80000 out of 250000 steps  (32 percent)
[21:09:12] Timered checkpoint triggered.
[21:18:15] Writing local files
[21:18:15] Completed 82500 out of 250000 steps  (33 percent)
[21:48:19] Timered checkpoint triggered.
[21:59:50] Writing local files
[21:59:50] Completed 85000 out of 250000 steps  (34 percent)
[22:29:56] Timered checkpoint triggered.
[22:47:26] Writing local files
[22:47:26] Completed 87500 out of 250000 steps  (35 percent)
[23:17:28] Timered checkpoint triggered.
[23:44:53] Writing local files
[23:44:54] Completed 90000 out of 250000 steps  (36 percent)
[00:14:59] Timered checkpoint triggered.
[00:44:39] Writing local files
[00:44:40] Completed 92500 out of 250000 steps  (37 percent)
[01:14:43] Timered checkpoint triggered.
[01:44:44] Timered checkpoint triggered.
[01:48:48] Writing local files
[01:48:49] Completed 95000 out of 250000 steps  (38 percent)
[02:16:21] - Autosending finished units... [January 31 02:16:21 UTC]
[02:16:21] Trying to send all finished work units
[02:16:21] + No unsent completed units remaining.
[02:16:21] - Autosend completed
[02:18:49] Timered checkpoint triggered.
[02:48:50] Timered checkpoint triggered.
The following after a Reboot:

Code: Select all

[02:53:31] Project: 2665 (Run 3, Clone 762, Gen 179)
[02:53:31] 
[02:53:32] Entering M.D.
[02:53:38] Calling FAH init
[02:53:40] ater
[02:53:40] Writing local files
[02:53:40] rom checkpoint)
[02:53:40] Read checkpoint
[02:53:41] eps  (38 percent)
[02:53:41] ter
[02:53:41] Writing local files
[02:53:41] Completed 96918 out of 250000 steps  (38 percent)
[02:53:48] Extra SSE boost OK.
[03:01:21] Writing local files
[03:01:21] Completed 97500 out of 250000 steps  (39 percent)
[03:24:53] Writing local files
[03:24:53] Completed 100000 out of 250000 steps  (40 percent)
[03:47:53] Writing local files
[03:47:53] Completed 102500 out of 250000 steps  (41 percent)
[04:10:38] Writing local files
[04:10:38] Completed 105000 out of 250000 steps  (42 percent)
[04:35:35] Writing local files
[04:35:36] Completed 107500 out of 250000 steps  (43 percent)
[05:05:36] Timered checkpoint triggered.
[05:07:20] Writing local files
[05:07:21] Completed 110000 out of 250000 steps  (44 percent)
[05:37:19] Writing local files
[05:37:19] Completed 112500 out of 250000 steps  (45 percent)
Proc is a 2.53 Ghz Core2Duo









Peace

Re: Project: 2665 (Run 3, Clone 762, Gen 179)

Posted: Mon Feb 01, 2010 3:00 am
by MoneyGuyBK
It finally finished .... but I see a 7B (123) code .... will a mod be able to check and tell me:
1) why this project acted eradically?
2) Did I get points for it?
TIA

Code: Select all

[00:55:20] Completed 242500 out of 250000 steps  (97 percent)
[01:16:55] Writing local files
[01:16:56] Completed 245000 out of 250000 steps  (98 percent)
[01:38:31] Writing local files
[01:38:31] Completed 247500 out of 250000 steps  (99 percent)
[02:00:47] Writing local files
[02:00:47] Completed 250000 out of 250000 steps  (100 percent)
[02:00:47] Writing final coordinates.
[02:00:48] Past main M.D. loop
[02:00:48] Will end MPI now
[02:01:48] 
[02:01:48] Finished Work Unit:
[02:01:48] - Reading up to 21310704 from "work/wudata_03.arc": Read 21310704
[02:01:48] - Reading up to 555876 from "work/wudata_03.xtc": Read 555876
[02:01:48] goefile size: 0
[02:01:48] logfile size: 221657
[02:01:48] Leaving Run
[02:01:50] - Writing 22094609 bytes of core data to disk...
[02:01:50]   ... Done.
[02:01:50] - Failed to delete work/wudata_03.sas
[02:01:50] - Failed to delete work/wudata_03.goe
[02:01:50] Warning:  check for stray files
[02:01:50] - Shutting down core
[02:03:51] 
[02:03:51] Folding@home Core Shutdown: FINISHED_UNIT
[02:03:51] 
[02:03:51] Folding@home Core Shutdown: FINISHED_UNIT
[02:03:55] CoreStatus = 7B (123)
[02:03:55] Sending work to server
[02:03:55] Project: 2665 (Run 3, Clone 762, Gen 179)


[02:03:55] + Attempting to send results [February 1 02:03:55 UTC]
[02:03:55] - Reading file work/wuresults_03.dat from core
[02:03:55]   (Read 22094609 bytes from disk)
[02:03:55] Connecting to http://171.64.65.64:8080/
[02:06:58] Posted data.
[02:06:58] Initial: 0000; - Uploaded at ~115 kB/s
[02:07:02] - Averaged speed for that direction ~138 kB/s
[02:07:02] + Results successfully sent
[02:07:02] Thank you for your contribution to Folding@Home.
[02:07:07] - Warning: Could not delete all work unit files (3): Core returned invalid code
[02:07:07] Trying to send all finished work units
[02:07:07] + No unsent completed units remaining.
[02:07:07] - Preparing to get new work unit...
[02:07:07] Cleaning up work directory
[02:07:07] + Attempting to get work packet
[02:07:07] Passkey found
[02:07:07] - Will indicate memory of 4047 MB
[02:07:07] - Detect CPU. Vendor: GenuineIntel, Family: 6, Model: 7, Stepping: 10
[02:07:07] - Connecting to assignment server
[02:07:07] Connecting to http://assign.stanford.edu:8080/
[02:07:08] Posted data.
[02:07:08] Initial: 40AB; - Successful: assigned to (171.64.65.64).
[02:07:08] + News From Folding@Home: Welcome to Folding@Home
[02:07:08] Loaded queue successfully.
[02:07:08] Connecting to http://171.64.65.64:8080/
[02:07:13] Posted data.
[02:07:13] Initial: 0000; - Receiving payload (expected size: 4784985)
[02:07:17] - Downloaded at ~1168 kB/s
[02:07:17] - Averaged speed for that direction ~1062 kB/s
[02:07:17] + Received work.
[02:07:17] Trying to send all finished work units
[02:07:17] + No unsent completed units remaining.
[02:07:17] + Closed connections
[02:07:22] 
[02:07:22] + Processing work unit
[02:07:22] Work type a1 not eligible for variable processors
[02:07:22] Core required: FahCore_a1.exe
[02:07:22] Core found.
[02:07:22] Working on queue slot 04 [February 1 02:07:22 UTC]
[02:07:22] + Working ...
[02:07:22] - Calling 'mpiexec -np 4 -channel auto -host 127.0.0.1 FahCore_a1.exe -dir work/ -suffix 04 -checkpoint 30 -verbose -lifeline 4432 -version 629'

[02:07:22] 
[02:07:22] *------------------------------*
[02:07:23] Folding@Home Gromacs SMP Core
[02:07:23] Version 1.74 (March 10, 2007)
[02:07:23] 
[02:07:23] Preparing to commence simulation
[02:07:23] - Ensuring status. Please wait.
[02:07:28] - Starting from initial work packet
[02:07:28] 
[02:07:28] Project: 2665 (Run 0, Clone 735, Gen 185)
[02:07:28] 
[02:07:29] Assembly optimizations on if available.
[02:07:29] Entering M.D.
[02:07:50] percent)
[02:07:51] - Starting from initial work packet
[02:07:51] 
[02:07:51] Project: 2665 (Run 0, Clone 735, Gen 185)
[02:07:51] 
[02:07:51] Entering M.D.
[02:08:01] GG in water
[02:08:01] Writing local files
[02:08:01] cal files
[02:08:02] Extra SSE boost OK.
[02:08:10] cal files
[02:08:10] Completed 0 out of 250000 steps  (0 percent)
[02:33:08] Writing local files
[02:33:08] Completed 2500 out of 250000 steps  (1 percent)
[02:52:53] - Autosending finished units... [February 1 02:52:53 UTC]
[02:52:53] Trying to send all finished work units
[02:52:53] + No unsent completed units remaining.
[02:52:53] - Autosend completed
Figured I would use some of that empty space you feel compelled to put at the bottom of all your posts.
Last credit Last returned WU Proj Run Clone Gen
1920 2010-01-31 18:10:04 2665 3 762 179
ChelseaOilman




Peace

Re: Project: 2665 (Run 3, Clone 762, Gen 179)

Posted: Tue Feb 02, 2010 7:11 am
by MoneyGuyBK
Thanx Chelsea ..... empty space :lol: well I try to be different :!:
I was just about to complain why no one had bothered to answer :oops:

Did we figure out a reason for fluctuations in percent times?






Peace