Page 1 of 1

Project: 3062 (Run 2, Clone 115, Gen 21)

Posted: Fri May 23, 2008 6:13 am
by DocJonz
I seem to be running on an 'unlucky' streak at the moment - this WU is running on a rig using the Linux 6.00beta2 client (the last upset on this machine was on the 18th April, and it runs 24/7).

It restarted the same WU and is currently at 22% .... unfortunately, I won't be around to try and stop/restart it just before it reaches the same 34% fall over point :(

Are you guys interested in this type of stuff, or is it just clogging Forum space?

Code: Select all

[14:32:44] 
[14:32:44] *------------------------------*
[14:32:44] Folding@Home Gromacs SMP Core
[14:32:44] Version 1.74 (November 27, 2006)
[14:32:44] 
[14:32:44] Preparing to commence simulation
[14:32:44] - Ensuring status. Please wait.
[14:33:01] - Assembly optimizations manually forced on.
[14:33:01] - Not checking prior termination.
[14:33:02] - Expanded 609919 -> 3263133 (decompressed 535.0 percent)
[14:33:02] - Starting from initial work packet
[14:33:02] 
[14:33:02] Project: 3062 (Run 2, Clone 115, Gen 21)
[14:33:02] 
[14:33:02] Assembly optimizations on if available.
[14:33:02] Entering M.D.
[14:33:08] Protein: p3062_lambdaProtein: p3062_lambda5_99sbExtra SSE boost OK.
[14:33:08] 
[14:33:08] Extra SSE boost OK.
[14:33:08] Writing local files
[14:33:08] Completed 0 out of 5000000 steps  (0 percent)
[14:48:02] Writing local files
[14:48:02] Completed 50000 out of 5000000 steps  (1 percent)
[15:00:05] Writing local files
[15:00:05] Completed 100000 out of 5000000 steps  (2 percent)
[15:16:23] Writing local files
[15:16:23] Completed 150000 out of 5000000 steps  (3 percent)
[15:32:43] Writing local files
[15:32:43] Completed 200000 out of 5000000 steps  (4 percent)
[15:49:01] Writing local files
[15:49:01] Completed 250000 out of 5000000 steps  (5 percent)
[16:05:21] Writing local files
[16:05:21] Completed 300000 out of 5000000 steps  (6 percent)
[16:21:42] Writing local files
[16:21:42] Completed 350000 out of 5000000 steps  (7 percent)
[16:38:02] Writing local files
[16:38:02] Completed 400000 out of 5000000 steps  (8 percent)
[16:54:23] Writing local files
[16:54:23] Completed 450000 out of 5000000 steps  (9 percent)
[17:10:42] Writing local files
[17:10:42] Completed 500000 out of 5000000 steps  (10 percent)
[17:27:10] Writing local files
[17:27:10] Completed 550000 out of 5000000 steps  (11 percent)
[17:43:38] Writing local files
[17:43:38] Completed 600000 out of 5000000 steps  (12 percent)
[18:00:05] Writing local files
[18:00:05] Completed 650000 out of 5000000 steps  (13 percent)
[18:16:30] Writing local files
[18:16:30] Completed 700000 out of 5000000 steps  (14 percent)
[18:20:53] - Autosending finished units...
[18:20:53] Trying to send all finished work units
[18:20:53] + No unsent completed units remaining.
[18:20:53] - Autosend completed
[18:32:55] Writing local files
[18:32:55] Completed 750000 out of 5000000 steps  (15 percent)
[18:49:19] Writing local files
[18:49:19] Completed 800000 out of 5000000 steps  (16 percent)
[19:05:44] Writing local files
[19:05:44] Completed 850000 out of 5000000 steps  (17 percent)
[19:22:10] Writing local files
[19:22:10] Completed 900000 out of 5000000 steps  (18 percent)
[19:38:34] Writing local files
[19:38:34] Completed 950000 out of 5000000 steps  (19 percent)
[19:54:58] Writing local files
[19:54:58] Completed 1000000 out of 5000000 steps  (20 percent)
[20:11:22] Writing local files
[20:11:22] Completed 1050000 out of 5000000 steps  (21 percent)
[20:27:49] Writing local files
[20:27:49] Completed 1100000 out of 5000000 steps  (22 percent)
[20:44:11] Writing local files
[20:44:11] Completed 1150000 out of 5000000 steps  (23 percent)
[21:00:35] Writing local files
[21:00:35] Completed 1200000 out of 5000000 steps  (24 percent)
[21:17:00] Writing local files
[21:17:00] Completed 1250000 out of 5000000 steps  (25 percent)
[21:33:25] Writing local files
[21:33:25] Completed 1300000 out of 5000000 steps  (26 percent)
[21:49:50] Writing local files
[21:49:50] Completed 1350000 out of 5000000 steps  (27 percent)
[22:06:14] Writing local files
[22:06:14] Completed 1400000 out of 5000000 steps  (28 percent)
[22:22:37] Writing local files
[22:22:37] Completed 1450000 out of 5000000 steps  (29 percent)
[22:39:03] Writing local files
[22:39:03] Completed 1500000 out of 5000000 steps  (30 percent)
[22:55:25] Writing local files
[22:55:25] Completed 1550000 out of 5000000 steps  (31 percent)
[23:11:49] Writing local files
[23:11:49] Completed 1600000 out of 5000000 steps  (32 percent)
[23:28:15] Writing local files
[23:28:15] Completed 1650000 out of 5000000 steps  (33 percent)
[23:44:41] Writing local files
[23:44:41] Completed 1700000 out of 5000000 steps  (34 percent)
[23:46:13] Warning:  long 1-4 interactions
[23:46:17] CoreStatus = 1 (1)
[23:46:17] Client-core communications error: ERROR 0x1
[23:46:17] Deleting current work unit & continuing...
Edit: I stopped the client at 24% (before I went to work), and restarted it - now back from work, its on 59% :D

Re: Project: 3062 (Run 2, Clone 115, Gen 21)

Posted: Fri May 30, 2008 1:08 am
by 7im
This is what the FAH WIKI says about reporting EUEs. You'll have to make your own determination case by case, WU by WU, if this stuff is necessary or just a clog. ;)

http://fahwiki.net/index.php/Early_Unit ... rting_EUEs

Re: Project: 3062 (Run 2, Clone 115, Gen 21)

Posted: Fri May 30, 2008 4:31 pm
by DocJonz
Thanks for the reply. Is this one classed as an EUE - it doesn't mention it in the log.

Re: Project: 3062 (Run 2, Clone 115, Gen 21)

Posted: Fri May 30, 2008 5:53 pm
by John Naylor
If a client stops processing a WU before its normal end, it is usually an EUE of some sort. A general rule I use about reporting them is that if something is sent back then it will be dealt with automatically... if something just EUEs then deletes itself then report it on here :) (as you have done)

Re: Project: 3062 (Run 2, Clone 115, Gen 21)

Posted: Fri May 30, 2008 6:38 pm
by 7im
What ^he^ said. ;)

Classically defined EUEs actually say Early_Unit_End in the log file. Here are those types: http://fahwiki.net/index.php/EUE_Types

The other type of "early end" is where a Core Status is shown, as in your example. Here is more info on those types of errors... http://fahwiki.net/index.php/CoreStatus_codes

CoreStatus = 1 (1) isn't specifically listed there (but should be), and the ERROR 0x1 is listed here: http://fahwiki.net/index.php/Error_0x0_and_0x1