Page 1 of 1

BAD_WORK_UNIT P13000 - (327,0,16)

Posted: Wed May 07, 2014 8:11 am
by Dr.G
Have completed many WUs without problems, so I think this is a bad WU and not due to hardware, but would be nice to find out.

FAULTY project:13000 run:327 clone:0 gen:16 core:0x17 unit:0x00000027538b3db7530ff8202ba1583c

Code: Select all

00:53:22:WU01:FS01:0x17:Bad State detected... attempting to resume from last good checkpoint
00:53:22:WU01:FS01:0x17:Max number of retries reached. Aborting.
00:53:22:WU01:FS01:0x17:ERROR:exception: Max Retries Reached
00:53:22:WU01:FS01:0x17:Saving result file logfile_01.txt
00:53:22:WU01:FS01:0x17:Saving result file log.txt
00:53:22:WU01:FS01:0x17:Folding@home Core Shutdown: BAD_WORK_UNIT
00:53:22:WARNING:WU01:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
00:53:22:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13000 run:327 clone:0 gen:16 core:0x17 unit:0x00000027538b3db7530ff8202ba1583c
00:53:22:WU01:FS01:Uploading 3.12KiB to 140.163.4.231
00:53:22:WU01:FS01:Connecting to 140.163.4.231:8080
00:53:22:WU01:FS01:Upload complete
00:53:22:WU01:FS01:Server responded WORK_ACK (400)
00:53:22:WU01:FS01:Cleaning up
Many thanks!
Dr.G
(full details below)

Code: Select all

09:01:14:WU01:FS01:Connecting to 171.67.108.201:80
09:01:15:WU01:FS01:Assigned to work server 140.163.4.231
09:01:15:WU01:FS01:Requesting new work unit for slot 01: RUNNING gpu:1:Tahiti XT [Radeon R9 200/HD 7900/8970] from 140.163.4.231
09:01:15:WU01:FS01:Connecting to 140.163.4.231:8080
09:01:16:WU01:FS01:Downloading 4.84MiB
09:01:18:WU01:FS01:Download complete
09:01:19:WU01:FS01:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:13000 run:327 clone:0 gen:16 core:0x17 unit:0x00000027538b3db7530ff8202ba1583c
09:01:33:WU00:FS01:0x17:Saving result file logfile_01.txt
09:01:33:WU00:FS01:0x17:Saving result file checkpointState.xml
09:01:35:WU00:FS01:0x17:Saving result file checkpt.crc
09:01:35:WU00:FS01:0x17:Saving result file log.txt
09:01:35:WU00:FS01:0x17:Saving result file positions.xtc
09:01:38:WU01:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/ProgramData/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/ATI/R600/Core_17.fah/FahCore_17.exe -dir 01 -suffix 01 -version 704 -lifeline 3196 -checkpoint 15 -gpu 1 -gpu-vendor ati
09:01:38:WU01:FS01:Started FahCore on PID 2716
09:01:38:WU01:FS01:Core PID:5892
09:01:38:WU01:FS01:FahCore 0x17 started
09:01:38:WU01:FS01:0x17:*********************** Log Started 2014-05-06T09:01:38Z ***********************
09:01:38:WU01:FS01:0x17:Project: 13000 (Run 327, Clone 0, Gen 16)
09:01:38:WU01:FS01:0x17:Unit: 0x00000027538b3db7530ff8202ba1583c
09:01:38:WU01:FS01:0x17:CPU: 0x00000000000000000000000000000000
09:01:38:WU01:FS01:0x17:Machine: 1
09:01:38:WU01:FS01:0x17:Reading tar file state.xml
09:01:39:WU01:FS01:0x17:Reading tar file system.xml
09:01:39:WU01:FS01:0x17:Reading tar file integrator.xml
09:01:39:WU01:FS01:0x17:Reading tar file core.xml
09:01:39:WU01:FS01:0x17:Digital signatures verified
09:01:39:WU01:FS01:0x17:Folding@home GPU core17
09:01:39:WU01:FS01:0x17:Version 0.0.52
09:05:03:WU01:FS01:0x17:Completed 0 out of 5000000 steps (0%)
09:05:03:WU01:FS01:0x17:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
09:13:31:WU01:FS01:0x17:Completed 50000 out of 5000000 steps (1%)
09:21:42:WU01:FS01:0x17:Completed 100000 out of 5000000 steps (2%)
09:30:13:WU01:FS01:0x17:Completed 150000 out of 5000000 steps (3%)
09:38:24:WU01:FS01:0x17:Completed 200000 out of 5000000 steps (4%)
09:46:36:WU01:FS01:0x17:Completed 250000 out of 5000000 steps (5%)
09:55:06:WU01:FS01:0x17:Completed 300000 out of 5000000 steps (6%)
10:03:17:WU01:FS01:0x17:Completed 350000 out of 5000000 steps (7%)
10:11:48:WU01:FS01:0x17:Completed 400000 out of 5000000 steps (8%)
10:19:59:WU01:FS01:0x17:Completed 450000 out of 5000000 steps (9%)
10:28:10:WU01:FS01:0x17:Completed 500000 out of 5000000 steps (10%)
10:36:41:WU01:FS01:0x17:Completed 550000 out of 5000000 steps (11%)
10:44:52:WU01:FS01:0x17:Completed 600000 out of 5000000 steps (12%)
10:53:23:WU01:FS01:0x17:Completed 650000 out of 5000000 steps (13%)
11:01:34:WU01:FS01:0x17:Completed 700000 out of 5000000 steps (14%)
11:09:45:WU01:FS01:0x17:Completed 750000 out of 5000000 steps (15%)
11:18:16:WU01:FS01:0x17:Completed 800000 out of 5000000 steps (16%)
11:26:27:WU01:FS01:0x17:Completed 850000 out of 5000000 steps (17%)
11:34:58:WU01:FS01:0x17:Completed 900000 out of 5000000 steps (18%)
11:43:08:WU01:FS01:0x17:Completed 950000 out of 5000000 steps (19%)
11:51:19:WU01:FS01:0x17:Completed 1000000 out of 5000000 steps (20%)
11:59:50:WU01:FS01:0x17:Completed 1050000 out of 5000000 steps (21%)
******************************* Date: 2014-05-06 *******************************
12:08:01:WU01:FS01:0x17:Completed 1100000 out of 5000000 steps (22%)
12:16:32:WU01:FS01:0x17:Completed 1150000 out of 5000000 steps (23%)
12:24:43:WU01:FS01:0x17:Completed 1200000 out of 5000000 steps (24%)
12:32:54:WU01:FS01:0x17:Completed 1250000 out of 5000000 steps (25%)
12:41:24:WU01:FS01:0x17:Completed 1300000 out of 5000000 steps (26%)
12:49:36:WU01:FS01:0x17:Completed 1350000 out of 5000000 steps (27%)
12:58:07:WU01:FS01:0x17:Completed 1400000 out of 5000000 steps (28%)
13:06:17:WU01:FS01:0x17:Completed 1450000 out of 5000000 steps (29%)
13:14:29:WU01:FS01:0x17:Completed 1500000 out of 5000000 steps (30%)
13:22:59:WU01:FS01:0x17:Completed 1550000 out of 5000000 steps (31%)
13:31:11:WU01:FS01:0x17:Completed 1600000 out of 5000000 steps (32%)
13:39:43:WU01:FS01:0x17:Completed 1650000 out of 5000000 steps (33%)
13:47:54:WU01:FS01:0x17:Completed 1700000 out of 5000000 steps (34%)
13:56:05:WU01:FS01:0x17:Completed 1750000 out of 5000000 steps (35%)
14:04:36:WU01:FS01:0x17:Completed 1800000 out of 5000000 steps (36%)
14:12:46:WU01:FS01:0x17:Completed 1850000 out of 5000000 steps (37%)
14:21:17:WU01:FS01:0x17:Completed 1900000 out of 5000000 steps (38%)
14:29:28:WU01:FS01:0x17:Completed 1950000 out of 5000000 steps (39%)
14:37:39:WU01:FS01:0x17:Completed 2000000 out of 5000000 steps (40%)
14:46:09:WU01:FS01:0x17:Completed 2050000 out of 5000000 steps (41%)
14:54:20:WU01:FS01:0x17:Completed 2100000 out of 5000000 steps (42%)
15:02:51:WU01:FS01:0x17:Completed 2150000 out of 5000000 steps (43%)
15:11:02:WU01:FS01:0x17:Completed 2200000 out of 5000000 steps (44%)
15:19:12:WU01:FS01:0x17:Completed 2250000 out of 5000000 steps (45%)
15:27:44:WU01:FS01:0x17:Completed 2300000 out of 5000000 steps (46%)
15:35:55:WU01:FS01:0x17:Completed 2350000 out of 5000000 steps (47%)
15:44:26:WU01:FS01:0x17:Completed 2400000 out of 5000000 steps (48%)
15:52:37:WU01:FS01:0x17:Completed 2450000 out of 5000000 steps (49%)
16:00:48:WU01:FS01:0x17:Completed 2500000 out of 5000000 steps (50%)
16:09:18:WU01:FS01:0x17:Completed 2550000 out of 5000000 steps (51%)
16:17:29:WU01:FS01:0x17:Completed 2600000 out of 5000000 steps (52%)
16:26:00:WU01:FS01:0x17:Completed 2650000 out of 5000000 steps (53%)
16:34:11:WU01:FS01:0x17:Completed 2700000 out of 5000000 steps (54%)
16:42:22:WU01:FS01:0x17:Completed 2750000 out of 5000000 steps (55%)
16:50:50:WU01:FS01:0x17:Completed 2800000 out of 5000000 steps (56%)
16:59:00:WU01:FS01:0x17:Completed 2850000 out of 5000000 steps (57%)
17:07:30:WU01:FS01:0x17:Completed 2900000 out of 5000000 steps (58%)
17:15:40:WU01:FS01:0x17:Completed 2950000 out of 5000000 steps (59%)
17:23:51:WU01:FS01:0x17:Completed 3000000 out of 5000000 steps (60%)
17:32:22:WU01:FS01:0x17:Completed 3050000 out of 5000000 steps (61%)
17:40:33:WU01:FS01:0x17:Completed 3100000 out of 5000000 steps (62%)
17:49:04:WU01:FS01:0x17:Completed 3150000 out of 5000000 steps (63%)
17:57:14:WU01:FS01:0x17:Completed 3200000 out of 5000000 steps (64%)
18:05:25:WU01:FS01:0x17:Completed 3250000 out of 5000000 steps (65%)
******************************* Date: 2014-05-06 *******************************
18:13:56:WU01:FS01:0x17:Completed 3300000 out of 5000000 steps (66%)
18:22:07:WU01:FS01:0x17:Completed 3350000 out of 5000000 steps (67%)
18:30:37:WU01:FS01:0x17:Completed 3400000 out of 5000000 steps (68%)
18:38:47:WU01:FS01:0x17:Completed 3450000 out of 5000000 steps (69%)
18:46:58:WU01:FS01:0x17:Completed 3500000 out of 5000000 steps (70%)
18:55:29:WU01:FS01:0x17:Completed 3550000 out of 5000000 steps (71%)
19:03:39:WU01:FS01:0x17:Completed 3600000 out of 5000000 steps (72%)
19:12:10:WU01:FS01:0x17:Completed 3650000 out of 5000000 steps (73%)
19:20:21:WU01:FS01:0x17:Completed 3700000 out of 5000000 steps (74%)
19:28:31:WU01:FS01:0x17:Completed 3750000 out of 5000000 steps (75%)
19:37:01:WU01:FS01:0x17:Completed 3800000 out of 5000000 steps (76%)
20:21:11:WU01:FS01:0x17:Completed 3850000 out of 5000000 steps (77%)
20:49:45:WU01:FS01:0x17:Bad State detected... attempting to resume from last good checkpoint
21:34:51:WU01:FS01:0x17:Completed 3800000 out of 5000000 steps (76%)
22:31:59:WU01:FS01:0x17:Completed 3850000 out of 5000000 steps (77%)
23:00:33:WU01:FS01:0x17:Bad State detected... attempting to resume from last good checkpoint
23:27:40:WU01:FS01:0x17:Completed 3800000 out of 5000000 steps (76%)
******************************* Date: 2014-05-07 *******************************
00:24:48:WU01:FS01:0x17:Completed 3850000 out of 5000000 steps (77%)
00:53:22:WU01:FS01:0x17:Bad State detected... attempting to resume from last good checkpoint
00:53:22:WU01:FS01:0x17:Max number of retries reached. Aborting.
00:53:22:WU01:FS01:0x17:ERROR:exception: Max Retries Reached
00:53:22:WU01:FS01:0x17:Saving result file logfile_01.txt
00:53:22:WU01:FS01:0x17:Saving result file log.txt
00:53:22:WU01:FS01:0x17:Folding@home Core Shutdown: BAD_WORK_UNIT
00:53:22:WARNING:WU01:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
00:53:22:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13000 run:327 clone:0 gen:16 core:0x17 unit:0x00000027538b3db7530ff8202ba1583c
00:53:22:WU01:FS01:Uploading 3.12KiB to 140.163.4.231
00:53:22:WU01:FS01:Connecting to 140.163.4.231:8080
00:53:22:WU01:FS01:Upload complete

Re: BAD_WORK_UNIT P13000 - (327,0,16)

Posted: Wed May 07, 2014 10:03 am
by P5-133XL
So far 3 failures with no successes.

Re: BAD_WORK_UNIT P13000 - (327,0,16)

Posted: Wed May 07, 2014 10:48 am
by Dr.G
@ P5-133XL
Thanks for your reply!
by P5-133XL ยป Wed May 07, 2014 10:03 am
So far 3 failures with no successes.
Looks like this this is a bad WU (?)
:e?:

Re: BAD_WORK_UNIT P13000 - (327,0,16)

Posted: Wed May 07, 2014 6:09 pm
by P5-133XL
I personally do not label a WU bad till there are 6 failures and no successes. I will continue checking the WU periodically till it either succeeds or gets to 6.

Re: BAD_WORK_UNIT P13000 - (327,0,16)

Posted: Fri May 23, 2014 5:58 am
by Dr.G
Any update on this particular WU? (please)

Re: BAD_WORK_UNIT P13000 - (327,0,16)

Posted: Fri May 23, 2014 6:14 am
by P5-133XL
It has only been given out to one more person (now 4). None have yet succeeded in completing it without error. My personal threshold for manually reporting a bad WU is 6 failures.

Re: BAD_WORK_UNIT P13000 - (327,0,16)

Posted: Fri May 23, 2014 7:52 am
by Dr.G
Thanks @P5-133XL

Re: BAD_WORK_UNIT P13000 - (327,0,16)

Posted: Fri May 23, 2014 1:37 pm
by 7im
Can we find someone with a lower personal threshold to tank this obviously bad work unit? 3 strikes and you're out!

Re: BAD_WORK_UNIT P13000 - (327,0,16)

Posted: Fri May 23, 2014 6:12 pm
by bruce
7im wrote:Can we find someone with a lower personal threshold to tank this obviously bad work unit? 3 strikes and you're out!
That's really the responsibility of the PG, specifically project owner, not the Mods. (S)he is responsible for setting a global value on the server.

Re: BAD_WORK_UNIT P13000 - (327,0,16)

Posted: Wed Jul 09, 2014 9:47 am
by bollix47
Another folder was able to complete this WU successfully:

Hi xxxx (team xxxx),
Your WU (P13000 R327 C0 G16) was added to the stats database on 2014-06-11 13:04:22 for 36763.1 points of credit.

Re: BAD_WORK_UNIT P13000 - (327,0,16)

Posted: Fri Aug 08, 2014 5:51 pm
by Dr.G
@ bollix47

Thanks for the feedback!