Core 21 Projects spamming BAD_WORK_UNIT failures

If you think it might be a driver problem, see viewforum.php?f=79

Moderators: Site Moderators, FAHC Science Team

petem
Posts: 14
Joined: Wed Nov 18, 2015 4:57 pm
Hardware configuration: i5-2400, Asus P8Z77V-Pro, 2x4GB mem, 80GB HD, 2xGTX 970 (EVGA SSC), GTX 660ti (PNY), Rosewill Quark 750W PSU, Costco grape tray case ; )

Core 21 Projects spamming BAD_WORK_UNIT failures

Post by petem »

Hardware:
Dedicated Linux Box, Folding with 2 GTX 1080's since last September, with no overclocks. Does not fold on the CPU (a quad core i5-2300, if your're curious).
Been folding without issue for over 4 months w/o any unusual events (it's on a UPS, so no power outage issues, either).
No changes in either hardware or software in past 4 months, no automatic updates.

Starting today, I noticed Both GPU slots getting "BAD_WORK_UNIT" errors so often that they eventually become "FAILED" slots, and cease folding.
The past month, there was only 1 bad work unit error. Today there have been over 35 on one slot alone.
My older GTX 970/980 windows folding box is chugging along peacefully (but I haven't reviewed the logs to check for any recent flurry of errors).

Recent Folding Log for slot 1 containing multiple BWU failures

Code: Select all

01:37:36:WU00:FS01:0x18:Completed 1000000 out of 5000000 steps (20%)
01:39:00:WU00:FS01:0x18:Completed 1050000 out of 5000000 steps (21%)
01:40:19:WU00:FS01:0x18:Completed 1100000 out of 5000000 steps (22%)
01:41:42:WU00:FS01:0x18:Completed 1150000 out of 5000000 steps (23%)
01:43:01:WU00:FS01:0x18:Completed 1200000 out of 5000000 steps (24%)
01:44:21:WU00:FS01:0x18:Completed 1250000 out of 5000000 steps (25%)
01:45:44:WU00:FS01:0x18:Completed 1300000 out of 5000000 steps (26%)
01:47:03:WU00:FS01:0x18:Completed 1350000 out of 5000000 steps (27%)
01:48:26:WU00:FS01:0x18:Completed 1400000 out of 5000000 steps (28%)
01:49:45:WU00:FS01:0x18:Completed 1450000 out of 5000000 steps (29%)
01:51:05:WU00:FS01:0x18:Completed 1500000 out of 5000000 steps (30%)
01:52:28:WU00:FS01:0x18:Completed 1550000 out of 5000000 steps (31%)
01:53:47:WU00:FS01:0x18:Completed 1600000 out of 5000000 steps (32%)
01:55:10:WU00:FS01:0x18:Completed 1650000 out of 5000000 steps (33%)
01:56:30:WU00:FS01:0x18:Completed 1700000 out of 5000000 steps (34%)
01:57:49:WU00:FS01:0x18:Completed 1750000 out of 5000000 steps (35%)
01:59:12:WU00:FS01:0x18:Completed 1800000 out of 5000000 steps (36%)
02:00:31:WU00:FS01:0x18:Completed 1850000 out of 5000000 steps (37%)
02:01:55:WU00:FS01:0x18:Completed 1900000 out of 5000000 steps (38%)
02:03:14:WU00:FS01:0x18:Completed 1950000 out of 5000000 steps (39%)
02:04:33:WU00:FS01:0x18:Completed 2000000 out of 5000000 steps (40%)
02:05:56:WU00:FS01:0x18:Completed 2050000 out of 5000000 steps (41%)
02:07:16:WU00:FS01:0x18:Completed 2100000 out of 5000000 steps (42%)
02:08:39:WU00:FS01:0x18:Completed 2150000 out of 5000000 steps (43%)
02:09:58:WU00:FS01:0x18:Completed 2200000 out of 5000000 steps (44%)
02:11:17:WU00:FS01:0x18:Completed 2250000 out of 5000000 steps (45%)
02:12:41:WU00:FS01:0x18:Completed 2300000 out of 5000000 steps (46%)
02:14:00:WU00:FS01:0x18:Completed 2350000 out of 5000000 steps (47%)
02:15:23:WU00:FS01:0x18:Completed 2400000 out of 5000000 steps (48%)
02:16:42:WU00:FS01:0x18:Completed 2450000 out of 5000000 steps (49%)
02:18:02:WU00:FS01:0x18:Completed 2500000 out of 5000000 steps (50%)
02:19:25:WU00:FS01:0x18:Completed 2550000 out of 5000000 steps (51%)
02:20:44:WU00:FS01:0x18:Completed 2600000 out of 5000000 steps (52%)
02:22:07:WU00:FS01:0x18:Completed 2650000 out of 5000000 steps (53%)
02:23:27:WU00:FS01:0x18:Completed 2700000 out of 5000000 steps (54%)
02:24:46:WU00:FS01:0x18:Completed 2750000 out of 5000000 steps (55%)
02:26:09:WU00:FS01:0x18:Completed 2800000 out of 5000000 steps (56%)
02:27:28:WU00:FS01:0x18:Completed 2850000 out of 5000000 steps (57%)
02:28:51:WU00:FS01:0x18:Completed 2900000 out of 5000000 steps (58%)
02:30:11:WU00:FS01:0x18:Completed 2950000 out of 5000000 steps (59%)
02:31:30:WU00:FS01:0x18:Completed 3000000 out of 5000000 steps (60%)
02:32:54:WU00:FS01:0x18:Completed 3050000 out of 5000000 steps (61%)
02:34:13:WU00:FS01:0x18:Completed 3100000 out of 5000000 steps (62%)
02:35:36:WU00:FS01:0x18:Completed 3150000 out of 5000000 steps (63%)
02:36:55:WU00:FS01:0x18:Completed 3200000 out of 5000000 steps (64%)
02:38:15:WU00:FS01:0x18:Completed 3250000 out of 5000000 steps (65%)
02:39:38:WU00:FS01:0x18:Completed 3300000 out of 5000000 steps (66%)
02:40:57:WU00:FS01:0x18:Completed 3350000 out of 5000000 steps (67%)
02:42:20:WU00:FS01:0x18:Completed 3400000 out of 5000000 steps (68%)
02:43:39:WU00:FS01:0x18:Completed 3450000 out of 5000000 steps (69%)
02:44:59:WU00:FS01:0x18:Completed 3500000 out of 5000000 steps (70%)
02:46:22:WU00:FS01:0x18:Completed 3550000 out of 5000000 steps (71%)
02:47:41:WU00:FS01:0x18:Completed 3600000 out of 5000000 steps (72%)
02:49:04:WU00:FS01:0x18:Completed 3650000 out of 5000000 steps (73%)
02:50:24:WU00:FS01:0x18:Completed 3700000 out of 5000000 steps (74%)
02:51:43:WU00:FS01:0x18:Completed 3750000 out of 5000000 steps (75%)
02:53:06:WU00:FS01:0x18:Completed 3800000 out of 5000000 steps (76%)
02:54:25:WU00:FS01:0x18:Completed 3850000 out of 5000000 steps (77%)
02:55:49:WU00:FS01:0x18:Completed 3900000 out of 5000000 steps (78%)
02:57:08:WU00:FS01:0x18:Completed 3950000 out of 5000000 steps (79%)
02:58:27:WU00:FS01:0x18:Completed 4000000 out of 5000000 steps (80%)
02:59:51:WU00:FS01:0x18:Completed 4050000 out of 5000000 steps (81%)
03:01:10:WU00:FS01:0x18:Completed 4100000 out of 5000000 steps (82%)
03:02:33:WU00:FS01:0x18:Completed 4150000 out of 5000000 steps (83%)
03:03:53:WU00:FS01:0x18:Completed 4200000 out of 5000000 steps (84%)
03:05:12:WU00:FS01:0x18:Completed 4250000 out of 5000000 steps (85%)
03:06:35:WU00:FS01:0x18:Completed 4300000 out of 5000000 steps (86%)
03:07:54:WU00:FS01:0x18:Completed 4350000 out of 5000000 steps (87%)
03:09:18:WU00:FS01:0x18:Completed 4400000 out of 5000000 steps (88%)
03:10:37:WU00:FS01:0x18:Completed 4450000 out of 5000000 steps (89%)
03:11:56:WU00:FS01:0x18:Completed 4500000 out of 5000000 steps (90%)
03:13:19:WU00:FS01:0x18:Completed 4550000 out of 5000000 steps (91%)
03:14:39:WU00:FS01:0x18:Completed 4600000 out of 5000000 steps (92%)
03:16:02:WU00:FS01:0x18:Completed 4650000 out of 5000000 steps (93%)
03:17:21:WU00:FS01:0x18:Completed 4700000 out of 5000000 steps (94%)
03:18:41:WU00:FS01:0x18:Completed 4750000 out of 5000000 steps (95%)
03:20:04:WU00:FS01:0x18:Completed 4800000 out of 5000000 steps (96%)
03:21:23:WU00:FS01:0x18:Completed 4850000 out of 5000000 steps (97%)
03:22:46:WU00:FS01:0x18:Completed 4900000 out of 5000000 steps (98%)
03:24:06:WU00:FS01:0x18:Completed 4950000 out of 5000000 steps (99%)
03:25:25:WU00:FS01:0x18:Completed 5000000 out of 5000000 steps (100%)
03:25:25:WU02:FS01:Connecting to 171.67.108.45:80
03:25:25:WU02:FS01:Assigned to work server 140.163.4.244
03:25:25:WU02:FS01:Requesting new work unit for slot 01: RUNNING gpu:0:GP104 [GeForce GTX 1080] from 140.163.4.244
03:25:25:WU02:FS01:Connecting to 140.163.4.244:8080
03:25:26:WU02:FS01:Downloading 2.54MiB
03:25:29:WU00:FS01:0x18:Saving result file logfile_01.txt
03:25:29:WU00:FS01:0x18:Saving result file checkpointState.xml
03:25:29:WU02:FS01:Download complete
03:25:29:WU02:FS01:Received Unit: id:02 state:DOWNLOAD error:NO_ERROR project:10490 run:190 clone:0 gen:525 core:0x18 unit:0x000002768ca304f45537e8f9ef2fbc0a
03:25:30:WU00:FS01:0x18:Saving result file checkpt.crc
03:25:30:WU00:FS01:0x18:Saving result file log.txt
03:25:30:WU00:FS01:0x18:Saving result file positions.xtc
03:25:31:WU00:FS01:0x18:Folding@home Core Shutdown: FINISHED_UNIT
03:25:31:WU00:FS01:FahCore returned: FINISHED_UNIT (100 = 0x64)
03:25:31:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:10490 run:311 clone:0 gen:554 core:0x18 unit:0x0000027c8ca304f45537e90c68283a8b
03:25:31:WU00:FS01:Uploading 6.65MiB to 140.163.4.244
03:25:31:WU00:FS01:Connecting to 140.163.4.244:8080
03:25:31:WU02:FS01:Starting
03:25:31:WU02:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/fahwebx.stanford.edu/cores/Linux/AMD64/NVIDIA/Fermi/Core_18.fah/FahCore_18 -dir 02 -suffix 01 -version 704 -lifeline 1097 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
03:25:31:WU02:FS01:Started FahCore on PID 3332
03:25:31:WU02:FS01:Core PID:3336
03:25:31:WU02:FS01:FahCore 0x18 started
03:25:31:WU02:FS01:0x18:*********************** Log Started 2017-02-01T03:25:31Z ***********************
03:25:31:WU02:FS01:0x18:Project: 10490 (Run 190, Clone 0, Gen 525)
03:25:31:WU02:FS01:0x18:Unit: 0x000002768ca304f45537e8f9ef2fbc0a
03:25:31:WU02:FS01:0x18:CPU: 0x00000000000000000000000000000000
03:25:31:WU02:FS01:0x18:Machine: 1
03:25:31:WU02:FS01:0x18:Reading tar file core.xml
03:25:31:WU02:FS01:0x18:Reading tar file system.xml
03:25:31:WU02:FS01:0x18:Reading tar file integrator.xml
03:25:31:WU02:FS01:0x18:Reading tar file state.xml
03:25:32:WU02:FS01:0x18:Digital signatures verified
03:25:32:WU02:FS01:0x18:Folding@home GPU core18
03:25:32:WU02:FS01:0x18:Version 0.0.4
03:25:38:WU00:FS01:Upload 22.56%
03:25:40:WU02:FS01:0x18:Completed 0 out of 5000000 steps (0%)
03:25:40:WU02:FS01:0x18:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
03:25:44:WU00:FS01:Upload 41.37%
03:25:50:WU00:FS01:Upload 54.53%
03:25:57:WU00:FS01:Upload 73.33%
03:26:05:WU00:FS01:Upload 91.20%
03:26:14:WU00:FS01:Upload complete
03:26:14:WU00:FS01:Server responded WORK_ACK (400)
03:26:14:WU00:FS01:Final credit estimate, 83089.00 points
03:26:14:WU00:FS01:Cleaning up
03:27:02:WU02:FS01:0x18:Completed 50000 out of 5000000 steps (1%)
03:28:23:WU02:FS01:0x18:Completed 100000 out of 5000000 steps (2%)
03:29:46:WU02:FS01:0x18:Completed 150000 out of 5000000 steps (3%)
03:31:05:WU02:FS01:0x18:Completed 200000 out of 5000000 steps (4%)
03:32:25:WU02:FS01:0x18:Completed 250000 out of 5000000 steps (5%)
03:33:48:WU02:FS01:0x18:Completed 300000 out of 5000000 steps (6%)
03:35:07:WU02:FS01:0x18:Completed 350000 out of 5000000 steps (7%)
03:36:31:WU02:FS01:0x18:Completed 400000 out of 5000000 steps (8%)
03:37:50:WU02:FS01:0x18:Completed 450000 out of 5000000 steps (9%)
03:39:09:WU02:FS01:0x18:Completed 500000 out of 5000000 steps (10%)
03:40:32:WU02:FS01:0x18:Completed 550000 out of 5000000 steps (11%)
03:41:51:WU02:FS01:0x18:Completed 600000 out of 5000000 steps (12%)
03:43:14:WU02:FS01:0x18:Completed 650000 out of 5000000 steps (13%)
03:44:34:WU02:FS01:0x18:Completed 700000 out of 5000000 steps (14%)
03:45:53:WU02:FS01:0x18:Completed 750000 out of 5000000 steps (15%)
03:47:16:WU02:FS01:0x18:Completed 800000 out of 5000000 steps (16%)
03:48:36:WU02:FS01:0x18:Completed 850000 out of 5000000 steps (17%)
03:49:59:WU02:FS01:0x18:Completed 900000 out of 5000000 steps (18%)
03:51:18:WU02:FS01:0x18:Completed 950000 out of 5000000 steps (19%)
03:52:38:WU02:FS01:0x18:Completed 1000000 out of 5000000 steps (20%)
03:54:01:WU02:FS01:0x18:Completed 1050000 out of 5000000 steps (21%)
03:55:20:WU02:FS01:0x18:Completed 1100000 out of 5000000 steps (22%)
03:56:43:WU02:FS01:0x18:Completed 1150000 out of 5000000 steps (23%)
03:58:02:WU02:FS01:0x18:Completed 1200000 out of 5000000 steps (24%)
03:59:22:WU02:FS01:0x18:Completed 1250000 out of 5000000 steps (25%)
04:00:45:WU02:FS01:0x18:Completed 1300000 out of 5000000 steps (26%)
04:02:04:WU02:FS01:0x18:Completed 1350000 out of 5000000 steps (27%)
04:03:28:WU02:FS01:0x18:Completed 1400000 out of 5000000 steps (28%)
04:04:47:WU02:FS01:0x18:Completed 1450000 out of 5000000 steps (29%)
04:06:06:WU02:FS01:0x18:Completed 1500000 out of 5000000 steps (30%)
04:07:29:WU02:FS01:0x18:Completed 1550000 out of 5000000 steps (31%)
04:08:49:WU02:FS01:0x18:Completed 1600000 out of 5000000 steps (32%)
04:10:12:WU02:FS01:0x18:Completed 1650000 out of 5000000 steps (33%)
04:11:31:WU02:FS01:0x18:Completed 1700000 out of 5000000 steps (34%)
04:12:50:WU02:FS01:0x18:Completed 1750000 out of 5000000 steps (35%)
04:14:14:WU02:FS01:0x18:Completed 1800000 out of 5000000 steps (36%)
04:15:33:WU02:FS01:0x18:Completed 1850000 out of 5000000 steps (37%)
04:16:56:WU02:FS01:0x18:Completed 1900000 out of 5000000 steps (38%)
04:18:16:WU02:FS01:0x18:Completed 1950000 out of 5000000 steps (39%)
04:19:35:WU02:FS01:0x18:Completed 2000000 out of 5000000 steps (40%)
04:20:58:WU02:FS01:0x18:Completed 2050000 out of 5000000 steps (41%)
04:22:18:WU02:FS01:0x18:Completed 2100000 out of 5000000 steps (42%)
04:23:41:WU02:FS01:0x18:Completed 2150000 out of 5000000 steps (43%)
04:25:00:WU02:FS01:0x18:Completed 2200000 out of 5000000 steps (44%)
04:26:20:WU02:FS01:0x18:Completed 2250000 out of 5000000 steps (45%)
04:27:43:WU02:FS01:0x18:Completed 2300000 out of 5000000 steps (46%)
04:29:02:WU02:FS01:0x18:Completed 2350000 out of 5000000 steps (47%)
04:30:26:WU02:FS01:0x18:Completed 2400000 out of 5000000 steps (48%)
04:31:45:WU02:FS01:0x18:Completed 2450000 out of 5000000 steps (49%)
04:33:05:WU02:FS01:0x18:Completed 2500000 out of 5000000 steps (50%)
04:34:28:WU02:FS01:0x18:Completed 2550000 out of 5000000 steps (51%)
04:35:47:WU02:FS01:0x18:Completed 2600000 out of 5000000 steps (52%)
04:37:11:WU02:FS01:0x18:Completed 2650000 out of 5000000 steps (53%)
04:38:30:WU02:FS01:0x18:Completed 2700000 out of 5000000 steps (54%)
04:39:49:WU02:FS01:0x18:Completed 2750000 out of 5000000 steps (55%)
04:41:12:WU02:FS01:0x18:Completed 2800000 out of 5000000 steps (56%)
04:42:32:WU02:FS01:0x18:Completed 2850000 out of 5000000 steps (57%)
04:43:55:WU02:FS01:0x18:Completed 2900000 out of 5000000 steps (58%)
04:45:14:WU02:FS01:0x18:Completed 2950000 out of 5000000 steps (59%)
04:46:34:WU02:FS01:0x18:Completed 3000000 out of 5000000 steps (60%)
04:47:57:WU02:FS01:0x18:Completed 3050000 out of 5000000 steps (61%)
04:49:16:WU02:FS01:0x18:Completed 3100000 out of 5000000 steps (62%)
04:50:39:WU02:FS01:0x18:Completed 3150000 out of 5000000 steps (63%)
04:51:59:WU02:FS01:0x18:Completed 3200000 out of 5000000 steps (64%)
04:53:18:WU02:FS01:0x18:Completed 3250000 out of 5000000 steps (65%)
04:54:41:WU02:FS01:0x18:Completed 3300000 out of 5000000 steps (66%)
04:56:00:WU02:FS01:0x18:Completed 3350000 out of 5000000 steps (67%)
04:57:23:WU02:FS01:0x18:Completed 3400000 out of 5000000 steps (68%)
04:58:43:WU02:FS01:0x18:Completed 3450000 out of 5000000 steps (69%)
05:00:02:WU02:FS01:0x18:Completed 3500000 out of 5000000 steps (70%)
05:01:25:WU02:FS01:0x18:Completed 3550000 out of 5000000 steps (71%)
05:02:44:WU02:FS01:0x18:Completed 3600000 out of 5000000 steps (72%)
05:04:07:WU02:FS01:0x18:Completed 3650000 out of 5000000 steps (73%)
05:05:27:WU02:FS01:0x18:Completed 3700000 out of 5000000 steps (74%)
05:06:46:WU02:FS01:0x18:Completed 3750000 out of 5000000 steps (75%)
05:08:09:WU02:FS01:0x18:Completed 3800000 out of 5000000 steps (76%)
05:09:29:WU02:FS01:0x18:Completed 3850000 out of 5000000 steps (77%)
05:10:52:WU02:FS01:0x18:Completed 3900000 out of 5000000 steps (78%)
05:12:11:WU02:FS01:0x18:Completed 3950000 out of 5000000 steps (79%)
05:13:30:WU02:FS01:0x18:Completed 4000000 out of 5000000 steps (80%)
05:14:53:WU02:FS01:0x18:Completed 4050000 out of 5000000 steps (81%)
05:16:13:WU02:FS01:0x18:Completed 4100000 out of 5000000 steps (82%)
05:17:36:WU02:FS01:0x18:Completed 4150000 out of 5000000 steps (83%)
05:18:55:WU02:FS01:0x18:Completed 4200000 out of 5000000 steps (84%)
05:20:14:WU02:FS01:0x18:Completed 4250000 out of 5000000 steps (85%)
05:21:37:WU02:FS01:0x18:Completed 4300000 out of 5000000 steps (86%)
05:22:57:WU02:FS01:0x18:Completed 4350000 out of 5000000 steps (87%)
05:24:20:WU02:FS01:0x18:Completed 4400000 out of 5000000 steps (88%)
05:25:39:WU02:FS01:0x18:Completed 4450000 out of 5000000 steps (89%)
05:26:59:WU02:FS01:0x18:Completed 4500000 out of 5000000 steps (90%)
05:28:22:WU02:FS01:0x18:Completed 4550000 out of 5000000 steps (91%)
05:29:41:WU02:FS01:0x18:Completed 4600000 out of 5000000 steps (92%)
05:31:04:WU02:FS01:0x18:Completed 4650000 out of 5000000 steps (93%)
05:32:23:WU02:FS01:0x18:Completed 4700000 out of 5000000 steps (94%)
05:33:43:WU02:FS01:0x18:Completed 4750000 out of 5000000 steps (95%)
05:35:06:WU02:FS01:0x18:Completed 4800000 out of 5000000 steps (96%)
05:36:25:WU02:FS01:0x18:Completed 4850000 out of 5000000 steps (97%)
05:37:48:WU02:FS01:0x18:Completed 4900000 out of 5000000 steps (98%)
05:39:0100%)
05:40:27:WU00:FS01:Connecting to 171.67.108.45:80
05:40:27:WU00:FS01:Assigned to work server 171.67.108.159
05:40:27:WU00:FS01:Requesting new work unit for slot 01: RUNNING gpu:0:GP104 [GeForce GTX 1080] from 171.67.108.159
05:40:27:WU00:FS01:Connecting to 171.67.108.159:8080
05:40:28:WU00:FS01:Downloading 23.56MiB
05:40:31:WU02:FS01:0x18:Saving result file logfile_01.txt
05:40:31:WU02:FS01:0x18:Saving result file checkpointState.xml
05:40:32:WU02:FS01:0x18:Saving result file checkpt.crc
05:40:32:WU02:FS01:0x18:Saving result file log.txt
05:40:32:WU02:FS01:0x18:Saving result file positions.xtc
05:40:32:WU02:FS01:0x18:Folding@home Core Shutdown: FINISHED_UNIT
05:40:33:WU02:FS01:FahCore returned: FINISHED_UNIT (100 = 0x64)
05:40:33:WU02:FS01:Sending unit results: id:02 state:SEND error:NO_ERROR project:10490 run:190 clone:0 gen:525 core:0x18 unit:0x000002768ca304f45537e8f9ef2fbc0a
05:40:33:WU02:FS01:Uploading 6.69MiB to 140.163.4.244
05:40:33:WU02:FS01:Connecting to 140.163.4.244:8080
05:40:34:WU00:FS01:Download 66.59%
05:40:39:WU02:FS01:Upload 17.75%
05:40:40:WU00:FS01:Download 95.51%
05:40:40:WU00:FS01:Download complete
05:40:40:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:9179 run:25 clone:7 gen:169 core:0x21 unit:0x0000010cab436c9f57bdce044dfa3edc
05:40:40:WU00:FS01:Starting
05:40:40:WU00:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/fahwebx.stanford.edu/cores/Linux/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21 -dir 00 -suffix 01 -version 704 -lifeline 1097 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
05:40:40:WU00:FS01:Started FahCore on PID 4976
05:40:40:WU00:FS01:Core PID:4980
05:40:40:WU00:FS01:FahCore 0x21 started
05:40:41:WU00:FS01:0x21:*********************** Log Started 2017-02-01T05:40:40Z ***********************
05:40:41:WU00:FS01:0x21:Project: 9179 (Run 25, Clone 7, Gen 169)
05:40:41:WU00:FS01:0x21:Unit: 0x0000010cab436c9f57bdce044dfa3edc
05:40:41:WU00:FS01:0x21:CPU: 0x00000000000000000000000000000000
05:40:41:WU00:FS01:0x21:Machine: 1
05:40:41:WU00:FS01:0x21:Reading tar file core.xml
05:40:41:WU00:FS01:0x21:Reading tar file integrator.xml
05:40:41:WU00:FS01:0x21:Reading tar file state.xml
05:40:41:WU00:FS01:0x21:Reading tar file system.xml
05:40:41:WU00:FS01:0x21:Digital signatures verified
05:40:41:WU00:FS01:0x21:Folding@home GPU Core21 Folding@home Core
05:40:41:WU00:FS01:0x21:Version 0.0.18
05:40:45:WU02:FS01:Upload 31.77%
05:40:46:WU00:FS01:0x21:ERROR:Discrepancy: Forces are blowing up! 4218 0
05:40:46:WU00:FS01:0x21:Saving result file logfile_01.txt
05:40:46:WU00:FS01:0x21:Saving result file log.txt
05:40:46:WU00:FS01:0x21:Folding@home Core Shutdown: BAD_WORK_UNIT
05:40:46:WARNING:WU00:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
05:40:46:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:9179 run:25 clone:7 gen:169 core:0x21 unit:0x0000010cab436c9f57bdce044dfa3edc
05:40:46:WU00:FS01:Uploading 6.50KiB to 171.67.108.159
05:40:46:WU00:FS01:Connecting to 171.67.108.159:8080
05:40:47:WU01:FS01:Connecting to 171.67.108.45:80
05:40:48:WU00:FS01:Upload complete
05:40:48:WU00:FS01:Server responded WORK_ACK (400)
05:40:48:WU00:FS01:Cleaning up
05:40:49:WU01:FS01:Assigned to work server 140.163.4.245
05:40:49:WU01:FS01:Requesting new work unit for slot 01: READY gpu:0:GP104 [GeForce GTX 1080] from 140.163.4.245
05:40:49:WU01:FS01:Connecting to 140.163.4.245:8080
05:40:51:WU01:FS01:Downloading 14.50MiB
05:40:52:WU02:FS01:Upload 49.52%
05:40:57:WU01:FS01:Download 17.67%
05:40:58:WU02:FS01:Upload 64.47%
05:41:03:WU01:FS01:Download 57.77%
05:41:04:WU02:FS01:Upload 78.48%
05:41:08:WU01:FS01:Download complete
05:41:08:WU01:FS01:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:10496 run:115 clone:5 gen:11 core:0x21 unit:0x0000000d8ca304f556bbae3f953bf5ff
05:41:08:WU01:FS01:Starting
05:41:08:WU01:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/fahwebx.stanford.edu/cores/Linux/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21 -dir 01 -suffix 01 -version 704 -lifeline 1097 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
05:41:08:WU01:FS01:Started FahCore on PID 4997
05:41:08:WU01:FS01:Core PID:5001
05:41:08:WU01:FS01:FahCore 0x21 started
05:41:09:WU01:FS01:0x21:*********************** Log Started 2017-02-01T05:41:08Z ***********************
05:41:09:WU01:FS01:0x21:Project: 10496 (Run 115, Clone 5, Gen 11)
05:41:09:WU01:FS01:0x21:Unit: 0x0000000d8ca304f556bbae3f953bf5ff
05:41:09:WU01:FS01:0x21:CPU: 0x00000000000000000000000000000000
05:41:09:WU01:FS01:0x21:Machine: 1
05:41:09:WU01:FS01:0x21:Reading tar file core.xml
05:41:09:WU01:FS01:0x21:Reading tar file system.xml
05:41:09:WU01:FS01:0x21:Reading tar file integrator.xml
05:41:09:WU01:FS01:0x21:Reading tar file state.xml
05:41:10:WU02:FS01:Upload 91.56%
05:41:14:WU01:FS01:0x21:Digital signatures verified
05:41:14:WU01:FS01:0x21:Folding@home GPU Core21 Folding@home Core
05:41:14:WU01:FS01:0x21:Version 0.0.18
05:41:20:WU02:FS01:Upload complete
05:41:20:WU02:FS01:Server responded WORK_ACK (400)
05:41:20:WU02:FS01:Final credit estimate, 83191.00 points
05:41:20:WU02:FS01:Cleaning up
05:41:33:WU01:FS01:0x21:ERROR:Discrepancy: Forces are blowing up! 1 0
05:41:33:WU01:FS01:0x21:Saving result file logfile_01.txt
05:41:33:WU01:FS01:0x21:Saving result file log.txt
05:41:33:WU01:FS01:0x21:Folding@home Core Shutdown: BAD_WORK_UNIT
05:41:33:WARNING:WU01:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
05:41:33:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:10496 run:115 clone:5 gen:11 core:0x21 unit:0x0000000d8ca304f556bbae3f953bf5ff
05:41:33:WU01:FS01:Uploading 2.41KiB to 140.163.4.245
05:41:33:WU01:FS01:Connecting to 140.163.4.245:8080
05:41:33:WU01:FS01:Upload complete
05:41:33:WU01:FS01:Server responded WORK_ACK (400)
05:41:33:WU01:FS01:Cleaning up
05:41:34:WU00:FS01:Connecting to 171.67.108.45:80
05:41:34:WU00:FS01:Assigned to work server 140.163.4.245
05:41:34:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:GP104 [GeForce GTX 1080] from 140.163.4.245
05:41:34:WU00:FS01:Connecting to 140.163.4.245:8080
05:41:35:WU00:FS01:Downloading 14.49MiB
05:41:40:WU00:FS01:Download complete
05:41:40:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:10496 run:156 clone:2 gen:9 core:0x21 unit:0x0000000d8ca304f556bbb1186f5d4b6d
05:41:40:WU00:FS01:Starting
05:41:40:WU00:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/fahwebx.stanford.edu/cores/Linux/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21 -dir 00 -suffix 01 -version 704 -lifeline 1097 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
05:41:40:WU00:FS01:Started FahCore on PID 5022
05:41:40:WU00:FS01:Core PID:5026
05:41:40:WU00:FS01:FahCore 0x21 started
05:41:41:WU00:FS01:0x21:*********************** Log Started 2017-02-01T05:41:40Z ***********************
05:41:41:WU00:FS01:0x21:Project: 10496 (Run 156, Clone 2, Gen 9)
05:41:41:WU00:FS01:0x21:Unit: 0x0000000d8ca304f556bbb1186f5d4b6d
05:41:41:WU00:FS01:0x21:CPU: 0x00000000000000000000000000000000
05:41:41:WU00:FS01:0x21:Machine: 1
05:41:41:WU00:FS01:0x21:Reading tar file core.xml
05:41:41:WU00:FS01:0x21:Reading tar file system.xml
05:41:41:WU00:FS01:0x21:Reading tar file integrator.xml
05:41:41:WU00:FS01:0x21:Reading tar file state.xml
05:41:43:WU00:FS01:0x21:Digital signatures verified
05:41:43:WU00:FS01:0x21:Folding@home GPU Core21 Folding@home Core
05:41:43:WU00:FS01:0x21:Version 0.0.18
05:42:02:WU00:FS01:0x21:ERROR:exception: Error downloading array interactionCount: clEnqueueReadBuffer (-5)
05:42:02:WU00:FS01:0x21:Saving result file logfile_01.txt
05:42:02:WU00:FS01:0x21:Saving result file log.txt
05:42:02:WU00:FS01:0x21:Folding@home Core Shutdown: BAD_WORK_UNIT
05:42:33:WARNING:WU00:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
05:42:33:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:10496 run:156 clone:2 gen:9 core:0x21 unit:0x0000000d8ca304f556bbb1186f5d4b6d
05:42:33:WU00:FS01:Uploading 2.49KiB to 140.163.4.245
05:42:33:WU00:FS01:Connecting to 140.163.4.245:8080
05:42:33:WU00:FS01:Upload complete
05:42:33:WU00:FS01:Server responded WORK_ACK (400)
05:42:33:WU00:FS01:Cleaning up
05:42:33:WU01:FS01:Connecting to 171.67.108.45:80
05:42:34:WU01:FS01:Assigned to work server 140.163.4.245
05:42:34:WU01:FS01:Requesting new work unit for slot 01: READY gpu:0:GP104 [GeForce GTX 1080] from 140.163.4.245
05:42:34:WU01:FS01:Connecting to 140.163.4.245:8080
05:42:34:WU01:FS01:Downloading 14.48MiB
05:42:40:WU01:FS01:Download 89.75%
05:42:40:WU01:FS01:Download complete
05:42:40:WU01:FS01:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:10496 run:172 clone:10 gen:3 core:0x21 unit:0x000000048ca304f556bbb23ca662250a
05:42:40:WU01:FS01:Starting
05:42:40:WU01:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/fahwebx.stanford.edu/cores/Linux/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21 -dir 01 -suffix 01 -version 704 -lifeline 1097 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
05:42:40:WU01:FS01:Started FahCore on PID 5105
05:42:40:WU01:FS01:Core PID:5109
05:42:40:WU01:FS01:FahCore 0x21 started
05:42:41:WU01:FS01:0x21:*********************** Log Started 2017-02-01T05:42:40Z ***********************
05:42:41:WU01:FS01:0x21:Project: 10496 (Run 172, Clone 10, Gen 3)
05:42:41:WU01:FS01:0x21:Unit: 0x000000048ca304f556bbb23ca662250a
05:42:41:WU01:FS01:0x21:CPU: 0x00000000000000000000000000000000
05:42:41:WU01:FS01:0x21:Machine: 1
05:42:41:WU01:FS01:0x21:Reading tar file core.xml
05:42:41:WU01:FS01:0x21:Reading tar file system.xml
05:42:41:WU01:FS01:0x21:Reading tar file integrator.xml
05:42:41:WU01:FS01:0x21:Reading tar file state.xml
05:42:43:WU01:FS01:0x21:Digital signatures verified
05:42:43:WU01:FS01:0x21:Folding@home GPU Core21 Folding@home Core
05:42:43:WU01:FS01:0x21:Version 0.0.18
05:43:02:WU01:FS01:0x21:ERROR:exception: Error downloading array interactionCount: clEnqueueReadBuffer (-5)
05:43:02:WU01:FS01:0x21:Saving result file logfile_01.txt
05:43:02:WU01:FS01:0x21:Saving result file log.txt
05:43:02:WU01:FS01:0x21:Folding@home Core Shutdown: BAD_WORK_UNIT
05:43:32:WARNING:WU01:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
05:43:32:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:10496 run:172 clone:10 gen:3 core:0x21 unit:0x000000048ca304f556bbb23ca662250a
05:43:32:WU01:FS01:Uploading 2.49KiB to 140.163.4.245
05:43:32:WU01:FS01:Connecting to 140.163.4.245:8080
05:43:33:WU01:FS01:Upload complete
05:43:33:WU00:FS01:Connecting to 171.67.108.45:80
05:43:33:WU01:FS01:Server responded WORK_ACK (400)
05:43:33:WU01:FS01:Cleaning up
05:43:33:WU00:FS01:Assigned to work server 171.67.108.159
05:43:33:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:GP104 [GeForce GTX 1080] from 171.67.108.159
05:43:33:WU00:FS01:Connecting to 171.67.108.159:8080
05:43:33:ERROR:WU00:FS01:Exception: Server did not assign work unit
05:43:34:WU00:FS01:Connecting to 171.67.108.45:80
05:43:34:WU00:FS01:Assigned to work server 171.67.108.157
05:43:34:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:GP104 [GeForce GTX 1080] from 171.67.108.157
05:43:34:WU00:FS01:Connecting to 171.67.108.157:8080
05:43:34:WU00:FS01:Downloading 5.18MiB
05:43:36:WU00:FS01:Download complete
05:43:36:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:9414 run:178 clone:0 gen:3 core:0x21 unit:0x00000004ab436c9d585e0691212a8c4a
05:43:36:WU00:FS01:Starting
05:43:36:WU00:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/fahwebx.stanford.edu/cores/Linux/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21 -dir 00 -suffix 01 -version 704 -lifeline 1097 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
05:43:36:WU00:FS01:Started FahCore on PID 5148
05:43:36:WU00:FS01:Core PID:5152
05:43:36:WU00:FS01:FahCore 0x21 started
05:43:37:WU00:FS01:0x21:*********************** Log Started 2017-02-01T05:43:36Z ***********************
05:43:37:WU00:FS01:0x21:Project: 9414 (Run 178, Clone 0, Gen 3)
05:43:37:WU00:FS01:0x21:Unit: 0x00000004ab436c9d585e0691212a8c4a
05:43:37:WU00:FS01:0x21:CPU: 0x00000000000000000000000000000000
05:43:37:WU00:FS01:0x21:Machine: 1
05:43:37:WU00:FS01:0x21:Reading tar file core.xml
05:43:37:WU00:FS01:0x21:Reading tar file integrator.xml
05:43:37:WU00:FS01:0x21:Reading tar file state.xml
05:43:37:WU00:FS01:0x21:Reading tar file system.xml
05:43:37:WU00:FS01:0x21:Digital signatures verified
05:43:37:WU00:FS01:0x21:Folding@home GPU Core21 Folding@home Core
05:43:37:WU00:FS01:0x21:Version 0.0.18
05:43:39:WU00:FS01:0x21:ERROR:Discrepancy: Forces are blowing up! 0 0
05:43:39:WU00:FS01:0x21:Saving result file logfile_01.txt
05:43:39:WU00:FS01:0x21:Saving result file log.txt
05:43:39:WU00:FS01:0x21:Folding@home Core Shutdown: BAD_WORK_UNIT
05:43:39:WARNING:WU00:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
05:43:39:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:9414 run:178 clone:0 gen:3 core:0x21 unit:0x00000004ab436c9d585e0691212a8c4a
05:43:39:WU00:FS01:Uploading 6.50KiB to 171.67.108.157
05:43:39:WU00:FS01:Connecting to 171.67.108.157:8080
05:43:39:WU00:FS01:Upload complete
05:43:39:WU00:FS01:Server responded WORK_ACK (400)
05:43:39:WU00:FS01:Cleaning up
05:43:40:WU00:FS01:Connecting to 171.67.108.45:80
05:43:40:WU00:FS01:Assigned to work server 140.163.4.245
05:43:40:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:GP104 [GeForce GTX 1080] from 140.163.4.245
05:43:40:WU00:FS01:Connecting to 140.163.4.245:8080
05:43:41:WU00:FS01:Downloading 14.50MiB
05:43:47:WU00:FS01:Download 84.48%
05:43:47:WU00:FS01:Download complete
05:43:48:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:10496 run:182 clone:17 gen:2 core:0x21 unit:0x000000028ca304f556bbb2fcce770e0f
05:43:48:WU00:FS01:Starting
05:43:48:WU00:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/fahwebx.stanford.edu/cores/Linux/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21 -dir 00 -suffix 01 -version 704 -lifeline 1097 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
05:43:48:WU00:FS01:Started FahCore on PID 5167
05:43:48:WU00:FS01:Core PID:5171
05:43:48:WU00:FS01:FahCore 0x21 started
05:43:48:WU00:FS01:0x21:*********************** Log Started 2017-02-01T05:43:48Z ***********************
05:43:48:WU00:FS01:0x21:Project: 10496 (Run 182, Clone 17, Gen 2)
05:43:48:WU00:FS01:0x21:Unit: 0x000000028ca304f556bbb2fcce770e0f
05:43:48:WU00:FS01:0x21:CPU: 0x00000000000000000000000000000000
05:43:48:WU00:FS01:0x21:Machine: 1
05:43:48:WU00:FS01:0x21:Reading tar file core.xml
05:43:48:WU00:FS01:0x21:Reading tar file system.xml
05:43:49:WU00:FS01:0x21:Reading tar file integrator.xml
05:43:49:WU00:FS01:0x21:Reading tar file state.xml
05:43:51:WU00:FS01:0x21:Digital signatures verified
05:43:51:WU00:FS01:0x21:Folding@home GPU Core21 Folding@home Core
05:43:51:WU00:FS01:0x21:Version 0.0.18
05:44:09:WU00:FS01:0x21:ERROR:exception: Error downloading array interactionCount: clEnqueueReadBuffer (-5)
05:44:09:WU00:FS01:0x21:Saving result file logfile_01.txt
05:44:09:WU00:FS01:0x21:Saving result file log.txt
05:44:09:WU00:FS01:0x21:Folding@home Core Shutdown: BAD_WORK_UNIT
05:44:40:WARNING:WU00:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
05:44:40:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:10496 run:182 clone:17 gen:2 core:0x21 unit:0x000000028ca304f556bbb2fcce770e0f
05:44:40:WU00:FS01:Uploading 2.49KiB to 140.163.4.245
05:44:40:WU00:FS01:Connecting to 140.163.4.245:8080
05:44:40:WU01:FS01:Connecting to 171.67.108.45:80
05:44:40:WU00:FS01:Upload complete
05:44:40:WU00:FS01:Server responded WORK_ACK (400)
05:44:40:WU00:FS01:Cleaning up
05:44:41:WU01:FS01:Assigned to work server 140.163.4.245
05:44:41:WU01:FS01:Requesting new work unit for slot 01: READY gpu:0:GP104 [GeForce GTX 1080] from 140.163.4.245
05:44:41:WU01:FS01:Connecting to 140.163.4.245:8080
05:44:41:WU01:FS01:Downloading 14.50MiB
05:44:47:WU01:FS01:Download 48.28%
05:44:52:WU01:FS01:Download complete
05:44:52:WU01:FS01:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:10496 run:91 clone:8 gen:15 core:0x21 unit:0x000000118ca304f556bbac921ab5d362
05:44:52:WU01:FS01:Starting
05:44:52:WU01:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/fahwebx.stanford.edu/cores/Linux/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21 -dir 01 -suffix 01 -version 704 -lifeline 1097 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
05:44:52:WU01:FS01:Started FahCore on PID 5190
05:44:52:WU01:FS01:Core PID:5194
05:44:52:WU01:FS01:FahCore 0x21 started
05:44:53:WU01:FS01:0x21:*********************** Log Started 2017-02-01T05:44:52Z ***********************
05:44:53:WU01:FS01:0x21:Project: 10496 (Run 91, Clone 8, Gen 15)
05:44:53:WU01:FS01:0x21:Unit: 0x000000118ca304f556bbac921ab5d362
05:44:53:WU01:FS01:0x21:CPU: 0x00000000000000000000000000000000
05:44:53:WU01:FS01:0x21:Machine: 1
05:44:53:WU01:FS01:0x21:Reading tar file core.xml
05:44:53:WU01:FS01:0x21:Reading tar file system.xml
05:44:53:WU01:FS01:0x21:Reading tar file integrator.xml
05:44:53:WU01:FS01:0x21:Reading tar file state.xml
05:44:55:WU01:FS01:0x21:Digital signatures verified
05:44:55:WU01:FS01:0x21:Folding@home GPU Core21 Folding@home Core
05:44:55:WU01:FS01:0x21:Version 0.0.18
05:45:14:WU01:FS01:0x21:ERROR:exception: Error downloading array interactionCount: clEnqueueReadBuffer (-5)
05:45:14:WU01:FS01:0x21:Saving result file logfile_01.txt
05:45:14:WU01:FS01:0x21:Saving result file log.txt
05:45:14:WU01:FS01:0x21:Folding@home Core Shutdown: BAD_WORK_UNIT
05:45:44:WARNING:WU01:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
05:45:44:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:10496 run:91 clone:8 gen:15 core:0x21 unit:0x000000118ca304f556bbac921ab5d362
05:45:44:WU01:FS01:Uploading 2.49KiB to 140.163.4.245
05:45:44:WU01:FS01:Connecting to 140.163.4.245:8080
05:45:45:WU00:FS01:Connecting to 171.67.108.45:80
05:45:45:WU01:FS01:Upload complete
05:45:45:WU01:FS01:Server responded WORK_ACK (400)
05:45:45:WU01:FS01:Cleaning up
05:45:45:WU00:FS01:Assigned to work server 171.67.108.105
05:45:45:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:GP104 [GeForce GTX 1080] from 171.67.108.105
05:45:45:WU00:FS01:Connecting to 171.67.108.105:8080
05:45:46:WU00:FS01:Downloading 21.71MiB
05:45:52:WU00:FS01:Download 72.24%
05:45:54:WU00:FS01:Download complete
05:45:54:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:9176 run:1 clone:9 gen:141 core:0x21 unit:0x000000e3ab436c6957b24c28209181bb
05:45:54:WU00:FS01:Starting
05:45:54:WU00:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/fahwebx.stanford.edu/cores/Linux/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21 -dir 00 -suffix 01 -version 704 -lifeline 1097 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
05:45:54:WU00:FS01:Started FahCore on PID 5216
05:45:54:WU00:FS01:Core PID:5220
05:45:54:WU00:FS01:FahCore 0x21 started
05:45:54:WU00:FS01:0x21:*********************** Log Started 2017-02-01T05:45:54Z ***********************
05:45:54:WU00:FS01:0x21:Project: 9176 (Run 1, Clone 9, Gen 141)
05:45:54:WU00:FS01:0x21:Unit: 0x000000e3ab436c6957b24c28209181bb
05:45:54:WU00:FS01:0x21:CPU: 0x00000000000000000000000000000000
05:45:54:WU00:FS01:0x21:Machine: 1
05:45:54:WU00:FS01:0x21:Reading tar file core.xml
05:45:54:WU00:FS01:0x21:Reading tar file integrator.xml
05:45:54:WU00:FS01:0x21:Reading tar file state.xml
05:45:54:WU00:FS01:0x21:Reading tar file system.xml
05:45:54:WU00:FS01:0x21:Digital signatures verified
05:45:54:WU00:FS01:0x21:Folding@home GPU Core21 Folding@home Core
05:45:54:WU00:FS01:0x21:Version 0.0.18
05:46:00:WU00:FS01:0x21:ERROR:Force RMSE error of 642.095 with threshold of 5
05:46:00:WU00:FS01:0x21:Saving result file logfile_01.txt
05:46:00:WU00:FS01:0x21:Saving result file log.txt
05:46:00:WU00:FS01:0x21:Folding@home Core Shutdown: BAD_WORK_UNIT
05:46:00:WARNING:WU00:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
05:46:00:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:9176 run:1 clone:9 gen:141 core:0x21 unit:0x000000e3ab436c6957b24c28209181bb
05:46:00:WU00:FS01:Uploading 6.50KiB to 171.67.108.105
05:46:00:WU00:FS01:Connecting to 171.67.108.105:8080
05:46:00:WU00:FS01:Upload complete
05:46:00:WU00:FS01:Server responded WORK_ACK (400)
05:46:01:WU00:FS01:Cleaning up
05:46:01:WU01:FS01:Connecting to 171.67.108.45:80
05:46:01:WU01:FS01:Assigned to work server 140.163.4.231
05:46:01:WU01:FS01:Requesting new work unit for slot 01: READY gpu:0:GP104 [GeForce GTX 1080] from 140.163.4.231
05:46:01:WU01:FS01:Connecting to 140.163.4.231:8080
05:46:02:WU01:FS01:Downloading 16.73MiB
05:46:08:WU01:FS01:Download 52.66%
05:46:12:WU01:FS01:Download complete
05:46:12:WU01:FS01:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:11712 run:4 clone:54 gen:35 core:0x21 unit:0x000000318ca304e758332b52128a06c2
05:46:12:WU01:FS01:Starting
05:46:12:WU01:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/fahwebx.stanford.edu/cores/Linux/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21 -dir 01 -suffix 01 -version 704 -lifeline 1097 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
05:46:12:WU01:FS01:Started FahCore on PID 5236
05:46:12:WU01:FS01:Core PID:5240
05:46:12:WU01:FS01:FahCore 0x21 started
05:46:12:WU01:FS01:0x21:*********************** Log Started 2017-02-01T05:46:12Z ***********************
05:46:12:WU01:FS01:0x21:Project: 11712 (Run 4, Clone 54, Gen 35)
05:46:12:WU01:FS01:0x21:Unit: 0x000000318ca304e758332b52128a06c2
05:46:12:WU01:FS01:0x21:CPU: 0x00000000000000000000000000000000
05:46:12:WU01:FS01:0x21:Machine: 1
05:46:12:WU01:FS01:0x21:Reading tar file core.xml
05:46:12:WU01:FS01:0x21:Reading tar file integrator.xml
05:46:12:WU01:FS01:0x21:Reading tar file state.xml
05:46:12:WU01:FS01:0x21:Reading tar file system.xml
05:46:12:WU01:FS01:0x21:Digital signatures verified
05:46:12:WU01:FS01:0x21:Folding@home GPU Core21 Folding@home Core
05:46:12:WU01:FS01:0x21:Version 0.0.18
05:46:17:WU01:FS01:0x21:ERROR:Discrepancy: Forces are blowing up! 0 0
05:46:17:WU01:FS01:0x21:Saving result file logfile_01.txt
05:46:17:WU01:FS01:0x21:Saving result file log.txt
05:46:17:WU01:FS01:0x21:Folding@home Core Shutdown: BAD_WORK_UNIT
05:46:18:WARNING:WU01:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
05:46:18:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:11712 run:4 clone:54 gen:35 core:0x21 unit:0x000000318ca304e758332b52128a06c2
05:46:18:WU01:FS01:Uploading 6.50KiB to 140.163.4.231
05:46:18:WU01:FS01:Connecting to 140.163.4.231:8080
05:46:18:WU01:FS01:Upload complete
05:46:18:WU00:FS01:Connecting to 171.67.108.45:80
05:46:18:WU01:FS01:Server responded WORK_ACK (400)
05:46:18:WU01:FS01:Cleaning up
05:46:19:WU00:FS01:Assigned to work server 171.67.108.105
05:46:19:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:GP104 [GeForce GTX 1080] from 171.67.108.105
05:46:19:WU00:FS01:Connecting to 171.67.108.105:8080
05:46:19:WU00:FS01:Downloading 21.72MiB
05:46:25:WU00:FS01:Download 65.59%
05:46:27:WU00:FS01:Download complete
05:46:27:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:9176 run:27 clone:5 gen:131 core:0x21 unit:0x000000d4ab436c6957b24c291eae5cb0
05:46:27:WU00:FS01:Starting
05:46:27:WU00:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/fahwebx.stanford.edu/cores/Linux/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21 -dir 00 -suffix 01 -version 704 -lifeline 1097 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
05:46:27:WU00:FS01:Started FahCore on PID 5255
05:46:27:WU00:FS01:Core PID:5259
05:46:27:WU00:FS01:FahCore 0x21 started
05:46:28:WU00:FS01:0x21:*********************** Log Started 2017-02-01T05:46:27Z ***********************
05:46:28:WU00:FS01:0x21:Project: 9176 (Run 27, Clone 5, Gen 131)
05:46:28:WU00:FS01:0x21:Unit: 0x000000d4ab436c6957b24c291eae5cb0
05:46:28:WU00:FS01:0x21:CPU: 0x00000000000000000000000000000000
05:46:28:WU00:FS01:0x21:Machine: 1
05:46:28:WU00:FS01:0x21:Reading tar file core.xml
05:46:28:WU00:FS01:0x21:Reading tar file integrator.xml
05:46:28:WU00:FS01:0x21:Reading tar file state.xml
05:46:28:WU00:FS01:0x21:Reading tar file system.xml
05:46:28:WU00:FS01:0x21:Digital signatures verified
05:46:28:WU00:FS01:0x21:Folding@home GPU Core21 Folding@home Core
05:46:28:WU00:FS01:0x21:Version 0.0.18
05:46:33:WU00:FS01:0x21:ERROR:Discrepancy: Forces are blowing up! 0 0
05:46:33:WU00:FS01:0x21:Saving result file logfile_01.txt
05:46:33:WU00:FS01:0x21:Saving result file log.txt
05:46:33:WU00:FS01:0x21:Folding@home Core Shutdown: BAD_WORK_UNIT
05:46:33:WARNING:WU00:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
05:46:33:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:9176 run:27 clone:5 gen:131 core:0x21 unit:0x000000d4ab436c6957b24c291eae5cb0
05:46:33:WU00:FS01:Uploading 6.50KiB to 171.67.108.105
05:46:33:WU00:FS01:Connecting to 171.67.108.105:8080
05:46:33:WU00:FS01:Upload complete
05:46:33:WU00:FS01:Server responded WORK_ACK (400)
05:46:34:WU00:FS01:Cleaning up
******************************* Date: 2017-02-01 *******************************
Summary from log entries:

05:40:46:WU00:FS01:0x21:ERROR:Discrepancy: Forces are blowing up! 4218 0
05:40:46:WARNING:WU00:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
05:40:46:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:9179 run:25 clone:7 gen:169 core:0x21 unit:0x0000010cab436c9f57bdce044dfa3edc

05:41:33:WU01:FS01:0x21:ERROR:Discrepancy: Forces are blowing up! 1 0
05:41:33:WARNING:WU01:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
05:41:33:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:10496 run:115 clone:5 gen:11 core:0x21 unit:0x0000000d8ca304f556bbae3f953bf5ff

05:42:02:WU00:FS01:0x21:ERROR:exception: Error downloading array interactionCount: clEnqueueReadBuffer (-5)
05:42:33:WARNING:WU00:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
05:42:33:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:10496 run:156 clone:2 gen:9 core:0x21 unit:0x0000000d8ca304f556bbb1186f5d4b6d

05:43:02:WU01:FS01:0x21:ERROR:exception: Error downloading array interactionCount: clEnqueueReadBuffer (-5)
05:43:32:WARNING:WU01:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
05:43:32:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:10496 run:172 clone:10 gen:3 core:0x21 unit:0x000000048ca304f556bbb23ca662250a

05:43:39:WU00:FS01:0x21:ERROR:Discrepancy: Forces are blowing up! 0 0
05:43:39:WARNING:WU00:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
05:43:39:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:9414 run:178 clone:0 gen:3 core:0x21 unit:0x00000004ab436c9d585e0691212a8c4a

05:44:09:WU00:FS01:0x21:ERROR:exception: Error downloading array interactionCount: clEnqueueReadBuffer (-5)
05:44:40:WARNING:WU00:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
05:44:40:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:10496 run:182 clone:17 gen:2 core:0x21 unit:0x000000028ca304f556bbb2fcce770e0f

05:45:14:WU01:FS01:0x21:ERROR:exception: Error downloading array interactionCount: clEnqueueReadBuffer (-5)
05:45:44:WARNING:WU01:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
05:45:44:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:10496 run:91 clone:8 gen:15 core:0x21 unit:0x000000118ca304f556bbac921ab5d362

05:46:00:WU00:FS01:0x21:ERROR:Force RMSE error of 642.095 with threshold of 5
05:46:00:WARNING:WU00:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
05:46:00:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:9176 run:1 clone:9 gen:141 core:0x21 unit:0x000000e3ab436c6957b24c28209181bb

05:46:17:WU01:FS01:0x21:ERROR:Discrepancy: Forces are blowing up! 0 0
05:46:18:WARNING:WU01:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
05:46:18:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:11712 run:4 clone:54 gen:35 core:0x21 unit:0x000000318ca304e758332b52128a06c2

05:46:33:WU00:FS01:0x21:ERROR:Discrepancy: Forces are blowing up! 0 0
05:46:33:WARNING:WU00:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
05:46:33:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:9176 run:27 clone:5 gen:131 core:0x21 unit:0x000000d4ab436c6957b24c291eae5cb0

Stopping and restarting resulted in a couple successful projects, but with more BAD_WORK_UNITs than usual, and eventually FAILING again.

Any help is greatly appreciated!
- Pete
rwh202
Posts: 410
Joined: Mon Nov 15, 2010 8:51 pm
Hardware configuration: 8x GTX 1080
3x GTX 1080 Ti
3x GTX 1060
Various other bits and pieces
Location: South Coast, UK

Re: Core 21 Projects spamming BAD_WORK_UNIT failures

Post by rwh202 »

Looks like it could be a potential core 21 v18 problem - also reported here:
viewtopic.php?f=66&t=29618&start=15#p292707

However, I've got 4 linux boxes with pairs of 1080s in them and as far as I can tell, are working ok with the new core. What driver version are you using?
Rel25917
Posts: 303
Joined: Wed Aug 15, 2012 2:31 am

Re: Core 21 Projects spamming BAD_WORK_UNIT failures

Post by Rel25917 »

I had this happen on a computer running 2 gtx 980ti cards. It had driver 368.something(22 i think). An upgrade to the latest driver got it working right.
Aurum
Posts: 292
Joined: Sat Oct 03, 2015 3:15 pm
Location: The Great Basin

Re: Core 21 Projects spamming BAD_WORK_UNIT failures

Post by Aurum »

I checked my headless folding rigs and they all said FAH Core 21 outdated and requested a new one and then stopped and did not restart. I rebooted each one to get them folding again.

If this was the server being overwhelmed it might be a good idea to trap this hanging error and make it restart.
In Science We Trust Image
petem
Posts: 14
Joined: Wed Nov 18, 2015 4:57 pm
Hardware configuration: i5-2400, Asus P8Z77V-Pro, 2x4GB mem, 80GB HD, 2xGTX 970 (EVGA SSC), GTX 660ti (PNY), Rosewill Quark 750W PSU, Costco grape tray case ; )

Re: Core 21 Projects spamming BAD_WORK_UNIT failures

Post by petem »

:D :D :D PROBLEM SOLVED (I Think)

OK, bit the bullet and updated the nvidia drivers - was running 36x.xx - don't remember the specifics.

Updated to 370.28, and still no luck - both cards experienced similar failures multiple times until they were both set to "FAILED" status.

To verify the driver update was successful, I browsed them in the Software APP like so.
  • Open Software app: click on icon in left side menu - it's the briefcase with an "A" with a progress bar for cross beam
    in SW app, from main menu: Select Installed/drop down: proprietary GPU drivers
    • It shows version: nvidia-370 370.28-0ubuntu0~gpu14.04.3
      In lower section (Optional add-ons) I noticed the OpenCL checkbox was NOT checked!
    Check box: NVIDIA OpenCL ICD (nvidia-opencl-icd-370)
    Apply: Click the apply button to implement the change!
Restarted the FAHclient (sudo /etc/init.d/FAHClient start) ... and ...

SUCCESS! They are both folding as I type.

This still doesn't answer the main question though: if the system has been folding continually for the past 3 months, what changed to cause it to start erroring?

It is a dedicated system, not being used for anything else (even the keyboard and mouse are only plugged in for booting, etc. and the display input on my regular monitor is inactive.

Since it has been rock solid for months, I don't monitor it closely anymore, but I do have the unit on a watt meter, which has luckily remained in place in front of my regular monitor since I set it up. I just happened to notice that it was reading unusually low for an extended period of time. While this can happen for a brief period between projects, it wasn't ramping back up like normal. Pulling up FAHControl (which has remote access to the folding box) immediately showed the 2 cards were in "FAILED" status.

What changed?
rwh202
Posts: 410
Joined: Mon Nov 15, 2010 8:51 pm
Hardware configuration: 8x GTX 1080
3x GTX 1080 Ti
3x GTX 1060
Various other bits and pieces
Location: South Coast, UK

Re: Core 21 Projects spamming BAD_WORK_UNIT failures

Post by rwh202 »

petem wrote:What changed?
Core 21 was updated from v0.17 to v0.18 - I suspect it doesn't like the older drivers.
Aurum
Posts: 292
Joined: Sat Oct 03, 2015 3:15 pm
Location: The Great Basin

Re: Core 21 Projects spamming BAD_WORK_UNIT failures

Post by Aurum »

rwh202 wrote:
petem wrote:What changed?
Core 21 was updated from v0.17 to v0.18 - I suspect it doesn't like the older drivers.
It runs fine on Win7 369.30 after they actually DL.
In Science We Trust Image
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Core 21 Projects spamming BAD_WORK_UNIT failures

Post by bruce »

@petem (and others): Do you have any way to figure out what driver version (or CUDA version) was part of the drivers you replaced?
According to nVidia, the oldest Linux-64bit drivers for a 1080 are
NVIDIA Certified 367.35 July 15, 2016
NVIDIA Certified 367.27 June 13, 2016
NVIDIA Certified 367.18 May 26, 2016

I checked the status of one of your WUs. It was reassigned and completed by someone else. The WU, itself, is not the culprit -- but apparently you figured that out.

Hi pete_m (team 223518),
Your WU (P9179 R25 C7 G169) was added to the stats database on 2017-01-31 22:06:03 for 0 points of credit.

WU reassigned to another donor at: 2017-01-31 21:44:52
Hi xxx (team xxxx),
Your WU (P9179 R25 C7 G169) was added to the stats database on 2017-02-01 00:06:47 for 40263.6 points of credit.
Rel25917
Posts: 303
Joined: Wed Aug 15, 2012 2:31 am

Re: Core 21 Projects spamming BAD_WORK_UNIT failures

Post by Rel25917 »

368.22 on win7 with 2 gtx980ti
Nicolas_orleans
Posts: 117
Joined: Wed Aug 08, 2012 3:08 am

Re: Core 21 Projects spamming BAD_WORK_UNIT failures

Post by Nicolas_orleans »

367.57 works ok with 0.0.18 on my Kepler (GTX 770) / Maxwell (GTX 750 Ti) setup
MSI Z77A-GD55 - Core i5-3550 - EVGA GTX 980 Ti Hybrid @ 1366 MHz - Ubuntu 24.04 - 6.8 kernel
MSI MPG B550 - Ryzen 5 5600X - PNY RTX 4080 Super @ 2715 MHz - Ubuntu 24.04 - 6.8 kernel
jimerickson
Posts: 533
Joined: Tue May 27, 2008 11:56 pm
Hardware configuration: Parts:
Asus H370 Mining Master motherboard (X2)
Patriot Viper DDR4 memory 16gb stick (X4)
Nvidia GeForce GTX 1080 gpu (X16)
Intel Core i7 8700 cpu (X2)
Silverstone 1000 watt psu (X4)
Veddha 8 gpu miner case (X2)
Thermaltake hsf (X2)
Ubit riser card (X16)
Location: ames, iowa

Re: Core 21 Projects spamming BAD_WORK_UNIT failures

Post by jimerickson »

ubuntu 16.04 and nvidia-367.57 with v0.0.18 on gtx1080's running with no problems here.
HaloJones
Posts: 906
Joined: Thu Jul 24, 2008 10:16 am

Re: Core 21 Projects spamming BAD_WORK_UNIT failures

Post by HaloJones »

why do you auto-update machines when the machine has drivers that don't work? is that a tested strategy or are you trying to piss me (us) off?

a perfectly happy headless computer has lost 24 hours of folding effort and an hour of research and updating then hacking the drivers with coolbits etc because 0.18 is better than 0.17? really? by how much? 1%? how about you let me decide next time

:x :x :x :x :x :x
single 1070

Image
Aurum
Posts: 292
Joined: Sat Oct 03, 2015 3:15 pm
Location: The Great Basin

Re: Core 21 Projects spamming BAD_WORK_UNIT failures

Post by Aurum »

HaloJones wrote:why do you auto-update machines when the machine has drivers that don't work? is that a tested strategy or are you trying to piss me (us) off?

a perfectly happy headless computer has lost 24 hours of folding effort and an hour of research and updating then hacking the drivers with coolbits etc because 0.18 is better than 0.17? really? by how much? 1%? how about you let me decide next time

:x :x :x :x :x :x
Ditto. 369.30 works good from CUDA 8.0 Dev package. Still had each folding rig halt trying to update FAH Core 21.
In Science We Trust Image
JohnChodera
Pande Group Member
Posts: 467
Joined: Fri Feb 22, 2013 9:59 pm

Re: Core 21 Projects spamming BAD_WORK_UNIT failures

Post by JohnChodera »

So sorry for the issues here. We're looking into this now---this was certainly not expected and not something that came up in the 1-2 weeks of testing.

> why do you auto-update machines when the machine has drivers that don't work? is that a tested strategy or are you trying to piss me (us) off?

We're incredibly sorry for the lost folding time (and incredibly grateful that you have such a great rig folding on FAH!). But here's what we're up against:
(1) A very LARGE fraction of FAH donors have updated to the new driver series which causes core21 0.0.17 and earlier versions to not work at all. That caused a really large fraction of our GPU compute power to go away.
(2) The NVIDIA driver team finally figured out what was going on and issued a hotfix, but we are under intense pressure for them to remove it because it significantly degrades performance.
(3) As soon as a workaround was identified for OpenMM (a couple of weeks after the hotfix), we started testing core builds incorporating this hotfix.
(4) The 0.0.18 build was tested in several stages over the period of ~2 weeks, and we didn't see any problems like this.
(5) We don't have a way of telling what driver version you're using, so there's no way for us to limit the push to machines that had the driver failure.

Again, we're super sorry this caused lost folding time, but hopefully this gives you an idea of why we rolled this forward.
JohnChodera
Pande Group Member
Posts: 467
Joined: Fri Feb 22, 2013 9:59 pm

Re: Core 21 Projects spamming BAD_WORK_UNIT failures

Post by JohnChodera »

The main bugfix change was this change in OpenMM (backported to the version we were using in the core):
https://github.com/pandegroup/openmm/pull/1717/files

It's possible this change caused issues with the specific driver you were using at the time that things started failing.

Is everything working OK with the recent drivers?
Post Reply