Page 1 of 3

171.64.65.124 SLOW upload/download and Dumping WU

Posted: Tue Dec 08, 2015 9:18 pm
by n1np
I saw this yesterday while looking through some logs. 171.64.65.124 very slow on upload and download, sometimes the work units are simply "dumped". Below is an example I saw again this afternoon. I have not gone through all of the logs (8 machines, all nearly identical) to get a list of the WU, but this is the only IP address that I have seen the issue with.

Log header:

Code: Select all

*********************** Log Started 2015-12-08T02:24:58Z ***********************
02:24:58:************************* Folding@home Client *************************
02:24:58:    Website: http://folding.stanford.edu/
02:24:58:  Copyright: (c) 2009-2014 Stanford University
02:24:58:     Author: Joseph Coffland <[email protected]>
02:24:58:       Args: --config core744.xml --client-type=advanced --cpus=8
02:24:58:             --command-allow=192.168.0.0/24 --command-port=36330
02:24:58:             --password=********** --next-unit-percentage=100
02:24:58:             --core-priority=low --cpu-affinity=true
02:24:58:     Config: /home/akula/fah/7-4-4/core744.xml
02:24:58:******************************** Build ********************************
02:24:58:    Version: 7.4.4
02:24:58:       Date: Mar 4 2014
02:24:58:       Time: 12:01:17
02:24:58:    SVN Rev: 4130
02:24:58:     Branch: fah/trunk/client
02:24:58:   Compiler: GNU 4.1.2 20080704 (Red Hat 4.1.2-46)
02:24:58:    Options: -std=gnu++98 -O3 -funroll-loops -mfpmath=sse -ffast-math
02:24:58:             -fno-unsafe-math-optimizations -msse2
02:24:58:   Platform: linux2 2.6.18-164.11.1.el5
02:24:58:       Bits: 64
02:24:58:       Mode: Release
02:24:58:******************************* System ********************************
02:24:58:        CPU: Intel(R) Xeon(R) CPU L5420 @ 2.50GHz
02:24:58:     CPU ID: GenuineIntel Family 6 Model 23 Stepping 6
02:24:58:       CPUs: 8
02:24:58:     Memory: 15.65GiB
02:24:58:Free Memory: 15.31GiB
02:24:58:    Threads: POSIX_THREADS
02:24:58: OS Version: 3.2
02:24:58:Has Battery: false
02:24:58: On Battery: false
02:24:58: UTC Offset: -5
02:24:58:        PID: 2025
02:24:58:        CWD: /home/akula/fah/7-4-4
02:24:58:         OS: Linux 3.2.29 x86_64
02:24:58:    OS Arch: AMD64
02:24:58:       GPUs: 0
02:24:58:       CUDA: Not detected
02:24:58:***********************************************************************
02:24:58:<config>
02:24:58:  <!-- Folding Slot Configuration -->
02:24:58:  <gpu v='false'/>
02:24:58:
02:24:58:  <!-- User Information -->
02:24:58:  <passkey v='********************************'/>
02:24:58:  <team v='12912'/>
02:24:58:  <user v='n1np'/>
02:24:58:
02:24:58:  <!-- Folding Slots -->
02:24:58:  <slot id='0' type='CPU'/>
02:24:58:</config>

Trouble spot:

Code: Select all

<SNIP>

19:30:05:WU00:FS00:0xa4:Completed 242500 out of 250000 steps  (97%)
19:31:28:WU00:FS00:0xa4:Completed 245000 out of 250000 steps  (98%)
19:32:50:WU00:FS00:0xa4:Completed 247500 out of 250000 steps  (99%)
19:34:12:WU00:FS00:0xa4:Completed 250000 out of 250000 steps  (100%)
19:34:13:WU00:FS00:0xa4:DynamicWrapper: Finished Work Unit: sleep=10000
19:34:13:WU01:FS00:Connecting to 171.67.108.45:8080
19:34:13:WU01:FS00:Assigned to work server 171.64.65.124
19:34:13:WU01:FS00:Requesting new work unit for slot 00: RUNNING cpu:8 from 171.64.65.124
19:34:13:WU01:FS00:Connecting to 171.64.65.124:8080
19:34:18:WU01:FS00:Downloading 806.67KiB
19:34:23:WU00:FS00:0xa4:
19:34:23:WU00:FS00:0xa4:Finished Work Unit:
19:34:23:WU00:FS00:0xa4:- Reading up to 811512 from "00/wudata_01.trr": Read 811512
19:34:23:WU00:FS00:0xa4:trr file hash check passed.
19:34:23:WU00:FS00:0xa4:- Reading up to 746104 from "00/wudata_01.xtc": Read 746104
19:34:23:WU00:FS00:0xa4:xtc file hash check passed.
19:34:23:WU00:FS00:0xa4:edr file hash check passed.
19:34:23:WU00:FS00:0xa4:logfile size: 23010
19:34:23:WU00:FS00:0xa4:Leaving Run
19:34:23:WU00:FS00:0xa4:- Writing 1583114 bytes of core data to disk...
19:34:23:WU00:FS00:0xa4:Done: 1582602 -> 1537893 (compressed to 97.1 percent)
19:34:23:WU00:FS00:0xa4:  ... Done.
19:34:27:WU01:FS00:Download 7.93%
19:34:41:WU01:FS00:Download 15.87%
19:34:48:WU01:FS00:Download 31.74%
19:34:58:WU01:FS00:Download 47.60%
19:35:05:WU01:FS00:Download 79.34%
19:35:11:WU01:FS00:Download 95.21%
19:35:12:WU01:FS00:Download complete
19:35:12:WU01:FS00:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:9022 run:144 clone:7 gen:106 core:0xa4 unit:0x0000007bab40417c55e8a37e171f3ca7
19:35:20:WU00:FS00:0xa4:- Shutting down core
19:35:20:WU00:FS00:0xa4:
19:35:20:WU00:FS00:0xa4:Folding@home Core Shutdown: FINISHED_UNIT
19:35:27:WU00:FS00:FahCore returned: FINISHED_UNIT (100 = 0x64)
19:35:27:WU00:FS00:Sending unit results: id:00 state:SEND error:NO_ERROR project:9024 run:600 clone:2 gen:83 core:0xa4 unit:0x00000065ab40417c55f3a7471bdc533c
19:35:27:WU00:FS00:Uploading 1.47MiB to 171.64.65.124
19:35:27:WU01:FS00:Starting
19:35:27:WU00:FS00:Connecting to 171.64.65.124:8080
19:35:27:WU01:FS00:Running FahCore: /home/akula/fah/7-4-4/FAHCoreWrapper /home/akula/fah/7-4-4/cores/web.stanford.edu/~pande/Linux/AMD64/Core_a4.fah/FahCore_a4 -dir 01 -suffix 01 -version 704 -lifeline 2025 -checkpoint 15 -np 8
19:35:27:WU01:FS00:Started FahCore on PID 2237
19:35:27:WU01:FS00:Core PID:2241
19:35:27:WU01:FS00:FahCore 0xa4 started
19:35:27:WU01:FS00:0xa4:
19:35:27:WU01:FS00:0xa4:*------------------------------*
19:35:27:WU01:FS00:0xa4:Folding@Home Gromacs GB Core
19:35:27:WU01:FS00:0xa4:Version 2.27 (Dec. 15, 2010)
19:35:27:WU01:FS00:0xa4:
19:35:27:WU01:FS00:0xa4:Preparing to commence simulation
19:35:27:WU01:FS00:0xa4:- Looking at optimizations...
19:35:27:WU01:FS00:0xa4:- Created dyn
19:35:27:WU01:FS00:0xa4:- Files status OK
19:35:27:WU01:FS00:0xa4:- Expanded 825516 -> 1398024 (decompressed 169.3 percent)
19:35:27:WU01:FS00:0xa4:Called DecompressByteArray: compressed_data_size=825516 data_size=1398024, decompressed_data_size=1398024 diff=0
19:35:27:WU01:FS00:0xa4:- Digital signature verified
19:35:27:WU01:FS00:0xa4:
19:35:27:WU01:FS00:0xa4:Project: 9022 (Run 144, Clone 7, Gen 106)
19:35:27:WU01:FS00:0xa4:
19:35:27:WU01:FS00:0xa4:Assembly optimizations on if available.
19:35:27:WU01:FS00:0xa4:Entering M.D.
19:35:33:WU00:FS00:Upload 25.56%
19:35:33:WU01:FS00:0xa4:Completed 0 out of 250000 steps  (0%)
19:35:39:WU00:FS00:Upload 46.86%
19:35:47:WU00:FS00:Upload 63.90%
19:35:59:WU00:FS00:Upload 89.46%
19:36:08:WU00:FS00:Upload 93.72%
19:36:14:WU00:FS00:Upload 100.00%
19:36:28:WU00:FS00:Upload complete
19:36:28:WU00:FS00:Server responded WORK_ACK (400)
19:36:28:WU00:FS00:Final credit estimate, 1647.00 points
19:36:28:WU00:FS00:Cleaning up
19:36:55:WU01:FS00:0xa4:Completed 2500 out of 250000 steps  (1%)
19:38:16:WU01:FS00:0xa4:Completed 5000 out of 250000 steps  (2%)
19:39:38:WU01:FS00:0xa4:Completed 7500 out of 250000 steps  (3%)
19:40:59:WU01:FS00:0xa4:Completed 10000 out of 250000 steps  (4%)
19:42:20:WU01:FS00:0xa4:Completed 12500 out of 250000 steps  (5%)
19:43:41:WU01:FS00:0xa4:Completed 15000 out of 250000 steps  (6%)
19:45:02:WU01:FS00:0xa4:Completed 17500 out of 250000 steps  (7%)
19:46:23:WU01:FS00:0xa4:Completed 20000 out of 250000 steps  (8%)
19:47:44:WU01:FS00:0xa4:Completed 22500 out of 250000 steps  (9%)
19:49:06:WU01:FS00:0xa4:Completed 25000 out of 250000 steps  (10%)
19:50:27:WU01:FS00:0xa4:Completed 27500 out of 250000 steps  (11%)
19:51:49:WU01:FS00:0xa4:Completed 30000 out of 250000 steps  (12%)
19:53:09:WU01:FS00:0xa4:Completed 32500 out of 250000 steps  (13%)
19:54:30:WU01:FS00:0xa4:Completed 35000 out of 250000 steps  (14%)
19:55:51:WU01:FS00:0xa4:Completed 37500 out of 250000 steps  (15%)
19:57:11:WU01:FS00:0xa4:Completed 40000 out of 250000 steps  (16%)
19:58:32:WU01:FS00:0xa4:Completed 42500 out of 250000 steps  (17%)
19:59:54:WU01:FS00:0xa4:Completed 45000 out of 250000 steps  (18%)
20:01:15:WU01:FS00:0xa4:Completed 47500 out of 250000 steps  (19%)
20:02:35:WU01:FS00:0xa4:Completed 50000 out of 250000 steps  (20%)
20:03:56:WU01:FS00:0xa4:Completed 52500 out of 250000 steps  (21%)
20:05:16:WU01:FS00:0xa4:Completed 55000 out of 250000 steps  (22%)
20:06:37:WU01:FS00:0xa4:Completed 57500 out of 250000 steps  (23%)
20:07:57:WU01:FS00:0xa4:Completed 60000 out of 250000 steps  (24%)
20:09:19:WU01:FS00:0xa4:Completed 62500 out of 250000 steps  (25%)
20:10:40:WU01:FS00:0xa4:Completed 65000 out of 250000 steps  (26%)
20:12:00:WU01:FS00:0xa4:Completed 67500 out of 250000 steps  (27%)
20:13:21:WU01:FS00:0xa4:Completed 70000 out of 250000 steps  (28%)
20:14:42:WU01:FS00:0xa4:Completed 72500 out of 250000 steps  (29%)
20:16:04:WU01:FS00:0xa4:Completed 75000 out of 250000 steps  (30%)
20:17:25:WU01:FS00:0xa4:Completed 77500 out of 250000 steps  (31%)
20:18:45:WU01:FS00:0xa4:Completed 80000 out of 250000 steps  (32%)
20:20:06:WU01:FS00:0xa4:Completed 82500 out of 250000 steps  (33%)
20:21:27:WU01:FS00:0xa4:Completed 85000 out of 250000 steps  (34%)
20:22:48:WU01:FS00:0xa4:Completed 87500 out of 250000 steps  (35%)
20:24:10:WU01:FS00:0xa4:Completed 90000 out of 250000 steps  (36%)
******************************* Date: 2015-12-08 *******************************
20:25:31:WU01:FS00:0xa4:Completed 92500 out of 250000 steps  (37%)
20:26:51:WU01:FS00:0xa4:Completed 95000 out of 250000 steps  (38%)
20:28:12:WU01:FS00:0xa4:Completed 97500 out of 250000 steps  (39%)
20:29:33:WU01:FS00:0xa4:Completed 100000 out of 250000 steps  (40%)
20:30:54:WU01:FS00:0xa4:Completed 102500 out of 250000 steps  (41%)
20:32:16:WU01:FS00:0xa4:Completed 105000 out of 250000 steps  (42%)
20:33:37:WU01:FS00:0xa4:Completed 107500 out of 250000 steps  (43%)
20:34:58:WU01:FS00:0xa4:Completed 110000 out of 250000 steps  (44%)
20:36:18:WU01:FS00:0xa4:Completed 112500 out of 250000 steps  (45%)
20:37:39:WU01:FS00:0xa4:Completed 115000 out of 250000 steps  (46%)
20:39:00:WU01:FS00:0xa4:Completed 117500 out of 250000 steps  (47%)
20:40:20:WU01:FS00:0xa4:Completed 120000 out of 250000 steps  (48%)
20:41:41:WU01:FS00:0xa4:Completed 122500 out of 250000 steps  (49%)
20:43:01:WU01:FS00:0xa4:Completed 125000 out of 250000 steps  (50%)
20:44:22:WU01:FS00:0xa4:Completed 127500 out of 250000 steps  (51%)
20:45:43:WU01:FS00:0xa4:Completed 130000 out of 250000 steps  (52%)
20:47:04:WU01:FS00:0xa4:Completed 132500 out of 250000 steps  (53%)
20:48:24:WU01:FS00:0xa4:Completed 135000 out of 250000 steps  (54%)
20:49:45:WU01:FS00:0xa4:Completed 137500 out of 250000 steps  (55%)
20:51:06:WU01:FS00:0xa4:Completed 140000 out of 250000 steps  (56%)
20:52:27:WU01:FS00:0xa4:Completed 142500 out of 250000 steps  (57%)
20:53:47:WU01:FS00:0xa4:Completed 145000 out of 250000 steps  (58%)
20:55:08:WU01:FS00:0xa4:Completed 147500 out of 250000 steps  (59%)
20:56:29:WU01:FS00:0xa4:Completed 150000 out of 250000 steps  (60%)
20:57:50:WU01:FS00:0xa4:Completed 152500 out of 250000 steps  (61%)
20:59:11:WU01:FS00:0xa4:Completed 155000 out of 250000 steps  (62%)
21:00:32:WU01:FS00:0xa4:Completed 157500 out of 250000 steps  (63%)
21:01:52:WU01:FS00:0xa4:Completed 160000 out of 250000 steps  (64%)
21:03:12:WU01:FS00:0xa4:Completed 162500 out of 250000 steps  (65%)
21:04:32:WU01:FS00:0xa4:Completed 165000 out of 250000 steps  (66%)
21:05:53:WU01:FS00:0xa4:Completed 167500 out of 250000 steps  (67%)
21:07:14:WU01:FS00:0xa4:Completed 170000 out of 250000 steps  (68%)
21:08:35:WU01:FS00:0xa4:Completed 172500 out of 250000 steps  (69%)
21:09:57:WU01:FS00:0xa4:Completed 175000 out of 250000 steps  (70%)
21:11:18:WU01:FS00:0xa4:Completed 177500 out of 250000 steps  (71%)
21:12:39:WU01:FS00:0xa4:Completed 180000 out of 250000 steps  (72%)
21:14:00:WU01:FS00:0xa4:Completed 182500 out of 250000 steps  (73%)
21:15:21:WU01:FS00:0xa4:Completed 185000 out of 250000 steps  (74%)
21:16:42:WU01:FS00:0xa4:Completed 187500 out of 250000 steps  (75%)
21:18:03:WU01:FS00:0xa4:Completed 190000 out of 250000 steps  (76%)
21:19:23:WU01:FS00:0xa4:Completed 192500 out of 250000 steps  (77%)
21:20:44:WU01:FS00:0xa4:Completed 195000 out of 250000 steps  (78%)
21:22:05:WU01:FS00:0xa4:Completed 197500 out of 250000 steps  (79%)
21:23:26:WU01:FS00:0xa4:Completed 200000 out of 250000 steps  (80%)
21:24:48:WU01:FS00:0xa4:Completed 202500 out of 250000 steps  (81%)
21:26:08:WU01:FS00:0xa4:Completed 205000 out of 250000 steps  (82%)
21:27:29:WU01:FS00:0xa4:Completed 207500 out of 250000 steps  (83%)
21:28:49:WU01:FS00:0xa4:Completed 210000 out of 250000 steps  (84%)
21:30:10:WU01:FS00:0xa4:Completed 212500 out of 250000 steps  (85%)
21:31:31:WU01:FS00:0xa4:Completed 215000 out of 250000 steps  (86%)
21:32:52:WU01:FS00:0xa4:Completed 217500 out of 250000 steps  (87%)
21:34:12:WU01:FS00:0xa4:Completed 220000 out of 250000 steps  (88%)
21:35:32:WU01:FS00:0xa4:Completed 222500 out of 250000 steps  (89%)
21:36:54:WU01:FS00:0xa4:Completed 225000 out of 250000 steps  (90%)
21:38:15:WU01:FS00:0xa4:Completed 227500 out of 250000 steps  (91%)
21:39:38:WU01:FS00:0xa4:Completed 230000 out of 250000 steps  (92%)
21:40:59:WU01:FS00:0xa4:Completed 232500 out of 250000 steps  (93%)
21:42:21:WU01:FS00:0xa4:Completed 235000 out of 250000 steps  (94%)
21:43:43:WU01:FS00:0xa4:Completed 237500 out of 250000 steps  (95%)
21:45:03:WU01:FS00:0xa4:Completed 240000 out of 250000 steps  (96%)
21:46:24:WU01:FS00:0xa4:Completed 242500 out of 250000 steps  (97%)
21:47:45:WU01:FS00:0xa4:Completed 245000 out of 250000 steps  (98%)
21:49:05:WU01:FS00:0xa4:Completed 247500 out of 250000 steps  (99%)
21:50:26:WU01:FS00:0xa4:Completed 250000 out of 250000 steps  (100%)
21:50:26:WU01:FS00:0xa4:DynamicWrapper: Finished Work Unit: sleep=10000
21:50:27:WU00:FS00:Connecting to 171.67.108.45:8080
21:50:27:WU00:FS00:Assigned to work server 171.64.65.124
21:50:27:WU00:FS00:Requesting new work unit for slot 00: RUNNING cpu:8 from 171.64.65.124
21:50:27:WU00:FS00:Connecting to 171.64.65.124:8080
21:50:36:WU01:FS00:0xa4:
21:50:36:WU01:FS00:0xa4:Finished Work Unit:
21:50:36:WU01:FS00:0xa4:- Reading up to 811536 from "01/wudata_01.trr": Read 811536
21:50:36:WU01:FS00:0xa4:trr file hash check passed.
21:50:36:WU01:FS00:0xa4:- Reading up to 745964 from "01/wudata_01.xtc": Read 745964
21:50:36:WU01:FS00:0xa4:xtc file hash check passed.
21:50:36:WU01:FS00:0xa4:edr file hash check passed.
21:50:36:WU01:FS00:0xa4:logfile size: 22945
21:50:36:WU01:FS00:0xa4:Leaving Run
21:50:39:WU01:FS00:0xa4:- Writing 1582933 bytes of core data to disk...
21:50:39:WU01:FS00:0xa4:Done: 1582421 -> 1537885 (compressed to 97.1 percent)
21:50:39:WU01:FS00:0xa4:  ... Done.
21:50:49:WU00:FS00:Downloading 806.40KiB
21:50:57:WU00:FS00:Download 7.94%
21:51:17:WU00:FS00:Download 23.81%
21:51:28:WU00:FS00:Download 39.68%
21:51:35:WU01:FS00:0xa4:- Shutting down core
21:51:35:WU01:FS00:0xa4:
21:51:35:WU01:FS00:0xa4:Folding@home Core Shutdown: FINISHED_UNIT
21:51:36:WU00:FS00:Download 63.49%
21:51:42:WU00:FS00:Download 79.37%
21:51:44:WU01:FS00:FahCore returned: FINISHED_UNIT (100 = 0x64)
21:51:44:WU01:FS00:Sending unit results: id:01 state:SEND error:NO_ERROR project:9022 run:144 clone:7 gen:106 core:0xa4 unit:0x0000007bab40417c55e8a37e171f3ca7
21:51:44:WU01:FS00:Uploading 1.47MiB to 171.64.65.124
21:51:44:WU01:FS00:Connecting to 171.64.65.124:8080
21:51:50:WU00:FS00:Download 95.24%
21:51:50:WU00:FS00:Download complete
21:51:50:WU01:FS00:Upload 12.78%
21:51:50:WU00:FS00:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:9029 run:921 clone:5 gen:10 core:0xa4 unit:0x0000000fab40417c55f38c5b02c474a2
21:51:50:WU00:FS00:Starting
21:51:50:WU00:FS00:Running FahCore: /home/akula/fah/7-4-4/FAHCoreWrapper /home/akula/fah/7-4-4/cores/web.stanford.edu/~pande/Linux/AMD64/Core_a4.fah/FahCore_a4 -dir 00 -suffix 01 -version 704 -lifeline 2025 -checkpoint 15 -np 8
21:51:50:WU00:FS00:Started FahCore on PID 2259
21:51:50:WU00:FS00:Core PID:2263
21:51:50:WU00:FS00:FahCore 0xa4 started
21:51:51:WU00:FS00:0xa4:
21:51:51:WU00:FS00:0xa4:*------------------------------*
21:51:51:WU00:FS00:0xa4:Folding@Home Gromacs GB Core
21:51:51:WU00:FS00:0xa4:Version 2.27 (Dec. 15, 2010)
21:51:51:WU00:FS00:0xa4:
21:51:51:WU00:FS00:0xa4:Preparing to commence simulation
21:51:51:WU00:FS00:0xa4:- Looking at optimizations...
21:51:51:WU00:FS00:0xa4:- Created dyn
21:51:51:WU00:FS00:0xa4:- Files status OK
21:51:51:WU00:FS00:0xa4:- Expanded 825238 -> 1398040 (decompressed 169.4 percent)
21:51:51:WU00:FS00:0xa4:Called DecompressByteArray: compressed_data_size=825238 data_size=1398040, decompressed_data_size=1398040 diff=0
21:51:51:WU00:FS00:0xa4:- Digital signature verified
21:51:51:WU00:FS00:0xa4:
21:51:51:WU00:FS00:0xa4:Project: 9029 (Run 921, Clone 5, Gen 10)
21:51:51:WU00:FS00:0xa4:
21:51:51:WU00:FS00:0xa4:Assembly optimizations on if available.
21:51:51:WU00:FS00:0xa4:Entering M.D.
21:51:57:WU00:FS00:0xa4:Completed 0 out of 250000 steps  (0%)
21:51:57:WU01:FS00:Upload 17.04%
21:52:08:WU01:FS00:Upload 25.56%
21:52:21:WU01:FS00:Upload 29.82%
21:53:20:WU00:FS00:0xa4:Completed 2500 out of 250000 steps  (1%)
21:54:29:WU01:FS00:Upload 34.08%
21:54:44:WU00:FS00:0xa4:Completed 5000 out of 250000 steps  (2%)
21:54:58:WU01:FS00:Upload 38.34%
[93m21:54:58:WARNING:WU01:FS00:Exception: Failed to send results to work server: Transfer failed[0m
21:54:59:WU01:FS00:Sending unit results: id:01 state:SEND error:NO_ERROR project:9022 run:144 clone:7 gen:106 core:0xa4 unit:0x0000007bab40417c55e8a37e171f3ca7
21:54:59:WU01:FS00:Uploading 1.47MiB to 171.64.65.124
21:54:59:WU01:FS00:Connecting to 171.64.65.124:8080
[93m21:54:59:WARNING:WU01:FS00:WorkServer connection failed on port 8080 trying 80[0m
21:54:59:WU01:FS00:Connecting to 171.64.65.124:80
[93m21:54:59:WARNING:WU01:FS00:Exception: Failed to send results to work server: Failed to connect to 171.64.65.124:80: Connection refused[0m
21:55:59:WU01:FS00:Sending unit results: id:01 state:SEND error:NO_ERROR project:9022 run:144 clone:7 gen:106 core:0xa4 unit:0x0000007bab40417c55e8a37e171f3ca7
21:55:59:WU01:FS00:Uploading 1.47MiB to 171.64.65.124
21:55:59:WU01:FS00:Connecting to 171.64.65.124:8080
[93m21:55:59:WARNING:WU01:FS00:WorkServer connection failed on port 8080 trying 80[0m
21:55:59:WU01:FS00:Connecting to 171.64.65.124:80
[93m21:55:59:WARNING:WU01:FS00:Exception: Failed to send results to work server: Failed to connect to 171.64.65.124:80: Connection refused[0m
21:56:08:WU00:FS00:0xa4:Completed 7500 out of 250000 steps  (3%)
21:57:31:WU00:FS00:0xa4:Completed 10000 out of 250000 steps  (4%)
21:57:36:WU01:FS00:Sending unit results: id:01 state:SEND error:NO_ERROR project:9022 run:144 clone:7 gen:106 core:0xa4 unit:0x0000007bab40417c55e8a37e171f3ca7
21:57:36:WU01:FS00:Uploading 1.47MiB to 171.64.65.124
21:57:36:WU01:FS00:Connecting to 171.64.65.124:8080
[93m21:58:39:WARNING:WU01:FS00:WorkServer connection failed on port 8080 trying 80[0m
21:58:39:WU01:FS00:Connecting to 171.64.65.124:80
21:58:55:WU00:FS00:0xa4:Completed 12500 out of 250000 steps  (5%)
[93m21:59:42:WARNING:WU01:FS00:Exception: Failed to send results to work server: Failed to connect to 171.64.65.124:80: Connection timed out[0m
22:00:13:WU01:FS00:Sending unit results: id:01 state:SEND error:NO_ERROR project:9022 run:144 clone:7 gen:106 core:0xa4 unit:0x0000007bab40417c55e8a37e171f3ca7
22:00:13:WU01:FS00:Uploading 1.47MiB to 171.64.65.124
22:00:13:WU01:FS00:Connecting to 171.64.65.124:8080
22:00:19:WU00:FS00:0xa4:Completed 15000 out of 250000 steps  (6%)
[93m22:01:17:WARNING:WU01:FS00:WorkServer connection failed on port 8080 trying 80[0m
22:01:17:WU01:FS00:Connecting to 171.64.65.124:80
22:01:24:WU01:FS00:Upload 4.26%
22:01:44:WU00:FS00:0xa4:Completed 17500 out of 250000 steps  (7%)
22:01:44:WU01:FS00:Upload 12.78%
22:01:52:WU01:FS00:Upload 29.82%
22:02:04:WU01:FS00:Upload 34.08%
22:02:11:WU01:FS00:Upload 42.60%
22:02:21:WU01:FS00:Upload 55.38%
22:02:33:WU01:FS00:Upload 80.94%
22:03:08:WU00:FS00:0xa4:Completed 20000 out of 250000 steps  (8%)
22:04:32:WU00:FS00:0xa4:Completed 22500 out of 250000 steps  (9%)
22:04:53:WU01:FS00:Upload complete
22:04:53:WU01:FS00:Server responded WORK_QUIT (404)
[93m22:04:53:WARNING:WU01:FS00:Server did not like results, dumping[0m
22:04:53:WU01:FS00:Cleaning up
22:05:56:WU00:FS00:0xa4:Completed 25000 out of 250000 steps  (10%)
22:07:20:WU00:FS00:0xa4:Completed 27500 out of 250000 steps  (11%)

<SNIP>

Re: 171.64.65.124 SLOW upload/download and Dumping WU

Posted: Tue Dec 08, 2015 9:55 pm
by n1np
I just saw this on a different machine with the same configuration: "Received short response"

Code: Select all

<SNIP>

22:46:27:WU00:FS00:0xa4:Completed 242500 out of 250000 steps  (97%)
22:47:50:WU00:FS00:0xa4:Completed 245000 out of 250000 steps  (98%)
22:49:14:WU00:FS00:0xa4:Completed 247500 out of 250000 steps  (99%)
22:50:38:WU00:FS00:0xa4:Completed 250000 out of 250000 steps  (100%)
22:50:38:WU00:FS00:0xa4:DynamicWrapper: Finished Work Unit: sleep=10000
22:50:39:WU01:FS00:Connecting to 171.67.108.45:8080
22:50:39:WU01:FS00:Assigned to work server 171.64.65.124
22:50:39:WU01:FS00:Requesting new work unit for slot 00: RUNNING cpu:8 from 171.64.65.124
22:50:39:WU01:FS00:Connecting to 171.64.65.124:8080
22:50:48:WU00:FS00:0xa4:
22:50:48:WU00:FS00:0xa4:Finished Work Unit:
22:50:48:WU00:FS00:0xa4:- Reading up to 811440 from "00/wudata_01.trr": Read 811440
22:50:48:WU00:FS00:0xa4:trr file hash check passed.
22:50:48:WU00:FS00:0xa4:- Reading up to 745984 from "00/wudata_01.xtc": Read 745984
22:50:48:WU00:FS00:0xa4:xtc file hash check passed.
22:50:48:WU00:FS00:0xa4:edr file hash check passed.
22:50:48:WU00:FS00:0xa4:logfile size: 22991
22:50:48:WU00:FS00:0xa4:Leaving Run
22:50:52:WU00:FS00:0xa4:- Writing 1582903 bytes of core data to disk...
22:50:52:WU00:FS00:0xa4:Done: 1582391 -> 1537560 (compressed to 97.1 percent)
22:50:52:WU00:FS00:0xa4:  ... Done.
[91m22:50:58:ERROR:WU01:FS00:Exception: 10002: Received short response, expected 512 bytes, got 0[0m
22:50:58:WU01:FS00:Connecting to 171.67.108.45:8080
22:50:59:WU01:FS00:Assigned to work server 171.64.65.124
22:50:59:WU01:FS00:Requesting new work unit for slot 00: RUNNING cpu:8 from 171.64.65.124
22:50:59:WU01:FS00:Connecting to 171.64.65.124:8080
[93m22:50:59:WARNING:WU01:FS00:WorkServer connection failed on port 8080 trying 80[0m
22:50:59:WU01:FS00:Connecting to 171.64.65.124:80
[91m22:50:59:ERROR:WU01:FS00:Exception: Failed to connect to 171.64.65.124:80: Connection refused[0m
22:51:55:WU00:FS00:0xa4:- Shutting down core
22:51:55:WU00:FS00:0xa4:

<SNIP>
And I seem to be seeing "special" characters in the logs, unicode? I don't remember seeing those before.

Re: 171.64.65.124 SLOW upload/download and Dumping WU

Posted: Tue Dec 08, 2015 10:24 pm
by davidcoton
n1np wrote:I just saw this on a different machine with the same configuration: "Received short response"
This indicates a missing reply from the server -- either it was never sent, or it got lost along the way. Sometimes it has been blocked by AV or port security software at the client PC.
n1np wrote: And I seem to be seeing "special" characters in the logs, unicode? I don't remember seeing those before.
I *think* these (at least the ones in your log snippet) are escape sequences, probably newlines, which come from the original error messages. It looks as though a file was prepared in Windows but used in a Linux based build (or possibly the other way round).

Re: 171.64.65.124 SLOW upload/download and Dumping WU

Posted: Tue Dec 08, 2015 10:39 pm
by n1np
davidcoton wrote: Sometimes it has been blocked by AV or port security software at the client PC.
These machines are still running naked as they always have :shock: , same networking setup.
davidcoton wrote: I *think* these (at least the ones in your log snippet) are escape sequences, probably newlines, which come from the original error messages. It looks as though a file was prepared in Windows but used in a Linux based build (or possibly the other way round).
I see them on the console (linux) as "ESC" in reversed text (white background). When I scp the logfile to my laptop (linux) and open it with Kate (K Advanced Text Editor) they show up as special characters. That part may not be important. The server and WU dumping issue still concerns me.

Re: 171.64.65.124 SLOW upload/download and Dumping WU

Posted: Wed Dec 09, 2015 12:47 am
by n1np
OK, the special characters are apparently for COLOR! When I cat the log either on the console or via ssh on my laptop, it is all monochrome. When I grep on the console, it is all monochrome. When I grep via ssh on my laptop, :ERROR is all red, and and :WARNING is all yellow. There is a difference in the shell setup between the laptop and the rack machines.

That makes me feel a bit stupid if it has been there all this time and I just noticed it :oops: . That's one mystery solved, anyway.

Unable to send completed WU (PRCG 9026 [769, 6, 19])

Posted: Wed Dec 09, 2015 12:59 am
by Nantes
It's on its 5th attempt right now. Collection server says 0.0.0.0 , and work server is 171.64.65.124. My internet connection is just fine, and I haven't changed anything with my firewall. If it manages to send I'll delete this thread.

Re: Unable to send completed WU (PRCG 9026 [769, 6, 19])

Posted: Wed Dec 09, 2015 1:18 am
by n1np
Sounds like the same problem I posted here.

Re: Unable to send completed WU (PRCG 9026 [769, 6, 19])

Posted: Wed Dec 09, 2015 1:28 am
by Joe_H
It looks like the WS may be having some problems, the Net Load has been running on the high side for the last couple hours. I have contacted the person responsible for that server.

Edit: I have received a response back that this is being looked into.

Re: 171.64.65.124 SLOW upload/download and Dumping WU

Posted: Wed Dec 09, 2015 1:02 pm
by goodyca
I am also seeing failed uploads to this server. Here is one example.

Code: Select all

06:37:12:WU01:FS01:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:9029 run:232 clone:4 gen:16 core:0xa4 unit:0x00000017ab40417c55f3901ae7fc62e8
06:37:57:WU01:FS01:Starting
06:37:57:WU01:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/web.stanford.edu/~pande/Linux/AMD64/Core_a4.fah/FahCore_a4 -dir 01 -suffix 01 -version 704 -lifeline 1581 -checkpoint 15 -np 12
06:37:57:WU01:FS01:Started FahCore on PID 14118
06:37:57:WU01:FS01:Core PID:14122
06:37:57:WU01:FS01:FahCore 0xa4 started
06:37:58:WU01:FS01:0xa4:
06:37:58:WU01:FS01:0xa4:*------------------------------*
06:37:58:WU01:FS01:0xa4:Folding@Home Gromacs GB Core
06:37:58:WU01:FS01:0xa4:Version 2.27 (Dec. 15, 2010)
06:37:58:WU01:FS01:0xa4:
06:37:58:WU01:FS01:0xa4:Preparing to commence simulation
06:37:58:WU01:FS01:0xa4:- Looking at optimizations...
06:37:58:WU01:FS01:0xa4:- Created dyn
06:37:58:WU01:FS01:0xa4:- Files status OK
06:37:58:WU01:FS01:0xa4:- Expanded 825184 -> 1398040 (decompressed 169.4 percent)
06:37:58:WU01:FS01:0xa4:Called DecompressByteArray: compressed_data_size=825184 data_size=1398040, decompressed_data_size=1398040 diff=0
06:37:58:WU01:FS01:0xa4:- Digital signature verified
06:37:58:WU01:FS01:0xa4:
06:37:58:WU01:FS01:0xa4:Project: 9029 (Run 232, Clone 4, Gen 16)
06:37:58:WU01:FS01:0xa4:
06:37:58:WU01:FS01:0xa4:Assembly optimizations on if available.
06:37:58:WU01:FS01:0xa4:Entering M.D.
06:38:04:WU01:FS01:0xa4:Completed 0 out of 250000 steps  (0%)
06:38:52:WU01:FS01:0xa4:Completed 2500 out of 250000 steps  (1%)
06:39:40:WU01:FS01:0xa4:Completed 5000 out of 250000 steps  (2%)
...
07:57:36:WU01:FS01:0xa4:Completed 247500 out of 250000 steps  (99%)
07:58:24:WU01:FS01:0xa4:Completed 250000 out of 250000 steps  (100%)
07:58:24:WU01:FS01:0xa4:DynamicWrapper: Finished Work Unit: sleep=10000
07:58:34:WU01:FS01:0xa4:
07:58:34:WU01:FS01:0xa4:Finished Work Unit:
07:58:34:WU01:FS01:0xa4:- Reading up to 811440 from "01/wudata_01.trr": Read 811440
07:58:34:WU01:FS01:0xa4:trr file hash check passed.
07:58:34:WU01:FS01:0xa4:- Reading up to 745532 from "01/wudata_01.xtc": Read 745532
07:58:34:WU01:FS01:0xa4:xtc file hash check passed.
07:58:34:WU01:FS01:0xa4:edr file hash check passed.
07:58:34:WU01:FS01:0xa4:logfile size: 22774
07:58:34:WU01:FS01:0xa4:Leaving Run
07:58:38:WU01:FS01:0xa4:- Writing 1582234 bytes of core data to disk...
07:58:38:WU01:FS01:0xa4:Done: 1581722 -> 1537413 (compressed to 97.1 percent)
07:58:38:WU01:FS01:0xa4:  ... Done.
07:59:39:WU01:FS01:0xa4:- Shutting down core
07:59:39:WU01:FS01:0xa4:
07:59:39:WU01:FS01:0xa4:Folding@home Core Shutdown: FINISHED_UNIT
07:59:47:WU01:FS01:FahCore returned: FINISHED_UNIT (100 = 0x64)
07:59:48:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:9029 run:232 clone:4 gen:16 core:0xa4 unit:0x00000017ab40417c55f3901ae7fc62e8
07:59:48:WU01:FS01:Uploading 1.47MiB to 171.64.65.124
07:59:48:WU01:FS01:Connecting to 171.64.65.124:8080
08:02:00:WU01:FS01:Upload 4.26%
08:02:00:WARNING:WU01:FS01:Exception: Failed to send results to work server: Transfer failed
08:02:01:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:9029 run:232 clone:4 gen:16 core:0xa4 unit:0x00000017ab40417c55f3901ae7fc62e8
08:02:01:WU01:FS01:Uploading 1.47MiB to 171.64.65.124
08:02:01:WU01:FS01:Connecting to 171.64.65.124:8080
08:02:01:WARNING:WU01:FS01:WorkServer connection failed on port 8080 trying 80
08:02:01:WU01:FS01:Connecting to 171.64.65.124:80
08:02:01:WARNING:WU01:FS01:Exception: Failed to send results to work server: Failed to connect to 171.64.65.124:80: Connection refused
08:03:01:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:9029 run:232 clone:4 gen:16 core:0xa4 unit:0x00000017ab40417c55f3901ae7fc62e8
08:03:01:WU01:FS01:Uploading 1.47MiB to 171.64.65.124
08:03:01:WU01:FS01:Connecting to 171.64.65.124:8080
08:05:08:WARNING:WU01:FS01:WorkServer connection failed on port 8080 trying 80
08:05:08:WU01:FS01:Connecting to 171.64.65.124:80
08:07:15:WARNING:WU01:FS01:Exception: Failed to send results to work server: Failed to connect to 171.64.65.124:80: Connection timed out
08:07:15:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:9029 run:232 clone:4 gen:16 core:0xa4 unit:0x00000017ab40417c55f3901ae7fc62e8
08:07:15:WU01:FS01:Uploading 1.47MiB to 171.64.65.124
08:07:15:WU01:FS01:Connecting to 171.64.65.124:8080
08:09:23:WARNING:WU01:FS01:WorkServer connection failed on port 8080 trying 80
08:09:23:WU01:FS01:Connecting to 171.64.65.124:80
08:11:30:WARNING:WU01:FS01:Exception: Failed to send results to work server: Failed to connect to 171.64.65.124:80: Connection timed out
08:11:30:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:9029 run:232 clone:4 gen:16 core:0xa4 unit:0x00000017ab40417c55f3901ae7fc62e8
08:11:30:WU01:FS01:Uploading 1.47MiB to 171.64.65.124
08:11:30:WU01:FS01:Connecting to 171.64.65.124:8080
08:13:38:WARNING:WU01:FS01:WorkServer connection failed on port 8080 trying 80
08:13:38:WU01:FS01:Connecting to 171.64.65.124:80
08:15:45:WARNING:WU01:FS01:Exception: Failed to send results to work server: Failed to connect to 171.64.65.124:80: Connection timed out
08:15:45:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:9029 run:232 clone:4 gen:16 core:0xa4 unit:0x00000017ab40417c55f3901ae7fc62e8
08:15:45:WU01:FS01:Uploading 1.47MiB to 171.64.65.124
08:15:45:WU01:FS01:Connecting to 171.64.65.124:8080
08:17:52:WARNING:WU01:FS01:WorkServer connection failed on port 8080 trying 80
08:17:52:WU01:FS01:Connecting to 171.64.65.124:80
08:20:00:WARNING:WU01:FS01:Exception: Failed to send results to work server: Failed to connect to 171.64.65.124:80: Connection timed out
08:22:37:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:9029 run:232 clone:4 gen:16 core:0xa4 unit:0x00000017ab40417c55f3901ae7fc62e8
08:22:37:WU01:FS01:Uploading 1.47MiB to 171.64.65.124
08:22:37:WU01:FS01:Connecting to 171.64.65.124:8080
08:24:44:WARNING:WU01:FS01:WorkServer connection failed on port 8080 trying 80
08:24:44:WU01:FS01:Connecting to 171.64.65.124:80
08:28:56:WU01:FS01:Upload 4.26%
08:28:56:WARNING:WU01:FS01:Exception: Failed to send results to work server: Transfer failed
08:33:42:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:9029 run:232 clone:4 gen:16 core:0xa4 unit:0x00000017ab40417c55f3901ae7fc62e8
08:33:42:WU01:FS01:Uploading 1.47MiB to 171.64.65.124
08:33:42:WU01:FS01:Connecting to 171.64.65.124:8080
08:34:13:WU01:FS01:Upload 4.26%
08:39:16:WU01:FS01:Upload 17.05%
08:39:16:WARNING:WU01:FS01:Exception: Failed to send results to work server: Transfer failed
08:51:39:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:9029 run:232 clone:4 gen:16 core:0xa4 unit:0x00000017ab40417c55f3901ae7fc62e8
08:51:39:WU01:FS01:Uploading 1.47MiB to 171.64.65.124
08:51:39:WU01:FS01:Connecting to 171.64.65.124:8080
08:53:46:WARNING:WU01:FS01:WorkServer connection failed on port 8080 trying 80
08:53:46:WU01:FS01:Connecting to 171.64.65.124:80
08:55:53:WARNING:WU01:FS01:Exception: Failed to send results to work server: Failed to connect to 171.64.65.124:80: Connection timed out
09:20:41:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:9029 run:232 clone:4 gen:16 core:0xa4 unit:0x00000017ab40417c55f3901ae7fc62e8
09:20:41:WU01:FS01:Uploading 1.47MiB to 171.64.65.124
09:20:41:WU01:FS01:Connecting to 171.64.65.124:8080
09:22:48:WARNING:WU01:FS01:WorkServer connection failed on port 8080 trying 80
09:22:48:WU01:FS01:Connecting to 171.64.65.124:80
09:24:56:WARNING:WU01:FS01:Exception: Failed to send results to work server: Failed to connect to 171.64.65.124:80: Connection timed out
10:07:40:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:9029 run:232 clone:4 gen:16 core:0xa4 unit:0x00000017ab40417c55f3901ae7fc62e8
10:07:40:WU01:FS01:Uploading 1.47MiB to 171.64.65.124
10:07:40:WU01:FS01:Connecting to 171.64.65.124:8080
10:11:53:WU01:FS01:Upload 4.26%
10:11:53:WARNING:WU01:FS01:Exception: Failed to send results to work server: Transfer failed
11:23:41:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:9029 run:232 clone:4 gen:16 core:0xa4 unit:0x00000017ab40417c55f3901ae7fc62e8
11:23:41:WU01:FS01:Uploading 1.47MiB to 171.64.65.124
11:23:41:WU01:FS01:Connecting to 171.64.65.124:8080
11:24:21:WU01:FS01:Upload 4.26%
11:25:31:WU01:FS01:Upload 8.52%
11:26:16:WU01:FS01:Upload 12.78%
11:26:50:WU01:FS01:Upload 17.05%
11:27:05:WU01:FS01:Upload 21.31%
11:27:38:WU01:FS01:Upload 25.57%
11:28:26:WU01:FS01:Upload 29.83%
11:29:02:WU01:FS01:Upload 34.09%
11:29:26:WU01:FS01:Upload 38.35%
11:30:08:WU01:FS01:Upload 42.61%
11:30:49:WU01:FS01:Upload 46.87%
11:31:52:WU01:FS01:Upload 51.14%
11:32:23:WU01:FS01:Upload 55.40%
11:32:31:WU01:FS01:Upload 59.66%
11:32:31:WARNING:WU01:FS01:Exception: Failed to send results to work server: Transfer failed


Re: 171.64.65.124 SLOW upload/download and Dumping WU

Posted: Wed Dec 09, 2015 3:22 pm
by Nert
I'm experiencing the same problem. I've had a unit stuck for a while now:

Code: Select all

*********************** Log Started 2015-12-07T15:58:38Z ***********************
******************************* Date: 2015-12-07 *******************************
******************************* Date: 2015-12-08 *******************************
******************************* Date: 2015-12-08 *******************************
******************************* Date: 2015-12-08 *******************************
******************************* Date: 2015-12-08 *******************************
******************************* Date: 2015-12-09 *******************************
04:36:50:WARNING:WU01:FS00:WorkServer connection failed on port 8080 trying 80
04:37:12:ERROR:WU01:FS00:Exception: Failed to connect to 171.64.65.124:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
04:37:33:WARNING:WU01:FS00:WorkServer connection failed on port 8080 trying 80
04:37:55:ERROR:WU01:FS00:Exception: Failed to connect to 171.64.65.124:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
04:38:34:WARNING:WU01:FS00:WorkServer connection failed on port 8080 trying 80
08:26:59:WARNING:WU01:FS00:Exception: Failed to send results to work server: Transfer failed
08:32:11:WARNING:WU01:FS00:Exception: Failed to send results to work server: Transfer failed
08:32:13:WARNING:WU01:FS00:WorkServer connection failed on port 8080 trying 80
08:32:14:WARNING:WU01:FS00:Exception: Failed to send results to work server: Failed to connect to 171.64.65.124:80: No connection could be made because the target machine actively refused it.
08:38:58:WARNING:WU01:FS00:Exception: Failed to send results to work server: Transfer failed
08:38:59:WARNING:WU01:FS00:WorkServer connection failed on port 8080 trying 80
08:39:00:WARNING:WU01:FS00:Exception: Failed to send results to work server: Failed to connect to 171.64.65.124:80: No connection could be made because the target machine actively refused it.
08:43:33:WARNING:WU01:FS00:WorkServer connection failed on port 8080 trying 80
08:43:55:WARNING:WU01:FS00:Exception: Failed to send results to work server: Failed to connect to 171.64.65.124:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
08:56:51:WARNING:WU01:FS00:Exception: Failed to send results to work server: Transfer failed
09:07:53:WARNING:WU01:FS00:Exception: Failed to send results to work server: Transfer failed
09:19:27:WARNING:WU01:FS00:WorkServer connection failed on port 8080 trying 80
09:19:48:WARNING:WU01:FS00:Exception: Failed to send results to work server: Failed to connect to 171.64.65.124:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
09:48:29:WARNING:WU01:FS00:WorkServer connection failed on port 8080 trying 80
09:48:51:WARNING:WU01:FS00:Exception: Failed to send results to work server: Failed to connect to 171.64.65.124:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
******************************* Date: 2015-12-09 *******************************
10:35:28:WARNING:WU01:FS00:WorkServer connection failed on port 8080 trying 80
10:35:50:WARNING:WU01:FS00:Exception: Failed to send results to work server: Failed to connect to 171.64.65.124:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
11:51:29:WARNING:WU01:FS00:WorkServer connection failed on port 8080 trying 80
11:51:50:WARNING:WU01:FS00:Exception: Failed to send results to work server: Failed to connect to 171.64.65.124:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
13:54:29:WARNING:WU01:FS00:WorkServer connection failed on port 8080 trying 80
13:54:50:WARNING:WU01:FS00:Exception: Failed to send results to work server: Failed to connect to 171.64.65.124:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.

Re: 171.64.65.124 SLOW upload/download and Dumping WU

Posted: Wed Dec 09, 2015 4:19 pm
by onDvine
I noticed the same problems here last night and they continue this morning. For example (from my log), "... WARNING:WU02:FS01:Server did not like results, dumping ..." At least one other member of my team also reported issues with this server overnight.

Re: 171.64.65.124 SLOW upload/download and Dumping WU

Posted: Wed Dec 09, 2015 6:26 pm
by Joe_H
Two topics on this merged, the person managing this WS is looking into the problem..

Re: 171.64.65.124 SLOW upload/download and Dumping WU

Posted: Wed Dec 09, 2015 8:34 pm
by sryckbos
I am looking into this. Thanks for the heads up.

Re: 171.64.65.124 SLOW upload/download and Dumping WU

Posted: Wed Dec 09, 2015 10:12 pm
by n1np
Thanks for checking on it.

It may not matter now, but here is a list of work units that my machines have dumped so far (may not be complete):

project:9029 run:848 clone:4 gen:126
project:9021 run:787 clone:9 gen:134
project:9029 run:5 clone:1 gen:31
project:9028 run:141 clone:4 gen:65
project:9022 run:144 clone:7 gen:106
project:9027 run:228 clone:0 gen:124
project:9027 run:465 clone:1 gen:71
project:9027 run:297 clone:5 gen:91
project:9024 run:855 clone:0 gen:116
project:9027 run:654 clone:2 gen:103
project:9022 run:197 clone:5 gen:19
project:9022 run:252 clone:5 gen:121
project:9022 run:186 clone:3 gen:119
project:9023 run:331 clone:6 gen:95
project:9025 run:462 clone:5 gen:147
project:9028 run:665 clone:7 gen:137
project:9029 run:880 clone:1 gen:12
project:9026 run:373 clone:2 gen:84
project:9028 run:850 clone:4 gen:114

Re: 171.64.65.124 SLOW upload/download and Dumping WU

Posted: Wed Dec 09, 2015 10:58 pm
by toTOW
It would be better to turn off assignments for projects hosted on this server until the issue is solved so that we don't waste CPU time because of dumped WUs and work on other available projects ...