Page 1 of 2

171.64.65.55

Posted: Mon Feb 06, 2012 10:50 am
by artoar_11
Anyone else have a problem with the server - 171.64.65.55? Ping is fine. After restart of the client again the same. Server status shows Net Load - 954. What is this?

Code: Select all

[08:54:14] Completed 980000 out of 1000000 steps  (98%)
[08:57:44] Completed 990000 out of 1000000 steps  (99%)
[09:01:14] Completed 1000000 out of 1000000 steps  (100%)
[09:01:14] DynamicWrapper: Finished Work Unit: sleep=10000
[09:01:24] 
[09:01:24] Finished Work Unit:
[09:01:24] - Reading up to 9338472 from "work/wudata_06.trr": Read 9338472
[09:01:24] trr file hash check passed.
[09:01:24] - Reading up to 1414816 from "work/wudata_06.xtc": Read 1414816
[09:01:24] xtc file hash check passed.
[09:01:24] edr file hash check passed.
[09:01:24] logfile size: 28680
[09:01:24] Leaving Run
[09:01:28] - Writing 10789480 bytes of core data to disk...
[09:01:29] Done: 10788968 -> 10301167 (compressed to 95.4 percent)
[09:01:29]   ... Done.
[09:01:31] - Shutting down core
[09:01:31] 
[09:01:31] Folding@home Core Shutdown: FINISHED_UNIT
[09:01:34] CoreStatus = 64 (100)
[09:01:34] Unit 6 finished with 99 percent of time to deadline remaining.
[09:01:34] Updated performance fraction: 0.983218
[09:01:34] Sending work to server
[09:01:34] Project: 11041 (Run 0, Clone 440, Gen 1)


[09:01:34] + Attempting to send results [February 6 09:01:34 UTC]
[09:01:34] - Reading file work/wuresults_06.dat from core
[09:01:34]   (Read 10301679 bytes from disk)
[09:01:34] Connecting to http://171.64.65.55:8080/
[09:10:35] Killing all core threads
[09:10:35] Could not get process id information.  Please kill core process manually

Folding@Home Client Shutdown at user request.
[09:10:35] ***** Got a SIGTERM signal (2)
[09:10:35] Killing all core threads
[09:10:35] Could not get process id information.  Please kill core process manually

Folding@Home Client Shutdown.

--- Opening Log file [February 6 09:11:00 UTC] 


# Windows SMP Console Edition #################################################
###############################################################################

                       Folding@Home Client Version 6.34

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: E:\_SMP_FAH-v. 6.34
Executable: E:\_SMP_FAH-v. 6.34\FAH6.34-win32-SMP.exe
Arguments: -smp -advmethods -verbosity 9 

[09:11:00] - Ask before connecting: No
[09:11:00] - User name: artoar_home (Team 32435)
[09:11:00] - User ID: 7892D91A0BA0CA2D
[09:11:00] - Machine ID: 1
[09:11:00] 
[09:11:00] Loaded queue successfully.
[09:11:00] - Preparing to get new work unit...
[09:11:00] - Autosending finished units... [February 6 09:11:00 UTC]
[09:11:00] Cleaning up work directory
[09:11:00] Trying to send all finished work units
[09:11:00] Project: 11041 (Run 0, Clone 440, Gen 1)


[09:11:00] + Attempting to send results [February 6 09:11:00 UTC]
[09:11:00] - Reading file work/wuresults_06.dat from core
[09:11:00]   (Read 10301679 bytes from disk)
[09:11:00] Connecting to http://171.64.65.55:8080/
[09:11:00] + Attempting to get work packet
[09:11:00] Passkey found
[09:11:00] - Will indicate memory of 4072 MB
[09:11:00] - Detect CPU. Vendor: GenuineIntel, Family: 6, Model: 10, Stepping: 7
[09:11:00] - Connecting to assignment server
[09:11:00] Connecting to http://assign.stanford.edu:8080/
[09:11:01] Posted data.
[09:11:01] Initial: 40AB; - Successful: assigned to (171.64.65.55).
[09:11:01] + News From Folding@Home: Welcome to Folding@Home
[09:11:01] Loaded queue successfully.
[09:11:01] Sent data
[09:11:01] Connecting to http://171.64.65.55:8080/
[09:13:01] Posted data.
[09:16:39] Killing all core threads
[09:16:39] Could not get process id information.  Please kill core process manually

Folding@Home Client Shutdown at user request.
[09:16:39] ***** Got a SIGTERM signal (2)
[09:16:39] Killing all core threads
[09:16:39] Could not get process id information.  Please kill core process manually

Folding@Home Client Shutdown.

--- Opening Log file [February 6 09:20:39 UTC] 


# Windows SMP Console Edition #################################################
###############################################################################

                       Folding@Home Client Version 6.34

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: E:\_SMP_FAH-v. 6.34
Executable: E:\_SMP_FAH-v. 6.34\FAH6.34-win32-SMP.exe
Arguments: -smp -advmethods -verbosity 9 

[09:20:39] - Ask before connecting: No
[09:20:39] - User name: artoar_home (Team 32435)
[09:20:39] - User ID: 7892D91A0BA0CA2D
[09:20:39] - Machine ID: 1
[09:20:39] 
[09:20:39] Loaded queue successfully.
[09:20:39] - Preparing to get new work unit...
[09:20:39] - Autosending finished units... [February 6 09:20:39 UTC]
[09:20:39] Cleaning up work directory
[09:20:39] Trying to send all finished work units
[09:20:39] Project: 11041 (Run 0, Clone 440, Gen 1)
[09:20:39] + Attempting to get work packet
[09:20:39] Passkey found


[09:20:39] - Will indicate memory of 4072 MB
[09:20:39] + Attempting to send results [February 6 09:20:39 UTC]
[09:20:39] - Detect CPU.[09:20:39] - Reading file work/wuresults_06.dat from core
 Vendor: GenuineIntel, Family: 6, Model: 10, Stepping: 7
[09:20:39]   (Read 10301679 bytes from disk)
[09:20:39] Connecting to http://171.64.65.55:8080/
[09:20:39] - Connecting to assignment server
[09:20:39] Connecting to http://assign.stanford.edu:8080/
[09:20:40] Posted data.
[09:20:40] Initial: 40AB; - Successful: assigned to (171.64.65.55).
[09:20:40] + News From Folding@Home: Welcome to Folding@Home
[09:20:40] Loaded queue successfully.
[09:20:40] Sent data
[09:20:40] Connecting to http://171.64.65.55:8080/
[09:22:41] Posted data.
[09:42:41] Initial: 00DA; Killing all core threads
[09:48:51] Could not get process id information.  Please kill core process manually

Folding@Home Client Shutdown at user request.
[09:48:51] ***** Got a SIGTERM signal (2)
[09:48:51] Killing all core threads
[09:48:51] Could not get process id information.  Please kill core process manually

Folding@Home Client Shutdown.


--- Opening Log file [February 6 09:48:59 UTC] 


# Windows SMP Console Edition #################################################
###############################################################################

                       Folding@Home Client Version 6.34

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: E:\_SMP_FAH-v. 6.34
Executable: E:\_SMP_FAH-v. 6.34\FAH6.34-win32-SMP.exe
Arguments: -smp -advmethods -verbosity 9 

[09:48:59] - Ask before connecting: No
[09:48:59] - User name: artoar_home (Team 32435)
[09:48:59] - User ID: 7892D91A0BA0CA2D
[09:48:59] - Machine ID: 1
[09:48:59] 
[09:48:59] Loaded queue successfully.
[09:48:59] - Preparing to get new work unit...
[09:48:59] - Autosending finished units... [February 6 09:48:59 UTC]
[09:48:59] Cleaning up work directory
[09:48:59] Trying to send all finished work units
[09:48:59] + Attempting to get work packet
[09:48:59] Project: 11041 (Run 0, Clone 440, Gen 1)
[09:48:59] Passkey found


[09:48:59] - Will indicate memory of 4072 MB
[09:48:59] + Attempting to send results [February 6 09:48:59 UTC]
[09:48:59] - Detect CPU.[09:48:59] - Reading file work/wuresults_06.dat from core
 Vendor: GenuineIntel, Family: 6, Model: 10, Stepping: 7
[09:48:59] - Connecting to assignment server
[09:48:59] Connecting to http://assign.stanford.edu:8080/
[09:48:59]   (Read 10301679 bytes from disk)
[09:48:59] Connecting to http://171.64.65.55:8080/
[09:49:00] Posted data.
[09:49:00] Initial: 40AB; - Successful: assigned to (171.64.65.55).
[09:49:00] + News From Folding@Home: Welcome to Folding@Home
[09:49:01] Loaded queue successfully.
[09:49:01] Sent data
[09:49:01] Connecting to http://171.64.65.55:8080/
[09:49:20] - Couldn't send HTTP request to server
[09:49:20] + Could not connect to Work Server (results)
[09:49:20]     (171.64.65.55:8080)
[09:49:20] + Retrying using alternative port
[09:49:20] Connecting to http://171.64.65.55:80/
[09:49:22] - Couldn't send HTTP request to server
[09:49:22] + Could not connect to Work Server
[09:49:22] - Attempt #1  to get work failed, and no other work to do.
Waiting before retry.
[09:49:38] + Attempting to get work packet
[09:49:38] Passkey found
[09:49:38] - Will indicate memory of 4072 MB
[09:49:38] - Connecting to assignment server
[09:49:38] Connecting to http://assign.stanford.edu:8080/
[09:49:41] - Couldn't send HTTP request to server
[09:49:41] + Could not connect to Work Server (results)
[09:49:41]     (171.64.65.55:80)
[09:49:41] - Error: Could not transmit unit 06 (completed February 6) to work server.
[09:49:41] - 1 failed uploads of this unit.
[09:49:41]   Keeping unit 06 in queue.
[09:49:41] + Sent 0 of 1 completed units to the server
[09:49:41] - Autosend completed
[09:49:59] - Couldn't send HTTP request to server
[09:49:59] + Could not connect to Assignment Server
[09:49:59] Connecting to http://assign2.stanford.edu:80/
[09:50:00] Posted data.
[09:50:00] Initial: 8F80; - Successful: assigned to (128.143.231.202).
[09:50:00] + News From Folding@Home: Welcome to Folding@Home
[09:50:00] Loaded queue successfully.
[09:50:00] Sent data
[09:50:00] Connecting to http://128.143.231.202:80/
[09:50:02] Posted data.
[09:50:02] Initial: 0000; - Receiving payload (expected size: 3806911)
[09:51:27] - Downloaded at ~43 kB/s
[09:51:27] - Averaged speed for that direction ~215 kB/s
[09:51:27] + Received work.
[09:51:27] + Closed connections
[09:51:27] 
[09:51:27] + Processing work unit
[09:51:27] Core required: FahCore_a3.exe
[09:51:27] Core found.
[09:51:27] Working on queue slot 07 [February 6 09:51:27 UTC]
[09:51:27] + Working ...
[09:51:27] - Calling '.\FahCore_a3.exe -dir work/ -nice 19 -suffix 07 -np 4 -nocpulock -checkpoint 9 -verbose -lifeline 1008 -version 634'

[09:51:27] 
[09:51:27] *------------------------------*
[09:51:27] Folding@Home Gromacs SMP Core
[09:51:27] Version 2.27 (Dec. 15, 2010)
[09:51:27] 
[09:51:27] Preparing to commence simulation
[09:51:27] - Looking at optimizations...
[09:51:27] - Created dyn
[09:51:27] - Files status OK
[09:51:28] - Expanded 3806399 -> 4136808 (decompressed 108.6 percent)
[09:51:28] Called DecompressByteArray: compressed_data_size=3806399 data_size=4136808, decompressed_data_size=4136808 diff=0
[09:51:28] - Digital signature verified
[09:51:28] 
[09:51:28] Project: 6098 (Run 0, Clone 66, Gen 53)
[09:51:28] 
[09:51:28] Assembly optimizations on if available.
[09:51:28] Entering M.D.
[09:51:34] Mapping NT from 4 to 4 
[09:51:34] Completed 0 out of 500000 steps  (0%)

PS: It is now - full/Reject

Re: 171.64.65.55

Posted: Mon Feb 06, 2012 11:00 am
by CBT
Same here: WS that I get from the AS is "http://171.64.65.55:8080/", but that server is 'Not Accept'-ing connections.
  • [10:57:01] + Attempting to get work packet
    [10:57:01] Passkey found
    [10:57:01] - Will indicate memory of 1024 MB
    [10:57:01] - Connecting to assignment server
    [10:57:01] Connecting to http://assign.stanford.edu:8080/
    [10:57:02] Posted data.
    [10:57:02] Initial: 40AB; - Successful: assigned to (171.64.65.55).
    [10:57:02] + News From Folding@Home: Welcome to Folding@Home
    [10:57:02] Loaded queue successfully.
    [10:57:02] Sent data
    [10:57:02] Connecting to http://171.64.65.55:8080/
    [10:57:23] - Couldn't send HTTP request to server
    [10:57:23] + Could not connect to Work Server
    [10:57:23] - Attempt #7 to get work failed, and no other work to do.
    Waiting before retry.
Can anybody help?

Corné

Re: 171.64.65.55

Posted: Mon Feb 06, 2012 2:47 pm
by codysluder
At 1:00 PST the server was fine. Since 2:00 PST, it has been in trouble. It's currently 06:45 PST so the Stanford staff may not be around to fix it for a few more hours, but I'm sure they will be checking on it soon.

Re: 171.64.65.55

Posted: Mon Feb 06, 2012 4:43 pm
by dvanatta
I'm looking into this now.

CBT, which project were you trying to return? was it also 11041?

-Dan

Re: 171.64.65.55

Posted: Mon Feb 06, 2012 4:53 pm
by dvanatta
Hi,

CBT and artoar_11, are both of you using the adv methods flag? If you were, I've solved this, if not, please let me know and I will continue looking.

Thanks,
Dan

Re: 171.64.65.55

Posted: Mon Feb 06, 2012 5:17 pm
by starkreiten
dvanatta wrote:CBT and artoar_11, are both of you using the adv methods flag? If you were, I've solved this, if not, please let me know and I will continue looking.
I've have an 11050 failing to upload. Running v7 with the beta flag.

Looks like this one tried to roll over to the collection server, but failed.

Code: Select all

******************************** Date: 06/02/12 ********************************
17:10:37:WU03:FS02:Sending unit results: id:03 state:SEND error:OK project:11050 run:0 clone:92 gen:6 core:0xa3 unit:0x000000060a3b1e5b4db738b91bc4821d
17:10:37:WU03:FS02:Uploading 7.97MiB to 171.64.65.55
17:10:37:WU03:FS02:Connecting to 171.64.65.55:8080
17:11:19:WU03:FS02:Upload 0.78%
17:11:19:WARNING:WU03:FS02:Exception: Failed to send results to work server: Transfer failed
17:11:19:WU03:FS02:Trying to send results to collection server
17:11:19:WU03:FS02:Uploading 7.97MiB to 171.67.108.26
17:11:19:WU03:FS02:Connecting to 171.67.108.26:8080
17:11:21:ERROR:WU03:FS02:Exception: Transfer failed
The initial upload attempt.

Code: Select all

******************************** Date: 06/02/12 ********************************
11:57:42:WU03:FS02:Sending unit results: id:03 state:SEND error:OK project:11050 run:0 clone:92 gen:6 core:0xa3 unit:0x000000060a3b1e5b4db738b91bc4821d
11:57:42:WU03:FS02:Uploading 7.97MiB to 171.64.65.55
11:57:42:WU03:FS02:Connecting to 171.64.65.55:8080
11:58:24:WU03:FS02:Upload 0.78%
11:58:24:WARNING:WU03:FS02:Exception: Failed to send results to work server: Transfer failed
Dana

Re: 171.64.65.55

Posted: Mon Feb 06, 2012 6:07 pm
by shad0wfax
The issue is not resolved for me...

It was a 11070 WU and I am using the -advmethods flag (and I do have the big file size flag set also, although that shouldn't matter.)

After several stalled clients and several restarts, it's failing to autosend.

Code: Select all

[08:40:32] Completed 990000 out of 1000000 steps  (99%)
[08:42:51] Completed 1000000 out of 1000000 steps  (100%)
[08:42:51] DynamicWrapper: Finished Work Unit: sleep=10000
[08:43:01] 
[08:43:01] Finished Work Unit:
[08:43:01] - Reading up to 6652008 from "work/wudata_08.trr": Read 6652008
[08:43:01] trr file hash check passed.
[08:43:01] - Reading up to 998416 from "work/wudata_08.xtc": Read 998416
[08:43:01] xtc file hash check passed.
[08:43:01] edr file hash check passed.
[08:43:01] logfile size: 26227
[08:43:01] Leaving Run
[08:43:01] - Writing 7684015 bytes of core data to disk...
[08:43:02] Done: 7683503 -> 7348831 (compressed to 95.6 percent)
[08:43:02]   ... Done.
[08:43:03] - Shutting down core
[08:43:03] 
[08:43:03] Folding@home Core Shutdown: FINISHED_UNIT
[08:43:05] CoreStatus = 64 (100)
[08:43:05] Unit 8 finished with 99 percent of time to deadline remaining.
[08:43:05] Updated performance fraction: 0.986629
[08:43:05] Sending work to server
[08:43:05] Project: 11070 (Run 498, Clone 0, Gen 23)


[08:43:05] + Attempting to send results [February 6 08:43:05 UTC]
[08:43:05] - Reading file work/wuresults_08.dat from core
[08:43:05]   (Read 7349343 bytes from disk)
[08:43:05] Connecting to http://171.64.65.55:8080/
[08:52:15] Killing all core threads
[08:52:15] Could not get process id information.  Please kill core process manually

Folding@Home Client Shutdown at user request.
[08:52:15] ***** Got a SIGTERM signal (2)
[08:52:15] Killing all core threads
[08:52:15] Could not get process id information.  Please kill core process manually

Folding@Home Client Shutdown.
As you can see, the process was hung for nearly 9 minutes and did not generate any error messages or log files, nor did it attempt to connect.

So I restarted the client, several times, to attempt to fix the issue:

Code: Select all

--- Opening Log file [February 6 08:52:22 UTC] 


# Windows SMP Console Edition #################################################
###############################################################################

                       Folding@Home Client Version 6.34

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: C:\Users\bjorn\Documents\FAHSMP
Executable: C:\Users\bjorn\Documents\FAHSMP\FAHSMP.exe
Arguments: -smp 4 -verbosity 9 

[08:52:22] - Ask before connecting: No
[08:52:22] - User name: shad0wfax (Team 37726)
[08:52:22] - User ID: 747958C18364134
[08:52:22] - Machine ID: 1
[08:52:22] 
[08:52:22] Loaded queue successfully.
[08:52:22] - Preparing to get new work unit...
[08:52:22] - Autosending finished units... [February 6 08:52:22 UTC]
[08:52:22] Cleaning up work directory
[08:52:22] Trying to send all finished work units
[08:52:22] Project: 11070 (Run 498, Clone 0, Gen 23)


[08:52:22] + Attempting to send results [February 6 08:52:22 UTC]
[08:52:22] - Reading file work/wuresults_08.dat from core
[08:52:22]   (Read 7349343 bytes from disk)
[08:52:22] Connecting to http://171.64.65.55:8080/
[08:52:22] + Attempting to get work packet
[08:52:22] Passkey found
[08:52:22] - Will indicate memory of 8169 MB
[08:52:22] - Detect CPU. Vendor: GenuineIntel, Family: 6, Model: 10, Stepping: 7
[08:52:22] - Connecting to assignment server
[08:52:22] Connecting to http://assign.stanford.edu:8080/
[08:52:23] Posted data.
[08:52:23] Initial: 40AB; - Successful: assigned to (171.64.65.55).
[08:52:23] + News From Folding@Home: Welcome to Folding@Home
[08:52:23] Loaded queue successfully.
[08:52:23] Sent data
[08:52:23] Connecting to http://171.64.65.55:8080/
[08:54:23] Posted data.
[09:03:39] Killing all core threads
[09:03:39] Could not get process id information.  Please kill core process manually

Folding@Home Client Shutdown at user request.
[09:03:39] ***** Got a SIGTERM signal (2)
[09:03:39] Killing all core threads
[09:03:39] Could not get process id information.  Please kill core process manually

Folding@Home Client Shutdown.


--- Opening Log file [February 6 17:48:54 UTC] 


# Windows SMP Console Edition #################################################
###############################################################################

                       Folding@Home Client Version 6.34

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: C:\Users\bjorn\Documents\FAHSMP
Executable: C:\Users\bjorn\Documents\FAHSMP\FAHSMP.exe
Arguments: -smp 4 -verbosity 9 

[17:48:54] - Ask before connecting: No
[17:48:54] - User name: shad0wfax (Team 37726)
[17:48:54] - User ID: 747958C18364134
[17:48:54] - Machine ID: 1
[17:48:54] 
[17:48:55] Loaded queue successfully.
[17:48:55] - Preparing to get new work unit...
[17:48:55] - Autosending finished units... [February 6 17:48:55 UTC]
[17:48:55] Cleaning up work directory
[17:48:55] Trying to send all finished work units
[17:48:55] + Attempting to get work packet
[17:48:55] Project: 11070 (Run 498, Clone 0, Gen 23)
[17:48:55] Passkey found


[17:48:55] - Will indicate memory of 8166 MB
[17:48:55] + Attempting to send results [February 6 17:48:55 UTC]
[17:48:55] - Detect CPU.[17:48:55] - Reading file work/wuresults_08.dat from core
 Vendor: GenuineIntel, Family: 6, Model: 10, Stepping: 7
[17:48:55] - Connecting to assignment server
[17:48:55] Connecting to http://assign.stanford.edu:8080/
[17:48:55]   (Read 7349343 bytes from disk)
[17:48:55] Connecting to http://171.64.65.55:8080/
[17:48:55] Posted data.
[17:48:55] Initial: 8F80; - Successful: assigned to (128.143.231.202).
[17:48:55] + News From Folding@Home: Welcome to Folding@Home
[17:48:55] Loaded queue successfully.
[17:48:55] Sent data
[17:48:55] Connecting to http://128.143.231.202:8080/
[17:48:57] Posted data.
[17:48:57] Initial: 0000; - Receiving payload (expected size: 3812449)
[17:49:03] - Downloaded at ~620 kB/s
[17:49:03] - Averaged speed for that direction ~659 kB/s
[17:49:03] + Received work.
[17:49:03] + Closed connections
[17:49:03] 
[17:49:03] + Processing work unit
[17:49:03] Core required: FahCore_a3.exe
[17:49:03] Core found.
[17:49:03] Working on queue slot 09 [February 6 17:49:03 UTC]
[17:49:03] + Working ...
[17:49:03] - Calling '.\FahCore_a3.exe -dir work/ -nice 19 -suffix 09 -np 4 -nocpulock -checkpoint 30 -verbose -lifeline 2744 -version 634'

[17:49:05] 
[17:49:05] *------------------------------*
[17:49:05] Folding@Home Gromacs SMP Core
[17:49:05] Version 2.27 (Dec. 15, 2010)
[17:49:05] 
[17:49:05] Preparing to commence simulation
[17:49:05] - Looking at optimizations...
[17:49:05] - Created dyn
[17:49:05] - Files status OK
[17:49:05] - Expanded 3811937 -> 4169428 (decompressed 109.3 percent)
[17:49:05] Called DecompressByteArray: compressed_data_size=3811937 data_size=4169428, decompressed_data_size=4169428 diff=0
[17:49:05] - Digital signature verified
[17:49:05] 
[17:49:05] Project: 6097 (Run 0, Clone 71, Gen 86)
[17:49:05] 
[17:49:05] Assembly optimizations on if available.
[17:49:05] Entering M.D.
[17:49:11] Mapping NT from 4 to 4 
[17:49:12] Completed 0 out of 500000 steps  (0%)
[17:49:37] - Couldn't send HTTP request to server
[17:49:37] + Could not connect to Work Server (results)
[17:49:37]     (171.64.65.55:8080)
[17:49:37] + Retrying using alternative port
[17:49:37] Connecting to http://171.64.65.55:80/
[17:50:19] - Couldn't send HTTP request to server
[17:50:19] + Could not connect to Work Server (results)
[17:50:19]     (171.64.65.55:80)
[17:50:19] - Error: Could not transmit unit 08 (completed February 6) to work server.
[17:50:19] - 1 failed uploads of this unit.
[17:50:19]   Keeping unit 08 in queue.
[17:50:19] + Sent 0 of 1 completed units to the server
[17:50:19] - Autosend completed
[17:54:44] Killing all core threads
[17:54:44] Could not get process id information.  Please kill core process manually

Folding@Home Client Shutdown at user request.
[17:54:44] ***** Got a SIGTERM signal (2)
[17:54:44] Killing all core threads
[17:54:44] Could not get process id information.  Please kill core process manually

Folding@Home Client Shutdown.


--- Opening Log file [February 6 17:56:04 UTC] 


# Windows SMP Console Edition #################################################
###############################################################################

                       Folding@Home Client Version 6.34

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: C:\Users\bjorn\Documents\FAHSMP
Executable: C:\Users\bjorn\Documents\FAHSMP\FAHSMP.exe
Arguments: -smp 4 -verbosity 9 

[17:56:04] - Ask before connecting: No
[17:56:04] - User name: shad0wfax (Team 37726)
[17:56:04] - User ID: 747958C18364134
[17:56:04] - Machine ID: 1
[17:56:04] 
[17:56:04] Loaded queue successfully.
[17:56:04] 
[17:56:04] - Autosending finished units... [February 6 17:56:04 UTC]
[17:56:04] + Processing work unit
[17:56:04] Trying to send all finished work units
[17:56:04] Core required: FahCore_a3.exe
[17:56:04] Project: 11070 (Run 498, Clone 0, Gen 23)
[17:56:04] Core found.


[17:56:04] + Attempting to send results [February 6 17:56:04 UTC]
[17:56:04] - Reading file work/wuresults_08.dat from core
[17:56:04] Working on queue slot 09 [February 6 17:56:04 UTC]
[17:56:04]   (Read 7349343 bytes from disk)
[17:56:04] + Working ...
[17:56:04] Connecting to http://171.64.65.55:8080/
[17:56:04] - Calling '.\FahCore_a3.exe -dir work/ -nice 19 -suffix 09 -np 4 -nocpulock -checkpoint 30 -verbose -lifeline 3556 -version 634'

[17:56:04] 
[17:56:04] *------------------------------*
[17:56:04] Folding@Home Gromacs SMP Core
[17:56:04] Version 2.27 (Dec. 15, 2010)
[17:56:04] 
[17:56:04] Preparing to commence simulation
[17:56:04] - Ensuring status. Please wait.
[17:56:14] - Looking at optimizations...
[17:56:14] - Working with standard loops on this execution.
[17:56:14] - Previous termination of core was improper.
[17:56:14] - Files status OK
[17:56:14] - Expanded 3811937 -> 4169428 (decompressed 109.3 percent)
[17:56:14] Called DecompressByteArray: compressed_data_size=3811937 data_size=4169428, decompressed_data_size=4169428 diff=0
[17:56:14] - Digital signature verified
[17:56:14] 
[17:56:14] Project: 6097 (Run 0, Clone 71, Gen 86)
[17:56:14] 
[17:56:14] Entering M.D.
[17:56:20] Mapping NT from 4 to 4 
[17:56:20] Completed 0 out of 500000 steps  (0%)
[17:56:25] - Couldn't send HTTP request to server
[17:56:25] + Could not connect to Work Server (results)
[17:56:25]     (171.64.65.55:8080)
[17:56:25] + Retrying using alternative port
[17:56:25] Connecting to http://171.64.65.55:80/
[17:57:08] - Couldn't send HTTP request to server
[17:57:08] + Could not connect to Work Server (results)
[17:57:08]     (171.64.65.55:80)
[17:57:08] - Error: Could not transmit unit 08 (completed February 6) to work server.
[17:57:08] - 2 failed uploads of this unit.


[17:57:08] + Attempting to send results [February 6 17:57:08 UTC]
[17:57:08] - Reading file work/wuresults_08.dat from core
[17:57:08]   (Read 7349343 bytes from disk)
[17:57:08] Connecting to http://171.67.108.26:8080/
[17:57:10] - Couldn't send HTTP request to server
[17:57:10] + Could not connect to Work Server (results)
[17:57:10]     (171.67.108.26:8080)
[17:57:10] + Retrying using alternative port
[17:57:10] Connecting to http://171.67.108.26:80/
[17:57:13] - Couldn't send HTTP request to server
[17:57:13] + Could not connect to Work Server (results)
[17:57:13]     (171.67.108.26:80)
[17:57:13]   Could not transmit unit 08 to Collection server; keeping in queue.
[17:57:13] + Sent 0 of 1 completed units to the server
[17:57:13] - Autosend completed
[18:05:59] Completed 5000 out of 500000 steps  (1%)
Now it's simply hammering the collection server, which is also not allowing a transmission, but my client is finally folding on the next WU in the queue at least. (It wasn't even making progress on the 6097 WU I received until just now, and it's killing my TPF.)

Re: 171.64.65.55

Posted: Mon Feb 06, 2012 6:16 pm
by 7im
P11050 R0 C98 G2 failing to upload to this server.

Code: Select all

10:26:01:WU01:FS00:0xa3:Completed 980000 out of 1000000 steps  (98%)
10:34:42:WU01:FS00:0xa3:Completed 990000 out of 1000000 steps  (99%)
10:43:23:WU01:FS00:0xa3:Completed 1000000 out of 1000000 steps  (100%)
10:43:23:WU01:FS00:0xa3:DynamicWrapper: Finished Work Unit: sleep=10000
10:43:33:WU01:FS00:0xa3:
10:43:33:WU01:FS00:0xa3:Finished Work Unit:
10:43:33:WU01:FS00:0xa3:- Reading up to 7569936 from "01/wudata_01.trr": Read 7569936
10:43:33:WU01:FS00:0xa3:trr file hash check passed.
10:43:33:WU01:FS00:0xa3:- Reading up to 1134728 from "01/wudata_01.xtc": Read 1134728
10:43:33:WU01:FS00:0xa3:xtc file hash check passed.
10:43:33:WU01:FS00:0xa3:edr file hash check passed.
10:43:33:WU01:FS00:0xa3:logfile size: 27046
10:43:33:WU01:FS00:0xa3:Leaving Run
10:43:34:WU01:FS00:0xa3:- Writing 8739074 bytes of core data to disk...
10:43:36:WU01:FS00:0xa3:Done: 8738562 -> 8351877 (compressed to 95.5 percent)
10:43:36:WU01:FS00:0xa3:  ... Done.
10:43:37:WU01:FS00:0xa3:- Shutting down core
10:43:37:WU01:FS00:0xa3:
10:43:37:WU01:FS00:0xa3:Folding@home Core Shutdown: FINISHED_UNIT
10:43:38:WU01:FS00:FahCore returned: FINISHED_UNIT (100 = 0x64)
10:43:38:WU01:FS00:Sending unit results: id:01 state:SEND error:OK project:11050 run:0 clone:98 gen:2 core:0xa3 unit:0x000000020a3b1e5b4db738bcf7afe879
10:43:38:WU01:FS00:Uploading 7.97MiB to 171.64.65.55
10:43:38:WU01:FS00:Connecting to 171.64.65.55:8080
10:46:41:WU01:FS00:Upload 0.78%
10:46:41:WARNING:WU01:FS00:Exception: Failed to send results to work server: Transfer failed
10:46:41:WU01:FS00:Trying to send results to collection server
10:46:41:WU01:FS00:Uploading 7.97MiB to 171.67.108.26
10:46:41:WU01:FS00:Connecting to 171.67.108.26:8080
10:46:42:WARNING:WU01:FS00:WorkServer connection failed on port 8080 trying 80
10:46:42:WU01:FS00:Connecting to 171.67.108.26:80
10:46:43:ERROR:WU01:FS00:Exception: Failed to connect to 171.67.108.26:80: No connection could be made because the target machine actively refused it.
10:46:43:WU01:FS00:Sending unit results: id:01 state:SEND error:OK project:11050 run:0 clone:98 gen:2 core:0xa3 unit:0x000000020a3b1e5b4db738bcf7afe879
10:46:43:WU01:FS00:Uploading 7.97MiB to 171.64.65.55
10:46:43:WU01:FS00:Connecting to 171.64.65.55:8080
10:49:47:WU01:FS00:Upload 0.78%
10:49:47:WARNING:WU01:FS00:Exception: Failed to send results to work server: Transfer failed
10:49:47:WU01:FS00:Trying to send results to collection server

[-->8--snip--8<--]

16:10:41:WU01:FS00:Sending unit results: id:01 state:SEND error:OK project:11050 run:0 clone:98 gen:2 core:0xa3 unit:0x000000020a3b1e5b4db738bcf7afe879
16:10:41:WU01:FS00:Uploading 7.97MiB to 171.64.65.55
16:10:41:WU01:FS00:Connecting to 171.64.65.55:8080
16:11:03:WARNING:WU01:FS00:WorkServer connection failed on port 8080 trying 80
16:11:03:WU01:FS00:Connecting to 171.64.65.55:80
16:11:24:WARNING:WU01:FS00:Exception: Failed to send results to work server: Failed to connect to 171.64.65.55:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
16:11:24:WU01:FS00:Trying to send results to collection server
16:11:24:WU01:FS00:Uploading 7.97MiB to 171.67.108.26
16:11:24:WU01:FS00:Connecting to 171.67.108.26:8080
16:11:25:WARNING:WU01:FS00:WorkServer connection failed on port 8080 trying 80
16:11:25:WU01:FS00:Connecting to 171.67.108.26:80
16:11:26:ERROR:WU01:FS00:Exception: Failed to connect to 171.67.108.26:80: No connection could be made because the target machine actively refused it.

Re: 171.64.65.55

Posted: Mon Feb 06, 2012 6:25 pm
by dvanatta
OK, thanks for the additional information, I'll take another look.

-Dan

Re: 171.64.65.55

Posted: Mon Feb 06, 2012 6:39 pm
by CBT
Sorry for the delayed reply.
My client finaly turned to another WS to fetch a new workload and it's crunching away fine, for a few hours already.

However, the upload still tries to contact the old server:

Code: Select all

[16:49:28] - Autosending finished units... [February 6 16:49:28 UTC]
[16:49:28] Trying to send all finished work units
[16:49:28] Project: 11021 (Run 0, Clone 807, Gen 10)


[16:49:28] + Attempting to send results [February 6 16:49:28 UTC]
[16:49:28] - Reading file work/wuresults_03.dat from core
[16:49:28]   (Read 8706481 bytes from disk)
[16:49:28] Connecting to http://171.64.65.55:8080/
[16:52:28] - Couldn't send HTTP request to server
[16:52:28] + Could not connect to Work Server (results)
[16:52:28]     (171.64.65.55:8080)
[16:52:28] + Retrying using alternative port
[16:52:28] Connecting to http://171.64.65.55:80/
[16:52:49] - Couldn't send HTTP request to server
[16:52:49] + Could not connect to Work Server (results)
[16:52:49]     (171.64.65.55:80)
[16:52:49] - Error: Could not transmit unit 03 (completed February 6) to work se
rver.
[16:52:49] - 4 failed uploads of this unit.


[16:52:49] + Attempting to send results [February 6 16:52:49 UTC]
[16:52:49] - Reading file work/wuresults_03.dat from core
[16:52:49]   (Read 8706481 bytes from disk)
[16:52:49] Connecting to http://171.67.108.26:8080/
[16:52:51] - Couldn't send HTTP request to server
[16:52:51] + Could not connect to Work Server (results)
[16:52:51]     (171.67.108.26:8080)
[16:52:51] + Retrying using alternative port
[16:52:51] Connecting to http://171.67.108.26:80/
[16:52:52] - Couldn't send HTTP request to server
[16:52:52] + Could not connect to Work Server (results)
[16:52:52]     (171.67.108.26:80)
[16:52:52]   Could not transmit unit 03 to Collection server; keeping in queue.
[16:52:52] + Sent 0 of 1 completed units to the server
[16:52:52] - Autosend completed
The second part is an attempted to upload this WU to another server (171.67.108.26), which also doesn't seem to work.

I've checked my settings (commandline parameters in the shortcut and client.cfg), but I don't seem to be using 'advmethods' on this particular PC. The client is "Windows SMP Console Edition", "Folding@Home Client Version 6.34".

I suppose I'm gonna mis my QRB on this one.

Corné

Mod Edit: Changed List Tags To Code Tags - PantherX

Re: 171.64.65.55

Posted: Mon Feb 06, 2012 7:23 pm
by dvanatta
Hi,

I made some changes and restarted the server, it really really should work now. Anyone still have issues or did that fix it?

Thanks,
Dan

Re: 171.64.65.55

Posted: Mon Feb 06, 2012 7:34 pm
by CBT
I'm affraid I have to disappoint you. :(
I just restarted my client in order to force a new upload attempt and this is the result:

Code: Select all

[19:25:04] Loaded queue successfully.
[19:25:04]
[19:25:04] - Autosending finished units... [February 6 19:25:04 UTC]
[19:25:04] Trying to send all finished work units
[19:25:04] + Processing work unit
[19:25:04] Project: 11021 (Run 0, Clone 807, Gen 10)


[19:25:04] + Attempting to send results [February 6 19:25:04 UTC]
[19:25:04] - Reading file work/wuresults_03.dat from core
[19:25:04] Core required: FahCore_a3.exe
[19:25:04]   (Read 8706481 bytes from disk)
[19:25:04] Connecting to http://171.64.65.55:8080/
[19:25:25] - Couldn't send HTTP request to server
[19:25:25] + Could not connect to Work Server (results)
[19:25:25]     (171.64.65.55:8080)
[19:25:25] + Retrying using alternative port
[19:25:25] Connecting to http://171.64.65.55:80/
[19:25:46] - Couldn't send HTTP request to server
[19:25:46] + Could not connect to Work Server (results)
[19:25:46]     (171.64.65.55:80)
[19:25:46] - Error: Could not transmit unit 03 (completed February 6) to work se
rver.
[19:25:46] - 5 failed uploads of this unit.


[19:25:46] + Attempting to send results [February 6 19:25:46 UTC]
[19:25:46] - Reading file work/wuresults_03.dat from core
[19:25:46]   (Read 8706481 bytes from disk)
[19:25:46] Connecting to http://171.67.108.26:8080/
[19:26:07] - Couldn't send HTTP request to server
[19:26:15] + Could not connect to Work Server (results)
[19:26:15]     (171.67.108.26:8080)
[19:26:15] + Retrying using alternative port
[19:26:15] Connecting to http://171.67.108.26:80/
[19:26:36] - Couldn't send HTTP request to server
[19:26:36] + Could not connect to Work Server (results)
[19:26:36]     (171.67.108.26:80)
[19:26:36]   Could not transmit unit 03 to Collection server; keeping in queue.
[19:26:36] + Sent 0 of 1 completed units to the server
[19:26:36] - Autosend completed
Mod Edit: Changed List Tags To Code Tags - PantherX

Re: 171.64.65.55

Posted: Mon Feb 06, 2012 7:35 pm
by starkreiten
It still won't take my 11050 (0,92.6) either. :(

Code: Select all

19:27:42:WU03:FS02:Sending unit results: id:03 state:SEND error:OK project:11050 run:0 clone:92 gen:6 core:0xa3 unit:0x000000060a3b1e5b4db738b91bc4821d
19:27:42:WU03:FS02:Uploading 7.97MiB to 171.64.65.55
19:27:42:WU03:FS02:Connecting to 171.64.65.55:8080
19:28:24:WU03:FS02:Upload 0.78%
19:28:24:WARNING:WU03:FS02:Exception: Failed to send results to work server: Transfer failed
19:28:24:WU03:FS02:Trying to send results to collection server
19:28:24:WU03:FS02:Uploading 7.97MiB to 171.67.108.26
19:28:24:WU03:FS02:Connecting to 171.67.108.26:8080
19:28:27:ERROR:WU03:FS02:Exception: Transfer failed
19:29:19:WU03:FS02:Sending unit results: id:03 state:SEND error:OK project:11050 run:0 clone:92 gen:6 core:0xa3 unit:0x000000060a3b1e5b4db738b91bc4821d
19:29:19:WU03:FS02:Uploading 7.97MiB to 171.64.65.55
19:29:19:WU03:FS02:Connecting to 171.64.65.55:8080
Dana

Re: 171.64.65.55

Posted: Mon Feb 06, 2012 7:39 pm
by artoar_11
dvanatta wrote:Hi,

CBT and artoar_11, are both of you using the adv methods flag? If you were, I've solved this, if not, please let me know and I will continue looking.

Thanks,
Dan
Sorry for the interrupted communication here. In our region it rained heavy rain in 24 hours. It is now mid-winter. I had no internet for ~10 hours.

Yes, I use for all clients the flag - advmethods. The problem continues.

PS: Another client, my home PC

Code: Select all

# Windows SMP Console Edition #################################################
###############################################################################

                       Folding@Home Client Version 6.34

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: D:\_SMP2_FAH
Executable: D:\_SMP2_FAH\FAH6.34-win32-SMP.exe
Arguments: -smp -verbosity 9 -advmethods 

[19:48:53] - Ask before connecting: No
[19:48:53] - User name: artoar_home (Team 32435)
[19:48:53] - User ID: 714971F37559BC17
[19:48:53] - Machine ID: 1
[19:48:53] 
[19:48:53] Loaded queue successfully.
[19:48:53] 
[19:48:53] - Autosending finished units... [February 6 19:48:53 UTC]
[19:48:53] + Processing work unit
[19:48:53] Trying to send all finished work units
[19:48:53] Core required: FahCore_a3.exe
[19:48:53] Project: 11061 (Run 0, Clone 2105, Gen 14)
[19:48:53] Core found.


[19:48:53] + Attempting to send results [February 6 19:48:53 UTC]
[19:48:53] - Reading file work/wuresults_01.dat from core
[19:48:53]   (Read 783007 bytes from disk)
[19:48:53] Connecting to http://171.64.65.55:8080/
[19:48:53] Working on queue slot 02 [February 6 19:48:53 UTC]
[19:48:53] + Working ...
[19:48:53] - Calling '.\FahCore_a3.exe -dir work/ -nice 19 -suffix 02 -np 4 -checkpoint 9 -verbose -lifeline 3424 -version 634'

[19:48:55] 
[19:48:55] *------------------------------*
[19:48:55] Folding@Home Gromacs SMP Core
[19:48:55] Version 2.27 (Dec. 15, 2010)
[19:48:55] 
[19:48:55] Preparing to commence simulation
[19:48:55] - Ensuring status. Please wait.
[19:49:04] - Looking at optimizations...
[19:49:05] - Working with standard loops on this execution.
[19:49:05] - Previous termination of core was improper.
[19:49:05] - Going to use standard loops.
[19:49:05] - Files status OK
[19:49:12] - Expanded 3811182 -> 4169428 (decompressed 109.3 percent)
[19:49:13] Called DecompressByteArray: compressed_data_size=3811182 data_size=4169428, decompressed_data_size=4169428 diff=0
[19:49:14] - Digital signature verified
[19:49:14] 
[19:49:14] Project: 6097 (Run 0, Clone 82, Gen 88)
[19:49:14] 
[19:49:14] - Couldn't send HTTP request to server
[19:49:14] + Could not connect to Work Server (results)
[19:49:14]     (171.64.65.55:8080)
[19:49:14] + Retrying using alternative port
[19:49:14] Connecting to http://171.64.65.55:80/
[19:49:21] Entering M.D.
[19:49:27] Using Gromacs checkpoints
[19:49:31] Mapping NT from 4 to 4 
[19:49:35] - Couldn't send HTTP request to server
[19:49:35] + Could not connect to Work Server (results)
[19:49:35]     (171.64.65.55:80)
[19:49:35] - Error: Could not transmit unit 01 (completed February 6) to work server.
[19:49:35] - 6 failed uploads of this unit.


[19:49:35] + Attempting to send results [February 6 19:49:35 UTC]
[19:49:35] - Reading file work/wuresults_01.dat from core
[19:49:35]   (Read 783007 bytes from disk)
[19:49:35] Connecting to http://171.67.108.26:8080/
[19:49:39] - Couldn't send HTTP request to server
[19:49:39] + Could not connect to Work Server (results)
[19:49:39]     (171.67.108.26:8080)
[19:49:39] + Retrying using alternative port
[19:49:39] Connecting to http://171.67.108.26:80/
[19:49:40] - Couldn't send HTTP request to server
[19:49:40] + Could not connect to Work Server (results)
[19:49:40]     (171.67.108.26:80)
[19:49:40]   Could not transmit unit 01 to Collection server; keeping in queue.
[19:49:40] + Sent 0 of 1 completed units to the server
[19:49:40] - Autosend completed
[19:50:04] Resuming from checkpoint
[19:50:05] Verified work/wudata_02.log
[19:50:07] Verified work/wudata_02.trr
[19:50:08] Verified work/wudata_02.edr
[19:50:19] Killing all core threads
[19:50:19] Killing 3 cores
[19:50:19] Killing core 0
[19:50:19] Killing core 1
[19:50:19] Killing core 2

Folding@Home Client Shutdown at user request.
[19:50:19] ***** Got a SIGTERM signal (2)
[19:50:19] Killing all core threads
[19:50:19] Killing 3 cores
[19:50:19] Killing core 0
[19:50:19] Killing core 1
[19:50:19] Killing core 2

Folding@Home Client Shutdown.

Re: 171.64.65.55

Posted: Mon Feb 06, 2012 7:59 pm
by dvanatta
OK sorry, i posted that before the server had finished restarting. Still problems?