Page 1 of 3

Re: New AS testing

Posted: Mon Oct 05, 2015 7:50 pm
by billford
As per announcement: viewtopic.php?f=24&t=28164

Related?

Code: Select all

19:44:22:WU00:FS01:0x18:Completed 16000000 out of 16000000 steps (100%)
19:44:23:WU01:FS01:Connecting to 171.67.108.200:80
19:44:24:WARNING:WU01:FS01:Failed to get assignment from '171.67.108.200:80': Failed to connect to 171.67.108.200:80: Connection refused
19:44:24:WU01:FS01:Connecting to 171.67.108.204:80
19:44:25:WU01:FS01:Assigned to work server 171.67.108.155
19:44:25:WU01:FS01:Requesting new work unit for slot 01: RUNNING gpu:0:GM204 [GeForce GTX 980] from 171.67.108.155
19:44:25:WU01:FS01:Connecting to 171.67.108.155:8080
19:44:25:WU01:FS01:Downloading 50.37MiB
.
.
19:44:53:WU01:FS01:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:9640 run:0 clone:5 gen:0 core:0x21 unit:0x00000000ab436c9b5609bee355f950ce
19:44:53:WU01:FS01:Starting
19:44:53:WU01:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/web.stanford.edu/~pande/Linux/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21 -dir 01 -suffix 01 -version 704 -lifeline 1279 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
19:44:53:WU01:FS01:Started FahCore on PID 24462
19:44:53:WU01:FS01:Core PID:24466
19:44:53:WU01:FS01:FahCore 0x21 started
19:44:53:WU01:FS01:0x21:*********************** Log Started 2015-10-05T19:44:53Z ***********************
19:44:53:WU01:FS01:0x21:Project: 9640 (Run 0, Clone 5, Gen 0)
19:44:53:WU01:FS01:0x21:Unit: 0x00000000ab436c9b5609bee355f950ce
19:44:53:WU01:FS01:0x21:CPU: 0x00000000000000000000000000000000
19:44:53:WU01:FS01:0x21:Machine: 1
19:44:53:WU01:FS01:0x21:Reading tar file core.xml
19:44:53:WU01:FS01:0x21:Reading tar file integrator.xml
19:44:53:WU01:FS01:0x21:Reading tar file state.xml
19:44:53:WU01:FS01:0x21:Reading tar file system.xml
19:44:53:WU01:FS01:0x21:Digital signatures verified

Re: Re: New AS testing

Posted: Mon Oct 05, 2015 8:09 pm
by jcoffland
Unless the problem persists, it's not an issue. Your client probably just connected during an AS restart. This is why we have automatic fail over.

Re: Re: New AS testing

Posted: Mon Oct 05, 2015 8:13 pm
by billford
OK, thanks.

Re: Re: New AS testing

Posted: Mon Oct 05, 2015 8:55 pm
by Nathan_P
A similar thing happening here I think:

Code: Select all

[20:34:52] Folding@home Core Shutdown: FINISHED_UNIT
[20:34:52] CoreStatus = 64 (100)
[20:34:52] Unit 1 finished with 99 percent of time to deadline remaining.
[20:34:52] Updated performance fraction: 0.978783
[20:34:52] Sending work to server
[20:34:52] Project: 7527 (Run 13, Clone 5, Gen 142)


[20:34:52] + Attempting to send results [October 5 20:34:52 UTC]
[20:34:52] - Reading file work/wuresults_01.dat from core
[20:34:52]   (Read 9164614 bytes from disk)
[20:34:52] Connecting to http://128.143.199.97:8080/
[20:36:51] Posted data.
[20:36:51] Initial: 0000; - Uploaded at ~75 kB/s
[20:36:51] - Averaged speed for that direction ~70 kB/s
[20:36:51] + Results successfully sent
[20:36:51] Thank you for your contribution to Folding@Home.
[20:36:51] + Number of Units Completed: 1545

[20:36:51] Trying to send all finished work units
[20:36:51] + No unsent completed units remaining.
[20:36:51] - Preparing to get new work unit...
[20:36:51] Cleaning up work directory
[20:36:51] + Attempting to get work packet
[20:36:51] Passkey found
[20:36:51] - Will indicate memory of 15994 MB
[20:36:51] - Connecting to assignment server
[20:36:51] Connecting to http://assign.stanford.edu:8080/
[20:36:52] Posted data.
[20:36:52] Initial: 0000; + No appropriate work server was available; will try again in a bit.
[20:36:52] + Couldn't get work instructions.
[20:36:52] - Attempt #1  to get work failed, and no other work to do.
Waiting before retry.
[20:37:05] + Attempting to get work packet
[20:37:05] Passkey found
[20:37:05] - Will indicate memory of 15994 MB
[20:37:05] - Connecting to assignment server
[20:37:05] Connecting to http://assign.stanford.edu:8080/
[20:37:05] Posted data.
[20:37:05] Initial: 0000; + No appropriate work server was available; will try again in a bit.
[20:37:05] + Couldn't get work instructions.
[20:37:05] - Attempt #2  to get work failed, and no other work to do.
Waiting before retry.
[20:37:15] + Attempting to get work packet
[20:37:15] Passkey found
[20:37:15] - Will indicate memory of 15994 MB
[20:37:15] - Connecting to assignment server
[20:37:15] Connecting to http://assign.stanford.edu:8080/
[20:37:16] Posted data.
[20:37:16] Initial: 0000; + No appropriate work server was available; will try again in a bit.
[20:37:16] + Couldn't get work instructions.
[20:37:16] - Attempt #3  to get work failed, and no other work to do.
Waiting before retry.
[20:37:44] + Attempting to get work packet
[20:37:44] Passkey found
[20:37:44] - Will indicate memory of 15994 MB
[20:37:44] - Connecting to assignment server
[20:37:44] Connecting to http://assign.stanford.edu:8080/
[20:37:45] Posted data.
[20:37:45] Initial: 0000; + No appropriate work server was available; will try again in a bit.
[20:37:45] + Couldn't get work instructions.
[20:37:45] - Attempt #4  to get work failed, and no other work to do.
Waiting before retry.
[20:38:26] + Attempting to get work packet
[20:38:26] Passkey found
[20:38:26] - Will indicate memory of 15994 MB
[20:38:26] - Connecting to assignment server
[20:38:26] Connecting to http://assign.stanford.edu:8080/
[20:38:27] Posted data.
[20:38:27] Initial: 0000; + No appropriate work server was available; will try again in a bit.
[20:38:27] + Couldn't get work instructions.
[20:38:27] - Attempt #5  to get work failed, and no other work to do.
Waiting before retry.
[20:39:49] + Attempting to get work packet
[20:39:49] Passkey found
[20:39:49] - Will indicate memory of 15994 MB
[20:39:49] - Connecting to assignment server
[20:39:49] Connecting to http://assign.stanford.edu:8080/
[20:39:50] Posted data.
[20:39:50] Initial: 0000; + No appropriate work server was available; will try again in a bit.
[20:39:50] + Couldn't get work instructions.
[20:39:50] - Attempt #6  to get work failed, and no other work to do.
Waiting before retry.
[20:42:39] + Attempting to get work packet
[20:42:39] Passkey found
[20:42:39] - Will indicate memory of 15994 MB
[20:42:39] - Connecting to assignment server
[20:42:39] Connecting to http://assign.stanford.edu:8080/
[20:42:40] Posted data.
[20:42:40] Initial: 0000; + No appropriate work server was available; will try again in a bit.
[20:42:40] + Couldn't get work instructions.
[20:42:40] - Attempt #7  to get work failed, and no other work to do.
Waiting before retry.
[20:48:05] + Attempting to get work packet
[20:48:05] Passkey found
[20:48:05] - Will indicate memory of 15994 MB
[20:48:05] - Connecting to assignment server
[20:48:05] Connecting to http://assign.stanford.edu:8080/
[20:48:05] Posted data.
[20:48:05] Initial: 0000; + No appropriate work server was available; will try again in a bit.
[20:48:05] + Couldn't get work instructions.
[20:48:05] - Attempt #8  to get work failed, and no other work to do.
Waiting before retry.

Re: Re: New AS testing

Posted: Mon Oct 05, 2015 10:35 pm
by jcoffland
@Nathan_P I don't see your client trying assign2? What version is it?

Re: Re: New AS testing

Posted: Mon Oct 05, 2015 10:50 pm
by 7im
Looks like a v6.xx client.

[20:36:51] + Number of Units Completed: 1545

Re: Re: New AS testing

Posted: Tue Oct 06, 2015 12:03 am
by Ricky
Is this related?

Code: Select all

22:22:55:WU02:FS00:0xa4:Completed 76000 out of 80000 steps  (95%)
22:23:30:WU02:FS00:0xa4:Completed 76800 out of 80000 steps  (96%)
22:24:05:WU02:FS00:0xa4:Completed 77600 out of 80000 steps  (97%)
22:24:41:WU02:FS00:0xa4:Completed 78400 out of 80000 steps  (98%)
22:25:16:WU02:FS00:0xa4:Completed 79200 out of 80000 steps  (99%)
22:25:16:WARNING:WU01:FS00:Exception: Could not get IP address for assign3.stanford.edu: No such host is known. 
22:25:16:ERROR:WU01:FS00:Exception: Could not get an assignment
22:25:16:WARNING:WU01:FS00:Exception: Could not get IP address for assign3.stanford.edu: No such host is known. 
22:25:16:ERROR:WU01:FS00:Exception: Could not get an assignment
22:25:51:WU02:FS00:0xa4:Completed 80000 out of 80000 steps  (100%)
22:25:53:WU02:FS00:0xa4:DynamicWrapper: Finished Work Unit: sleep=10000
22:26:03:WU02:FS00:0xa4:
22:26:03:WU02:FS00:0xa4:Finished Work Unit:
22:26:03:WU02:FS00:0xa4:- Reading up to 4123296 from "02/wudata_01.trr": Read 4123296
22:26:03:WU02:FS00:0xa4:trr file hash check passed.
22:26:03:WU02:FS00:0xa4:- Reading up to 3193456 from "02/wudata_01.xtc": Read 3193456
22:26:03:WU02:FS00:0xa4:xtc file hash check passed.
22:26:03:WU02:FS00:0xa4:edr file hash check passed.
22:26:03:WU02:FS00:0xa4:logfile size: 19892
22:26:03:WU02:FS00:0xa4:Leaving Run
22:26:07:WU02:FS00:0xa4:- Writing 7339036 bytes of core data to disk...
22:26:08:WU02:FS00:0xa4:Done: 7338524 -> 7067344 (compressed to 96.3 percent)
22:26:08:WU02:FS00:0xa4:  ... Done.
22:26:10:WU02:FS00:0xa4:- Shutting down core
22:26:10:WU02:FS00:0xa4:
22:26:10:WU02:FS00:0xa4:Folding@home Core Shutdown: FINISHED_UNIT
22:26:10:WU02:FS00:FahCore returned: FINISHED_UNIT (100 = 0x64)
22:26:10:WU02:FS00:Sending unit results: id:02 state:SEND error:NO_ERROR project:9752 run:1550 clone:0 gen:431 core:0xa4 unit:0x00000223ab404163554173b4c5af60d6
22:26:10:WU02:FS00:Uploading 6.74MiB to 171.64.65.99
22:26:10:WU02:FS00:Connecting to 171.64.65.99:8080
22:26:16:WU02:FS00:Upload 47.29%
22:26:16:WARNING:WU01:FS00:Exception: Could not get IP address for assign3.stanford.edu: No such host is known. 
22:26:16:ERROR:WU01:FS00:Exception: Could not get an assignment
22:26:20:WU02:FS00:Upload complete
22:26:20:WU02:FS00:Server responded WORK_ACK (400)
22:26:20:WU02:FS00:Final credit estimate, 4222.00 points
22:26:20:WU02:FS00:Cleaning up
22:27:53:WARNING:WU01:FS00:Exception: Could not get IP address for assign3.stanford.edu: No such host is known. 
22:27:53:ERROR:WU01:FS00:Exception: Could not get an assignment
22:30:31:WARNING:WU01:FS00:Exception: Could not get IP address for assign3.stanford.edu: No such host is known. 
22:30:31:ERROR:WU01:FS00:Exception: Could not get an assignment
22:34:45:WARNING:WU01:FS00:Exception: Could not get IP address for assign3.stanford.edu: No such host is known. 
22:34:45:ERROR:WU01:FS00:Exception: Could not get an assignment
22:41:37:WARNING:WU01:FS00:Exception: Could not get IP address for assign3.stanford.edu: No such host is known. 
22:41:37:ERROR:WU01:FS00:Exception: Could not get an assignment
22:52:42:WARNING:WU01:FS00:Exception: Could not get IP address for assign3.stanford.edu: No such host is known. 
22:52:42:ERROR:WU01:FS00:Exception: Could not get an assignment
23:10:39:WARNING:WU01:FS00:Exception: Could not get IP address for assign3.stanford.edu: No such host is known. 
23:10:39:ERROR:WU01:FS00:Exception: Could not get an assignment
23:39:41:WARNING:WU01:FS00:Exception: Could not get IP address for assign3.stanford.edu: No such host is known. 
23:39:41:ERROR:WU01:FS00:Exception: Could not get an assignment
Note: I had a couple power outages earlier today.

Re: Re: New AS testing

Posted: Tue Oct 06, 2015 12:20 am
by jcoffland
@Ricky, yes. It should be sorted out now.

Re: Re: New AS testing

Posted: Tue Oct 06, 2015 12:28 am
by Ricky
Still have the problem.

Code: Select all

00:13:34:WARNING:WU01:FS00:Exception: Could not get IP address for assign3.stanford.edu: No such host is known. 
00:13:34:ERROR:WU01:FS00:Exception: Could not get an assignment
00:13:47:WU00:FS01:0x21:Completed 524800 out of 640000 steps (82%)
00:15:40:WU00:FS01:0x21:Completed 531200 out of 640000 steps (83%)
00:16:11:WARNING:WU01:FS00:Exception: Could not get IP address for assign3.stanford.edu: No such host is known. 
00:16:11:ERROR:WU01:FS00:Exception: Could not get an assignment
00:17:33:WU00:FS01:0x21:Completed 537600 out of 640000 steps (84%)
00:19:26:WU00:FS01:0x21:Completed 544000 out of 640000 steps (85%)
00:20:25:WARNING:WU01:FS00:Exception: Could not get IP address for assign3.stanford.edu: No such host is known. 
00:20:25:ERROR:WU01:FS00:Exception: Could not get an assignment
00:21:07:WU03:FS02:0x18:Completed 12640000 out of 16000000 steps (79%)
00:21:19:WU00:FS01:0x21:Completed 550400 out of 640000 steps (86%)
00:23:11:WU00:FS01:0x21:Completed 556800 out of 640000 steps (87%)
00:25:38:WU00:FS01:0x21:Completed 563200 out of 640000 steps (88%)
00:27:16:WARNING:WU01:FS00:Exception: Could not get IP address for assign3.stanford.edu: No such host is known. 
00:27:16:ERROR:WU01:FS00:Exception: Could not get an assignment
00:27:31:WU00:FS01:0x21:Completed 569600 out of 640000 steps (89%)

Re: Re: New AS testing

Posted: Tue Oct 06, 2015 1:02 am
by Ricky
I rebooted my router and computer. I am ok now. Thanks!

Re: Re: New AS testing

Posted: Tue Oct 06, 2015 2:09 am
by jcoffland
Yes, it's possible that your router or OS or ISP could cache the wrong domain name. It will clear up on it's own when the cache times out.

Re: Re: New AS testing

Posted: Tue Oct 06, 2015 4:22 pm
by Nathan_P
7im's correct this is a v6.34 client

Hmm, I rebooted the rig this afternoon and it still won't connect to the SMP server, it has however connected to the bigadv server and is now working on an 8108. i'll see if there are any iissues with the SMP server

Re: New AS testing

Posted: Wed Oct 07, 2015 3:04 pm
by Gary480six
I believe there is still a problem with the new assignment server and the V6.34 SMP client.

This is an i5-2500 stock clocked running Windows.

After the previously suggested restart of FAH, it spent the next 24 hours in an endless loop of:

Code: Select all

[01:59:41] + Attempting to get work packet
[01:59:41] Passkey found
[01:59:41] - Connecting to assignment server
[01:59:41] + No appropriate work server was available; will try again in a bit.
[01:59:41] + Couldn't get work instructions.
[01:59:41] - Attempt #2  to get work failed, and no other work to do.
Waiting before retry.
[01:59:55] + Attempting to get work packet
[01:59:55] Passkey found
[01:59:55] - Connecting to assignment server
[01:59:55] + No appropriate work server was available; will try again in a bit.
[01:59:55] + Couldn't get work instructions.
[01:59:55] - Attempt #3  to get work failed, and no other work to do.
Waiting before retry.
Then, last night, it picked up one P7527 on server 128.143.199.97

When it finished and returned that work unit successfully, we were back to several hours of:

Code: Select all

[13:41:25] + Attempting to get work packet
[13:41:25] Passkey found
[13:41:25] - Connecting to assignment server
[13:41:25] + No appropriate work server was available; will try again in a bit.
[13:41:25] + Couldn't get work instructions.
[13:41:25] - Attempt #10  to get work failed, and no other work to do.
Waiting before retry.
[14:24:20] + Attempting to get work packet
[14:24:20] Passkey found
[14:24:20] - Connecting to assignment server
[14:24:21] + No appropriate work server was available; will try again in a bit.
[14:24:21] + Couldn't get work instructions.
[14:24:21] - Attempt #11  to get work failed, and no other work to do.
Waiting before retry.
Nothing would have changed with the system during the time it did actually get work - the owner was not even home.

What should I suggest the owner of this PC try next?

Re: Re: New AS testing

Posted: Wed Oct 07, 2015 6:34 pm
by bruce
That doesn't look like a problem with the new AS hardware.

You SHOULD get this message whenever there are no WUs that satisfy all the restrictions his system places on the WUs he can process. His choices of WUs may be too restrictive (maybe 128.143.199.97 is the only one?) and maybe that server may is short of WUs.

For a comment on his restrictions and what he might change, he will need to post the information requested below.

Re: Re: New AS testing

Posted: Wed Oct 07, 2015 7:42 pm
by Nathan_P
128.143.199.97 is the only server that I have seen that will allocate normal SMP jobs out to machines with more than 24 threads, that's the one I was trying to connect to the other day. According to server status the server is OK and has 2720 WU available. I haven't had time to sit down with the machine and see if it will connect yet, i'll also try my v7 machine tomorrow and see if that helps.

Both rigs are 48 thread so variety on non BA WU is limited