Page 1 of 1

Issue with 155.247.164.213

Posted: Wed Sep 16, 2020 6:07 pm
by gs60
Hello,

It appears that every time I pick up 155.247.164.213 as the assignment server, the cpu thread just hangs. I thought there was a timeout for connect failures, but it appears it's on the order of several hours? According to the server status page, this server is assigning around 100 tasks per minute, but it doesn't seem to like me. If I select the server with a browser from the status page, I do not see the basic folding@home web pages with server type and version. What I get is a tcp/ip error of ERR_CONNECTION_RESET. I wonder if the client is staying in an endless loop if this type of error happens? The only why I can clear it and get the cpu task thread running again is to pause and shutdown the client (gpu thread is active) and use the task manager (windows 10) to find and kill the hung fahclient process and restart the client. This gets me a new assignment server for the cpu thread and the gpu thread picks up where it left off and all is well again.

One thing to note on this, if you pause and quit the client, it only exits the system tray. The fahclient, that I suspect is trying to access this server, is not responding to a termination request so if you try and restart the client without killing the task from the task manager, it will just hang and not restart leaving both cpu and gpu threads dead.

Thank you!

Re: Issue with 155.247.164.213

Posted: Wed Sep 16, 2020 9:37 pm
by JimF
I am starting to get assignments from "assign1.foldingathome.org" up to "assign4.foldingathome.org".
I have not had problems all day, so I think it is hopefully fixed.

Re: Issue with 155.247.164.213

Posted: Sat Sep 19, 2020 12:02 pm
by JimF
I am now not getting work on any of them. They are practicing social distancing.

Code: Select all

11:54:52:WU00:FS00:0xa7:Completed 247500 out of 250000 steps (99%)
11:54:53:WU01:FS00:Connecting to assign1.foldingathome.org:80
\x1b[93m11:54:53:WARNING:WU01:FS00:Failed to get assignment from 'assign1.foldingathome.org:80': No WUs available for this configuration\x1b[0m
11:54:53:WU01:FS00:Connecting to assign2.foldingathome.org:80
\x1b[93m11:54:53:WARNING:WU01:FS00:Failed to get assignment from 'assign2.foldingathome.org:80': No WUs available for this configuration\x1b[0m
11:54:53:WU01:FS00:Connecting to assign3.foldingathome.org:80
\x1b[93m11:54:54:WARNING:WU01:FS00:Failed to get assignment from 'assign3.foldingathome.org:80': No WUs available for this configuration\x1b[0m
11:54:54:WU01:FS00:Connecting to assign4.foldingathome.org:80
\x1b[93m11:54:54:WARNING:WU01:FS00:Failed to get assignment from 'assign4.foldingathome.org:80': No WUs available for this configuration\x1b[0m
\x1b[91m11:54:54:ERROR:WU01:FS00:Exception: Could not get an assignment\x1b[0m
11:54:54:WU01:FS00:Connecting to assign1.foldingathome.org:80
\x1b[93m11:54:54:WARNING:WU01:FS00:Failed to get assignment from 'assign1.foldingathome.org:80': No WUs available for this configuration\x1b[0m
11:54:54:WU01:FS00:Connecting to assign2.foldingathome.org:80
\x1b[93m11:54:55:WARNING:WU01:FS00:Failed to get assignment from 'assign2.foldingathome.org:80': No WUs available for this configuration\x1b[0m
11:54:55:WU01:FS00:Connecting to assign3.foldingathome.org:80
\x1b[93m11:54:55:WARNING:WU01:FS00:Failed to get assignment from 'assign3.foldingathome.org:80': No WUs available for this configuration\x1b[0m
11:54:55:WU01:FS00:Connecting to assign4.foldingathome.org:80
\x1b[93m11:54:55:WARNING:WU01:FS00:Failed to get assignment from 'assign4.foldingathome.org:80': No WUs available for this configuration\x1b[0m
\x1b[91m11:54:55:ERROR:WU01:FS00:Exception: Could not get an assignment\x1b[0m
11:55:10:WU00:FS00:0xa7:Completed 250000 out of 250000 steps (100%)
11:55:11:WU00:FS00:FahCore returned: FINISHED_UNIT (100 = 0x64)
11:55:11:WU00:FS00:Sending unit results: id:00 state:SEND error:NO_ERROR project:14259 run:0 clone:27 gen:432 core:0xa7 unit:0x000001f4cedfaa925eac9e0cf54f515d
11:55:11:WU00:FS00:Uploading 2.85MiB to 206.223.170.146
11:55:11:WU00:FS00:Connecting to 206.223.170.146:8080
11:55:13:WU00:FS00:Upload complete
11:55:13:WU00:FS00:Server responded WORK_ACK (400)
11:55:13:WU00:FS00:Final credit estimate, 8608.00 points
11:55:13:WU00:FS00:Cleaning up
11:55:54:WU01:FS00:Connecting to assign1.foldingathome.org:80
\x1b[93m11:55:54:WARNING:WU01:FS00:Failed to get assignment from 'assign1.foldingathome.org:80': No WUs available for this configuration\x1b[0m
11:55:54:WU01:FS00:Connecting to assign2.foldingathome.org:80
\x1b[93m11:55:54:WARNING:WU01:FS00:Failed to get assignment from 'assign2.foldingathome.org:80': No WUs available for this configuration\x1b[0m
11:55:54:WU01:FS00:Connecting to assign3.foldingathome.org:80
\x1b[93m11:55:55:WARNING:WU01:FS00:Failed to get assignment from 'assign3.foldingathome.org:80': No WUs available for this configuration\x1b[0m
11:55:55:WU01:FS00:Connecting to assign4.foldingathome.org:80
\x1b[93m11:55:55:WARNING:WU01:FS00:Failed to get assignment from 'assign4.foldingathome.org:80': No WUs available for this configuration\x1b[0m
\x1b[91m11:55:55:ERROR:WU01:FS00:Exception: Could not get an assignment\x1b[0m
11:57:31:WU01:FS00:Connecting to assign1.foldingathome.org:80
11:57:32:WU01:FS00:Assigned to work server 155.247.164.213
11:57:32:WU01:FS00:Requesting new work unit for slot 00: READY cpu:24 from 155.247.164.213
11:57:32:WU01:FS00:Connecting to 155.247.164.213:8080
11:58:10:FS00:Paused
11:58:58:Removing old file 'configs/config-20200529-170314.xml'
11:58:58:Saving configuration to /etc/fahclient/config.xml
11:58:58:<config>
11:58:58:  <!-- Client Control -->
11:58:58:  <fold-anon v='true'/>
11:58:58:
11:58:58:  <!-- Folding Core -->
11:58:58:  <core-priority v='low'/>
11:58:58:
11:58:58:  <!-- Folding Slot Configuration -->
11:58:58:  <client-type v='advanced'/>
11:58:58:
11:58:58:  <!-- HTTP Server -->
11:58:58:  <allow v='0.0.0.0/0'/>
11:58:58:
11:58:58:  <!-- Network -->
11:58:58:  <proxy v=':8080'/>
11:58:58:
11:58:58:  <!-- Remote Command Server -->
11:58:58:  <command-allow-no-pass v='0.0.0.0/0'/>
11:58:58:
11:58:58:  <!-- Slot Control -->
11:58:58:  <power v='full'/>
11:58:58:
11:58:58:  <!-- User Information -->
11:58:58:  <passkey v='*****'/>
11:58:58:  <user v='Jim1348'/>
11:58:58:
11:58:58:  <!-- Folding Slots -->
11:58:58:  <slot id='0' type='CPU'>
11:58:58:    <paused v='true'/>
11:58:58:  </slot>
11:58:58:</config>
\x1b[93m11:59:42:WARNING:WU01:FS00:WorkServer connection failed on port 8080 trying 80\x1b[0m
11:59:42:WU01:FS00:Connecting to 155.247.164.213:80

Re: Issue with 155.247.164.213

Posted: Sat Sep 19, 2020 2:49 pm
by Joe_H
Without knowing what your client is requesting work for, can only guess that currently WUs that will assign to those settings are out.

Re: Issue with 155.247.164.213

Posted: Sat Sep 19, 2020 9:08 pm
by JimF
It was CPU work. I would think there would be plenty of it.
I am pausing it for a while.

Code: Select all

*********************** Log Started 2020-09-19T16:53:18Z ***********************
16:53:18:Trying to access database...
16:53:18:Successfully acquired database lock
16:53:18:Read GPUs.txt
16:53:18:Enabled folding slot 00: PAUSED cpu:24 (by user)
16:53:18:****************************** FAHClient ******************************
16:53:18:        Version: 7.6.13
16:53:18:         Author: Joseph Coffland <[email protected]>
16:53:18:      Copyright: 2020 foldingathome.org
16:53:18:       Homepage: https://foldingathome.org/
16:53:18:           Date: Apr 28 2020
16:53:18:           Time: 04:20:16
16:53:18:       Revision: 5a652817f46116b6e135503af97f18e094414e3b
16:53:18:         Branch: master
16:53:18:       Compiler: GNU 8.3.0
16:53:18:        Options: -std=c++11 -ffunction-sections -fdata-sections -O3
16:53:18:                 -funroll-loops -fno-pie
16:53:18:       Platform: linux2 4.19.0-5-amd64
16:53:18:           Bits: 64
16:53:18:           Mode: Release
16:53:18:           Args: --child /etc/fahclient/config.xml --run-as fahclient
16:53:18:                 --pid-file=/var/run/fahclient.pid --daemon
16:53:18:         Config: /etc/fahclient/config.xml
16:53:18:******************************** CBang ********************************
16:53:18:           Date: Apr 25 2020
16:53:18:           Time: 00:07:53
16:53:18:       Revision: ea081a3b3b0f4a37c4d0440b4f1bc184197c7797
16:53:18:         Branch: master
16:53:18:       Compiler: GNU 8.3.0
16:53:18:        Options: -std=c++11 -ffunction-sections -fdata-sections -O3
16:53:18:                 -funroll-loops -fno-pie -fPIC
16:53:18:       Platform: linux2 4.19.0-5-amd64
16:53:18:           Bits: 64
16:53:18:           Mode: Release
16:53:18:******************************* System ********************************
16:53:18:            CPU: AMD Ryzen 9 3900X 12-Core Processor
16:53:18:         CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
16:53:18:           CPUs: 24
16:53:18:         Memory: 47.00GiB
16:53:18:    Free Memory: 46.03GiB
16:53:18:        Threads: POSIX_THREADS
16:53:18:     OS Version: 5.4
16:53:18:    Has Battery: false
16:53:18:     On Battery: false
16:53:18:     UTC Offset: -4
16:53:18:            PID: 1565
16:53:18:            CWD: /var/lib/fahclient
16:53:18:             OS: Linux 5.4.0-47-generic x86_64
16:53:18:        OS Arch: AMD64
16:53:18:           GPUs: 1
16:53:18:          GPU 0: Bus:11 Slot:0 Func:0 NVIDIA:7 GP104 [GeForce GTX 1070] 6463
16:53:18:  CUDA Device 0: Platform:0 Device:0 Bus:11 Slot:0 Compute:6.1 Driver:11.0
16:53:18:OpenCL Device 0: Platform:0 Device:0 Bus:11 Slot:0 Compute:1.2 Driver:450.66
16:53:18:******************************* libFAH ********************************
16:53:18:           Date: Apr 15 2020
16:53:18:           Time: 21:43:24
16:53:18:       Revision: 216968bc7025029c841ed6e36e81a03a316890d3
16:53:18:         Branch: master
16:53:18:       Compiler: GNU 8.3.0
16:53:18:        Options: -std=c++11 -ffunction-sections -fdata-sections -O3
16:53:18:                 -funroll-loops -fno-pie
16:53:18:       Platform: linux2 4.19.0-5-amd64
16:53:18:           Bits: 64
16:53:18:           Mode: Release
16:53:18:***********************************************************************
16:53:18:<config>

Re: Issue with 155.247.164.213

Posted: Sat Sep 19, 2020 9:58 pm
by gs60
JimF, I'm seeing the same thing for a ryzen 5 3600. The gpu is being kept busy though! I figure they'll have us heating the room again in no time. lol. Hopefully with more a8 core projects.

Re: Issue with 155.247.164.213

Posted: Sat Sep 19, 2020 10:12 pm
by Joe_H
There is a batch of new servers at Temple they are working on getting online. One was online, but there were some issues and it has been taken offline while they get those sorted out. It listed A8 as the project type when it was online, so once it is back that could alleviate the problem. Both of you have CPUs with larger core counts, so the shortage may be in WUs that can utilize them.

Re: Issue with 155.247.164.213

Posted: Sat Sep 19, 2020 10:43 pm
by JimF
OK, thanks for checking. I think I am a little ahead of the game with the large Ryzens. I will start smaller next time and work up.

Re: Issue with 155.247.164.213

Posted: Mon Sep 21, 2020 7:56 am
by PantherX
FYI, it seems to potentially be a shortage of CPU WUs which is being managed so hopefully, new CPU WUs will be coming very soon :)

Re: Issue with 155.247.164.213

Posted: Tue Sep 22, 2020 3:17 pm
by Jonazz
While I haven't encountered any shortages, I have been getting more non-covid projects lately.

Re: Issue with 155.247.164.213

Posted: Wed Sep 23, 2020 8:16 am
by PantherX
Please note that the researcher is aware of that and will be looking into it when possible to see if something needs to be done or not. Currently, the focus is on getting FahCore_22 version 0.0.13 (CUDA enabled) released to full to ensure that all supported Nvidia GPUs (Kepler or better) can take advantage of the significant speed-up. After that, hopefully, new Projects will use that and existing Projects can migrate to it. This all takes careful planning to ensure that nothing bad happens as we would like this to be a transparent process for the Donors :)