Page 2 of 3

Re: Not getting any WU

Posted: Mon Apr 13, 2020 11:13 am
by Neil-B
If you post your log (including top 200 lines or so with the system configuration) someone might be able to help you diagnose why you GPU isn't working - it may be servers are busy, but in may be something in your setup ... See viewtopic.php?f=24&t=26036 for help on posting logs ... You previous crop of the log showed "GPUs: 0" which may be an issue (but I may be wrong) as mine shows this type of thing:

Code: Select all

08:00:59:           GPUs: 1
08:00:59:          GPU 0: Bus:3 Slot:0 Func:0 NVIDIA:3 GK107 [Quadro K420]
08:00:59:  CUDA Device 0: Platform:0 Device:0 Bus:3 Slot:0 Compute:3.0 Driver:10.2
08:00:59:OpenCL Device 0: Platform:0 Device:0 Bus:3 Slot:0 Compute:1.2 Driver:442.74

Re: Not getting any WU

Posted: Mon Apr 13, 2020 2:49 pm
by bloblobl0
Well this is my log, hope somebody can tell me why I can't utilize my GPU

Code: Select all

*********************** Log Started 2020-04-13T10:50:32Z ***********************
10:50:32:WU00:FS02:0xa7:************************** Gromacs Folding@home Core ***************************
10:50:32:WU00:FS02:0xa7:       Type: 0xa7
10:50:32:WU00:FS02:0xa7:       Core: Gromacs
10:50:32:WU00:FS02:0xa7:       Args: -dir 00 -suffix 01 -version 705 -lifeline 9396 -checkpoint 15 -np 1
10:50:32:WU00:FS02:0xa7:************************************ CBang *************************************
10:50:32:WU00:FS02:0xa7:       Date: Oct 26 2019
10:50:32:WU00:FS02:0xa7:       Time: 01:38:25
10:50:32:WU00:FS02:0xa7:   Revision: c46a1a011a24143739ac7218c5a435f66777f62f
10:50:32:WU00:FS02:0xa7:     Branch: master
10:50:32:WU00:FS02:0xa7:   Compiler: Visual C++ 2008
10:50:32:WU00:FS02:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
10:50:32:WU00:FS02:0xa7:   Platform: win32 10
10:50:33:WU00:FS02:0xa7:       Bits: 64
10:50:33:WU00:FS02:0xa7:       Mode: Release
10:50:33:WU00:FS02:0xa7:************************************ System ************************************
10:50:33:WU00:FS02:0xa7:        CPU: Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz
10:50:33:WU00:FS02:0xa7:     CPU ID: GenuineIntel Family 6 Model 61 Stepping 4
10:50:33:WU00:FS02:0xa7:       CPUs: 4
10:50:33:WU00:FS02:0xa7:     Memory: 7.91GiB
10:50:33:WU00:FS02:0xa7:Free Memory: 4.85GiB
10:50:33:WU00:FS02:0xa7:    Threads: WINDOWS_THREADS
10:50:33:WU00:FS02:0xa7: OS Version: 6.2
10:50:33:WU00:FS02:0xa7:Has Battery: true
10:50:33:WU00:FS02:0xa7: On Battery: false
10:50:33:WU00:FS02:0xa7: UTC Offset: -7
10:50:33:WU00:FS02:0xa7:        PID: 11192
10:50:33:WU00:FS02:0xa7:        CWD: C:\Users\Dick Thunderson\AppData\Roaming\FAHClient\work
10:50:33:WU00:FS02:0xa7:******************************** Build - libFAH ********************************
10:50:33:WU00:FS02:0xa7:    Version: 0.0.18
10:50:33:WU00:FS02:0xa7:     Author: Joseph Coffland <[email protected]>
10:50:33:WU00:FS02:0xa7:  Copyright: 2019 foldingathome.org
10:50:33:WU00:FS02:0xa7:   Homepage: https://foldingathome.org/
10:50:33:WU00:FS02:0xa7:       Date: Oct 26 2019
10:50:33:WU00:FS02:0xa7:       Time: 01:52:30
10:50:33:WU00:FS02:0xa7:   Revision: c1e3513b1bc0c16013668f2173ee969e5995b38e
10:50:33:WU00:FS02:0xa7:     Branch: master
10:50:33:WU00:FS02:0xa7:   Compiler: Visual C++ 2008
10:50:33:WU00:FS02:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
10:50:33:WU00:FS02:0xa7:   Platform: win32 10
10:50:33:WU00:FS02:0xa7:       Bits: 64
10:50:33:WU00:FS02:0xa7:       Mode: Release
10:50:33:WU00:FS02:0xa7:************************************ Build *************************************
10:50:33:WU00:FS02:0xa7:       SIMD: avx_256
10:50:33:WU00:FS02:0xa7:********************************************************************************
10:50:33:WU00:FS02:0xa7:Project: 16417 (Run 1258, Clone 4, Gen 14)
10:50:33:WU00:FS02:0xa7:Unit: 0x0000001096880e6e5e8a61a06950cbef
10:50:33:WU00:FS02:0xa7:Digital signatures verified
10:50:33:WU00:FS02:0xa7:Calling: mdrun -s frame14.tpr -o frame14.trr -x frame14.xtc -cpi state.cpt -cpt 15 -nt 1
10:50:33:WU00:FS02:0xa7:Steps: first=3500000 total=250000
10:50:34:WU00:FS02:0xa7:Completed 45422 out of 250000 steps (18%)
10:51:13:Removing old file 'configs/config-20200413-062451.xml'
10:51:13:Saving configuration to config.xml
10:51:13:<config>
10:51:13:  <!-- Folding Core -->
10:51:13:  <core-priority v='low'/>
10:51:13:
10:51:13:  <!-- Network -->
10:51:13:  <proxy v=':8080'/>
10:51:13:
10:51:13:  <!-- Slot Control -->
10:51:13:  <pause-on-battery v='false'/>
10:51:13:  <power v='FULL'/>
10:51:13:
10:51:13:  <!-- User Information -->
10:51:13:  <passkey v='********************************'/>
10:51:13:  <team v='223518'/>
10:51:13:  <user v='Anon'/>
10:51:13:
10:51:13:  <!-- Folding Slots -->
10:51:13:  <slot id='0' type='CPU'>
10:51:13:    <cpus v='2'/>
10:51:13:  </slot>
10:51:13:  <slot id='2' type='CPU'/>
10:51:13:  <slot id='1' type='GPU'/>
10:51:13:</config>
10:51:30:WU02:FS01:Connecting to 65.254.110.245:8080
10:51:31:WARNING:WU02:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
10:51:31:WU02:FS01:Connecting to 18.218.241.186:80
10:51:31:WARNING:WU02:FS01:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
10:51:31:ERROR:WU02:FS01:Exception: Could not get an assignment
10:53:08:WU02:FS01:Connecting to 65.254.110.245:8080
10:53:08:WARNING:WU02:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
10:53:08:WU02:FS01:Connecting to 18.218.241.186:80
10:53:09:WARNING:WU02:FS01:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
10:53:09:ERROR:WU02:FS01:Exception: Could not get an assignment
10:55:45:WU02:FS01:Connecting to 65.254.110.245:8080
10:55:45:WARNING:WU02:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
10:55:45:WU02:FS01:Connecting to 18.218.241.186:80
10:55:46:WARNING:WU02:FS01:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
10:55:46:ERROR:WU02:FS01:Exception: Could not get an assignment
10:57:14:WU00:FS02:0xa7:Completed 47500 out of 250000 steps (19%)
10:57:59:WU01:FS00:0xa7:Completed 55000 out of 250000 steps (22%)
10:59:59:WU02:FS01:Connecting to 65.254.110.245:8080
10:59:59:WARNING:WU02:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
10:59:59:WU02:FS01:Connecting to 18.218.241.186:80
11:00:00:WU02:FS01:Assigned to work server 128.252.203.10
11:00:00:WU02:FS01:Requesting new work unit for slot 01: READY gpu:0:GK208 [GeForce 920M] from 128.252.203.10
11:00:00:WU02:FS01:Connecting to 128.252.203.10:8080
11:00:21:WARNING:WU02:FS01:WorkServer connection failed on port 8080 trying 80
11:00:21:WU02:FS01:Connecting to 128.252.203.10:80
11:00:42:ERROR:WU02:FS01:Exception: Failed to connect to 128.252.203.10:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.

Re: Not getting any WU

Posted: Mon Apr 13, 2020 3:07 pm
by BobWilliams757
bloblobl0 wrote:Well this is my log, hope somebody can tell me why I can't utilize my GPU

Code: Select all

*********************** Log Started 2020-04-13T10:50:32Z ***********************
10:50:32:WU00:FS02:0xa7:************************** Gromacs Folding@home Core ***************************
10:50:32:WU00:FS02:0xa7:       Type: 0xa7
10:50:32:WU00:FS02:0xa7:       Core: Gromacs
10:50:32:WU00:FS02:0xa7:       Args: -dir 00 -suffix 01 -version 705 -lifeline 9396 -checkpoint 15 -np 1
10:50:32:WU00:FS02:0xa7:************************************ CBang *************************************
10:50:32:WU00:FS02:0xa7:       Date: Oct 26 2019
10:50:32:WU00:FS02:0xa7:       Time: 01:38:25
10:50:32:WU00:FS02:0xa7:   Revision: c46a1a011a24143739ac7218c5a435f66777f62f
10:50:32:WU00:FS02:0xa7:     Branch: master
10:50:32:WU00:FS02:0xa7:   Compiler: Visual C++ 2008
10:50:32:WU00:FS02:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
10:50:32:WU00:FS02:0xa7:   Platform: win32 10
10:50:33:WU00:FS02:0xa7:       Bits: 64
10:50:33:WU00:FS02:0xa7:       Mode: Release
10:50:33:WU00:FS02:0xa7:************************************ System ************************************
10:50:33:WU00:FS02:0xa7:        CPU: Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz
10:50:33:WU00:FS02:0xa7:     CPU ID: GenuineIntel Family 6 Model 61 Stepping 4
10:50:33:WU00:FS02:0xa7:       CPUs: 4
10:50:33:WU00:FS02:0xa7:     Memory: 7.91GiB
10:50:33:WU00:FS02:0xa7:Free Memory: 4.85GiB
10:50:33:WU00:FS02:0xa7:    Threads: WINDOWS_THREADS
10:50:33:WU00:FS02:0xa7: OS Version: 6.2
10:50:33:WU00:FS02:0xa7:Has Battery: true
10:50:33:WU00:FS02:0xa7: On Battery: false
10:50:33:WU00:FS02:0xa7: UTC Offset: -7
10:50:33:WU00:FS02:0xa7:        PID: 11192
10:50:33:WU00:FS02:0xa7:        CWD: C:\Users\Dick Thunderson\AppData\Roaming\FAHClient\work
10:50:33:WU00:FS02:0xa7:******************************** Build - libFAH ********************************
10:50:33:WU00:FS02:0xa7:    Version: 0.0.18
10:50:33:WU00:FS02:0xa7:     Author: Joseph Coffland <[email protected]>
10:50:33:WU00:FS02:0xa7:  Copyright: 2019 foldingathome.org
10:50:33:WU00:FS02:0xa7:   Homepage: https://foldingathome.org/
10:50:33:WU00:FS02:0xa7:       Date: Oct 26 2019
10:50:33:WU00:FS02:0xa7:       Time: 01:52:30
10:50:33:WU00:FS02:0xa7:   Revision: c1e3513b1bc0c16013668f2173ee969e5995b38e
10:50:33:WU00:FS02:0xa7:     Branch: master
10:50:33:WU00:FS02:0xa7:   Compiler: Visual C++ 2008
10:50:33:WU00:FS02:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
10:50:33:WU00:FS02:0xa7:   Platform: win32 10
10:50:33:WU00:FS02:0xa7:       Bits: 64
10:50:33:WU00:FS02:0xa7:       Mode: Release
10:50:33:WU00:FS02:0xa7:************************************ Build *************************************
10:50:33:WU00:FS02:0xa7:       SIMD: avx_256
10:50:33:WU00:FS02:0xa7:********************************************************************************
10:50:33:WU00:FS02:0xa7:Project: 16417 (Run 1258, Clone 4, Gen 14)
10:50:33:WU00:FS02:0xa7:Unit: 0x0000001096880e6e5e8a61a06950cbef
10:50:33:WU00:FS02:0xa7:Digital signatures verified
10:50:33:WU00:FS02:0xa7:Calling: mdrun -s frame14.tpr -o frame14.trr -x frame14.xtc -cpi state.cpt -cpt 15 -nt 1
10:50:33:WU00:FS02:0xa7:Steps: first=3500000 total=250000
10:50:34:WU00:FS02:0xa7:Completed 45422 out of 250000 steps (18%)
10:51:13:Removing old file 'configs/config-20200413-062451.xml'
10:51:13:Saving configuration to config.xml
10:51:13:<config>
10:51:13:  <!-- Folding Core -->
10:51:13:  <core-priority v='low'/>
10:51:13:
10:51:13:  <!-- Network -->
10:51:13:  <proxy v=':8080'/>
10:51:13:
10:51:13:  <!-- Slot Control -->
10:51:13:  <pause-on-battery v='false'/>
10:51:13:  <power v='FULL'/>
10:51:13:
10:51:13:  <!-- User Information -->
10:51:13:  <passkey v='********************************'/>
10:51:13:  <team v='223518'/>
10:51:13:  <user v='Anon'/>
10:51:13:
10:51:13:  <!-- Folding Slots -->
10:51:13:  <slot id='0' type='CPU'>
10:51:13:    <cpus v='2'/>
10:51:13:  </slot>
10:51:13:  <slot id='2' type='CPU'/>
10:51:13:  <slot id='1' type='GPU'/>
10:51:13:</config>
10:51:30:WU02:FS01:Connecting to 65.254.110.245:8080
10:51:31:WARNING:WU02:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
10:51:31:WU02:FS01:Connecting to 18.218.241.186:80
10:51:31:WARNING:WU02:FS01:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
10:51:31:ERROR:WU02:FS01:Exception: Could not get an assignment
10:53:08:WU02:FS01:Connecting to 65.254.110.245:8080
10:53:08:WARNING:WU02:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
10:53:08:WU02:FS01:Connecting to 18.218.241.186:80
10:53:09:WARNING:WU02:FS01:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
10:53:09:ERROR:WU02:FS01:Exception: Could not get an assignment
10:55:45:WU02:FS01:Connecting to 65.254.110.245:8080
10:55:45:WARNING:WU02:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
10:55:45:WU02:FS01:Connecting to 18.218.241.186:80
10:55:46:WARNING:WU02:FS01:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
10:55:46:ERROR:WU02:FS01:Exception: Could not get an assignment
10:57:14:WU00:FS02:0xa7:Completed 47500 out of 250000 steps (19%)
10:57:59:WU01:FS00:0xa7:Completed 55000 out of 250000 steps (22%)
10:59:59:WU02:FS01:Connecting to 65.254.110.245:8080
10:59:59:WARNING:WU02:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
10:59:59:WU02:FS01:Connecting to 18.218.241.186:80
11:00:00:WU02:FS01:Assigned to work server 128.252.203.10
11:00:00:WU02:FS01:Requesting new work unit for slot 01: READY gpu:0:GK208 [GeForce 920M] from 128.252.203.10
11:00:00:WU02:FS01:Connecting to 128.252.203.10:8080
11:00:21:WARNING:WU02:FS01:WorkServer connection failed on port 8080 trying 80
11:00:21:WU02:FS01:Connecting to 128.252.203.10:80
11:00:42:ERROR:WU02:FS01:Exception: Failed to connect to 128.252.203.10:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.

I'm a newbie at this, but it appears you now have a recognized GPU which is simply not getting an assignment. With the huge influx of new folders, this is not uncommon. Often people don't pick up a WU for days, though I've had waits no longer than 10-12 hours that I've noticed.

I'd just leave it running as is for now and hope that it picks up a WU. If I'm incorrect about any of the above, someone with more experience will hopefully come along and correct me.

Re: Not getting any WU

Posted: Mon Apr 13, 2020 3:47 pm
by Neil-B
To troubleshoot really need the head of the log with the system details, should look like ... but I guess that is effective what you have cropped into previous post ... the bit I think is missing are the 2nd and 3rd lines (obviously relevant to your GPU):

Code: Select all

08:00:59:        OS Arch: AMD64
08:00:59:           GPUs: 1
08:00:59:          GPU 0: Bus:3 Slot:0 Func:0 NVIDIA:3 GK107 [Quadro K420]
08:00:59:  CUDA Device 0: Platform:0 Device:0 Bus:3 Slot:0 Compute:3.0 Driver:10.2
08:00:59:OpenCL Device 0: Platform:0 Device:0 Bus:3 Slot:0 Compute:1.2 Driver:442.74
08:00:59:  Win32 Service: false

Re: Not getting any WU

Posted: Mon Apr 13, 2020 4:04 pm
by HaloJones
the log shows there is no recognised GPU. What is the card and what drivers are installed?

Re: Not getting any WU

Posted: Mon Apr 13, 2020 4:47 pm
by BobWilliams757
It looked to me like a configuration change was saved after the initial header.


The end of the log showed:


10:59:59:WU02:FS01:Connecting to 65.254.110.245:8080
10:59:59:WARNING:WU02:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
10:59:59:WU02:FS01:Connecting to 18.218.241.186:80
11:00:00:WU02:FS01:Assigned to work server 128.252.203.10
11:00:00:WU02:FS01:Requesting new work unit for slot 01: READY gpu:0:GK208 [GeForce 920M] from 128.252.203.10
11:00:00:WU02:FS01:Connecting to 128.252.203.10:8080
11:00:21:WARNING:WU02:FS01:WorkServer connection failed on port 8080 trying 80
11:00:21:WU02:FS01:Connecting to 128.252.203.10:80
11:00:42:ERROR:WU02:FS01:Exception: Failed to connect to 128.252.203.10:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.

(bolded by me)

As I said, I'm "new" to this again, so subject to correction. I thought it showed that a configuration change worked and there was now an available GPU.

Re: Not getting any WU

Posted: Mon Apr 13, 2020 5:52 pm
by Neil-B
Good catch … I scanned the latest but missed that … so it looks like all might be ok now and just possibly server load … maybe I was just jumping at shadows - have seen a lot of misconfigured GPUs recently

Re: Not getting any WU

Posted: Mon Apr 13, 2020 11:49 pm
by meltz511
On increasing the number of steps of WUs:

With the info from the admin, the amount of network traffic out from the servers would be less (WUs the same physical size, but take longer to compute, thus WUs sent out of server less often, thus less network traffic going out of servers). Also less connections to servers at one time.

fewer connections that transmit larger results back to the server; fewer connections at one time is easier on server harddrives, the harddrive needles jump around less so throughput to storage is quicker. More connections are less efficient and slows server down (more time spent on needle jumping then writing). (unless the servers are all SSDs)

While many connections that transmit small amounts of data, reminds me of denial of service attacks. The server is overloaded, and can't do anything.

my two cents anyway.. *shrug*

Re: Not getting any WU

Posted: Tue Apr 14, 2020 1:35 am
by Joe_H
Define "small". The returning WUs can be 10s of MB, to over 100 MB each. Double the number of steps, the file size grows as more intermediate data needs to be included. Right now that data gets buffered in the server RAM before getting transferred to drives.

Re: Not getting any WU

Posted: Tue Apr 14, 2020 3:07 am
by PantherX
BobWilliams757 wrote:...11:00:00:WU02:FS01:Requesting new work unit for slot 01: READY gpu:0:GK208 [GeForce 920M] from 128.252.203.10...
That GPU is supported as it has OpenCL 1.2 and Double Precision support: https://www.techpowerup.com/gpu-specs/g ... 920m.c2646

However, it is a mobile GPU so please ensure that it can fold the assigned WU before the Timeout date. This may mean that you might have to leave your system folding on the GPU 24/7 to meet the deadline. Feel free to experiment and report back :)

Re: Not getting any WU

Posted: Tue Apr 14, 2020 7:07 am
by bloblobl0
PantherX wrote: However, it is a mobile GPU so please ensure that it can fold the assigned WU before the Timeout date.
How do I ensure that?

I have left my laptop folding on CPU for the last 48 hours. So far, no available WUs for my GPU.

Re: Not getting any WU

Posted: Tue Apr 14, 2020 7:31 am
by bruce
I would reboot my FAH computer about this time of (night, depending on your timezone). A certain percentage of computers in the USA are switched off for the night so I've been able to get WUs around this time when I couldn't for the previous 14 hours.

Also, rebooting occasionally does improve your chances. One of the enhancements that we can expect in a soon-to-be-released new client will contain an improvement for this condition.

Re: Not getting any WU

Posted: Tue Apr 14, 2020 10:34 pm
by kaz011890
Sorry i have another question.

Is there a way or a code to download a new (maybe more than 1) WU before completing the WU that the GPU is currently working on?
I know the servers are overloaded right now and it doesn't have much storage or bandwidth to upload and download all the data from the computers from around the world.

So is there a way that i can download multiple WU units at a time and complete them and keep that completed data stored on my computer for the time being until the server is ready to download all the data?
Maybe im 100% wrong but this could maybe take some of the strain that the servers are experiencing right now

Re: Not getting any WU

Posted: Tue Apr 14, 2020 10:39 pm
by tulanebarandgrill
So I have a 2080 Ti but I notice my system is busy anywhere from 0 to 30% of the time. Also it seems sometimes I will not get any WU unless I reboot. This is new behavior since the WU shortage. Is there anything specific I can check ? I've considered reinstalling the client. Previous daily averages were trending to 4,000,000 +

Image

Re: Not getting any WU

Posted: Tue Apr 14, 2020 11:10 pm
by bruce
There's a setting in the client that allows you to TRY to download the next Wu before the current WU finishes. The only help it can offer is to start trying sooner. If the servers are saturated, you get to find out about that just a little sooner.

No.

FAHClient is designed to prevent you from caching WUs. If you could, it would destroy your bonus because you'd be holding WUs longer than necessary and ultimately would delay the completion of active trajectories.