Page 3 of 5

Re: Odd GPU behavior - terminal window + FAHClient

Posted: Thu Feb 11, 2021 6:51 pm
by bruce
FAHClient is responsible for uploading results and obtaining a new assignment plus a couple of other little tasks. I wouldn't say that it's doing nothing. Other Daemons reside quietly in the background until you need it's services

Re: Odd GPU behavior - terminal window + FAHClient

Posted: Thu Feb 11, 2021 7:08 pm
by SJC_Steve
I was getting OpenCL problem messages so I installed the OpenCL dev files as suggested by gunnarre with this command;
sudo apt install ocl-icd-opencl-dev

And then these commands;
sudo adduser fahclient video
sudo adduser fahclient render

I'm not getting any OpenCL error messages anymore but still only folding on the CPU, not the GPU.

There's one log message that stands out;
18:40:07: <!-- Folding Slot Configuration -->
18:40:07: <gpu v='false'/>

Is this the issue or is there something else I need to do to begin folding on the GPU?
Here the first lines of the log.

Thanks,
Steve

Code: Select all

*********************** Log Started 2021-02-11T18:40:07Z ***********************
18:40:07:******************************* libFAH ********************************
18:40:07:           Date: Oct 20 2020
18:40:07:           Time: 20:36:39
18:40:07:       Revision: 5ca109d295a6245e2a2f590b3d0085ad5e567aeb
18:40:07:         Branch: master
18:40:07:       Compiler: GNU 8.3.0
18:40:07:        Options: -faligned-new -std=c++11 -fsigned-char -ffunction-sections
18:40:07:                 -fdata-sections -O3 -funroll-loops -fno-pie
18:40:07:       Platform: linux2 5.8.0-1-amd64
18:40:07:           Bits: 64
18:40:07:           Mode: Release
18:40:07:****************************** FAHClient ******************************
18:40:07:        Version: 7.6.21
18:40:07:         Author: Joseph Coffland <[email protected]>
18:40:07:      Copyright: 2020 foldingathome.org
18:40:07:       Homepage: https://foldingathome.org/
18:40:07:           Date: Oct 20 2020
18:40:07:           Time: 20:39:00
18:40:07:       Revision: 6efbf0e138e22d3963e6a291f78dcb9c6422a278
18:40:07:         Branch: master
18:40:07:       Compiler: GNU 8.3.0
18:40:07:        Options: -faligned-new -std=c++11 -fsigned-char -ffunction-sections
18:40:07:                 -fdata-sections -O3 -funroll-loops -fno-pie
18:40:07:       Platform: linux2 5.8.0-1-amd64
18:40:07:           Bits: 64
18:40:07:           Mode: Release
18:40:07:           Args: --child /etc/fahclient/config.xml --run-as fahclient
18:40:07:                 --pid-file=/var/run/fahclient.pid --daemon
18:40:07:         Config: /etc/fahclient/config.xml
18:40:07:******************************** CBang ********************************
18:40:07:           Date: Oct 20 2020
18:40:07:           Time: 18:37:59
18:40:07:       Revision: 7e4ce85225d7eaeb775e87c31740181ca603de60
18:40:07:         Branch: master
18:40:07:       Compiler: GNU 8.3.0
18:40:07:        Options: -faligned-new -std=c++11 -fsigned-char -ffunction-sections
18:40:07:                 -fdata-sections -O3 -funroll-loops -fno-pie -fPIC
18:40:07:       Platform: linux2 5.8.0-1-amd64
18:40:07:           Bits: 64
18:40:07:           Mode: Release
18:40:07:******************************* System ********************************
18:40:07:            CPU: AMD Ryzen 7 3700X 8-Core Processor
18:40:07:         CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
18:40:07:           CPUs: 8
18:40:07:         Memory: 15.61GiB
18:40:07:    Free Memory: 14.03GiB
18:40:07:        Threads: POSIX_THREADS
18:40:07:     OS Version: 5.8
18:40:07:    Has Battery: false
18:40:07:     On Battery: false
18:40:07:     UTC Offset: -7
18:40:07:            PID: 1529
18:40:07:            CWD: /var/lib/fahclient
18:40:07:             OS: Linux 5.8.0-43-generic x86_64
18:40:07:        OS Arch: AMD64
18:40:07:           GPUs: 1
18:40:07:          GPU 0: Bus:8 Slot:0 Func:0 NVIDIA:7 GP106 [GeForce GTX 1060 3GB] 3935
18:40:07:  CUDA Device 0: Platform:0 Device:0 Bus:8 Slot:0 Compute:6.1 Driver:11.2
18:40:07:OpenCL Device 0: Platform:0 Device:0 Bus:8 Slot:0 Compute:1.2 Driver:460.32
18:40:07:***********************************************************************
18:40:07:<config>
18:40:07:  <!-- Client Control -->
18:40:07:  <fold-anon v='true'/>
18:40:07:
18:40:07:  <!-- Folding Slot Configuration -->
18:40:07:  <gpu v='false'/>
18:40:07:
18:40:07:  <!-- Slot Control -->
18:40:07:  <power v='full'/>
18:40:07:
18:40:07:  <!-- User Information -->
18:40:07:  <user v='SJC_Steve'/>
18:40:07:
18:40:07:  <!-- Folding Slots -->
18:40:07:  <slot id='0' type='CPU'/>
18:40:07:</config>
18:40:07:Trying to access database...
18:40:07:Successfully acquired database lock
18:40:07:FS00:Initialized folding slot 00: cpu:8
18:40:07:WU00:FS00:Starting
18:40:07:WU00:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/lin/64bit-avx-256/a7-0.0.19/Core_a7.fah/FahCore_a7 -dir 00 -suffix 01 -version 706 -lifeline 1529 -checkpoint 15 -np 8
18:40:07:WU00:FS00:Started FahCore on PID 1540
18:40:07:WU00:FS00:Core PID:1544
18:40:07:WU00:FS00:FahCore 0xa7 started
18:40:08:WU00:FS00:0xa7:*********************** Log Started 2021-02-11T18:40:07Z ***********************
18:40:08:WU00:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
18:40:08:WU00:FS00:0xa7:       Type: 0xa7
18:40:08:WU00:FS00:0xa7:       Core: Gromacs
18:40:08:WU00:FS00:0xa7:       Args: -dir 00 -suffix 01 -version 706 -lifeline 1540 -checkpoint 15 -np 8
18:40:08:WU00:FS00:0xa7:************************************ CBang *************************************
18:40:08:WU00:FS00:0xa7:       Date: Nov 27 2019
18:40:08:WU00:FS00:0xa7:       Time: 11:26:54
18:40:08:WU00:FS00:0xa7:   Revision: d25803215b59272441049dfa05a0a9bf7a6e3c48
18:40:08:WU00:FS00:0xa7:     Branch: master
18:40:08:WU00:FS00:0xa7:   Compiler: GNU 8.3.0
18:40:08:WU00:FS00:0xa7:    Options: -std=c++11 -ffunction-sections -fdata-sections -O3 -funroll-loops
18:40:08:WU00:FS00:0xa7:             -fno-pie -fPIC
18:40:08:WU00:FS00:0xa7:   Platform: linux2 4.19.0-5-amd64
18:40:08:WU00:FS00:0xa7:       Bits: 64
18:40:08:WU00:FS00:0xa7:       Mode: Release
18:40:08:WU00:FS00:0xa7:************************************ System ************************************
18:40:08:WU00:FS00:0xa7:        CPU: AMD Ryzen 7 3700X 8-Core Processor
18:40:08:WU00:FS00:0xa7:     CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
18:40:08:WU00:FS00:0xa7:       CPUs: 8
18:40:08:WU00:FS00:0xa7:     Memory: 15.61GiB
18:40:08:WU00:FS00:0xa7:Free Memory: 14.02GiB
18:40:08:WU00:FS00:0xa7:    Threads: POSIX_THREADS
18:40:08:WU00:FS00:0xa7: OS Version: 5.8
18:40:08:WU00:FS00:0xa7:Has Battery: false
18:40:08:WU00:FS00:0xa7: On Battery: false
18:40:08:WU00:FS00:0xa7: UTC Offset: -7
18:40:08:WU00:FS00:0xa7:        PID: 1544
18:40:08:WU00:FS00:0xa7:        CWD: /var/lib/fahclient/work
18:40:08:WU00:FS00:0xa7:******************************** Build - libFAH ********************************
18:40:08:WU00:FS00:0xa7:    Version: 0.0.19
18:40:08:WU00:FS00:0xa7:     Author: Joseph Coffland <[email protected]>
18:40:08:WU00:FS00:0xa7:  Copyright: 2019 foldingathome.org
18:40:08:WU00:FS00:0xa7:   Homepage: https://foldingathome.org/
18:40:08:WU00:FS00:0xa7:       Date: Nov 26 2019
18:40:08:WU00:FS00:0xa7:       Time: 00:41:42
18:40:08:WU00:FS00:0xa7:   Revision: d5b5c747532224f986b7cd02c968ed9a20c16d6e
18:40:08:WU00:FS00:0xa7:     Branch: master
18:40:08:WU00:FS00:0xa7:   Compiler: GNU 8.3.0
18:40:08:WU00:FS00:0xa7:    Options: -std=c++11 -ffunction-sections -fdata-sections -O3 -funroll-loops
18:40:08:WU00:FS00:0xa7:             -fno-pie
18:40:08:WU00:FS00:0xa7:   Platform: linux2 4.19.0-5-amd64
18:40:08:WU00:FS00:0xa7:       Bits: 64
18:40:08:WU00:FS00:0xa7:       Mode: Release
18:40:08:WU00:FS00:0xa7:************************************ Build *************************************
18:40:08:WU00:FS00:0xa7:       SIMD: avx_256
18:40:08:WU00:FS00:0xa7:********************************************************************************
18:40:08:WU00:FS00:0xa7:Project: 16927 (Run 22, Clone 223, Gen 97)
18:40:08:WU00:FS00:0xa7:Unit: 0x00000000000000000000000000000000
18:40:08:WU00:FS00:0xa7:Digital signatures verified
18:40:08:WU00:FS00:0xa7:Calling: mdrun -s frame97.tpr -o frame97.trr -cpi state.cpt -cpt 15 -nt 8
18:40:08:WU00:FS00:0xa7:Steps: first=48500000 total=500000
18:40:09:WU00:FS00:0xa7:Completed 439282 out of 500000 steps (87%)
18:40:23:WU00:FS00:0xa7:Completed 440000 out of 500000 steps (88%)
18:41:48:WU00:FS00:0xa7:Completed 445000 out of 500000 steps (89%)

Re: Odd GPU behavior - terminal window + FAHClient

Posted: Thu Feb 11, 2021 7:14 pm
by Joe_H
Yes, that is the issue. It appears the client set that flag when initially installed and no usable GPU was detected. Remove that or set the value to true and you should be able to get the GPU slot set up.

Re: Odd GPU behavior - terminal window + FAHClient

Posted: Thu Feb 11, 2021 7:19 pm
by SJC_Steve
Joe_H wrote:Yes, that is the issue. It appears the client set that flag when initially installed and no usable GPU was detected. Remove that or set the value to true and you should be able to get the GPU slot set up.
Joe_H;

How do I do that? I've broken lots of stuff in the past by fumbling around with configuration files.

Thanks,
Steve

Re: Odd GPU behavior - terminal window + FAHClient

Posted: Thu Feb 11, 2021 7:53 pm
by Joe_H
In FAHControl use the Configure button, select the Expert tab. Under Extra client options add the option gpu and set the value to true. Then okay and save your way out. You may have to restart the FAHClient process, easiest way can be by rebooting.

Re: Odd GPU behavior - terminal window + FAHClient

Posted: Thu Feb 11, 2021 8:04 pm
by SJC_Steve
Joe_H wrote:In FAHControl use the Configure button, select the Expert tab. Under Extra client options add the option gpu and set the value to true. Then okay and save your way out. You may have to restart the FAHClient process, easiest way can be by rebooting.
I tried that but the gpu was already there with a value of "false". I went to the field with a value of false and changed it to true, saved it and then rebooted. When I looked again, it was back to false. So I changed the value the other way and modified the config.xml file manually, rebooted and now it's folding on both the cpu and gpu.

Success, now I just need to write all this stuff into a procedure.

Thanks,
Steve

Re: Odd GPU behavior - terminal window + FAHClient

Posted: Thu Feb 11, 2021 8:07 pm
by demorgan
Joe_H wrote:In FAHControl use the Configure button, select the Expert tab. Under Extra client options add the option gpu and set the value to true. Then okay and save your way out. You may have to restart the FAHClient process, easiest way can be by rebooting.
To restart the FAHClient easily, you can do:

Code: Select all

sudo service FAHClient restart
If that doesn't work, try:

Code: Select all

sudo pkill -i fah
sudo service FAHClient restart
This will kill all processes with "fah" in either lower or uppercase in them, so it kills anything related to F@H on your machine. Note that it will also kill anything with the string "fah" in the name, period, so although it's a slim chance this will affect anything other than F@H, keep this in mind. The second command restarts the FAHClient service and gets everything running again fresh.

Re: Odd GPU behavior - terminal window + FAHClient

Posted: Thu Feb 11, 2021 8:48 pm
by Neil-B
SJC_Steve wrote:
Joe_H wrote:In FAHControl use the Configure button, select the Expert tab. Under Extra client options add the option gpu and set the value to true. Then okay and save your way out. You may have to restart the FAHClient process, easiest way can be by rebooting.
I tried that but the gpu was already there with a value of "false". I went to the field with a value of false and changed it to true, saved it and then rebooted. When I looked again, it was back to false. So I changed the value the other way and modified the config.xml file manually, rebooted and now it's folding on both the cpu and gpu.

Success, now I just need to write all this stuff into a procedure.

Thanks,
Steve
I believe you may have to remove the gpu false entry and then add gpu true rather than try to edit the value .. but you got it sorted another way .. great to see you work through the issues .. looking forward to having the guide to refer people to :)

Re: Odd GPU behavior - terminal window + FAHClient

Posted: Thu Feb 11, 2021 9:41 pm
by bruce
Neil-B wrote:I believe you may have to remove the gpu false entry and then add gpu true rather than try to edit the value .. but you got it sorted another way .. great to see you work through the issues .. looking forward to having the guide to refer people to :)
That works.

So does replacing one value with another value except that you MUST also use the "enter" key. Almost everywhere else you type something, it's accepted when you just click the "OK" or whatever else seems sufficient to move on.

I don't know how many times this oddity has bitten me. :evil:

Re: Odd GPU behavior - terminal window + FAHClient

Posted: Fri Feb 12, 2021 6:15 am
by Whompithian
At least on Ubuntu, issuing the FAHClient command starts another client that burns power but doesn't actually get any work done.

Thanks,
Steve
The second running instance is getting work done. It just uses all of the default configuration values, including folding as Anonymous, and drops the work in an unexpected location, possibly the current working directory. You can specify the desired config file to use and a unique target work directory with command line arguments if you really want two separate clients running with different parameters. As bruce said, however, that mode of operation is neither supported nor necessary, since multiple slots can be configured for a single running client. To learn more about available options, just issue:

Code: Select all

FAHClient --help | less

Re: Odd GPU behavior - terminal window + FAHClient

Posted: Fri Feb 12, 2021 7:05 pm
by SJC_Steve
bruce wrote:
Neil-B wrote:I believe you may have to remove the gpu false entry and then add gpu true rather than try to edit the value .. but you got it sorted another way .. great to see you work through the issues .. looking forward to having the guide to refer people to :)
That works.

So does replacing one value with another value except that you MUST also use the "enter" key. Almost everywhere else you type something, it's accepted when you just click the "OK" or whatever else seems sufficient to move on.

I don't know how many times this oddity has bitten me. :evil:
Today's FAHControl behavior using Python3-FAHControl;

In 'Configure / Expert', I changed the gpu attribute from false to true, hit Enter and Save. System reported a crash of FAHControl. I rebooted, this time the Client is folding on both the GPU and CPU. I tried to go in and change gpu = true however the gpu attribute was missing. I added gpu = true, saved and rebooted. No change, it will not accept any value for gpu. When the a new client is loaded, it again comes up with gpu = false and no gpu folding.

I also tried to modify the config.xml to change gpu from false to true, Saved and rebooted but again the value was deleted in the config file.

I seems the default of the gpu attribute in a new Client install comes up false and it must be removed before the Client will use the gpu. Since this happens without FAHControl running it seems the Client is at fault here.

Any thoughts?

Thanks,
Steve

Re: Odd GPU behavior - terminal window + FAHClient

Posted: Fri Feb 12, 2021 9:20 pm
by bruce
We probably confused you by giving an incomplete description of that option.

In FAHControl, you need to add the option gpu=true, but you do it by doing a ADD and entering gpu in the Name field of the popup menu and true or false in the Value field unless there is already an entry for gpu.

Editing an existing entry requires more care because of the need for the enter key.

In either case, it ends up looking like this in config.xml:
<gpu v="false"/>
We DO NOT recommend manually edit the config. If you happen to in advertently create a syntax error, FAH will simpy stop working and won't tell you why, making it very difficult to fix.

Re: Odd GPU behavior - terminal window + FAHClient

Posted: Fri Feb 12, 2021 11:48 pm
by SJC_Steve
bruce wrote:We probably confused you by giving an incomplete description of that option.

In FAHControl, you need to add the option gpu=true, but you do it by doing a ADD and entering gpu in the Name field of the popup menu and true or false in the Value field unless there is already an entry for gpu.

Editing an existing entry requires more care because of the need for the enter key.

In either case, it ends up looking like this in config.xml:
<gpu v="false"/>
We DO NOT recommend manually edit the config. If you happen to in advertently create a syntax error, FAH will simpy stop working and won't tell you why, making it very difficult to fix.
Yep, that's exactly what I did; Add button, put in Name = gpu and Value = true, then Save.

I then rebooted the PC and again looked at the Configure/Expert and no gpu name or value, I've done it several times with the same result.

I also edited the config.xml file and rebooted, it did the same thing and deleted the <gpu v="true"/> line completely.

Here's some lines from the log file including the config portion. As you can see it matches the FAHControl value line for the gpu - No Name or Value

Code: Select all

23:40:05:******************************* System ********************************
23:40:05:            CPU: AMD Ryzen 7 3700X 8-Core Processor
23:40:05:         CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
23:40:05:           CPUs: 8
23:40:05:         Memory: 15.61GiB
23:40:05:    Free Memory: 14.05GiB
23:40:05:        Threads: POSIX_THREADS
23:40:05:     OS Version: 5.8
23:40:05:    Has Battery: false
23:40:05:     On Battery: false
23:40:05:     UTC Offset: -7
23:40:05:            PID: 1520
23:40:05:            CWD: /var/lib/fahclient
23:40:05:             OS: Linux 5.8.0-43-generic x86_64
23:40:05:        OS Arch: AMD64
23:40:05:           GPUs: 1
23:40:05:          GPU 0: Bus:8 Slot:0 Func:0 NVIDIA:7 GP106 [GeForce GTX 1060 3GB] 3935
23:40:05:  CUDA Device 0: Platform:0 Device:0 Bus:8 Slot:0 Compute:6.1 Driver:11.2
23:40:05:OpenCL Device 0: Platform:0 Device:0 Bus:8 Slot:0 Compute:1.2 Driver:460.32
23:40:05:***********************************************************************
23:40:05:<config>
23:40:05:  <!-- Client Control -->
23:40:05:  <fold-anon v='true'/>
23:40:05:
23:40:05:  <!-- Network -->
23:40:05:  <proxy v=':8080'/>
23:40:05:
23:40:05:  <!-- Slot Control -->
23:40:05:  <power v='full'/>
23:40:05:
23:40:05:  <!-- User Information -->
23:40:05:  <user v='SJC_Steve'/>
23:40:05:
23:40:05:  <!-- Folding Slots -->
23:40:05:  <slot id='0' type='CPU'/>
23:40:05:  <slot id='1' type='GPU'>
23:40:05:    <pci-bus v='8'/>
23:40:05:    <pci-slot v='0'/>
23:40:05:  </slot>
23:40:05:</config>
Thanks,
Steve

Mod Edit: Added Code Tags - PantherX

Re: Odd GPU behavior - terminal window + FAHClient

Posted: Sat Feb 13, 2021 1:40 am
by bruce
:?: :?:

Maybe the order matters. Mine is at the very top.

<?xml version="1.0"?>
-<config>
<!-- Folding Slot Configuration -->
<gpu v="false"/>

<!-- HTTP Server -->
<allow v="127.0.0.1,192.168.0.0/24"/>

<!-- Network -->
...
-----------------------------------------------------------------------
Maybe it's just that FAH rewrites config and discards redundant statements. The default value for 'gpu' is 'true' so changing it to 'true' is logically the same as removing the 'false' modification from the default.

Re: Odd GPU behavior - terminal window + FAHClient

Posted: Sat Feb 13, 2021 6:35 am
by Whompithian
bruce wrote:Maybe it's just that FAH rewrites config and discards redundant statements. The default value for 'gpu' is 'true' so changing it to 'true' is logically the same as removing the 'false' modification from the default.
On Linux systems, this is the case. FAHClient regularly normalizes the config file. On first start and anytime a command is sent to the running client that changes the configurable state, such as the "--send-pause" argument, the client updates the config.xml with a normalized version which it also dumps to the log file. Part of the normalization is to remove any parameter that is explicitly set to its default value. The "--help" output from FAHClient lists the default value for most parameters.