Comp is 2 rx470 cards, ubuntu 20.04 / 5.4 kernel. 20.20 amdgpu opencl.
Latest config:
Code: Select all
<config>
<!-- Client Control -->
<fold-anon v='true'/>
<!-- Folding Slot Configuration -->
<cause v='COVID_19'/>
<!-- Network -->
<proxy v=':8080'/>
<!-- User Information -->
<passkey v='redacted'/>
<team v='234771'/>
<user v='Yeroon'/>
<!-- Folding Slots -->
<slot id='1' type='GPU'>
<gpu-index v='0'/>
<opencl-index v='0'/>
</slot>
<slot id='0' type='GPU'>
<gpu-index v='1'/>
<opencl-index v='1'/>
</slot>
<slot id='2' type='CPU'>
<cpus v='8'/>
</slot>
</config>
Code: Select all
*********************** Log Started 2020-07-06T23:25:03Z ***********************
23:25:03:Trying to access database...
23:25:03:Successfully acquired database lock
23:25:03:Read GPUs.txt
23:25:04:Enabled folding slot 01: READY gpu:0:Ellesmere XT [Radeon RX 470/480/570/580/590]
23:25:04:****************************** FAHClient ******************************
23:25:04: Version: 7.6.13
23:25:04: Author: Joseph Coffland <[email protected]>
23:25:04: Copyright: 2020 foldingathome.org
23:25:04: Homepage: https://foldingathome.org/
23:25:04: Date: Apr 28 2020
23:25:04: Time: 04:20:16
23:25:04: Revision: 5a652817f46116b6e135503af97f18e094414e3b
23:25:04: Branch: master
23:25:04: Compiler: GNU 8.3.0
23:25:04: Options: -std=c++11 -ffunction-sections -fdata-sections -O3
23:25:04: -funroll-loops -fno-pie
23:25:04: Platform: linux2 4.19.0-5-amd64
23:25:04: Bits: 64
23:25:04: Mode: Release
23:25:04: Args: --child /etc/fahclient/config.xml --run-as root
23:25:04: --pid-file=/var/run/fahclient.pid --daemon
23:25:04: Config: /etc/fahclient/config.xml
23:25:04:******************************** CBang ********************************
23:25:04: Date: Apr 25 2020
23:25:04: Time: 00:07:53
23:25:04: Revision: ea081a3b3b0f4a37c4d0440b4f1bc184197c7797
23:25:04: Branch: master
23:25:04: Compiler: GNU 8.3.0
23:25:04: Options: -std=c++11 -ffunction-sections -fdata-sections -O3
23:25:04: -funroll-loops -fno-pie -fPIC
23:25:04: Platform: linux2 4.19.0-5-amd64
23:25:04: Bits: 64
23:25:04: Mode: Release
23:25:04:******************************* System ********************************
23:25:04: CPU: AMD Ryzen 5 3600 6-Core Processor
23:25:04: CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
23:25:04: CPUs: 12
23:25:04: Memory: 31.30GiB
23:25:04: Free Memory: 29.93GiB
23:25:04: Threads: POSIX_THREADS
23:25:04: OS Version: 5.4
23:25:04: Has Battery: false
23:25:04: On Battery: false
23:25:04: UTC Offset: -4
23:25:04: PID: 2705
23:25:04: CWD: /var/lib/fahclient
23:25:04: OS: Linux 5.4.0-40-generic x86_64
23:25:04: OS Arch: AMD64
23:25:04: GPUs: 2
23:25:04: GPU 0: Bus:5 Slot:0 Func:0 AMD:5 Ellesmere XT [Radeon RX
23:25:04: 470/480/570/580/590]
23:25:04: GPU 1: Bus:6 Slot:0 Func:0 AMD:5 Ellesmere XT [Radeon RX
23:25:04: 470/480/570/580/590]
23:25:04: CUDA: Not detected: Failed to open dynamic library 'libcuda.so':
23:25:04: libcuda.so: cannot open shared object file: No such file or
23:25:04: directory
23:25:04:OpenCL Device 0: Platform:0 Device:0 Bus:5 Slot:0 Compute:1.2 Driver:3110.6
23:25:04:OpenCL Device 1: Platform:0 Device:1 Bus:6 Slot:0 Compute:1.2 Driver:3110.6
23:25:04:******************************* libFAH ********************************
23:25:04: Date: Apr 15 2020
23:25:04: Time: 21:43:24
23:25:04: Revision: 216968bc7025029c841ed6e36e81a03a316890d3
23:25:04: Branch: master
23:25:04: Compiler: GNU 8.3.0
23:25:04: Options: -std=c++11 -ffunction-sections -fdata-sections -O3
23:25:04: -funroll-loops -fno-pie
23:25:04: Platform: linux2 4.19.0-5-amd64
23:25:04: Bits: 64
23:25:04: Mode: Release
23:25:04:***********************************************************************
23:25:04:<config>
23:25:04: <!-- Client Control -->
23:25:04: <fold-anon v='true'/>
23:25:04:
23:25:04: <!-- Network -->
23:25:04: <proxy v=':8080'/>
23:25:04:
23:25:04: <!-- User Information -->
23:25:04: <passkey v='*****'/>
23:25:04: <team v='234771'/>
23:25:04: <user v='Yeroon'/>
23:25:04:
23:25:04: <!-- Folding Slots -->
23:25:04: <slot id='1' type='GPU'>
23:25:04: <gpu-index v='0'/>
23:25:04: <opencl-index v='0'/>
23:25:04: </slot>
23:25:04:</config>
Code: Select all
WU00:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 0 0
WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:13416 run:1108 clone:124 gen:0 core:0x22 unit:0x0000000012bc7d9a5f02af7c6e30c25c
WU01:FS00:0x22:ERROR:Discrepancy: Forces are blowing up! 0 0
WU01:FS00:Sending unit results: id:01 state:SEND error:FAULTY project:13416 run:1154 clone:124 gen:0 core:0x22 unit:0x0000000012bc7d9a5f02af79269ad718
WU02:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 0 0
WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:13416 run:1156 clone:124 gen:0 core:0x22 unit:0x0000000012bc7d9a5f02af798c4a5c5c
WU03:FS00:0x22:ERROR:Potential energy error of 246.816, threshold of 10
WU03:FS00:0x22:ERROR:Reference Potential Energy: -1.23702e+06 | Given Potential Energy: -1.23677e+06
WU03:FS00:Sending unit results: id:03 state:SEND error:FAULTY project:13416 run:1274 clone:124 gen:0 core:0x22 unit:0x0000000012bc7d9a5f02af716b44eddb
Code: Select all
WU00:FS00:0x22:ERROR:Discrepancy: Forces are blowing up! 0 0
WU00:FS00:Sending unit results: id:00 state:SEND error:FAULTY project:13416 run:298 clone:23 gen:3 core:0x22 unit:0x0000000312bc7d9a5f00a7ee24fd8ddb
WU00:FS01:0x22:An exception occurred at step 250: Particle coordinate is nan
WU00:FS01:0x22:Max number of attempts to resume from last checkpoint (2) reached. Aborting.
WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:13416 run:757 clone:39 gen:0 core:0x22 unit:0x0000000112bc7d9a5f02af9af0e6828b
WU02:FS01:0x22:ERROR:Force RMSE error of 41.1941 with threshold of 5
WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:13416 run:652 clone:8 gen:0 core:0x22 unit:0x0000000212bc7d9a5f02afa3f80e66a3
WU03:FS01:0x22:ERROR:Force RMSE error of 13.4286 with threshold of 5
WU03:FS01:Sending unit results: id:03 state:SEND error:FAULTY project:13416 run:1122 clone:8 gen:0 core:0x22 unit:0x0000000212bc7d9a5f02af7c63ccc089
WU00:FS00:0x22:ERROR:Force RMSE error of 9.85244 with threshold of 5
WU00:FS00:Sending unit results: id:00 state:SEND error:FAULTY project:13416 run:1059 clone:102 gen:0 core:0x22 unit:0x0000000112bc7d9a5f02af809604a333
I can get full logs if it helps.