Seems I Missed A Step?
Posted: Mon Sep 07, 2020 1:37 am
Hi Team,
I've been trying for 4 days to get a "valid" AMD card working on Debian 10 Linux. My machine can take 3 GPUs of which one is an NVidia and is running (I more or less "threw it in to test", but it works). It took half a day.
The other two slots I have two AMD HD7700s to fit. At the moment I only have one inserted until I get the drivers and/or Linux to play with F@H.
So far I have installed and uninstalled (tried) mesa drivers which I knew didn't work, AMD Pro (which at the time I didn't know didn't work), ROCm drivers (which seemed to work for others but gave me a kfd error on boot up).
I currently have the Debian OpenCL driver from their repo installed. I have no errors but FAH still complains I have incorrect drivers or the "compute device doesn't match". I know it does match by description, vendor and device according to the GPUs list. Pretty much back where I started last Wednesday.
SO... that leaves the issue with me having missed something or not checked something?
System log dump is below for anyone who might be able to ease my pain.
Thanks in advance.
I've been trying for 4 days to get a "valid" AMD card working on Debian 10 Linux. My machine can take 3 GPUs of which one is an NVidia and is running (I more or less "threw it in to test", but it works). It took half a day.
The other two slots I have two AMD HD7700s to fit. At the moment I only have one inserted until I get the drivers and/or Linux to play with F@H.
So far I have installed and uninstalled (tried) mesa drivers which I knew didn't work, AMD Pro (which at the time I didn't know didn't work), ROCm drivers (which seemed to work for others but gave me a kfd error on boot up).
I currently have the Debian OpenCL driver from their repo installed. I have no errors but FAH still complains I have incorrect drivers or the "compute device doesn't match". I know it does match by description, vendor and device according to the GPUs list. Pretty much back where I started last Wednesday.
SO... that leaves the issue with me having missed something or not checked something?
System log dump is below for anyone who might be able to ease my pain.
Thanks in advance.
Code: Select all
*********************** Log Started 2020-09-06T23:41:16Z ***********************
23:41:16:Trying to access database...
23:41:16:Successfully acquired database lock
23:41:16:Read GPUs.txt
23:41:16:Enabled folding slot 00: READY cpu:22
23:41:17:Enabled folding slot 01: READY gpu:0:GF106 [Quadro 2000] 480
23:41:17:Enabled folding slot 02: READY gpu:1:R575A [Radeon R7 250X/HD 7700/8760]
23:41:17:ERROR:No compute devices matched GPU #1 {
23:41:17:ERROR: "vendor": 4098,
23:41:17:ERROR: "device": 26685,
23:41:17:ERROR: "type": 1,
23:41:17:ERROR: "species": 5,
23:41:17:ERROR: "description": "R575A [Radeon R7 250X/HD 7700/8760]"
23:41:17:ERROR:}. You may need to update your graphics drivers.
23:41:17:****************************** FAHClient ******************************
23:41:17: Version: 7.6.13
23:41:17: Author: Joseph Coffland <[email protected]>
23:41:17: Copyright: 2020 foldingathome.org
23:41:17: Homepage: https://foldingathome.org/
23:41:17: Date: Apr 28 2020
23:41:17: Time: 04:20:16
23:41:17: Revision: 5a652817f46116b6e135503af97f18e094414e3b
23:41:17: Branch: master
23:41:17: Compiler: GNU 8.3.0
23:41:17: Options: -std=c++11 -ffunction-sections -fdata-sections -O3
23:41:17: -funroll-loops -fno-pie
23:41:17: Platform: linux2 4.19.0-5-amd64
23:41:17: Bits: 64
23:41:17: Mode: Release
23:41:17: Args: --child /etc/fahclient/config.xml --run-as fahclient
23:41:17: --pid-file=/var/run/fahclient.pid --daemon
23:41:17: Config: /etc/fahclient/config.xml
23:41:17:******************************** CBang ********************************
23:41:17: Date: Apr 25 2020
23:41:17: Time: 00:07:53
23:41:17: Revision: ea081a3b3b0f4a37c4d0440b4f1bc184197c7797
23:41:17: Branch: master
23:41:17: Compiler: GNU 8.3.0
23:41:17: Options: -std=c++11 -ffunction-sections -fdata-sections -O3
23:41:17: -funroll-loops -fno-pie -fPIC
23:41:17: Platform: linux2 4.19.0-5-amd64
23:41:17: Bits: 64
23:41:17: Mode: Release
23:41:17:******************************* System ********************************
23:41:17: CPU: Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz
23:41:17: CPU ID: GenuineIntel Family 6 Model 45 Stepping 7
23:41:17: CPUs: 24
23:41:17: Memory: 188.87GiB
23:41:17: Free Memory: 186.21GiB
23:41:17: Threads: POSIX_THREADS
23:41:17: OS Version: 4.19
23:41:17: Has Battery: false
23:41:17: On Battery: false
23:41:17: UTC Offset: 10
23:41:17: PID: 3124
23:41:17: CWD: /var/lib/fahclient
23:41:17: OS: Linux 4.19.0-10-amd64 x86_64
23:41:17: OS Arch: AMD64
23:41:17: GPUs: 2
23:41:17: GPU 0: Bus:10 Slot:0 Func:0 NVIDIA:2 GF106 [Quadro 2000] 480
23:41:17: GPU 1: Bus:33 Slot:0 Func:0 AMD:5 R575A [Radeon R7 250X/HD 7700/8760]
23:41:17: CUDA Device 0: Platform:0 Device:0 Bus:10 Slot:0 Compute:2.1 Driver:9.1
23:41:17:OpenCL Device 0: Platform:0 Device:0 Bus:10 Slot:0 Compute:1.1 Driver:390.138
23:41:17:******************************* libFAH ********************************
23:41:17: Date: Apr 15 2020
23:41:17: Time: 21:43:24
23:41:17: Revision: 216968bc7025029c841ed6e36e81a03a316890d3
23:41:17: Branch: master
23:41:17: Compiler: GNU 8.3.0
23:41:17: Options: -std=c++11 -ffunction-sections -fdata-sections -O3
23:41:17: -funroll-loops -fno-pie
23:41:17: Platform: linux2 4.19.0-5-amd64
23:41:17: Bits: 64
23:41:17: Mode: Release
23:41:17:***********************************************************************
23:41:17:<config>
23:41:17: <!-- Client Control -->
23:41:17: <fold-anon v='true'/>
23:41:17:
23:41:17: <!-- Folding Slot Configuration -->
23:41:17: <gpu v='True'/>
23:41:17:
23:41:17: <!-- Network -->
23:41:17: <proxy v=':8080'/>
23:41:17:
23:41:17: <!-- Slot Control -->
23:41:17: <power v='full'/>
23:41:17:
23:41:17: <!-- User Information -->
23:41:17: <passkey v='*****'/>
23:41:17: <team v='*****'/>
23:41:17: <user v='TheServerGeek'/>
23:41:17:
23:41:17: <!-- Folding Slots -->
23:41:17: <slot id='0' type='CPU'>
23:41:17: <cpus v='22'/>
23:41:17: </slot>
23:41:17: <slot id='1' type='GPU'/>
23:41:17: <slot id='2' type='GPU'/>
23:41:17:</config>
23:41:17:WU03:FS01:Starting
23:41:17:WU03:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/lin/64bit/22-0.0.11/Core_22.fah/FahCore_22 -dir 03 -suffix 01 -version 706 -lifeline 3124 -checkpoint 15 -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu 0
23:41:17:WU03:FS01:Started FahCore on PID 3136
23:41:17:WU03:FS01:Core PID:3140
23:41:17:WU03:FS01:FahCore 0x22 started
23:41:17:WU04:FS00:Starting
23:41:17:WU04:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/lin/64bit-avx-256/a7-0.0.19/Core_a7.fah/FahCore_a7 -dir 04 -suffix 01 -version 706 -lifeline 3124 -checkpoint 15 -np 22
23:41:17:WU04:FS00:Started FahCore on PID 3143
23:41:17:WU04:FS00:Core PID:3147
23:41:17:WU04:FS00:FahCore 0xa7 started
23:41:17:WU00:FS02:Connecting to assign1.foldingathome.org:80
23:41:17:WU03:FS01:0x22:*********************** Log Started 2020-09-06T23:41:17Z ***********************
23:41:17:WU03:FS01:0x22:*************************** Core22 Folding@home Core ***************************
23:41:17:WU03:FS01:0x22: Core: Core22
23:41:17:WU03:FS01:0x22: Type: 0x22
23:41:17:WU03:FS01:0x22: Version: 0.0.11
23:41:17:WU03:FS01:0x22: Author: Joseph Coffland <[email protected]>
23:41:17:WU03:FS01:0x22: Copyright: 2020 foldingathome.org
23:41:17:WU03:FS01:0x22: Homepage: https://foldingathome.org/
23:41:17:WU03:FS01:0x22: Date: Jun 27 2020
23:41:17:WU03:FS01:0x22: Time: 22:50:00
23:41:17:WU03:FS01:0x22: Revision: cfc2940c5dd1aa80f60daa6e28d4a2a417f74edb
23:41:17:WU03:FS01:0x22: Branch: core22-0.0.11
23:41:17:WU03:FS01:0x22: Compiler: GNU 4.8.2 20140120 (Red Hat 4.8.2-15)
23:41:17:WU03:FS01:0x22: Options: -std=c++11 -fsigned-char -ffunction-sections -fdata-sections -O3
23:41:17:WU03:FS01:0x22: -funroll-loops
23:41:17:WU03:FS01:0x22: Platform: linux2 4.19.76-linuxkit
23:41:17:WU03:FS01:0x22: Bits: 64
23:41:17:WU03:FS01:0x22: Mode: Release
23:41:17:WU03:FS01:0x22:Maintainers: John Chodera <[email protected]> and Peter Eastman
23:41:17:WU03:FS01:0x22: <[email protected]>
23:41:17:WU03:FS01:0x22: Args: -dir 03 -suffix 01 -version 706 -lifeline 3136 -checkpoint 15
23:41:17:WU03:FS01:0x22: -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device
23:41:17:WU03:FS01:0x22: 0 -gpu 0
23:41:17:WU03:FS01:0x22:************************************ libFAH ************************************
23:41:17:WU03:FS01:0x22: Date: Jun 27 2020
23:41:17:WU03:FS01:0x22: Time: 22:11:04
23:41:17:WU03:FS01:0x22: Revision: 2b383f4f04f38511dff592885d7c0400e72bdf43
23:41:17:WU03:FS01:0x22: Branch: HEAD
23:41:17:WU03:FS01:0x22: Compiler: GNU 4.8.2 20140120 (Red Hat 4.8.2-15)
23:41:17:WU03:FS01:0x22: Options: -std=c++11 -fsigned-char -ffunction-sections -fdata-sections -O3
23:41:17:WU03:FS01:0x22: -funroll-loops
23:41:17:WU03:FS01:0x22: Platform: linux2 4.19.76-linuxkit
23:41:17:WU03:FS01:0x22: Bits: 64
23:41:17:WU03:FS01:0x22: Mode: Release
23:41:17:WU03:FS01:0x22:************************************ CBang *************************************
23:41:17:WU03:FS01:0x22: Date: Jun 27 2020
23:41:17:WU03:FS01:0x22: Time: 22:10:11
23:41:17:WU03:FS01:0x22: Revision: f8529962055b0e7bde23e429f5072ff758089dee
23:41:17:WU03:FS01:0x22: Branch: HEAD
23:41:17:WU03:FS01:0x22: Compiler: GNU 4.8.2 20140120 (Red Hat 4.8.2-15)
23:41:17:WU03:FS01:0x22: Options: -std=c++11 -fsigned-char -ffunction-sections -fdata-sections -O3
23:41:17:WU03:FS01:0x22: -funroll-loops -fPIC
23:41:17:WU03:FS01:0x22: Platform: linux2 4.19.76-linuxkit
23:41:17:WU03:FS01:0x22: Bits: 64
23:41:17:WU03:FS01:0x22: Mode: Release
23:41:17:WU03:FS01:0x22:************************************ System ************************************
23:41:17:WU03:FS01:0x22: CPU: Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz
23:41:17:WU03:FS01:0x22: CPU ID: GenuineIntel Family 6 Model 45 Stepping 7
23:41:17:WU03:FS01:0x22: CPUs: 24
23:41:17:WU03:FS01:0x22: Memory: 188.87GiB
23:41:17:WU03:FS01:0x22:Free Memory: 186.21GiB
23:41:17:WU03:FS01:0x22: Threads: POSIX_THREADS
23:41:17:WU03:FS01:0x22: OS Version: 4.19
23:41:17:WU03:FS01:0x22:Has Battery: false
23:41:17:WU03:FS01:0x22: On Battery: false
23:41:17:WU03:FS01:0x22: UTC Offset: 10
23:41:17:WU03:FS01:0x22: PID: 3140
23:41:17:WU03:FS01:0x22: CWD: /var/lib/fahclient/work
23:41:17:WU03:FS01:0x22:********************************************************************************
23:41:17:WU03:FS01:0x22:Project: 13425 (Run 2567, Clone 8, Gen 5)
23:41:17:WU03:FS01:0x22:Unit: 0x0000000812bc7d9a5f4ef9b4513f877c
23:41:17:WU03:FS01:0x22:Digital signatures verified
23:41:17:WU03:FS01:0x22:Folding@home GPU Core22 Folding@home Core
23:41:17:WU03:FS01:0x22:Version 0.0.11
23:41:17:WU03:FS01:0x22: Checkpoint write interval: 50000 steps (5%) [20 total]
23:41:17:WU03:FS01:0x22: JSON viewer frame write interval: 10000 steps (1%) [100 total]
23:41:17:WU03:FS01:0x22: XTC frame write interval: 250000 steps (25%) [4 total]
23:41:17:WU03:FS01:0x22: Global context and integrator variables write interval: 25000 steps (2.5%) [40 total]
23:41:18:WU04:FS00:0xa7:*********************** Log Started 2020-09-06T23:41:17Z ***********************
23:41:18:WU04:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
23:41:18:WU04:FS00:0xa7: Type: 0xa7
23:41:18:WU04:FS00:0xa7: Core: Gromacs
23:41:18:WU04:FS00:0xa7: Args: -dir 04 -suffix 01 -version 706 -lifeline 3143 -checkpoint 15 -np
23:41:18:WU04:FS00:0xa7: 22
23:41:18:WU04:FS00:0xa7:************************************ CBang *************************************
23:41:18:WU04:FS00:0xa7: Date: Nov 27 2019
23:41:18:WU04:FS00:0xa7: Time: 11:26:54
23:41:18:WU04:FS00:0xa7: Revision: d25803215b59272441049dfa05a0a9bf7a6e3c48
23:41:18:WU04:FS00:0xa7: Branch: master
23:41:18:WU04:FS00:0xa7: Compiler: GNU 8.3.0
23:41:18:WU04:FS00:0xa7: Options: -std=c++11 -ffunction-sections -fdata-sections -O3 -funroll-loops
23:41:18:WU04:FS00:0xa7: -fno-pie -fPIC
23:41:18:WU04:FS00:0xa7: Platform: linux2 4.19.0-5-amd64
23:41:18:WU04:FS00:0xa7: Bits: 64
23:41:18:WU04:FS00:0xa7: Mode: Release
23:41:18:WU04:FS00:0xa7:************************************ System ************************************
23:41:18:WU04:FS00:0xa7: CPU: Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz
23:41:18:WU04:FS00:0xa7: CPU ID: GenuineIntel Family 6 Model 45 Stepping 7
23:41:18:WU04:FS00:0xa7: CPUs: 24
23:41:18:WU04:FS00:0xa7: Memory: 188.87GiB
23:41:18:WU04:FS00:0xa7:Free Memory: 186.20GiB
23:41:18:WU04:FS00:0xa7: Threads: POSIX_THREADS
23:41:18:WU04:FS00:0xa7: OS Version: 4.19
23:41:18:WU04:FS00:0xa7:Has Battery: false
23:41:18:WU04:FS00:0xa7: On Battery: false
23:41:18:WU04:FS00:0xa7: UTC Offset: 10
23:41:18:WU04:FS00:0xa7: PID: 3147
23:41:18:WU04:FS00:0xa7: CWD: /var/lib/fahclient/work
23:41:18:WU04:FS00:0xa7:******************************** Build - libFAH ********************************
23:41:18:WU04:FS00:0xa7: Version: 0.0.19
23:41:18:WU04:FS00:0xa7: Author: Joseph Coffland <[email protected]>
23:41:18:WU04:FS00:0xa7: Copyright: 2019 foldingathome.org
23:41:18:WU04:FS00:0xa7: Homepage: https://foldingathome.org/
23:41:18:WU04:FS00:0xa7: Date: Nov 26 2019
23:41:18:WU04:FS00:0xa7: Time: 00:41:42
23:41:18:WU04:FS00:0xa7: Revision: d5b5c747532224f986b7cd02c968ed9a20c16d6e
23:41:18:WU04:FS00:0xa7: Branch: master
23:41:18:WU04:FS00:0xa7: Compiler: GNU 8.3.0
23:41:18:WU04:FS00:0xa7: Options: -std=c++11 -ffunction-sections -fdata-sections -O3 -funroll-loops
23:41:18:WU04:FS00:0xa7: -fno-pie
23:41:18:WU04:FS00:0xa7: Platform: linux2 4.19.0-5-amd64
23:41:18:WU04:FS00:0xa7: Bits: 64
23:41:18:WU04:FS00:0xa7: Mode: Release
23:41:18:WU04:FS00:0xa7:************************************ Build *************************************
23:41:18:WU04:FS00:0xa7: SIMD: avx_256
23:41:18:WU04:FS00:0xa7:********************************************************************************
23:41:18:WU04:FS00:0xa7:Project: 16431 (Run 978, Clone 1, Gen 248)
23:41:18:WU04:FS00:0xa7:Unit: 0x0000012196880e6e5e97db27d5381e25
23:41:18:WU04:FS00:0xa7:Digital signatures verified
23:41:18:WU04:FS00:0xa7:Reducing thread count from 22 to 21 to avoid domain decomposition with large prime factor 11
23:41:18:WU04:FS00:0xa7:Calling: mdrun -s frame248.tpr -o frame248.trr -x frame248.xtc -cpi state.cpt -cpt 15 -nt 21
23:41:18:WU04:FS00:0xa7:Steps: first=62000000 total=250000
23:41:19:WU00:FS02:Assigned to work server 192.0.2.1
23:41:19:WU00:FS02:Requesting new work unit for slot 02: READY gpu:1:R575A [Radeon R7 250X/HD 7700/8760] from 192.0.2.1
23:41:19:WU00:FS02:Connecting to 192.0.2.1:8080
23:41:20:WU04:FS00:0xa7:Completed 130502 out of 250000 steps (52%)
23:41:29:WU03:FS01:0x22:ERROR:Potential energy error of 11.7691, threshold of 10
23:41:29:WU03:FS01:0x22:ERROR:Reference Potential Energy: -55463.8 | Given Potential Energy: -55452
23:41:29:WU03:FS01:0x22:Saving result file ../logfile_01.txt
23:41:29:WU03:FS01:0x22:Saving result file science.log
23:41:29:WU03:FS01:0x22:Saving result file state.xml.bz2
23:41:29:WU03:FS01:0x22:Folding@home Core Shutdown: BAD_WORK_UNIT
23:41:29:WARNING:WU03:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
23:41:29:WU03:FS01:Sending unit results: id:03 state:SEND error:FAULTY project:13425 run:2567 clone:8 gen:5 core:0x22 unit:0x0000000812bc7d9a5f4ef9b4513f877c
23:41:29:WU03:FS01:Uploading 289.64KiB to 18.188.125.154
23:41:29:WU03:FS01:Connecting to 18.188.125.154:8080
23:41:30:WU01:FS01:Connecting to assign1.foldingathome.org:80
23:41:30:WU01:FS01:Assigned to work server 18.188.125.154
23:41:30:WU01:FS01:Requesting new work unit for slot 01: READY gpu:0:GF106 [Quadro 2000] 480 from 18.188.125.154
23:41:30:WU01:FS01:Connecting to 18.188.125.154:8080
23:41:31:WU03:FS01:Upload complete
23:41:31:WU03:FS01:Server responded WORK_ACK (400)
23:41:31:WU03:FS01:Cleaning up
23:41:31:WU01:FS01:Downloading 350.23KiB
23:41:32:WU01:FS01:Download complete
23:41:32:WU01:FS01:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:13425 run:2454 clone:16 gen:1 core:0x22 unit:0x0000000112bc7d9a5f4ef9b816fdbca2
23:41:32:WU01:FS01:Starting
23:41:32:WU01:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/lin/64bit/22-0.0.11/Core_22.fah/FahCore_22 -dir 01 -suffix 01 -version 706 -lifeline 3124 -checkpoint 15 -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu 0
23:41:32:WU01:FS01:Started FahCore on PID 3229
23:41:32:WU01:FS01:Core PID:3233
23:41:32:WU01:FS01:FahCore 0x22 started
.
.
.
.
.
.
00:55:57:WU01:FS00:Download 32.25%
00:55:58:WU04:FS00:0xa7:Completed 250000 out of 250000 steps (100%)
00:56:03:WU04:FS00:FahCore returned: FINISHED_UNIT (100 = 0x64)
00:56:03:WU04:FS00:Sending unit results: id:04 state:SEND error:NO_ERROR project:16431 run:978 clone:1 gen:248 core:0xa7 unit:0x0000012196880e6e5e97db27d5381e25
00:56:03:WU04:FS00:Uploading 4.22MiB to 150.136.14.110
00:56:03:WU04:FS00:Connecting to 150.136.14.110:8080
00:56:03:WU01:FS00:Download 35.32%
00:56:09:WU04:FS00:Upload 41.50%
00:56:09:WU01:FS00:Download 38.39%
00:56:15:WU01:FS00:Download 39.93%
00:56:15:WU04:FS00:Upload 80.04%
00:56:21:WU04:FS00:Upload complete
00:56:21:WU04:FS00:Server responded WORK_ACK (400)
00:56:21:WU04:FS00:Final credit estimate, 11876.00 points
00:56:21:WU04:FS00:Cleaning up
.
.
.
00:57:55:ERROR:WU00:FS02:Exception: Failed to connect to 192.0.2.1:80: Connection timed out
....... and so on
01:07:14:WU01:FS00:0xa7:Completed 11250 out of 125000 steps (9%)