AMD 6700XT on Linux fails

If you think it might be a driver problem, see viewforum.php?f=79

Moderators: Site Moderators, FAHC Science Team

Post Reply
zalgo
Posts: 3
Joined: Sat Jan 21, 2023 5:54 am

AMD 6700XT on Linux fails

Post by zalgo »

I'm a bit new to this so could be doing something totally wrong, but I've tried my best to debug it alone and have failed. I've tried mesa openCL drivers and the proprietary AMD ones, and the issue seems to occur on both for me. System info is as follows

Code: Select all

************************************ libFAH ************************************
       Date: Oct 20 2020
       Time: 20:36:39
   Revision: 5ca109d295a6245e2a2f590b3d0085ad5e567aeb
     Branch: master
   Compiler: GNU 8.3.0
    Options: -faligned-new -std=c++11 -fsigned-char -ffunction-sections
             -fdata-sections -O3 -funroll-loops -fno-pie
   Platform: linux2 5.8.0-1-amd64
       Bits: 64
       Mode: Release
********************************** FAHClient ***********************************
    Version: 7.6.21
     Author: Joseph Coffland <[email protected]>
  Copyright: 2020 foldingathome.org
   Homepage: https://foldingathome.org/
       Date: Oct 20 2020
       Time: 20:39:00
   Revision: 6efbf0e138e22d3963e6a291f78dcb9c6422a278
     Branch: master
   Compiler: GNU 8.3.0
    Options: -faligned-new -std=c++11 -fsigned-char -ffunction-sections
             -fdata-sections -O3 -funroll-loops -fno-pie
   Platform: linux2 5.8.0-1-amd64
       Bits: 64
       Mode: Release
       Args: --info
************************************ CBang *************************************
       Date: Oct 20 2020
       Time: 18:37:59
   Revision: 7e4ce85225d7eaeb775e87c31740181ca603de60
     Branch: master
   Compiler: GNU 8.3.0
    Options: -faligned-new -std=c++11 -fsigned-char -ffunction-sections
             -fdata-sections -O3 -funroll-loops -fno-pie -fPIC
   Platform: linux2 5.8.0-1-amd64
       Bits: 64
       Mode: Release
************************************ System ************************************
        CPU: AMD Ryzen 5 3600X 6-Core Processor
     CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
       CPUs: 12
     Memory: 31.27GiB
Free Memory: 10.70GiB
    Threads: POSIX_THREADS
 OS Version: 6.1
Has Battery: false
 On Battery: false
 UTC Offset: -6
        PID: 51245
        CWD: /home/zalgo/fold
********************************************************************************

Running FAHClient as user or as a systemd service does detect the GPU and attempt to give it work, but it repeatedly fails in the following manner

Code: Select all

06:02:22:WU01:FS01:Starting
06:02:22:WU01:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /home/zalgo/fold/cores/cores.foldingathome.org/lin/64bit/22-0.0.20/Core_22.fah/FahCore_22 -dir 01 -suffix 01 -version 706 -lifeline 50509 -checkpoint 15 -opencl-platform 2 -opencl-device 0 -gpu-vendor amd -gpu 0 -gpu-usage 100
06:02:22:WU01:FS01:Started FahCore on PID 50858
06:02:22:WU01:FS01:Core PID:50862
06:02:22:WU01:FS01:FahCore 0x22 started
Xlib:  extension "MIT-SCREEN-SAVER" missing on display ":0".
06:02:23:WU01:FS01:0x22:*********************** Log Started 2023-01-21T06:02:22Z ***********************
06:02:23:WU01:FS01:0x22:*************************** Core22 Folding@home Core ***************************
06:02:23:WU01:FS01:0x22:       Core: Core22
06:02:23:WU01:FS01:0x22:       Type: 0x22
06:02:23:WU01:FS01:0x22:    Version: 0.0.20
06:02:23:WU01:FS01:0x22:     Author: Joseph Coffland <[email protected]>
06:02:23:WU01:FS01:0x22:  Copyright: 2020 foldingathome.org
06:02:23:WU01:FS01:0x22:   Homepage: https://foldingathome.org/
06:02:23:WU01:FS01:0x22:       Date: Jan 20 2022
06:02:23:WU01:FS01:0x22:       Time: 00:57:52
06:02:23:WU01:FS01:0x22:   Revision: 3f211b8a4346514edbff34e3cb1c0e0ec951373c
06:02:23:WU01:FS01:0x22:     Branch: HEAD
06:02:23:WU01:FS01:0x22:   Compiler: GNU 9.4.0
06:02:23:WU01:FS01:0x22:    Options: -faligned-new -std=c++11 -fsigned-char -ffunction-sections
06:02:23:WU01:FS01:0x22:             -fdata-sections -O3 -funroll-loops -fno-pie
06:02:23:WU01:FS01:0x22:             -DOPENMM_VERSION="\"7.7.0\""
06:02:23:WU01:FS01:0x22:   Platform: linux 5.11.0-1025-azure
06:02:23:WU01:FS01:0x22:       Bits: 64
06:02:23:WU01:FS01:0x22:       Mode: Release
06:02:23:WU01:FS01:0x22:Maintainers: John Chodera <[email protected]> and Peter Eastman
06:02:23:WU01:FS01:0x22:             <[email protected]>
06:02:23:WU01:FS01:0x22:       Args: -dir 01 -suffix 01 -version 706 -lifeline 50858 -checkpoint 15
06:02:23:WU01:FS01:0x22:             -opencl-platform 2 -opencl-device 0 -gpu-vendor amd -gpu 0
06:02:23:WU01:FS01:0x22:             -gpu-usage 100
06:02:23:WU01:FS01:0x22:************************************ libFAH ************************************
06:02:23:WU01:FS01:0x22:       Date: Jan 20 2022
06:02:23:WU01:FS01:0x22:       Time: 00:57:22
06:02:23:WU01:FS01:0x22:   Revision: 9f4ad694e75c2350d4bb6b8b5b769ba27e483a2f
06:02:23:WU01:FS01:0x22:     Branch: HEAD
06:02:23:WU01:FS01:0x22:   Compiler: GNU 9.4.0
06:02:23:WU01:FS01:0x22:    Options: -faligned-new -std=c++11 -fsigned-char -ffunction-sections
06:02:23:WU01:FS01:0x22:             -fdata-sections -O3 -funroll-loops -fno-pie
06:02:23:WU01:FS01:0x22:   Platform: linux 5.11.0-1025-azure
06:02:23:WU01:FS01:0x22:       Bits: 64
06:02:23:WU01:FS01:0x22:       Mode: Release
06:02:23:WU01:FS01:0x22:************************************ CBang *************************************
06:02:23:WU01:FS01:0x22:       Date: Jan 20 2022
06:02:23:WU01:FS01:0x22:       Time: 00:57:00
06:02:23:WU01:FS01:0x22:   Revision: ab023d155b446906d55b0f6c9a1eedeea04f7a1a
06:02:23:WU01:FS01:0x22:     Branch: HEAD
06:02:23:WU01:FS01:0x22:   Compiler: GNU 9.4.0
06:02:23:WU01:FS01:0x22:    Options: -faligned-new -std=c++11 -fsigned-char -ffunction-sections
06:02:23:WU01:FS01:0x22:             -fdata-sections -O3 -funroll-loops -fno-pie -fPIC
06:02:23:WU01:FS01:0x22:   Platform: linux 5.11.0-1025-azure
06:02:23:WU01:FS01:0x22:       Bits: 64
06:02:23:WU01:FS01:0x22:       Mode: Release
06:02:23:WU01:FS01:0x22:************************************ System ************************************
06:02:23:WU01:FS01:0x22:        CPU: AMD Ryzen 5 3600X 6-Core Processor
06:02:23:WU01:FS01:0x22:     CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
06:02:23:WU01:FS01:0x22:       CPUs: 12
06:02:23:WU01:FS01:0x22:     Memory: 31.27GiB
06:02:23:WU01:FS01:0x22:Free Memory: 10.41GiB
06:02:23:WU01:FS01:0x22:    Threads: POSIX_THREADS
06:02:23:WU01:FS01:0x22: OS Version: 6.1
06:02:23:WU01:FS01:0x22:Has Battery: false
06:02:23:WU01:FS01:0x22: On Battery: false
06:02:23:WU01:FS01:0x22: UTC Offset: -6
06:02:23:WU01:FS01:0x22:        PID: 50862
06:02:23:WU01:FS01:0x22:        CWD: /home/zalgo/fold/work
06:02:23:WU01:FS01:0x22:************************************ OpenMM ************************************
06:02:23:WU01:FS01:0x22:    Version: 7.7.0
06:02:23:WU01:FS01:0x22:********************************************************************************
06:02:23:WU01:FS01:0x22:Project: 18449 (Run 5, Clone 94, Gen 684)
06:02:23:WU01:FS01:0x22:Reading tar file core.xml
06:02:23:WU01:FS01:0x22:Reading tar file integrator.xml
06:02:23:WU01:FS01:0x22:Reading tar file state.xml
06:02:23:WU01:FS01:0x22:Reading tar file system.xml
06:02:23:WU01:FS01:0x22:Digital signatures verified
06:02:23:WU01:FS01:0x22:Folding@home GPU Core22 Folding@home Core
06:02:23:WU01:FS01:0x22:Version 0.0.20
06:02:23:WU01:FS01:0x22:  Checkpoint write interval: 50000 steps (2%) [50 total]
06:02:23:WU01:FS01:0x22:  JSON viewer frame write interval: 25000 steps (1%) [100 total]
06:02:23:WU01:FS01:0x22:  XTC frame write interval: 2500000 steps (1e+02%) [1 total]
06:02:23:WU01:FS01:0x22:  Global context and integrator variables write interval: disabled
06:02:23:WU01:FS01:0x22:There are 3 platforms available.
06:02:23:WU01:FS01:0x22:Platform 0: Reference
06:02:23:WU01:FS01:0x22:Platform 1: CPU
06:02:23:WU01:FS01:0x22:Platform 2: OpenCL
06:02:23:WU01:FS01:0x22:  opencl-device 0 specified
Xlib:  extension "MIT-SCREEN-SAVER" missing on display ":0".
Xlib:  extension "MIT-SCREEN-SAVER" missing on display ":0".
Xlib:  extension "MIT-SCREEN-SAVER" missing on display ":0".
06:02:26:WU01:FS01:0x22:Attempting to create OpenCL context:
06:02:26:WU01:FS01:0x22:  Configuring platform OpenCL
06:02:26:WU01:FS01:0x22:Failed to create OpenCL context:
06:02:26:WU01:FS01:0x22:Illegal value for OpenCLPlatformIndex: 2
06:02:26:WU01:FS01:0x22:ERROR:125: Failed to create a GPU-enabled OpenMM Context.
06:02:26:WU01:FS01:0x22:Saving result file ../logfile_01.txt
06:02:26:WU01:FS01:0x22:Saving result file science.log
06:02:26:WU01:FS01:0x22:Folding@home Core Shutdown: BAD_WORK_UNIT
06:02:26:WARNING:WU01:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
06:02:26:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:18449 run:5 clone:94 gen:684 core:0x22 unit:0x0000005e000002ac0000481100000005
06:02:26:WU01:FS01:Uploading 11.00KiB to 129.32.209.202
06:02:26:WU01:FS01:Connecting to 129.32.209.202:8080
06:02:26:WU01:FS01:Upload complete
06:02:26:WU01:FS01:Server responded WORK_ACK (400)
06:02:26:WU01:FS01:Cleaning up
06:02:27:WU02:FS01:Connecting to assign1.foldingathome.org:80
Xlib:  extension "MIT-SCREEN-SAVER" missing on display ":0".
06:02:27:WU02:FS01:Assigned to work server 129.32.209.202
06:02:27:WU02:FS01:Requesting new work unit for slot 01: gpu:9:0 Navi 22 XT-XL [Radeon RX 6700/6700XT/6800M] from 129.32.209.202

Have I misconfigured something, or is this an error with OpenMM itself?
Joe_H
Site Admin
Posts: 7926
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: AMD 6700XT on Linux fails

Post by Joe_H »

Welcome to the folding support forum.

Which version of Linux are you using? The installer was created initially some years ago and not updated for the latest versions of the various distros. Depending on which you may need to add the fahclient user to one or more groups such as the "video" group. There are some write-ups on that here.

As for the drivers, make certain the default drivers are unloaded. The mesa open source drivers have never worked for folding. For an AMD Radeon RX 6700 XT your options are the AMD proprietary drivers and the ROCm open source drivers. There are older posts on how to install the ROCm drivers.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
zalgo
Posts: 3
Joined: Sat Jan 21, 2023 5:54 am

Re: AMD 6700XT on Linux fails

Post by zalgo »

This is running under Arch Linux. I did notice on the wiki a mention of using the AMD proprietary openCL drivers would be needed, and so uninstalled the mesa ones and installed the proprietary ones. Would any configuration past that be needed? If not, then I tested with that correctly. I did also consider the video group, and my user was added to this before any of the above testing took place
Joe_H
Site Admin
Posts: 7926
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: AMD 6700XT on Linux fails

Post by Joe_H »

You may also need to install the OpenCL developer package.

Which user did you add to the video group? Your own account or the fahclient user that should have been created by the installer to run the background process with the FAHClient portion of the F@h software? Digging into some older posts some distros also require the user to be added to the "render" group.

One o the troubleshooting methods that was used when this first came up with newer Linux distress was running FAHClient from the root account. If it works then, but not from the default account, then it is a group membership issue.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
zalgo
Posts: 3
Joined: Sat Jan 21, 2023 5:54 am

Re: AMD 6700XT on Linux fails

Post by zalgo »

The openCL dev package from AMD was installed for all of the above tests as well.

My own user was added to the video group, and judging by the service file included (attached below) with the AUR package, the service runs as user 'fah' and is automatically added to the video group. Running it directly with the root user does not remedy the issue. As far as I can tell, the render group is unused on arch linux but for good measure I did add my user to that group and the fah user to it. No change.

Code: Select all

# Type=simple
# User=fah
# SupplementaryGroups=video
# DynamicUser=yes
# ConfigurationDirectory=foldingathome
# LogsDirectory=foldingathome
# StateDirectory=fah
# WorkingDirectory=/var/lib/fah
# ReadWritePaths=-/dev/dri
# ExecStartPre=!/usr/bin/chown -R fah:fah /etc/foldingathome
# ExecStart=/usr/bin/FAHClient --config /etc/foldingathome/config.xml \
#                              --log /var/log/foldingathome/log.txt \
#                              --log-rotate-dir /var/log/foldingathome
# CPUSchedulingPolicy=idle
# IOSchedulingClass=3
#
# # Nvidia
# ReadWritePaths=-/dev/nvidia0
# ReadWritePaths=-/dev/nvidiactl
# ReadWritePaths=-/dev/nvidia-uvm
# ReadWritePaths=-/dev/nvidia-uvm-tools
#
# [Install]
# WantedBy=default.target
tchiers
Posts: 23
Joined: Tue Oct 23, 2018 4:23 am

Re: AMD 6700XT on Linux fails

Post by tchiers »

With that card, you will want to use the latest ROCm OpenCL drivers from AMD.

Also check that /etc/OpenCL/vendors only contains the single .icd for the AMD driver - somethings in the FAH stack does not handle multiple .icd correctly.

You will also need to use the workaround I describe here: viewtopic.php?t=38848
Post Reply