Error folding on GPU [OpenSUSE]

DJViking · Post by **DJViking** » Wed Apr 20, 2016 6:04 pm

Well, only halfway there... I need to get the FAHClient to run with fahclient user. Perhaps I should set this thread to solved as it is no longer a Folding problem.

Post by **bruce** » Wed Apr 20, 2016 6:20 pm

I'm going to move this topic.

Stanford only supports 6 flavors of Unix:
Debian / Mint / Ubuntu
Redhat / Centos / Fedora
(plus you might also count MacOS)

There's nothing wrong with folks trying to help, but if it doesn't work, you may have to install a different distro.

DJViking · Post by **DJViking** » Wed Apr 20, 2016 6:35 pm

RedHat and OpenSUSE are same flavor, an RPM based distribution. An RPM for RedHat will in most cases also work on OpenSUSE.

Post by **bruce** » Wed Apr 20, 2016 7:38 pm

DJViking wrote:RedHat and OpenSUSE are same flavor, an RPM based distribution. An RPM for RedHat will in most cases also work on OpenSUSE.

I agree ... except when they don't. All that's certain is that FAHClient, together with it's installation procedure was tested on RedHad and was NOT testing on OpenSUSE -- but both were long ago.

i have a test version of FAHClient which works on Ubuntu 15.xx but not on 14.xx. It turns out that I had to upgrade the runtime library in 14.xx.

Post by **bruce** » Wed Apr 20, 2016 7:53 pm

The standard Linux installation creates a new user "fahuser" which is configured to run the FAHClient daemon, which is started automatically by a script that was installed. (The script has to run as root, since it's issuing commands as fahuser. Running FAHClient as root or as yourself is likely to run into a lot of the problems you're reporting.

Moreover, in your first post, you said you CHANGED a CPU slot to a GPU slot. You can't do that either since when FAHClient creates a slot, it validates the GPU and sets up specific things for it to run. Similarly, you can't replace, say, a Fermi with a Kepler without removing the slot and creating a new one.

DJViking · Post by **DJViking** » Wed Apr 20, 2016 8:09 pm

It actually creates a new user "fahclient" which is configured to run the FAHClient daemon. It is root who starts the service, but the fahclient user who runs the application.

From the service script /etc/init.d/FAHClient

Code: Select all

#!/bin/bash
# chkconfig: 2345 95 20
# description: Folding@home Client
# Starts FAHClient
# processname: FAHClient

### BEGIN INIT INFO
# Provides:          FAHClient
# Required-Start:    $remote_fs $syslog $network
# Required-Stop:     $remote_fs $syslog $network
# Default-Start:     2 3 4 5
# Default-Stop:      0 1 6
# Short-Description: Folding@home Client
# Description:       Start and stop Folding@home Client daemon
### END INIT INFO

USER=fahclient
NAME=fahclient
CONFIG=/etc/$NAME/config.xml
DEFAULT=/etc/default/$NAME
HOME=/var/lib/$NAME
EXEC=/usr/bin/FAHClient
LOG=$HOME/log.txt
PID=/var/run/$NAME.pid
EXTRA_OPTS=
QUIET=true
ENABLE=true

The first I did after I installed FAHClient/FAHControl was change the configuration on the default cpu slot to gpu. I have since delete it and added a new gpu slot.

DJViking · Post by **DJViking** » Thu Apr 21, 2016 9:21 am

OpenCL information

Code: Select all

fahclient@mintaka:~> clinfo
Number of platforms                               1
  Platform Name                                   NVIDIA CUDA
  Platform Vendor                                 NVIDIA Corporation
  Platform Version                                OpenCL 1.2 CUDA 8.0.20
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts
  Platform Extensions function suffix             NV

  Platform Name                                   NVIDIA CUDA
Number of devices                                 1
  Device Name                                     GeForce GTX 650 Ti
  Device Vendor                                   NVIDIA Corporation
  Device Vendor ID                                0x10de
  Device Version                                  OpenCL 1.2 CUDA
  Driver Version                                  361.42
  Device OpenCL C Version                         OpenCL C 1.2 
  Device Type                                     GPU
  Device Profile                                  FULL_PROFILE
  Device Topology (NV)                            PCI-E, 01:00.0
  Max compute units                               4
  Max clock frequency                             954MHz
  Compute Capability (NV)                         3.0
  Device Partition                                (core)
    Max number of sub-devices                     1
    Supported partition types                     None
  Max work item dimensions                        3
  Max work item sizes                             1024x1024x64
  Max work group size                             1024
  Preferred work group size multiple              32
  Warp size (NV)                                  32
  Preferred / native vector sizes                 
    char                                                 1 / 1       
    short                                                1 / 1       
    int                                                  1 / 1       
    long                                                 1 / 1       
    half                                                 0 / 0        (n/a)
    float                                                1 / 1       
    double                                               1 / 1        (cl_khr_fp64)
  Half-precision Floating-point support           (n/a)
  Single-precision Floating-point support         (core)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  Yes
  Double-precision Floating-point support         (cl_khr_fp64)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Address bits                                    64, Little-Endian
  Global memory size                              1065734144 (1016MiB)
  Error Correction support                        No
  Max memory allocation                           266433536 (254.1MiB)
  Unified memory for Host and Device              No
  Integrated memory (NV)                          No
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       4096 bits (512 bytes)
  Global Memory cache type                        Read/Write
  Global Memory cache size                        65536
  Global Memory cache line                        128 bytes
  Image support                                   Yes
    Max number of samplers per kernel             32
    Max size for 1D images from buffer            134217728 pixels
    Max 1D or 2D image array size                 2048 images
    Max 2D image size                             16384x16384 pixels
    Max 3D image size                             4096x4096x4096 pixels
    Max number of read image args                 256
    Max number of write image args                16
  Local memory type                               Local
  Local memory size                               49152 (48KiB)
  Registers per block (NV)                        65536
  Max constant buffer size                        65536 (64KiB)
  Max number of constant args                     9
  Max size of kernel argument                     4352 (4.25KiB)
  Queue properties                                
    Out-of-order execution                        Yes
    Profiling                                     Yes
  Profiling timer resolution                      1000ns
  Execution capabilities                          
    Run OpenCL kernels                            Yes
    Run native kernels                            No
    Kernel execution timeout (NV)                 Yes
  Concurrent copy and kernel execution (NV)       Yes
    Number of async copy engines                  1
  Prefer user sync for interop                    No
  printf() buffer size                            1048576 (1024KiB)
  Built-in kernels                                
  Device Available                                Yes
  Compiler Available                              Yes
  Linker Available                                Yes
  Device Extensions                               cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  No platform
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   No platform
  clCreateContext(NULL, ...) [default]            No platform
  clCreateContext(NULL, ...) [other]              Success [NV]
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  No platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  No platform

Found some documentation from Nvidia regarding CUDA development.
http://docs.nvidia.com/cuda/cuda-gettin ... stallation
It says to add the user to the video group. Which I now have done, but still it can't fold on the GPU.

I tried to set logging to level 5 to see if there was more useful data why it couldn't detect CUDA, but there was nothing useful.

Seems the problem is running FAHClient as a service. If I execute /usr/bin/fahclient on the commandline either as my user djviking or fahclient user it works perfectly.

Code: Select all

*********************** Log Started 2016-04-21T14:17:02Z ***********************
14:17:02:************************* Folding@home Client *************************
14:17:02:    Website: http://folding.stanford.edu/
14:17:02:  Copyright: (c) 2009-2014 Stanford University
14:17:02:     Author: Joseph Coffland <[email protected]>
14:17:02:       Args: /etc/fahclient/config.xml
14:17:02:     Config: /etc/fahclient/config.xml
14:17:02:******************************** Build ********************************
14:17:02:    Version: 7.4.4
14:17:02:       Date: Mar 4 2014
14:17:02:       Time: 12:01:17
14:17:02:    SVN Rev: 4130
14:17:02:     Branch: fah/trunk/client
14:17:02:   Compiler: GNU 4.1.2 20080704 (Red Hat 4.1.2-46)
14:17:02:    Options: -std=gnu++98 -O3 -funroll-loops -mfpmath=sse -ffast-math
14:17:02:             -fno-unsafe-math-optimizations -msse2
14:17:02:   Platform: linux2 2.6.18-164.11.1.el5
14:17:02:       Bits: 64
14:17:02:       Mode: Release
14:17:02:******************************* System ********************************
14:17:02:        CPU: Intel(R) Core(TM)2 Duo CPU E7400 @ 2.80GHz
14:17:02:     CPU ID: GenuineIntel Family 6 Model 23 Stepping 10
14:17:02:       CPUs: 2
14:17:02:     Memory: 7.80GiB
14:17:02:Free Memory: 85.75MiB
14:17:02:    Threads: POSIX_THREADS
14:17:02: OS Version: 4.1
14:17:02:Has Battery: false
14:17:02: On Battery: false
14:17:02: UTC Offset: 2
14:17:02:        PID: 20877
14:17:02:        CWD: /var/lib/fahclient
14:17:02:         OS: Linux 4.1.20-11-default x86_64
14:17:02:    OS Arch: AMD64
14:17:02:       GPUs: 1
14:17:02:      GPU 0: NVIDIA:3 GK106 [GeForce GTX 650 Ti]
14:17:02:       CUDA: 3.0
14:17:02:CUDA Driver: 8000
14:17:02:***********************************************************************

Post by **bruce** » Thu Apr 21, 2016 3:24 pm

Windows acts a lot like what you're describing. (Microsoft prevents the GPU from being used by a service.) I've never heard of a similar limitation for Linux but I'm certainly no Linux expert.

DJViking · Post by **DJViking** » Thu Apr 21, 2016 5:46 pm

Usually running as service on Linux is not a problem, but in this case it looks like access restriction to the GPU. Adding the user to group video should fix that.
I think it has to do with access to /dev/nvidia which on OpenSUSE has video as group, but owned by root.

Code: Select all

djviking@mintaka:/> ll /dev/nvidia*
crw-rw----+ 1 root video 195,   0 21.04.2016 10:33:59 /dev/nvidia0
crw-rw----+ 1 root video 195, 255 21.04.2016 10:33:59 /dev/nvidiactl
crw-rw-rw-  1 root root  195, 254 21.04.2016 10:34:11 /dev/nvidia-modeset
crw-rw-rw-+ 1 root root  248,   0 21.04.2016 10:33:59 /dev/nvidia-uvm
crw-rw-rw-  1 root root  248,   1 21.04.2016 10:35:11 /dev/nvidia-uvm-tools

Post by **bruce** » Thu Apr 21, 2016 6:05 pm

[OFF TOPIC}
I wonder if WIndows would somehow permit a user to join the video group. It sure would be nice if FAHClient.exe could actually run in the background.
[/OFF TOPIC}

DJViking · Post by **DJViking** » Sat Apr 23, 2016 7:04 pm

I am able now to run FAClient as a service and have access to the GPU. I had to modify the /etc/inid.d/FAHClient and remove the option --daemon.
However, even though it can detect CUDA I still get "Bad platformId size".

Seems like the problem is when root is starting the FAHClient service. It works fine when I start the FAHClient as the fahclient user.

Does anyone know what function this --daemon flag has with FAHClient? With this flag it starts two instances of FAHClient and without it only one instance. According to FAHClient --help

Code: Select all

  daemon <boolean=false>
    Short for --pid --service --respawn --log='' --fork

Not able to detect CUDA:

Code: Select all

@mintaka:~> ps -aux | grep FAH
fahclie+ 11312  0.0  0.0 21097356 6812 ?       Ssl  19:05   0:00 /usr/bin/FAHClient /etc/fahclient/config.xml --run-as fahclient --pid-file=/var/run/fahclient.pid --daemon
fahclie+ 11314  0.1  0.1 623440 12184 ?        Sl   19:05   0:00 /usr/bin/FAHClient --child --lifeline 11312 /etc/fahclient/config.xml --run-as fahclient --pid-file=/var/run/fahclient.pid --daemon

Able to detect CUDA:

Code: Select all

mintaka:/home/sverrem # ps -axu | grep FAH
fahclie+  3072  0.1  0.2 21687552 18376 ?      Sl   20:58   0:00 /usr/bin/FAHClient /etc/fahclient/config.xml --run-as fahclient --pid-file=/var/run/fahclient.pid

Post by **calxalot** » Sun Apr 24, 2016 4:36 am

I suspect the issue is that the daemonizing client launches as root:wheel, does a fork(), then the child does a setuid() to "fahclient", and does not alter the group.
Thus, no group "video" when running as service.

Can you show us the user and group of the running FAHClients?

DJViking · Post by **DJViking** » Sun Apr 24, 2016 8:08 am

Both instances has root as group, fahclient:root

Code: Select all

mintaka:/> ps -C FAHClient -o user,group,pid,comm
USER     GROUP      PID COMMAND
fahclie+ root      5371 FAHClient
fahclie+ root      5373 FAHClient

DJViking · Post by **DJViking** » Sun Apr 24, 2016 11:18 am

My modified /etc/init.d/FAHClient

Code: Select all

    OPTS+="$EXTRA_OPTS "
    OPTS+="--run-as $USER "
    OPTS+="--pid "
    OPTS+="--pid-file=$PID "
    OPTS+="--service "
    #OPTS+="--daemon "

Considering that --daemon is short for --pid --service --respawn --log='' --fork
I could not use --fork, --respawn and --log=''. FAHClient would not work at all with them.

Prior to running the EXEC in this script I perform su - fahclient.
Running FAHClient does detect CUDA this way, but is still unable to fold on the GPU. It also has USER:GROUP fahclient:root.

Running FAHClient manually the process user and group is now fahclient:users:

Code: Select all

su
su - fahclient
/usr/bin/FAHClient /etc/fahclient/config.xml --run-as fahclient --pid --pid-file=/var/lib/fahclient/fahclient.pid --service

Post by **calxalot** » Sun Apr 24, 2016 8:42 pm

Does OpenSUSE have the start-stop-daemon command?

Folding Forum

Error folding on GPU [OpenSUSE]

Re: Error folding on GPU [OpenSUSE]

Re: Error folding on GPU [OpenSUSE]

Re: Error folding on GPU [OpenSUSE]

Re: Error folding on GPU [OpenSUSE]

Re: Error folding on GPU [OpenSUSE]

Re: Error folding on GPU [OpenSUSE]

Re: Error folding on GPU [OpenSUSE]

Re: Error folding on GPU [OpenSUSE]

Re: Error folding on GPU [OpenSUSE]

Re: Error folding on GPU [OpenSUSE]

Re: Error folding on GPU [OpenSUSE]

Re: Error folding on GPU [OpenSUSE]

Re: Error folding on GPU [OpenSUSE]

Re: Error folding on GPU [OpenSUSE]

Re: Error folding on GPU [OpenSUSE]