Error folding on GPU [OpenSUSE]
Moderators: Site Moderators, FAHC Science Team
Re: Error folding on GPU [OpenSUSE]
Well, only halfway there... I need to get the FAHClient to run with fahclient user. Perhaps I should set this thread to solved as it is no longer a Folding problem.
Re: Error folding on GPU [OpenSUSE]
I'm going to move this topic.
Stanford only supports 6 flavors of Unix:
Debian / Mint / Ubuntu
Redhat / Centos / Fedora
(plus you might also count MacOS)
There's nothing wrong with folks trying to help, but if it doesn't work, you may have to install a different distro.
Stanford only supports 6 flavors of Unix:
Debian / Mint / Ubuntu
Redhat / Centos / Fedora
(plus you might also count MacOS)
There's nothing wrong with folks trying to help, but if it doesn't work, you may have to install a different distro.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
Re: Error folding on GPU [OpenSUSE]
RedHat and OpenSUSE are same flavor, an RPM based distribution. An RPM for RedHat will in most cases also work on OpenSUSE.
Re: Error folding on GPU [OpenSUSE]
I agree ... except when they don't. All that's certain is that FAHClient, together with it's installation procedure was tested on RedHad and was NOT testing on OpenSUSE -- but both were long ago.DJViking wrote:RedHat and OpenSUSE are same flavor, an RPM based distribution. An RPM for RedHat will in most cases also work on OpenSUSE.
i have a test version of FAHClient which works on Ubuntu 15.xx but not on 14.xx. It turns out that I had to upgrade the runtime library in 14.xx.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
Re: Error folding on GPU [OpenSUSE]
The standard Linux installation creates a new user "fahuser" which is configured to run the FAHClient daemon, which is started automatically by a script that was installed. (The script has to run as root, since it's issuing commands as fahuser. Running FAHClient as root or as yourself is likely to run into a lot of the problems you're reporting.
Moreover, in your first post, you said you CHANGED a CPU slot to a GPU slot. You can't do that either since when FAHClient creates a slot, it validates the GPU and sets up specific things for it to run. Similarly, you can't replace, say, a Fermi with a Kepler without removing the slot and creating a new one.
Moreover, in your first post, you said you CHANGED a CPU slot to a GPU slot. You can't do that either since when FAHClient creates a slot, it validates the GPU and sets up specific things for it to run. Similarly, you can't replace, say, a Fermi with a Kepler without removing the slot and creating a new one.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
Re: Error folding on GPU [OpenSUSE]
It actually creates a new user "fahclient" which is configured to run the FAHClient daemon. It is root who starts the service, but the fahclient user who runs the application.
From the service script /etc/init.d/FAHClient
The first I did after I installed FAHClient/FAHControl was change the configuration on the default cpu slot to gpu. I have since delete it and added a new gpu slot.
From the service script /etc/init.d/FAHClient
Code: Select all
#!/bin/bash
# chkconfig: 2345 95 20
# description: Folding@home Client
# Starts FAHClient
# processname: FAHClient
### BEGIN INIT INFO
# Provides: FAHClient
# Required-Start: $remote_fs $syslog $network
# Required-Stop: $remote_fs $syslog $network
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: Folding@home Client
# Description: Start and stop Folding@home Client daemon
### END INIT INFO
USER=fahclient
NAME=fahclient
CONFIG=/etc/$NAME/config.xml
DEFAULT=/etc/default/$NAME
HOME=/var/lib/$NAME
EXEC=/usr/bin/FAHClient
LOG=$HOME/log.txt
PID=/var/run/$NAME.pid
EXTRA_OPTS=
QUIET=true
ENABLE=true
Re: Error folding on GPU [OpenSUSE]
OpenCL information
Found some documentation from Nvidia regarding CUDA development.
http://docs.nvidia.com/cuda/cuda-gettin ... stallation
It says to add the user to the video group. Which I now have done, but still it can't fold on the GPU.
I tried to set logging to level 5 to see if there was more useful data why it couldn't detect CUDA, but there was nothing useful.
Seems the problem is running FAHClient as a service. If I execute /usr/bin/fahclient on the commandline either as my user djviking or fahclient user it works perfectly.
Code: Select all
fahclient@mintaka:~> clinfo
Number of platforms 1
Platform Name NVIDIA CUDA
Platform Vendor NVIDIA Corporation
Platform Version OpenCL 1.2 CUDA 8.0.20
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts
Platform Extensions function suffix NV
Platform Name NVIDIA CUDA
Number of devices 1
Device Name GeForce GTX 650 Ti
Device Vendor NVIDIA Corporation
Device Vendor ID 0x10de
Device Version OpenCL 1.2 CUDA
Driver Version 361.42
Device OpenCL C Version OpenCL C 1.2
Device Type GPU
Device Profile FULL_PROFILE
Device Topology (NV) PCI-E, 01:00.0
Max compute units 4
Max clock frequency 954MHz
Compute Capability (NV) 3.0
Device Partition (core)
Max number of sub-devices 1
Supported partition types None
Max work item dimensions 3
Max work item sizes 1024x1024x64
Max work group size 1024
Preferred work group size multiple 32
Warp size (NV) 32
Preferred / native vector sizes
char 1 / 1
short 1 / 1
int 1 / 1
long 1 / 1
half 0 / 0 (n/a)
float 1 / 1
double 1 / 1 (cl_khr_fp64)
Half-precision Floating-point support (n/a)
Single-precision Floating-point support (core)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Correctly-rounded divide and sqrt operations Yes
Double-precision Floating-point support (cl_khr_fp64)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Correctly-rounded divide and sqrt operations No
Address bits 64, Little-Endian
Global memory size 1065734144 (1016MiB)
Error Correction support No
Max memory allocation 266433536 (254.1MiB)
Unified memory for Host and Device No
Integrated memory (NV) No
Minimum alignment for any data type 128 bytes
Alignment of base address 4096 bits (512 bytes)
Global Memory cache type Read/Write
Global Memory cache size 65536
Global Memory cache line 128 bytes
Image support Yes
Max number of samplers per kernel 32
Max size for 1D images from buffer 134217728 pixels
Max 1D or 2D image array size 2048 images
Max 2D image size 16384x16384 pixels
Max 3D image size 4096x4096x4096 pixels
Max number of read image args 256
Max number of write image args 16
Local memory type Local
Local memory size 49152 (48KiB)
Registers per block (NV) 65536
Max constant buffer size 65536 (64KiB)
Max number of constant args 9
Max size of kernel argument 4352 (4.25KiB)
Queue properties
Out-of-order execution Yes
Profiling Yes
Profiling timer resolution 1000ns
Execution capabilities
Run OpenCL kernels Yes
Run native kernels No
Kernel execution timeout (NV) Yes
Concurrent copy and kernel execution (NV) Yes
Number of async copy engines 1
Prefer user sync for interop No
printf() buffer size 1048576 (1024KiB)
Built-in kernels
Device Available Yes
Compiler Available Yes
Linker Available Yes
Device Extensions cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts
NULL platform behavior
clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) No platform
clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) No platform
clCreateContext(NULL, ...) [default] No platform
clCreateContext(NULL, ...) [other] Success [NV]
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) No platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) No platform
http://docs.nvidia.com/cuda/cuda-gettin ... stallation
It says to add the user to the video group. Which I now have done, but still it can't fold on the GPU.
I tried to set logging to level 5 to see if there was more useful data why it couldn't detect CUDA, but there was nothing useful.
Seems the problem is running FAHClient as a service. If I execute /usr/bin/fahclient on the commandline either as my user djviking or fahclient user it works perfectly.
Code: Select all
*********************** Log Started 2016-04-21T14:17:02Z ***********************
14:17:02:************************* Folding@home Client *************************
14:17:02: Website: http://folding.stanford.edu/
14:17:02: Copyright: (c) 2009-2014 Stanford University
14:17:02: Author: Joseph Coffland <[email protected]>
14:17:02: Args: /etc/fahclient/config.xml
14:17:02: Config: /etc/fahclient/config.xml
14:17:02:******************************** Build ********************************
14:17:02: Version: 7.4.4
14:17:02: Date: Mar 4 2014
14:17:02: Time: 12:01:17
14:17:02: SVN Rev: 4130
14:17:02: Branch: fah/trunk/client
14:17:02: Compiler: GNU 4.1.2 20080704 (Red Hat 4.1.2-46)
14:17:02: Options: -std=gnu++98 -O3 -funroll-loops -mfpmath=sse -ffast-math
14:17:02: -fno-unsafe-math-optimizations -msse2
14:17:02: Platform: linux2 2.6.18-164.11.1.el5
14:17:02: Bits: 64
14:17:02: Mode: Release
14:17:02:******************************* System ********************************
14:17:02: CPU: Intel(R) Core(TM)2 Duo CPU E7400 @ 2.80GHz
14:17:02: CPU ID: GenuineIntel Family 6 Model 23 Stepping 10
14:17:02: CPUs: 2
14:17:02: Memory: 7.80GiB
14:17:02:Free Memory: 85.75MiB
14:17:02: Threads: POSIX_THREADS
14:17:02: OS Version: 4.1
14:17:02:Has Battery: false
14:17:02: On Battery: false
14:17:02: UTC Offset: 2
14:17:02: PID: 20877
14:17:02: CWD: /var/lib/fahclient
14:17:02: OS: Linux 4.1.20-11-default x86_64
14:17:02: OS Arch: AMD64
14:17:02: GPUs: 1
14:17:02: GPU 0: NVIDIA:3 GK106 [GeForce GTX 650 Ti]
14:17:02: CUDA: 3.0
14:17:02:CUDA Driver: 8000
14:17:02:***********************************************************************
Re: Error folding on GPU [OpenSUSE]
Windows acts a lot like what you're describing. (Microsoft prevents the GPU from being used by a service.) I've never heard of a similar limitation for Linux but I'm certainly no Linux expert.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
Re: Error folding on GPU [OpenSUSE]
Usually running as service on Linux is not a problem, but in this case it looks like access restriction to the GPU. Adding the user to group video should fix that.
I think it has to do with access to /dev/nvidia which on OpenSUSE has video as group, but owned by root.
I think it has to do with access to /dev/nvidia which on OpenSUSE has video as group, but owned by root.
Code: Select all
djviking@mintaka:/> ll /dev/nvidia*
crw-rw----+ 1 root video 195, 0 21.04.2016 10:33:59 /dev/nvidia0
crw-rw----+ 1 root video 195, 255 21.04.2016 10:33:59 /dev/nvidiactl
crw-rw-rw- 1 root root 195, 254 21.04.2016 10:34:11 /dev/nvidia-modeset
crw-rw-rw-+ 1 root root 248, 0 21.04.2016 10:33:59 /dev/nvidia-uvm
crw-rw-rw- 1 root root 248, 1 21.04.2016 10:35:11 /dev/nvidia-uvm-tools
Re: Error folding on GPU [OpenSUSE]
[OFF TOPIC}
I wonder if WIndows would somehow permit a user to join the video group. It sure would be nice if FAHClient.exe could actually run in the background.
[/OFF TOPIC}
I wonder if WIndows would somehow permit a user to join the video group. It sure would be nice if FAHClient.exe could actually run in the background.
[/OFF TOPIC}
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
Re: Error folding on GPU [OpenSUSE]
I am able now to run FAClient as a service and have access to the GPU. I had to modify the /etc/inid.d/FAHClient and remove the option --daemon.
However, even though it can detect CUDA I still get "Bad platformId size".
Seems like the problem is when root is starting the FAHClient service. It works fine when I start the FAHClient as the fahclient user.
Does anyone know what function this --daemon flag has with FAHClient? With this flag it starts two instances of FAHClient and without it only one instance. According to FAHClient --help
Not able to detect CUDA:
Able to detect CUDA:
However, even though it can detect CUDA I still get "Bad platformId size".
Seems like the problem is when root is starting the FAHClient service. It works fine when I start the FAHClient as the fahclient user.
Does anyone know what function this --daemon flag has with FAHClient? With this flag it starts two instances of FAHClient and without it only one instance. According to FAHClient --help
Code: Select all
daemon <boolean=false>
Short for --pid --service --respawn --log='' --fork
Code: Select all
@mintaka:~> ps -aux | grep FAH
fahclie+ 11312 0.0 0.0 21097356 6812 ? Ssl 19:05 0:00 /usr/bin/FAHClient /etc/fahclient/config.xml --run-as fahclient --pid-file=/var/run/fahclient.pid --daemon
fahclie+ 11314 0.1 0.1 623440 12184 ? Sl 19:05 0:00 /usr/bin/FAHClient --child --lifeline 11312 /etc/fahclient/config.xml --run-as fahclient --pid-file=/var/run/fahclient.pid --daemon
Code: Select all
mintaka:/home/sverrem # ps -axu | grep FAH
fahclie+ 3072 0.1 0.2 21687552 18376 ? Sl 20:58 0:00 /usr/bin/FAHClient /etc/fahclient/config.xml --run-as fahclient --pid-file=/var/run/fahclient.pid
-
- Site Moderator
- Posts: 1117
- Joined: Sat Dec 08, 2007 1:33 am
- Location: San Francisco, CA
- Contact:
Re: Error folding on GPU [OpenSUSE]
I suspect the issue is that the daemonizing client launches as root:wheel, does a fork(), then the child does a setuid() to "fahclient", and does not alter the group.
Thus, no group "video" when running as service.
Can you show us the user and group of the running FAHClients?
Thus, no group "video" when running as service.
Can you show us the user and group of the running FAHClients?
Re: Error folding on GPU [OpenSUSE]
Both instances has root as group, fahclient:root
Code: Select all
mintaka:/> ps -C FAHClient -o user,group,pid,comm
USER GROUP PID COMMAND
fahclie+ root 5371 FAHClient
fahclie+ root 5373 FAHClient
Re: Error folding on GPU [OpenSUSE]
My modified /etc/init.d/FAHClient
Considering that --daemon is short for --pid --service --respawn --log='' --fork
I could not use --fork, --respawn and --log=''. FAHClient would not work at all with them.
Prior to running the EXEC in this script I perform su - fahclient.
Running FAHClient does detect CUDA this way, but is still unable to fold on the GPU. It also has USER:GROUP fahclient:root.
Running FAHClient manually the process user and group is now fahclient:users:
Code: Select all
OPTS+="$EXTRA_OPTS "
OPTS+="--run-as $USER "
OPTS+="--pid "
OPTS+="--pid-file=$PID "
OPTS+="--service "
#OPTS+="--daemon "
I could not use --fork, --respawn and --log=''. FAHClient would not work at all with them.
Prior to running the EXEC in this script I perform su - fahclient.
Running FAHClient does detect CUDA this way, but is still unable to fold on the GPU. It also has USER:GROUP fahclient:root.
Running FAHClient manually the process user and group is now fahclient:users:
Code: Select all
su
su - fahclient
/usr/bin/FAHClient /etc/fahclient/config.xml --run-as fahclient --pid --pid-file=/var/lib/fahclient/fahclient.pid --service
-
- Site Moderator
- Posts: 1117
- Joined: Sat Dec 08, 2007 1:33 am
- Location: San Francisco, CA
- Contact:
Re: Error folding on GPU [OpenSUSE]
Does OpenSUSE have the start-stop-daemon command?