Page 1 of 5

Error folding on GPU [OpenSUSE]

Posted: Tue Apr 19, 2016 2:01 pm
by DJViking
Right after I installed FAHClient and FAHControl I started folding with CPU and it worked great. I went over to the configuration and changed the slot with CPU to GPU. Now it wont fold at all. Status moves between Ready and Running for a while then settling on Error.

This keeps filling my log every time it switches between Ready and Running.

Code: Select all

13:03:12:WU00:FS00:Assigned to work server 140.163.4.235
13:03:12:WU00:FS00:Requesting new work unit for slot 00: READY gpu:0:GK106 [GeForce GTX 650 Ti] from 140.163.4.235
13:03:12:WU00:FS00:Connecting to 140.163.4.235:8080
13:03:12:WU00:FS00:Downloading 3.78MiB
13:03:16:WU00:FS00:Download complete
13:03:16:WU00:FS00:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:10485 run:0 clone:53 gen:264 core:0x18 unit:0x0000014f538b3dbb54aeb39d80684dce
13:03:16:WU00:FS00:Starting
13:03:16:WU00:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/web.stanford.edu/~pande/Linux/AMD64/NVIDIA/Fermi/Core_18.fah/FahCore_18 -dir 00 -suffix 01 -version 704 -lifeline 6462 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
13:03:16:WU00:FS00:Started FahCore on PID 6510
13:03:16:WU00:FS00:Core PID:6514
13:03:16:WU00:FS00:FahCore 0x18 started
13:03:16:WU00:FS00:0x18:*********************** Log Started 2016-04-19T13:03:16Z ***********************
13:03:16:WU00:FS00:0x18:Project: 10485 (Run 0, Clone 53, Gen 264)
13:03:16:WU00:FS00:0x18:Unit: 0x0000014f538b3dbb54aeb39d80684dce
13:03:16:WU00:FS00:0x18:CPU: 0x00000000000000000000000000000000
13:03:16:WU00:FS00:0x18:Machine: 0
13:03:16:WU00:FS00:0x18:Reading tar file state.xml
13:03:16:WU00:FS00:0x18:Reading tar file system.xml
13:03:17:WU00:FS00:0x18:Reading tar file integrator.xml
13:03:17:WU00:FS00:0x18:Reading tar file core.xml
13:03:17:WU00:FS00:0x18:Digital signatures verified
13:03:17:WU00:FS00:0x18:Folding@home GPU core18
13:03:17:WU00:FS00:0x18:Version 0.0.4
13:03:17:WU00:FS00:0x18:ERROR:exception: Bad platformId size.
13:03:17:WU00:FS00:0x18:Saving result file logfile_01.txt
13:03:17:WU00:FS00:0x18:Saving result file log.txt
13:03:17:WU00:FS00:0x18:Folding@home Core Shutdown: BAD_WORK_UNIT
13:03:17:WARNING:WU00:FS00:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
13:03:17:WU00:FS00:Sending unit results: id:00 state:SEND error:FAULTY project:10485 run:0 clone:53 gen:264 core:0x18 unit:0x0000014f538b3dbb54aeb39d80684dce
13:03:17:WU00:FS00:Uploading 1.92KiB to 140.163.4.235
13:03:17:WU00:FS00:Connecting to 140.163.4.235:8080
13:03:17:WU01:FS00:Connecting to 171.67.108.45:80
13:03:18:WU00:FS00:Upload complete
13:03:18:WU00:FS00:Server responded WORK_ACK (400)
13:03:18:WU00:FS00:Cleaning up

Re: Error folding on GPU

Posted: Tue Apr 19, 2016 2:52 pm
by Joe_H
Welcome to the folding support forum.

The message "Bad platformId size" usually indicates that you are either using the generic Linux video drivers, or have not fully installed the proprietary nVidia drivers. GPU folding on Linux systems requires the nVidia drivers be installed along with the OpenCL support.

Re: Error folding on GPU

Posted: Tue Apr 19, 2016 3:14 pm
by DJViking
I have actually installed the nvidia drivers 361.48 and I have blacklisted nouveau
modprobe.d/nvidia-default.conf:blacklist nouveau

Using OpenSUSE Leap 42.1 and installed the nvidia driver through repository from nvidia
ftp://download.nvidia.com/opensuse/leap/42.1/

Re: Error folding on GPU

Posted: Tue Apr 19, 2016 3:45 pm
by 7im
The OpenCL portion of the driver is sometimes packages separately. Check the repository.

Re: Error folding on GPU

Posted: Tue Apr 19, 2016 4:11 pm
by DJViking
There are no OpenCL packages in the repository from Nvidia.
However I do have the following package that does seem to contain support for OpenCL

Information for package nvidia-computeG04:
------------------------------------------
Repository: Nvidia
Name: nvidia-computeG04
Version: 361.42-21.1
Arch: x86_64
Vendor: obs://build.suse.de/home:sndirsch:drivers
Installed: Yes
Status: up-to-date
Installed Size: 51.3 MiB
Summary: NVIDIA driver for computing with GPGPU
Description:
NVIDIA driver for computing with GPGPUs using CUDA or OpenCL

Re: Error folding on GPU

Posted: Tue Apr 19, 2016 4:24 pm
by jimerickson
you may want to look at ocl-icd:

https://build.opensuse.org/package/show ... es/ocl-icd

Re: Error folding on GPU

Posted: Tue Apr 19, 2016 4:33 pm
by DJViking
I cannot install libOpenCL as it conflicts with nvidia-computeG04

Code: Select all

Retrieving package libOpenCL1-2.2.7-1.1.x86_64                                                                           (1/1),  30.1 KiB (111.8 KiB unpacked)
Retrieving: libOpenCL1-2.2.7-1.1.x86_64.rpm ............................................................................................................[done]
Checking for file conflicts: ..........................................................................................................................[error]
Detected 1 file conflict:

File /usr/lib64/libOpenCL.so.1.0.0
  from install of
     libOpenCL1-2.2.7-1.1.x86_64 (openSUSE-leap/42.1-Oss)
  conflicts with file from package
     nvidia-computeG04-361.42-21.1.x86_64 (@System)

File conflicts happen when two packages attempt to install files with the same name but different contents. If you continue, conflicting files will be replaced losing the previous content.
Continue? [yes/no] (no): no 

Problem occured during or after installation or removal of packages:
Installation aborted by user

Re: Error folding on GPU

Posted: Tue Apr 19, 2016 4:47 pm
by jimerickson
personally i would at least try ocl-icd. this is what it takes to get folding@home working on fedora which is also a rpm distribution. you can always reinstall the driver if it doesn't work. just my opinion though do what you are comfortable with.

Re: Error folding on GPU

Posted: Tue Apr 19, 2016 5:41 pm
by DJViking
Installing libOpenCL had no affect. Got same problem when folding with GPU.

Re: Error folding on GPU

Posted: Tue Apr 19, 2016 6:06 pm
by jimerickson
after installing the NVidia driver did you do a "sudo halt" and power the machine back on after halting? rebooting after NVidia install is a necessary step. also exactly which gpu are you trying to fold with? is it supported? (to find out if its supported check GPU.txt)

if that doesn't work try deleting the slot and setting it up properly with FAHControl. you may want to post your config.xml file.

Re: Error folding on GPU

Posted: Tue Apr 19, 2016 6:26 pm
by DJViking
I have just reinstalled the nvidia driver, rebootet the machine.

I found my Nvidia graphic card in GPUs.txt
djviking@mintaka:/var/lib/fahclient> less GPUs.txt | grep "650 Ti"
0x10de:0x11c2:2:3:GK106 [GTX 650 Ti Boost]
0x10de:0x11c3:2:3:GK106 [GeForce GTX 650 Ti]
0x10de:0x11c6:2:3:GK106 [GeForce GTX 650 Ti]

I have now also tried to remove the old slot and created a new gpu slot. Values for gpu core, opencl and cuda was set to -1. That didn't work. Still getting the problem.

Where is this config.xml located?

Re: Error folding on GPU

Posted: Tue Apr 19, 2016 6:45 pm
by jimerickson
my config.xml is located in my home directory at ~/fahclient_7.4.4-64bit-release/config.xml though it may be slightly different on opensuse. GPU.txt should be in the same folder. though to be honest i really think you need ocl-icd. now that you have rebooted it may be a good time to try it.

Re: Error folding on GPU

Posted: Tue Apr 19, 2016 7:05 pm
by DJViking
I have already tried with installing libOpenCL (ocl-icd).

I have nothing of fahclient under my home directory.
However I did find a bunch of configs in /var/lib/fahclient/configs/

This on is config-20160419-183935.xml

Code: Select all

<config>
  <!-- Network -->
  <proxy v=':8080'/>

  <!-- Remote Command Server -->
  <password v='******************'/>

  <!-- User Information -->
  <passkey v=''/>
  <team v='37651'/>
  <user v='DJViking'/>

  <!-- Folding Slots -->
  <slot id='0' type='GPU'>
    <paused v='true'/>
  </slot>
</config>
But I guess it is the /etc/fahcontrol/config.xml that is relevant here.

Code: Select all

<config>
  <!-- Network -->
  <proxy v=':8080'/>

  <!-- Remote Command Server -->
  <password v='*************'/>

  <!-- User Information -->
  <passkey v=''/>
  <team v='37651'/>
  <user v='DJViking'/>

  <!-- Folding Slots -->
  <slot id='0' type='GPU'>
    <paused v='true'/>
  </slot>
</config>
Passkey removed. ~sortofageek

Re: Error folding on GPU

Posted: Tue Apr 19, 2016 7:31 pm
by 7im
DJViking wrote:...snip...

I have now also tried to remove the old slot and created a new gpu slot. Values for gpu core, opencl and cuda was set to -1. That didn't work. Still getting the problem.
Unless you know those new index values to be accurate, you will have made the troubleshooting much more difficult. The index settings are not the issue. Please return them to their default values.

Re: Error folding on GPU

Posted: Tue Apr 19, 2016 7:35 pm
by jimerickson
that was before you rebooted. its fine with me if you don't want to but, i think you will continue to see "BadPlatformID" if you don't. have you installed the xorg-x11-drv-nvidia-*.rpm's? is the kernel module properly installed? how was the driver installed? i checked the page https://en.opensuse.org/SDB:NVIDIA_drivers and it says the following:

The packages contain the correct 'supplements:' so Zypper will find the correct modules for your card. Unfortunately on openSUSE Leap 42.1 these 'supplemements' are being ignored by default by YaST (boo#953522). Therefore you need to select 'Extras/Install All Matching Recommended Packages' in 'Software Management' for autoselection and installation of the appropriate NVIDIA driver packages. When using 'zypper inr' you're not affected by this issue on openSUSE Leap 42.1.

other than that i am out of ideas.