cuda failure - core 0.0.13 - both win10 and linux
Moderators: Site Moderators, FAHC Science Team
cuda failure - core 0.0.13 - both win10 and linux
Hi all -
i see the new 0.0.13 core trying to create a CUDA context on both linux and W10, and both failing.
on windows 10 home, the error is:
Failed to create CUDA context
Error loading CUDA module: CUDA_ERROR_FILE_NOT_FOUND (301)
on this machine i have VB2019 and cuda 11.1.0 installed. the NVIDIA control panel reports driver version 456.43 loaded on both my RTX2060 and GTX 1060. i think this must be the latest.
over on linux, i see a similar error:
Error launching CUDA compiler: 256
gcc: error trying to exec 'cc1plus': execvp: No such file or directory.
on this machine an older nvidia driver is loaded but g++ is definitely installed.
both errors maybe seem like some kind of a PATH error, but i'm not sure how to set up PATH for the FAH daemons on either platform. further, i'm not really even sure what directory is missing from PATH.
has anyone faced this particular problem on either platform?
thanks
i see the new 0.0.13 core trying to create a CUDA context on both linux and W10, and both failing.
on windows 10 home, the error is:
Failed to create CUDA context
Error loading CUDA module: CUDA_ERROR_FILE_NOT_FOUND (301)
on this machine i have VB2019 and cuda 11.1.0 installed. the NVIDIA control panel reports driver version 456.43 loaded on both my RTX2060 and GTX 1060. i think this must be the latest.
over on linux, i see a similar error:
Error launching CUDA compiler: 256
gcc: error trying to exec 'cc1plus': execvp: No such file or directory.
on this machine an older nvidia driver is loaded but g++ is definitely installed.
both errors maybe seem like some kind of a PATH error, but i'm not sure how to set up PATH for the FAH daemons on either platform. further, i'm not really even sure what directory is missing from PATH.
has anyone faced this particular problem on either platform?
thanks
-
- Posts: 2040
- Joined: Sat Dec 01, 2012 3:43 pm
- Hardware configuration: Folding@Home Client 7.6.13 (1 GPU slots)
Windows 7 64bit
Intel Core i5 2500k@4Ghz
Nvidia gtx 1080ti driver 441
Re: cuda failure - core 0.0.13 - both win10 and linux
On Windows uninstall cuda toolkit sdk would help. Or remove the CUDA_PATH... before launching FAH
Re: cuda failure - core 0.0.13 - both win10 and linux
Running 441.66 win 10. 49d online. 2x2060s and beta is now 134xx
On win uninstall cuda toolkit sdk?
On win uninstall cuda toolkit sdk?
-
- Posts: 2040
- Joined: Sat Dec 01, 2012 3:43 pm
- Hardware configuration: Folding@Home Client 7.6.13 (1 GPU slots)
Windows 7 64bit
Intel Core i5 2500k@4Ghz
Nvidia gtx 1080ti driver 441
Re: cuda failure - core 0.0.13 - both win10 and linux
Only uninstall cuda toolkit sdk if you have issues running FAH in CUDA mode and fallback to OpenCL. Or wait for a fixed FAHcore
Re: cuda failure - core 0.0.13 - both win10 and linux
how do you do this on windows? FAH is started by some kind of windows service daemon. is there a registry setting or something?foldy wrote:On Windows uninstall cuda toolkit sdk would help. Or remove the CUDA_PATH... before launching FAH
Re: cuda failure - core 0.0.13 - both win10 and linux
ok well i couldn't figure out the CUDA_PATH thing on windows, but i found that i had cuda 10 and cuda 11 both installed simultaneously. removed all of cuda 10 (probably not necessary) and then removed the cuda 11 development stuff and now at least on my windows 10 box FAH is using CUDA to fold on both nvidia GPUs.
this leaves linux where i suppose the problem is of a similar nature? i need to remove the dev kit portion of the cuda installation?
this leaves linux where i suppose the problem is of a similar nature? i need to remove the dev kit portion of the cuda installation?
Re: cuda failure - core 0.0.13 - both win10 and linux
I expect a later revision than 0.0.13 will resolve the issue for those people who need a different SDK toolkit version than the one FAH is delivering. For the rest of the world, removing an unused SDK works now.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
Re: cuda failure - core 0.0.13 - both win10 and linux
thanks - are there any instructions on how to do this on Ubuntu? i think i just installed nvidia's cuda package and i don't think there's any SDK version mismatch (though the installer probably did install the SDK)
Re: cuda failure - core 0.0.13 - both win10 and linux
The SDK includes a lot of things that are non-essential for FAH -- but would be used by a developer. The nVidia drivers (if you get them directly for nV) include OpenCL and everything that is needed to run it. FAHCore_22 delivers the parts of CUDA that it takes to use CUDA but you're not equiped to develop for CUDA.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
Re: cuda failure - core 0.0.13 - both win10 and linux
That worked for me on Windows too, had the exact same issue.astrorob wrote:removed the cuda 11 development stuff and now at least on my windows 10 box FAH is using CUDA to fold on both nvidia GPUs.
Most of my other crunchers are running some flavour of Linux, though (primarily Ubuntu), and none of them are using CUDA. There isn't even an error, it's not showing up at all:
Code: Select all
14:55:54:WU00:FS01:Connecting to 65.254.110.245:80
14:55:55:WU00:FS01:Assigned to work server 66.170.111.50
14:55:55:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:GP104 [GeForce GTX 1070 Ti] 8186 from 66.170.111.50
14:55:55:WU00:FS01:Connecting to 66.170.111.50:8080
14:55:55:WU00:FS01:Downloading 11.17MiB
14:56:01:WU00:FS01:Download 47.58%
14:56:06:WU00:FS01:Download complete
14:56:06:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:14485 run:0 clone:1439 gen:81 core:0x22 unit:0x0000006e42aa6f325f45deaa14b9e36d
14:56:06:WU00:FS01:Starting
14:56:06:WU00:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /etc/init.d/cores/cores.foldingathome.org/lin/64bit/22-0.0.13/Core_22.fah/FahCore_22 -dir 00 -suffix 01 -version 704 -lifeline 2281 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
14:56:06:WU00:FS01:Started FahCore on PID 47834
14:56:06:WU00:FS01:Core PID:47838
14:56:06:WU00:FS01:FahCore 0x22 started
14:56:06:WU00:FS01:0x22:*********************** Log Started 2020-10-08T14:56:06Z ***********************
14:56:06:WU00:FS01:0x22:*************************** Core22 Folding@home Core ***************************
14:56:06:WU00:FS01:0x22: Core: Core22
14:56:06:WU00:FS01:0x22: Type: 0x22
14:56:06:WU00:FS01:0x22: Version: 0.0.13
14:56:06:WU00:FS01:0x22: Author: Joseph Coffland <[email protected]>
14:56:06:WU00:FS01:0x22: Copyright: 2020 foldingathome.org
14:56:06:WU00:FS01:0x22: Homepage: https://foldingathome.org/
14:56:06:WU00:FS01:0x22: Date: Sep 19 2020
14:56:06:WU00:FS01:0x22: Time: 01:10:35
14:56:06:WU00:FS01:0x22: Revision: 571cf95de6de2c592c7c3ed48fcfb2e33e9ea7d3
14:56:06:WU00:FS01:0x22: Branch: core22-0.0.13
14:56:06:WU00:FS01:0x22: Compiler: GNU 4.8.2 20140120 (Red Hat 4.8.2-15)
14:56:06:WU00:FS01:0x22: Options: -std=c++11 -fsigned-char -ffunction-sections -fdata-sections -O3
14:56:06:WU00:FS01:0x22: -funroll-loops -DOPENMM_GIT_HASH="\"189320d0\""
14:56:06:WU00:FS01:0x22: Platform: linux2 4.19.76-linuxkit
14:56:06:WU00:FS01:0x22: Bits: 64
14:56:06:WU00:FS01:0x22: Mode: Release
14:56:06:WU00:FS01:0x22:Maintainers: John Chodera <[email protected]> and Peter Eastman
14:56:06:WU00:FS01:0x22: <[email protected]>
14:56:06:WU00:FS01:0x22: Args: -dir 00 -suffix 01 -version 704 -lifeline 47834 -checkpoint 15 -gpu
14:56:06:WU00:FS01:0x22: 0 -gpu-vendor nvidia
14:56:06:WU00:FS01:0x22:************************************ libFAH ************************************
14:56:06:WU00:FS01:0x22: Date: Sep 15 2020
14:56:06:WU00:FS01:0x22: Time: 05:14:43
14:56:06:WU00:FS01:0x22: Revision: 44301ed97b996b63fe736bb8073f22209cb2b603
14:56:06:WU00:FS01:0x22: Branch: HEAD
14:56:06:WU00:FS01:0x22: Compiler: GNU 4.8.2 20140120 (Red Hat 4.8.2-15)
14:56:06:WU00:FS01:0x22: Options: -std=c++11 -fsigned-char -ffunction-sections -fdata-sections -O3
14:56:06:WU00:FS01:0x22: -funroll-loops
14:56:06:WU00:FS01:0x22: Platform: linux2 4.19.76-linuxkit
14:56:06:WU00:FS01:0x22: Bits: 64
14:56:06:WU00:FS01:0x22: Mode: Release
14:56:06:WU00:FS01:0x22:************************************ CBang *************************************
14:56:06:WU00:FS01:0x22: Date: Sep 15 2020
14:56:06:WU00:FS01:0x22: Time: 05:11:04
14:56:06:WU00:FS01:0x22: Revision: 33fcfc2b3ed2195a423606a264718e31e6b3903f
14:56:06:WU00:FS01:0x22: Branch: HEAD
14:56:06:WU00:FS01:0x22: Compiler: GNU 4.8.2 20140120 (Red Hat 4.8.2-15)
14:56:06:WU00:FS01:0x22: Options: -std=c++11 -fsigned-char -ffunction-sections -fdata-sections -O3
14:56:06:WU00:FS01:0x22: -funroll-loops -fPIC
14:56:06:WU00:FS01:0x22: Platform: linux2 4.19.76-linuxkit
14:56:06:WU00:FS01:0x22: Bits: 64
14:56:06:WU00:FS01:0x22: Mode: Release
14:56:06:WU00:FS01:0x22:************************************ System ************************************
14:56:06:WU00:FS01:0x22: CPU: AMD Ryzen 7 3700X 8-Core Processor
14:56:06:WU00:FS01:0x22: CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
14:56:06:WU00:FS01:0x22: CPUs: 16
14:56:06:WU00:FS01:0x22: Memory: 23.49GiB
14:56:06:WU00:FS01:0x22:Free Memory: 16.82GiB
14:56:06:WU00:FS01:0x22: Threads: POSIX_THREADS
14:56:06:WU00:FS01:0x22: OS Version: 5.4
14:56:06:WU00:FS01:0x22:Has Battery: false
14:56:06:WU00:FS01:0x22: On Battery: false
14:56:06:WU00:FS01:0x22: UTC Offset: 2
14:56:06:WU00:FS01:0x22: PID: 47838
14:56:06:WU00:FS01:0x22: CWD: /etc/init.d/work
14:56:06:WU00:FS01:0x22:************************************ OpenMM ************************************
14:56:06:WU00:FS01:0x22: Revision: 189320d0
14:56:06:WU00:FS01:0x22:********************************************************************************
14:56:06:WU00:FS01:0x22:Project: 14485 (Run 0, Clone 1439, Gen 81)
14:56:06:WU00:FS01:0x22:Unit: 0x0000006e42aa6f325f45deaa14b9e36d
14:56:06:WU00:FS01:0x22:Reading tar file core.xml
14:56:06:WU00:FS01:0x22:Reading tar file integrator.xml.bz2
14:56:06:WU00:FS01:0x22:Reading tar file state.xml.bz2
14:56:06:WU00:FS01:0x22:Reading tar file system.xml.bz2
14:56:06:WU00:FS01:0x22:Digital signatures verified
14:56:06:WU00:FS01:0x22:Folding@home GPU Core22 Folding@home Core
14:56:06:WU00:FS01:0x22:Version 0.0.13
14:56:06:WU00:FS01:0x22: Checkpoint write interval: 25000 steps (2%) [50 total]
14:56:06:WU00:FS01:0x22: JSON viewer frame write interval: 12500 steps (1%) [100 total]
14:56:06:WU00:FS01:0x22: XTC frame write interval: 10000 steps (0.8%) [125 total]
14:56:06:WU00:FS01:0x22: Global context and integrator variables write interval: disabled
14:56:06:WU00:FS01:0x22:No -opencl-device specified; using deprecated -gpu argument as an alias for -opencl-device.
14:56:06:WU00:FS01:0x22:Please consider upgrading your client version.
14:56:06:WU00:FS01:0x22:There are 3 platforms available.
14:56:06:WU00:FS01:0x22:Platform 0: Reference
14:56:06:WU00:FS01:0x22:Platform 1: CPU
14:56:06:WU00:FS01:0x22:Platform 2: OpenCL
14:56:06:WU00:FS01:0x22: opencl-device 0 specified
14:56:14:WU00:FS01:0x22:Attempting to create OpenCL context:
14:56:14:WU00:FS01:0x22: Configuring platform OpenCL
14:56:20:WU00:FS01:0x22: Using OpenCL on platformId 0 and gpu 0
14:56:20:WU00:FS01:0x22:Completed 0 out of 1250000 steps (0%)
14:56:20:WU00:FS01:0x22:Checkpoint completed at step 0
14:57:11:WU00:FS01:0x22:Completed 12500 out of 1250000 steps (1%)
Last edited by Joe_H on Thu Oct 08, 2020 3:35 pm, edited 1 time in total.
Reason: change Quote tags to Code for log segments
Reason: change Quote tags to Code for log segments
Re: cuda failure - core 0.0.13 - both win10 and linux
The FAHCore downloads the necessary CUDA support code. The bug that you're dealing with is a conflict between the version of CUDA in your SDK confliciting with the code downloading in FAHCore_22. You probably can uninstall the SDK and avoid the issue unless you're actually using it to develop CUDA code.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
Re: cuda failure - core 0.0.13 - both win10 and linux
Upgrade your client version, had to do that on one of mine to get cuda recognized properly.
-
- Posts: 2040
- Joined: Sat Dec 01, 2012 3:43 pm
- Hardware configuration: Folding@Home Client 7.6.13 (1 GPU slots)
Windows 7 64bit
Intel Core i5 2500k@4Ghz
Nvidia gtx 1080ti driver 441
Re: cuda failure - core 0.0.13 - both win10 and linux
@Azmodes: FAHclient should be v7.6.13
Re: cuda failure - core 0.0.13 - both win10 and linux
That did it. Thank you!foldy wrote:@Azmodes: FAHclient should be v7.6.13
-
- Posts: 16
- Joined: Fri Feb 12, 2010 12:16 am
- Location: California
- Contact:
Re: cuda failure - core 0.0.13 - both win10 and linux
I still have this problem with 7.6.13 and NVidia 456.71, Win10, TitanX Pascal.
07:28:37:WU00:FS01:0x22:There are 4 platforms available.
07:28:37:WU00:FS01:0x22:Platform 0: Reference
07:28:37:WU00:FS01:0x22:Platform 1: CPU
07:28:37:WU00:FS01:0x22:Platform 2: OpenCL
07:28:37:WU00:FS01:0x22: opencl-device 0 specified
07:28:37:WU00:FS01:0x22:Platform 3: CUDA
07:28:37:WU00:FS01:0x22: cuda-device 0 specified
07:28:47:WU00:FS01:0x22:Attempting to create CUDA context:
07:28:47:WU00:FS01:0x22: Configuring platform CUDA
07:28:48:WU00:FS01:0x22:Failed to create CUDA context:
07:28:48:WU00:FS01:0x22:Error loading CUDA module: CUDA_ERROR_FILE_NOT_FOUND (301)
07:28:48:WU00:FS01:0x22:Attempting to create OpenCL context:
07:28:48:WU00:FS01:0x22: Configuring platform OpenCL
07:29:02:WU00:FS01:0x22: Using OpenCL on platformId 0 and gpu 0
07:28:37:WU00:FS01:0x22:There are 4 platforms available.
07:28:37:WU00:FS01:0x22:Platform 0: Reference
07:28:37:WU00:FS01:0x22:Platform 1: CPU
07:28:37:WU00:FS01:0x22:Platform 2: OpenCL
07:28:37:WU00:FS01:0x22: opencl-device 0 specified
07:28:37:WU00:FS01:0x22:Platform 3: CUDA
07:28:37:WU00:FS01:0x22: cuda-device 0 specified
07:28:47:WU00:FS01:0x22:Attempting to create CUDA context:
07:28:47:WU00:FS01:0x22: Configuring platform CUDA
07:28:48:WU00:FS01:0x22:Failed to create CUDA context:
07:28:48:WU00:FS01:0x22:Error loading CUDA module: CUDA_ERROR_FILE_NOT_FOUND (301)
07:28:48:WU00:FS01:0x22:Attempting to create OpenCL context:
07:28:48:WU00:FS01:0x22: Configuring platform OpenCL
07:29:02:WU00:FS01:0x22: Using OpenCL on platformId 0 and gpu 0
Florin Andrei
http://florin.myip.org/
http://florin.myip.org/