on Ubuntu 22.04.5 LTS (GNU/Linux 5.15.0-134-generic x86_64)
with a Quadro K2200, Driver Version: 570.124.06, CUDA Version: 12.8. I've tried older drivers too.
Some cores work, especially 0x22, but 0x24 seems to consistently fail. This has been something I've seen primarily since I updated to Ubuntu 22LTS from 20 late last year, because of course that upgraded all the drivers and libraries, but I don't know if that's coincidence and I'm seeing more failures because more 0x24 WUs are being offered, the container versions vs installed drivers and libraries being an issue, or something else.
Working unit, 0x22:
Code: Select all
12:13:00:WU02:FS01:Starting
12:13:00:WU02:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /fah/cores/cores.foldingathome.org/lin/64bit/22-0.0.20/Core_22.fah/FahCore_22 -dir 02 -suffix 01 -version 706 -lifeline 1 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor nvidia -gpu 0 -gpu-usage 100
12:13:00:WU02:FS01:Started FahCore on PID 116
12:13:00:WU02:FS01:Core PID:120
12:13:00:WU02:FS01:FahCore 0x22 started
12:13:01:WU02:FS01:0x22:*********************** Log Started 2025-03-13T12:13:00Z ***********************
12:13:01:WU02:FS01:0x22:*************************** Core22 Folding@home Core ***************************
12:13:01:WU02:FS01:0x22: Core: Core22
12:13:01:WU02:FS01:0x22: Type: 0x22
12:13:01:WU02:FS01:0x22: Version: 0.0.20
12:13:01:WU02:FS01:0x22: Author: Joseph Coffland <[email protected]>
12:13:01:WU02:FS01:0x22: Copyright: 2020 foldingathome.org
12:13:01:WU02:FS01:0x22: Homepage: https://foldingathome.org/
12:13:01:WU02:FS01:0x22: Date: Jan 20 2022
12:13:01:WU02:FS01:0x22: Time: 00:57:52
12:13:01:WU02:FS01:0x22: Revision: 3f211b8a4346514edbff34e3cb1c0e0ec951373c
12:13:01:WU02:FS01:0x22: Branch: HEAD
12:13:01:WU02:FS01:0x22: Compiler: GNU 9.4.0
12:13:01:WU02:FS01:0x22: Options: -faligned-new -std=c++11 -fsigned-char -ffunction-sections
12:13:01:WU02:FS01:0x22: -fdata-sections -O3 -funroll-loops -fno-pie
12:13:01:WU02:FS01:0x22: -DOPENMM_VERSION="\"7.7.0\""
12:13:01:WU02:FS01:0x22: Platform: linux 5.11.0-1025-azure
12:13:01:WU02:FS01:0x22: Bits: 64
12:13:01:WU02:FS01:0x22: Mode: Release
12:13:01:WU02:FS01:0x22:Maintainers: John Chodera <[email protected]> and Peter Eastman
12:13:01:WU02:FS01:0x22: <[email protected]>
12:13:01:WU02:FS01:0x22: Args: -dir 02 -suffix 01 -version 706 -lifeline 116 -checkpoint 15
12:13:01:WU02:FS01:0x22: -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor
12:13:01:WU02:FS01:0x22: nvidia -gpu 0 -gpu-usage 100
12:13:01:WU02:FS01:0x22:************************************ libFAH ************************************
12:13:01:WU02:FS01:0x22: Date: Jan 20 2022
12:13:01:WU02:FS01:0x22: Time: 00:57:22
12:13:01:WU02:FS01:0x22: Revision: 9f4ad694e75c2350d4bb6b8b5b769ba27e483a2f
12:13:01:WU02:FS01:0x22: Branch: HEAD
12:13:01:WU02:FS01:0x22: Compiler: GNU 9.4.0
12:13:01:WU02:FS01:0x22: Options: -faligned-new -std=c++11 -fsigned-char -ffunction-sections
12:13:01:WU02:FS01:0x22: -fdata-sections -O3 -funroll-loops -fno-pie
12:13:01:WU02:FS01:0x22: Platform: linux 5.11.0-1025-azure
12:13:01:WU02:FS01:0x22: Bits: 64
12:13:01:WU02:FS01:0x22: Mode: Release
12:13:01:WU02:FS01:0x22:************************************ CBang *************************************
12:13:01:WU02:FS01:0x22: Date: Jan 20 2022
12:13:01:WU02:FS01:0x22: Time: 00:57:00
12:13:01:WU02:FS01:0x22: Revision: ab023d155b446906d55b0f6c9a1eedeea04f7a1a
12:13:01:WU02:FS01:0x22: Branch: HEAD
12:13:01:WU02:FS01:0x22: Compiler: GNU 9.4.0
12:13:01:WU02:FS01:0x22: Options: -faligned-new -std=c++11 -fsigned-char -ffunction-sections
12:13:01:WU02:FS01:0x22: -fdata-sections -O3 -funroll-loops -fno-pie -fPIC
12:13:01:WU02:FS01:0x22: Platform: linux 5.11.0-1025-azure
12:13:01:WU02:FS01:0x22: Bits: 64
12:13:01:WU02:FS01:0x22: Mode: Release
12:13:01:WU02:FS01:0x22:************************************ System ************************************
12:13:01:WU02:FS01:0x22: CPU: Intel(R) Xeon(R) CPU E5620 @ 2.40GHz
12:13:01:WU02:FS01:0x22: CPU ID: GenuineIntel Family 6 Model 44 Stepping 2
12:13:01:WU02:FS01:0x22: CPUs: 8
12:13:01:WU02:FS01:0x22: Memory: 35.22GiB
12:13:01:WU02:FS01:0x22:Free Memory: 30.34GiB
12:13:01:WU02:FS01:0x22: Threads: POSIX_THREADS
12:13:01:WU02:FS01:0x22: OS Version: 5.15
12:13:01:WU02:FS01:0x22:Has Battery: false
12:13:01:WU02:FS01:0x22: On Battery: false
12:13:01:WU02:FS01:0x22: UTC Offset: 0
12:13:01:WU02:FS01:0x22: PID: 120
12:13:01:WU02:FS01:0x22: CWD: /fah/work
12:13:01:WU02:FS01:0x22:************************************ OpenMM ************************************
12:13:01:WU02:FS01:0x22: Version: 7.7.0
12:13:01:WU02:FS01:0x22:********************************************************************************
12:13:01:WU02:FS01:0x22:Project: 18931 (Run 9, Clone 48, Gen 90)
12:13:01:WU02:FS01:0x22:Reading tar file core.xml
12:13:01:WU02:FS01:0x22:Reading tar file integrator.xml
12:13:01:WU02:FS01:0x22:Reading tar file state.xml
12:13:01:WU02:FS01:0x22:Reading tar file system.xml
12:13:02:WU02:FS01:0x22:Digital signatures verified
12:13:02:WU02:FS01:0x22:Folding@home GPU Core22 Folding@home Core
12:13:02:WU02:FS01:0x22:Version 0.0.20
12:13:02:WU02:FS01:0x22: Checkpoint write interval: 62500 steps (5%) [20 total]
12:13:02:WU02:FS01:0x22: JSON viewer frame write interval: 12500 steps (1%) [100 total]
12:13:02:WU02:FS01:0x22: XTC frame write interval: 25000 steps (2%) [50 total]
12:13:02:WU02:FS01:0x22: Global context and integrator variables write interval: disabled
12:13:02:WU02:FS01:0x22:There are 4 platforms available.
12:13:02:WU02:FS01:0x22:Platform 0: Reference
12:13:02:WU02:FS01:0x22:Platform 1: CPU
12:13:02:WU02:FS01:0x22:Platform 2: OpenCL
12:13:02:WU02:FS01:0x22: opencl-device 0 specified
12:13:02:WU02:FS01:0x22:Platform 3: CUDA
12:13:02:WU02:FS01:0x22: cuda-device 0 specified
12:13:14:WU02:FS01:0x22:Attempting to create CUDA context:
12:13:14:WU02:FS01:0x22: Configuring platform CUDA
12:13:28:WU02:FS01:0x22: Using CUDA and gpu 0
12:13:28:WU02:FS01:0x22:Completed 0 out of 1250000 steps (0%)
12:13:31:WU02:FS01:0x22:Checkpoint completed at step 0
12:18:49:WU02:FS01:0x22:Completed 12500 out of 1250000 steps (1%)
Code: Select all
20:59:16:WU02:FS01:0x22:Completed 1237500 out of 1250000 steps (99%)
20:59:17:WU00:FS01:Connecting to assign1.foldingathome.org:80
20:59:17:WU00:FS01:Assigned to work server 158.130.118.26
20:59:17:WU00:FS01:Requesting new work unit for slot 01: gpu:9:0 GM107GL [Quadro K2200] from 158.130.118.26
20:59:17:WU00:FS01:Connecting to 158.130.118.26:8080
20:59:18:WU00:FS01:Downloading 9.20MiB
20:59:18:WU00:FS01:Download complete
20:59:18:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:18235 run:48 clone:3 gen:32 core:0x24 unit:0x00000003000000200000473b00000030
21:04:34:WU02:FS01:0x22:Completed 1250000 out of 1250000 steps (100%)
21:04:34:WU02:FS01:0x22:Average performance: 13.5849 ns/day
21:04:37:WU02:FS01:0x22:Checkpoint completed at step 1250000
21:04:41:WU02:FS01:0x22:Saving result file ../logfile_01.txt
21:04:41:WU02:FS01:0x22:Saving result file checkpointIntegrator.xml
21:04:41:WU02:FS01:0x22:Saving result file checkpointState.xml
21:04:44:WU02:FS01:0x22:Saving result file positions.xtc
21:04:46:WU02:FS01:0x22:Saving result file science.log
21:04:46:WU02:FS01:0x22:Folding@home Core Shutdown: FINISHED_UNIT
21:04:47:WU02:FS01:FahCore returned: FINISHED_UNIT (100 = 0x64)
21:04:47:WU02:FS01:Sending unit results: id:02 state:SEND error:NO_ERROR project:18931 run:9 clone:48 gen:90 core:0x22 unit:0x5a0000003000000009000000f3490000
21:04:47:WU02:FS01:FahCore returned: FINISHED_UNIT (100 = 0x64)
21:04:47:WU02:FS01:Sending unit results: id:02 state:SEND error:NO_ERROR project:18931 run:9 clone:48 gen:90 core:0x22 unit:0x5a0000003000000009000000f3490000
21:04:47:WU02:FS01:Uploading 29.28MiB to 128.174.73.78
21:04:47:WU02:FS01:Connecting to 128.174.73.78:8080
21:04:47:WU00:FS01:Starting
21:04:47:WU00:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /fah/cores/cores.foldingathome.org/openmm-core-24/centos-7.9.2009-64bit/release/0x24-8.1.4/Core_24.fah/FahCore_24 -dir 00 -suffix 01 -version 706 -lifeline 1 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor nvidia -gpu 0 -gpu-usage 100
21:04:47:WU00:FS01:Started FahCore on PID 179
21:04:47:WU00:FS01:Core PID:183
21:04:47:WU00:FS01:FahCore 0x24 started
21:04:48:WARNING:WU00:FS01:FahCore returned: WU_STALLED (127 = 0x7f)
Code: Select all
21:10:51:WARNING:WU00:FS01:Too many errors, failing