16926 - Some sort of loops with this CPU WU
Posted: Mon Nov 30, 2020 3:16 am
Initially got a problem at 2352Z kept seeing a loops regarding a cpu WU. paused/unpaused the slot, and also tried rebooting, still get the below:
Code: Select all
*********************** Log Started 2020-11-30T03:10:47Z ***********************
03:10:47:Trying to access database...
03:10:48:Successfully acquired database lock
03:10:48:Downloading GPUs.txt from assign1.foldingathome.org:80
03:10:48:Connecting to assign1.foldingathome.org:80
03:10:48:Read GPUs.txt
03:10:48:Enabled folding slot 00: READY cpu:4
03:10:50:Enabled folding slot 01: PAUSED gpu:0:GV100GL [Tesla V100 PCIe 16GB] M 14028 (by user)
03:10:50:****************************** FAHClient ******************************
03:10:50: Version: 7.6.13
03:10:50: Author: Joseph Coffland <[email protected]>
03:10:50: Copyright: 2020 foldingathome.org
03:10:50: Homepage: https://foldingathome.org/
03:10:50: Date: Apr 28 2020
03:10:50: Time: 04:20:16
03:10:50: Revision: 5a652817f46116b6e135503af97f18e094414e3b
03:10:50: Branch: master
03:10:50: Compiler: GNU 8.3.0
03:10:50: Options: -std=c++11 -ffunction-sections -fdata-sections -O3
03:10:50: -funroll-loops -fno-pie
03:10:50: Platform: linux2 4.19.0-5-amd64
03:10:50: Bits: 64
03:10:50: Mode: Release
03:10:50: Args: --child /etc/fahclient/config.xml --run-as fahclient
03:10:50: --pid-file=/var/run/fahclient.pid --daemon
03:10:50: Config: /etc/fahclient/config.xml
03:10:50:******************************** CBang ********************************
03:10:50: Date: Apr 25 2020
03:10:50: Time: 00:07:53
03:10:50: Revision: ea081a3b3b0f4a37c4d0440b4f1bc184197c7797
03:10:50: Branch: master
03:10:50: Compiler: GNU 8.3.0
03:10:50: Options: -std=c++11 -ffunction-sections -fdata-sections -O3
03:10:50: -funroll-loops -fno-pie -fPIC
03:10:50: Platform: linux2 4.19.0-5-amd64
03:10:50: Bits: 64
03:10:50: Mode: Release
03:10:50:******************************* System ********************************
03:10:50: CPU: Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz
03:10:50: CPU ID: GenuineIntel Family 6 Model 79 Stepping 1
03:10:50: CPUs: 6
03:10:50: Memory: 110.17GiB
03:10:50: Free Memory: 109.39GiB
03:10:50: Threads: POSIX_THREADS
03:10:50: OS Version: 4.19
03:10:50: Has Battery: false
03:10:50: On Battery: false
03:10:50: UTC Offset: 0
03:10:50: PID: 651
03:10:50: CWD: /var/lib/fahclient
03:10:50: OS: Linux 4.19.0-12-cloud-amd64 x86_64
03:10:50: OS Arch: AMD64
03:10:50: GPUs: 1
03:10:50: GPU 0: Bus:0 Slot:0 Func:0 NVIDIA:7 GV100GL [Tesla V100 PCIe 16GB] M
03:10:50: 14028
03:10:50: CUDA Device 0: Platform:0 Device:0 Bus:0 Slot:0 Compute:7.0 Driver:11.0
03:10:50:OpenCL Device 0: Platform:0 Device:0 Bus:0 Slot:0 Compute:1.2 Driver:450.80
03:10:50:******************************* libFAH ********************************
03:10:50: Date: Apr 15 2020
03:10:50: Time: 21:43:24
03:10:50: Revision: 216968bc7025029c841ed6e36e81a03a316890d3
03:10:50: Branch: master
03:10:50: Compiler: GNU 8.3.0
03:10:50: Options: -std=c++11 -ffunction-sections -fdata-sections -O3
03:10:50: -funroll-loops -fno-pie
03:10:50: Platform: linux2 4.19.0-5-amd64
03:10:50: Bits: 64
03:10:50: Mode: Release
03:10:50:***********************************************************************
03:10:50:<config>
03:10:50: <!-- Client Control -->
03:10:50: <fold-anon v='true'/>
03:10:50:
03:10:50: <!-- Folding Slot Configuration -->
03:10:50: <cpus v='4'/>
03:10:50:
03:10:50: <!-- HTTP Server -->redacted
03:10:50:
03:10:50: <!-- Folding Slots -->
03:10:50: <slot id='0' type='CPU'/>
03:10:50: <slot id='1' type='GPU'>
03:10:50: <paused v='true'/>
03:10:50: </slot>
03:10:50:</config>
03:10:50:WU01:FS00:Starting
03:10:50:WU01:FS00:Removing old file 'work/01/logfile_01-20201130-023821.txt'
03:10:50:WU01:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/lin/64bit-avx2-256/a8-0.0.9/Core_a8.fah/FahCore_a8 -dir 01 -suffix 01 -version 706 -lifeline 651 -checkpoint 15 -np 4
03:10:50:WU01:FS00:Started FahCore on PID 768
03:10:50:WU01:FS00:Core PID:776
03:10:50:WU01:FS00:FahCore 0xa8 started
03:10:50:WU01:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
03:10:51:WU01:FS00:Starting
03:10:51:WU01:FS00:Removing old file 'work/01/logfile_01-20201130-023921.txt'
03:10:51:WU01:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/lin/64bit-avx2-256/a8-0.0.9/Core_a8.fah/FahCore_a8 -dir 01 -suffix 01 -version 706 -lifeline 651 -checkpoint 15 -np 4
03:10:51:WU01:FS00:Started FahCore on PID 1107
03:10:51:WU01:FS00:Core PID:1111
03:10:51:WU01:FS00:FahCore 0xa8 started
03:10:51:WARNING:WU01:FS00:FahCore returned: EARLY_UNIT_END (123 = 0x7b)
03:11:51:WU01:FS00:Starting
03:11:51:WU01:FS00:Removing old file 'work/01/logfile_01-20201130-024021.txt'
03:11:51:WU01:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/lin/64bit-avx2-256/a8-0.0.9/Core_a8.fah/FahCore_a8 -dir 01 -suffix 01 -version 706 -lifeline 651 -checkpoint 15 -np 4
03:11:51:WU01:FS00:Started FahCore on PID 1148
03:11:51:WU01:FS00:Core PID:1152
03:11:51:WU01:FS00:FahCore 0xa8 started
03:11:51:WU01:FS00:FahCore returned: INTERRUPTED (102 = 0x66)