Page 1 of 1

13400 (0, 82, 7) Linux CORE_RESTART (98 = 0x62)

Posted: Mon Apr 27, 2020 12:08 pm
by ForbiddenBacon
Hi,

Picked up a WU for 13400 today and it has failed to start processing and crashes out with CORE_RESTART (98 = 0x62). Other projects have been running fine on the GPU and the temps are stable. Hardware is in the logfile but for clarity it's running on an AMD Radeon RX 5700 XT. I also can't seem to find a work folder to purge to get rid of the WU either. How can I clear this so the GPU can get on with other WUs? Also, is this a problem on my end or with the WU itself?

Code: Select all

11:54:01:INFO(1):Read GPUs.txt
11:54:01:Removing old file 'logs/log-20200423-082722.txt'
11:54:01:****************************** FAHClient ******************************
11:54:01:        Version: 7.6.9
11:54:01:         Author: Joseph Coffland <[email protected]>
11:54:01:      Copyright: 2020 foldingathome.org
11:54:01:       Homepage: https://foldingathome.org/
11:54:01:           Date: Apr 17 2020
11:54:01:           Time: 18:11:26
11:54:01:       Revision: 398c2b17fa535e0cc6c9d10856b2154c32771646
11:54:01:         Branch: master
11:54:01:       Compiler: GNU 8.3.0
11:54:01:        Options: -std=c++11 -ffunction-sections -fdata-sections -O3
11:54:01:                 -funroll-loops -fno-pie
11:54:01:       Platform: linux2 4.19.0-5-amd64
11:54:01:           Bits: 64
11:54:01:           Mode: Release
11:54:01:           Args: --config /home/andy/config.xml
11:54:01:         Config: /home/andy/config.xml
11:54:01:******************************** CBang ********************************
11:54:01:           Date: Apr 17 2020
11:54:01:           Time: 18:10:13
11:54:01:       Revision: 2fb0be7809c5e45287a122ca5fbc15b5ae859a3b
11:54:01:         Branch: master
11:54:01:       Compiler: GNU 8.3.0
11:54:01:        Options: -std=c++11 -ffunction-sections -fdata-sections -O3
11:54:01:                 -funroll-loops -fno-pie -fPIC
11:54:01:       Platform: linux2 4.19.0-5-amd64
11:54:01:           Bits: 64
11:54:01:           Mode: Release
11:54:01:******************************* System ********************************
11:54:01:            CPU: AMD Ryzen 9 3950X 16-Core Processor
11:54:01:         CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
11:54:01:           CPUs: 32
11:54:01:         Memory: 62.79GiB
11:54:01:    Free Memory: 16.70GiB
11:54:01:        Threads: POSIX_THREADS
11:54:01:     OS Version: 5.4
11:54:01:    Has Battery: false
11:54:01:     On Battery: false
11:54:01:     UTC Offset: 1
11:54:01:            PID: 1556213
11:54:01:            CWD: /home/andy
11:54:01:             OS: Linux 5.4.33-3-MANJARO x86_64
11:54:01:        OS Arch: AMD64
11:54:01:           GPUs: 1
11:54:01:          GPU 0: Bus:12 Slot:0 Func:0 AMD:6 Navi 10 [Radeon RX 5600 OEM/5600 XT
11:54:01:                 / 5700/5700 XT]
11:54:01:           CUDA: Not detected: Failed to open dynamic library 'libcuda.so':
11:54:01:                 libcuda.so: cannot open shared object file: No such file or
11:54:01:                 directory
11:54:01:OpenCL Device 0: Platform:0 Device:0 Bus:12 Slot:0 Compute:2.0 Driver:3004.6
11:54:01:OpenCL Device 1: Platform:1 Device:0 Bus:NA Slot:NA Compute:1.1 Driver:20.0
11:54:01:******************************* libFAH ********************************
11:54:01:           Date: Apr 15 2020
11:54:01:           Time: 21:43:24
11:54:01:       Revision: 216968bc7025029c841ed6e36e81a03a316890d3
11:54:01:         Branch: master
11:54:01:       Compiler: GNU 8.3.0
11:54:01:        Options: -std=c++11 -ffunction-sections -fdata-sections -O3
11:54:01:                 -funroll-loops -fno-pie
11:54:01:       Platform: linux2 4.19.0-5-amd64
11:54:01:           Bits: 64
11:54:01:           Mode: Release
11:54:01:***********************************************************************
11:54:01:<config>
11:54:01:  <!-- Network -->
11:54:01:  <proxy v=':8080'/>
11:54:01:
11:54:01:  <!-- Slot Control -->
11:54:01:  <power v='full'/>
11:54:01:
11:54:01:  <!-- User Information -->
11:54:01:  <passkey v='*****'/>
11:54:01:  <team v='223518'/>
11:54:01:  <user v='ForbiddenBacon'/>
11:54:01:
11:54:01:  <!-- Folding Slots -->
11:54:01:  <slot id='0' type='CPU'/>
11:54:01:  <slot id='1' type='GPU'>
11:54:01:    <opencl-index v='0'/>
11:54:01:  </slot>
11:54:01:</config>
11:54:01:Trying to access database...
11:54:01:Successfully acquired database lock
11:54:01:Enabled folding slot 00: READY cpu:31
11:54:01:Enabled folding slot 01: READY gpu:0:Navi 10 [Radeon RX 5600 OEM/5600 XT / 5700/5700 XT]
11:54:01:WU02:FS01:Starting
11:54:01:WU02:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /home/andy/cores/cores.foldingathome.org/v7/lin/64bit/Core_22.fah/FahCore_22 -dir 02 -suffix 01 -version 706 -lifeline 1556213 -checkpoint 15 -gpu-vendor amd -opencl-platform 0 -opencl-device 0 -gpu 0
11:54:01:WU02:FS01:Started FahCore on PID 1556293
11:54:01:WU02:FS01:Core PID:1556297
11:54:01:WU02:FS01:FahCore 0x22 started
11:54:16:WARNING:WU02:FS01:FahCore returned: CORE_RESTART (98 = 0x62)
11:54:16:WU02:FS01:Starting
11:54:16:WU02:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /home/andy/cores/cores.foldingathome.org/v7/lin/64bit/Core_22.fah/FahCore_22 -dir 02 -suffix 01 -version 706 -lifeline 1556213 -checkpoint 15 -gpu-vendor amd -opencl-platform 0 -opencl-device 0 -gpu 0
11:54:16:WU02:FS01:Started FahCore on PID 1556396
11:54:16:WU02:FS01:Core PID:1556400
11:54:16:WU02:FS01:FahCore 0x22 started
11:54:31:WARNING:WU02:FS01:FahCore returned: CORE_RESTART (98 = 0x62)
11:55:16:WU02:FS01:Starting
11:55:16:WU02:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /home/andy/cores/cores.foldingathome.org/v7/lin/64bit/Core_22.fah/FahCore_22 -dir 02 -suffix 01 -version 706 -lifeline 1556213 -checkpoint 15 -gpu-vendor amd -opencl-platform 0 -opencl-device 0 -gpu 0
11:55:16:WU02:FS01:Started FahCore on PID 1556588
11:55:16:WU02:FS01:Core PID:1556592
11:55:16:WU02:FS01:FahCore 0x22 started
11:55:31:WARNING:WU02:FS01:FahCore returned: CORE_RESTART (98 = 0x62)
Cheers,

Re: 13400 (0, 82, 7) Linux CORE_RESTART (98 = 0x62)

Posted: Mon Apr 27, 2020 4:20 pm
by Joe_H
Your CWD: is listed as /name/andy, there should be a work folder there. Inside that should be a folder named '01' for Folding Slot 01 and a folder '02' for the WU. Delete 02 after pausing folding, when restarted the client will detect the missing work files and dump the WU.

Re: 13400 (0, 82, 7) Linux CORE_RESTART (98 = 0x62)

Posted: Mon Apr 27, 2020 4:26 pm
by ForbiddenBacon
Thank you, completely didn't notice I was running without --chdir so the folders were not where I expected them. Thank you for spotting (It's been one of those days).