Mixed AMD/nVidia system - FINISHED_UNIT == UNKNOWN_ENUM??
Posted: Fri Mar 27, 2020 7:03 pm
I've got a laptop with an Intel integrated GPU, nVidia discrete GPU, and an AMD GPU over Thunderbolt 3. I spent a good chunk of time last night trying to get the two to co-exist: saga and logs etc here: https://www.reddit.com/r/foldingathome/ ... _now_with/ - old thread revived due to Reddit missing its moderators.
After getting it working, I set off to bed. Woke up this morning to some VERY strange log messages. This is from the AMD slot:
This happened twice in the same way. This is the earlier one:
In particular, note this section:
For some reason it seems to be taking the FINISHED_UNIT return value and somehow, by inexplicable miracle of nonsense, interpreting that as UNKNOWN_ENUM and asking the core to start work again... which of course fails... but it doesn't do anything with the result. Nothing at all. It seems to just throw it all away, not even sending a result of failure or success to the server.
?Que?
Obligatory system info:
FWIW, it had previously been working (using AMD external GPU only), before fighting with getting the nVidia and AMD to both work under the same umbrella. Reinstalled the nVidia driver to get it up and running, then reinstalled the AMD driver second once nVidia was working. This is because AMD and nVidia seem to share an "opencl.dll" file in \windows\system32, which causes a conflict of who "owns" the OpenCL platform. But now that both seem to be working (both have high activity and crunch WUs), the only problem seems to be in how F@H handles it.
After getting it working, I set off to bed. Woke up this morning to some VERY strange log messages. This is from the AMD slot:
Code: Select all
13:20:06:WU00:FS01:0x22:Completed 1760000 out of 2000000 steps (88%)
13:22:20:WU00:FS01:0x22:Completed 1780000 out of 2000000 steps (89%)
13:24:35:WU00:FS01:0x22:Completed 1800000 out of 2000000 steps (90%)
13:26:57:WU00:FS01:0x22:Completed 1820000 out of 2000000 steps (91%)
13:29:13:WU00:FS01:0x22:Completed 1840000 out of 2000000 steps (92%)
13:31:43:WU00:FS01:0x22:Completed 1860000 out of 2000000 steps (93%)
13:33:58:WU00:FS01:0x22:Completed 1880000 out of 2000000 steps (94%)
13:36:13:WU00:FS01:0x22:Completed 1900000 out of 2000000 steps (95%)
13:38:37:WU00:FS01:0x22:Completed 1920000 out of 2000000 steps (96%)
13:40:53:WU00:FS01:0x22:Completed 1940000 out of 2000000 steps (97%)
13:43:14:WU00:FS01:0x22:Completed 1960000 out of 2000000 steps (98%)
13:45:32:WU00:FS01:0x22:Completed 1980000 out of 2000000 steps (99%)
13:45:32:WU01:FS01:Connecting to 65.254.110.245:8080
13:45:32:WARNING:WU01:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
13:45:32:WU01:FS01:Connecting to 18.218.241.186:80
13:45:33:WARNING:WU01:FS01:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
13:45:33:ERROR:WU01:FS01:Exception: Could not get an assignment
13:45:33:WU01:FS01:Connecting to 65.254.110.245:8080
13:45:33:WU01:FS01:Assigned to work server 40.114.52.201
13:45:33:WU01:FS01:Requesting new work unit for slot 01: RUNNING gpu:1:Fiji XT [Radeon R9 Fury X] from 40.114.52.201
13:45:33:WU01:FS01:Connecting to 40.114.52.201:8080
13:45:55:WARNING:WU01:FS01:WorkServer connection failed on port 8080 trying 80
13:45:55:WU01:FS01:Connecting to 40.114.52.201:80
13:47:18:ERROR:WU01:FS01:Exception: 10002: Received short response, expected 512 bytes, got 0
13:47:18:WU01:FS01:Connecting to 65.254.110.245:8080
13:47:19:WARNING:WU01:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
13:47:19:WU01:FS01:Connecting to 18.218.241.186:80
13:47:19:WU01:FS01:Assigned to work server 128.252.203.10
13:47:19:WU01:FS01:Requesting new work unit for slot 01: RUNNING gpu:1:Fiji XT [Radeon R9 Fury X] from 128.252.203.10
13:47:19:WU01:FS01:Connecting to 128.252.203.10:8080
13:47:41:WARNING:WU01:FS01:WorkServer connection failed on port 8080 trying 80
13:47:41:WU01:FS01:Connecting to 128.252.203.10:80
13:47:59:WU00:FS01:0x22:Completed 2000000 out of 2000000 steps (100%)
13:48:01:ERROR:WU01:FS01:Exception: 10002: Received short response, expected 512 bytes, got 0
13:48:04:WU00:FS01:0x22:Saving result file ..\logfile_01.txt
13:48:04:WU00:FS01:0x22:Saving result file checkpointState.xml
13:48:07:WU00:FS01:0x22:Saving result file checkpt.crc
13:48:07:WU00:FS01:0x22:Saving result file positions.xtc
13:48:10:WU00:FS01:0x22:Saving result file science.log
13:48:10:WU00:FS01:0x22:Folding@home Core Shutdown: FINISHED_UNIT
13:48:13:WARNING:WU00:FS01:FahCore returned an unknown error code which probably indicates that it crashed
13:48:13:WARNING:WU00:FS01:FahCore returned: UNKNOWN_ENUM (-1073740791 = 0xc0000409)
13:48:14:WU00:FS01:Starting
13:48:14:WU00:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" [...]\FAHClient\cores/cores.foldingathome.org/v7/win/64bit/Core_22.fah/FahCore_22.exe -dir 00 -suffix 01 -version 705 -lifeline 12988 -checkpoint 15 -gpu-vendor amd -opencl-platform 1 -opencl-device 0 -gpu 0
13:48:14:WU00:FS01:Started FahCore on PID 1260
13:48:14:WU00:FS01:Core PID:9952
13:48:14:WU00:FS01:FahCore 0x22 started
13:48:14:WU00:FS01:0x22:*********************** Log Started 2020-03-27T13:48:14Z ***********************
13:48:14:WU00:FS01:0x22:*************************** Core22 Folding@home Core ***************************
13:48:14:WU00:FS01:0x22: Type: 0x22
13:48:14:WU00:FS01:0x22: Core: Core22
13:48:14:WU00:FS01:0x22: Website: https://foldingathome.org/
13:48:14:WU00:FS01:0x22: Copyright: (c) 2009-2018 foldingathome.org
13:48:14:WU00:FS01:0x22: Author: John Chodera <[email protected]> and Rafal Wiewiora
13:48:14:WU00:FS01:0x22: <[email protected]>
13:48:14:WU00:FS01:0x22: Args: -dir 00 -suffix 01 -version 705 -lifeline 1260 -checkpoint 15
13:48:14:WU00:FS01:0x22: -gpu-vendor amd -opencl-platform 1 -opencl-device 0 -gpu 0
13:48:14:WU00:FS01:0x22: Config: <none>
13:48:14:WU00:FS01:0x22:************************************ Build *************************************
13:48:14:WU00:FS01:0x22: Version: 0.0.2
13:48:14:WU00:FS01:0x22: Date: Dec 6 2019
13:48:14:WU00:FS01:0x22: Time: 21:30:31
13:48:14:WU00:FS01:0x22: Repository: Git
13:48:14:WU00:FS01:0x22: Revision: abeb39247cc72df5af0f63723edafadb23d5dfbe
13:48:14:WU00:FS01:0x22: Branch: HEAD
13:48:14:WU00:FS01:0x22: Compiler: Visual C++ 2008
13:48:14:WU00:FS01:0x22: Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
13:48:14:WU00:FS01:0x22: Platform: win32 10
13:48:14:WU00:FS01:0x22: Bits: 64
13:48:14:WU00:FS01:0x22: Mode: Release
13:48:14:WU00:FS01:0x22:************************************ System ************************************
13:48:14:WU00:FS01:0x22: CPU: Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
13:48:14:WU00:FS01:0x22: CPU ID: GenuineIntel Family 6 Model 158 Stepping 9
13:48:14:WU00:FS01:0x22: CPUs: 8
13:48:14:WU00:FS01:0x22: Memory: 15.88GiB
13:48:14:WU00:FS01:0x22:Free Memory: 10.41GiB
13:48:14:WU00:FS01:0x22: Threads: WINDOWS_THREADS
13:48:14:WU00:FS01:0x22: OS Version: 6.2
13:48:14:WU00:FS01:0x22:Has Battery: true
13:48:14:WU00:FS01:0x22: On Battery: false
13:48:14:WU00:FS01:0x22: UTC Offset: -7
13:48:14:WU00:FS01:0x22: PID: 9952
13:48:14:WU00:FS01:0x22: CWD: [...]\FAHClient\work
13:48:14:WU00:FS01:0x22: OS: Windows 10 Pro
13:48:14:WU00:FS01:0x22: OS Arch: AMD64
13:48:14:WU00:FS01:0x22:********************************************************************************
13:48:14:WU00:FS01:0x22:Project: 11748 (Run 0, Clone 6851, Gen 3)
13:48:14:WU00:FS01:0x22:Unit: 0x000000058ca304e75e6bb11c3a3521df
13:48:14:WU00:FS01:0x22:Reading tar file core.xml
13:48:14:WU00:FS01:0x22:Reading tar file integrator.xml
13:48:14:WU00:FS01:0x22:Reading tar file state.xml
13:48:16:WU00:FS01:0x22:Reading tar file system.xml
13:48:16:WU00:FS01:0x22:Digital signatures verified
13:48:16:WU00:FS01:0x22:Folding@home GPU Core22 Folding@home Core
13:48:16:WU00:FS01:0x22:Version 0.0.2
13:48:56:WU01:FS01:Connecting to 65.254.110.245:8080
13:48:56:WARNING:WU01:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
13:48:56:WU01:FS01:Connecting to 18.218.241.186:80
13:48:56:WARNING:WU01:FS01:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
13:48:56:ERROR:WU01:FS01:Exception: Could not get an assignment
13:51:33:WU01:FS01:Connecting to 65.254.110.245:8080
13:51:33:WARNING:WU01:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
13:51:33:WU01:FS01:Connecting to 18.218.241.186:80
13:51:33:WARNING:WU01:FS01:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
13:51:33:ERROR:WU01:FS01:Exception: Could not get an assignment
13:55:47:WU01:FS01:Connecting to 65.254.110.245:8080
13:55:47:WARNING:WU01:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
13:55:47:WU01:FS01:Connecting to 18.218.241.186:80
Code: Select all
13:31:43:WU00:FS01:0x22:Completed 1860000 out of 2000000 steps (93%)
13:33:58:WU00:FS01:0x22:Completed 1880000 out of 2000000 steps (94%)
13:36:13:WU00:FS01:0x22:Completed 1900000 out of 2000000 steps (95%)
13:38:37:WU00:FS01:0x22:Completed 1920000 out of 2000000 steps (96%)
13:40:53:WU00:FS01:0x22:Completed 1940000 out of 2000000 steps (97%)
13:43:14:WU00:FS01:0x22:Completed 1960000 out of 2000000 steps (98%)
13:45:32:WU00:FS01:0x22:Completed 1980000 out of 2000000 steps (99%)
13:45:32:WU01:FS01:Connecting to 65.254.110.245:8080
13:45:32:WARNING:WU01:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
13:45:32:WU01:FS01:Connecting to 18.218.241.186:80
13:45:33:WARNING:WU01:FS01:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
13:45:33:ERROR:WU01:FS01:Exception: Could not get an assignment
13:45:33:WU01:FS01:Connecting to 65.254.110.245:8080
13:45:33:WU01:FS01:Assigned to work server 40.114.52.201
13:45:33:WU01:FS01:Requesting new work unit for slot 01: RUNNING gpu:1:Fiji XT [Radeon R9 Fury X] from 40.114.52.201
13:45:33:WU01:FS01:Connecting to 40.114.52.201:8080
13:45:55:WARNING:WU01:FS01:WorkServer connection failed on port 8080 trying 80
13:45:55:WU01:FS01:Connecting to 40.114.52.201:80
13:47:18:ERROR:WU01:FS01:Exception: 10002: Received short response, expected 512 bytes, got 0
13:47:18:WU01:FS01:Connecting to 65.254.110.245:8080
13:47:19:WARNING:WU01:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
13:47:19:WU01:FS01:Connecting to 18.218.241.186:80
13:47:19:WU01:FS01:Assigned to work server 128.252.203.10
13:47:19:WU01:FS01:Requesting new work unit for slot 01: RUNNING gpu:1:Fiji XT [Radeon R9 Fury X] from 128.252.203.10
13:47:19:WU01:FS01:Connecting to 128.252.203.10:8080
13:47:41:WARNING:WU01:FS01:WorkServer connection failed on port 8080 trying 80
13:47:41:WU01:FS01:Connecting to 128.252.203.10:80
13:47:59:WU00:FS01:0x22:Completed 2000000 out of 2000000 steps (100%)
13:48:01:ERROR:WU01:FS01:Exception: 10002: Received short response, expected 512 bytes, got 0
13:48:04:WU00:FS01:0x22:Saving result file ..\logfile_01.txt
13:48:04:WU00:FS01:0x22:Saving result file checkpointState.xml
13:48:07:WU00:FS01:0x22:Saving result file checkpt.crc
13:48:07:WU00:FS01:0x22:Saving result file positions.xtc
13:48:10:WU00:FS01:0x22:Saving result file science.log
13:48:10:WU00:FS01:0x22:Folding@home Core Shutdown: FINISHED_UNIT
13:48:13:WARNING:WU00:FS01:FahCore returned an unknown error code which probably indicates that it crashed
13:48:13:WARNING:WU00:FS01:FahCore returned: UNKNOWN_ENUM (-1073740791 = 0xc0000409)
13:48:14:WU00:FS01:Starting
13:48:14:WU00:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" [...]\FAHClient\cores/cores.foldingathome.org/v7/win/64bit/Core_22.fah/FahCore_22.exe -dir 00 -suffix 01 -version 705 -lifeline 12988 -checkpoint 15 -gpu-vendor amd -opencl-platform 1 -opencl-device 0 -gpu 0
13:48:14:WU00:FS01:Started FahCore on PID 1260
13:48:14:WU00:FS01:Core PID:9952
13:48:14:WU00:FS01:FahCore 0x22 started
09:06:31:WU00:FS01:Connecting to 65.254.110.245:8080
09:06:32:WARNING:WU00:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
09:06:32:WU00:FS01:Connecting to 18.218.241.186:80
09:06:32:WARNING:WU00:FS01:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
09:06:32:ERROR:WU00:FS01:Exception: Could not get an assignment
09:09:09:WU00:FS01:Connecting to 65.254.110.245:8080
09:09:09:WARNING:WU00:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
09:09:09:WU00:FS01:Connecting to 18.218.241.186:80
09:09:09:WARNING:WU00:FS01:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
09:09:09:ERROR:WU00:FS01:Exception: Could not get an assignment
09:13:23:WU00:FS01:Connecting to 65.254.110.245:8080
09:13:23:WARNING:WU00:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
09:13:23:WU00:FS01:Connecting to 18.218.241.186:80
Code: Select all
13:48:07:WU00:FS01:0x22:Saving result file positions.xtc
13:48:10:WU00:FS01:0x22:Saving result file science.log
13:48:10:WU00:FS01:0x22:Folding@home Core Shutdown: FINISHED_UNIT
13:48:13:WARNING:WU00:FS01:FahCore returned an unknown error code which probably indicates that it crashed
13:48:13:WARNING:WU00:FS01:FahCore returned: UNKNOWN_ENUM (-1073740791 = 0xc0000409)
13:48:14:WU00:FS01:Starting
13:48:14:WU00:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" [...]\FAHClient\cores/cores.foldingathome.org/v7/win/64bit/Core_22.fah/FahCore_22.exe -dir 00 -suffix 01 -version 705 -lifeline 12988 -checkpoint 15 -gpu-vendor amd -opencl-platform 1 -opencl-device 0 -gpu 0
?Que?
Obligatory system info:
Code: Select all
*********************** Log Started 2020-03-27T09:03:21Z ***********************
09:03:21:************************* Folding@home Client *************************
09:03:21: Website: https://foldingathome.org/
09:03:21: Copyright: (c) 2009-2018 foldingathome.org
09:03:21: Author: Joseph Coffland <[email protected]>
09:03:21: Args:
09:03:21: Config: [...]\FAHClient\config.xml
09:03:21:******************************** Build ********************************
09:03:21: Version: 7.5.1
09:03:21: Date: May 11 2018
09:03:21: Time: 13:06:32
09:03:21: Repository: Git
09:03:21: Revision: 4705bf53c635f88b8fe85af7675557e15d491ff0
09:03:21: Branch: master
09:03:21: Compiler: Visual C++ 2008
09:03:21: Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
09:03:21: Platform: win32 10
09:03:21: Bits: 32
09:03:21: Mode: Release
09:03:21:******************************* System ********************************
09:03:21: CPU: Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
09:03:21: CPU ID: GenuineIntel Family 6 Model 158 Stepping 9
09:03:21: CPUs: 8
09:03:21: Memory: 15.88GiB
09:03:21: Free Memory: 8.13GiB
09:03:21: Threads: WINDOWS_THREADS
09:03:21: OS Version: 6.2
09:03:21: Has Battery: true
09:03:21: On Battery: false
09:03:21: UTC Offset: -7
09:03:21: PID: 12988
09:03:21: CWD: [...]\FAHClient
09:03:21: OS: Windows 10 Enterprise
09:03:21: OS Arch: AMD64
09:03:21: GPUs: 2
09:03:21: GPU 0: Bus:1 Slot:0 Func:0 NVIDIA:4 GM108 [GeForce 940MX]
09:03:21: GPU 1: Bus:9 Slot:0 Func:0 AMD:5 Fiji XT [Radeon R9 Fury X]
09:03:21: CUDA Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:5.0 Driver:11.0
09:03:21:OpenCL Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:1.2 Driver:445.75
09:03:21:OpenCL Device 1: Platform:1 Device:0 Bus:9 Slot:0 Compute:1.2 Driver:2906.10
09:03:21:OpenCL Device 2: Platform:2 Device:0 Bus:NA Slot:NA Compute:2.1 Driver:24.20
09:03:21: Win32 Service: false
09:03:21:***********************************************************************