Tesla P4 failing
Posted: Fri Aug 19, 2022 10:47 pm
Let this sit for 3 hours - this is all I get on this device. I've tried more than a few drivers set. Any advice? I also tried in Linux - and it would not enable.
20:59:14:WU00:FS05:0x22: Version: 7.7.0
20:59:14:WU00:FS05:0x22:********************************************************************************
20:59:14:WU00:FS05:0x22:Project: 17918 (Run 929, Clone 0, Gen 58)
20:59:14:WU00:FS05:0x22:Reading tar file core.xml
20:59:15:WU00:FS05:0x22:Reading tar file integrator.xml
20:59:15:WU00:FS05:0x22:Reading tar file state.xml
20:59:15:WU00:FS05:0x22:Reading tar file system.xml
20:59:18:WU00:FS05:0x22:Digital signatures verified
20:59:18:WU00:FS05:0x22:Folding@home GPU Core22 Folding@home Core
20:59:18:WU00:FS05:0x22:Version 0.0.20
20:59:18:WU00:FS05:0x22: Checkpoint write interval: 50000 steps (5%) [20 total]
20:59:18:WU00:FS05:0x22: JSON viewer frame write interval: 10000 steps (1%) [100 total]
20:59:18:WU00:FS05:0x22: XTC frame write interval: 25000 steps (2.5%) [40 total]
20:59:18:WU00:FS05:0x22: Global context and integrator variables write interval: disabled
20:59:18:WU00:FS05:0x22:There are 4 platforms available.
20:59:18:WU00:FS05:0x22:Platform 0: Reference
20:59:18:WU00:FS05:0x22:Platform 1: CPU
20:59:18:WU00:FS05:0x22:Platform 2: OpenCL
20:59:18:WU00:FS05:0x22: opencl-device 0 specified
20:59:18:WU00:FS05:0x22:Platform 3: CUDA
20:59:18:WU00:FS05:0x22: cuda-device 0 specified
21:00:25:WU00:FS05:0x22:Attempting to create CUDA context:
21:00:25:WU00:FS05:0x22: Configuring platform CUDA
21:00:42:WU00:FS05:0x22:ERROR:Discrepancy: Forces are blowing up! 683 0
21:00:42:WU00:FS05:0x22:Saving result file ..\logfile_01.txt
21:00:42:WU00:FS05:0x22:Saving result file science.log
21:00:42:WU00:FS05:0x22:Saving result file state.xml
21:00:47:WU00:FS05:0x22:Folding@home Core Shutdown: BAD_WORK_UNIT
*********************** Log Started 2022-08-19T19:12:18Z ***********************
20:05:32:WU00:FS05:0x22:WARNING:Console control signal 1 on PID 9796
20:06:33:WARNING:FS05:Killing WU00
20:59:08:WU00:FS05:0x22:ERROR:exception: Error loading CUDA module: CUDA_ERROR_ILLEGAL_ADDRESS (700)
20:59:13:WARNING:WU00:FS05:FahCore returned an unknown error code which probably indicates that it crashed
20:59:13:WARNING:WU00:FS05:FahCore returned: UNKNOWN_ENUM (-1073740791 = 0xc0000409)
21:00:42:WU00:FS05:0x22:ERROR:Discrepancy: Forces are blowing up! 683 0
20:59:14:WU00:FS05:0x22: Version: 7.7.0
20:59:14:WU00:FS05:0x22:********************************************************************************
20:59:14:WU00:FS05:0x22:Project: 17918 (Run 929, Clone 0, Gen 58)
20:59:14:WU00:FS05:0x22:Reading tar file core.xml
20:59:15:WU00:FS05:0x22:Reading tar file integrator.xml
20:59:15:WU00:FS05:0x22:Reading tar file state.xml
20:59:15:WU00:FS05:0x22:Reading tar file system.xml
20:59:18:WU00:FS05:0x22:Digital signatures verified
20:59:18:WU00:FS05:0x22:Folding@home GPU Core22 Folding@home Core
20:59:18:WU00:FS05:0x22:Version 0.0.20
20:59:18:WU00:FS05:0x22: Checkpoint write interval: 50000 steps (5%) [20 total]
20:59:18:WU00:FS05:0x22: JSON viewer frame write interval: 10000 steps (1%) [100 total]
20:59:18:WU00:FS05:0x22: XTC frame write interval: 25000 steps (2.5%) [40 total]
20:59:18:WU00:FS05:0x22: Global context and integrator variables write interval: disabled
20:59:18:WU00:FS05:0x22:There are 4 platforms available.
20:59:18:WU00:FS05:0x22:Platform 0: Reference
20:59:18:WU00:FS05:0x22:Platform 1: CPU
20:59:18:WU00:FS05:0x22:Platform 2: OpenCL
20:59:18:WU00:FS05:0x22: opencl-device 0 specified
20:59:18:WU00:FS05:0x22:Platform 3: CUDA
20:59:18:WU00:FS05:0x22: cuda-device 0 specified
21:00:25:WU00:FS05:0x22:Attempting to create CUDA context:
21:00:25:WU00:FS05:0x22: Configuring platform CUDA
21:00:42:WU00:FS05:0x22:ERROR:Discrepancy: Forces are blowing up! 683 0
21:00:42:WU00:FS05:0x22:Saving result file ..\logfile_01.txt
21:00:42:WU00:FS05:0x22:Saving result file science.log
21:00:42:WU00:FS05:0x22:Saving result file state.xml
21:00:47:WU00:FS05:0x22:Folding@home Core Shutdown: BAD_WORK_UNIT
*********************** Log Started 2022-08-19T19:12:18Z ***********************
20:05:32:WU00:FS05:0x22:WARNING:Console control signal 1 on PID 9796
20:06:33:WARNING:FS05:Killing WU00
20:59:08:WU00:FS05:0x22:ERROR:exception: Error loading CUDA module: CUDA_ERROR_ILLEGAL_ADDRESS (700)
20:59:13:WARNING:WU00:FS05:FahCore returned an unknown error code which probably indicates that it crashed
20:59:13:WARNING:WU00:FS05:FahCore returned: UNKNOWN_ENUM (-1073740791 = 0xc0000409)
21:00:42:WU00:FS05:0x22:ERROR:Discrepancy: Forces are blowing up! 683 0