Failing all GPU Work Units
Posted: Sat Dec 20, 2014 10:44 pm
Hi there,
I seem to be failing GPU Work Units on both of my graphics cards. My cards have not been folding for a week since I have been on vacation. I just got back home and started the cards back up and they are now getting BAD_WORK_UNIT errors on every unit that they receive from several different projects. Before I went on vacation, the cards were working completely normally and since then I have made no changes. There are no (manual) overclocks on either card.
I have restarted my computer, deleted and readded the GPU slots, deleted FahCore_17 from /var/lib/fahclient/cores/web.stanford.edu/~pande/Linux/AMD64/NVIDIA/Fermi/Core_17.fah/, uninstalled and reinstalled FAHClient: all with no success. Can anyone recommend any other steps? (I have paused the slots because I don't know if all of these Dumped units are negatively impacting my Unit Return Ratio.)
System details:
Relevant log information:
I seem to be failing GPU Work Units on both of my graphics cards. My cards have not been folding for a week since I have been on vacation. I just got back home and started the cards back up and they are now getting BAD_WORK_UNIT errors on every unit that they receive from several different projects. Before I went on vacation, the cards were working completely normally and since then I have made no changes. There are no (manual) overclocks on either card.
I have restarted my computer, deleted and readded the GPU slots, deleted FahCore_17 from /var/lib/fahclient/cores/web.stanford.edu/~pande/Linux/AMD64/NVIDIA/Fermi/Core_17.fah/, uninstalled and reinstalled FAHClient: all with no success. Can anyone recommend any other steps? (I have paused the slots because I don't know if all of these Dumped units are negatively impacting my Unit Return Ratio.)
System details:
Code: Select all
*********************** Log Started 2014-12-20T22:20:35Z ***********************
22:20:35:************************* Folding@home Client *************************
22:20:35: Website: http://folding.stanford.edu/
22:20:35: Copyright: (c) 2009-2014 Stanford University
22:20:35: Author: Joseph Coffland <[email protected]>
22:20:35: Args: --child --lifeline 3805 /etc/fahclient/config.xml --run-as
22:20:35: fahclient --pid-file=/var/run/fahclient.pid --daemon
22:20:35: Config: /etc/fahclient/config.xml
22:20:35:******************************** Build ********************************
22:20:35: Version: 7.4.4
22:20:35: Date: Mar 4 2014
22:20:35: Time: 12:02:38
22:20:35: SVN Rev: 4130
22:20:35: Branch: fah/trunk/client
22:20:35: Compiler: GNU 4.4.7
22:20:35: Options: -std=gnu++98 -O3 -funroll-loops -mfpmath=sse -ffast-math
22:20:35: -fno-unsafe-math-optimizations -msse2
22:20:35: Platform: linux2 3.2.0-1-amd64
22:20:35: Bits: 64
22:20:35: Mode: Release
22:20:35:******************************* System ********************************
22:20:35: CPU: AMD FX(tm)-8120 Eight-Core Processor
22:20:35: CPU ID: AuthenticAMD Family 21 Model 1 Stepping 2
22:20:35: CPUs: 8
22:20:35: Memory: 3.84GiB
22:20:35:Free Memory: 643.03MiB
22:20:35: Threads: POSIX_THREADS
22:20:35: OS Version: 3.13
22:20:35:Has Battery: false
22:20:35: On Battery: false
22:20:35: UTC Offset: -5
22:20:35: PID: 3807
22:20:35: CWD: /var/lib/fahclient
22:20:35: OS: Linux 3.13.0-24-generic x86_64
22:20:35: OS Arch: AMD64
22:20:35: GPUs: 2
22:20:35: GPU 0: NVIDIA:2 GF104 [GeForce GTX 460]
22:20:35: GPU 1: NVIDIA:3 GK106 [GeForce GTX 650 Ti]
22:20:35: CUDA: Not detected
22:20:35:***********************************************************************
Code: Select all
22:21:23:Adding folding slot 01: READY gpu:0:GF104 [GeForce GTX 460]
22:21:23:Saving configuration to /etc/fahclient/config.xml
22:21:23:<config>
22:21:23: <!-- Client Control -->
22:21:23: <fold-anon v='true'/>
22:21:23:
22:21:23: <!-- Folding Slot Configuration -->
22:21:23: <gpu v='false'/>
22:21:23:
22:21:23: <!-- Network -->
22:21:23: <proxy v=':8080'/>
22:21:23:
22:21:23: <!-- Folding Slots -->
22:21:23: <slot id='1' type='GPU'>
22:21:23: <cuda-index v='1'/>
22:21:23: <gpu-index v='0'/>
22:21:23: <opencl-index v='1'/>
22:21:23: </slot>
22:21:23:</config>
22:21:23:WARNING:WU00:Slot ID 0 no longer exists and there are no other matching slots, dumping
22:21:23:WU00:Sending unit results: id:00 state:SEND error:DUMPED project:9009 run:659 clone:1 gen:126 core:0xa4 unit:0x0000008d664f2de453868a0c32745789
22:21:23:WU00:Connecting to 171.64.65.124:8080
22:21:24:WU00:Server responded WORK_ACK (400)
22:21:24:WU00:Cleaning up
22:21:24:WU01:FS01:Connecting to 171.67.108.200:80
22:21:24:WU01:FS01:Assigned to work server 171.67.108.52
22:21:24:WU01:FS01:Requesting new work unit for slot 01: READY gpu:0:GF104 [GeForce GTX 460] from 171.67.108.52
22:21:24:WU01:FS01:Connecting to 171.67.108.52:8080
22:21:24:WU01:FS01:Downloading 1.52MiB
22:21:27:WU01:FS01:Download complete
22:21:27:WU01:FS01:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:9201 run:626 clone:3 gen:106 core:0x17 unit:0x000000a16652edc45399eea7e9ddf0c0
22:21:27:WU01:FS01:Downloading core from http://web.stanford.edu/~pande/Linux/AMD64/NVIDIA/Fermi/Core_17.fah
22:21:27:WU01:FS01:Connecting to web.stanford.edu:80
22:21:27:WU01:FS01:FahCore 17: Downloading 3.01MiB
22:21:31:WU01:FS01:FahCore 17: Download complete
22:21:31:WU01:FS01:Valid core signature
22:21:31:WU01:FS01:Unpacked 8.16MiB to cores/web.stanford.edu/~pande/Linux/AMD64/NVIDIA/Fermi/Core_17.fah/FahCore_17
22:21:31:WU01:FS01:Starting
22:21:31:WU01:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/web.stanford.edu/~pande/Linux/AMD64/NVIDIA/Fermi/Core_17.fah/FahCore_17 -dir 01 -suffix 01 -version 704 -lifeline 3807 -checkpoint 15 -gpu 1 -gpu-vendor nvidia
22:21:31:WU01:FS01:Started FahCore on PID 3850
22:21:31:WU01:FS01:Core PID:3854
22:21:31:WU01:FS01:FahCore 0x17 started
22:21:32:WU01:FS01:0x17:*********************** Log Started 2014-12-20T22:21:31Z ***********************
22:21:32:WU01:FS01:0x17:Project: 9201 (Run 626, Clone 3, Gen 106)
22:21:32:WU01:FS01:0x17:Unit: 0x000000a16652edc45399eea7e9ddf0c0
22:21:32:WU01:FS01:0x17:CPU: 0x00000000000000000000000000000000
22:21:32:WU01:FS01:0x17:Machine: 1
22:21:32:WU01:FS01:0x17:Reading tar file state.xml
22:21:32:WU01:FS01:0x17:Reading tar file system.xml
22:21:32:WU01:FS01:0x17:Reading tar file integrator.xml
22:21:32:WU01:FS01:0x17:Reading tar file core.xml
22:21:32:WU01:FS01:0x17:Digital signatures verified
22:21:32:WU01:FS01:0x17:ERROR:exception: Bad platformId size.
22:21:32:WU01:FS01:0x17:Saving result file logfile_01.txt
22:21:32:WU01:FS01:0x17:Saving result file log.txt
22:21:32:WU01:FS01:0x17:Folding@home Core Shutdown: BAD_WORK_UNIT
22:21:32:WARNING:WU01:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
22:21:32:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:9201 run:626 clone:3 gen:106 core:0x17 unit:0x000000a16652edc45399eea7e9ddf0c0
22:21:32:WU01:FS01:Uploading 1.86KiB to 171.67.108.52
22:21:32:WU01:FS01:Connecting to 171.67.108.52:8080
22:21:32:WU00:FS01:Connecting to 171.67.108.200:80
22:21:32:WU01:FS01:Upload complete
22:21:32:WU01:FS01:Server responded WORK_ACK (400)
22:21:32:WU01:FS01:Cleaning up
22:21:33:WU00:FS01:Assigned to work server 171.67.108.52
22:21:33:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:GF104 [GeForce GTX 460] from 171.67.108.52
22:21:33:WU00:FS01:Connecting to 171.67.108.52:8080
22:21:33:WU00:FS01:Downloading 1.53MiB
22:21:36:Saving configuration to /etc/fahclient/config.xml
22:21:36:<config>
22:21:36: <!-- Client Control -->
22:21:36: <fold-anon v='true'/>
22:21:36:
22:21:36: <!-- Folding Slot Configuration -->
22:21:36: <gpu v='false'/>
22:21:36:
22:21:36: <!-- Network -->
22:21:36: <proxy v=':8080'/>
22:21:36:
22:21:36: <!-- Folding Slots -->
22:21:36: <slot id='1' type='GPU'>
22:21:36: <cuda-index v='1'/>
22:21:36: <gpu-index v='0'/>
22:21:36: <opencl-index v='1'/>
22:21:36: </slot>
22:21:36:</config>
22:21:36:WU00:FS01:Download complete
22:21:36:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:9201 run:265 clone:2 gen:78 core:0x17 unit:0x000000816652edc45399e06f480e4522
22:21:36:WU00:FS01:Starting
22:21:36:WU00:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/web.stanford.edu/~pande/Linux/AMD64/NVIDIA/Fermi/Core_17.fah/FahCore_17 -dir 00 -suffix 01 -version 704 -lifeline 3807 -checkpoint 15 -gpu 1 -gpu-vendor nvidia
22:21:36:WU00:FS01:Started FahCore on PID 3857
22:21:36:WU00:FS01:Core PID:3861
22:21:36:WU00:FS01:FahCore 0x17 started
22:21:36:WU00:FS01:0x17:*********************** Log Started 2014-12-20T22:21:36Z ***********************
22:21:36:WU00:FS01:0x17:Project: 9201 (Run 265, Clone 2, Gen 78)
22:21:36:WU00:FS01:0x17:Unit: 0x000000816652edc45399e06f480e4522
22:21:36:WU00:FS01:0x17:CPU: 0x00000000000000000000000000000000
22:21:36:WU00:FS01:0x17:Machine: 1
22:21:36:WU00:FS01:0x17:Reading tar file state.xml
22:21:36:WU00:FS01:0x17:Reading tar file system.xml
22:21:36:WU00:FS01:0x17:Reading tar file integrator.xml
22:21:36:WU00:FS01:0x17:Reading tar file core.xml
22:21:36:WU00:FS01:0x17:Digital signatures verified
22:21:36:WU00:FS01:0x17:ERROR:exception: Bad platformId size.
22:21:36:WU00:FS01:0x17:Saving result file logfile_01.txt
22:21:36:WU00:FS01:0x17:Saving result file log.txt
22:21:36:WU00:FS01:0x17:Folding@home Core Shutdown: BAD_WORK_UNIT
22:21:37:WARNING:WU00:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
22:21:37:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:9201 run:265 clone:2 gen:78 core:0x17 unit:0x000000816652edc45399e06f480e4522
22:21:37:WU00:FS01:Uploading 1.87KiB to 171.67.108.52
22:21:37:WU00:FS01:Connecting to 171.67.108.52:8080
22:21:37:WU00:FS01:Upload complete
22:21:37:WU00:FS01:Server responded WORK_ACK (400)
22:21:37:WU00:FS01:Cleaning up
22:21:37:WU01:FS01:Connecting to 171.67.108.200:80
22:21:38:WU01:FS01:Assigned to work server 171.67.108.52
22:21:38:WU01:FS01:Requesting new work unit for slot 01: READY gpu:0:GF104 [GeForce GTX 460] from 171.67.108.52
22:21:38:WU01:FS01:Connecting to 171.67.108.52:8080
22:21:38:WU01:FS01:Downloading 1.52MiB
22:21:40:WU01:FS01:Download complete
22:21:41:WU01:FS01:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:9201 run:466 clone:4 gen:76 core:0x17 unit:0x000000706652edc45399e861591d2356
22:21:41:WU01:FS01:Starting
22:21:41:WU01:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/web.stanford.edu/~pande/Linux/AMD64/NVIDIA/Fermi/Core_17.fah/FahCore_17 -dir 01 -suffix 01 -version 704 -lifeline 3807 -checkpoint 15 -gpu 1 -gpu-vendor nvidia
22:21:41:WU01:FS01:Started FahCore on PID 3864
22:21:41:WU01:FS01:Core PID:3868
22:21:41:WU01:FS01:FahCore 0x17 started
22:21:41:WU01:FS01:0x17:*********************** Log Started 2014-12-20T22:21:41Z ***********************
22:21:41:WU01:FS01:0x17:Project: 9201 (Run 466, Clone 4, Gen 76)
22:21:41:WU01:FS01:0x17:Unit: 0x000000706652edc45399e861591d2356
22:21:41:WU01:FS01:0x17:CPU: 0x00000000000000000000000000000000
22:21:41:WU01:FS01:0x17:Machine: 1
22:21:41:WU01:FS01:0x17:Reading tar file state.xml
22:21:41:WU01:FS01:0x17:Reading tar file system.xml
22:21:41:WU01:FS01:0x17:Reading tar file integrator.xml
22:21:41:WU01:FS01:0x17:Reading tar file core.xml
22:21:41:WU01:FS01:0x17:Digital signatures verified
22:21:41:WU01:FS01:0x17:ERROR:exception: Bad platformId size.
22:21:41:WU01:FS01:0x17:Saving result file logfile_01.txt
22:21:41:WU01:FS01:0x17:Saving result file log.txt
22:21:41:WU01:FS01:0x17:Folding@home Core Shutdown: BAD_WORK_UNIT
22:21:41:WARNING:WU01:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
22:21:41:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:9201 run:466 clone:4 gen:76 core:0x17 unit:0x000000706652edc45399e861591d2356
22:21:41:WU01:FS01:Uploading 1.86KiB to 171.67.108.52
22:21:41:WU01:FS01:Connecting to 171.67.108.52:8080
22:21:42:WU01:FS01:Upload complete
22:21:42:WU01:FS01:Server responded WORK_ACK (400)
22:21:42:WU01:FS01:Cleaning up
22:21:42:WU00:FS01:Connecting to 171.67.108.200:80
22:21:42:WU00:FS01:Assigned to work server 140.163.4.231
22:21:42:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:GF104 [GeForce GTX 460] from 140.163.4.231
22:21:42:WU00:FS01:Connecting to 140.163.4.231:8080
22:21:42:WU00:FS01:Downloading 4.83MiB
22:21:45:WU00:FS01:Download complete
22:21:45:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:13001 run:48 clone:7 gen:63 core:0x17 unit:0x0000007c538b3db753285d7aa08e7365
22:21:45:WU00:FS01:Starting
22:21:45:WU00:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/web.stanford.edu/~pande/Linux/AMD64/NVIDIA/Fermi/Core_17.fah/FahCore_17 -dir 00 -suffix 01 -version 704 -lifeline 3807 -checkpoint 15 -gpu 1 -gpu-vendor nvidia
22:21:45:WU00:FS01:Started FahCore on PID 3871
22:21:45:WU00:FS01:Core PID:3875
22:21:45:WU00:FS01:FahCore 0x17 started
22:21:45:WU00:FS01:0x17:*********************** Log Started 2014-12-20T22:21:45Z ***********************
22:21:45:WU00:FS01:0x17:Project: 13001 (Run 48, Clone 7, Gen 63)
22:21:45:WU00:FS01:0x17:Unit: 0x0000007c538b3db753285d7aa08e7365
22:21:45:WU00:FS01:0x17:CPU: 0x00000000000000000000000000000000
22:21:45:WU00:FS01:0x17:Machine: 1
22:21:45:WU00:FS01:0x17:Reading tar file state.xml
22:21:45:WU00:FS01:0x17:Reading tar file system.xml
22:21:46:WU00:FS01:0x17:Reading tar file integrator.xml
22:21:46:WU00:FS01:0x17:Reading tar file core.xml
22:21:46:WU00:FS01:0x17:Digital signatures verified
22:21:46:WU00:FS01:0x17:ERROR:exception: Bad platformId size.
22:21:46:WU00:FS01:0x17:Saving result file logfile_01.txt
22:21:46:WU00:FS01:0x17:Saving result file log.txt
22:21:46:WU00:FS01:0x17:Folding@home Core Shutdown: BAD_WORK_UNIT
22:21:46:WARNING:WU00:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
22:21:46:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:13001 run:48 clone:7 gen:63 core:0x17 unit:0x0000007c538b3db753285d7aa08e7365
22:21:46:WU00:FS01:Uploading 1.87KiB to 140.163.4.231
22:21:46:WU00:FS01:Connecting to 140.163.4.231:8080
22:21:46:WU00:FS01:Upload complete
22:21:46:WU00:FS01:Server responded WORK_ACK (400)
22:21:46:WU00:FS01:Cleaning up
22:21:46:WU01:FS01:Connecting to 171.67.108.200:80
22:21:47:WU01:FS01:Assigned to work server 140.163.4.231
22:21:47:WU01:FS01:Requesting new work unit for slot 01: READY gpu:0:GF104 [GeForce GTX 460] from 140.163.4.231
22:21:47:WU01:FS01:Connecting to 140.163.4.231:8080
22:21:47:FS01:Finishing
22:21:47:WU01:FS01:Downloading 4.84MiB
22:21:49:WU01:FS01:Download complete
22:21:49:WU01:FS01:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:13000 run:275 clone:1 gen:62 core:0x17 unit:0x0000006c538b3db7530fe97a0fbb95ed
22:21:49:WU01:FS01:Starting
22:21:49:WU01:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/web.stanford.edu/~pande/Linux/AMD64/NVIDIA/Fermi/Core_17.fah/FahCore_17 -dir 01 -suffix 01 -version 704 -lifeline 3807 -checkpoint 15 -gpu 1 -gpu-vendor nvidia
22:21:49:WU01:FS01:Started FahCore on PID 3878
22:21:49:WU01:FS01:Core PID:3882
22:21:49:WU01:FS01:FahCore 0x17 started
22:21:50:WU01:FS01:0x17:*********************** Log Started 2014-12-20T22:21:49Z ***********************
22:21:50:WU01:FS01:0x17:Project: 13000 (Run 275, Clone 1, Gen 62)
22:21:50:WU01:FS01:0x17:Unit: 0x0000006c538b3db7530fe97a0fbb95ed
22:21:50:WU01:FS01:0x17:CPU: 0x00000000000000000000000000000000
22:21:50:WU01:FS01:0x17:Machine: 1
22:21:50:WU01:FS01:0x17:Reading tar file state.xml
22:21:50:WU01:FS01:0x17:Reading tar file system.xml
22:21:51:WU01:FS01:0x17:Reading tar file integrator.xml
22:21:51:WU01:FS01:0x17:Reading tar file core.xml
22:21:51:WU01:FS01:0x17:Digital signatures verified
22:21:51:WU01:FS01:0x17:ERROR:exception: Bad platformId size.
22:21:51:WU01:FS01:0x17:Saving result file logfile_01.txt
22:21:51:WU01:FS01:0x17:Saving result file log.txt
22:21:51:WU01:FS01:0x17:Folding@home Core Shutdown: BAD_WORK_UNIT
22:21:51:WARNING:WU01:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
22:21:51:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13000 run:275 clone:1 gen:62 core:0x17 unit:0x0000006c538b3db7530fe97a0fbb95ed
22:21:51:WU01:FS01:Uploading 1.86KiB to 140.163.4.231
22:21:51:WU01:FS01:Connecting to 140.163.4.231:8080
22:21:51:WU01:FS01:Upload complete
22:21:51:WU01:FS01:Server responded WORK_ACK (400)
22:21:51:WU01:FS01:Cleaning up