Page 1 of 1

Project: 9609 (R 0, C 46, G 211) - bad WU?

Posted: Tue Oct 13, 2015 11:04 pm
by folding_hoomer
Project: 9609 (R 0, C 46, G 211) crashes three times at frame 75 with: bad state detected
GTX970@1450MHz, Ubuntu 14.04 LTS, Driver 346.72

Initial-Log:

Code: Select all

*********************** Log Started 2015-09-29T22:45:29Z ***********************
22:45:29:************************* Folding@home Client *************************
22:45:29:    Website: http://folding.stanford.edu/
22:45:29:  Copyright: (c) 2009-2014 Stanford University
22:45:29:     Author: Joseph Coffland <[email protected]>
22:45:29:       Args: --child --lifeline 1090 /etc/fahclient/config.xml --run-as
22:45:29:             fahclient --pid-file=/var/run/fahclient.pid --daemon
22:45:29:     Config: /etc/fahclient/config.xml
22:45:29:******************************** Build ********************************
22:45:29:    Version: 7.4.4
22:45:29:       Date: Mar 4 2014
22:45:29:       Time: 12:02:38
22:45:29:    SVN Rev: 4130
22:45:29:     Branch: fah/trunk/client
22:45:29:   Compiler: GNU 4.4.7
22:45:29:    Options: -std=gnu++98 -O3 -funroll-loops -mfpmath=sse -ffast-math
22:45:29:             -fno-unsafe-math-optimizations -msse2
22:45:29:   Platform: linux2 3.2.0-1-amd64
22:45:29:       Bits: 64
22:45:29:       Mode: Release
22:45:29:******************************* System ********************************
22:45:29:        CPU: Intel(R) Core(TM) i7-3820 CPU @ 3.60GHz
22:45:29:     CPU ID: GenuineIntel Family 6 Model 45 Stepping 7
22:45:29:       CPUs: 8
22:45:29:     Memory: 7.71GiB
22:45:29:Free Memory: 7.28GiB
22:45:29:    Threads: POSIX_THREADS
22:45:29: OS Version: 3.13
22:45:29:Has Battery: false
22:45:29: On Battery: false
22:45:29: UTC Offset: 2
22:45:29:        PID: 1092
22:45:29:        CWD: /var/lib/fahclient
22:45:29:         OS: Linux 3.13.0-54-generic x86_64
22:45:29:    OS Arch: AMD64
22:45:29:       GPUs: 1
22:45:29:      GPU 0: NVIDIA:5 GM204 [GeForce GTX 970]
22:45:29:       CUDA: 5.2
22:45:29:CUDA Driver: 7000
22:45:29:***********************************************************************
22:45:29:<config>
22:45:29:  <!-- Client Control -->
22:45:29:  <fold-anon v='true'/>
22:45:29:
22:45:29:  <!-- Folding Slot Configuration -->
22:45:29:  <gpu v='false'/>
22:45:29:
22:45:29:  <!-- HTTP Server -->
22:45:29:  <allow v='127.0.0.1 192.168.2.100-192.168.2.110'/>
22:45:29:
22:45:29:  <!-- Logging -->
22:45:29:  <log-rotate-max v='100'/>
22:45:29:
22:45:29:  <!-- Network -->
22:45:29:  <proxy v=':8080'/>
22:45:29:
22:45:29:  <!-- Remote Command Server -->
22:45:29:  <command-allow-no-pass v='127.0.0.1 192.168.2.100-192.168.2.110'/>
22:45:29:
22:45:29:  <!-- Slot Control -->
22:45:29:  <pause-on-start v='true'/>
22:45:29:  <power v='full'/>
22:45:29:
22:45:29:  <!-- User Information -->
22:45:29:  <passkey v='********************************'/>
22:45:29:  <team v='xxxxx'/>
22:45:29:  <user v='xxxxxxxxxxx'/>
22:45:29:
22:45:29:  <!-- Folding Slots -->
22:45:29:  <slot id='1' type='GPU'>
22:45:29:    <client-type v='advanced'/>
22:45:29:    <next-unit-percentage v='100'/>
22:45:29:  </slot>
22:45:29:</config>
Error-Log:

Code: Select all

13:52:28:WU00:FS01:0x18:Project: 9609 (Run 0, Clone 46, Gen 211)
13:52:28:WU00:FS01:0x18:Unit: 0x000001080a3b1e815546e89c40643c7a
13:52:28:WU00:FS01:0x18:CPU: 0x00000000000000000000000000000000
13:52:28:WU00:FS01:0x18:Machine: 1
13:52:28:WU00:FS01:0x18:Reading tar file core.xml
13:52:28:WU00:FS01:0x18:Reading tar file system.xml
13:52:28:WU00:FS01:0x18:Reading tar file integrator.xml
13:52:28:WU00:FS01:0x18:Reading tar file state.xml
13:52:28:WU00:FS01:0x18:Digital signatures verified
13:52:28:WU00:FS01:0x18:Folding@home GPU core18
13:52:28:WU00:FS01:0x18:Version 0.0.4
13:52:31:WU00:FS01:0x18:Completed 0 out of 2000000 steps (0%)
13:52:31:WU00:FS01:0x18:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
13:52:49:WU00:FS01:0x18:Completed 20000 out of 2000000 steps (1%)
13:53:06:WU00:FS01:0x18:Completed 40000 out of 2000000 steps (2%)

 . . .

14:11:58:WU00:FS01:0x18:Completed 1360000 out of 2000000 steps (68%)
14:12:15:WU00:FS01:0x18:Completed 1380000 out of 2000000 steps (69%)
14:12:32:WU00:FS01:0x18:Completed 1400000 out of 2000000 steps (70%)
14:12:50:WU00:FS01:0x18:Completed 1420000 out of 2000000 steps (71%)
14:13:07:WU00:FS01:0x18:Completed 1440000 out of 2000000 steps (72%)
14:13:24:WU00:FS01:0x18:Completed 1460000 out of 2000000 steps (73%)
14:13:41:WU00:FS01:0x18:Completed 1480000 out of 2000000 steps (74%)
14:13:58:WU00:FS01:0x18:Completed 1500000 out of 2000000 steps (75%)
14:13:58:WU00:FS01:0x18:Bad State detected... attempting to resume from last good checkpoint
14:14:15:WU00:FS01:0x18:Completed 1420000 out of 2000000 steps (71%)
14:14:32:WU00:FS01:0x18:Completed 1440000 out of 2000000 steps (72%)
14:14:49:WU00:FS01:0x18:Completed 1460000 out of 2000000 steps (73%)
14:15:06:WU00:FS01:0x18:Completed 1480000 out of 2000000 steps (74%)
14:15:23:WU00:FS01:0x18:Completed 1500000 out of 2000000 steps (75%)
14:15:23:WU00:FS01:0x18:Bad State detected... attempting to resume from last good checkpoint
14:15:40:WU00:FS01:0x18:Completed 1420000 out of 2000000 steps (71%)
14:15:57:WU00:FS01:0x18:Completed 1440000 out of 2000000 steps (72%)
14:16:14:WU00:FS01:0x18:Completed 1460000 out of 2000000 steps (73%)
14:16:31:WU00:FS01:0x18:Completed 1480000 out of 2000000 steps (74%)
14:16:48:WU00:FS01:0x18:Completed 1500000 out of 2000000 steps (75%)
14:16:48:WU00:FS01:0x18:Bad State detected... attempting to resume from last good checkpoint
14:16:48:WU00:FS01:0x18:Max number of retries reached. Aborting.
14:16:48:WU00:FS01:0x18:ERROR:exception: Max Retries Reached
14:16:48:WU00:FS01:0x18:Saving result file logfile_01.txt
14:16:48:WU00:FS01:0x18:Saving result file log.txt
14:16:48:WU00:FS01:0x18:Folding@home Core Shutdown: BAD_WORK_UNIT
[93m14:16:49:WARNING:WU00:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)[0m
14:16:49:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:9609 run:0 clone:46 gen:211 core:0x18 unit:0x000001080a3b1e815546e89c40643c7a
14:16:49:WU00:FS01:Uploading 3.01KiB to 171.67.108.31
14:16:49:WU00:FS01:Connecting to 171.67.108.31:8080
14:16:50:WU00:FS01:Upload complete
14:16:55:WU00:FS01:Server responded WORK_ACK (400)
14:16:55:WU00:FS01:Cleaning up

Re: Project: 9609 (R 0, C 46, G 211) - bad WU?

Posted: Wed Oct 14, 2015 12:52 pm
by toTOW
The WU has been completed by someone else.