Page 4 of 10

Re: New Assignment Server feedback/problem

Posted: Sun Oct 05, 2014 12:22 pm
by widsss
I don't understand the idea that Maxwells aren't ready. 344.11 didn't change over night. Maxwell folded core 17s just fine until changes were made at Stanford.

Re: New Assignment Server feedback/problem

Posted: Sun Oct 05, 2014 12:28 pm
by Breach
widsss wrote:I don't understand the idea that Maxwells aren't ready. 344.11 didn't change over night. Maxwell folded core 17s just fine until changes were made at Stanford.
Yes, but I don't think the Projects/WUs which are being assigned now...

Re: New Assignment Server feedback/problem

Posted: Sun Oct 05, 2014 2:08 pm
by 7im
JimF wrote:I don't understand the need for the "advanced settings". I was getting Core 17 fine on all six of my GTX 750 Ti's (without the advanced tag) until a couple of days ago, when we were asked to use it. So I did, and have been getting only failures ever since.

My latest log is in this post.
viewtopic.php?f=80&t=25887&start=135
Your log shows the client using fahcore_17 v52. Maxwell cards need v55 to fold correctly.

Pande Group should have forced the new version to download and clearly that didn't happen. They may have posted a newer version, but if not forced to update, it will run on the current (older) version.

Pause the client and delete the fahcore to see if it will download the v55 core while using the advanced client type. The path of where to find the file is listed in you log.

Re: New Assignment Server feedback/problem

Posted: Sun Oct 05, 2014 2:41 pm
by SwingOp3
7im wrote:
JimF wrote:I don't understand the need for the "advanced settings". I was getting Core 17 fine on all six of my GTX 750 Ti's (without the advanced tag) until a couple of days ago, when we were asked to use it. So I did, and have been getting only failures ever since.

My latest log is in this post.
viewtopic.php?f=80&t=25887&start=135
Your log shows the client using fahcore_17 v52. Maxwell cards need v55 to fold correctly.

Pande Group should have forced the new version to download and clearly that didn't happen. They may have posted a newer version, but if not forced to update, it will run on the current (older) version.

Pause the client and delete the fahcore to see if it will download the v55 core while using the advanced client type. The path of where to find the file is listed in you log.
I did as you suggested 7im but it still downloaded v52. The string of failed wu's continued.

Re: New Assignment Server feedback/problem

Posted: Sun Oct 05, 2014 2:56 pm
by Breach
7im wrote:
JimF wrote:I don't understand the need for the "advanced settings". I was getting Core 17 fine on all six of my GTX 750 Ti's (without the advanced tag) until a couple of days ago, when we were asked to use it. So I did, and have been getting only failures ever since.

My latest log is in this post.
viewtopic.php?f=80&t=25887&start=135
Your log shows the client using fahcore_17 v52. Maxwell cards need v55 to fold correctly.

Pande Group should have forced the new version to download and clearly that didn't happen. They may have posted a newer version, but if not forced to update, it will run on the current (older) version.

Pause the client and delete the fahcore to see if it will download the v55 core while using the advanced client type. The path of where to find the file is listed in you log.
I don't think this is correct as the situation on beta (core 0.5.5) is the same. Additionally, just as an experiment, I have replaced the standard non-beta core 17 with the beta executable (0.5.5) and the situation is the same.

Re: New Assignment Server feedback/problem

Posted: Sun Oct 05, 2014 3:00 pm
by Mstenholm
Advanced setting will only download you version 52. Beta setting downloads 55. I tried both but the end result is the same - crash on 13001 and 9406. GTX 970.

Re: New Assignment Server feedback/problem

Posted: Sun Oct 05, 2014 4:59 pm
by kookykrazee
I turned off all flags before I went to bed last night and appears I have gotten 1300x WU and those failed while sleeping. Do we know when the reversion will be updated?

Re: New Assignment Server feedback/problem

Posted: Sun Oct 05, 2014 5:06 pm
by widsss
I just tried both v52 and v55 with the same results - bad work units.

Re: New Assignment Server feedback/problem

Posted: Sun Oct 05, 2014 5:24 pm
by RipD
Same for me. I'm not able to fold the 9406, 13000, or 13001 WUs I've been getting for the past day or so. Running a GTX 970, latest drivers, no o/c or other changes. It was folding just fine until about 24 hours ago (maybe 36). Can send previous logs if that's helpful. Have tried advanced and not, uninstalled and re-installed drivers from Nvidia. Nothing seems to help.

Interesting that before the WU fails with a "FAULTY" message my CPU is running at 25% for FahCore_17. My GPU has 0% load.

Code: Select all

*********************** Log Started 2014-10-05T16:54:41Z ***********************
16:54:41:************************* Folding@home Client *************************
16:54:41:      Website: http://folding.stanford.edu/
16:54:41:    Copyright: (c) 2009-2014 Stanford University
16:54:41:       Author: Joseph Coffland <[email protected]>
16:54:41:         Args: 
16:54:41:       Config: C:/Users/Andy/AppData/Roaming/FAHClient/config.xml
16:54:41:******************************** Build ********************************
16:54:41:      Version: 7.4.4
16:54:41:         Date: Mar 4 2014
16:54:41:         Time: 20:26:54
16:54:41:      SVN Rev: 4130
16:54:41:       Branch: fah/trunk/client
16:54:41:     Compiler: Intel(R) C++ MSVC 1500 mode 1200
16:54:41:      Options: /TP /nologo /EHa /Qdiag-disable:4297,4103,1786,279 /Ox -arch:SSE
16:54:41:               /QaxSSE2,SSE3,SSSE3,SSE4.1,SSE4.2 /Qopenmp /Qrestrict /MT /Qmkl
16:54:41:     Platform: win32 XP
16:54:41:         Bits: 32
16:54:41:         Mode: Release
16:54:41:******************************* System ********************************
16:54:41:          CPU: Intel(R) Core(TM)2 Quad CPU @ 2.40GHz
16:54:41:       CPU ID: GenuineIntel Family 6 Model 15 Stepping 7
16:54:41:         CPUs: 4
16:54:41:       Memory: 3.94GiB
16:54:41:  Free Memory: 2.25GiB
16:54:41:      Threads: WINDOWS_THREADS
16:54:41:   OS Version: 6.1
16:54:41:  Has Battery: false
16:54:41:   On Battery: false
16:54:41:   UTC Offset: -7
16:54:41:          PID: 4140
16:54:41:          CWD: C:/Users/Andy/AppData/Roaming/FAHClient
16:54:41:           OS: Windows 7 Ultimate N
16:54:41:      OS Arch: AMD64
16:54:41:         GPUs: 1
16:54:41:        GPU 0: NVIDIA:4 GM204 [GeForce GTX 970]
16:54:41:         CUDA: 5.2
16:54:41:  CUDA Driver: 6050
16:54:41:Win32 Service: false
16:54:41:***********************************************************************
16:54:41:<config>
16:54:41:  <service-description v='Folding@home Client'/>
16:54:41:  <service-restart v='true'/>
16:54:41:  <service-restart-delay v='5000'/>
16:54:41:
16:54:41:  <!-- Client Control -->
16:54:41:  <client-threads v='6'/>
16:54:41:  <cycle-rate v='4'/>
16:54:41:  <cycles v='-1'/>
16:54:41:  <data-directory v='.'/>
16:54:41:  <disable-sleep-when-active v='true'/>
16:54:41:  <exec-directory v='C:\Program Files (x86)\FAHClient'/>
16:54:41:  <exit-when-done v='false'/>
16:54:41:  <fold-anon v='false'/>
16:54:41:  <open-web-control v='false'/>
16:54:41:
16:54:41:  <!-- Configuration -->
16:54:41:  <config-rotate v='true'/>
16:54:41:  <config-rotate-dir v='configs'/>
16:54:41:  <config-rotate-max v='16'/>
16:54:41:
16:54:41:  <!-- Debugging -->
16:54:41:  <assignment-servers>
16:54:41:    assign3.stanford.edu:8080 assign4.stanford.edu:80
16:54:41:  </assignment-servers>
16:54:41:  <auth-as v='true'/>
16:54:41:  <capture-directory v='capture'/>
16:54:41:  <capture-on-error v='false'/>
16:54:41:  <capture-packets v='false'/>
16:54:41:  <capture-requests v='false'/>
16:54:41:  <capture-responses v='false'/>
16:54:41:  <capture-sockets v='false'/>
16:54:41:  <core-exec v='FahCore_$type'/>
16:54:41:  <core-wrapper-exec v='FAHCoreWrapper'/>
16:54:41:  <debug-sockets v='false'/>
16:54:41:  <exception-locations v='true'/>
16:54:41:  <gpu-assignment-servers>
16:54:41:    assign-GPU.stanford.edu:80 assign-GPU2.stanford.edu:80
16:54:41:  </gpu-assignment-servers>
16:54:41:  <stack-traces v='false'/>
16:54:41:
16:54:41:  <!-- Error Handling -->
16:54:41:  <max-slot-errors v='10'/>
16:54:41:  <max-unit-errors v='5'/>
16:54:41:
16:54:41:  <!-- Folding Core -->
16:54:41:  <checkpoint v='3'/>
16:54:41:  <core-dir v='cores'/>
16:54:41:  <core-priority v='low'/>
16:54:41:  <cpu-affinity v='false'/>
16:54:41:  <cpu-usage v='100'/>
16:54:41:  <gpu-usage v='100'/>
16:54:41:  <no-assembly v='false'/>
16:54:41:
16:54:41:  <!-- Folding Slot Configuration -->
16:54:41:  <cause v='ANY'/>
16:54:41:  <client-subtype v='STDCLI'/>
16:54:41:  <client-type v='normal'/>
16:54:41:  <cpu-species v='X86_PENTIUM_II'/>
16:54:41:  <cpu-type v='AMD64'/>
16:54:41:  <cpus v='-1'/>
16:54:41:  <gpu v='true'/>
16:54:41:  <max-packet-size v='normal'/>
16:54:41:  <os-species v='UNKNOWN'/>
16:54:41:  <os-type v='WIN32'/>
16:54:41:  <project-key v='0'/>
16:54:41:  <smp v='true'/>
16:54:41:
16:54:41:  <!-- GUI -->
16:54:41:  <gui-enabled v='true'/>
16:54:41:
16:54:41:  <!-- HTTP Server -->
16:54:41:  <allow v='127.0.0.1'/>
16:54:41:  <connection-timeout v='60'/>
16:54:41:  <deny v='0/0'/>
16:54:41:  <http-addresses v='0:7396'/>
16:54:41:  <https-addresses v=''/>
16:54:41:  <max-connect-time v='900'/>
16:54:41:  <max-connections v='800'/>
16:54:41:  <max-request-length v='52428800'/>
16:54:41:  <min-connect-time v='300'/>
16:54:41:  <threads v='4'/>
16:54:41:
16:54:41:  <!-- Logging -->
16:54:41:  <log v='log.txt'/>
16:54:41:  <log-color v='false'/>
16:54:41:  <log-crlf v='true'/>
16:54:41:  <log-date v='false'/>
16:54:41:  <log-date-periodically v='21600'/>
16:54:41:  <log-debug v='true'/>
16:54:41:  <log-domain v='false'/>
16:54:41:  <log-header v='true'/>
16:54:41:  <log-level v='true'/>
16:54:41:  <log-no-info-header v='true'/>
16:54:41:  <log-redirect v='false'/>
16:54:41:  <log-rotate v='true'/>
16:54:41:  <log-rotate-dir v='logs'/>
16:54:41:  <log-rotate-max v='16'/>
16:54:41:  <log-short-level v='false'/>
16:54:41:  <log-simple-domains v='true'/>
16:54:41:  <log-thread-id v='false'/>
16:54:41:  <log-thread-prefix v='true'/>
16:54:41:  <log-time v='true'/>
16:54:41:  <log-to-screen v='true'/>
16:54:41:  <log-truncate v='false'/>
16:54:41:  <verbosity v='5'/>
16:54:41:
16:54:41:  <!-- Network -->
16:54:41:  <proxy v=':8080'/>
16:54:41:  <proxy-enable v='false'/>
16:54:41:  <proxy-pass v=''/>
16:54:41:  <proxy-user v=''/>
16:54:41:
16:54:41:  <!-- Process Control -->
16:54:41:  <child v='false'/>
16:54:41:  <daemon v='false'/>
16:54:41:  <pid v='false'/>
16:54:41:  <pid-file v='Folding@home Client.pid'/>
16:54:41:  <respawn v='false'/>
16:54:41:  <service v='false'/>
16:54:41:
16:54:41:  <!-- Remote Command Server -->
16:54:41:  <command-address v='0.0.0.0'/>
16:54:41:  <command-allow-no-pass v='127.0.0.1'/>
16:54:41:  <command-deny-no-pass v='0/0'/>
16:54:41:  <command-enable v='true'/>
16:54:41:  <command-port v='36330'/>
16:54:41:
16:54:41:  <!-- Slot Control -->
16:54:41:  <idle v='false'/>
16:54:41:  <max-shutdown-wait v='60'/>
16:54:41:  <pause-on-battery v='true'/>
16:54:41:  <pause-on-start v='false'/>
16:54:41:  <paused v='false'/>
16:54:41:  <power v='FULL'/>
16:54:41:
16:54:41:  <!-- User Information -->
16:54:41:  <machine-id v='0'/>
16:54:41:  <passkey v='********************************'/>
16:54:41:  <team v='32'/>
16:54:41:  <user v='Andy_R'/>
16:54:41:
16:54:41:  <!-- Web Server -->
16:54:41:  <web-allow v='127.0.0.1'/>
16:54:41:  <web-deny v='0/0'/>
16:54:41:  <web-enable v='true'/>
16:54:41:
16:54:41:  <!-- Web Server Sessions -->
16:54:41:  <session-cookie v='sid'/>
16:54:41:  <session-lifetime v='86400'/>
16:54:41:  <session-timeout v='3600'/>
16:54:41:
16:54:41:  <!-- Work Unit Control -->
16:54:41:  <dump-after-deadline v='true'/>
16:54:41:  <max-queue v='16'/>
16:54:41:  <max-units v='0'/>
16:54:41:  <next-unit-percentage v='99'/>
16:54:41:  <stall-detection-enabled v='false'/>
16:54:41:  <stall-percent v='5'/>
16:54:41:  <stall-timeout v='1800'/>
16:54:41:
16:54:41:  <!-- Folding Slots -->
16:54:41:  <slot id='1' type='GPU'/>
16:54:41:</config>
16:54:41:Trying to access database...
16:54:41:Successfully acquired database lock
16:54:41:Enabled folding slot 01: READY gpu:0:GM204 [GeForce GTX 970]
16:54:43:Started thread 4 on PID 4140
16:54:43:Started thread 7 on PID 4140
16:54:43:Started thread 6 on PID 4140
16:54:43:Started thread 8 on PID 4140
16:54:43:Started thread 5 on PID 4140
16:54:43:Started thread 9 on PID 4140
16:54:44:WU01:FS01:Starting
16:54:44:WU01:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/Andy/AppData/Roaming/FAHClient/cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_17.fah/FahCore_17.exe -dir 01 -suffix 01 -version 704 -lifeline 4140 -checkpoint 3 -gpu 0 -gpu-vendor nvidia
16:54:44:WU01:FS01:Started FahCore on PID 4988
16:54:44:Started thread 10 on PID 4140
16:54:46:WU01:FS01:Core PID:3288
16:54:46:WU01:FS01:FahCore 0x17 started
16:54:52:WU01:FS01:0x17:*********************** Log Started 2014-10-05T16:54:51Z ***********************
16:54:52:WU01:FS01:0x17:Project: 13001 (Run 407, Clone 4, Gen 73)
16:54:52:WU01:FS01:0x17:Unit: 0x00000081538b3db75328c32b8165b2dc
16:54:52:WU01:FS01:0x17:CPU: 0x00000000000000000000000000000000
16:54:52:WU01:FS01:0x17:Machine: 1
16:54:52:WU01:FS01:0x17:Reading tar file state.xml
16:54:57:WU01:FS01:0x17:Reading tar file system.xml
16:54:59:WU01:FS01:0x17:Reading tar file integrator.xml
16:54:59:WU01:FS01:0x17:Reading tar file core.xml
16:54:59:WU01:FS01:0x17:Digital signatures verified
16:54:59:WU01:FS01:0x17:Folding@home GPU core17
16:54:59:WU01:FS01:0x17:Version 0.0.52
17:01:26:WU01:FS01:0x17:ERROR:exception: Force RMSE error of 454.221 with threshold of 5
17:01:26:WU01:FS01:0x17:Saving result file logfile_01.txt
17:01:26:WU01:FS01:0x17:Saving result file log.txt
17:01:26:WU01:FS01:0x17:Folding@home Core Shutdown: BAD_WORK_UNIT
17:01:27:WARNING:WU01:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
17:01:27:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13001 run:407 clone:4 gen:73 core:0x17 unit:0x00000081538b3db75328c32b8165b2dc
17:01:27:WU01:FS01:Uploading 2.72KiB to 140.163.4.231
17:01:27:WU01:FS01:Connecting to 140.163.4.231:8080
17:01:27:WU00:FS01:Connecting to 171.67.108.201:80
17:01:28:WU01:FS01:Upload complete
17:01:28:WU01:FS01:Server responded WORK_ACK (400)
17:01:28:WU01:FS01:Cleaning up
17:01:29:WU00:FS01:Assigned to work server 140.163.4.231
17:01:29:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:GM204 [GeForce GTX 970] from 140.163.4.231
17:01:29:WU00:FS01:Connecting to 140.163.4.231:8080
17:01:30:WU00:FS01:Downloading 4.83MiB
17:01:33:WU00:FS01:Download complete
17:01:33:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:13001 run:197 clone:5 gen:73 core:0x17 unit:0x00000086538b3db7532887a589d44919
17:01:33:WU00:FS01:Starting
17:01:33:WU00:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/Andy/AppData/Roaming/FAHClient/cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_17.fah/FahCore_17.exe -dir 00 -suffix 01 -version 704 -lifeline 4140 -checkpoint 3 -gpu 0 -gpu-vendor nvidia
17:01:33:WU00:FS01:Started FahCore on PID 7192
17:01:33:Started thread 11 on PID 4140
17:01:33:WU00:FS01:Core PID:6060
17:01:33:WU00:FS01:FahCore 0x17 started
17:01:35:WU00:FS01:0x17:*********************** Log Started 2014-10-05T17:01:35Z ***********************
17:01:35:WU00:FS01:0x17:Project: 13001 (Run 197, Clone 5, Gen 73)
17:01:35:WU00:FS01:0x17:Unit: 0x00000086538b3db7532887a589d44919
17:01:35:WU00:FS01:0x17:CPU: 0x00000000000000000000000000000000
17:01:35:WU00:FS01:0x17:Machine: 1
17:01:35:WU00:FS01:0x17:Reading tar file state.xml
17:01:36:WU00:FS01:0x17:Reading tar file system.xml
17:01:37:WU00:FS01:0x17:Reading tar file integrator.xml
17:01:37:WU00:FS01:0x17:Reading tar file core.xml
17:01:37:WU00:FS01:0x17:Digital signatures verified
17:01:37:WU00:FS01:0x17:Folding@home GPU core17
17:01:37:WU00:FS01:0x17:Version 0.0.52
17:05:24:Started thread 12 on PID 4140
17:08:03:WU00:FS01:0x17:ERROR:exception: Force RMSE error of 454.587 with threshold of 5
17:08:03:WU00:FS01:0x17:Saving result file logfile_01.txt
17:08:03:WU00:FS01:0x17:Saving result file log.txt
17:08:03:WU00:FS01:0x17:Folding@home Core Shutdown: BAD_WORK_UNIT
17:08:04:WARNING:WU00:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
17:08:04:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:13001 run:197 clone:5 gen:73 core:0x17 unit:0x00000086538b3db7532887a589d44919
17:08:04:WU00:FS01:Uploading 2.25KiB to 140.163.4.231
17:08:04:WU00:FS01:Connecting to 140.163.4.231:8080
17:08:04:WU00:FS01:Upload complete
17:08:04:WU01:FS01:Connecting to 171.67.108.201:80
17:08:04:WU00:FS01:Server responded WORK_ACK (400)
17:08:05:WU00:FS01:Cleaning up
17:08:06:WU01:FS01:Assigned to work server 140.163.4.231
17:08:06:WU01:FS01:Requesting new work unit for slot 01: READY gpu:0:GM204 [GeForce GTX 970] from 140.163.4.231
17:08:06:WU01:FS01:Connecting to 140.163.4.231:8080
17:08:07:WU01:FS01:Downloading 4.84MiB
17:08:11:WU01:FS01:Download complete
17:08:11:WU01:FS01:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:13000 run:648 clone:7 gen:8 core:0x17 unit:0x0000000f538b3db7531052bc68faf4c2
17:08:11:WU01:FS01:Starting
17:08:11:WU01:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/Andy/AppData/Roaming/FAHClient/cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_17.fah/FahCore_17.exe -dir 01 -suffix 01 -version 704 -lifeline 4140 -checkpoint 3 -gpu 0 -gpu-vendor nvidia
17:08:11:WU01:FS01:Started FahCore on PID 7016
17:08:11:Started thread 13 on PID 4140
17:08:11:WU01:FS01:Core PID:2732
17:08:11:WU01:FS01:FahCore 0x17 started
17:08:12:WU01:FS01:0x17:*********************** Log Started 2014-10-05T17:08:12Z ***********************
17:08:12:WU01:FS01:0x17:Project: 13000 (Run 648, Clone 7, Gen 8)
17:08:12:WU01:FS01:0x17:Unit: 0x0000000f538b3db7531052bc68faf4c2
17:08:12:WU01:FS01:0x17:CPU: 0x00000000000000000000000000000000
17:08:12:WU01:FS01:0x17:Machine: 1
17:08:12:WU01:FS01:0x17:Reading tar file state.xml
17:08:14:WU01:FS01:0x17:Reading tar file system.xml
17:08:15:WU01:FS01:0x17:Reading tar file integrator.xml
17:08:15:WU01:FS01:0x17:Reading tar file core.xml
17:08:15:WU01:FS01:0x17:Digital signatures verified
17:08:15:WU01:FS01:0x17:Folding@home GPU core17
17:08:15:WU01:FS01:0x17:Version 0.0.52
17:14:34:WU01:FS01:0x17:ERROR:exception: Force RMSE error of 450.616 with threshold of 5
17:14:34:WU01:FS01:0x17:Saving result file logfile_01.txt
17:14:34:WU01:FS01:0x17:Saving result file log.txt
17:14:34:WU01:FS01:0x17:Folding@home Core Shutdown: BAD_WORK_UNIT
17:14:35:WARNING:WU01:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
17:14:35:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13000 run:648 clone:7 gen:8 core:0x17 unit:0x0000000f538b3db7531052bc68faf4c2
17:14:35:WU01:FS01:Uploading 2.25KiB to 140.163.4.231
17:14:35:WU01:FS01:Connecting to 140.163.4.231:8080
17:14:35:WU01:FS01:Upload complete
17:14:35:WU00:FS01:Connecting to 171.67.108.201:80
17:14:35:WU01:FS01:Server responded WORK_ACK (400)
17:14:35:WU01:FS01:Cleaning up
17:14:36:WU00:FS01:Assigned to work server 140.163.4.231
17:14:36:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:GM204 [GeForce GTX 970] from 140.163.4.231
17:14:36:WU00:FS01:Connecting to 140.163.4.231:8080
17:14:36:WU00:FS01:Downloading 4.83MiB
17:14:41:WU00:FS01:Download complete
17:14:41:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:13001 run:362 clone:1 gen:18 core:0x17 unit:0x00000024538b3db75328b64d2cffc4f3
17:14:41:WU00:FS01:Starting
17:14:41:WU00:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/Andy/AppData/Roaming/FAHClient/cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_17.fah/FahCore_17.exe -dir 00 -suffix 01 -version 704 -lifeline 4140 -checkpoint 3 -gpu 0 -gpu-vendor nvidia
17:14:41:WU00:FS01:Started FahCore on PID 10160
17:14:41:Started thread 14 on PID 4140
17:14:41:WU00:FS01:Core PID:8452
17:14:41:WU00:FS01:FahCore 0x17 started
17:14:42:WU00:FS01:0x17:*********************** Log Started 2014-10-05T17:14:41Z ***********************
17:14:42:WU00:FS01:0x17:Project: 13001 (Run 362, Clone 1, Gen 18)
17:14:42:WU00:FS01:0x17:Unit: 0x00000024538b3db75328b64d2cffc4f3
17:14:42:WU00:FS01:0x17:CPU: 0x00000000000000000000000000000000
17:14:42:WU00:FS01:0x17:Machine: 1
17:14:42:WU00:FS01:0x17:Reading tar file state.xml
17:14:43:WU00:FS01:0x17:Reading tar file system.xml
17:14:44:WU00:FS01:0x17:Reading tar file integrator.xml
17:14:44:WU00:FS01:0x17:Reading tar file core.xml
17:14:44:WU00:FS01:0x17:Digital signatures verified
17:14:44:WU00:FS01:0x17:Folding@home GPU core17
17:14:44:WU00:FS01:0x17:Version 0.0.52

Re: New Assignment Server feedback/problem

Posted: Sun Oct 05, 2014 6:18 pm
by johnim
hi im getting the same

Code: Select all

17:58:34:WU00:FS01:0x17:*********************** Log Started 2014-10-05T17:58:33Z ***********************
17:58:34:WU00:FS01:0x17:Project: 9406 (Run 36, Clone 2, Gen 1)
17:58:34:WU00:FS01:0x17:Unit: 0x000000020a3b1e5c533dd2c5b7af9dae
17:58:34:WU00:FS01:0x17:CPU: 0x00000000000000000000000000000000
17:58:34:WU00:FS01:0x17:Machine: 1
17:58:34:WU00:FS01:0x17:Reading tar file state.xml
17:58:35:WU00:FS01:0x17:Reading tar file system.xml
17:58:35:WU00:FS01:0x17:Reading tar file integrator.xml
17:58:35:WU00:FS01:0x17:Reading tar file core.xml
17:58:35:WU00:FS01:0x17:Digital signatures verified
17:58:35:WU00:FS01:0x17:Folding@home GPU core17
17:58:35:WU00:FS01:0x17:Version 0.0.55
18:00:36:WU00:FS01:0x17:ERROR:exception: Force RMSE error of 609.35 with threshold of 5
18:00:36:WU00:FS01:0x17:Saving result file logfile_01.txt
18:00:36:WU00:FS01:0x17:Saving result file log.txt
18:00:37:WU00:FS01:0x17:Folding@home Core Shutdown: BAD_WORK_UNIT
18:00:37:WARNING:WU00:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
18:00:37:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:9406 run:36 clone:2 gen:1 core:0x17 unit:0x000000020a3b1e5c533dd2c5b7af9dae
18:00:37:WU00:FS01:Uploading 2.32KiB to 171.64.65.56
18:00:37:WU00:FS01:Connecting to 171.64.65.56:8080
18:00:37:WU01:FS01:Connecting to 171.67.108.201:80
18:00:37:WU00:FS01:Upload complete
18:00:38:WU00:FS01:Server responded WORK_ACK (400)
18:00:38:WU00:FS01:Cleaning up
18:00:38:WU01:FS01:Assigned to work server 171.64.65.56
18:00:38:WU01:FS01:Requesting new work unit for slot 01: READY gpu:0:GM204 [GeForce GTX 970] from 171.64.65.56
18:00:38:WU01:FS01:Connecting to 171.64.65.56:8080
18:00:39:WU01:FS01:Downloading 5.44MiB
18:00:45:WU01:FS01:Download 98.88%
18:00:45:WU01:FS01:Download complete
18:00:45:WU01:FS01:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:9406 run:767 clone:0 gen:47 core:0x17 unit:0x000000600a3b1e5c533e581ef151a0b6
18:00:45:WU01:FS01:Starting
18:00:45:WU01:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/MAIN/AppData/Roaming/FAHClient/cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/beta/Core_17.fah/FahCore_17.exe -dir 01 -suffix 01 -version 704 -lifeline 8588 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
18:00:45:WU01:FS01:Started FahCore on PID 5508
18:00:45:WU01:FS01:Core PID:8572
18:00:45:WU01:FS01:FahCore 0x17 started
18:00:45:WU01:FS01:0x17:*********************** Log Started 2014-10-05T18:00:45Z ***********************
18:00:45:WU01:FS01:0x17:Project: 9406 (Run 767, Clone 0, Gen 47)
18:00:45:WU01:FS01:0x17:Unit: 0x000000600a3b1e5c533e581ef151a0b6
18:00:45:WU01:FS01:0x17:CPU: 0x00000000000000000000000000000000
18:00:45:WU01:FS01:0x17:Machine: 1
18:00:45:WU01:FS01:0x17:Reading tar file state.xm
Mod edit: Please use Code tags around log postings

Re: New Assignment Server feedback/problem

Posted: Sun Oct 05, 2014 6:35 pm
by heikosch
GTX 750 Ti doesn´t fold any core 17 WUs any more.

core 17 v55
Geforce 340.52

Code: Select all

17:59:10:WU02:FS01:Starting
17:59:10:WU02:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/ProgramData/FAHClient/cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_17.fah/FahCore_17.exe -dir 02 -suffix 01 -version 704 -lifeline 41260 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
17:59:10:WU02:FS01:Started FahCore on PID 44104
17:59:10:WU02:FS01:Core PID:40832
17:59:10:WU02:FS01:FahCore 0x17 started
17:59:11:WU02:FS01:0x17:*********************** Log Started 2014-10-05T17:59:10Z ***********************
17:59:11:WU02:FS01:0x17:Project: 13000 (Run 173, Clone 6, Gen 11)
17:59:11:WU02:FS01:0x17:Unit: 0x00000014538b3db7530fccf85345cf00
17:59:11:WU02:FS01:0x17:CPU: 0x00000000000000000000000000000000
17:59:11:WU02:FS01:0x17:Machine: 1
17:59:11:WU02:FS01:0x17:Reading tar file state.xml
17:59:12:WU02:FS01:0x17:Reading tar file system.xml
17:59:13:WU02:FS01:0x17:Reading tar file integrator.xml
17:59:13:WU02:FS01:0x17:Reading tar file core.xml
17:59:13:WU02:FS01:0x17:Digital signatures verified
17:59:13:WU02:FS01:0x17:Folding@home GPU core17
17:59:13:WU02:FS01:0x17:Version 0.0.55
18:01:40:WU02:FS01:0x17:ERROR:exception: Force RMSE error of 455.057 with threshold of 5
Heiko

PS: Same problem with P9406.

Re: New Assignment Server feedback/problem

Posted: Sun Oct 05, 2014 8:18 pm
by VijayPande
I've asked Yutong to see what's going on here. Yesterday, we reverted back the Core17/Core18 adv setting. Being Sunday, we don't have a full team on this right now. I think we'll know more tomorrow morning.

We could use some info for debugging: are you getting *different* core17 projects and it's those different projects that are causing problems or are you having problems with the same projects that were working before? The more detail you can give us on what was working before (core version # and project #'s) and not working now would be a huge help. There are so many different variables here, I think the first thing we need to do is to pin down is this an issue due to drivers, core version, or projects. Thanks in advance for your help with those details.

Re: New Assignment Server feedback/problem

Posted: Sun Oct 05, 2014 8:28 pm
by Breach
VijayPande wrote:I've asked Yutong to see what's going on here. Yesterday, we reverted back the Core17/Core18 adv setting. Being Sunday, we don't have a full team on this right now. I think we'll know more tomorrow morning.

We could use some info for debugging: are you getting *different* core17 projects and it's those different projects that are causing problems or are you having problems with the same projects that were working before? The more detail you can give us on what was working before (core version # and project #'s) and not working now would be a huge help. There are so many different variables here, I think the first thing we need to do is to pin down is this an issue due to drivers, core version, or projects. Thanks in advance for your help with those details.
Drivers: 344.16 (new card, haven't used anything else)
Cores: Both 0.52 and 0.55 (no difference)
WUs from projects (problematic): 9406, 13000 and 13001 <- have not seen them assigned before the problem, but I had the exact same problem even before the change with *some* WUs from Core 18 0.0.2 (10473, 10472)
WUs from projects (which used to work before): 9201 < have not received it since the problem started happening so I don't know whether it would work now, but I guess yes

Re: New Assignment Server feedback/problem

Posted: Sun Oct 05, 2014 8:36 pm
by VijayPande
Do most people feel like it's an issue w/specific projects (and that the new AS is leading to more assignments of them)? That would make sense. We can bring the faulty projects back to beta if there is enough evidence that they are faulty. We just shouldn't bring too many projects back to beta if that's not needed since otherwise there won't be any WUs of course, so we'll have to tread carefully here.

Re: New Assignment Server feedback/problem

Posted: Sun Oct 05, 2014 8:40 pm
by Breach
The problem with answering this question is that I for one have not received anything else since your change:
Re: New Assignment Server feedback/problem
Postby VijayPande » Sat Oct 04, 2014 8:15 pm

Since it's the weekend, most of the staff is out, so I've made some changes myself just now to the AS. If that works, then we know the issue was an AS configuration issue (which I think I caught). If it doesn't work, then it may be an AS code bug that Joe will need to further investigate. Thanks for your understanding with this. Switching over to the new AS has had more growing pains than we expected.
but WUs from the problematic projects: 9406, 13000 and 13001 which all fail (at least on Maxwell, perhaps they fare better on Keplers).

Others will chip in of course.