Page 1 of 3

WU 13416 low ppd long run time

Posted: Tue Jul 07, 2020 8:42 am
by rjccosta1
I never posted here before. Never had problems before :D :D . I have been folding for 8 years now. For the last 3 months my average ppd is 1.4M with rtx 2070. I have an undervolt of -100mv on gpu. Most wu finish less than 3 hours.

The problem is that this wu type is taking 6h to finish. It is very unusual. Also, the average ppd is 800k-900k. It is almost 50% less than usual. I reset the graphics to stock and the average was exactly the same. My cpu is running stock, no pbo enabled. The temperatures and speed of gpu and cpu are exactly the same as usual.

Please find the first log lines here to start the discussion. I will post the rest when I finish a couple of wu of this type. Any requests be polite and civilised, no need to bite. If I am wrong just nudge me in the right direction :D .

Code: Select all

*********************** Log Started 2020-07-07T07:51:41Z ***********************
07:51:41:Trying to access database...
07:51:41:Successfully acquired database lock
07:51:41:Read GPUs.txt
07:51:42:Enabled folding slot 01: PAUSED gpu:0:TU106 [GeForce RTX 2070] M 6497 (by user)
07:51:43:****************************** FAHClient ******************************
07:51:43:        Version: 7.6.13
07:51:43:         Author: Joseph Coffland <[email protected]>
07:51:43:      Copyright: 2020 foldingathome.org
07:51:43:       Homepage: https://foldingathome.org/
07:51:43:           Date: Apr 27 2020
07:51:43:           Time: 21:21:01
07:51:43:       Revision: 5a652817f46116b6e135503af97f18e094414e3b
07:51:43:         Branch: master
07:51:43:       Compiler: Visual C++ 2008
07:51:43:        Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
07:51:43:       Platform: win32 10
07:51:43:           Bits: 32
07:51:43:           Mode: Release
07:51:43:         Config: C:\Users\ricardo\AppData\Roaming\FAHClient\config.xml
07:51:43:******************************** CBang ********************************
07:51:43:           Date: Apr 24 2020
07:51:43:           Time: 17:07:55
07:51:43:       Revision: ea081a3b3b0f4a37c4d0440b4f1bc184197c7797
07:51:43:         Branch: master
07:51:43:       Compiler: Visual C++ 2008
07:51:43:        Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
07:51:43:       Platform: win32 10
07:51:43:           Bits: 32
07:51:43:           Mode: Release
07:51:43:******************************* System ********************************
07:51:43:            CPU: AMD Ryzen 5 3600 6-Core Processor
07:51:43:         CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
07:51:43:           CPUs: 12
07:51:43:         Memory: 15.95GiB
07:51:43:    Free Memory: 13.63GiB
07:51:43:        Threads: WINDOWS_THREADS
07:51:43:     OS Version: 6.2
07:51:43:    Has Battery: false
07:51:43:     On Battery: false
07:51:43:     UTC Offset: 1
07:51:43:            PID: 11628
07:51:43:            CWD: C:\Users\ricardo\AppData\Roaming\FAHClient
07:51:43:  Win32 Service: false
07:51:43:             OS: Windows 10 Enterprise
07:51:43:        OS Arch: AMD64
07:51:43:           GPUs: 1
07:51:43:          GPU 0: Bus:38 Slot:0 Func:0 NVIDIA:7 TU106 [GeForce RTX 2070] M 6497
07:51:43:  CUDA Device 0: Platform:0 Device:0 Bus:38 Slot:0 Compute:7.5 Driver:11.0
07:51:43:OpenCL Device 0: Platform:0 Device:0 Bus:38 Slot:0 Compute:1.2 Driver:451.48
07:51:43:******************************* libFAH ********************************
07:51:43:           Date: Apr 15 2020
07:51:43:           Time: 14:53:14
07:51:43:       Revision: 216968bc7025029c841ed6e36e81a03a316890d3
07:51:43:         Branch: master
07:51:43:       Compiler: Visual C++ 2008
07:51:43:        Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
07:51:43:       Platform: win32 10
07:51:43:           Bits: 32
07:51:43:           Mode: Release
07:51:43:***********************************************************************
07:51:43:<config>
07:51:43:  <!-- Folding Core -->
07:51:43:  <checkpoint v='3'/>
07:51:43:
07:51:43:  <!-- Folding Slot Configuration -->
07:51:43:  <client-type v='advanced'/>
07:51:43:
07:51:43:  <!-- Network -->
07:51:43:  <proxy v=':8080'/>
07:51:43:
07:51:43:  <!-- Slot Control -->
07:51:43:  <pause-on-battery v='false'/>
07:51:43:  <power v='full'/>
07:51:43:
07:51:43:  <!-- User Information -->
07:51:43:  <passkey v='*****'/>
07:51:43:  <team v='35947'/>
07:51:43:  <user v='rjcman'/>
07:51:43:
07:51:43:  <!-- Folding Slots -->
07:51:43:  <slot id='1' type='GPU'>
07:51:43:    <paused v='true'/>
07:51:43:  </slot>
07:51:43:</config>
07:52:45:FS01:Unpaused
07:52:45:WU01:FS01:Starting
07:52:45:WU01:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\Users\ricardo\AppData\Roaming\FAHClient\cores/cores.foldingathome.org/v7/win/64bit/Core_22.fah/FahCore_22.exe -dir 01 -suffix 01 -version 706 -lifeline 11628 -checkpoint 3 -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu 0
07:52:45:WU01:FS01:Started FahCore on PID 15104
07:52:45:WU01:FS01:Core PID:9780
07:52:45:WU01:FS01:FahCore 0x22 started
07:52:46:WU01:FS01:0x22:*********************** Log Started 2020-07-07T07:52:45Z ***********************
07:52:46:WU01:FS01:0x22:*************************** Core22 Folding@home Core ***************************
07:52:46:WU01:FS01:0x22:       Core: Core22
07:52:46:WU01:FS01:0x22:       Type: 0x22
07:52:46:WU01:FS01:0x22:    Version: 0.0.11
07:52:46:WU01:FS01:0x22:     Author: Joseph Coffland <[email protected]>
07:52:46:WU01:FS01:0x22:  Copyright: 2020 foldingathome.org
07:52:46:WU01:FS01:0x22:   Homepage: https://foldingathome.org/
07:52:46:WU01:FS01:0x22:       Date: Jun 26 2020
07:52:46:WU01:FS01:0x22:       Time: 19:49:16
07:52:46:WU01:FS01:0x22:   Revision: 22010df8a4db48db1b35d33e666b64d8ce48689d
07:52:46:WU01:FS01:0x22:     Branch: core22-0.0.11
07:52:46:WU01:FS01:0x22:   Compiler: Visual C++ 2015
07:52:46:WU01:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
07:52:46:WU01:FS01:0x22:   Platform: win32 10
07:52:46:WU01:FS01:0x22:       Bits: 64
07:52:46:WU01:FS01:0x22:       Mode: Release
07:52:46:WU01:FS01:0x22:Maintainers: John Chodera <[email protected]> and Peter Eastman
07:52:46:WU01:FS01:0x22:             <[email protected]>
07:52:46:WU01:FS01:0x22:       Args: -dir 01 -suffix 01 -version 706 -lifeline 15104 -checkpoint 3
07:52:46:WU01:FS01:0x22:             -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device
07:52:46:WU01:FS01:0x22:             0 -gpu 0
07:52:46:WU01:FS01:0x22:************************************ libFAH ************************************
07:52:46:WU01:FS01:0x22:       Date: Jun 26 2020
07:52:46:WU01:FS01:0x22:       Time: 19:47:12
07:52:46:WU01:FS01:0x22:   Revision: 2b383f4f04f38511dff592885d7c0400e72bdf43
07:52:46:WU01:FS01:0x22:     Branch: HEAD
07:52:46:WU01:FS01:0x22:   Compiler: Visual C++ 2015
07:52:46:WU01:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
07:52:46:WU01:FS01:0x22:   Platform: win32 10
07:52:46:WU01:FS01:0x22:       Bits: 64
07:52:46:WU01:FS01:0x22:       Mode: Release
07:52:46:WU01:FS01:0x22:************************************ CBang *************************************
07:52:46:WU01:FS01:0x22:       Date: Jun 26 2020
07:52:46:WU01:FS01:0x22:       Time: 19:46:11
07:52:46:WU01:FS01:0x22:   Revision: f8529962055b0e7bde23e429f5072ff758089dee
07:52:46:WU01:FS01:0x22:     Branch: master
07:52:46:WU01:FS01:0x22:   Compiler: Visual C++ 2015
07:52:46:WU01:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
07:52:46:WU01:FS01:0x22:   Platform: win32 10
07:52:46:WU01:FS01:0x22:       Bits: 64
07:52:46:WU01:FS01:0x22:       Mode: Release
07:52:46:WU01:FS01:0x22:************************************ System ************************************
07:52:46:WU01:FS01:0x22:        CPU: AMD Ryzen 5 3600 6-Core Processor
07:52:46:WU01:FS01:0x22:     CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
07:52:46:WU01:FS01:0x22:       CPUs: 12
07:52:46:WU01:FS01:0x22:     Memory: 15.95GiB
07:52:46:WU01:FS01:0x22:Free Memory: 12.27GiB
07:52:46:WU01:FS01:0x22:    Threads: WINDOWS_THREADS
07:52:46:WU01:FS01:0x22: OS Version: 6.2
07:52:46:WU01:FS01:0x22:Has Battery: false
07:52:46:WU01:FS01:0x22: On Battery: false
07:52:46:WU01:FS01:0x22: UTC Offset: 1
07:52:46:WU01:FS01:0x22:        PID: 9780
07:52:46:WU01:FS01:0x22:        CWD: C:\Users\ricardo\AppData\Roaming\FAHClient\work
07:52:46:WU01:FS01:0x22:********************************************************************************
07:52:46:WU01:FS01:0x22:Project: 13416 (Run 1053, Clone 177, Gen 0)
07:52:46:WU01:FS01:0x22:Unit: 0x0000000012bc7d9a5f02af804fd165a7
07:52:46:WU01:FS01:0x22:Digital signatures verified
07:52:46:WU01:FS01:0x22:Folding@home GPU Core22 Folding@home Core
07:52:46:WU01:FS01:0x22:Version 0.0.11
07:52:46:WU01:FS01:0x22:  Checkpoint write interval: 50000 steps (5%) [20 total]
07:52:46:WU01:FS01:0x22:  JSON viewer frame write interval: 10000 steps (1%) [100 total]
07:52:46:WU01:FS01:0x22:  XTC frame write interval: 250000 steps (25%) [4 total]
07:52:46:WU01:FS01:0x22:  Global context and integrator variables write interval: 2500 steps (0.25%) [400 total]
07:52:58:WU01:FS01:0x22:Completed 300000 out of 1000000 steps (30%)
07:53:43:Removing old file 'configs/config-20200701-081923.xml'

Re: WU 13416 low ppd long run time

Posted: Tue Jul 07, 2020 1:42 pm
by rjccosta1
I find all of this strange because I got another 13416 project wu that was folding at the usual 1.4M ppd. This means that not all wu on project 13416 are slow. The next code is an example of one of the 3 wu that were very slow (800K ppd on rtx 2070).

Please find the wu in question - Project: 13416 (Run 1053, Clone 177, Gen 0):

Code: Select all

07:52:46:WU01:FS01:0x22:Project: 13416 (Run 1053, Clone 177, Gen 0)
07:52:46:WU01:FS01:0x22:Unit: 0x0000000012bc7d9a5f02af804fd165a7
07:52:46:WU01:FS01:0x22:Digital signatures verified
07:52:46:WU01:FS01:0x22:Folding@home GPU Core22 Folding@home Core
07:52:46:WU01:FS01:0x22:Version 0.0.11
07:52:46:WU01:FS01:0x22:  Checkpoint write interval: 50000 steps (5%) [20 total]
07:52:46:WU01:FS01:0x22:  JSON viewer frame write interval: 10000 steps (1%) [100 total]
07:52:46:WU01:FS01:0x22:  XTC frame write interval: 250000 steps (25%) [4 total]
07:52:46:WU01:FS01:0x22:  Global context and integrator variables write interval: 2500 steps (0.25%) [400 total]
07:52:58:WU01:FS01:0x22:Completed 300000 out of 1000000 steps (30%)
07:53:43:Removing old file 'configs/config-20200701-081923.xml'
07:53:43:Saving configuration to config.xml
07:53:43:<config>
07:53:43:  <!-- Folding Core -->
07:53:43:  <checkpoint v='3'/>
07:53:43:
07:53:43:  <!-- Folding Slot Configuration -->
07:53:43:  <client-type v='advanced'/>
07:53:43:
07:53:43:  <!-- Network -->
07:53:43:  <proxy v=':8080'/>
07:53:43:
07:53:43:  <!-- Slot Control -->
07:53:43:  <pause-on-battery v='false'/>
07:53:43:  <power v='full'/>
07:53:43:
07:53:43:  <!-- User Information -->
07:53:43:  <passkey v='*****'/>
07:53:43:  <team v='35947'/>
07:53:43:  <user v='rjcman'/>
07:53:43:
07:53:43:  <!-- Folding Slots -->
07:53:43:  <slot id='1' type='GPU'/>
07:53:43:</config>
07:55:26:WU01:FS01:0x22:Completed 310000 out of 1000000 steps (31%)
07:57:57:WU01:FS01:0x22:Completed 320000 out of 1000000 steps (32%)
08:00:28:WU01:FS01:0x22:Completed 330000 out of 1000000 steps (33%)
08:02:58:WU01:FS01:0x22:Completed 340000 out of 1000000 steps (34%)
*********************** Log Started 2020-07-07T07:51:41Z ***********************
07:51:41:Trying to access database...
07:51:41:Successfully acquired database lock
07:51:41:Read GPUs.txt
07:51:42:Enabled folding slot 01: PAUSED gpu:0:TU106 [GeForce RTX 2070] M 6497 (by user)
07:51:43:****************************** FAHClient ******************************
07:51:43:        Version: 7.6.13
07:51:43:         Author: Joseph Coffland <[email protected]>
07:51:43:      Copyright: 2020 foldingathome.org
07:51:43:       Homepage: https://foldingathome.org/
07:51:43:           Date: Apr 27 2020
07:51:43:           Time: 21:21:01
07:51:43:       Revision: 5a652817f46116b6e135503af97f18e094414e3b
07:51:43:         Branch: master
07:51:43:       Compiler: Visual C++ 2008
07:51:43:        Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
07:51:43:       Platform: win32 10
07:51:43:           Bits: 32
07:51:43:           Mode: Release
07:51:43:         Config: C:\Users\ricardo\AppData\Roaming\FAHClient\config.xml
07:51:43:******************************** CBang ********************************
07:51:43:           Date: Apr 24 2020
07:51:43:           Time: 17:07:55
07:51:43:       Revision: ea081a3b3b0f4a37c4d0440b4f1bc184197c7797
07:51:43:         Branch: master
07:51:43:       Compiler: Visual C++ 2008
07:51:43:        Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
07:51:43:       Platform: win32 10
07:51:43:           Bits: 32
07:51:43:           Mode: Release
07:51:43:******************************* System ********************************
07:51:43:            CPU: AMD Ryzen 5 3600 6-Core Processor
07:51:43:         CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
07:51:43:           CPUs: 12
07:51:43:         Memory: 15.95GiB
07:51:43:    Free Memory: 13.63GiB
07:51:43:        Threads: WINDOWS_THREADS
07:51:43:     OS Version: 6.2
07:51:43:    Has Battery: false
07:51:43:     On Battery: false
07:51:43:     UTC Offset: 1
07:51:43:            PID: 11628
07:51:43:            CWD: C:\Users\ricardo\AppData\Roaming\FAHClient
07:51:43:  Win32 Service: false
07:51:43:             OS: Windows 10 Enterprise
07:51:43:        OS Arch: AMD64
07:51:43:           GPUs: 1
07:51:43:          GPU 0: Bus:38 Slot:0 Func:0 NVIDIA:7 TU106 [GeForce RTX 2070] M 6497
07:51:43:  CUDA Device 0: Platform:0 Device:0 Bus:38 Slot:0 Compute:7.5 Driver:11.0
07:51:43:OpenCL Device 0: Platform:0 Device:0 Bus:38 Slot:0 Compute:1.2 Driver:451.48
07:51:43:******************************* libFAH ********************************
07:51:43:           Date: Apr 15 2020
07:51:43:           Time: 14:53:14
07:51:43:       Revision: 216968bc7025029c841ed6e36e81a03a316890d3
07:51:43:         Branch: master
07:51:43:       Compiler: Visual C++ 2008
07:51:43:        Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
07:51:43:       Platform: win32 10
07:51:43:           Bits: 32
07:51:43:           Mode: Release
07:51:43:***********************************************************************
07:51:43:<config>
07:51:43:  <!-- Folding Core -->
07:51:43:  <checkpoint v='3'/>
07:51:43:
07:51:43:  <!-- Folding Slot Configuration -->
07:51:43:  <client-type v='advanced'/>
07:51:43:
07:51:43:  <!-- Network -->
07:51:43:  <proxy v=':8080'/>
07:51:43:
07:51:43:  <!-- Slot Control -->
07:51:43:  <pause-on-battery v='false'/>
07:51:43:  <power v='full'/>
07:51:43:
07:51:43:  <!-- User Information -->
07:51:43:  <passkey v='*****'/>
07:51:43:  <team v='35947'/>
07:51:43:  <user v='rjcman'/>
07:51:43:
07:51:43:  <!-- Folding Slots -->
07:51:43:  <slot id='1' type='GPU'>
07:51:43:    <paused v='true'/>
07:51:43:  </slot>
07:51:43:</config>
07:52:45:FS01:Unpaused
07:52:45:WU01:FS01:Starting
07:52:45:WU01:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\Users\ricardo\AppData\Roaming\FAHClient\cores/cores.foldingathome.org/v7/win/64bit/Core_22.fah/FahCore_22.exe -dir 01 -suffix 01 -version 706 -lifeline 11628 -checkpoint 3 -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu 0
07:52:45:WU01:FS01:Started FahCore on PID 15104
07:52:45:WU01:FS01:Core PID:9780
07:52:45:WU01:FS01:FahCore 0x22 started
07:52:46:WU01:FS01:0x22:*********************** Log Started 2020-07-07T07:52:45Z ***********************

07:52:46:WU01:FS01:0x22:Project: 13416 (Run 1053, Clone 177, Gen 0)
07:52:46:WU01:FS01:0x22:Unit: 0x0000000012bc7d9a5f02af804fd165a7
07:52:46:WU01:FS01:0x22:Digital signatures verified
07:52:46:WU01:FS01:0x22:Folding@home GPU Core22 Folding@home Core
07:52:46:WU01:FS01:0x22:Version 0.0.11
07:52:46:WU01:FS01:0x22:  Checkpoint write interval: 50000 steps (5%) [20 total]
07:52:46:WU01:FS01:0x22:  JSON viewer frame write interval: 10000 steps (1%) [100 total]
07:52:46:WU01:FS01:0x22:  XTC frame write interval: 250000 steps (25%) [4 total]
07:52:46:WU01:FS01:0x22:  Global context and integrator variables write interval: 2500 steps (0.25%) [400 total]
07:52:58:WU01:FS01:0x22:Completed 300000 out of 1000000 steps (30%)
07:53:43:Removing old file 'configs/config-20200701-081923.xml'
07:53:43:Saving configuration to config.xml

07:55:26:WU01:FS01:0x22:Completed 310000 out of 1000000 steps (31%)
07:57:57:WU01:FS01:0x22:Completed 320000 out of 1000000 steps (32%)
08:00:28:WU01:FS01:0x22:Completed 330000 out of 1000000 steps (33%)
08:02:58:WU01:FS01:0x22:Completed 340000 out of 1000000 steps (34%)

08:05:28:WU01:FS01:0x22:Completed 350000 out of 1000000 steps (35%)

10:41:13:WU01:FS01:0x22:Completed 990000 out of 1000000 steps (99%)
10:41:14:WU00:FS01:Connecting to assign1.foldingathome.org:80
10:41:14:WU00:FS01:Assigned to work server 18.188.125.154
10:41:14:WU00:FS01:Requesting new work unit for slot 01: RUNNING gpu:0:TU106 [GeForce RTX 2070] M 6497 from 18.188.125.154
10:41:14:WU00:FS01:Connecting to 18.188.125.154:8080
10:41:15:WU00:FS01:Downloading 7.03MiB
10:41:21:WU00:FS01:Download 84.48%
10:41:21:WU00:FS01:Download complete
10:41:22:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:13416 run:399 clone:130 gen:1 core:0x22 unit:0x0000000312bc7d9a5f00a7e7ea520da4
10:43:43:WU01:FS01:0x22:Completed 1000000 out of 1000000 steps (100%)
10:43:43:WU01:FS01:0x22:Average performance: 116.129 ns/day
10:43:47:WU01:FS01:0x22:Saving result file ..\logfile_01.txt
10:43:47:WU01:FS01:0x22:Saving result file checkpointState.xml.bz2
10:43:47:WU01:FS01:0x22:Saving result file globals.csv
10:43:47:WU01:FS01:0x22:Saving result file positions.xtc
10:43:47:WU01:FS01:0x22:Saving result file science.log
10:43:47:WU01:FS01:0x22:Folding@home Core Shutdown: FINISHED_UNIT
10:43:48:WU01:FS01:FahCore returned: FINISHED_UNIT (100 = 0x64)
10:43:48:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:13416 run:1053 clone:177 gen:0 core:0x22 unit:0x0000000012bc7d9a5f02af804fd165a7
10:43:48:WU01:FS01:Uploading 5.83MiB to 18.188.125.154
10:43:48:WU01:FS01:Connecting to 18.188.125.154:8080
10:43:48:WU00:FS01:Starting
10:43:48:WU00:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\Users\ricardo\AppData\Roaming\FAHClient\cores/cores.foldingathome.org/v7/win/64bit/Core_22.fah/FahCore_22.exe -dir 00 -suffix 01 -version 706 -lifeline 11628 -checkpoint 30 -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu 0
10:43:48:WU00:FS01:Started FahCore on PID 4832
10:43:48:WU00:FS01:Core PID:12212
10:43:48:WU00:FS01:FahCore 0x22 started
10:43:49:WU00:FS01:0x22:*********************** Log Started 2020-07-07T10:43:48Z ***********************
10:43:49:WU00:FS01:0x22:*************************** Core22 Folding@home Core ***************************
10:43:49:WU00:FS01:0x22:       Core: Core22
10:43:49:WU00:FS01:0x22:       Type: 0x22
10:43:49:WU00:FS01:0x22:    Version: 0.0.11
10:43:49:WU00:FS01:0x22:     Author: Joseph Coffland <[email protected]>
10:43:49:WU00:FS01:0x22:  Copyright: 2020 foldingathome.org
10:43:49:WU00:FS01:0x22:   Homepage: https://foldingathome.org/
10:43:49:WU00:FS01:0x22:       Date: Jun 26 2020
10:43:49:WU00:FS01:0x22:       Time: 19:49:16
10:43:49:WU00:FS01:0x22:   Revision: 22010df8a4db48db1b35d33e666b64d8ce48689d
10:43:49:WU00:FS01:0x22:     Branch: core22-0.0.11
10:43:49:WU00:FS01:0x22:   Compiler: Visual C++ 2015
10:43:49:WU00:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
10:43:49:WU00:FS01:0x22:   Platform: win32 10
10:43:49:WU00:FS01:0x22:       Bits: 64
10:43:49:WU00:FS01:0x22:       Mode: Release
10:43:49:WU00:FS01:0x22:Maintainers: John Chodera <[email protected]> and Peter Eastman
10:43:49:WU00:FS01:0x22:             <[email protected]>
10:43:49:WU00:FS01:0x22:       Args: -dir 00 -suffix 01 -version 706 -lifeline 4832 -checkpoint 30
10:43:49:WU00:FS01:0x22:             -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device
10:43:49:WU00:FS01:0x22:             0 -gpu 0
10:43:49:WU00:FS01:0x22:************************************ libFAH ************************************
10:43:49:WU00:FS01:0x22:       Date: Jun 26 2020
10:43:49:WU00:FS01:0x22:       Time: 19:47:12
10:43:49:WU00:FS01:0x22:   Revision: 2b383f4f04f38511dff592885d7c0400e72bdf43
10:43:49:WU00:FS01:0x22:     Branch: HEAD
10:43:49:WU00:FS01:0x22:   Compiler: Visual C++ 2015
10:43:49:WU00:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
10:43:49:WU00:FS01:0x22:   Platform: win32 10
10:43:49:WU00:FS01:0x22:       Bits: 64
10:43:49:WU00:FS01:0x22:       Mode: Release
10:43:49:WU00:FS01:0x22:************************************ CBang *************************************
10:43:49:WU00:FS01:0x22:       Date: Jun 26 2020
10:43:49:WU00:FS01:0x22:       Time: 19:46:11
10:43:49:WU00:FS01:0x22:   Revision: f8529962055b0e7bde23e429f5072ff758089dee
10:43:49:WU00:FS01:0x22:     Branch: master
10:43:49:WU00:FS01:0x22:   Compiler: Visual C++ 2015
10:43:49:WU00:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
10:43:49:WU00:FS01:0x22:   Platform: win32 10
10:43:49:WU00:FS01:0x22:       Bits: 64
10:43:49:WU00:FS01:0x22:       Mode: Release
10:43:49:WU00:FS01:0x22:************************************ System ************************************
10:43:49:WU00:FS01:0x22:        CPU: AMD Ryzen 5 3600 6-Core Processor
10:43:49:WU00:FS01:0x22:     CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
10:43:49:WU00:FS01:0x22:       CPUs: 12
10:43:49:WU00:FS01:0x22:     Memory: 15.95GiB
10:43:49:WU00:FS01:0x22:Free Memory: 9.24GiB
10:43:49:WU00:FS01:0x22:    Threads: WINDOWS_THREADS
10:43:49:WU00:FS01:0x22: OS Version: 6.2
10:43:49:WU00:FS01:0x22:Has Battery: false
10:43:49:WU00:FS01:0x22: On Battery: false
10:43:49:WU00:FS01:0x22: UTC Offset: 1
10:43:49:WU00:FS01:0x22:        PID: 12212
10:43:49:WU00:FS01:0x22:        CWD: C:\Users\ricardo\AppData\Roaming\FAHClient\work
10:43:49:WU00:FS01:0x22:********************************************************************************
10:43:49:WU00:FS01:0x22:Project: 13416 (Run 399, Clone 130, Gen 1)
10:43:49:WU00:FS01:0x22:Unit: 0x0000000312bc7d9a5f00a7e7ea520da4
10:43:49:WU00:FS01:0x22:Reading tar file core.xml
10:43:49:WU00:FS01:0x22:Reading tar file integrator.xml
10:43:49:WU00:FS01:0x22:Reading tar file state.xml.bz2
10:43:49:WU00:FS01:0x22:Reading tar file system.xml.bz2
10:43:49:WU00:FS01:0x22:Digital signatures verified
10:43:49:WU00:FS01:0x22:Folding@home GPU Core22 Folding@home Core
10:43:49:WU00:FS01:0x22:Version 0.0.11
10:43:49:WU00:FS01:0x22:  Checkpoint write interval: 50000 steps (5%) [20 total]
10:43:49:WU00:FS01:0x22:  JSON viewer frame write interval: 10000 steps (1%) [100 total]
10:43:49:WU00:FS01:0x22:  XTC frame write interval: 250000 steps (25%) [4 total]
10:43:49:WU00:FS01:0x22:  Global context and integrator variables write interval: 2500 steps (0.25%) [400 total]
10:43:54:WU01:FS01:Upload 10.72%
10:44:00:WU01:FS01:Upload 20.37%
10:44:01:WU00:FS01:0x22:Completed 0 out of 1000000 steps (0%)
10:44:06:WU01:FS01:Upload 31.09%
10:44:12:WU01:FS01:Upload 40.74%
10:44:18:WU01:FS01:Upload 51.46%
10:44:24:WU01:FS01:Upload 62.18%
10:44:30:WU01:FS01:Upload 72.90%
10:44:36:WU01:FS01:Upload 83.63%
10:44:42:WU01:FS01:Upload 94.35%
10:44:47:WU01:FS01:Upload complete
10:44:47:WU01:FS01:Server responded WORK_ACK (400)
10:44:47:WU01:FS01:Final credit estimate, 153414.00 points
10:44:47:WU01:FS01:Cleaning up

Re: WU 13416 low ppd long run time

Posted: Tue Jul 07, 2020 1:59 pm
by Joe_H
There are multiple posts, including by the researcher running these projects, that the 134nn projects are having more variance than normal projects. WUs from some runs will be significantly slower, the data from these is being analyzed to see if a reason can be determined and improve future project configuration and assignment.

Re: WU 13416 low ppd long run time

Posted: Tue Jul 07, 2020 2:51 pm
by Sparkly
Yeah, I see this too, some of the 13416 WUs are insanely CPU hungry for some reason, even thou they are running on GPU, and require a full CPU core on its own to even be able to move forward, compared to other WUs in the same project that have the normal expected CPU load for the atom count they have.

Re: WU 13416 low ppd long run time

Posted: Tue Jul 07, 2020 3:09 pm
by Curt3g
I also just had a long running one, taking 12.5 hours (normally taking 4 hours). It was run:1291 clone:121 gen:0.

Just wanted to confirm, did I see in one of the posts from @JohnChodera that as long as the job results get successfully returned, we don't need to post abnormally long run times on the forum? Or maybe that pertained to switching to the latest core. Can't remember now.

Cheers,

Curt

Re: WU 13416 low ppd long run time

Posted: Tue Jul 07, 2020 3:15 pm
by ajm
As a general rule, if the WUs are returned, the researchers will be able to see how they behaved and I think it's not necessary to list them here, at least not systematically. But it's okay too, I'd say.
.

Re: WU 13416 low ppd long run time

Posted: Tue Jul 07, 2020 3:31 pm
by rjccosta1
Thanks for the clarification. Conclusion: some wu are very slow and low ppd. Fair enough. I thought there was something really wrong with my cpu-gpu configuration. You have all been very helpful. We live in times of so much hate and negativity that is surprising to see so many people rowing in the same direction. What great community we have here. 1-0 for humanity.

Re: WU 13416 low ppd long run time

Posted: Tue Jul 07, 2020 5:07 pm
by Ichbin3
I doubt that the scientists can see if some WUs take abnormaly long time.
There are so many reasons why a WU takes longer independent from their structure - like different gpus, underclocking, power limiting, pausing, parallel use of gpu and gaming, ...
Would be curious about a statement from the scientists.

Re: WU 13416 low ppd long run time

Posted: Tue Jul 07, 2020 5:17 pm
by bruce
Projects have a variety of goals and so do the researchers. As a general rule, John Chodera is very careful to check each WU and if it contained an error, he'll evaluate that information carefully. Other researchers may not be specifically analyzing each one so carefully, but they do pay attention.

As far as some WUs taking several hours, that's not surprising Some projects assign WUs that take several days. A single project that takes a week is more efficient than a series of 14 projects that each take 12 hrs. That's why people talk about Points Per Day. The processing time for the Covid Moonshot projects may often be very short, but that's not their main objective.

Re: WU 13416 low ppd long run time

Posted: Tue Jul 07, 2020 6:50 pm
by JohnChodera
Thanks for the reports, folks! We shifted from SARS-CoV-1 to SARS-CoV-2 Mpro retrospective benchmarks for the new batch of runs we just loaded into 13416-7, and it seems that this has unexpectedly caused an increase in WU compute time. We've adjusted the base credit upwards to compensate while we investigate what is going on.

> As a general rule, if the WUs are returned, the researchers will be able to see how they behaved and I think it's not necessary to list them here, at least not systematically.

I can confirm this is the case. Anything that's uploaded is available for us to analyze, and we periodically check to identify the major sources of issues and pinpoint specific WUs that are problematic, but it's good to hear about systematic issues (like this) or issues that do NOT get uploaded!

We're still experiencing a much greater RUN-to-RUN variation than expected. Each RUN here is a different ligand for Mpro (either SARS-CoV-1 or 2). The systems are nearly identical in size (number of atoms), so it's the composition of the workload that is causing variation. We're not sure why yet, but we hope to improve this for the next batch of projects (13418-9).

These projects are helping us support the COVID Moonshot (http://covid.postera.ai/covid), so huge thanks again for helping out as we keep progressing toward more potent inhibitors, and with some luck, a molecule we can put into clinical trials in the next few months.

~ John Chodera // MSKCC

Re: WU 13416 low ppd long run time

Posted: Tue Jul 07, 2020 8:16 pm
by Ichbin3
Well than.
Thanks for clarification.
Normal TPF here is 00:01:04, like in 13416 (452, 12, 0)
13416 (669, 132, 0) 2080TI TPF 00:01:25

Re: WU 13416 low ppd long run time

Posted: Wed Jul 08, 2020 6:21 am
by BobHehmann
Just experienced this (project:13416 run:1051 clone:146 gen:0 core:0x22 unit:0x0000000112bc7d9a5f02af81adfa8240). I noticed on my GPU stats monitor that the GPU usage % was hovering around 62-64% for this WU, whereas I normally see GPU usage > 90%. GPU power draw was commensurately low, while all other GPU stats looked nominal. The GPU is a 2070 Super, running alongside an AMD 3900X cpu, also folding away. CPU utilization looked entirely normal for folding, while 13416 was running slowly on the GPU. Combined CPU & GPU "Total Estimated Points Per Day" was sitting around 1.5M while this WU ran (jogged?) - I'm usually showing 2.3-2.4M ppd CPU/GPU this week. Sitting at 2.5M ppd right now as I type.

I also found that another 13416 instance from earlier today crashed and restarted several times, finally giving up. The "slow" instance was the next WU I received, and it also crashed and restarted once during its extended run, but it did eventually successfully complete. I seemingly recall several other 13416-related crashes over the last couple of days. Normally my GPU folding is rock solid. My OS is Win10 Pro 1909, all patch levels current. I still have relevant log files, let me know if anything from the logs could be of service.

Cheers, Bob

Re: WU 13416 low ppd long run time

Posted: Wed Jul 08, 2020 7:09 am
by Shirty
I too have noticed this behaviour across a mix of high-end Nvidia cards (species 7 & 8). I seem to be getting these WUs on the majority of my cards, and it's shaved over 4 million points off my daily average. As long as science is getting done I can cope with that for a while, but it'd be nice to get more WUs suited to the hardware I'm donating to the cause.

Re: WU 13416 low ppd long run time

Posted: Wed Jul 08, 2020 7:46 am
by psaam0001
My only question's are: 1) Will the expiration times be less than 3.5 days for a Moonshot WU? And 2) Will dumped units for this be reassigned?

Paul

Re: WU 13416 low ppd long run time

Posted: Wed Jul 08, 2020 7:04 pm
by Breach
Same observations, 13416 (1297, 133, 1) for example - PPD is lower than usual by about 25%.

I've noticed that my GPU is loaded at 75%. If I stop my CPU slot (using 6 out of 8 threads) it goes up to 82-85%. Could it be that my CPU bottlenecks the GPU slot here?

[Edit: Would have helped to read previous posts - it seems that's right, some WUs are just too CPU hungry]