Page 1 of 1

Errors and varying/very low PPD on 16945 (12, 3, 1)

Posted: Wed Dec 30, 2020 5:30 pm
by Hopfgeist
I am noticing a PPD value for this WU of between 1/5 and 1/10 of my usual, and there are a couple of errors in the log:

Code: Select all

01:48:11:WU01:FS00:Connecting to assign1.foldingathome.org:80
[...]
01:48:14:WU01:FS00:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:16945 run:12 clone:3 gen:1 core:0xa8 unit:0x0000000300000001000042310000000c
01:48:50:WU01:FS00:Starting
01:48:50:WU01:FS00:Running FahCore: /emul/linux/usr/bin/FAHCoreWrapper /home/bernd/FAH/cores/cores.foldingathome.org/lin/64bit-sse2/a8-0.0.9/Core_a8.fah/FahCore_a8 -dir 01 -suffix 01 -version 706 -lifeline 21724 -checkpoint 15 -np 24
01:48:50:WU01:FS00:Started FahCore on PID 3660
01:48:50:WU01:FS00:Core PID:9191
01:48:50:WU01:FS00:FahCore 0xa8 started
01:48:51:WU01:FS00:0xa8:*********************** Log Started 2020-12-30T01:48:50Z ***********************
01:48:51:WU01:FS00:0xa8:************************** Gromacs Folding@home Core ***************************
01:48:51:WU01:FS00:0xa8:       Core: Gromacs
01:48:51:WU01:FS00:0xa8:       Type: 0xa8
01:48:51:WU01:FS00:0xa8:    Version: 0.0.9
01:48:51:WU01:FS00:0xa8:     Author: Joseph Coffland <[email protected]>
01:48:51:WU01:FS00:0xa8:  Copyright: 2020 foldingathome.org
01:48:51:WU01:FS00:0xa8:   Homepage: https://foldingathome.org/
01:48:51:WU01:FS00:0xa8:       Date: Oct 28 2020
01:48:51:WU01:FS00:0xa8:       Time: 22:12:28
01:48:51:WU01:FS00:0xa8:   Compiler: GNU 8.3.0
01:48:51:WU01:FS00:0xa8:    Options: -faligned-new -std=c++14 -fsigned-char -ffunction-sections
01:48:51:WU01:FS00:0xa8:             -fdata-sections -O3 -funroll-loops -fno-pie
01:48:51:WU01:FS00:0xa8:   Platform: linux2 4.15.0-108-generic
01:48:51:WU01:FS00:0xa8:       Bits: 64
01:48:51:WU01:FS00:0xa8:       Mode: Release
01:48:51:WU01:FS00:0xa8:       SIMD: sse2
01:48:51:WU01:FS00:0xa8:     OpenMP: ON
01:48:51:WU01:FS00:0xa8:       CUDA: OFF
01:48:51:WU01:FS00:0xa8:       Args: -dir 01 -suffix 01 -version 706 -lifeline 3660 -checkpoint 15 -np
01:48:51:WU01:FS00:0xa8:             24
01:48:51:WU01:FS00:0xa8:************************************ libFAH ************************************
01:48:51:WU01:FS00:0xa8:       Date: Oct 28 2020
01:48:51:WU01:FS00:0xa8:       Time: 22:12:00
01:48:51:WU01:FS00:0xa8:   Compiler: GNU 8.3.0
01:48:51:WU01:FS00:0xa8:    Options: -faligned-new -std=c++14 -fsigned-char -ffunction-sections
01:48:51:WU01:FS00:0xa8:             -fdata-sections -O3 -funroll-loops -fno-pie
01:48:51:WU01:FS00:0xa8:   Platform: linux2 4.15.0-108-generic
01:48:51:WU01:FS00:0xa8:       Bits: 64
01:48:51:WU01:FS00:0xa8:       Mode: Release
01:48:51:WU01:FS00:0xa8:************************************ CBang *************************************
01:48:51:WU01:FS00:0xa8:       Date: Oct 28 2020
01:48:51:WU01:FS00:0xa8:       Time: 22:11:46
01:48:51:WU01:FS00:0xa8:   Compiler: GNU 8.3.0
01:48:51:WU01:FS00:0xa8:    Options: -faligned-new -std=c++14 -fsigned-char -ffunction-sections
01:48:51:WU01:FS00:0xa8:             -fdata-sections -O3 -funroll-loops -fno-pie -fPIC
01:48:51:WU01:FS00:0xa8:   Platform: linux2 4.15.0-108-generic
01:48:51:WU01:FS00:0xa8:       Bits: 64
01:48:51:WU01:FS00:0xa8:       Mode: Release
01:48:51:WU01:FS00:0xa8:************************************ System ************************************
01:48:51:WU01:FS00:0xa8:        CPU: Intel(R) Xeon(R) CPU X5675 @ 3.07GHz
01:48:51:WU01:FS00:0xa8:     CPU ID: GenuineIntel Family 6 Model 44 Stepping 2
01:48:51:WU01:FS00:0xa8:       CPUs: 24
01:48:51:WU01:FS00:0xa8:     Memory: 39.99GiB
01:48:51:WU01:FS00:0xa8:Free Memory: 681.61MiB
01:48:51:WU01:FS00:0xa8:    Threads: POSIX_THREADS
01:48:51:WU01:FS00:0xa8: OS Version: 3.11
01:48:51:WU01:FS00:0xa8:Has Battery: false
01:48:51:WU01:FS00:0xa8: On Battery: false
01:48:51:WU01:FS00:0xa8: UTC Offset: 1
01:48:51:WU01:FS00:0xa8:        PID: 9191
01:48:51:WU01:FS00:0xa8:        CWD: /home/bernd/FAH/work
01:48:51:WU01:FS00:0xa8:********************************************************************************
01:48:51:WU01:FS00:0xa8:Project: 16945 (Run 12, Clone 3, Gen 1)
01:48:51:WU01:FS00:0xa8:Unit: 0x00000000000000000000000000000000
01:48:51:WU01:FS00:0xa8:Reading tar file core.xml
01:48:51:WU01:FS00:0xa8:Reading tar file frame1.tpr
01:48:51:WU01:FS00:0xa8:Digital signatures verified
01:48:51:WU01:FS00:0xa8:Calling: mdrun -c frame1.gro -s frame1.tpr -x frame1.xtc -cpt 15 -nt 24 -ntmpi 1
01:48:51:WU01:FS00:0xa8:Steps: first=5000000 total=10000000
01:48:54:WU01:FS00:0xa8:Completed 1 out of 5000000 steps (0%)
02:19:55:WU01:FS00:0xa8:Completed 50000 out of 5000000 steps (1%)
02:32:32:WU01:FS00:0xa8:Completed 100000 out of 5000000 steps (2%)
02:43:24:WU01:FS00:0xa8:Completed 150000 out of 5000000 steps (3%)
02:53:55:WU01:FS00:0xa8:Completed 200000 out of 5000000 steps (4%)
03:04:58:WU01:FS00:0xa8:Completed 250000 out of 5000000 steps (5%)
03:17:13:WU01:FS00:0xa8:Completed 300000 out of 5000000 steps (6%)
03:33:22:WU01:FS00:0xa8:Completed 350000 out of 5000000 steps (7%)
03:54:05:WU01:FS00:0xa8:Completed 400000 out of 5000000 steps (8%)
04:20:39:WU01:FS00:0xa8:Completed 450000 out of 5000000 steps (9%)
04:31:44:WU01:FS00:0xa8:Completed 500000 out of 5000000 steps (10%)
04:41:56:WU01:FS00:0xa8:Completed 550000 out of 5000000 steps (11%)
04:56:13:WU01:FS00:0xa8:Completed 600000 out of 5000000 steps (12%)
05:21:57:WU01:FS00:0xa8:Completed 650000 out of 5000000 steps (13%)
05:44:51:WU01:FS00:0xa8:Completed 700000 out of 5000000 steps (14%)
06:09:23:WU01:FS00:0xa8:Completed 750000 out of 5000000 steps (15%)
06:35:52:WU01:FS00:0xa8:Completed 800000 out of 5000000 steps (16%)
******************************* Date: 2020-12-30 *******************************
06:54:43:WU01:FS00:0xa8:Completed 850000 out of 5000000 steps (17%)
07:14:31:WU01:FS00:0xa8:Completed 900000 out of 5000000 steps (18%)
07:37:18:WU01:FS00:0xa8:Completed 950000 out of 5000000 steps (19%)
07:54:49:WU01:FS00:0xa8:Completed 1000000 out of 5000000 steps (20%)
08:08:09:WU01:FS00:0xa8:Completed 1050000 out of 5000000 steps (21%)
08:27:12:WU01:FS00:0xa8:Completed 1100000 out of 5000000 steps (22%)
08:44:20:WU01:FS00:0xa8:Completed 1150000 out of 5000000 steps (23%)
09:14:03:WU01:FS00:0xa8:Completed 1200000 out of 5000000 steps (24%)
09:34:32:WU01:FS00:0xa8:Completed 1250000 out of 5000000 steps (25%)
09:56:57:WU01:FS00:0xa8:Completed 1300000 out of 5000000 steps (26%)
10:00:25:ERROR:Send error: 32: Broken pipe
10:26:31:WU01:FS00:0xa8:Completed 1350000 out of 5000000 steps (27%)
11:01:43:WU01:FS00:0xa8:Completed 1400000 out of 5000000 steps (28%)
11:17:27:WU01:FS00:0xa8:Completed 1450000 out of 5000000 steps (29%)
11:30:28:WU01:FS00:0xa8:Completed 1500000 out of 5000000 steps (30%)
11:52:26:WU01:FS00:0xa8:Completed 1550000 out of 5000000 steps (31%)
12:17:51:ERROR:Send error: 32: Broken pipe
12:17:51:ERROR:Send error: 32: Broken pipe
12:21:49:WU01:FS00:0xa8:Completed 1600000 out of 5000000 steps (32%)
12:44:26:WU01:FS00:0xa8:Completed 1650000 out of 5000000 steps (33%)
******************************* Date: 2020-12-30 *******************************
13:16:44:WU01:FS00:0xa8:Completed 1700000 out of 5000000 steps (34%)
13:28:00:ERROR:Send error: 32: Broken pipe
13:40:40:WU01:FS00:0xa8:Completed 1750000 out of 5000000 steps (35%)
13:44:53:ERROR:Send error: 32: Broken pipe
13:59:21:WU01:FS00:0xa8:Completed 1800000 out of 5000000 steps (36%)
14:16:59:WU01:FS00:0xa8:Completed 1850000 out of 5000000 steps (37%)
14:40:36:WU01:FS00:0xa8:Completed 1900000 out of 5000000 steps (38%)
14:56:57:WU01:FS00:0xa8:Completed 1950000 out of 5000000 steps (39%)
15:11:09:WU01:FS00:0xa8:Completed 2000000 out of 5000000 steps (40%)
15:38:07:WU01:FS00:0xa8:Completed 2050000 out of 5000000 steps (41%)
16:00:12:WU01:FS00:0xa8:Completed 2100000 out of 5000000 steps (42%)
16:20:20:WU01:FS00:0xa8:Completed 2150000 out of 5000000 steps (43%)
16:35:48:WU01:FS00:0xa8:Completed 2200000 out of 5000000 steps (44%)
16:52:13:WU01:FS00:0xa8:Completed 2250000 out of 5000000 steps (45%)
I don't know what the Broken pipe refers to, but something is wrong, besides wildly varying TPF, and over all very slow progress.

Also a unit ID of 0 seems fishy:

Code: Select all

01:48:51:WU01:FS00:0xa8:Project: 16945 (Run 12, Clone 3, Gen 1)
01:48:51:WU01:FS00:0xa8:Unit: 0x00000000000000000000000000000000
Especially since earlier it lists a different (non-zero) ID:

Code: Select all

01:48:14:WU01:FS00:Download complete
01:48:14:WU01:FS00:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:16945 run:12 clone:3 gen:1 core:0xa8 unit:0x00000003
00000001000042310000000c
Has anyone else seen something similar in this project or others?

Cheers,
HG.

Re: Errors and varying/very low PPD on 16945 (12, 3, 1)

Posted: Wed Dec 30, 2020 8:06 pm
by PantherX
Unit ID of 0 is a red herring... it's correct that previously, there was a valid Unit ID but that feature has been depreciated and future versions of FahCore will not have them. Consider this a temporary cosmetic issue.

I think that "Send error: 32: Broken pipe" is related to FAHControl but don't know the cause/fix for that.