Page 1 of 1

Project 17236 - Early Unit End

Posted: Tue May 25, 2021 2:25 pm
by satcat16609
Over the last week, I've noticed a five WUs, so far, failing with an "Early_Unit_End" warning. All five WUs are from Project 17236. Below are the relevant WU entries from HFM.Net. I've also included a snippet from the log file from the PCRG 17236 (265,117,4). The message is the same for all of the WUs.

Code: Select all

Project Run  Clone  Gen  Core FC    Result           Assigned          Finished          WU_Name  Core_Name
17236	26	186	 0	 0.19	0	EARLY_UNIT_END	5/16/2021 3:20	 5/16/2021 3:21	 p17236	GRO_A7
17236	159	162	6	 0.19	0	EARLY_UNIT_END	5/24/2021 13:50	5/24/2021 13:50	p17236	GRO_A7
17236	253	82	 11	0.19	0	EARLY_UNIT_END	5/24/2021 13:54	5/24/2021 13:54	p17236	GRO_A7
17236	313	46	 5	 0.19	0	EARLY_UNIT_END	5/25/2021 4:25	 5/25/2021 4:25	 p17236	GRO_A7
17236	265	117	4	 0.19	0	EARLY_UNIT_END	5/25/2021 4:29	 5/25/2021 4:29	 p17236	GRO_A7

Code: Select all

04:29:28:WU01:FS00:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
04:29:28:WU01:FS00:0xa7:   Platform: win32 10
04:29:28:WU01:FS00:0xa7:       Bits: 64
04:29:28:WU01:FS00:0xa7:       Mode: Release
04:29:28:WU01:FS00:0xa7:************************************ System ************************************
04:29:28:WU01:FS00:0xa7:        CPU: AMD Ryzen 5 3600X 6-Core Processor
04:29:28:WU01:FS00:0xa7:     CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
04:29:28:WU01:FS00:0xa7:       CPUs: 12
04:29:28:WU01:FS00:0xa7:     Memory: 31.93GiB
04:29:28:WU01:FS00:0xa7:Free Memory: 17.89GiB
04:29:28:WU01:FS00:0xa7:    Threads: WINDOWS_THREADS
04:29:28:WU01:FS00:0xa7: OS Version: 6.2
04:29:28:WU01:FS00:0xa7:Has Battery: true
04:29:28:WU01:FS00:0xa7: On Battery: false
04:29:28:WU01:FS00:0xa7: UTC Offset: -5
04:29:28:WU01:FS00:0xa7:        PID: 34960
04:29:28:WU01:FS00:0xa7:        CWD: C:\ProgramData\FAHClient\work
04:29:28:WU01:FS00:0xa7:******************************** Build - libFAH ********************************
04:29:28:WU01:FS00:0xa7:    Version: 0.0.19
04:29:28:WU01:FS00:0xa7:     Author: Joseph Coffland <[email protected]>
04:29:28:WU01:FS00:0xa7:  Copyright: 2019 foldingathome.org
04:29:28:WU01:FS00:0xa7:   Homepage: https://foldingathome.org/
04:29:28:WU01:FS00:0xa7:       Date: Nov 25 2019
04:29:28:WU01:FS00:0xa7:       Time: 17:12:41
04:29:28:WU01:FS00:0xa7:   Revision: d5b5c747532224f986b7cd02c968ed9a20c16d6e
04:29:28:WU01:FS00:0xa7:     Branch: master
04:29:28:WU01:FS00:0xa7:   Compiler: Visual C++ 2008
04:29:28:WU01:FS00:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
04:29:28:WU01:FS00:0xa7:   Platform: win32 10
04:29:28:WU01:FS00:0xa7:       Bits: 64
04:29:28:WU01:FS00:0xa7:       Mode: Release
04:29:28:WU01:FS00:0xa7:************************************ Build *************************************
04:29:28:WU01:FS00:0xa7:       SIMD: avx_256
04:29:28:WU01:FS00:0xa7:********************************************************************************
04:29:28:WU01:FS00:0xa7:Project: 17236 (Run 265, Clone 117, Gen 4)
04:29:28:WU01:FS00:0xa7:Unit: 0x0000000f80fccb02609dbb0c3f813525
04:29:28:WU01:FS00:0xa7:Reading tar file core.xml
04:29:28:WU01:FS00:0xa7:Reading tar file frame4.tpr
04:29:28:WU01:FS00:0xa7:Digital signatures verified
04:29:28:WU01:FS00:0xa7:Calling: mdrun -s frame4.tpr -o frame4.trr -x frame4.xtc -cpt 30 -nt 10
04:29:28:WU01:FS00:0xa7:Steps: first=2000000 total=500000
04:29:28:WU01:FS00:0xa7:ERROR:
04:29:28:WU01:FS00:0xa7:ERROR:-------------------------------------------------------
04:29:28:WU01:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20191026-456f0d636-unknown
04:29:28:WU01:FS00:0xa7:ERROR:Source code file: C:\build\fah\core-a7-avx-release\windows-10-64bit-core-a7-avx-release\gromacs-core\build\gromacs\src\gromacs\mdlib\domdec.c, line: 6902
04:29:28:WU01:FS00:0xa7:ERROR:
04:29:28:WU01:FS00:0xa7:ERROR:Fatal error:
04:29:28:WU01:FS00:0xa7:ERROR:There is no domain decomposition for 10 ranks that is compatible with the given box and a minimum cell size of 1.40508 nm
04:29:28:WU01:FS00:0xa7:ERROR:Change the number of ranks or mdrun option -rcon or -dds or your LINCS settings
04:29:28:WU01:FS00:0xa7:ERROR:Look in the log file for details on the domain decomposition
04:29:28:WU01:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS
04:29:28:WU01:FS00:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors
04:29:28:WU01:FS00:0xa7:ERROR:-------------------------------------------------------
04:29:33:WU01:FS00:0xa7:WARNING:Unexpected exit
04:29:33:WARNING:WU01:FS00:FahCore returned: EARLY_UNIT_END (123 = 0x7b)
04:29:33:WU01:FS00:Starting
04:29:33:WU01:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\ProgramData\FAHClient\cores/cores.foldingathome.org/win/64bit-avx-256/a7-0.0.19/Core_a7.fah/FahCore_a7.exe -dir 01 -suffix 01 -version 706 -lifeline 11620 -checkpoint 30 -np 10
04:29:33:WU01:FS00:Started FahCore on PID 36712
04:29:33:WU01:FS00:Core PID:27384
04:29:33:WU01:FS00:FahCore 0xa7 started
04:29:34:WU01:FS00:0xa7:*********************** Log Started 2021-05-25T04:29:34Z ***********************
04:29:34:WU01:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
04:29:34:WU01:FS00:0xa7:       Type: 0xa7
04:29:34:WU01:FS00:0xa7:       Core: Gromacs
04:29:34:WU01:FS00:0xa7:       Args: -dir 01 -suffix 01 -version 706 -lifeline 36712 -checkpoint 30 -np
04:29:34:WU01:FS00:0xa7:             10
04:29:34:WU01:FS00:0xa7:************************************ CBang *************************************
04:29:34:WU01:FS00:0xa7:       Date: Nov 27 2019
04:29:34:WU01:FS00:0xa7:       Time: 03:40:09
04:29:34:WU01:FS00:0xa7:   Revision: d25803215b59272441049dfa05a0a9bf7a6e3c48
04:29:34:WU01:FS00:0xa7:     Branch: master
04:29:34:WU01:FS00:0xa7:   Compiler: Visual C++ 2008
04:29:34:WU01:FS00:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
04:29:34:WU01:FS00:0xa7:   Platform: win32 10
04:29:34:WU01:FS00:0xa7:       Bits: 64
04:29:34:WU01:FS00:0xa7:       Mode: Release
04:29:34:WU01:FS00:0xa7:************************************ System ************************************
04:29:34:WU01:FS00:0xa7:        CPU: AMD Ryzen 5 3600X 6-Core Processor
04:29:34:WU01:FS00:0xa7:     CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
04:29:34:WU01:FS00:0xa7:       CPUs: 12
04:29:34:WU01:FS00:0xa7:     Memory: 31.93GiB
04:29:34:WU01:FS00:0xa7:Free Memory: 17.89GiB
04:29:34:WU01:FS00:0xa7:    Threads: WINDOWS_THREADS
04:29:34:WU01:FS00:0xa7: OS Version: 6.2
04:29:34:WU01:FS00:0xa7:Has Battery: true
04:29:34:WU01:FS00:0xa7: On Battery: false
04:29:34:WU01:FS00:0xa7: UTC Offset: -5
04:29:34:WU01:FS00:0xa7:        PID: 27384
04:29:34:WU01:FS00:0xa7:        CWD: C:\ProgramData\FAHClient\work
04:29:34:WU01:FS00:0xa7:******************************** Build - libFAH ********************************
04:29:34:WU01:FS00:0xa7:    Version: 0.0.19
04:29:34:WU01:FS00:0xa7:     Author: Joseph Coffland <[email protected]>
04:29:34:WU01:FS00:0xa7:  Copyright: 2019 foldingathome.org
04:29:34:WU01:FS00:0xa7:   Homepage: https://foldingathome.org/
04:29:34:WU01:FS00:0xa7:       Date: Nov 25 2019
04:29:34:WU01:FS00:0xa7:       Time: 17:12:41
04:29:34:WU01:FS00:0xa7:   Revision: d5b5c747532224f986b7cd02c968ed9a20c16d6e
04:29:34:WU01:FS00:0xa7:     Branch: master
04:29:34:WU01:FS00:0xa7:   Compiler: Visual C++ 2008
04:29:34:WU01:FS00:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
04:29:34:WU01:FS00:0xa7:   Platform: win32 10
04:29:34:WU01:FS00:0xa7:       Bits: 64
04:29:34:WU01:FS00:0xa7:       Mode: Release
04:29:34:WU01:FS00:0xa7:************************************ Build *************************************
04:29:34:WU01:FS00:0xa7:       SIMD: avx_256
04:29:34:WU01:FS00:0xa7:********************************************************************************
04:29:34:WU01:FS00:0xa7:Project: 17236 (Run 265, Clone 117, Gen 4)
04:29:34:WU01:FS00:0xa7:Unit: 0x0000000f80fccb02609dbb0c3f813525
04:29:34:WU01:FS00:0xa7:Reading tar file core.xml
04:29:34:WU01:FS00:0xa7:Reading tar file frame4.tpr
04:29:34:WU01:FS00:0xa7:Digital signatures verified
04:29:34:WU01:FS00:0xa7:Calling: mdrun -s frame4.tpr -o frame4.trr -x frame4.xtc -cpt 30 -nt 10
04:29:34:WU01:FS00:0xa7:Steps: first=2000000 total=500000
04:29:34:WU01:FS00:0xa7:ERROR:
04:29:34:WU01:FS00:0xa7:ERROR:-------------------------------------------------------
04:29:34:WU01:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20191026-456f0d636-unknown
04:29:34:WU01:FS00:0xa7:ERROR:Source code file: C:\build\fah\core-a7-avx-release\windows-10-64bit-core-a7-avx-release\gromacs-core\build\gromacs\src\gromacs\mdlib\domdec.c, line: 6902
04:29:34:WU01:FS00:0xa7:ERROR:
04:29:34:WU01:FS00:0xa7:ERROR:Fatal error:
04:29:34:WU01:FS00:0xa7:ERROR:There is no domain decomposition for 10 ranks that is compatible with the given box and a minimum cell size of 1.40508 nm
04:29:34:WU01:FS00:0xa7:ERROR:Change the number of ranks or mdrun option -rcon or -dds or your LINCS settings
04:29:34:WU01:FS00:0xa7:ERROR:Look in the log file for details on the domain decomposition
04:29:34:WU01:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS
04:29:34:WU01:FS00:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors
04:29:34:WU01:FS00:0xa7:ERROR:-------------------------------------------------------
04:29:39:WU01:FS00:0xa7:WARNING:Unexpected exit
04:29:39:WARNING:WU01:FS00:FahCore returned: EARLY_UNIT_END (123 = 0x7b)
04:30:34:WU01:FS00:Starting
04:30:34:WU01:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\ProgramData\FAHClient\cores/cores.foldingathome.org/win/64bit-avx-256/a7-0.0.19/Core_a7.fah/FahCore_a7.exe -dir 01 -suffix 01 -version 706 -lifeline 11620 -checkpoint 30 -np 10
04:30:34:WU01:FS00:Started FahCore on PID 38176
04:30:34:WU01:FS00:Core PID:37540
04:30:34:WU01:FS00:FahCore 0xa7 started
04:30:34:WU01:FS00:0xa7:*********************** Log Started 2021-05-25T04:30:34Z ***********************
04:30:34:WU01:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
04:30:34:WU01:FS00:0xa7:       Type: 0xa7
04:30:34:WU01:FS00:0xa7:       Core: Gromacs
04:30:34:WU01:FS00:0xa7:       Args: -dir 01 -suffix 01 -version 706 -lifeline 38176 -checkpoint 30 -np
04:30:34:WU01:FS00:0xa7:             10
04:30:34:WU01:FS00:0xa7:************************************ CBang *************************************
04:30:34:WU01:FS00:0xa7:       Date: Nov 27 2019
04:30:34:WU01:FS00:0xa7:       Time: 03:40:09
04:30:34:WU01:FS00:0xa7:   Revision: d25803215b59272441049dfa05a0a9bf7a6e3c48
04:30:34:WU01:FS00:0xa7:     Branch: master
04:30:34:WU01:FS00:0xa7:   Compiler: Visual C++ 2008
04:30:34:WU01:FS00:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
04:30:34:WU01:FS00:0xa7:   Platform: win32 10
04:30:34:WU01:FS00:0xa7:       Bits: 64
04:30:34:WU01:FS00:0xa7:       Mode: Release
04:30:34:WU01:FS00:0xa7:************************************ System ************************************
04:30:34:WU01:FS00:0xa7:        CPU: AMD Ryzen 5 3600X 6-Core Processor
04:30:34:WU01:FS00:0xa7:     CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
04:30:34:WU01:FS00:0xa7:       CPUs: 12
04:30:34:WU01:FS00:0xa7:     Memory: 31.93GiB
04:30:34:WU01:FS00:0xa7:Free Memory: 17.85GiB
04:30:34:WU01:FS00:0xa7:    Threads: WINDOWS_THREADS
04:30:34:WU01:FS00:0xa7: OS Version: 6.2
04:30:34:WU01:FS00:0xa7:Has Battery: true
04:30:34:WU01:FS00:0xa7: On Battery: false
04:30:34:WU01:FS00:0xa7: UTC Offset: -5
04:30:34:WU01:FS00:0xa7:        PID: 37540
04:30:34:WU01:FS00:0xa7:        CWD: C:\ProgramData\FAHClient\work
04:30:34:WU01:FS00:0xa7:******************************** Build - libFAH ********************************
04:30:34:WU01:FS00:0xa7:    Version: 0.0.19
04:30:34:WU01:FS00:0xa7:     Author: Joseph Coffland <[email protected]>
04:30:34:WU01:FS00:0xa7:  Copyright: 2019 foldingathome.org
04:30:34:WU01:FS00:0xa7:   Homepage: https://foldingathome.org/
04:30:34:WU01:FS00:0xa7:       Date: Nov 25 2019
04:30:34:WU01:FS00:0xa7:       Time: 17:12:41
04:30:34:WU01:FS00:0xa7:   Revision: d5b5c747532224f986b7cd02c968ed9a20c16d6e
04:30:34:WU01:FS00:0xa7:     Branch: master
04:30:34:WU01:FS00:0xa7:   Compiler: Visual C++ 2008
04:30:34:WU01:FS00:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
04:30:34:WU01:FS00:0xa7:   Platform: win32 10
04:30:34:WU01:FS00:0xa7:       Bits: 64
04:30:34:WU01:FS00:0xa7:       Mode: Release
04:30:34:WU01:FS00:0xa7:************************************ Build *************************************
04:30:34:WU01:FS00:0xa7:       SIMD: avx_256
04:30:34:WU01:FS00:0xa7:********************************************************************************
04:30:34:WU01:FS00:0xa7:Project: 17236 (Run 265, Clone 117, Gen 4)
04:30:34:WU01:FS00:0xa7:Unit: 0x0000000f80fccb02609dbb0c3f813525
04:30:34:WU01:FS00:0xa7:Reading tar file core.xml
04:30:34:WU01:FS00:0xa7:Reading tar file frame4.tpr
04:30:34:WU01:FS00:0xa7:Digital signatures verified
04:30:34:WU01:FS00:0xa7:Calling: mdrun -s frame4.tpr -o frame4.trr -x frame4.xtc -cpt 30 -nt 10
04:30:34:WU01:FS00:0xa7:Steps: first=2000000 total=500000
04:30:34:WU01:FS00:0xa7:ERROR:
04:30:34:WU01:FS00:0xa7:ERROR:-------------------------------------------------------
04:30:34:WU01:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20191026-456f0d636-unknown
04:30:34:WU01:FS00:0xa7:ERROR:Source code file: C:\build\fah\core-a7-avx-release\windows-10-64bit-core-a7-avx-release\gromacs-core\build\gromacs\src\gromacs\mdlib\domdec.c, line: 6902
04:30:34:WU01:FS00:0xa7:ERROR:
04:30:34:WU01:FS00:0xa7:ERROR:Fatal error:
04:30:34:WU01:FS00:0xa7:ERROR:There is no domain decomposition for 10 ranks that is compatible with the given box and a minimum cell size of 1.40508 nm
04:30:34:WU01:FS00:0xa7:ERROR:Change the number of ranks or mdrun option -rcon or -dds or your LINCS settings
04:30:34:WU01:FS00:0xa7:ERROR:Look in the log file for details on the domain decomposition
04:30:34:WU01:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS
04:30:34:WU01:FS00:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors
04:30:34:WU01:FS00:0xa7:ERROR:-------------------------------------------------------
04:30:39:WU01:FS00:0xa7:WARNING:Unexpected exit
04:30:39:WARNING:WU01:FS00:FahCore returned: EARLY_UNIT_END (123 = 0x7b)
04:31:34:WU01:FS00:Starting
04:31:34:WU01:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\ProgramData\FAHClient\cores/cores.foldingathome.org/win/64bit-avx-256/a7-0.0.19/Core_a7.fah/FahCore_a7.exe -dir 01 -suffix 01 -version 706 -lifeline 11620 -checkpoint 30 -np 10
04:31:34:WU01:FS00:Started FahCore on PID 27116
04:31:34:WU01:FS00:Core PID:4516
04:31:34:WU01:FS00:FahCore 0xa7 started
04:31:34:WU01:FS00:0xa7:*********************** Log Started 2021-05-25T04:31:34Z ***********************
04:31:34:WU01:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
04:31:34:WU01:FS00:0xa7:       Type: 0xa7
04:31:34:WU01:FS00:0xa7:       Core: Gromacs
04:31:34:WU01:FS00:0xa7:       Args: -dir 01 -suffix 01 -version 706 -lifeline 27116 -checkpoint 30 -np
04:31:34:WU01:FS00:0xa7:             10
04:31:34:WU01:FS00:0xa7:************************************ CBang *************************************
04:31:34:WU01:FS00:0xa7:       Date: Nov 27 2019
04:31:34:WU01:FS00:0xa7:       Time: 03:40:09
04:31:34:WU01:FS00:0xa7:   Revision: d25803215b59272441049dfa05a0a9bf7a6e3c48
04:31:34:WU01:FS00:0xa7:     Branch: master
04:31:34:WU01:FS00:0xa7:   Compiler: Visual C++ 2008
04:31:34:WU01:FS00:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
04:31:34:WU01:FS00:0xa7:   Platform: win32 10
04:31:34:WU01:FS00:0xa7:       Bits: 64
04:31:34:WU01:FS00:0xa7:       Mode: Release
04:31:34:WU01:FS00:0xa7:************************************ System ************************************
04:31:34:WU01:FS00:0xa7:        CPU: AMD Ryzen 5 3600X 6-Core Processor
04:31:34:WU01:FS00:0xa7:     CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
04:31:34:WU01:FS00:0xa7:       CPUs: 12
04:31:34:WU01:FS00:0xa7:     Memory: 31.93GiB
04:31:34:WU01:FS00:0xa7:Free Memory: 17.85GiB
04:31:34:WU01:FS00:0xa7:    Threads: WINDOWS_THREADS
04:31:34:WU01:FS00:0xa7: OS Version: 6.2
04:31:34:WU01:FS00:0xa7:Has Battery: true
04:31:34:WU01:FS00:0xa7: On Battery: false
04:31:34:WU01:FS00:0xa7: UTC Offset: -5
04:31:34:WU01:FS00:0xa7:        PID: 4516
04:31:34:WU01:FS00:0xa7:        CWD: C:\ProgramData\FAHClient\work
04:31:34:WU01:FS00:0xa7:******************************** Build - libFAH ********************************
04:31:34:WU01:FS00:0xa7:    Version: 0.0.19
04:31:34:WU01:FS00:0xa7:     Author: Joseph Coffland <[email protected]>
04:31:34:WU01:FS00:0xa7:  Copyright: 2019 foldingathome.org
04:31:34:WU01:FS00:0xa7:   Homepage: https://foldingathome.org/
04:31:34:WU01:FS00:0xa7:       Date: Nov 25 2019
04:31:34:WU01:FS00:0xa7:       Time: 17:12:41
04:31:34:WU01:FS00:0xa7:   Revision: d5b5c747532224f986b7cd02c968ed9a20c16d6e
04:31:34:WU01:FS00:0xa7:     Branch: master
04:31:34:WU01:FS00:0xa7:   Compiler: Visual C++ 2008
04:31:34:WU01:FS00:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
04:31:34:WU01:FS00:0xa7:   Platform: win32 10
04:31:34:WU01:FS00:0xa7:       Bits: 64
04:31:34:WU01:FS00:0xa7:       Mode: Release
04:31:34:WU01:FS00:0xa7:************************************ Build *************************************
04:31:34:WU01:FS00:0xa7:       SIMD: avx_256
04:31:34:WU01:FS00:0xa7:********************************************************************************
04:31:34:WU01:FS00:0xa7:Project: 17236 (Run 265, Clone 117, Gen 4)
04:31:34:WU01:FS00:0xa7:Unit: 0x0000000f80fccb02609dbb0c3f813525
04:31:34:WU01:FS00:0xa7:Reading tar file core.xml
04:31:34:WU01:FS00:0xa7:Reading tar file frame4.tpr
04:31:34:WU01:FS00:0xa7:Digital signatures verified
04:31:34:WU01:FS00:0xa7:Calling: mdrun -s frame4.tpr -o frame4.trr -x frame4.xtc -cpt 30 -nt 10
04:31:34:WU01:FS00:0xa7:Steps: first=2000000 total=500000
04:31:34:WU01:FS00:0xa7:ERROR:
04:31:34:WU01:FS00:0xa7:ERROR:-------------------------------------------------------
04:31:34:WU01:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20191026-456f0d636-unknown
04:31:34:WU01:FS00:0xa7:ERROR:Source code file: C:\build\fah\core-a7-avx-release\windows-10-64bit-core-a7-avx-release\gromacs-core\build\gromacs\src\gromacs\mdlib\domdec.c, line: 6902
04:31:34:WU01:FS00:0xa7:ERROR:
04:31:34:WU01:FS00:0xa7:ERROR:Fatal error:
04:31:34:WU01:FS00:0xa7:ERROR:There is no domain decomposition for 10 ranks that is compatible with the given box and a minimum cell size of 1.40508 nm
04:31:34:WU01:FS00:0xa7:ERROR:Change the number of ranks or mdrun option -rcon or -dds or your LINCS settings
04:31:34:WU01:FS00:0xa7:ERROR:Look in the log file for details on the domain decomposition
04:31:34:WU01:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS
04:31:34:WU01:FS00:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors
04:31:34:WU01:FS00:0xa7:ERROR:-------------------------------------------------------
04:31:39:WU01:FS00:0xa7:WARNING:Unexpected exit
04:31:40:WARNING:WU01:FS00:FahCore returned: EARLY_UNIT_END (123 = 0x7b)
04:32:34:WU01:FS00:Starting
04:32:34:WU01:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\ProgramData\FAHClient\cores/cores.foldingathome.org/win/64bit-avx-256/a7-0.0.19/Core_a7.fah/FahCore_a7.exe -dir 01 -suffix 01 -version 706 -lifeline 11620 -checkpoint 30 -np 10
04:32:34:WU01:FS00:Started FahCore on PID 10320
04:32:34:WU01:FS00:Core PID:38496
04:32:34:WU01:FS00:FahCore 0xa7 started
04:32:34:WU01:FS00:0xa7:*********************** Log Started 2021-05-25T04:32:34Z ***********************
04:32:34:WU01:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
04:32:34:WU01:FS00:0xa7:       Type: 0xa7
04:32:34:WU01:FS00:0xa7:       Core: Gromacs
04:32:34:WU01:FS00:0xa7:       Args: -dir 01 -suffix 01 -version 706 -lifeline 10320 -checkpoint 30 -np
04:32:34:WU01:FS00:0xa7:             10
04:32:34:WU01:FS00:0xa7:************************************ CBang *************************************
04:32:34:WU01:FS00:0xa7:       Date: Nov 27 2019
04:32:34:WU01:FS00:0xa7:       Time: 03:40:09
04:32:34:WU01:FS00:0xa7:   Revision: d25803215b59272441049dfa05a0a9bf7a6e3c48
04:32:34:WU01:FS00:0xa7:     Branch: master
04:32:34:WU01:FS00:0xa7:   Compiler: Visual C++ 2008
04:32:34:WU01:FS00:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
04:32:34:WU01:FS00:0xa7:   Platform: win32 10
04:32:34:WU01:FS00:0xa7:       Bits: 64
04:32:34:WU01:FS00:0xa7:       Mode: Release
04:32:34:WU01:FS00:0xa7:************************************ System ************************************
04:32:34:WU01:FS00:0xa7:        CPU: AMD Ryzen 5 3600X 6-Core Processor
04:32:34:WU01:FS00:0xa7:     CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
04:32:34:WU01:FS00:0xa7:       CPUs: 12
04:32:34:WU01:FS00:0xa7:     Memory: 31.93GiB
04:32:34:WU01:FS00:0xa7:Free Memory: 17.87GiB
04:32:34:WU01:FS00:0xa7:    Threads: WINDOWS_THREADS
04:32:34:WU01:FS00:0xa7: OS Version: 6.2
04:32:34:WU01:FS00:0xa7:Has Battery: true
04:32:34:WU01:FS00:0xa7: On Battery: false
04:32:34:WU01:FS00:0xa7: UTC Offset: -5
04:32:34:WU01:FS00:0xa7:        PID: 38496
04:32:34:WU01:FS00:0xa7:        CWD: C:\ProgramData\FAHClient\work
04:32:34:WU01:FS00:0xa7:******************************** Build - libFAH ********************************
04:32:34:WU01:FS00:0xa7:    Version: 0.0.19
04:32:34:WU01:FS00:0xa7:     Author: Joseph Coffland <[email protected]>
04:32:34:WU01:FS00:0xa7:  Copyright: 2019 foldingathome.org
04:32:34:WU01:FS00:0xa7:   Homepage: https://foldingathome.org/
04:32:34:WU01:FS00:0xa7:       Date: Nov 25 2019
04:32:34:WU01:FS00:0xa7:       Time: 17:12:41
04:32:34:WU01:FS00:0xa7:   Revision: d5b5c747532224f986b7cd02c968ed9a20c16d6e
04:32:34:WU01:FS00:0xa7:     Branch: master
04:32:34:WU01:FS00:0xa7:   Compiler: Visual C++ 2008
04:32:34:WU01:FS00:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
04:32:34:WU01:FS00:0xa7:   Platform: win32 10
04:32:34:WU01:FS00:0xa7:       Bits: 64
04:32:34:WU01:FS00:0xa7:       Mode: Release
04:32:34:WU01:FS00:0xa7:************************************ Build *************************************
04:32:34:WU01:FS00:0xa7:       SIMD: avx_256
04:32:34:WU01:FS00:0xa7:********************************************************************************
04:32:34:WU01:FS00:0xa7:Project: 17236 (Run 265, Clone 117, Gen 4)
04:32:34:WU01:FS00:0xa7:Unit: 0x0000000f80fccb02609dbb0c3f813525
04:32:34:WU01:FS00:0xa7:Reading tar file core.xml
04:32:34:WU01:FS00:0xa7:Reading tar file frame4.tpr
04:32:34:WU01:FS00:0xa7:Digital signatures verified
04:32:34:WU01:FS00:0xa7:Calling: mdrun -s frame4.tpr -o frame4.trr -x frame4.xtc -cpt 30 -nt 10
04:32:34:WU01:FS00:0xa7:Steps: first=2000000 total=500000
04:32:34:WU01:FS00:0xa7:ERROR:
04:32:34:WU01:FS00:0xa7:ERROR:-------------------------------------------------------
04:32:34:WU01:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20191026-456f0d636-unknown
04:32:34:WU01:FS00:0xa7:ERROR:Source code file: C:\build\fah\core-a7-avx-release\windows-10-64bit-core-a7-avx-release\gromacs-core\build\gromacs\src\gromacs\mdlib\domdec.c, line: 6902
04:32:34:WU01:FS00:0xa7:ERROR:
04:32:34:WU01:FS00:0xa7:ERROR:Fatal error:
04:32:34:WU01:FS00:0xa7:ERROR:There is no domain decomposition for 10 ranks that is compatible with the given box and a minimum cell size of 1.40508 nm
04:32:34:WU01:FS00:0xa7:ERROR:Change the number of ranks or mdrun option -rcon or -dds or your LINCS settings
04:32:34:WU01:FS00:0xa7:ERROR:Look in the log file for details on the domain decomposition
04:32:34:WU01:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS
04:32:34:WU01:FS00:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors
04:32:34:WU01:FS00:0xa7:ERROR:-------------------------------------------------------
04:32:39:WU01:FS00:0xa7:WARNING:Unexpected exit
04:32:40:WARNING:WU01:FS00:FahCore returned: EARLY_UNIT_END (123 = 0x7b)
04:32:40:WARNING:WU01:FS00:Too many errors, failing
04:32:40:WU01:FS00:Sending unit results: id:01 state:SEND error:FAILED project:17236 run:265 clone:117 gen:4 core:0xa7 unit:0x0000000f80fccb02609dbb0c3f813525
04:32:40:WU01:FS00:Connecting to 128.252.203.2:8080
04:32:40:WU01:FS00:Server responded WORK_ACK (400)
04:32:40:WU01:FS00:Cleaning up 

Re: Project 17236 - Early Unit End

Posted: Tue May 25, 2021 3:25 pm
by PaulTV
It's a wild guess, maybe someone can confirm/deny. I think that you run 10 threads (see Calling: mdrun -s frame4.tpr -o frame4.trr -x frame4.xtc -cpt 30 -nt 10). I read somewhere that jobs from older projects sometimes can't really deal with a multiple of 5 cores (nope, didn't understand either). This project is still an a7 core project, and that definitely falls under 'older'. It may be related if you read the explanation of the error: http://www.gromacs.org/Documentation/Er ... ze_of_x_nm

So I'd suggest reducing the number of cores to 8 or 9, and see if the problem persists.

Re: Project 17236 - Early Unit End

Posted: Tue May 25, 2021 5:27 pm
by satcat16609
I've been using the CPU on Medium power, which sets the system to use ten cores, and this is the first project where I've had any issues. I had read last year when I started folding that there are certain numbers of cores that don't work well, but have always assumed the software would default to using a proper number of cores. Perhaps this particular project doesn't like ten cores. Thanks for the info.

Re: Project 17236 - Early Unit End

Posted: Tue May 25, 2021 5:42 pm
by bruce
FACore_a8 supposedly fixes this type of error but it's still present in _a7 and older cores. Projects started with older cores will continue to use that core (including follow-in clones of the same study) until the study is finished.

Requesting "large prime' numers of cores (including multiples of primes, like 5 cores) do sometimes fail, but it's somewhat a random faiure. You may have completed some WUs from that project. Setting your slot to 9 cores or 12 cores will probably "solve" the problem for you. The project owner can also exclude assignments to clients which are requesting assignments to clients with 10 cores.

Re: Project 17236 - Early Unit End

Posted: Tue May 25, 2021 8:53 pm
by satcat16609
Well, I adjusted the cores to 9, and then got back-to-back WUs from Project 17236, and they both completed successfully. Thanks for the info.

Re: Project 17236 - Early Unit End

Posted: Fri May 28, 2021 7:34 pm
by DeeGee
Yeah, seems like that project didn't like my 5950x that has been set to 28 threads instead of the default 31. Almost no CPU WU's and then the few ones received crashed...

Re: Project 17236 - Early Unit End

Posted: Fri May 28, 2021 7:59 pm
by JimboPalmer
28 is 7 by 2 by and a7 'hates' large primes like 7, 27 would be 3 by 3 by 3, much smaller primes.
One of the improvements in a8 is coping better with primes.