Project 17236 - Early Unit End

Moderators: Site Moderators, FAHC Science Team

Post Reply
satcat16609
Posts: 8
Joined: Wed Dec 30, 2020 3:02 pm
Hardware configuration: i5-4590, GTX 1060 3GB, Windows 10 Pro
Ryzen 5 3600X, GTX 1050 2GB, Windows 10 Pro
i3-3470, Manjaro Linux

Project 17236 - Early Unit End

Post by satcat16609 »

Over the last week, I've noticed a five WUs, so far, failing with an "Early_Unit_End" warning. All five WUs are from Project 17236. Below are the relevant WU entries from HFM.Net. I've also included a snippet from the log file from the PCRG 17236 (265,117,4). The message is the same for all of the WUs.

Code: Select all

Project Run  Clone  Gen  Core FC    Result           Assigned          Finished          WU_Name  Core_Name
17236	26	186	 0	 0.19	0	EARLY_UNIT_END	5/16/2021 3:20	 5/16/2021 3:21	 p17236	GRO_A7
17236	159	162	6	 0.19	0	EARLY_UNIT_END	5/24/2021 13:50	5/24/2021 13:50	p17236	GRO_A7
17236	253	82	 11	0.19	0	EARLY_UNIT_END	5/24/2021 13:54	5/24/2021 13:54	p17236	GRO_A7
17236	313	46	 5	 0.19	0	EARLY_UNIT_END	5/25/2021 4:25	 5/25/2021 4:25	 p17236	GRO_A7
17236	265	117	4	 0.19	0	EARLY_UNIT_END	5/25/2021 4:29	 5/25/2021 4:29	 p17236	GRO_A7

Code: Select all

04:29:28:WU01:FS00:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
04:29:28:WU01:FS00:0xa7:   Platform: win32 10
04:29:28:WU01:FS00:0xa7:       Bits: 64
04:29:28:WU01:FS00:0xa7:       Mode: Release
04:29:28:WU01:FS00:0xa7:************************************ System ************************************
04:29:28:WU01:FS00:0xa7:        CPU: AMD Ryzen 5 3600X 6-Core Processor
04:29:28:WU01:FS00:0xa7:     CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
04:29:28:WU01:FS00:0xa7:       CPUs: 12
04:29:28:WU01:FS00:0xa7:     Memory: 31.93GiB
04:29:28:WU01:FS00:0xa7:Free Memory: 17.89GiB
04:29:28:WU01:FS00:0xa7:    Threads: WINDOWS_THREADS
04:29:28:WU01:FS00:0xa7: OS Version: 6.2
04:29:28:WU01:FS00:0xa7:Has Battery: true
04:29:28:WU01:FS00:0xa7: On Battery: false
04:29:28:WU01:FS00:0xa7: UTC Offset: -5
04:29:28:WU01:FS00:0xa7:        PID: 34960
04:29:28:WU01:FS00:0xa7:        CWD: C:\ProgramData\FAHClient\work
04:29:28:WU01:FS00:0xa7:******************************** Build - libFAH ********************************
04:29:28:WU01:FS00:0xa7:    Version: 0.0.19
04:29:28:WU01:FS00:0xa7:     Author: Joseph Coffland <[email protected]>
04:29:28:WU01:FS00:0xa7:  Copyright: 2019 foldingathome.org
04:29:28:WU01:FS00:0xa7:   Homepage: https://foldingathome.org/
04:29:28:WU01:FS00:0xa7:       Date: Nov 25 2019
04:29:28:WU01:FS00:0xa7:       Time: 17:12:41
04:29:28:WU01:FS00:0xa7:   Revision: d5b5c747532224f986b7cd02c968ed9a20c16d6e
04:29:28:WU01:FS00:0xa7:     Branch: master
04:29:28:WU01:FS00:0xa7:   Compiler: Visual C++ 2008
04:29:28:WU01:FS00:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
04:29:28:WU01:FS00:0xa7:   Platform: win32 10
04:29:28:WU01:FS00:0xa7:       Bits: 64
04:29:28:WU01:FS00:0xa7:       Mode: Release
04:29:28:WU01:FS00:0xa7:************************************ Build *************************************
04:29:28:WU01:FS00:0xa7:       SIMD: avx_256
04:29:28:WU01:FS00:0xa7:********************************************************************************
04:29:28:WU01:FS00:0xa7:Project: 17236 (Run 265, Clone 117, Gen 4)
04:29:28:WU01:FS00:0xa7:Unit: 0x0000000f80fccb02609dbb0c3f813525
04:29:28:WU01:FS00:0xa7:Reading tar file core.xml
04:29:28:WU01:FS00:0xa7:Reading tar file frame4.tpr
04:29:28:WU01:FS00:0xa7:Digital signatures verified
04:29:28:WU01:FS00:0xa7:Calling: mdrun -s frame4.tpr -o frame4.trr -x frame4.xtc -cpt 30 -nt 10
04:29:28:WU01:FS00:0xa7:Steps: first=2000000 total=500000
04:29:28:WU01:FS00:0xa7:ERROR:
04:29:28:WU01:FS00:0xa7:ERROR:-------------------------------------------------------
04:29:28:WU01:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20191026-456f0d636-unknown
04:29:28:WU01:FS00:0xa7:ERROR:Source code file: C:\build\fah\core-a7-avx-release\windows-10-64bit-core-a7-avx-release\gromacs-core\build\gromacs\src\gromacs\mdlib\domdec.c, line: 6902
04:29:28:WU01:FS00:0xa7:ERROR:
04:29:28:WU01:FS00:0xa7:ERROR:Fatal error:
04:29:28:WU01:FS00:0xa7:ERROR:There is no domain decomposition for 10 ranks that is compatible with the given box and a minimum cell size of 1.40508 nm
04:29:28:WU01:FS00:0xa7:ERROR:Change the number of ranks or mdrun option -rcon or -dds or your LINCS settings
04:29:28:WU01:FS00:0xa7:ERROR:Look in the log file for details on the domain decomposition
04:29:28:WU01:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS
04:29:28:WU01:FS00:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors
04:29:28:WU01:FS00:0xa7:ERROR:-------------------------------------------------------
04:29:33:WU01:FS00:0xa7:WARNING:Unexpected exit
04:29:33:WARNING:WU01:FS00:FahCore returned: EARLY_UNIT_END (123 = 0x7b)
04:29:33:WU01:FS00:Starting
04:29:33:WU01:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\ProgramData\FAHClient\cores/cores.foldingathome.org/win/64bit-avx-256/a7-0.0.19/Core_a7.fah/FahCore_a7.exe -dir 01 -suffix 01 -version 706 -lifeline 11620 -checkpoint 30 -np 10
04:29:33:WU01:FS00:Started FahCore on PID 36712
04:29:33:WU01:FS00:Core PID:27384
04:29:33:WU01:FS00:FahCore 0xa7 started
04:29:34:WU01:FS00:0xa7:*********************** Log Started 2021-05-25T04:29:34Z ***********************
04:29:34:WU01:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
04:29:34:WU01:FS00:0xa7:       Type: 0xa7
04:29:34:WU01:FS00:0xa7:       Core: Gromacs
04:29:34:WU01:FS00:0xa7:       Args: -dir 01 -suffix 01 -version 706 -lifeline 36712 -checkpoint 30 -np
04:29:34:WU01:FS00:0xa7:             10
04:29:34:WU01:FS00:0xa7:************************************ CBang *************************************
04:29:34:WU01:FS00:0xa7:       Date: Nov 27 2019
04:29:34:WU01:FS00:0xa7:       Time: 03:40:09
04:29:34:WU01:FS00:0xa7:   Revision: d25803215b59272441049dfa05a0a9bf7a6e3c48
04:29:34:WU01:FS00:0xa7:     Branch: master
04:29:34:WU01:FS00:0xa7:   Compiler: Visual C++ 2008
04:29:34:WU01:FS00:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
04:29:34:WU01:FS00:0xa7:   Platform: win32 10
04:29:34:WU01:FS00:0xa7:       Bits: 64
04:29:34:WU01:FS00:0xa7:       Mode: Release
04:29:34:WU01:FS00:0xa7:************************************ System ************************************
04:29:34:WU01:FS00:0xa7:        CPU: AMD Ryzen 5 3600X 6-Core Processor
04:29:34:WU01:FS00:0xa7:     CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
04:29:34:WU01:FS00:0xa7:       CPUs: 12
04:29:34:WU01:FS00:0xa7:     Memory: 31.93GiB
04:29:34:WU01:FS00:0xa7:Free Memory: 17.89GiB
04:29:34:WU01:FS00:0xa7:    Threads: WINDOWS_THREADS
04:29:34:WU01:FS00:0xa7: OS Version: 6.2
04:29:34:WU01:FS00:0xa7:Has Battery: true
04:29:34:WU01:FS00:0xa7: On Battery: false
04:29:34:WU01:FS00:0xa7: UTC Offset: -5
04:29:34:WU01:FS00:0xa7:        PID: 27384
04:29:34:WU01:FS00:0xa7:        CWD: C:\ProgramData\FAHClient\work
04:29:34:WU01:FS00:0xa7:******************************** Build - libFAH ********************************
04:29:34:WU01:FS00:0xa7:    Version: 0.0.19
04:29:34:WU01:FS00:0xa7:     Author: Joseph Coffland <[email protected]>
04:29:34:WU01:FS00:0xa7:  Copyright: 2019 foldingathome.org
04:29:34:WU01:FS00:0xa7:   Homepage: https://foldingathome.org/
04:29:34:WU01:FS00:0xa7:       Date: Nov 25 2019
04:29:34:WU01:FS00:0xa7:       Time: 17:12:41
04:29:34:WU01:FS00:0xa7:   Revision: d5b5c747532224f986b7cd02c968ed9a20c16d6e
04:29:34:WU01:FS00:0xa7:     Branch: master
04:29:34:WU01:FS00:0xa7:   Compiler: Visual C++ 2008
04:29:34:WU01:FS00:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
04:29:34:WU01:FS00:0xa7:   Platform: win32 10
04:29:34:WU01:FS00:0xa7:       Bits: 64
04:29:34:WU01:FS00:0xa7:       Mode: Release
04:29:34:WU01:FS00:0xa7:************************************ Build *************************************
04:29:34:WU01:FS00:0xa7:       SIMD: avx_256
04:29:34:WU01:FS00:0xa7:********************************************************************************
04:29:34:WU01:FS00:0xa7:Project: 17236 (Run 265, Clone 117, Gen 4)
04:29:34:WU01:FS00:0xa7:Unit: 0x0000000f80fccb02609dbb0c3f813525
04:29:34:WU01:FS00:0xa7:Reading tar file core.xml
04:29:34:WU01:FS00:0xa7:Reading tar file frame4.tpr
04:29:34:WU01:FS00:0xa7:Digital signatures verified
04:29:34:WU01:FS00:0xa7:Calling: mdrun -s frame4.tpr -o frame4.trr -x frame4.xtc -cpt 30 -nt 10
04:29:34:WU01:FS00:0xa7:Steps: first=2000000 total=500000
04:29:34:WU01:FS00:0xa7:ERROR:
04:29:34:WU01:FS00:0xa7:ERROR:-------------------------------------------------------
04:29:34:WU01:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20191026-456f0d636-unknown
04:29:34:WU01:FS00:0xa7:ERROR:Source code file: C:\build\fah\core-a7-avx-release\windows-10-64bit-core-a7-avx-release\gromacs-core\build\gromacs\src\gromacs\mdlib\domdec.c, line: 6902
04:29:34:WU01:FS00:0xa7:ERROR:
04:29:34:WU01:FS00:0xa7:ERROR:Fatal error:
04:29:34:WU01:FS00:0xa7:ERROR:There is no domain decomposition for 10 ranks that is compatible with the given box and a minimum cell size of 1.40508 nm
04:29:34:WU01:FS00:0xa7:ERROR:Change the number of ranks or mdrun option -rcon or -dds or your LINCS settings
04:29:34:WU01:FS00:0xa7:ERROR:Look in the log file for details on the domain decomposition
04:29:34:WU01:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS
04:29:34:WU01:FS00:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors
04:29:34:WU01:FS00:0xa7:ERROR:-------------------------------------------------------
04:29:39:WU01:FS00:0xa7:WARNING:Unexpected exit
04:29:39:WARNING:WU01:FS00:FahCore returned: EARLY_UNIT_END (123 = 0x7b)
04:30:34:WU01:FS00:Starting
04:30:34:WU01:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\ProgramData\FAHClient\cores/cores.foldingathome.org/win/64bit-avx-256/a7-0.0.19/Core_a7.fah/FahCore_a7.exe -dir 01 -suffix 01 -version 706 -lifeline 11620 -checkpoint 30 -np 10
04:30:34:WU01:FS00:Started FahCore on PID 38176
04:30:34:WU01:FS00:Core PID:37540
04:30:34:WU01:FS00:FahCore 0xa7 started
04:30:34:WU01:FS00:0xa7:*********************** Log Started 2021-05-25T04:30:34Z ***********************
04:30:34:WU01:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
04:30:34:WU01:FS00:0xa7:       Type: 0xa7
04:30:34:WU01:FS00:0xa7:       Core: Gromacs
04:30:34:WU01:FS00:0xa7:       Args: -dir 01 -suffix 01 -version 706 -lifeline 38176 -checkpoint 30 -np
04:30:34:WU01:FS00:0xa7:             10
04:30:34:WU01:FS00:0xa7:************************************ CBang *************************************
04:30:34:WU01:FS00:0xa7:       Date: Nov 27 2019
04:30:34:WU01:FS00:0xa7:       Time: 03:40:09
04:30:34:WU01:FS00:0xa7:   Revision: d25803215b59272441049dfa05a0a9bf7a6e3c48
04:30:34:WU01:FS00:0xa7:     Branch: master
04:30:34:WU01:FS00:0xa7:   Compiler: Visual C++ 2008
04:30:34:WU01:FS00:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
04:30:34:WU01:FS00:0xa7:   Platform: win32 10
04:30:34:WU01:FS00:0xa7:       Bits: 64
04:30:34:WU01:FS00:0xa7:       Mode: Release
04:30:34:WU01:FS00:0xa7:************************************ System ************************************
04:30:34:WU01:FS00:0xa7:        CPU: AMD Ryzen 5 3600X 6-Core Processor
04:30:34:WU01:FS00:0xa7:     CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
04:30:34:WU01:FS00:0xa7:       CPUs: 12
04:30:34:WU01:FS00:0xa7:     Memory: 31.93GiB
04:30:34:WU01:FS00:0xa7:Free Memory: 17.85GiB
04:30:34:WU01:FS00:0xa7:    Threads: WINDOWS_THREADS
04:30:34:WU01:FS00:0xa7: OS Version: 6.2
04:30:34:WU01:FS00:0xa7:Has Battery: true
04:30:34:WU01:FS00:0xa7: On Battery: false
04:30:34:WU01:FS00:0xa7: UTC Offset: -5
04:30:34:WU01:FS00:0xa7:        PID: 37540
04:30:34:WU01:FS00:0xa7:        CWD: C:\ProgramData\FAHClient\work
04:30:34:WU01:FS00:0xa7:******************************** Build - libFAH ********************************
04:30:34:WU01:FS00:0xa7:    Version: 0.0.19
04:30:34:WU01:FS00:0xa7:     Author: Joseph Coffland <[email protected]>
04:30:34:WU01:FS00:0xa7:  Copyright: 2019 foldingathome.org
04:30:34:WU01:FS00:0xa7:   Homepage: https://foldingathome.org/
04:30:34:WU01:FS00:0xa7:       Date: Nov 25 2019
04:30:34:WU01:FS00:0xa7:       Time: 17:12:41
04:30:34:WU01:FS00:0xa7:   Revision: d5b5c747532224f986b7cd02c968ed9a20c16d6e
04:30:34:WU01:FS00:0xa7:     Branch: master
04:30:34:WU01:FS00:0xa7:   Compiler: Visual C++ 2008
04:30:34:WU01:FS00:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
04:30:34:WU01:FS00:0xa7:   Platform: win32 10
04:30:34:WU01:FS00:0xa7:       Bits: 64
04:30:34:WU01:FS00:0xa7:       Mode: Release
04:30:34:WU01:FS00:0xa7:************************************ Build *************************************
04:30:34:WU01:FS00:0xa7:       SIMD: avx_256
04:30:34:WU01:FS00:0xa7:********************************************************************************
04:30:34:WU01:FS00:0xa7:Project: 17236 (Run 265, Clone 117, Gen 4)
04:30:34:WU01:FS00:0xa7:Unit: 0x0000000f80fccb02609dbb0c3f813525
04:30:34:WU01:FS00:0xa7:Reading tar file core.xml
04:30:34:WU01:FS00:0xa7:Reading tar file frame4.tpr
04:30:34:WU01:FS00:0xa7:Digital signatures verified
04:30:34:WU01:FS00:0xa7:Calling: mdrun -s frame4.tpr -o frame4.trr -x frame4.xtc -cpt 30 -nt 10
04:30:34:WU01:FS00:0xa7:Steps: first=2000000 total=500000
04:30:34:WU01:FS00:0xa7:ERROR:
04:30:34:WU01:FS00:0xa7:ERROR:-------------------------------------------------------
04:30:34:WU01:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20191026-456f0d636-unknown
04:30:34:WU01:FS00:0xa7:ERROR:Source code file: C:\build\fah\core-a7-avx-release\windows-10-64bit-core-a7-avx-release\gromacs-core\build\gromacs\src\gromacs\mdlib\domdec.c, line: 6902
04:30:34:WU01:FS00:0xa7:ERROR:
04:30:34:WU01:FS00:0xa7:ERROR:Fatal error:
04:30:34:WU01:FS00:0xa7:ERROR:There is no domain decomposition for 10 ranks that is compatible with the given box and a minimum cell size of 1.40508 nm
04:30:34:WU01:FS00:0xa7:ERROR:Change the number of ranks or mdrun option -rcon or -dds or your LINCS settings
04:30:34:WU01:FS00:0xa7:ERROR:Look in the log file for details on the domain decomposition
04:30:34:WU01:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS
04:30:34:WU01:FS00:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors
04:30:34:WU01:FS00:0xa7:ERROR:-------------------------------------------------------
04:30:39:WU01:FS00:0xa7:WARNING:Unexpected exit
04:30:39:WARNING:WU01:FS00:FahCore returned: EARLY_UNIT_END (123 = 0x7b)
04:31:34:WU01:FS00:Starting
04:31:34:WU01:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\ProgramData\FAHClient\cores/cores.foldingathome.org/win/64bit-avx-256/a7-0.0.19/Core_a7.fah/FahCore_a7.exe -dir 01 -suffix 01 -version 706 -lifeline 11620 -checkpoint 30 -np 10
04:31:34:WU01:FS00:Started FahCore on PID 27116
04:31:34:WU01:FS00:Core PID:4516
04:31:34:WU01:FS00:FahCore 0xa7 started
04:31:34:WU01:FS00:0xa7:*********************** Log Started 2021-05-25T04:31:34Z ***********************
04:31:34:WU01:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
04:31:34:WU01:FS00:0xa7:       Type: 0xa7
04:31:34:WU01:FS00:0xa7:       Core: Gromacs
04:31:34:WU01:FS00:0xa7:       Args: -dir 01 -suffix 01 -version 706 -lifeline 27116 -checkpoint 30 -np
04:31:34:WU01:FS00:0xa7:             10
04:31:34:WU01:FS00:0xa7:************************************ CBang *************************************
04:31:34:WU01:FS00:0xa7:       Date: Nov 27 2019
04:31:34:WU01:FS00:0xa7:       Time: 03:40:09
04:31:34:WU01:FS00:0xa7:   Revision: d25803215b59272441049dfa05a0a9bf7a6e3c48
04:31:34:WU01:FS00:0xa7:     Branch: master
04:31:34:WU01:FS00:0xa7:   Compiler: Visual C++ 2008
04:31:34:WU01:FS00:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
04:31:34:WU01:FS00:0xa7:   Platform: win32 10
04:31:34:WU01:FS00:0xa7:       Bits: 64
04:31:34:WU01:FS00:0xa7:       Mode: Release
04:31:34:WU01:FS00:0xa7:************************************ System ************************************
04:31:34:WU01:FS00:0xa7:        CPU: AMD Ryzen 5 3600X 6-Core Processor
04:31:34:WU01:FS00:0xa7:     CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
04:31:34:WU01:FS00:0xa7:       CPUs: 12
04:31:34:WU01:FS00:0xa7:     Memory: 31.93GiB
04:31:34:WU01:FS00:0xa7:Free Memory: 17.85GiB
04:31:34:WU01:FS00:0xa7:    Threads: WINDOWS_THREADS
04:31:34:WU01:FS00:0xa7: OS Version: 6.2
04:31:34:WU01:FS00:0xa7:Has Battery: true
04:31:34:WU01:FS00:0xa7: On Battery: false
04:31:34:WU01:FS00:0xa7: UTC Offset: -5
04:31:34:WU01:FS00:0xa7:        PID: 4516
04:31:34:WU01:FS00:0xa7:        CWD: C:\ProgramData\FAHClient\work
04:31:34:WU01:FS00:0xa7:******************************** Build - libFAH ********************************
04:31:34:WU01:FS00:0xa7:    Version: 0.0.19
04:31:34:WU01:FS00:0xa7:     Author: Joseph Coffland <[email protected]>
04:31:34:WU01:FS00:0xa7:  Copyright: 2019 foldingathome.org
04:31:34:WU01:FS00:0xa7:   Homepage: https://foldingathome.org/
04:31:34:WU01:FS00:0xa7:       Date: Nov 25 2019
04:31:34:WU01:FS00:0xa7:       Time: 17:12:41
04:31:34:WU01:FS00:0xa7:   Revision: d5b5c747532224f986b7cd02c968ed9a20c16d6e
04:31:34:WU01:FS00:0xa7:     Branch: master
04:31:34:WU01:FS00:0xa7:   Compiler: Visual C++ 2008
04:31:34:WU01:FS00:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
04:31:34:WU01:FS00:0xa7:   Platform: win32 10
04:31:34:WU01:FS00:0xa7:       Bits: 64
04:31:34:WU01:FS00:0xa7:       Mode: Release
04:31:34:WU01:FS00:0xa7:************************************ Build *************************************
04:31:34:WU01:FS00:0xa7:       SIMD: avx_256
04:31:34:WU01:FS00:0xa7:********************************************************************************
04:31:34:WU01:FS00:0xa7:Project: 17236 (Run 265, Clone 117, Gen 4)
04:31:34:WU01:FS00:0xa7:Unit: 0x0000000f80fccb02609dbb0c3f813525
04:31:34:WU01:FS00:0xa7:Reading tar file core.xml
04:31:34:WU01:FS00:0xa7:Reading tar file frame4.tpr
04:31:34:WU01:FS00:0xa7:Digital signatures verified
04:31:34:WU01:FS00:0xa7:Calling: mdrun -s frame4.tpr -o frame4.trr -x frame4.xtc -cpt 30 -nt 10
04:31:34:WU01:FS00:0xa7:Steps: first=2000000 total=500000
04:31:34:WU01:FS00:0xa7:ERROR:
04:31:34:WU01:FS00:0xa7:ERROR:-------------------------------------------------------
04:31:34:WU01:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20191026-456f0d636-unknown
04:31:34:WU01:FS00:0xa7:ERROR:Source code file: C:\build\fah\core-a7-avx-release\windows-10-64bit-core-a7-avx-release\gromacs-core\build\gromacs\src\gromacs\mdlib\domdec.c, line: 6902
04:31:34:WU01:FS00:0xa7:ERROR:
04:31:34:WU01:FS00:0xa7:ERROR:Fatal error:
04:31:34:WU01:FS00:0xa7:ERROR:There is no domain decomposition for 10 ranks that is compatible with the given box and a minimum cell size of 1.40508 nm
04:31:34:WU01:FS00:0xa7:ERROR:Change the number of ranks or mdrun option -rcon or -dds or your LINCS settings
04:31:34:WU01:FS00:0xa7:ERROR:Look in the log file for details on the domain decomposition
04:31:34:WU01:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS
04:31:34:WU01:FS00:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors
04:31:34:WU01:FS00:0xa7:ERROR:-------------------------------------------------------
04:31:39:WU01:FS00:0xa7:WARNING:Unexpected exit
04:31:40:WARNING:WU01:FS00:FahCore returned: EARLY_UNIT_END (123 = 0x7b)
04:32:34:WU01:FS00:Starting
04:32:34:WU01:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\ProgramData\FAHClient\cores/cores.foldingathome.org/win/64bit-avx-256/a7-0.0.19/Core_a7.fah/FahCore_a7.exe -dir 01 -suffix 01 -version 706 -lifeline 11620 -checkpoint 30 -np 10
04:32:34:WU01:FS00:Started FahCore on PID 10320
04:32:34:WU01:FS00:Core PID:38496
04:32:34:WU01:FS00:FahCore 0xa7 started
04:32:34:WU01:FS00:0xa7:*********************** Log Started 2021-05-25T04:32:34Z ***********************
04:32:34:WU01:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
04:32:34:WU01:FS00:0xa7:       Type: 0xa7
04:32:34:WU01:FS00:0xa7:       Core: Gromacs
04:32:34:WU01:FS00:0xa7:       Args: -dir 01 -suffix 01 -version 706 -lifeline 10320 -checkpoint 30 -np
04:32:34:WU01:FS00:0xa7:             10
04:32:34:WU01:FS00:0xa7:************************************ CBang *************************************
04:32:34:WU01:FS00:0xa7:       Date: Nov 27 2019
04:32:34:WU01:FS00:0xa7:       Time: 03:40:09
04:32:34:WU01:FS00:0xa7:   Revision: d25803215b59272441049dfa05a0a9bf7a6e3c48
04:32:34:WU01:FS00:0xa7:     Branch: master
04:32:34:WU01:FS00:0xa7:   Compiler: Visual C++ 2008
04:32:34:WU01:FS00:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
04:32:34:WU01:FS00:0xa7:   Platform: win32 10
04:32:34:WU01:FS00:0xa7:       Bits: 64
04:32:34:WU01:FS00:0xa7:       Mode: Release
04:32:34:WU01:FS00:0xa7:************************************ System ************************************
04:32:34:WU01:FS00:0xa7:        CPU: AMD Ryzen 5 3600X 6-Core Processor
04:32:34:WU01:FS00:0xa7:     CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
04:32:34:WU01:FS00:0xa7:       CPUs: 12
04:32:34:WU01:FS00:0xa7:     Memory: 31.93GiB
04:32:34:WU01:FS00:0xa7:Free Memory: 17.87GiB
04:32:34:WU01:FS00:0xa7:    Threads: WINDOWS_THREADS
04:32:34:WU01:FS00:0xa7: OS Version: 6.2
04:32:34:WU01:FS00:0xa7:Has Battery: true
04:32:34:WU01:FS00:0xa7: On Battery: false
04:32:34:WU01:FS00:0xa7: UTC Offset: -5
04:32:34:WU01:FS00:0xa7:        PID: 38496
04:32:34:WU01:FS00:0xa7:        CWD: C:\ProgramData\FAHClient\work
04:32:34:WU01:FS00:0xa7:******************************** Build - libFAH ********************************
04:32:34:WU01:FS00:0xa7:    Version: 0.0.19
04:32:34:WU01:FS00:0xa7:     Author: Joseph Coffland <[email protected]>
04:32:34:WU01:FS00:0xa7:  Copyright: 2019 foldingathome.org
04:32:34:WU01:FS00:0xa7:   Homepage: https://foldingathome.org/
04:32:34:WU01:FS00:0xa7:       Date: Nov 25 2019
04:32:34:WU01:FS00:0xa7:       Time: 17:12:41
04:32:34:WU01:FS00:0xa7:   Revision: d5b5c747532224f986b7cd02c968ed9a20c16d6e
04:32:34:WU01:FS00:0xa7:     Branch: master
04:32:34:WU01:FS00:0xa7:   Compiler: Visual C++ 2008
04:32:34:WU01:FS00:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
04:32:34:WU01:FS00:0xa7:   Platform: win32 10
04:32:34:WU01:FS00:0xa7:       Bits: 64
04:32:34:WU01:FS00:0xa7:       Mode: Release
04:32:34:WU01:FS00:0xa7:************************************ Build *************************************
04:32:34:WU01:FS00:0xa7:       SIMD: avx_256
04:32:34:WU01:FS00:0xa7:********************************************************************************
04:32:34:WU01:FS00:0xa7:Project: 17236 (Run 265, Clone 117, Gen 4)
04:32:34:WU01:FS00:0xa7:Unit: 0x0000000f80fccb02609dbb0c3f813525
04:32:34:WU01:FS00:0xa7:Reading tar file core.xml
04:32:34:WU01:FS00:0xa7:Reading tar file frame4.tpr
04:32:34:WU01:FS00:0xa7:Digital signatures verified
04:32:34:WU01:FS00:0xa7:Calling: mdrun -s frame4.tpr -o frame4.trr -x frame4.xtc -cpt 30 -nt 10
04:32:34:WU01:FS00:0xa7:Steps: first=2000000 total=500000
04:32:34:WU01:FS00:0xa7:ERROR:
04:32:34:WU01:FS00:0xa7:ERROR:-------------------------------------------------------
04:32:34:WU01:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20191026-456f0d636-unknown
04:32:34:WU01:FS00:0xa7:ERROR:Source code file: C:\build\fah\core-a7-avx-release\windows-10-64bit-core-a7-avx-release\gromacs-core\build\gromacs\src\gromacs\mdlib\domdec.c, line: 6902
04:32:34:WU01:FS00:0xa7:ERROR:
04:32:34:WU01:FS00:0xa7:ERROR:Fatal error:
04:32:34:WU01:FS00:0xa7:ERROR:There is no domain decomposition for 10 ranks that is compatible with the given box and a minimum cell size of 1.40508 nm
04:32:34:WU01:FS00:0xa7:ERROR:Change the number of ranks or mdrun option -rcon or -dds or your LINCS settings
04:32:34:WU01:FS00:0xa7:ERROR:Look in the log file for details on the domain decomposition
04:32:34:WU01:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS
04:32:34:WU01:FS00:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors
04:32:34:WU01:FS00:0xa7:ERROR:-------------------------------------------------------
04:32:39:WU01:FS00:0xa7:WARNING:Unexpected exit
04:32:40:WARNING:WU01:FS00:FahCore returned: EARLY_UNIT_END (123 = 0x7b)
04:32:40:WARNING:WU01:FS00:Too many errors, failing
04:32:40:WU01:FS00:Sending unit results: id:01 state:SEND error:FAILED project:17236 run:265 clone:117 gen:4 core:0xa7 unit:0x0000000f80fccb02609dbb0c3f813525
04:32:40:WU01:FS00:Connecting to 128.252.203.2:8080
04:32:40:WU01:FS00:Server responded WORK_ACK (400)
04:32:40:WU01:FS00:Cleaning up 
1. Ryzen 5 3600X, GTX 1050 2GB, Windows 10 Pro
2. i5-4590, GTX 1060 3GB, Windows 10 Pro
3. i3-3470, Manjaro Linux
PaulTV
Posts: 211
Joined: Mon Jan 25, 2021 4:53 pm
Location: Netherlands

Re: Project 17236 - Early Unit End

Post by PaulTV »

It's a wild guess, maybe someone can confirm/deny. I think that you run 10 threads (see Calling: mdrun -s frame4.tpr -o frame4.trr -x frame4.xtc -cpt 30 -nt 10). I read somewhere that jobs from older projects sometimes can't really deal with a multiple of 5 cores (nope, didn't understand either). This project is still an a7 core project, and that definitely falls under 'older'. It may be related if you read the explanation of the error: http://www.gromacs.org/Documentation/Er ... ze_of_x_nm

So I'd suggest reducing the number of cores to 8 or 9, and see if the problem persists.
Image

Ryzen 9800X3D / RTX 4090 / Windows 11
Ryzen 5600X / RTX 3070 Ti / Ubuntu 22.04
Ryzen 5600 / RTX 3060 Ti / Windows 11
satcat16609
Posts: 8
Joined: Wed Dec 30, 2020 3:02 pm
Hardware configuration: i5-4590, GTX 1060 3GB, Windows 10 Pro
Ryzen 5 3600X, GTX 1050 2GB, Windows 10 Pro
i3-3470, Manjaro Linux

Re: Project 17236 - Early Unit End

Post by satcat16609 »

I've been using the CPU on Medium power, which sets the system to use ten cores, and this is the first project where I've had any issues. I had read last year when I started folding that there are certain numbers of cores that don't work well, but have always assumed the software would default to using a proper number of cores. Perhaps this particular project doesn't like ten cores. Thanks for the info.
1. Ryzen 5 3600X, GTX 1050 2GB, Windows 10 Pro
2. i5-4590, GTX 1060 3GB, Windows 10 Pro
3. i3-3470, Manjaro Linux
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Project 17236 - Early Unit End

Post by bruce »

FACore_a8 supposedly fixes this type of error but it's still present in _a7 and older cores. Projects started with older cores will continue to use that core (including follow-in clones of the same study) until the study is finished.

Requesting "large prime' numers of cores (including multiples of primes, like 5 cores) do sometimes fail, but it's somewhat a random faiure. You may have completed some WUs from that project. Setting your slot to 9 cores or 12 cores will probably "solve" the problem for you. The project owner can also exclude assignments to clients which are requesting assignments to clients with 10 cores.
satcat16609
Posts: 8
Joined: Wed Dec 30, 2020 3:02 pm
Hardware configuration: i5-4590, GTX 1060 3GB, Windows 10 Pro
Ryzen 5 3600X, GTX 1050 2GB, Windows 10 Pro
i3-3470, Manjaro Linux

Re: Project 17236 - Early Unit End

Post by satcat16609 »

Well, I adjusted the cores to 9, and then got back-to-back WUs from Project 17236, and they both completed successfully. Thanks for the info.
1. Ryzen 5 3600X, GTX 1050 2GB, Windows 10 Pro
2. i5-4590, GTX 1060 3GB, Windows 10 Pro
3. i3-3470, Manjaro Linux
DeeGee
Posts: 61
Joined: Thu Oct 02, 2008 1:15 pm
Hardware configuration: Asus Crosshair Hero VIII, AMD Ryzen 3950x, 2x8GB 3600MHz DDR4, Radeon VII, Win10
Asus Crosshair Hero VII, Amd Ryzen 3900x, 2x16GB 3200MHz DDR4, GeForce 980 TI, Kubuntu 19.10
Location: Finland

Re: Project 17236 - Early Unit End

Post by DeeGee »

Yeah, seems like that project didn't like my 5950x that has been set to 28 threads instead of the default 31. Almost no CPU WU's and then the few ones received crashed...
JimboPalmer
Posts: 2522
Joined: Mon Feb 16, 2009 4:12 am
Location: Greenwood MS USA

Re: Project 17236 - Early Unit End

Post by JimboPalmer »

28 is 7 by 2 by and a7 'hates' large primes like 7, 27 would be 3 by 3 by 3, much smaller primes.
One of the improvements in a8 is coping better with primes.
Tsar of all the Rushers
I tried to remain childlike, all I achieved was childish.
A friend to those who want no friends
Post Reply