Fatal Error with WU
Moderators: Site Moderators, FAHC Science Team
-
- Posts: 941
- Joined: Sun Dec 16, 2007 6:22 pm
- Hardware configuration: 7950x3D, 5950x, 5800x3D, 3900x
7900xtx, Radeon 7, 5700xt, 6900xt, RX 550 640SP - Location: London
- Contact:
Re: Fatal Error with WU
It has been observed recently that sweetspot for current crop projects is 10 or 12 threads. You reduce TPF going more than that but not by big margin. But the way bonuses work, every second counts
Maybe with AMDs push towards many core architecture and Zen popularity, we might see the comeback of more optimized more parallel compute capable gromacs code
Maybe with AMDs push towards many core architecture and Zen popularity, we might see the comeback of more optimized more parallel compute capable gromacs code
FAH Omega tester
Re: Fatal Error with WU
With four slots pinned to a socket each, I now have system time below 5% when all slots are working. Before it was ~20% using two slots and two sockets each. So it's definitely worth pinning each slot to one socket, is there a way to do it within the client? Or do I have to run four clients?
-
- Posts: 941
- Joined: Sun Dec 16, 2007 6:22 pm
- Hardware configuration: 7950x3D, 5950x, 5800x3D, 3900x
7900xtx, Radeon 7, 5700xt, 6900xt, RX 550 640SP - Location: London
- Contact:
Re: Fatal Error with WU
Fahclient can have as many slots as you want. How to set them up in the environment you are in I cannot help as I'm not familiar with it
FAH Omega tester
-
- Posts: 7
- Joined: Tue Jun 02, 2020 4:28 pm
- Hardware configuration: gnujaos Team 236734
GNU/Linux, AMD 3900X on X590, RX5700XT
Re: Fatal Error with WU
Had a bad WU that required me to drop my cpu core count:
Code: Select all
16:07:50:WU01:FS01:FahCore 0xa7 started
16:07:50:WU01:FS01:0xa7:*********************** Log Started 2020-06-02T16:07:50Z ***********************
16:07:50:WU01:FS01:0xa7:************************** Gromacs Folding@home Core ***************************
16:07:50:WU01:FS01:0xa7: Type: 0xa7
16:07:50:WU01:FS01:0xa7: Core: Gromacs
16:07:50:WU01:FS01:0xa7: Args: -dir 01 -suffix 01 -version 706 -lifeline 24132 -checkpoint 15 -np
16:07:50:WU01:FS01:0xa7: 23
16:07:50:WU01:FS01:0xa7:************************************ CBang *************************************
16:07:50:WU01:FS01:0xa7: Date: Nov 5 2019
16:07:50:WU01:FS01:0xa7: Time: 06:06:57
16:07:50:WU01:FS01:0xa7: Revision: 46c96f1aa8419571d83f3e63f9c99a0d602f6da9
16:07:50:WU01:FS01:0xa7: Branch: master
16:07:50:WU01:FS01:0xa7: Compiler: GNU 8.3.0
16:07:50:WU01:FS01:0xa7: Options: -std=c++11 -O3 -funroll-loops -fno-pie -fPIC
16:07:50:WU01:FS01:0xa7: Platform: linux2 4.19.0-5-amd64
16:07:50:WU01:FS01:0xa7: Bits: 64
16:07:50:WU01:FS01:0xa7: Mode: Release
16:07:50:WU01:FS01:0xa7:************************************ System ************************************
16:07:50:WU01:FS01:0xa7: CPU: AMD Ryzen 9 3900X 12-Core Processor
16:07:50:WU01:FS01:0xa7: CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
16:07:50:WU01:FS01:0xa7: CPUs: 24
16:07:50:WU01:FS01:0xa7: Memory: 31.37GiB
16:07:50:WU01:FS01:0xa7:Free Memory: 1.56GiB
16:07:50:WU01:FS01:0xa7: Threads: POSIX_THREADS
16:07:50:WU01:FS01:0xa7: OS Version: 5.7
16:07:50:WU01:FS01:0xa7:Has Battery: false
16:07:50:WU01:FS01:0xa7: On Battery: false
16:07:50:WU01:FS01:0xa7: UTC Offset: -4
16:07:50:WU01:FS01:0xa7: PID: 24136
16:07:50:WU01:FS01:0xa7: CWD: ...
16:07:50:WU01:FS01:0xa7:******************************** Build - libFAH ********************************
16:07:50:WU01:FS01:0xa7: Version: 0.0.18
16:07:50:WU01:FS01:0xa7: Author: Joseph Coffland <[email protected]>
16:07:50:WU01:FS01:0xa7: Copyright: 2019 foldingathome.org
16:07:50:WU01:FS01:0xa7: Homepage: https://foldingathome.org/
16:07:50:WU01:FS01:0xa7: Date: Nov 5 2019
16:07:50:WU01:FS01:0xa7: Time: 06:13:26
16:07:50:WU01:FS01:0xa7: Revision: 490c9aa2957b725af319379424d5c5cb36efb656
16:07:50:WU01:FS01:0xa7: Branch: master
16:07:50:WU01:FS01:0xa7: Compiler: GNU 8.3.0
16:07:50:WU01:FS01:0xa7: Options: -std=c++11 -O3 -funroll-loops -fno-pie
16:07:50:WU01:FS01:0xa7: Platform: linux2 4.19.0-5-amd64
16:07:50:WU01:FS01:0xa7: Bits: 64
16:07:50:WU01:FS01:0xa7: Mode: Release
16:07:50:WU01:FS01:0xa7:************************************ Build *************************************
16:07:50:WU01:FS01:0xa7: SIMD: avx_256
16:07:50:WU01:FS01:0xa7:********************************************************************************
16:07:50:WU01:FS01:0xa7:Project: 14246 (Run 0, Clone 8, Gen 258)
16:07:50:WU01:FS01:0xa7:Unit: 0x000001c580fccb0a5d6fe21c31576508
16:07:50:WU01:FS01:0xa7:Digital signatures verified
16:07:50:WU01:FS01:0xa7:Reducing thread count from 23 to 22 to avoid domain decomposition by a prime number > 3
16:07:50:WU01:FS01:0xa7:Reducing thread count from 22 to 21 to avoid domain decomposition with large prime factor 11
16:07:50:WU01:FS01:0xa7:Calling: mdrun -s frame258.tpr -o frame258.trr -x frame258.xtc -cpi state.cpt -cpt 15 -nt 21
16:07:50:WU01:FS01:0xa7:Steps: first=64500000 total=250000
16:07:50:WU01:FS01:0xa7:ERROR:
16:07:50:WU01:FS01:0xa7:ERROR:-------------------------------------------------------
16:07:50:WU01:FS01:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20191026-456f0d636-unknown
16:07:50:WU01:FS01:0xa7:ERROR:Source code file: /host/debian-stable-64bit-core-a7-avx-release/gromacs-core/build/gromacs/src/gromacs/mdlib/domdec.c,
line: 6902
16:07:50:WU01:FS01:0xa7:ERROR:
16:07:50:WU01:FS01:0xa7:ERROR:Fatal error:
16:07:50:WU01:FS01:0xa7:ERROR:There is no domain decomposition for 16 ranks that is compatible with the given box and a minimum cell size of 1.45733 nm
16:07:50:WU01:FS01:0xa7:ERROR:Change the number of ranks or mdrun option -rcon or -dds or your LINCS settings
16:07:50:WU01:FS01:0xa7:ERROR:Look in the log file for details on the domain decomposition
16:07:50:WU01:FS01:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS
16:07:50:WU01:FS01:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors
16:07:50:WU01:FS01:0xa7:ERROR:-------------------------------------------------------
16:07:55:WU01:FS01:0xa7:WARNING:Unexpected exit() call
16:07:55:WU01:FS01:0xa7:WARNING:Unexpected exit from science code
U: gnujaos T: 236734
GNU/Linux, AMD 3900X on X509, RX5700XT
GNU/Linux, AMD 3900X on X509, RX5700XT
-
- Posts: 941
- Joined: Sun Dec 16, 2007 6:22 pm
- Hardware configuration: 7950x3D, 5950x, 5800x3D, 3900x
7900xtx, Radeon 7, 5700xt, 6900xt, RX 550 640SP - Location: London
- Contact:
Re: Fatal Error with WU
Thanks for that, project owner has been informed.jaos wrote:Had a bad WU that required me to drop my cpu core count:Code: Select all
16:07:50:WU01:FS01:FahCore 0xa7 started 16:07:50:WU01:FS01:0xa7:*********************** Log Started 2020-06-02T16:07:50Z *********************** 16:07:50:WU01:FS01:0xa7:************************** Gromacs Folding@home Core *************************** 16:07:50:WU01:FS01:0xa7: Type: 0xa7 16:07:50:WU01:FS01:0xa7: Core: Gromacs 16:07:50:WU01:FS01:0xa7: Args: -dir 01 -suffix 01 -version 706 -lifeline 24132 -checkpoint 15 -np 16:07:50:WU01:FS01:0xa7: 23 16:07:50:WU01:FS01:0xa7:************************************ CBang ************************************* 16:07:50:WU01:FS01:0xa7: Date: Nov 5 2019 16:07:50:WU01:FS01:0xa7: Time: 06:06:57 16:07:50:WU01:FS01:0xa7: Revision: 46c96f1aa8419571d83f3e63f9c99a0d602f6da9 16:07:50:WU01:FS01:0xa7: Branch: master 16:07:50:WU01:FS01:0xa7: Compiler: GNU 8.3.0 16:07:50:WU01:FS01:0xa7: Options: -std=c++11 -O3 -funroll-loops -fno-pie -fPIC 16:07:50:WU01:FS01:0xa7: Platform: linux2 4.19.0-5-amd64 16:07:50:WU01:FS01:0xa7: Bits: 64 16:07:50:WU01:FS01:0xa7: Mode: Release 16:07:50:WU01:FS01:0xa7:************************************ System ************************************ 16:07:50:WU01:FS01:0xa7: CPU: AMD Ryzen 9 3900X 12-Core Processor 16:07:50:WU01:FS01:0xa7: CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0 16:07:50:WU01:FS01:0xa7: CPUs: 24 16:07:50:WU01:FS01:0xa7: Memory: 31.37GiB 16:07:50:WU01:FS01:0xa7:Free Memory: 1.56GiB 16:07:50:WU01:FS01:0xa7: Threads: POSIX_THREADS 16:07:50:WU01:FS01:0xa7: OS Version: 5.7 16:07:50:WU01:FS01:0xa7:Has Battery: false 16:07:50:WU01:FS01:0xa7: On Battery: false 16:07:50:WU01:FS01:0xa7: UTC Offset: -4 16:07:50:WU01:FS01:0xa7: PID: 24136 16:07:50:WU01:FS01:0xa7: CWD: ... 16:07:50:WU01:FS01:0xa7:******************************** Build - libFAH ******************************** 16:07:50:WU01:FS01:0xa7: Version: 0.0.18 16:07:50:WU01:FS01:0xa7: Author: Joseph Coffland <[email protected]> 16:07:50:WU01:FS01:0xa7: Copyright: 2019 foldingathome.org 16:07:50:WU01:FS01:0xa7: Homepage: https://foldingathome.org/ 16:07:50:WU01:FS01:0xa7: Date: Nov 5 2019 16:07:50:WU01:FS01:0xa7: Time: 06:13:26 16:07:50:WU01:FS01:0xa7: Revision: 490c9aa2957b725af319379424d5c5cb36efb656 16:07:50:WU01:FS01:0xa7: Branch: master 16:07:50:WU01:FS01:0xa7: Compiler: GNU 8.3.0 16:07:50:WU01:FS01:0xa7: Options: -std=c++11 -O3 -funroll-loops -fno-pie 16:07:50:WU01:FS01:0xa7: Platform: linux2 4.19.0-5-amd64 16:07:50:WU01:FS01:0xa7: Bits: 64 16:07:50:WU01:FS01:0xa7: Mode: Release 16:07:50:WU01:FS01:0xa7:************************************ Build ************************************* 16:07:50:WU01:FS01:0xa7: SIMD: avx_256 16:07:50:WU01:FS01:0xa7:******************************************************************************** 16:07:50:WU01:FS01:0xa7:Project: 14246 (Run 0, Clone 8, Gen 258) 16:07:50:WU01:FS01:0xa7:Unit: 0x000001c580fccb0a5d6fe21c31576508 16:07:50:WU01:FS01:0xa7:Digital signatures verified 16:07:50:WU01:FS01:0xa7:Reducing thread count from 23 to 22 to avoid domain decomposition by a prime number > 3 16:07:50:WU01:FS01:0xa7:Reducing thread count from 22 to 21 to avoid domain decomposition with large prime factor 11 16:07:50:WU01:FS01:0xa7:Calling: mdrun -s frame258.tpr -o frame258.trr -x frame258.xtc -cpi state.cpt -cpt 15 -nt 21 16:07:50:WU01:FS01:0xa7:Steps: first=64500000 total=250000 16:07:50:WU01:FS01:0xa7:ERROR: 16:07:50:WU01:FS01:0xa7:ERROR:------------------------------------------------------- 16:07:50:WU01:FS01:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20191026-456f0d636-unknown 16:07:50:WU01:FS01:0xa7:ERROR:Source code file: /host/debian-stable-64bit-core-a7-avx-release/gromacs-core/build/gromacs/src/gromacs/mdlib/domdec.c, line: 6902 16:07:50:WU01:FS01:0xa7:ERROR: 16:07:50:WU01:FS01:0xa7:ERROR:Fatal error: 16:07:50:WU01:FS01:0xa7:ERROR:There is no domain decomposition for 16 ranks that is compatible with the given box and a minimum cell size of 1.45733 nm 16:07:50:WU01:FS01:0xa7:ERROR:Change the number of ranks or mdrun option -rcon or -dds or your LINCS settings 16:07:50:WU01:FS01:0xa7:ERROR:Look in the log file for details on the domain decomposition 16:07:50:WU01:FS01:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS 16:07:50:WU01:FS01:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors 16:07:50:WU01:FS01:0xa7:ERROR:------------------------------------------------------- 16:07:55:WU01:FS01:0xa7:WARNING:Unexpected exit() call 16:07:55:WU01:FS01:0xa7:WARNING:Unexpected exit from science code
Also, reduce your slot thread count to 21. There is no projects which will ever fold on 23 threads you have free. Best would be to have one slot with CPU:21 and maybe second CPU slot with CPU:2 set up. I take it you have a GPU folding as well.
FAH Omega tester
Re: Fatal Error with WU
Microsoft's licensing policy make you pay more for a license that supports >64 threads ... and since there are a lot of @home Donors who run plain-vanilla Windows, a lot of people run into that limit. (I'm not familiar enough with the Docker image to comment about it yet.)
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
-
- Posts: 7
- Joined: Tue Jun 02, 2020 4:28 pm
- Hardware configuration: gnujaos Team 236734
GNU/Linux, AMD 3900X on X590, RX5700XT
Re: Fatal Error with WU
Yes, I have a GPU as well. Is it better to split my cores up by 7s in different slots?muziqaz wrote: Thanks for that, project owner has been informed.
Also, reduce your slot thread count to 21. There is no projects which will ever fold on 23 threads you have free. Best would be to have one slot with CPU:21 and maybe second CPU slot with CPU:2 set up. I take it you have a GPU folding as well.
U: gnujaos T: 236734
GNU/Linux, AMD 3900X on X509, RX5700XT
GNU/Linux, AMD 3900X on X509, RX5700XT
-
- Posts: 2522
- Joined: Mon Feb 16, 2009 4:12 am
- Location: Greenwood MS USA
Re: Fatal Error with WU
No.jaos wrote:muziqaz wrote:Yes, I have a GPU as well. Is it better to split my cores up by 7s in different slots?
F@H has difficulty with large primes and their multiples number of CPUs.
7 is always large, 5 is sometimes large, and 3 is never large. Try to choose a number that is a multiple of 2 and/or 3.
1, 2, 3, 4, 6, 8, 9, 12, 16, 18, 20, 21, 24, 27, 30, 32 are known good numbers of CPUs to choose. (_r2w_ben has advised me of more good numbers)
5. 10. 15, 20, 25, 28 may work most of the time.
Other numbers will bite you
From a Science and Points perspective the largest number is best as it completes the WU quickest. So two slots of 21/2 makes more Points/Science than three slots of 9/8/6 even though both use all 23 threads available.
Tsar of all the Rushers
I tried to remain childlike, all I achieved was childish.
A friend to those who want no friends
I tried to remain childlike, all I achieved was childish.
A friend to those who want no friends
Re: Fatal Error with WU
No. Don't use 7. But the idea is often sound.
Many people split up their cores by some smaller numbers containing only small factors ... like 24 or 16 or 12.
FAH will always reduce 7 to 6, leaving one thread unused. If you choose 64 and it works, fine ... until the next project is assigned and its more challenging to find a workable number, leaving too much of the machine idle.
Many people split up their cores by some smaller numbers containing only small factors ... like 24 or 16 or 12.
FAH will always reduce 7 to 6, leaving one thread unused. If you choose 64 and it works, fine ... until the next project is assigned and its more challenging to find a workable number, leaving too much of the machine idle.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
-
- Posts: 941
- Joined: Sun Dec 16, 2007 6:22 pm
- Hardware configuration: 7950x3D, 5950x, 5800x3D, 3900x
7900xtx, Radeon 7, 5700xt, 6900xt, RX 550 640SP - Location: London
- Contact:
Re: Fatal Error with WU
12 and 9 might be safest and less chance to get into scaling issues
FAH Omega tester
-
- Posts: 7
- Joined: Tue Jun 02, 2020 4:28 pm
- Hardware configuration: gnujaos Team 236734
GNU/Linux, AMD 3900X on X590, RX5700XT
Re: Fatal Error with WU
Thanks all for the information and suggestions!muziqaz wrote:12 and 9 might be safest and less chance to get into scaling issues
U: gnujaos T: 236734
GNU/Linux, AMD 3900X on X509, RX5700XT
GNU/Linux, AMD 3900X on X509, RX5700XT
-
- Site Moderator
- Posts: 6986
- Joined: Wed Dec 23, 2009 9:33 am
- Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB
Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400 - Location: Land Of The Long White Cloud
- Contact:
Re: Fatal Error with WU
Not sure if you are aware or not but there's an official Docker image from F@H: https://github.com/FoldingAtHome/containers
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time
Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time
Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
-
- Posts: 941
- Joined: Sun Dec 16, 2007 6:22 pm
- Hardware configuration: 7950x3D, 5950x, 5800x3D, 3900x
7900xtx, Radeon 7, 5700xt, 6900xt, RX 550 640SP - Location: London
- Contact:
Re: Fatal Error with WU
let's not muddy the waters further with extra layers of complexity there is absolutely no need to use middle man to adjust simple slot settingsPantherX wrote:Not sure if you are aware or not but there's an official Docker image from F@H: https://github.com/FoldingAtHome/containers
Last edited by muziqaz on Fri Jun 12, 2020 8:19 pm, edited 1 time in total.
FAH Omega tester
-
- Posts: 7
- Joined: Tue Jun 02, 2020 4:28 pm
- Hardware configuration: gnujaos Team 236734
GNU/Linux, AMD 3900X on X590, RX5700XT
Re: Fatal Error with WU
Another bad WU
Code: Select all
13:47:55:WU01:FS01:0xa7:*********************** Log Started 2020-06-12T13:47:54Z ***********************
13:47:55:WU01:FS01:0xa7:************************** Gromacs Folding@home Core ***************************
13:47:55:WU01:FS01:0xa7: Type: 0xa7
13:47:55:WU01:FS01:0xa7: Core: Gromacs
13:47:55:WU01:FS01:0xa7: Args: -dir 01 -suffix 01 -version 706 -lifeline 21173 -checkpoint 15 -np
13:47:55:WU01:FS01:0xa7: 21
13:47:55:WU01:FS01:0xa7:************************************ CBang *************************************
13:47:55:WU01:FS01:0xa7: Date: Nov 5 2019
13:47:55:WU01:FS01:0xa7: Time: 06:06:57
13:47:55:WU01:FS01:0xa7: Revision: 46c96f1aa8419571d83f3e63f9c99a0d602f6da9
13:47:55:WU01:FS01:0xa7: Branch: master
13:47:55:WU01:FS01:0xa7: Compiler: GNU 8.3.0
13:47:55:WU01:FS01:0xa7: Options: -std=c++11 -O3 -funroll-loops -fno-pie -fPIC
13:47:55:WU01:FS01:0xa7: Platform: linux2 4.19.0-5-amd64
13:47:55:WU01:FS01:0xa7: Bits: 64
13:47:55:WU01:FS01:0xa7: Mode: Release
13:47:55:WU01:FS01:0xa7:************************************ System ************************************
13:47:55:WU01:FS01:0xa7: CPU: AMD Ryzen 9 3900X 12-Core Processor
13:47:55:WU01:FS01:0xa7: CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
13:47:55:WU01:FS01:0xa7: CPUs: 24
13:47:55:WU01:FS01:0xa7: Memory: 31.37GiB
13:47:55:WU01:FS01:0xa7:Free Memory: 8.25GiB
13:47:55:WU01:FS01:0xa7: Threads: POSIX_THREADS
13:47:55:WU01:FS01:0xa7: OS Version: 5.7
13:47:55:WU01:FS01:0xa7:Has Battery: false
13:47:55:WU01:FS01:0xa7: On Battery: false
13:47:55:WU01:FS01:0xa7: UTC Offset: -4
13:47:55:WU01:FS01:0xa7: PID: 21177
13:47:55:WU01:FS01:0xa7: CWD: /home/jason/projects/fah/work
13:47:55:WU01:FS01:0xa7:******************************** Build - libFAH ********************************
13:47:55:WU01:FS01:0xa7: Version: 0.0.18
13:47:55:WU01:FS01:0xa7: Author: Joseph Coffland <[email protected]>
13:47:55:WU01:FS01:0xa7: Copyright: 2019 foldingathome.org
13:47:55:WU01:FS01:0xa7: Homepage: https://foldingathome.org/
13:47:55:WU01:FS01:0xa7: Date: Nov 5 2019
13:47:55:WU01:FS01:0xa7: Time: 06:13:26
13:47:55:WU01:FS01:0xa7: Revision: 490c9aa2957b725af319379424d5c5cb36efb656
13:47:55:WU01:FS01:0xa7: Branch: master
13:47:55:WU01:FS01:0xa7: Compiler: GNU 8.3.0
13:47:55:WU01:FS01:0xa7: Options: -std=c++11 -O3 -funroll-loops -fno-pie
13:47:55:WU01:FS01:0xa7: Platform: linux2 4.19.0-5-amd64
13:47:55:WU01:FS01:0xa7: Bits: 64
13:47:55:WU01:FS01:0xa7: Mode: Release
13:47:55:WU01:FS01:0xa7:************************************ Build *************************************
13:47:55:WU01:FS01:0xa7: SIMD: avx_256
13:47:55:WU01:FS01:0xa7:********************************************************************************
13:47:55:WU01:FS01:0xa7:Project: 14524 (Run 482, Clone 2, Gen 33)
13:47:55:WU01:FS01:0xa7:Unit: 0x0000003580fccb0a5e459ba4615a0d2d
13:47:55:WU01:FS01:0xa7:Reading tar file core.xml
13:47:55:WU01:FS01:0xa7:Reading tar file frame33.tpr
13:47:55:WU01:FS01:0xa7:Digital signatures verified
13:47:55:WU01:FS01:0xa7:Calling: mdrun -s frame33.tpr -o frame33.trr -x frame33.xtc -cpt 15 -nt 21
13:47:55:WU01:FS01:0xa7:Steps: first=8250000 total=250000
13:47:55:WU01:FS01:0xa7:ERROR:
13:47:55:WU01:FS01:0xa7:ERROR:-------------------------------------------------------
13:47:55:WU01:FS01:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20191026-456f0d636-unknown
13:47:55:WU01:FS01:0xa7:ERROR:Source code file: /host/debian-stable-64bit-core-a7-avx-release/gromacs-core/build/gromacs/src/gromacs/mdlib/domdec.c, line: 6902
13:47:55:WU01:FS01:0xa7:ERROR:
13:47:55:WU01:FS01:0xa7:ERROR:Fatal error:
13:47:55:WU01:FS01:0xa7:ERROR:There is no domain decomposition for 16 ranks that is compatible with the given box and a minimum cell size of 1.4227 nm
13:47:55:WU01:FS01:0xa7:ERROR:Change the number of ranks or mdrun option -rcon or -dds or your LINCS settings
13:47:55:WU01:FS01:0xa7:ERROR:Look in the log file for details on the domain decomposition
13:47:55:WU01:FS01:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS
13:47:55:WU01:FS01:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors
U: gnujaos T: 236734
GNU/Linux, AMD 3900X on X509, RX5700XT
GNU/Linux, AMD 3900X on X509, RX5700XT
Re: Fatal Error with WU
I would reconfigure the CPU slot(s) with FAHClient. The software tried to avoid this problem several times reducing the thread count but 21 still didn't work. Start with 16 and maybe another slot with 8 threads. The slider isn't well designed for wider CPUs. (Help is on the way with the potential for a new FAHCore "soon")
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.