Performance anomaly for 19227

Moderators: Site Moderators, FAHC Science Team

Post Reply
arisu
Posts: 252
Joined: Mon Feb 24, 2025 11:11 pm

Performance anomaly for 19227

Post by arisu »

I woke up to find that Project 19227 (Run 5659, Clone 7, Gen 2) was running at under 0.07 ns/day, with CPU use negligible. Unfortunately I restarted it before trying to diagnose the issue. Restarting the WU brought it back to its expected performance of nearly 3 ns/day.

There were no logged errors or warnings. Non-FAH system load was nominal. Kernel scheduler settings were typical for my system (all cores running nice 19 with SCHED_BATCH). The WU was not stalling, it was just going abnormally slowly, with only 100 steps between each 5 minute checkpoint:

Code: Select all

Writing checkpoint, step 1000200 at Sat Mar 29 05:59:40 2025
Writing checkpoint, step 1000300 at Sat Mar 29 06:03:56 2025
Writing checkpoint, step 1000400 at Sat Mar 29 06:08:12 2025
Writing checkpoint, step 1000500 at Sat Mar 29 06:12:29 2025
Writing checkpoint, step 1000600 at Sat Mar 29 06:16:45 2025
Writing checkpoint, step 1000800 at Sat Mar 29 06:25:17 2025
Here is the current md.log file (the logfile_01.txt containes nothing useful, it didn't even get to the first 1% before I restarted it). The logs show the WU starting and running extremely slowly, despite it logging use of all 7 threads. It then shows me interrupting and restarting the WU, after which it can be seen that the performance went back to normal:

Code: Select all

         :-) GROMACS - GROMACS, 2020.5-dev-20210116-ddc6077-unknown (-:

                            GROMACS is written by:
     Emile Apol      Rossen Apostolov      Paul Bauer     Herman J.C. Berendsen
    Par Bjelkmar      Christian Blau   Viacheslav Bolnykh     Kevin Boyd    
 Aldert van Buuren   Rudi van Drunen     Anton Feenstra       Alan Gray     
  Gerrit Groenhof     Anca Hamuraru    Vincent Hindriksen  M. Eric Irrgang  
  Aleksei Iupinov   Christoph Junghans     Joe Jordan     Dimitrios Karkoulis
    Peter Kasson        Jiri Kraus      Carsten Kutzner      Per Larsson    
  Justin A. Lemkul    Viveca Lindahl    Magnus Lundborg     Erik Marklund   
    Pascal Merz     Pieter Meulenhoff    Teemu Murtola       Szilard Pall   
    Sander Pronk      Roland Schulz      Michael Shirts    Alexey Shvetsov  
   Alfons Sijbers     Peter Tieleman      Jon Vincent      Teemu Virolainen 
 Christian Wennberg    Maarten Wolf      Artem Zhmurov   
                           and the project leaders:
        Mark Abraham, Berk Hess, Erik Lindahl, and David van der Spoel

Copyright (c) 1991-2000, University of Groningen, The Netherlands.
Copyright (c) 2001-2019, The GROMACS development team at
Uppsala University, Stockholm University and
the Royal Institute of Technology, Sweden.
check out http://www.gromacs.org for more information.

GROMACS:      GROMACS, version 2020.5-dev-20210116-ddc6077-unknown
Working dir:  /var/lib/fah-client/work/K1FPyoybdRldicNw6V7jsSBZoG7kU9WZE8a-7WHeCLg/01
Process ID:   1171779

GROMACS version:    2020.5-dev-20210116-ddc6077-unknown
GIT SHA1 hash:      ddc6077cfc91185b44cf253801548ae5d6c5e673
Branched from:      unknown
Precision:          single
Memory model:       64 bit
MPI library:        thread_mpi
OpenMP support:     enabled (GMX_OPENMP_MAX_THREADS = 64)
GPU support:        disabled
SIMD instructions:  AVX2_256
FFT library:        fftw-3.3.8-sse2-avx
RDTSCP usage:       disabled
TNG support:        enabled
Hwloc support:      disabled
Tracing support:    disabled
C compiler:         /usr/bin/x86_64-linux-gnu-gcc GNU 8.3.0
C compiler flags:   -mavx2 -mfma -Wall -Wno-unused -Wunused-value -Wunused-parameter -Wextra -Wno-missing-field-initializers -Wno-sign-compare -Wpointer-arith -Wundef -Werror=stringop-truncation -fexcess-precision=fast -funroll-all-loops -Wno-array-bounds -O3 -DNDEBUG
C++ compiler:       /usr/bin/x86_64-linux-gnu-g++ GNU 8.3.0
C++ compiler flags: -mavx2 -mfma -Wall -Wextra -Wno-missing-field-initializers -Wpointer-arith -Wmissing-declarations -Wundef -Wstringop-truncation -fexcess-precision=fast -funroll-all-loops -Wno-array-bounds -fopenmp -O3 -DNDEBUG

Note: 16 CPUs configured, but only 8 were detected to be online.
      X86 Hyperthreading is likely disabled; enable it for better performance.

Running on 1 node with total 8 cores, 8 logical cores
Hardware detected:
  CPU info:
    Vendor: AMD
    Brand:  AMD Ryzen 7 7840U w/ Radeon 780M Graphics      
    Family: 25   Model: 116   Stepping: 1
    Features: aes amd apic avx avx2 avx512f avx512cd avx512bw avx512vl clfsh cmov cx8 cx16 f16c fma htt lahf misalignsse mmx msr nonstop_tsc pclmuldq pdpe1gb popcnt pse rdrnd rdtscp sha sse2 sse3 sse4a sse4.1 sse4.2 ssse3 x2apic
    Number of AVX-512 FMA units: 1 (AVX2 is faster w/o 2 AVX-512 FMA units)
  Hardware topology: Basic
    Sockets, cores, and logical processors:
      Socket  0: [   0] [   1] [   2] [   3] [   4] [   5] [   6] [   7]


The current CPU can measure timings more accurately than the code in
GROMACS was configured to use. This might affect your simulation
speed as accurate timings are needed for load-balancing.
Please consider rebuilding GROMACS with the GMX_USE_RDTSCP=ON CMake option.

++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
M. J. Abraham, T. Murtola, R. Schulz, S. Páll, J. C. Smith, B. Hess, E.
Lindahl
GROMACS: High performance molecular simulations through multi-level
parallelism from laptops to supercomputers
SoftwareX 1 (2015) pp. 19-25
-------- -------- --- Thank You --- -------- --------


++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
S. Páll, M. J. Abraham, C. Kutzner, B. Hess, E. Lindahl
Tackling Exascale Software Challenges in Molecular Dynamics Simulations with
GROMACS
In S. Markidis & E. Laure (Eds.), Solving Software Challenges for Exascale 8759 (2015) pp. 3-27
-------- -------- --- Thank You --- -------- --------


++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
S. Pronk, S. Páll, R. Schulz, P. Larsson, P. Bjelkmar, R. Apostolov, M. R.
Shirts, J. C. Smith, P. M. Kasson, D. van der Spoel, B. Hess, and E. Lindahl
GROMACS 4.5: a high-throughput and highly parallel open source molecular
simulation toolkit
Bioinformatics 29 (2013) pp. 845-54
-------- -------- --- Thank You --- -------- --------


++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
B. Hess and C. Kutzner and D. van der Spoel and E. Lindahl
GROMACS 4: Algorithms for highly efficient, load-balanced, and scalable
molecular simulation
J. Chem. Theory Comput. 4 (2008) pp. 435-447
-------- -------- --- Thank You --- -------- --------


++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
D. van der Spoel, E. Lindahl, B. Hess, G. Groenhof, A. E. Mark and H. J. C.
Berendsen
GROMACS: Fast, Flexible and Free
J. Comp. Chem. 26 (2005) pp. 1701-1719
-------- -------- --- Thank You --- -------- --------


++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
E. Lindahl and B. Hess and D. van der Spoel
GROMACS 3.0: A package for molecular simulation and trajectory analysis
J. Mol. Mod. 7 (2001) pp. 306-317
-------- -------- --- Thank You --- -------- --------


++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
H. J. C. Berendsen, D. van der Spoel and R. van Drunen
GROMACS: A message-passing parallel molecular dynamics implementation
Comp. Phys. Comm. 91 (1995) pp. 43-56
-------- -------- --- Thank You --- -------- --------


++++ PLEASE CITE THE DOI FOR THIS VERSION OF GROMACS ++++
https://doi.org/10.5281/zenodo.4420785
-------- -------- --- Thank You --- -------- --------


This run will default to '-update gpu' as requested by the GMX_FORCE_UPDATE_DEFAULT_GPU environment variable. GPU update with domain decomposition lacks substantial testing and should be used with caution.
Input Parameters:
   integrator                     = sd
   tinit                          = 0
   dt                             = 0.002
   nsteps                         = 500000
   init-step                      = 1000000
   simulation-part                = 1
   comm-mode                      = Linear
   nstcomm                        = 100
   bd-fric                        = 0
   ld-seed                        = 1074724025
   emtol                          = 10
   emstep                         = 0.01
   niter                          = 20
   fcstep                         = 0
   nstcgsteep                     = 1000
   nbfgscorr                      = 10
   rtpi                           = 0.05
   nstxout                        = 5000
   nstvout                        = 5000
   nstfout                        = 0
   nstlog                         = 5000
   nstcalcenergy                  = 100
   nstenergy                      = 500
   nstxout-compressed             = 0
   compressed-x-precision         = 1000
   cutoff-scheme                  = Verlet
   nstlist                        = 10
   pbc                            = xyz
   periodic-molecules             = false
   verlet-buffer-tolerance        = 0.005
   rlist                          = 1
   coulombtype                    = PME
   coulomb-modifier               = Potential-shift
   rcoulomb-switch                = 0
   rcoulomb                       = 1
   epsilon-r                      = 1
   epsilon-rf                     = inf
   vdw-type                       = PME
   vdw-modifier                   = Potential-shift
   rvdw-switch                    = 0
   rvdw                           = 1
   DispCorr                       = EnerPres
   table-extension                = 1
   fourierspacing                 = 0.1
   fourier-nx                     = 112
   fourier-ny                     = 112
   fourier-nz                     = 112
   pme-order                      = 6
   ewald-rtol                     = 1e-06
   ewald-rtol-lj                  = 0.001
   lj-pme-comb-rule               = Geometric
   ewald-geometry                 = 0
   epsilon-surface                = 0
   tcoupl                         = No
   nsttcouple                     = -1
   nh-chain-length                = 0
   print-nose-hoover-chain-variables = false
   pcoupl                         = Parrinello-Rahman
   pcoupltype                     = Isotropic
   nstpcouple                     = 10
   tau-p                          = 0.5
   compressibility (3x3):
      compressibility[    0]={ 4.50000e-05,  0.00000e+00,  0.00000e+00}
      compressibility[    1]={ 0.00000e+00,  4.50000e-05,  0.00000e+00}
      compressibility[    2]={ 0.00000e+00,  0.00000e+00,  4.50000e-05}
   ref-p (3x3):
      ref-p[    0]={ 1.00000e+00,  0.00000e+00,  0.00000e+00}
      ref-p[    1]={ 0.00000e+00,  1.00000e+00,  0.00000e+00}
      ref-p[    2]={ 0.00000e+00,  0.00000e+00,  1.00000e+00}
   refcoord-scaling               = All
   posres-com (3):
      posres-com[0]= 0.00000e+00
      posres-com[1]= 0.00000e+00
      posres-com[2]= 0.00000e+00
   posres-comB (3):
      posres-comB[0]= 0.00000e+00
      posres-comB[1]= 0.00000e+00
      posres-comB[2]= 0.00000e+00
   QMMM                           = false
   QMconstraints                  = 0
   QMMMscheme                     = 0
   MMChargeScaleFactor            = 1
qm-opts:
   ngQM                           = 0
   constraint-algorithm           = Lincs
   continuation                   = false
   Shake-SOR                      = false
   shake-tol                      = 0.0001
   lincs-order                    = 4
   lincs-iter                     = 1
   lincs-warnangle                = 30
   nwall                          = 0
   wall-type                      = 9-3
   wall-r-linpot                  = -1
   wall-atomtype[0]               = -1
   wall-atomtype[1]               = -1
   wall-density[0]                = 0
   wall-density[1]                = 0
   wall-ewald-zfac                = 3
   pull                           = false
   awh                            = false
   rotation                       = false
   interactiveMD                  = false
   disre                          = No
   disre-weighting                = Conservative
   disre-mixed                    = false
   dr-fc                          = 1000
   dr-tau                         = 0
   nstdisreout                    = 100
   orire-fc                       = 0
   orire-tau                      = 0
   nstorireout                    = 100
   free-energy                    = yes
   init-lambda                    = -1
   init-lambda-state              = 7
   delta-lambda                   = 0
   nstdhdl                        = 500
   n-lambdas                      = 21
   separate-dvdl:
       fep-lambdas =   TRUE
      mass-lambdas =   FALSE
      coul-lambdas =   FALSE
       vdw-lambdas =   FALSE
    bonded-lambdas =   FALSE
 restraint-lambdas =   FALSE
temperature-lambdas =   FALSE
all-lambdas:
       fep-lambdas =            0        0.05         0.1        0.15         0.2        0.25         0.3        0.35         0.4        0.45         0.5        0.55         0.6        0.65         0.7        0.75         0.8        0.85         0.9        0.95           1
      mass-lambdas =            0        0.05         0.1        0.15         0.2        0.25         0.3        0.35         0.4        0.45         0.5        0.55         0.6        0.65         0.7        0.75         0.8        0.85         0.9        0.95           1
      coul-lambdas =            0        0.05         0.1        0.15         0.2        0.25         0.3        0.35         0.4        0.45         0.5        0.55         0.6        0.65         0.7        0.75         0.8        0.85         0.9        0.95           1
       vdw-lambdas =            0        0.05         0.1        0.15         0.2        0.25         0.3        0.35         0.4        0.45         0.5        0.55         0.6        0.65         0.7        0.75         0.8        0.85         0.9        0.95           1
    bonded-lambdas =            0        0.05         0.1        0.15         0.2        0.25         0.3        0.35         0.4        0.45         0.5        0.55         0.6        0.65         0.7        0.75         0.8        0.85         0.9        0.95           1
 restraint-lambdas =            0        0.05         0.1        0.15         0.2        0.25         0.3        0.35         0.4        0.45         0.5        0.55         0.6        0.65         0.7        0.75         0.8        0.85         0.9        0.95           1
temperature-lambdas =            0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0
   calc-lambda-neighbors          = -1
   dhdl-print-energy              = potential
   sc-alpha                       = 0.5
   sc-power                       = 1
   sc-r-power                     = 6
   sc-sigma                       = 0.3
   sc-sigma-min                   = 0.3
   sc-coul                        = true
   dh-hist-size                   = 0
   dh-hist-spacing                = 0.1
   separate-dhdl-file             = yes
   dhdl-derivatives               = yes
   cos-acceleration               = 0
   deform (3x3):
      deform[    0]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
      deform[    1]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
      deform[    2]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
   simulated-tempering            = false
   swapcoords                     = no
   userint1                       = 0
   userint2                       = 0
   userint3                       = 0
   userint4                       = 0
   userreal1                      = 0
   userreal2                      = 0
   userreal3                      = 0
   userreal4                      = 0
   applied-forces:
     electric-field:
       x:
         E0                       = 0
         omega                    = 0
         t0                       = 0
         sigma                    = 0
       y:
         E0                       = 0
         omega                    = 0
         t0                       = 0
         sigma                    = 0
       z:
         E0                       = 0
         omega                    = 0
         t0                       = 0
         sigma                    = 0
     density-guided-simulation:
       active                     = false
       group                      = protein
       similarity-measure         = inner-product
       atom-spreading-weight      = unity
       force-constant             = 1e+09
       gaussian-transform-spreading-width = 0.2
       gaussian-transform-spreading-range-in-multiples-of-width = 4
       reference-density-filename = reference.mrc
       nst                        = 1
       normalize-densities        = true
       adaptive-force-scaling     = false
       adaptive-force-scaling-time-constant = 4
grpopts:
   nrdf:      194266
   ref-t:         300
   tau-t:           1
annealing:          No
annealing-npoints:           0
   acc:	           0           0           0
   nfreeze:           N           N           N
   energygrp-flags[  0]: 0

Changing nstlist from 10 to 100, rlist from 1 to 1.005


Update task on the GPU was required, by the GMX_FORCE_UPDATE_DEFAULT_GPU environment variable, but the following condition(s) were not satisfied:

Either PME or short-ranged non-bonded interaction tasks must run on the GPU.
Compatible GPUs must have been found.
Only a CUDA build is supported.
Only the md integrator is supported.
Free energy perturbations are not supported.
The number of coupled constraints is higher than supported in the CUDA LINCS code.

Will use CPU version of update.

Using 1 MPI thread

Non-default thread affinity set, disabling internal thread affinity

Using 7 OpenMP threads 

System total charge, top. A: 0.003 top. B: 0.003
Will do PME sum in reciprocal space for LJ dispersion interactions.

++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
U. Essmann, L. Perera, M. L. Berkowitz, T. Darden, H. Lee and L. G. Pedersen 
A smooth particle mesh Ewald method
J. Chem. Phys. 103 (1995) pp. 8577-8592
-------- -------- --- Thank You --- -------- --------

Using a Gaussian width (1/beta) of 0.298423 nm for LJ Ewald
Will do PME sum in reciprocal space for electrostatic interactions.

++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
U. Essmann, L. Perera, M. L. Berkowitz, T. Darden, H. Lee and L. G. Pedersen 
A smooth particle mesh Ewald method
J. Chem. Phys. 103 (1995) pp. 8577-8592
-------- -------- --- Thank You --- -------- --------

Using a Gaussian width (1/beta) of 0.289108 nm for Ewald
Potential shift: LJ r^-12: -1.000e+00 r^-6: -1.000e-03, Ewald -1.000e-06
Initialized non-bonded Ewald tables, spacing: 8.87e-04 size: 1129

Using shifted Lennard-Jones, switch between 0 and 1 nm
Generated table with 1002 data points for 1-4 COUL.
Tabscale = 500 points/nm
Generated table with 1002 data points for 1-4 LJ6.
Tabscale = 500 points/nm
Generated table with 1002 data points for 1-4 LJ12.
Tabscale = 500 points/nm

Using SIMD 4x8 nonbonded short-range kernels

Using a 4x8 pair-list setup:
  updated every 100 steps, buffer 0.005 nm, rlist 1.005 nm
At tolerance 0.005 kJ/mol/ps per atom, equivalent classical 1x1 list would be:
  updated every 100 steps, buffer 0.182 nm, rlist 1.182 nm

Long Range LJ corr.: <C6> 1.1539e-06

There are 30 atoms and 30 charges for free energy perturbation
Removing pbc first time

Initializing LINear Constraint Solver

++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
B. Hess and H. Bekker and H. J. C. Berendsen and J. G. E. M. Fraaije
LINCS: A Linear Constraint Solver for molecular simulations
J. Comp. Chem. 18 (1997) pp. 1463-1472
-------- -------- --- Thank You --- -------- --------

The number of constraints is 8441

++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
S. Miyamoto and P. A. Kollman
SETTLE: An Analytical Version of the SHAKE and RATTLE Algorithms for Rigid
Water Models
J. Comp. Chem. 13 (1992) pp. 952-962
-------- -------- --- Thank You --- -------- --------

Setting the maximum number of constraint warnings to 2147483647
Initial vector of lambda components:[     0.3500     0.3500     0.3500     0.3500     0.3500     0.3500     0.0000 ]

++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
N. Goga and A. J. Rzepiela and A. H. de Vries and S. J. Marrink and H. J. C.
Berendsen
Efficient Algorithms for Langevin and DPD Dynamics
J. Chem. Theory Comput. 8 (2012) pp. 3637--3649
-------- -------- --- Thank You --- -------- --------

There are: 97101 Atoms

Constraining the starting coordinates (step 1000000)

Constraining the coordinates at t0-dt (step 1000000)
Center of mass motion removal mode is Linear
We have the following groups for center of mass motion removal:
  0:  rest
RMS relative constraint deviation after constraining: 1.18e-05
Initial temperature: 300.691 K

Started mdrun on rank 0 Sat Mar 29 05:51:03 2025

           Step           Time
        1000000     2000.00000

   Energies (kJ/mol)
          Angle    Proper Dih.  Improper Dih.          LJ-14     Coulomb-14
    1.76447e+04    2.13414e+04    1.09885e+03    9.23870e+03    7.56705e+04
        LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. recip.      LJ recip.
    2.72786e+05   -4.66277e+01   -1.62509e+06    1.40961e+04   -1.13945e+05
 Position Rest.      Potential    Kinetic En.   Total Energy    Temperature
    5.84771e+02   -1.32662e+06    2.42798e+05   -1.08382e+06    3.00638e+02
 Pres. DC (bar) Pressure (bar)    dVremain/dl   Constr. rmsd
   -7.93814e-01    1.33210e+01   -2.64108e+01    1.24773e-05

Writing checkpoint, step 1000200 at Sat Mar 29 05:59:40 2025


Writing checkpoint, step 1000300 at Sat Mar 29 06:03:56 2025


Writing checkpoint, step 1000400 at Sat Mar 29 06:08:12 2025


Writing checkpoint, step 1000500 at Sat Mar 29 06:12:29 2025


Writing checkpoint, step 1000600 at Sat Mar 29 06:16:45 2025


Writing checkpoint, step 1000800 at Sat Mar 29 06:25:17 2025


Writing checkpoint, step 1000900 at Sat Mar 29 06:29:35 2025


Writing checkpoint, step 1001000 at Sat Mar 29 06:33:52 2025


Writing checkpoint, step 1001100 at Sat Mar 29 06:38:08 2025


Writing checkpoint, step 1001200 at Sat Mar 29 06:42:25 2025


Writing checkpoint, step 1001300 at Sat Mar 29 06:46:42 2025


Writing checkpoint, step 1001500 at Sat Mar 29 06:55:14 2025




Received the remote second INT/TERM signal, stopping within 11 steps

Writing checkpoint, step 1001600 at Sat Mar 29 06:59:35 2025


           Step           Time
        1001601     2003.20200

Writing checkpoint, step 1001601 at Sat Mar 29 06:59:35 2025


   Energies (kJ/mol)
          Angle    Proper Dih.  Improper Dih.          LJ-14     Coulomb-14
    1.75208e+04    2.14735e+04    1.04005e+03    9.16521e+03    7.56769e+04
        LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. recip.      LJ recip.
    2.72395e+05   -4.66524e+01   -1.62555e+06    1.41485e+04   -1.14005e+05
 Position Rest.      Potential    Kinetic En.   Total Energy    Temperature
    5.94462e+02   -1.32759e+06    2.42321e+05   -1.08527e+06    3.00047e+02
 Pres. DC (bar) Pressure (bar)    dVremain/dl   Constr. rmsd
   -7.94653e-01   -6.78620e+01   -2.45375e+02    1.26114e-05


	<======  ###############  ==>
	<====  A V E R A G E S  ====>
	<==  ###############  ======>

	Statistics over 1602 steps using 17 frames

   Energies (kJ/mol)
          Angle    Proper Dih.  Improper Dih.          LJ-14     Coulomb-14
    1.76182e+04    2.12718e+04    1.06178e+03    9.22057e+03    7.57611e+04
        LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. recip.      LJ recip.
    2.72480e+05   -4.66669e+01   -1.62518e+06    1.40836e+04   -1.14061e+05
 Position Rest.      Potential    Kinetic En.   Total Energy    Temperature
    5.72261e+02   -1.32722e+06    2.42985e+05   -1.08424e+06    3.00870e+02
 Pres. DC (bar) Pressure (bar)    dVremain/dl   Constr. rmsd
   -7.95149e-01   -7.61378e+00   -1.97357e+02    0.00000e+00

          Box-X          Box-Y          Box-Z
    1.11331e+01    1.11331e+01    7.87228e+00

   Total Virial (kJ/mol)
    8.12369e+04    2.18138e+02    1.71726e+02
    2.02617e+02    8.12812e+04   -1.02466e+03
    1.45648e+02   -1.03797e+03    8.11494e+04

   Pressure (bar)
   -5.34530e+00   -7.05055e+00   -7.67331e+00
   -6.52253e+00   -1.05447e+01    3.72356e+01
   -6.78555e+00    3.76885e+01   -6.95129e+00


	M E G A - F L O P S   A C C O U N T I N G

 NB=Group-cutoff nonbonded kernels    NxN=N-by-N cluster Verlet kernels
 RF=Reaction-Field  VdW=Van der Waals  QSTab=quadratic-spline table
 W3=SPC/TIP3p  W4=TIP4p (single or pairs)
 V&F=Potential and force  V=Potential only  F=Force only

 Computing:                               M-Number         M-Flops  % Flops
-----------------------------------------------------------------------------
 NB Free energy kernel                 6940.595124        6940.595     0.1
 Pair Search distance check             590.133036        5311.197     0.1
 NxN Ewald Elec. + LJ [F]             38401.295328     3916932.123    39.1
 NxN Ewald Elec. + LJ [V&F]             436.338336       61087.367     0.6
 NxN Ewald Elec. [F]                  31024.658016     1892504.139    18.9
 NxN Ewald Elec. [V&F]                  352.533600       29612.822     0.3
 1,4 nonbonded interactions              34.987080        3148.837     0.0
 Calc Weights                           466.667406       16800.027     0.2
 Spread Q Bspline                    134400.212928      268800.426     2.7
 Gather F Bspline                    134400.212928      806401.278     8.1
 3D-FFT                              367710.661296     2941685.290    29.4
 Solve PME                               80.381952        5144.445     0.1
 Shift-X                                  1.650717           9.904     0.0
 Angles                                  24.302340        4082.793     0.0
 Propers                                 37.789980        8653.905     0.1
 Impropers                                2.899514         603.099     0.0
 Pos. Restr.                              0.828234          41.412     0.0
 Virial                                  15.737652         283.278     0.0
 Update                                 155.555802        4822.230     0.0
 Stop-CM                                  1.747818          17.478     0.0
 Calc-Ekin                               31.363623         846.818     0.0
 Lincs                                   27.061846        1623.711     0.0
 Lincs-Mat                              583.697184        2334.789     0.0
 Constraint-V                           338.047375        2704.379     0.0
 Constraint-Vir                          15.719508         377.268     0.0
 Settle                                  94.676386       30580.473     0.3
-----------------------------------------------------------------------------
 Total                                                10011350.083   100.0
-----------------------------------------------------------------------------


     R E A L   C Y C L E   A N D   T I M E   A C C O U N T I N G

On 1 MPI rank, each using 7 OpenMP threads

 Computing:          Num   Num      Call    Wall time         Giga-Cycles
                     Ranks Threads  Count      (s)         total sum    %
-----------------------------------------------------------------------------
 Neighbor search        1    7         17       7.075        163.126   0.2
 Force                  1    7       1602     513.627      11842.542  12.5
 PME mesh               1    7       1602    2355.503      54310.088  57.3
 NB X/F buffer ops.     1    7       3187     156.439       3606.976   3.8
 Write traj.            1    7         15      15.038        346.728   0.4
 Update                 1    7       3204     155.958       3595.869   3.8
 Constraints            1    7       3206     885.797      20423.541  21.5
 Rest                                          22.828        526.327   0.6
-----------------------------------------------------------------------------
 Total                                       4112.265      94815.196 100.0
-----------------------------------------------------------------------------
 Breakdown of PME mesh computation
-----------------------------------------------------------------------------
 PME spread             1    7       6408     737.530      17004.994  17.9
 PME gather             1    7       6408     977.503      22537.982  23.8
 PME 3D-FFT             1    7      12816     511.478      11792.995  12.4
 PME solve LJ           1    7       3204      21.356        492.392   0.5
 PME solve Elec         1    7       3204       2.848         65.668   0.1
-----------------------------------------------------------------------------

               Core t (s)   Wall t (s)        (%)
       Time:    28785.854     4112.265      700.0
                         1h08:32
                 (ns/day)    (hour/ns)
Performance:        0.067      356.522
Finished mdrun on rank 0 Sat Mar 29 06:59:35 2025


Reading checkpoint file state.cpt
  file generated by:     
  file generated at:     Sat Mar 29 06:59:35 2025

  GROMACS double prec.:  0
  simulation part #:     1
  step:                  1001601
  time:                  2003.202000



-----------------------------------------------------------
Restarting from checkpoint, appending to previous log file.

         :-) GROMACS - GROMACS, 2020.5-dev-20210116-ddc6077-unknown (-:

Working dir:  /var/lib/fah-client/work/K1FPyoybdRldicNw6V7jsSBZoG7kU9WZE8a-7WHeCLg/01
Process ID:   1173903

GROMACS version:    2020.5-dev-20210116-ddc6077-unknown
GIT SHA1 hash:      ddc6077cfc91185b44cf253801548ae5d6c5e673
Branched from:      unknown
Precision:          single
Memory model:       64 bit
MPI library:        thread_mpi
OpenMP support:     enabled (GMX_OPENMP_MAX_THREADS = 64)
GPU support:        disabled
SIMD instructions:  AVX2_256
FFT library:        fftw-3.3.8-sse2-avx
RDTSCP usage:       disabled
TNG support:        enabled
Hwloc support:      disabled
Tracing support:    disabled
C compiler:         /usr/bin/x86_64-linux-gnu-gcc GNU 8.3.0
C compiler flags:   -mavx2 -mfma -Wall -Wno-unused -Wunused-value -Wunused-parameter -Wextra -Wno-missing-field-initializers -Wno-sign-compare -Wpointer-arith -Wundef -Werror=stringop-truncation -fexcess-precision=fast -funroll-all-loops -Wno-array-bounds -O3 -DNDEBUG
C++ compiler:       /usr/bin/x86_64-linux-gnu-g++ GNU 8.3.0
C++ compiler flags: -mavx2 -mfma -Wall -Wextra -Wno-missing-field-initializers -Wpointer-arith -Wmissing-declarations -Wundef -Wstringop-truncation -fexcess-precision=fast -funroll-all-loops -Wno-array-bounds -fopenmp -O3 -DNDEBUG

Changing nstlist from 10 to 100, rlist from 1 to 1.005

Update task on the GPU was required, by the GMX_FORCE_UPDATE_DEFAULT_GPU environment variable, but the following condition(s) were not satisfied:

Either PME or short-ranged non-bonded interaction tasks must run on the GPU.
Compatible GPUs must have been found.
Only a CUDA build is supported.
Only the md integrator is supported.
Free energy perturbations are not supported.
The number of coupled constraints is higher than supported in the CUDA LINCS code.

Will use CPU version of update.

Using 1 MPI thread

Non-default thread affinity set, disabling internal thread affinity

Using 7 OpenMP threads 

System total charge, top. A: 0.003 top. B: 0.003
Will do PME sum in reciprocal space for LJ dispersion interactions.

++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
U. Essmann, L. Perera, M. L. Berkowitz, T. Darden, H. Lee and L. G. Pedersen 
A smooth particle mesh Ewald method
J. Chem. Phys. 103 (1995) pp. 8577-8592
-------- -------- --- Thank You --- -------- --------

Using a Gaussian width (1/beta) of 0.298423 nm for LJ Ewald
Will do PME sum in reciprocal space for electrostatic interactions.

++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
U. Essmann, L. Perera, M. L. Berkowitz, T. Darden, H. Lee and L. G. Pedersen 
A smooth particle mesh Ewald method
J. Chem. Phys. 103 (1995) pp. 8577-8592
-------- -------- --- Thank You --- -------- --------

Using a Gaussian width (1/beta) of 0.289108 nm for Ewald
Potential shift: LJ r^-12: -1.000e+00 r^-6: -1.000e-03, Ewald -1.000e-06
Initialized non-bonded Ewald tables, spacing: 8.87e-04 size: 1129

Using shifted Lennard-Jones, switch between 0 and 1 nm
Generated table with 1002 data points for 1-4 COUL.
Tabscale = 500 points/nm
Generated table with 1002 data points for 1-4 LJ6.
Tabscale = 500 points/nm
Generated table with 1002 data points for 1-4 LJ12.
Tabscale = 500 points/nm

Using SIMD 4x8 nonbonded short-range kernels

Using a 4x8 pair-list setup:
  updated every 100 steps, buffer 0.005 nm, rlist 1.005 nm
At tolerance 0.005 kJ/mol/ps per atom, equivalent classical 1x1 list would be:
  updated every 100 steps, buffer 0.182 nm, rlist 1.182 nm

Long Range LJ corr.: <C6> 1.1539e-06

There are 30 atoms and 30 charges for free energy perturbation

Initializing LINear Constraint Solver

++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
B. Hess and H. Bekker and H. J. C. Berendsen and J. G. E. M. Fraaije
LINCS: A Linear Constraint Solver for molecular simulations
J. Comp. Chem. 18 (1997) pp. 1463-1472
-------- -------- --- Thank You --- -------- --------

The number of constraints is 8441

++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
S. Miyamoto and P. A. Kollman
SETTLE: An Analytical Version of the SHAKE and RATTLE Algorithms for Rigid
Water Models
J. Comp. Chem. 13 (1992) pp. 952-962
-------- -------- --- Thank You --- -------- --------

Setting the maximum number of constraint warnings to 2147483647
Initial vector of lambda components:[     0.3500     0.3500     0.3500     0.3500     0.3500     0.3500     0.0000 ]
There are: 97101 Atoms
Center of mass motion removal mode is Linear
We have the following groups for center of mass motion removal:
  0:  rest

Started mdrun on rank 0 Sat Mar 29 06:59:43 2025

           Step           Time
        1005000     2010.00000

   Energies (kJ/mol)
          Angle    Proper Dih.  Improper Dih.          LJ-14     Coulomb-14
    1.78943e+04    2.13561e+04    1.06903e+03    9.14452e+03    7.58080e+04
        LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. recip.      LJ recip.
    2.73847e+05   -4.66621e+01   -1.62760e+06    1.42622e+04   -1.14060e+05
 Position Rest.      Potential    Kinetic En.   Total Energy    Temperature
    5.80356e+02   -1.32774e+06    2.42607e+05   -1.08514e+06    3.00402e+02
 Pres. DC (bar) Pressure (bar)    dVremain/dl   Constr. rmsd
   -7.94983e-01    1.29168e+01   -2.95358e+01    1.25638e-05

Writing checkpoint, step 1006700 at Sat Mar 29 07:04:47 2025


           Step           Time
        1010000     2020.00000

   Energies (kJ/mol)
          Angle    Proper Dih.  Improper Dih.          LJ-14     Coulomb-14
    1.75239e+04    2.12665e+04    1.06937e+03    9.07527e+03    7.58257e+04
        LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. recip.      LJ recip.
    2.71210e+05   -4.65080e+01   -1.62330e+06    1.40239e+04   -1.13690e+05
 Position Rest.      Potential    Kinetic En.   Total Energy    Temperature
    5.85316e+02   -1.32645e+06    2.43347e+05   -1.08311e+06    3.01318e+02
 Pres. DC (bar) Pressure (bar)    dVremain/dl   Constr. rmsd
   -7.89749e-01   -1.92865e+02    2.34520e+01    1.30552e-05

Writing checkpoint, step 1011700 at Sat Mar 29 07:09:45 2025


           Step           Time
        1015000     2030.00000

   Energies (kJ/mol)
          Angle    Proper Dih.  Improper Dih.          LJ-14     Coulomb-14
    1.74927e+04    2.11825e+04    1.02925e+03    9.23814e+03    7.55764e+04
        LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. recip.      LJ recip.
    2.73925e+05   -4.67688e+01   -1.62540e+06    1.41663e+04   -1.14268e+05
 Position Rest.      Potential    Kinetic En.   Total Energy    Temperature
    5.06469e+02   -1.32659e+06    2.42885e+05   -1.08371e+06    3.00745e+02
 Pres. DC (bar) Pressure (bar)    dVremain/dl   Constr. rmsd
   -7.98619e-01    2.29081e+01    5.73640e+01    1.26620e-05

Writing checkpoint, step 1016800 at Sat Mar 29 07:14:49 2025


           Step           Time
        1020000     2040.00000

   Energies (kJ/mol)
          Angle    Proper Dih.  Improper Dih.          LJ-14     Coulomb-14
    1.76625e+04    2.14868e+04    1.11678e+03    9.26524e+03    7.58738e+04
        LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. recip.      LJ recip.
    2.72403e+05   -4.68009e+01   -1.62767e+06    1.39312e+04   -1.14380e+05
 Position Rest.      Potential    Kinetic En.   Total Energy    Temperature
    6.27902e+02   -1.32973e+06    2.42966e+05   -1.08676e+06    3.00846e+02
 Pres. DC (bar) Pressure (bar)    dVremain/dl   Constr. rmsd
   -7.99715e-01    4.11403e+01    2.03969e+01    1.24796e-05

Writing checkpoint, step 1021800 at Sat Mar 29 07:19:47 2025




Received the remote second INT/TERM signal, stopping within 11 steps

           Step           Time
        1023151     2046.30200

Writing checkpoint, step 1023151 at Sat Mar 29 07:21:08 2025


   Energies (kJ/mol)
          Angle    Proper Dih.  Improper Dih.          LJ-14     Coulomb-14
    1.78573e+04    2.11744e+04    1.02424e+03    9.19195e+03    7.56376e+04
        LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. recip.      LJ recip.
    2.75515e+05   -4.65865e+01   -1.62846e+06    1.41853e+04   -1.13877e+05
 Position Rest.      Potential    Kinetic En.   Total Energy    Temperature
    5.35584e+02   -1.32727e+06    2.43462e+05   -1.08380e+06    3.01460e+02
 Pres. DC (bar) Pressure (bar)    dVremain/dl   Constr. rmsd
   -7.92412e-01    5.98015e+01   -2.00189e+02    1.27571e-05


	<======  ###############  ==>
	<====  A V E R A G E S  ====>
	<==  ###############  ======>

	Statistics over 23152 steps using 232 frames

   Energies (kJ/mol)
          Angle    Proper Dih.  Improper Dih.          LJ-14     Coulomb-14
    1.77403e+04    2.12862e+04    1.06869e+03    9.20593e+03    7.57173e+04
        LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. recip.      LJ recip.
    2.72918e+05   -4.66662e+01   -1.62577e+06    1.41121e+04   -1.14047e+05
 Position Rest.      Potential    Kinetic En.   Total Energy    Temperature
    5.62726e+02   -1.32725e+06    2.42778e+05   -1.08447e+06    3.00613e+02
 Pres. DC (bar) Pressure (bar)    dVremain/dl   Constr. rmsd
   -7.95124e-01    5.53275e+00   -2.00526e+02    0.00000e+00

          Box-X          Box-Y          Box-Z
    1.11332e+01    1.11332e+01    7.87233e+00

   Total Virial (kJ/mol)
    8.11498e+04   -8.82006e+01    1.42992e+02
   -8.71236e+01    8.07133e+04   -5.00858e+02
    1.42483e+02   -5.02929e+02    8.04355e+04

   Pressure (bar)
   -5.90240e+00    3.44922e+00   -5.40251e+00
    3.41217e+00    5.68858e+00    1.71956e+01
   -5.38547e+00    1.72660e+01    1.68121e+01


	M E G A - F L O P S   A C C O U N T I N G

 NB=Group-cutoff nonbonded kernels    NxN=N-by-N cluster Verlet kernels
 RF=Reaction-Field  VdW=Van der Waals  QSTab=quadratic-spline table
 W3=SPC/TIP3p  W4=TIP4p (single or pairs)
 V&F=Potential and force  V=Potential only  F=Force only

 Computing:                               M-Number         M-Flops  % Flops
-----------------------------------------------------------------------------
 NB Free energy kernel                93131.058450       93131.058     0.1
 Pair Search distance check            7496.126708       67465.140     0.1
 NxN Ewald Elec. + LJ [F]            516921.322320    52725974.877    39.2
 NxN Ewald Elec. + LJ [V&F]            5233.273568      732658.300     0.5
 NxN Ewald Elec. [F]                 417787.166544    25485017.159    18.9
 NxN Ewald Elec. [V&F]                 4229.592576      355285.776     0.3
 1,4 nonbonded interactions             470.650072       42358.506     0.0
 Calc Weights                          6277.870953      226003.354     0.2
 Spread Q Bspline                   1808026.834464     3616053.669     2.7
 Gather F Bspline                   1808026.834464    10848161.007     8.1
 3D-FFT                             4946649.476648    39573195.813    29.4
 Solve PME                             1081.342976       69205.950     0.1
 Shift-X                                 20.973816         125.843     0.0
 Angles                                 326.928670       54924.017     0.0
 Propers                                508.340554      116409.987     0.1
 Impropers                               39.001841        8112.383     0.0
 Pos. Restr.                             11.141867         557.093     0.0
 Virial                                 209.446776        3770.042     0.0
 Update                                2092.623651       64871.333     0.0
 Stop-CM                                 20.876715         208.767     0.0
 Calc-Ekin                              418.602411       11302.265     0.0
 Lincs                                  363.823982       21829.439     0.0
 Lincs-Mat                             7847.322528       31389.290     0.0
 Constraint-V                          4546.183450       36369.468     0.0
 Constraint-Vir                         209.205304        5020.927     0.0
 Settle                                1272.845162      411128.987     0.3
-----------------------------------------------------------------------------
 Total                                               134600530.452   100.0
-----------------------------------------------------------------------------


     R E A L   C Y C L E   A N D   T I M E   A C C O U N T I N G

On 1 MPI rank, each using 7 OpenMP threads

 Computing:          Num   Num      Call    Wall time         Giga-Cycles
                     Ranks Threads  Count      (s)         total sum    %
-----------------------------------------------------------------------------
 Neighbor search        1    7        216       5.322        122.717   0.4
 Force                  1    7      21551     287.860       6637.087  22.4
 PME mesh               1    7      21551     953.597      21986.779  74.2
 NB X/F buffer ops.     1    7      42886       5.642        130.095   0.4
 Write traj.            1    7          9       3.947         91.002   0.3
 Update                 1    7      43102      11.782        271.654   0.9
 Constraints            1    7      43102      14.943        344.533   1.2
 Rest                                           1.772         40.858   0.1
-----------------------------------------------------------------------------
 Total                                       1284.865      29624.724 100.0
-----------------------------------------------------------------------------
 Breakdown of PME mesh computation
-----------------------------------------------------------------------------
 PME spread             1    7      86204     368.119       8487.602  28.7
 PME gather             1    7      86204     316.352       7294.022  24.6
 PME 3D-FFT             1    7     172408     225.141       5190.999  17.5
 PME solve LJ           1    7      43102      34.150        787.390   2.7
 PME solve Elec         1    7      43102       9.200        212.126   0.7
-----------------------------------------------------------------------------

               Core t (s)   Wall t (s)        (%)
       Time:     8994.054     1284.865      700.0
                 (ns/day)    (hour/ns)
Performance:        2.898        8.281
Finished mdrun on rank 0 Sat Mar 29 07:21:08 2025
Because it's no longer causing an issue, I don't know if it is even worth reporting or even if it's specific to this WU. But it was very strange, and I'm reporting it here anyway on the off-chance that it helps the researcher(s) involved.
Post Reply