What are "bad WUs"?
Posted: Wed Jul 16, 2008 5:29 am
Has anybody done a study of the so-called "bad WUs" to categorize them and determine how many different types of failures are truly involved (as opposed to the number of symptoms)? I'd think that a beneficial advance in the field of MD might "solve" some of these problem cases. Surely this might involve some refinements to the equations of motion or to the force equations might enable Gromacs to bypass the condition that is causing the failure without aborting that trajectory.
I'm going to guess that the fact that some repeatable EUEs can be avoided by stopping and restarting the simulation probably means that some EUEs can also be bypassed by a re-characterization of the random motions around the time of the EUE.
Nwkelly calls it "just one bad data point" here and it's pretty obvious to me that if a reasonably long trajectory has already been calculated, it's best if it isn't aborted and long as a reasonable solution can be obtained by restarting at some earlier time.
On the other hand, if the trajectory leads to a very early EUE, it may not be worth worrying about it. Just start a new one.
Subject: Project: 2665 (Run 0, Clone 479, Gen 20)
I'm going to guess that the fact that some repeatable EUEs can be avoided by stopping and restarting the simulation probably means that some EUEs can also be bypassed by a re-characterization of the random motions around the time of the EUE.
Nwkelly calls it "just one bad data point" here and it's pretty obvious to me that if a reasonably long trajectory has already been calculated, it's best if it isn't aborted and long as a reasonable solution can be obtained by restarting at some earlier time.
On the other hand, if the trajectory leads to a very early EUE, it may not be worth worrying about it. Just start a new one.
Subject: Project: 2665 (Run 0, Clone 479, Gen 20)
nwkelley wrote:hmm, well proj 2665 has had a couple work units that were simply bad imo, but since there could be many reasons on this one, we hate to stop an entire run/clone series for just one bad data point. If you end up with it again or if anyone else has noticed this one causing trouble could you let us know??
thank you!
nick
(and try NOT to delete your log