Code: Select all
[23:36:42] *------------------------------*
[23:36:42] Folding@Home Gromacs SMP Core
[23:36:59] - Starting from initial work packet
[23:36:59] Project: 3062 (Run 2, Clone 71, Gen 7)
[23:37:06] Protein: p3062_lambda5_99sbExtra SSE boost OK.
[23:37:06] Extra SSE boost OK.
[23:37:06] Writing local files
[23:37:06] Completed 0 out of 5000000 steps (0 percent)
[08:57:18] Writing local files
[08:57:18] Completed 2800000 out of 5000000 steps (56 percent)
[09:07:17] Warning: long 1-4 interactions
[09:07:21] CoreStatus = 0 (0)
[09:07:21] Client-core communications error: ERROR 0x0
[09:07:21] Deleting current work unit & continuing...
[09:11:55] *------------------------------*
[09:11:55] Folding@Home Gromacs SMP Core
[09:11:55] Version 1.74 (November 27, 2006)
[09:12:12] Project: 3062 (Run 2, Clone 71, Gen 7)
[09:12:19] Protein: p3062_lambda5_99sbExtra SSE boost OK.
[09:12:19] Extra SSE boost OK.
[09:12:19] Completed 0 out of 5000000 steps (0 percent)
[14:43:14] Completed 1650000 out of 5000000 steps (33 percent)
I also do not have time to baby-sit the second (or third) assignment of the same WU.
Last week, when I was unavailable, I noted that one SMP WU had been assigned three times consecutively after 0X0'ing at the same point.
This machine, also a Q6600, then completed three WUs sucessfully after which the same offending WU was again assigned.
The error trapping in the SMP clients is quite substandard and perhaps another approach to this issue would be appropriate until such time as the code is improved.