Project- 2605 (Run 11, Clone 561, Gen 84)-CLIENT STALL(a1)

Moderators: Site Moderators, FAHC Science Team

Post Reply
314159
Posts: 232
Joined: Sun Dec 02, 2007 2:46 am
Location: http://www.teammacosx.org/

Project- 2605 (Run 11, Clone 561, Gen 84)-CLIENT STALL(a1)

Post by 314159 »

C2D-Linux Client-Stock Clock-Single Instance-etc.

Note: While it might appear that I am posting an inordinate number of problem WUs, please be advised that I have 22 machines running the SMP client and that the failure/problem frequency is about 1 per 30 or 40 WUs. I do not consider this to be unreasonable.

My concern is that if I go out of town for a week or so, I may return to a situation that I would rather not have to address.

Code: Select all

[13:49:46] Core required: FahCore_a1.exe
[13:49:46] Core found.
[13:49:46] Working on Unit 02 [August 31 13:49:46]
[13:49:46] + Working ...
[13:49:46] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a1.exe -dir work/ -suffix 02 -checkpoint 15 -forceasm -verbose -lifeline 16137 -version 602'

[13:49:47] 
[13:49:47] *------------------------------*
[13:49:47] Folding@Home Gromacs SMP Core
[13:49:47] Version 1.74 (November 27, 2006)
[13:49:47] 
[13:49:47] Preparing to commence simulation
[13:49:47] - Ensuring status. Please wait.
[13:50:04] - Assembly optimizations manually forced on.
[13:50:04] - Not checking prior termination.
[13:50:05] - Expanded 2422545 -> 12896633 (decompressed 532.3 percent)
[13:50:05] - Starting from initial work packet
[13:50:05] 
[13:50:05] Project: 2605 (Run 11, Clone 561, Gen 84)
[13:50:05] 
[13:50:05] Assembly optimizations on if available.
[13:50:05] Entering M.D.
[13:50:11] Rejecting checkpoint
[13:50:12] Protein: Protein in POPCExtra SSE boost OK.
[13:50:12] 
[13:50:12] Extra SSE boost OK.
[13:50:12] Writing local files
[13:50:12] Completed 0 out of 500000 steps  (0 percent)
[14:05:13] Timered checkpoint triggered.

[17:17:51] Completed 500000 out of 500000 steps  (100 percent)
[17:17:52] Writing final coordinates.
[17:17:52] Past main M.D. loop
[17:17:52] Will end MPI now
[17:18:52] 
[17:18:52] Finished Work Unit:
[17:18:52] - Reading up to 3723552 from "work/wudata_02.arc": Read 3723552
[17:18:52] - Reading up to 1780676 from "work/wudata_02.xtc": Read 1780676
[17:18:52] goefile size: 0
[17:18:52] logfile size: 21735
[17:18:52] Leaving Run
[17:18:54] - Writing 5530363 bytes of core data to disk...
[17:18:54]   ... Done.
[17:18:55] - Shutting down core
[17:18:55] 
[17:18:55] Folding@home Core Shutdown: FINISHED_UNIT

Note: client hang at this point

[17:44:17] ***** Got an Activate signal (2)<----User generated
[17:44:17] Killing all core threads

Folding@Home Client Shutdown.
Note that the results file had been written.
I WAS able to recover and send this WU at around 18:00 UTC. :)
(deleted slot two/ran qfix/used -send 2/acknowledged by work server)

I would appreciate a "lookup" in a few hours to confirm that all went well.
Thank you in advance!
John (from the central part of the Commonwealth of Virginia, U.S.A.)

A friendly visitor to what hopefully will remain a friendly Forum.
With thanks to all of the dedicated volunteers on the staff here!!
toTOW
Site Moderator
Posts: 6395
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: Project- 2605 (Run 11, Clone 561, Gen 84)-CLIENT STALL(a1)

Post by toTOW »

314159 wrote: (deleted slot two/ran qfix/used -send 2/acknowledged by work server)
I was going to suggest this trick ;)

Thanks for your report.
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
314159
Posts: 232
Joined: Sun Dec 02, 2007 2:46 am
Location: http://www.teammacosx.org/

Re: Project- 2605 (Run 11, Clone 561, Gen 84)-CLIENT STALL(a1)

Post by 314159 »

Would you please be so kind as to do a lookup for me to see if this WU was properly received and credited.

No big rush.

Thanks in advance, my friend! :)

P.S. Unfortunately, I have had to use that "trick" on MANY occasions. :e(
John (from the central part of the Commonwealth of Virginia, U.S.A.)

A friendly visitor to what hopefully will remain a friendly Forum.
With thanks to all of the dedicated volunteers on the staff here!!
toTOW
Site Moderator
Posts: 6395
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: Project- 2605 (Run 11, Clone 561, Gen 84)-CLIENT STALL(a1)

Post by toTOW »

314159 wrote:Would you please be so kind as to do a lookup for me to see if this WU was properly received and credited
Of course. The WU went back home successfully ;) :

Hi 314159 (team 1971),
Your WU (P2605 R11 C561 G84) was added to the stats database on 2008-09-01 12:33:10 for 1760 points of credit.
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
314159
Posts: 232
Joined: Sun Dec 02, 2007 2:46 am
Location: http://www.teammacosx.org/

Re: Project- 2605 (Run 11, Clone 561, Gen 84)-CLIENT STALL(a1)

Post by 314159 »

Whew!

And thank you so much for the lookup. 8-)
John (from the central part of the Commonwealth of Virginia, U.S.A.)

A friendly visitor to what hopefully will remain a friendly Forum.
With thanks to all of the dedicated volunteers on the staff here!!
Post Reply