6013 (Run 0, Clone 53, Gen 157)

Moderators: Site Moderators, FAHC Science Team

Post Reply
Bigstan
Posts: 4
Joined: Tue Jan 08, 2008 6:09 pm

6013 (Run 0, Clone 53, Gen 157)

Post by Bigstan »

Had this die on me twice over the last 12 hours on one machine with error "CoreStatus = FF (255)". No problems with other projects.

Code: Select all

[19:39:52] + Processing work unit
[19:39:52] Core required: FahCore_a3.exe
[19:39:52] Core found.
[19:39:52] Working on queue slot 00 [May 30 19:39:52 UTC]
[19:39:52] + Working ...
[19:39:52] - Calling '.\FahCore_a3.exe -dir work/ -nice 19 -suffix 00 -np 8 -checkpoint 5 -verbose -lifeline 4700 -version 629'

[19:39:52] 
[19:39:52] *------------------------------*
[19:39:52] Folding@Home Gromacs SMP Core
[19:39:52] Version 2.19 (Mar 12, 2010)
[19:39:52] 
[19:39:52] Preparing to commence simulation
[19:39:52] - Looking at optimizations...
[19:39:52] - Created dyn
[19:39:52] - Files status OK
[19:39:53] - Expanded 979415 -> 10427873 (decompressed 1064.7 percent)
[19:39:53] Called DecompressByteArray: compressed_data_size=979415 data_size=10427873, decompressed_data_size=10427873 diff=0
[19:39:53] - Digital signature verified
[19:39:53] 
[19:39:53] Project: 6013 (Run 0, Clone 53, Gen 157)
[19:39:53] 
[19:39:53] Assembly optimizations on if available.
[19:39:53] Entering M.D.
[19:42:08] Completed 0 out of 250000 steps  (0%)
[19:51:52] CoreStatus = FF (255)
[19:51:52] Sending work to server
[19:51:52] Project: 6013 (Run 0, Clone 53, Gen 157)
[19:51:52] - Error: Could not get length of results file work/wuresults_00.dat
[19:51:52] - Error: Could not read unit 00 file. Removing from queue.
[19:51:52] Trying to send all finished work units
[19:51:52] + No unsent completed units remaining.
[19:51:52] - Preparing to get new work unit...
[19:51:52] Cleaning up work directory

Code: Select all

[02:20:34] + Processing work unit
[02:20:34] Core required: FahCore_a3.exe
[02:20:34] Core found.
[02:20:34] Working on queue slot 02 [May 31 02:20:34 UTC]
[02:20:34] + Working ...
[02:20:34] - Calling '.\FahCore_a3.exe -dir work/ -nice 19 -suffix 02 -np 8 -checkpoint 5 -verbose -lifeline 4700 -version 629'

[02:20:34] 
[02:20:34] *------------------------------*
[02:20:34] Folding@Home Gromacs SMP Core
[02:20:34] Version 2.19 (Mar 12, 2010)
[02:20:34] 
[02:20:34] Preparing to commence simulation
[02:20:34] - Looking at optimizations...
[02:20:34] - Created dyn
[02:20:34] - Files status OK
[02:20:34] - Expanded 979415 -> 10427873 (decompressed 1064.7 percent)
[02:20:34] Called DecompressByteArray: compressed_data_size=979415 data_size=10427873, decompressed_data_size=10427873 diff=0
[02:20:34] - Digital signature verified
[02:20:34] 
[02:20:34] Project: 6013 (Run 0, Clone 53, Gen 157)
[02:20:34] 
[02:20:34] Assembly optimizations on if available.
[02:20:34] Entering M.D.
[02:24:23] Completed 0 out of 250000 steps  (0%)
[04:36:12] CoreStatus = FF (255)
[04:36:12] Sending work to server
[04:36:12] Project: 6013 (Run 0, Clone 53, Gen 157)
[04:36:12] - Error: Could not get length of results file work/wuresults_02.dat
[04:36:12] - Error: Could not read unit 02 file. Removing from queue.
[04:36:12] Trying to send all finished work units
[04:36:12] + No unsent completed units remaining.
[04:36:12] - Preparing to get new work unit...
[04:36:12] Cleaning up work directory
AlanH
Posts: 57
Joined: Mon Dec 03, 2007 9:54 pm

Re: 6013 (Run 0, Clone 53, Gen 157)

Post by AlanH »

I received this unit last night. When I checked F@H today I found that it was taking 90 minutes per %. Checking the bonus calculator, this was forecast to complete in six days, against a final deadline of three days. Not good.

I tried restarting F@H, and restarting my Mac, but performance remained pathetic. So I stopped F@H, discarded the work folder and queue, and restarted it. I now have a P6025 unit which is folding at a reasonable rate. I don't cherry pick, but this seemed to be faulty behaviour.
Folding for TeamCFC
- Mac Pro Dual 2.66GHz Xeon, 4 GBytes running Mac SMP2 client
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: 6013 (Run 0, Clone 53, Gen 157)

Post by bruce »

I'm not sure what's wrong, but there are strong indications this is a bad WU, so I'm reporting it. (A third person got a very small credit for an EUE report so that makes three entirely different symptoms.)
ra40
Posts: 13
Joined: Wed Aug 27, 2008 7:26 pm

Re: 6013 (Run 0, Clone 53, Gen 157)

Post by ra40 »

I've got it now...my step time is 1.07. So far it is at 40% but won't complete the deadline of 3 days.
ra40
Posts: 13
Joined: Wed Aug 27, 2008 7:26 pm

Re: 6013 (Run 0, Clone 53, Gen 157)

Post by ra40 »

As suspected, it didn't make it anywhere close to the deadline. The last portion of the log file:
[15:17:05] Completed 142500 out of 250000 steps (57%)
[16:24:15] Completed 145000 out of 250000 steps (58%)
[17:31:23] Completed 147500 out of 250000 steps (59%)
[18:38:25] Completed 150000 out of 250000 steps (60%)
[19:45:29] Completed 152500 out of 250000 steps (61%)
[20:53:01] Completed 155000 out of 250000 steps (62%)
[22:00:37] Completed 157500 out of 250000 steps (63%)
[23:08:06] Completed 160000 out of 250000 steps (64%)
[23:08:06] Unit 1's deadline (June 7 22:50) has passed.
[23:08:06] Going to interrupt core and move on to next unit...
[23:08:23] mdrun returned 2
[23:08:23] Gromacs was interrupted
[23:08:23] Folding@home Core Shutdown: INTERRUPTED
[23:08:26] CoreStatus = 66 (102)
[23:08:30] - Preparing to get new work unit...
[23:08:30] Cleaning up work directory
[23:08:30] + Attempting to get work packet
[23:08:30] Passkey found
[23:08:30] - Connecting to assignment server
[23:08:31] - Successful: assigned to (171.64.65.54).
[23:08:31] + News From Folding@Home: Welcome to Folding@Home
[23:08:31] Loaded queue successfully.
[23:08:34] + Closed connections
Post Reply