Project: 3064 (Run 4, Clone 137, Gen 18) failed twice at 11%

Moderators: Site Moderators, FAHC Science Team

Post Reply
Al_Bert
Posts: 2
Joined: Fri May 30, 2008 12:35 am
Hardware configuration: SMP1: intel [email protected] Arctic Freezer, 2x1 GB Corsair 533 RAM @568, Asus P5n32SLI-DeLuxe, Club3d 1950XT (OC'ed 680/850) w Arctic Accelero, XP Pro SP 2, Tagan 530, CM Centurion 532
SMP2: intel C2D [email protected] Arctic Freezer, 2x1 GB OCZ 800 RAM@888, Gigabyte P965DS3L, XFX 7600GT XXX factory OC, Win2K SP 4, Corsair HX620, A+ case
SMP3: intel C2D 6550 stock, XP Pro SP 2
SMP4+GPU: [email protected] Thermalright 12 XTRM w. Typhoon2000 , 2x1GB Corsair 800 RAM C5@800, P5K-E Wifi-AP, Gigabyte 8800GT factory OC'ed 700/1760/950 Custom cooler (Zalman), Vista32 SP1, Corsair 650W, Akasa Eclipse
Location: Bolton, England

Project: 3064 (Run 4, Clone 137, Gen 18) failed twice at 11%

Post by Al_Bert »

Hi there,

as you can see, I have received the above unit twice in a row, and both times it went EUE at 11%. I was wondering whether anybody else had the same problem? I have copied the work folder into a subfolder on the second attempt, at around 10%, and I will try this on another machine (once I have read up on how to - but I think I will manage at least the part of setting it up).

Code: Select all


--- Opening Log file [May 27 10:22:04] 

[10:22:04] - Ask before connecting: No
[10:22:04] - User name: Al_Bert (Team 35947)
[10:22:31] Project: 2665 (Run 3, Clone 764, Gen 1)
[23:41:04] Completed 167500 out of 250000 steps  (67 percent)
[00:04:59] Writing local files
[00:05:00] Completed 170000 out of 250000 steps  (68 percent)
[00:29:01] Writing local files
[12:48:59] Completed 250000 out of 250000 steps  (100 percent)
[12:48:59] Writing final coordinates.
[12:49:00] Past main M.D. loop
[12:49:00] Will end MPI now
[12:50:00] 
[12:50:00] Finished Work Unit:
[12:50:00] - Reading up to 21310704 from "work/wudata_02.arc": Read 21310704
[12:50:00] - Reading up to 552788 from "work/wudata_02.xtc": Read 552788
[12:50:00] goefile size: 0
[12:50:00] logfile size: 212432
[12:50:00] Leaving Run
[12:50:04] - Writing 22082296 bytes of core data to disk...
[12:50:05]   ... Done.
[12:50:06] - Failed to delete work/wudata_02.sas
[12:50:06] - Failed to delete work/wudata_02.goe
[12:50:06] Warning:  check for stray files
[12:50:06] - Shutting down core
[12:52:11] 
[12:52:11] Folding@home Core Shutdown: FINISHED_UNIT
[12:52:11] 
[12:52:11] Folding@home Core Shutdown: FINISHED_UNIT
[12:56:22] CoreStatus = 64 (100)
[12:56:22] Sending work to server


[12:56:22] + Attempting to send results
[13:09:09] + Results successfully sent
[13:09:09] Thank you for your contribution to Folding@Home.
[13:09:09] + Number of Units Completed: 130

[13:11:14] - Preparing to get new work unit...
[13:11:14] + Attempting to get work packet
[13:11:14] - Connecting to assignment server
[13:11:15] - Successful: assigned to (171.64.65.63).
[13:11:15] + News From Folding@Home: Welcome to Folding@Home
[13:11:15] Loaded queue successfully.
[13:11:24] + Closed connections
[13:11:24] 
[13:11:24] + Processing work unit
[13:11:24] Core required: FahCore_a1.exe
[13:11:24] Core found.
[13:11:24] Working on Unit 03 [May 28 13:11:24]
[13:11:24] + Working ...
[13:11:25] 
[13:11:25] *------------------------------*
[13:11:25] Folding@Home Gromacs SMP Core
[13:11:25] Version 1.76 (February 23, 2008)
[13:11:25] 
[13:11:25] Preparing to commence simulation
[13:11:25] - Ensuring status. Please wait.
[13:11:26] - Starting from initial work packet
[13:11:26] 
[13:11:26] Project: 3064 (Run 4, Clone 137, Gen 18)
[13:11:26] 
[13:11:26] Assembly optimizations on if available.
[13:11:26] Entering M.D.
[13:11:43]  percent)
[13:11:43] - Starting from initial work packet
[13:11:43] 
[13:11:43] Project: 3064 (Run 4, Clone 137, Gen 18)
[13:11:43] 
[13:11:43] Entering M.D.
[13:11:50] 3064_lambda5_2003
[13:11:50] Writing local files
[13:11:50] Extra SSE Extra SSE Writing local files
[13:11:50] Completed 0 out of 5000000 steps  (0 percent)
[13:29:33] Writing local files
[13:29:33] Completed 50000 out of 5000000 steps  (1 percent)
[13:47:16] Writing local files
[13:47:16] Completed 100000 out of 5000000 steps  (2 percent)
[14:05:00] Writing local files
[14:05:00] Completed 150000 out of 5000000 steps  (3 percent)
[14:22:42] Writing local files
[14:22:42] Completed 200000 out of 5000000 steps  (4 percent)
[14:40:26] Writing local files
[14:40:26] Completed 250000 out of 5000000 steps  (5 percent)
[14:58:08] Writing local files
[14:58:08] Completed 300000 out of 5000000 steps  (6 percent)
[15:15:51] Writing local files
[15:15:51] Completed 350000 out of 5000000 steps  (7 percent)
[15:33:33] Writing local files
[15:33:33] Completed 400000 out of 5000000 steps  (8 percent)
[15:51:16] Writing local files
[15:51:16] Completed 450000 out of 5000000 steps  (9 percent)
[16:09:09] Writing local files
[16:09:09] Completed 500000 out of 5000000 steps  (10 percent)
[16:26:59] Writing local files
[16:26:59] Completed 550000 out of 5000000 steps  (11 percent)
[16:32:41] Warning:  long 1-4 interactions
[16:32:41] Gromacs cannot continue further.
[16:32:41] Going to send back what have done.
[16:32:41] logfile size: 22423
[16:32:41] - Writing 22959 bytes of core data to disk...
[16:32:41]   ... Done.
[16:32:41] - Failed to delete work/wudata_03.goe
[16:32:41] - Failed to delete work/wudata_03.pdo
[16:32:41] Warning:  check for stray files
[16:34:41] 
[16:34:41] Folding@home Core Shutdown: EARLY_UNIT_END
[16:34:41] 
[16:34:41] Folding@home Core Shutdown: EARLY_UNIT_END
[16:34:45] CoreStatus = 63 (99)
[16:34:45] + Error starting Folding@Home core or unexpected system termination of core.
[16:34:50] 
[16:34:50] + Processing work unit
[16:34:50] Core required: FahCore_a1.exe
[16:34:50] Core found.
[16:34:50] Working on Unit 03 [May 28 16:34:50]
[16:34:50] + Working ...
[16:34:51] 
[16:34:51] *------------------------------*
[16:34:51] Folding@Home Gromacs SMP Core
[16:34:51] Version 1.76 (February 23, 2008)
[16:34:51] 
[16:34:51] Preparing to commence simulation
[16:34:51] - Looking at optimizations...
[16:34:51] .
[16:34:51] Finalizing output
[16:35:08] ation of core was improper.
[16:35:08] - Going to use standard loops.
[16:35:08] - Files status OK
[16:37:08] 
[16:37:08] Folding@home Core Shutdown: MISSING_WORK_FILES
[16:37:08] Finalizing output
[16:37:12] CoreStatus = 1 (1)
[16:37:12] Client-core communications error: ERROR 0x1
[16:37:12] Deleting current work unit & continuing...
[16:39:34] - Preparing to get new work unit...
[16:39:34] + Attempting to get work packet
[16:39:34] - Connecting to assignment server
[16:39:35] - Successful: assigned to (171.64.65.63).
[16:39:35] + News From Folding@Home: Welcome to Folding@Home
[16:39:35] Loaded queue successfully.
[16:39:45] + Closed connections
[16:39:50] 
[16:39:50] + Processing work unit
[16:39:50] Core required: FahCore_a1.exe
[16:39:50] Core found.
[16:39:50] Working on Unit 04 [May 28 16:39:50]
[16:39:50] + Working ...
[16:39:51] 
[16:39:51] *------------------------------*
[16:39:51] Folding@Home Gromacs SMP Core
[16:39:51] Version 1.76 (February 23, 2008)
[16:39:51] 
[16:39:51] Preparing to commence simulation
[16:39:51] - Ensuring status. Please wait.
[16:40:08] - Looking at optimizations...
[16:40:08] - Working with standard loops on this execution.
[16:40:08] - Previous termination of core was improper.
[16:40:08] - Files status OK
[16:40:08] ndard loops.
[16:40:08] - Files status OK
[16:40:08] - Expanded 607733 -> 3255941 (decompressed 535.7 percent)
[16:40:08] - Starting from initial work packet
[16:40:08] 
[16:40:08] Project: 3064 (Run 4, Clone 137, Gen 18)
[16:40:08] 
[16:40:09] Entering M.D.
[16:40:15] ting local files
[16:40:15] int
[16:40:16] a SSE boost OK.
[16:40:16] ocal files
[16:40:16] Extra SSE boost OK.
[16:40:16] 
[16:40:16] Extra SSE boost OK.
[16:40:16] Writing local files
[16:40:16] Completed 0 out of 5000000 steps  (0 percent)
[16:58:25] Writing local files
[16:58:25] Completed 50000 out of 5000000 steps  (1 percent)
[17:16:33] Writing local files
[17:16:33] Completed 100000 out of 5000000 steps  (2 percent)
[17:34:33] Writing local files
[17:34:33] Completed 150000 out of 5000000 steps  (3 percent)
[17:52:32] Writing local files
[17:52:32] Completed 200000 out of 5000000 steps  (4 percent)
[18:10:29] Writing local files
[18:10:29] Completed 250000 out of 5000000 steps  (5 percent)
[18:28:27] Writing local files
[18:28:27] Completed 300000 out of 5000000 steps  (6 percent)
[18:46:25] Writing local files
[18:46:25] Completed 350000 out of 5000000 steps  (7 percent)
[19:04:21] Writing local files
[19:04:21] Completed 400000 out of 5000000 steps  (8 percent)
[19:22:18] Writing local files
[19:22:18] Completed 450000 out of 5000000 steps  (9 percent)
[19:40:23] Writing local files
[19:40:23] Completed 500000 out of 5000000 steps  (10 percent)
[19:58:27] Writing local files
[19:58:27] Completed 550000 out of 5000000 steps  (11 percent)
[20:04:12] Warning:  long 1-4 interactions
[20:04:12] Gromacs cannot continue further.
[20:04:12] Going to send back what have done.
[20:04:12] logfile size: 22423
[20:04:12] - Writing 22959 bytes of core data to disk...
[20:04:12]   ... Done.
[20:06:13] 
[20:06:13] Folding@home Core Shutdown: EARLY_UNIT_END
[20:06:13] 
[20:06:13] Folding@home Core Shutdown: EARLY_UNIT_END
[20:06:16] CoreStatus = 63 (99)
[20:06:16] + Error starting Folding@Home core or unexpected system termination of core.
[20:06:16] - Attempting to download new core...
[20:06:16] + Downloading new core: FahCore_a1.exe
[20:06:17] + 10240 bytes downloaded
[20:06:18] + 20480 bytes downloaded
[20:06:18] + 30720 bytes downloaded
[20:06:18] + 40960 bytes downloaded
[20:06:18] + 51200 bytes downloaded
[20:06:18] + 61440 bytes downloaded
[20:06:18] + 71680 bytes downloaded
[20:06:18] + 81920 bytes downloaded
[20:06:18] + 92160 bytes downloaded
[20:06:19] + 102400 bytes downloaded
[20:06:19] + 112640 bytes downloaded
[20:06:19] + 122880 bytes downloaded
[20:06:19] + 133120 bytes downloaded
[20:06:19] + 143360 bytes downloaded
[20:06:19] + 153600 bytes downloaded
[20:06:19] + 163840 bytes downloaded
[20:06:19] + 174080 bytes downloaded
[20:06:20] + 184320 bytes downloaded
[20:06:20] + 194560 bytes downloaded
[20:06:20] + 204800 bytes downloaded
[20:06:20] + 215040 bytes downloaded
[20:06:20] + 225280 bytes downloaded
[20:06:20] + 235520 bytes downloaded
[20:06:20] + 245760 bytes downloaded
[20:06:20] + 256000 bytes downloaded
[20:06:21] + 266240 bytes downloaded
[20:06:21] + 276480 bytes downloaded
[20:06:21] + 286720 bytes downloaded
[20:06:21] + 296960 bytes downloaded
[20:06:21] + 307200 bytes downloaded
[20:06:21] + 317440 bytes downloaded
[20:06:21] + 327680 bytes downloaded
[20:06:21] + 337920 bytes downloaded
[20:06:22] + 348160 bytes downloaded
[20:06:22] + 358400 bytes downloaded
[20:06:22] + 368640 bytes downloaded
[20:06:22] + 378880 bytes downloaded
[20:06:22] + 389120 bytes downloaded
[20:06:22] + 399360 bytes downloaded
[20:06:22] + 409600 bytes downloaded
[20:06:22] + 419840 bytes downloaded
[20:06:23] + 430080 bytes downloaded
[20:06:23] + 440320 bytes downloaded
[20:06:23] + 450560 bytes downloaded
[20:06:23] + 460800 bytes downloaded
[20:06:23] + 471040 bytes downloaded
[20:06:23] + 481280 bytes downloaded
[20:06:23] + 491520 bytes downloaded
[20:06:23] + 501760 bytes downloaded
[20:06:24] + 512000 bytes downloaded
[20:06:24] + 522240 bytes downloaded
[20:06:24] + 532480 bytes downloaded
[20:06:24] + 542720 bytes downloaded
[20:06:24] + 552960 bytes downloaded
[20:06:24] + 563200 bytes downloaded
[20:06:24] + 573440 bytes downloaded
[20:06:24] + 583680 bytes downloaded
[20:06:25] + 593920 bytes downloaded
[20:06:25] + 604160 bytes downloaded
[20:06:25] + 614400 bytes downloaded
[20:06:25] + 624640 bytes downloaded
[20:06:25] + 634880 bytes downloaded
[20:06:25] + 645120 bytes downloaded
[20:06:25] + 655360 bytes downloaded
[20:06:25] + 665600 bytes downloaded
[20:06:26] + 675840 bytes downloaded
[20:06:26] + 686080 bytes downloaded
[20:06:26] + 696320 bytes downloaded
[20:06:26] + 706560 bytes downloaded
[20:06:26] + 716800 bytes downloaded
[20:06:26] + 727040 bytes downloaded
[20:06:26] + 737280 bytes downloaded
[20:06:26] + 747520 bytes downloaded
[20:06:27] + 757760 bytes downloaded
[20:06:27] + 768000 bytes downloaded
[20:06:27] + 778240 bytes downloaded
[20:06:27] + 788480 bytes downloaded
[20:06:27] + 795847 bytes downloaded
[20:06:27] Verifying core Core_a1.fah...
[20:06:27] Signature is VALID
[20:06:27] 
[20:06:27] Trying to unzip core FahCore_a1.exe
[20:06:27] Decompressed FahCore_a1.exe (2117632 bytes) successfully
[20:06:27] + Core successfully engaged
[20:06:32] 
[20:06:32] + Processing work unit
[20:06:32] Core required: FahCore_a1.exe
[20:06:32] Core found.
[20:06:32] Working on Unit 04 [May 28 20:06:32]
[20:06:32] + Working ...
[20:06:33] 
[20:06:33] *------------------------------*
[20:06:33] Folding@Home Gromacs SMP Core
[20:06:33] Version 1.76 (February 23, 2008)
[20:06:33] 
[20:06:33] Preparing to commence simulation
[20:06:33] - Looking at optimizations...
[20:06:33] - Working with standard loops on this execution.
[20:06:34] - Previous termination of core was improper.
[20:06:34] - Files status OK
[20:08:34] 
[20:08:34] Folding@home Core Shutdown: MISSING_WORK_FILES
[20:08:34] Finalizing output
[20:08:37] CoreStatus = 1 (1)
[20:08:37] Client-core communications error: ERROR 0x1
[20:08:37] Deleting current work unit & continuing...
[20:10:59] - Preparing to get new work unit...
[20:10:59] + Attempting to get work packet
[20:10:59] - Connecting to assignment server
[20:11:00] - Successful: assigned to (171.64.65.64).
[20:11:00] + News From Folding@Home: Welcome to Folding@Home
[20:11:00] Loaded queue successfully.
[20:12:12] + Closed connections
[20:12:17] 
[20:12:17] + Processing work unit
[20:12:17] Core required: FahCore_a1.exe
[20:12:17] Core found.
[20:12:17] Working on Unit 05 [May 28 20:12:17]
[20:12:17] + Working ...
[20:12:18] 
[20:12:18] *------------------------------*
[20:12:18] Folding@Home Gromacs SMP Core
[20:12:18] Version 1.76 (February 23, 2008)
[20:12:18] 
[20:12:18] Preparing to commence simulation
[20:12:18] - Ensuring status. Please wait.
[20:12:35] - Looking at optimizations...
[20:12:35] - Working with standard loops on this execution.
[20:12:35] - Previous termination of core was improper.
[20:12:35] - Going to use standard loops.
[20:12:35] - Files status OK
[20:12:44] Starting from initial work pa- Starting from initial work packet
[20:12:44] 
[20:12:44] Project: 2665 (Run 1, Clone 758, Gen 3)
[20:12:44] 
[20:12:54] 665 (Run 1, Clone 758, Gen 3)
[20:12:54] 
[20:12:55] Entering M.D.
[20:13:01] Rejecting checkpoint
[20:13:03] PWriting local files
[20:13:03] 
[20:13:03] Writing local files
[20:13:11] Extra SSE boost OK.
[20:13:12] Writing local files
[20:13:13] Completed 0 out of 250000 steps  (0 percent)
[20:37:32] Writing local files
[20:37:32] Completed 2500 out of 250000 steps  (1 percent)
[21:01:41] Writing local files
PS log is abbreviated at the beginning, just wanted to make date and times available. timezone is GMT, times shown are GMT-1 as usual in summer.
Regards,
Alex
7im
Posts: 10179
Joined: Thu Nov 29, 2007 4:30 pm
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
Location: Arizona
Contact:

Re: Project: 3064 (Run 4, Clone 137, Gen 18) failed twice at 11%

Post by 7im »

If you want, stop the client when it reaches 9 or 10 %. Wait a minute, then restart. If it keeps going, great, if not, it should move on after the 3rd attempt. If it doesn't move on, delete it and manually move on.

No other results for this WU have been returned.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
tear
Posts: 254
Joined: Sun Dec 02, 2007 4:08 am
Hardware configuration: None
Location: Rocky Mountains

Re: Project: 3064 (Run 4, Clone 137, Gen 18) failed twice at 11%

Post by tear »

Hey Al_Bert,

A bit from me -- restarting [@9 or 10% in your case], even on the same machine usually
yields good results. Good as in "simulation continues past 'crash' mark".

You may wish to give it a shot.


tear
One man's ceiling is another man's floor.
Image
anandhanju
Posts: 522
Joined: Mon Dec 03, 2007 4:33 am
Location: Australia

Re: Project: 3064 (Run 4, Clone 137, Gen 18) failed twice at 11%

Post by anandhanju »

Hello Al_bert, welcome to the forum :wink:

Oh, and yes, what 7im and tear said.
rbrandman
Pande Group Member
Posts: 22
Joined: Wed May 14, 2008 4:11 pm

Re: Project: 3064 (Run 4, Clone 137, Gen 18) failed twice at 11%

Post by rbrandman »

Thanks for your post. I've alerted the researcher in charge of this project, Dan Ensign.

Relly
Post Reply