p2665 (Run 0, Clone 587, Gen 34) CoreStatus = 63 (99)

Moderators: Site Moderators, FAHC Science Team

Post Reply
Foxery
Posts: 118
Joined: Mon Mar 03, 2008 3:11 am
Hardware configuration: Intel Core2 Quad Q9300 (Intel P35 chipset)
Radeon 3850, 512MB model (Catalyst 8.10)
Windows XP, SP2
Location: Syracuse, NY

p2665 (Run 0, Clone 587, Gen 34) CoreStatus = 63 (99)

Post by Foxery »

I've seen this WU stop and restart several times with the message CoreStatus = 63 (99), but after each time, it picks right up and keeps going anyway. The machine hasn't been touched beyond checking my email this morning, and one brief network interruption. Currently at 68% and still chugging, but thought the trouble may be worth reporting nonetheless.

Long log snips below, though the CoreStatus message is the only noteworthy part I could find.
(WinSMP 5.92.)

Code: Select all

[05:40:10] Core required: FahCore_a1.exe
[05:40:10] Core found.
[05:40:10] Working on Unit 05 [July 24 05:40:10]
[05:40:10] + Working ...
[05:40:10] - Calling 'mpiexec -channel shm -env MPICH_USE_SMP_OPTIMIZATIONS 1 -np 4 FahCore_a1.exe -dir work/ -suffix 05 -checkpoint 15 -verbose -lifeline 2788 -version 592'

[05:40:10] 
[05:40:10] *------------------------------*
[05:40:10] Folding@Home Gromacs SMP Core
[05:40:10] Version 1.76 (February 23, 2008)
[05:40:10] 
[05:40:10] Preparing to commence simulation
[05:40:10] - Looking at optimizations...
[05:40:10] - Created dyn
[05:40:10] - Files status OK
[05:40:10]  this execution.
[05:40:10] - Created dyn
[05:40:10] - Files status OK
[05:40:20] 2 percent)
[05:40:20] - Starting from initial work packet
[05:40:20] 
[05:40:20] Project: 2665 (Run 0, Clone 587, Gen 34)
[05:40:20] 
[05:40:23] Assembly optimizations on if available.
[05:40:23] Entering M.D.
[05:40:24] ing M.D.
[05:40:30] Rejecting checkpoint
[05:40:31] Protein: HGG in water
[05:40:31] Writing local files
[05:40:38] Extra SSE boost OK.
[05:40:39] Writing local files
[05:40:39] Completed 0 out of 250000 steps  (0 percent)
[05:54:30] Writing local files
[05:54:30] Completed 2500 out of 250000 steps  (1 percent)
[06:08:22] Writing local files
[06:08:22] Completed 5000 out of 250000 steps  (2 percent)
[06:22:14] Writing local files
[06:22:14] Completed 7500 out of 250000 steps  (3 percent)
[06:35:58] Writing local files
[06:35:59] Completed 10000 out of 250000 steps  (4 percent)
[06:49:50] Writing local files
[06:49:50] Completed 12500 out of 250000 steps  (5 percent)
[07:03:42] Writing local files
[07:03:42] Completed 15000 out of 250000 steps  (6 percent)
[07:17:33] Writing local files
[07:17:34] Completed 17500 out of 250000 steps  (7 percent)
[07:31:25] Writing local files
[07:31:25] Completed 20000 out of 250000 steps  (8 percent)
[07:45:17] Writing local files
[07:45:17] Completed 22500 out of 250000 steps  (9 percent)
[07:59:09] Writing local files
[07:59:09] Completed 25000 out of 250000 steps  (10 percent)
[08:13:00] Writing local files
[08:13:00] Completed 27500 out of 250000 steps  (11 percent)
[08:26:52] Writing local files
[08:26:52] Completed 30000 out of 250000 steps  (12 percent)
[08:40:43] Writing local files
[08:40:43] Completed 32500 out of 250000 steps  (13 percent)
[08:54:28] Writing local files
[08:54:29] Completed 35000 out of 250000 steps  (14 percent)
[09:08:22] Writing local files
[09:08:22] Completed 37500 out of 250000 steps  (15 percent)
[09:22:15] Writing local files
[09:22:16] Completed 40000 out of 250000 steps  (16 percent)
[09:36:07] Writing local files
[09:36:08] Completed 42500 out of 250000 steps  (17 percent)
[09:50:00] Writing local files
[09:50:00] Completed 45000 out of 250000 steps  (18 percent)
[10:03:54] Writing local files
[10:03:54] Completed 47500 out of 250000 steps  (19 percent)
[10:17:46] Writing local files
[10:17:46] Completed 50000 out of 250000 steps  (20 percent)
[10:31:39] Writing local files
[10:31:39] Completed 52500 out of 250000 steps  (21 percent)
[10:45:30] Writing local files
[10:45:31] Completed 55000 out of 250000 steps  (22 percent)
[10:59:24] Writing local files
[10:59:24] Completed 57500 out of 250000 steps  (23 percent)
[11:12:14] CoreStatus = 63 (99)
[11:12:14] + Error starting Folding@Home core or unexpected system termination of core.
[11:12:19] 
[11:12:19] + Processing work unit
[11:12:19] Core required: FahCore_a1.exe
[11:12:19] Core found.
[11:12:19] Working on Unit 05 [July 24 11:12:19]
[11:12:19] + Working ...
[11:12:19] - Calling 'mpiexec -channel shm -env MPICH_USE_SMP_OPTIMIZATIONS 1 -np 4 FahCore_a1.exe -dir work/ -suffix 05 -checkpoint 15 -verbose -lifeline 2788 -version 592'

[11:12:20] 
[11:12:20] *------------------------------*
[11:12:20] Folding@Home Gromacs SMP Core
[11:12:20] Version 1.76 (February 23, 2008)
[11:12:20] 
[11:12:20] Preparing to commence simulation
[11:12:20] - Ensuring status. Please wait.
[11:12:37] - Looking at optimizations...
[11:12:37] - Working with standard loops on this execution.
[11:12:37] - Previous termination of core was improper.
[11:12:37] - Going to use standard loops.
[11:12:37] - Files status OK
[11:12:48] (decompressed 516.2 percent)
[11:12:48] 
[11:12:48] Project: 2665 (Run 0, Clone 587, Gen 34)
[11:12:48] 
[11:12:49] 65 (Run 0, Clone 587, Gen 34)
[11:12:49] 
[11:12:50] Entering M.D.
[11:12:58] ocal files
[11:12:58] Completed 57500 out of 250000 steps  (23 percent)
[11:12:58]  of 250000 steps  (23 peCompleted 57500 out of 250000 steps  (23 percent)
[11:13:05] Extra SSE boost OK.
[11:16:24] CoreStatus = 63 (99)
[11:16:24] + Error starting Folding@Home core or unexpected system termination of core.
[11:16:29] 
[11:16:29] + Processing work unit
[11:16:29] Core required: FahCore_a1.exe
[11:16:29] Core found.
[11:16:29] Working on Unit 05 [July 24 11:16:29]
[11:16:29] + Working ...
[11:16:29] - Calling 'mpiexec -channel shm -env MPICH_USE_SMP_OPTIMIZATIONS 1 -np 4 FahCore_a1.exe -dir work/ -suffix 05 -checkpoint 15 -verbose -lifeline 2788 -version 592'

[11:16:29] 
[11:16:29] *------------------------------*
[11:16:29] Folding@Home Gromacs SMP Core
[11:16:29] Version 1.76 (February 23, 2008)
[11:16:29] 
[11:16:29] Preparing to commence simulation
[11:16:29] - Ensuring status. Please wait.
[11:16:46] - Looking at optimizations...
[11:16:46] - Working with standard loops on this execution.
[11:16:46] - Previous termination of core was improper.
[11:16:46] - Going to use standard loops.
[11:16:46] - Files status OK
[11:16:58] - Expanded 4731718 -> 24426905 (decompressed 516.2 percent)
[11:16:58] 
[11:16:58] Project: 2665 (Run 0, Clone 587, Gen 34)
[11:16:58] 
[11:17:00] Entering M.D.
[11:17:06] Calling FAH init
[11:17:08] ater
[11:17:08] Writing local files
[11:17:08] Completed 57500 out of 250000 steps  (23 percent)
[11:17:08] ter
[11:17:08] Writing local files
[11:17:08] Completed 57500 out of 250000 steps  (23 percent)
[11:17:15] Extra SSE boost OK.
[11:25:40] - Autosending finished units...
[11:25:40] Trying to send all finished work units
[11:25:40] + No unsent completed units remaining.
[11:25:40] - Autosend completed
[11:31:47] Writing local files
[11:31:47] Completed 60000 out of 250000 steps  (24 percent)
[11:46:22] Writing local files
[11:46:22] Completed 62500 out of 250000 steps  (25 percent)
[11:53:29] CoreStatus = 63 (99)
[11:53:29] + Error starting Folding@Home core or unexpected system termination of core.
[11:53:29] - Attempting to download new core...
[11:53:29] + Downloading new core: FahCore_a1.exe
[11:53:29] Downloading core (/~pande/Win32/x86_Deino/Core_a1.fah from www.stanford.edu)
[11:53:29] Could not download 'www.stanford.edu'
[11:53:29] + Error: Could not download core
[11:53:29] + Core download error (#2), waiting before retry...

[11:53:38] + Downloading new core: FahCore_a1.exe
[11:53:38] Downloading core (/~pande/Win32/x86_Deino/Core_a1.fah from www.stanford.edu)
[11:53:38] Could not download 'www.stanford.edu'
[11:53:38] + Error: Could not download core
[11:53:38] + Core download error (#3), waiting before retry...

[11:53:47] Killing all core threads

Folding@Home Client Shutdown at user request.
[11:53:47] ***** Got a SIGTERM signal (2)
[11:53:47] Killing all core threads

Folding@Home Client Shutdown.
At this point, my cable modem & software firewall froze while getting a new IP address, so I restarted the client. More CoreStatus messages below, though no trouble from the interruption itself.

Code: Select all

--- Opening Log file [July 24 12:10:20] 


# SMP Client ##################################################################
###############################################################################

                       Folding@Home Client Version 5.92beta

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: C:\Program Files\Folding@Home Windows SMP Client V1.01
Executable: C:\Program Files\Folding@Home Windows SMP Client V1.01\fah.exe
Arguments: -local -verbosity 9 

[12:10:20] - Ask before connecting: No
[12:10:20] - User name: Foxery (Team 198)
[12:10:20] - User ID: 2102BD015E3C6C0F
[12:10:20] - Machine ID: 1
[12:10:20] 
[12:10:20] Loaded queue successfully.
[12:10:20] - Autosending finished units...
[12:10:20] 
[12:10:20] Trying to send all finished work units
[12:10:20] + Processing work unit
[12:10:20] + No unsent completed units remaining.
[12:10:20] Core required: FahCore_a1.exe
[12:10:20] - Autosend completed
[12:10:20] Core found.
[12:10:20] Working on Unit 05 [July 24 12:10:20]
[12:10:20] + Working ...
[12:10:20] - Calling 'mpiexec -channel shm -env MPICH_USE_SMP_OPTIMIZATIONS 1 -np 4 FahCore_a1.exe -dir work/ -suffix 05 -checkpoint 15 -verbose -lifeline 560 -version 592'

[12:10:21] 
[12:10:21] *------------------------------*
[12:10:21] Folding@Home Gromacs SMP Core
[12:10:21] Version 1.76 (February 23, 2008)
[12:10:21] 
[12:10:21] Preparing to commence simulation
[12:10:21] - Ensuring status. Please wait.
[12:10:38] - Looking at optimizations...
[12:10:38] - Working with standard loops on this execution.
[12:10:38] Examination of work files indicates 8 consecutive improper terminations of core.
[12:10:48] - Expanded 4731718 -> 24426905 (decompressed 516.2 percent)
[12:10:49] 
[12:10:49] Project: 2665 (Run 0, Clone 587, Gen 34)
[12:10:49] 
[12:10:50] Entering M.D.
[12:10:58] Calling FAH init
[12:10:59] ater
[12:10:59] Writing local files
[12:10:59] Completed 62500 out of 250000 steps  (25 percent)
[12:10:59] ter
[12:10:59] Writing local files
[12:10:59] Completed 62500 out of 250000 steps  (25 percent)
[12:11:06] Extra SSE boost OK.
[12:24:23] CoreStatus = 63 (99)
[12:24:23] + Error starting Folding@Home core or unexpected system termination of core.
[12:24:28] 
[12:24:28] + Processing work unit
[12:24:28] Core required: FahCore_a1.exe
[12:24:28] Core found.
[12:24:28] Working on Unit 05 [July 24 12:24:28]
[12:24:28] + Working ...
[12:24:28] - Calling 'mpiexec -channel shm -env MPICH_USE_SMP_OPTIMIZATIONS 1 -np 4 FahCore_a1.exe -dir work/ -suffix 05 -checkpoint 15 -verbose -lifeline 560 -version 592'

[12:24:28] 
[12:24:28] *------------------------------*
[12:24:28] Folding@Home Gromacs SMP Core
[12:24:28] Version 1.76 (February 23, 2008)
[12:24:28] 
[12:24:28] Preparing to commence simulation
[12:24:28] - Ensuring status. Please wait.
[12:24:45] - Looking at optimizations...
[12:24:45] - Working with standard loops on this execution.
[12:24:45] Examination of work files indicates 8 consecutive improper terminations of core.
[12:24:57] - Expanded 4731718 -> 24426905 (decompressed 516.2 percent)
[12:24:57] 
[12:24:57] Project: 2665 (Run 0, Clone 587, Gen 34)
[12:24:57] 
[12:24:59] Entering M.D.
[12:25:05] Calling FAH init
[12:25:07] ater
[12:25:07] Writing local files
[12:25:07] Completed 62500 out of 250000 steps  (25 percent)
[12:25:07] ter
[12:25:07] Writing local files
[12:25:07] Completed 62500 out of 250000 steps  (25 percent)
[12:25:14] Extra SSE boost OK.
[12:39:47] Writing local files
[12:39:47] Completed 65000 out of 250000 steps  (26 percent)
[12:54:20] Writing local files
[12:54:20] Completed 67500 out of 250000 steps  (27 percent)
[13:08:51] Writing local files
[13:08:51] Completed 70000 out of 250000 steps  (28 percent)
[13:23:10] Writing local files
[13:23:11] Completed 72500 out of 250000 steps  (29 percent)
[13:37:42] Writing local files
[13:37:43] Completed 75000 out of 250000 steps  (30 percent)
[13:52:16] Writing local files
[13:52:16] Completed 77500 out of 250000 steps  (31 percent)
[14:06:47] Writing local files
[14:06:48] Completed 80000 out of 250000 steps  (32 percent)
[14:21:19] Writing local files
[14:21:19] Completed 82500 out of 250000 steps  (33 percent)
[14:35:50] Writing local files
[14:35:50] Completed 85000 out of 250000 steps  (34 percent)
[14:50:22] Writing local files
[14:50:22] Completed 87500 out of 250000 steps  (35 percent)
[15:04:52] Writing local files
[15:04:53] Completed 90000 out of 250000 steps  (36 percent)
[15:19:23] Writing local files
[15:19:24] Completed 92500 out of 250000 steps  (37 percent)
[15:33:55] Writing local files
[15:33:55] Completed 95000 out of 250000 steps  (38 percent)
[15:48:56] Timered checkpoint triggered.
[15:49:08] Writing local files
[15:49:08] Completed 97500 out of 250000 steps  (39 percent)
[16:04:09] Timered checkpoint triggered.
[16:04:22] Writing local files
[16:04:22] Completed 100000 out of 250000 steps  (40 percent)
[16:19:23] Timered checkpoint triggered.
[16:19:35] Writing local files
[16:19:35] Completed 102500 out of 250000 steps  (41 percent)
[16:34:36] Timered checkpoint triggered.
[16:34:47] Writing local files
[16:34:47] Completed 105000 out of 250000 steps  (42 percent)
[16:49:48] Timered checkpoint triggered.
[16:50:01] Writing local files
[16:50:01] Completed 107500 out of 250000 steps  (43 percent)
[17:05:02] Timered checkpoint triggered.
[17:05:13] Writing local files
[17:05:13] Completed 110000 out of 250000 steps  (44 percent)
[17:20:14] Timered checkpoint triggered.
[17:20:25] Writing local files
[17:20:26] Completed 112500 out of 250000 steps  (45 percent)
[17:27:44] CoreStatus = 63 (99)
[17:27:44] + Error starting Folding@Home core or unexpected system termination of core.
[17:27:49] 
[17:27:49] + Processing work unit
[17:27:49] Core required: FahCore_a1.exe
[17:27:49] Core found.
[17:27:49] Working on Unit 05 [July 24 17:27:49]
[17:27:49] + Working ...
[17:27:49] - Calling 'mpiexec -channel shm -env MPICH_USE_SMP_OPTIMIZATIONS 1 -np 4 FahCore_a1.exe -dir work/ -suffix 05 -checkpoint 15 -verbose -lifeline 560 -version 592'

[17:27:49] 
[17:27:49] *------------------------------*
[17:27:49] Folding@Home Gromacs SMP Core
[17:27:49] Version 1.76 (February 23, 2008)
[17:27:49] 
[17:27:49] Preparing to commence simulation
[17:27:49] - Ensuring status. Please wait.
[17:28:06] - Looking at optimizations...
[17:28:06] - Working with standard loops on this execution.
[17:28:06] Examination of work files indicates 8 consecutive improper terminations of core.
[17:28:16] - Expanded 4731718 -> 24426905 (decompressed 516.2 percent)
[17:28:16] 
[17:28:16] Project: 2665 (Run 0, Clone 587, Gen 34)
[17:28:16] 
[17:28:19] Entering M.D.
[17:28:26] Calling FAH init
[17:28:27] ater
[17:28:27] Writing local files
[17:28:27] rom checkpoint)
[17:28:27] Read checkpoint
[17:28:27] Protein: HGG in water
[17:28:27] Writing local files
[17:28:28] Completed 112500 out of 250000 steps  (45 percent)
[17:28:34] Extra SSE boost OK.
[17:42:24] Writing local files
[17:42:24] Completed 115000 out of 250000 steps  (46 percent)
[17:56:08] Writing local files
[17:56:08] Completed 117500 out of 250000 steps  (47 percent)
[18:09:58] Writing local files
Core2 Quad/Q9300, Radeon 3850/512MB (WinXP SP2)
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: p2665 (Run 0, Clone 587, Gen 34) CoreStatus = 63 (99)

Post by bruce »

Known issue: Changes to the network interface will crash FAH-SMP with either version of MPI.
If you're running 5.92, work will resume from the last checkpoint.
If you're running 5.91, the WU will be discarded and a new assignment will be downloaded.
Foxery
Posts: 118
Joined: Mon Mar 03, 2008 3:11 am
Hardware configuration: Intel Core2 Quad Q9300 (Intel P35 chipset)
Radeon 3850, 512MB model (Catalyst 8.10)
Windows XP, SP2
Location: Syracuse, NY

Re: p2665 (Run 0, Clone 587, Gen 34) CoreStatus = 63 (99)

Post by Foxery »

Yes, but there was only 1 network interruption, and 4-5 instances of the error message, spaced hours apart from each other.
Core2 Quad/Q9300, Radeon 3850/512MB (WinXP SP2)
toTOW
Site Moderator
Posts: 6395
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: p2665 (Run 0, Clone 587, Gen 34) CoreStatus = 63 (99)

Post by toTOW »

Tell us if you manage to finish the WU ...
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
Foxery
Posts: 118
Joined: Mon Mar 03, 2008 3:11 am
Hardware configuration: Intel Core2 Quad Q9300 (Intel P35 chipset)
Radeon 3850, 512MB model (Catalyst 8.10)
Windows XP, SP2
Location: Syracuse, NY

Re: p2665 (Run 0, Clone 587, Gen 34) CoreStatus = 63 (99)

Post by Foxery »

Update: It did finish last night, and uploaded successfully. There were no further troubles all the way from 45%-100%. Weird, eh? I guess I won't worry about it...
Core2 Quad/Q9300, Radeon 3850/512MB (WinXP SP2)
Post Reply