Project: 2662 checkpoint failure -> ERROR 0xff

Moderators: Site Moderators, FAHC Science Team

Post Reply
DaveGS4
Posts: 1
Joined: Thu Jan 03, 2008 2:11 am

Project: 2662 checkpoint failure -> ERROR 0xff

Post by DaveGS4 »

Kasson et al,

I'm trying to help one of my team members debug an issue with this same work unit (but different project)- I'm not a Mac guy so not sure if it's a case of still serving up a bad WU or something else. Any thoughts (details below)?

Edit - sorry I'm a little slow in realizing this is a different run & clone.... I was focused on searching for mac client, project and error code in recent posts
ebil wrote:I'm running the SMP client on a Mac Pro.

In system preferences, it says f@h is running, but looking at my stats it doesn't seem to be.

FAHlog:

[06:46:14] - Ask before connecting: No
[06:46:14] - User name: ebil (Team 45435)
[06:46:14] - User ID: 2E2D275A18254D99
[06:46:14] - Machine ID: 1
[06:46:14]
[06:46:15] Loaded queue successfully.
[Se:46:15]
[06:46:15] + Processing work unit
[06:46:15] Core required: FahCore_a2.exe
[06:46:15] Core found.
[06:46:15] Working on queue slot 08 [September 5 06:46:15 UTC]
[06:46:15] + Working ...
[06:46:15]
[06:46:15] *------------------------------*
[06:46:15] Folding@Home Gromacs SMP Core
[06:46:15] Version 2.01 (Wed Jul 16 08:26:53 PDT 2008)
[06:46:15]
[06:46:15] Preparing to commence simulation
[06:46:15] - Ensuring status. Please wait.
[06:46:24] - Looking at optimizations...
[06:46:24] - Working with standard loops on this execution.
[06:46:24] - Files status OK
[06:46:26] - Expanded 4854119 -> 24045785 (decompressed 495.3 percent)
[06:46:26] Called DecompressByteArray: compressed_data_size=4854119 data_size=24045785, decompressed_data_size=24045785 diff=0
[06:46:26] - Digital signature verified
[06:46:26]
[06:46:26] Project: 2662 (Run 0, Clone 258, Gen 29)
[06:46:26]
[06:46:27] Entering M.D.
[06:46:33] Will resume from checkpoint file
[06:46:33] Node 1 initialized
[06:46:35] Resuming from checkpoint
[06:46:35] File work/wudata_08.log has changed since last checkpoint
[06:46:40] CoreStatus = FF (255)
[06:46:40] Client-core communications error: ERROR 0xff
[06:46:40] This is a sign of more serious problems, shutting down.
Image
kasson
Pande Group Member
Posts: 1459
Joined: Thu Nov 29, 2007 9:37 pm

Re: Project: 2662 checkpoint failure -> ERROR 0xff

Post by kasson »

Hard to say. It's more of a system issue than a work unit issue--the work files failed some of our consistency checks. The user should clean out the work directory, remove queue.dat, and restart the client. There's not much that can rescue this work unit.
Post Reply