Page 1 of 2

171.64.65.71 assigning Core 65 to GPU clients

Posted: Mon Nov 22, 2010 5:41 am
by Tobit
Several users on my team are reporting that their GPU clients are trying to download fahcore_65.exe which doesn't exist. Correct me if I am wrong but Core 65 used to be the Tinker core, no?

[05:25:13] - Preparing to get new work unit...
[05:25:13] + Attempting to get work packet
[05:25:13] - Connecting to assignment server
[05:25:14] - Successful: assigned to (171.64.65.71).
[05:25:14] + News From Folding@Home: Welcome to Folding@Home
[05:25:14] Loaded queue successfully.
[05:25:14] - Deadline time not received.
[05:25:15] + Closed connections
[05:25:15]
[05:25:15] + Processing work unit
[05:25:15] Core required: FahCore_65.exe
[05:25:15] Core not found.
[05:25:15] - Core is not present or corrupted.
[05:25:15] - Attempting to download new core...
[05:25:15] + Downloading new core: FahCore_65.exe
[05:25:15] - Error: HTTP GET returned error code 404
[05:25:15] + Error: Could not download core
[05:25:15] + Core download error (#2), waiting before retry...

Re: 171.64.65.71 assigning Core 65 to GPU clients

Posted: Mon Nov 22, 2010 5:49 am
by weedacres
I have the same problem.
Currently have 5 gpu2 clients down.

Re: 171.64.65.71 assigning Core 65 to GPU clients

Posted: Mon Nov 22, 2010 5:52 am
by pwnchu
Deleting the WU/queue seems to be solving the problem for now.

Re: 171.64.65.71 assigning Core 65 to GPU clients

Posted: Mon Nov 22, 2010 6:03 am
by weedacres
pwnchu wrote:Deleting the WU/queue seems to be solving the problem for now.
Good call, that got them back to Core 11,,, for now.

Re: 171.64.65.71 assigning Core 65 to GPU clients

Posted: Mon Nov 22, 2010 6:06 am
by oneran
This Just happened to me as well luckily I caught as soon as it happened

Code: Select all

05:46:59] Folding@home Core Shutdown: FINISHED_UNIT
[05:47:02] CoreStatus = 64 (100)
[05:47:02] Sending work to server
[05:47:02] Project: 10111 (Run 666, Clone 4, Gen 90)


[05:47:02] + Attempting to send results [November 22 05:47:02 UTC]
[05:47:04] + Results successfully sent
[05:47:04] Thank you for your contribution to Folding@Home.
[05:47:04] + Number of Units Completed: 268

[05:47:08] - Preparing to get new work unit...
[05:47:08] + Attempting to get work packet
[05:47:08] - Connecting to assignment server
[05:47:09] - Successful: assigned to (171.64.65.71).
[05:47:09] + News From Folding@Home: Welcome to Folding@Home
[05:47:10] Loaded queue successfully.
[05:47:10] - Deadline time not received.
[05:47:10] + Closed connections
[05:47:10] 
[05:47:10] + Processing work unit
[05:47:10] Core required: FahCore_65.exe
[05:47:10] Core not found.
[05:47:10] - Core is not present or corrupted.
[05:47:10] - Attempting to download new core...
[05:47:10] + Downloading new core: FahCore_65.exe
[05:47:11] - Error: HTTP GET returned error code 404
[05:47:11] + Error: Could not download core
[05:47:11] + Core download error (#2), waiting before retry...

[05:47:16] + Downloading new core: FahCore_65.exe
[05:47:17] - Error: HTTP GET returned error code 404
[05:47:17] + Error: Could not download core
[05:47:17] + Core download error (#3), waiting before retry...

[05:47:34] + Downloading new core: FahCore_65.exe
[05:47:34] - Error: HTTP GET returned error code 404
[05:47:34] + Error: Could not download core
[05:47:34] + Core download error (#4), waiting before retry...

[05:48:00] + Downloading new core: FahCore_65.exe
[05:48:00] - Error: HTTP GET returned error code 404
[05:48:00] + Error: Could not download core
[05:48:00] + Core download error (#5), waiting before retry...
Deleting the WU does seem to fix the problem, although I may not catch it the next time it happens :?

Re: 171.64.65.71 assigning Core 65 to GPU clients

Posted: Mon Nov 22, 2010 6:19 am
by bruce
The Pande Group is aware of the problem so something will probably change soon.

Additional "me too" reports won't change their investigation.

Re: 171.64.65.71 assigning Core 65 to GPU clients

Posted: Mon Nov 22, 2010 7:27 am
by toTOW
By the way, it made me laugh a lot to see the reference to the good old Tinker core :lol:

Re: 171.64.65.71 assigning Core 65 to GPU clients

Posted: Mon Nov 22, 2010 7:55 am
by HaloJones
bruce wrote:The Pande Group is aware of the problem so something will probably change soon.

Additional "me too" reports won't change their investigation.
No, I guess it won't but does it hurt to be able to moan about it? Or just maybe for PG to post what went wrong once they know why it happened and what they're going to do to stop the same thing happening again?

Re: 171.64.65.71 assigning Core 65 to GPU clients

Posted: Mon Nov 22, 2010 11:51 am
by Mactin
I just woke up. Horor !
Same problem sinse 5:19 (folding log time), it is now 11:51.

Re: 171.64.65.71 assigning Core 65 to GPU clients

Posted: Mon Nov 22, 2010 1:41 pm
by VijayPande
thanks for the report. We'll take this machine off of the AS until this is worked out.

Re: 171.64.65.71 assigning Core 65 to GPU clients

Posted: Mon Nov 22, 2010 1:42 pm
by VijayPande
PS I see that Dr. Greg Bowman (project manager for this server) took this machine off of the AS last night when we got the first reports. We're looking to see why this happened. The v6 WS code has been developing rapidly, adding new features, and this looks to be a WS bug.

Re: 171.64.65.71 assigning Core 65 to GPU clients

Posted: Mon Nov 22, 2010 3:00 pm
by George144
I have the same problem being assigned core 65 and tried deleting the wu queue but was not successful in getting back to normal. I am in my second day with this error. Any other suggestions
George144

Re: 171.64.65.71 assigning Core 65 to GPU clients

Posted: Mon Nov 22, 2010 4:57 pm
by AtwaterFS
same prob here, only one out of 3 GPU client seems to be affected FWIW

Re: 171.64.65.71 assigning Core 65 to GPU clients

Posted: Mon Nov 22, 2010 5:19 pm
by JimF
George144 wrote:I have the same problem being assigned core 65 and tried deleting the wu queue but was not successful in getting back to normal.
Deleting everything in the "work" folder, the Core 65 file itself and the queue.dat file (along with the log files) worked for me on a couple of cards on two different PCs.

Re: 171.64.65.71 assigning Core 65 to GPU clients

Posted: Mon Nov 22, 2010 6:15 pm
by tofuwombat
HaloJones wrote:
bruce wrote:The Pande Group is aware of the problem so something will probably change soon.

Additional "me too" reports won't change their investigation.
No, I guess it won't but does it hurt to be able to moan about it? Or just maybe for PG to post what went wrong once they know why it happened and what they're going to do to stop the same thing happening again?
Has a "me-too" button been considered? It might help to highlight the "popularity" of an issue. :)

(I had seven GPU's were choking on this core, last night. WU deletion wasn't helping with six of them . . .)

It is very helpful to me to know that my problems, might not be fixable by me just yet.

It made it easier to step away, knowing that others were suffering the same pain.

Regards,
tofuwombat.

P.S.: Looks better now. Thanks guys, for the quick fix.