Page 1 of 1

Unfinished p4620 lurking in queue?

Posted: Thu Sep 25, 2008 9:08 pm
by John Deas
User name: John_Deas (Team 0)
User ID: 7FE55C6067C16CA5
Client: Windows CPU Console Edition
Client Version: 6.20
Operating System: Windows XP Home v5.1.2600 SP2
CPU: AMD Sempron 3400+ x86 Family 15 Model 47 Stepping 2
Overclocked: No


I recently went on holiday and before going ran FaH with the -oneunit parameter so as not to start a new unit. On 11 Sept, it duly finished p4619 R/C/G 8/22/5 and closed down. At that point, as far as I knew, there was no unfinished business.

Returning today I restarted the service but to my surprise it did not fetch a new WU; it said "Loaded queue successfully.. processing work unit... Working on queue slot 04... Project: 4620 (Run 2, Clone 15, Gen 4)... Completed 1650000 out of 2500000 steps (66%)" (fuller log file extracts attached).

What puzzles me is, where did this p4620 come from? It's not mentioned in my log file, nor in FAHlog-Prev.txt which goes back to 5 September. I keep a record of the WUs my system processes and have not logged this; but I suppose it must have been started some time in the past, stopped for some reason, and sat unfinished in the queue until now; but how could that have happened and why didn't it get completed sooner?

I will let it complete, but I guess it will be out of time by now.

Is it possible to tell whether and when p4620 R/C/G 2/15/4 was issued to me?

Log file extract:

Code: Select all

(11 September 2008)
[07:04:43] Completed 2500000 out of 2500000 steps  (100%)
[07:04:43] Writing checkpoint files
[07:05:43] 
[07:05:43] Finished Work Unit:
[07:05:43] Leaving Run
[07:05:45] - Writing 368430 bytes of core data to disk...
[07:05:45]   ... Done.
[07:05:45] - Shutting down core
[07:05:45] 
[07:05:45] Folding@home Core Shutdown: FINISHED_UNIT
[07:05:49] CoreStatus = 64 (100)
[07:05:49] Unit 3 finished with 98 percent of time to deadline remaining.
[07:05:49] Updated performance fraction: 0.954535
[07:05:49] Sending work to server
[07:05:49] Project: 4619 (Run 8, Clone 22, Gen 5)


[07:05:49] + Attempting to send results [September 11 07:05:49 UTC]
[07:05:49] - Reading file work/wuresults_03.dat from core
[07:05:49]   (Read 368430 bytes from disk)
[07:05:49] Connecting to http://169.230.26.30:8080/
[07:05:59] Posted data.
[07:05:59] Initial: 0000; - Uploaded at ~36 kB/s
[07:05:59] - Averaged speed for that direction ~30 kB/s
[07:05:59] + Results successfully sent
[07:05:59] Thank you for your contribution to Folding@Home.
[07:05:59] + Number of Units Completed: 791

[07:06:03] Trying to send all finished work units
[07:06:03] + No unsent completed units remaining.
[07:06:03] + -oneunit flag given and have now finished a unit. Exiting.***** Got a SIGTERM signal (2)
[07:06:04] Killing all core threads

Folding@Home Client Shutdown.

(system off for two weeks: on restarting... )

--- Opening Log file [September 25 17:51:34 UTC] 


# Windows CPU Console Edition #################################################
###############################################################################

                       Folding@Home Client Version 6.20

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: C:\Program Files\Fah
Service: C:\Program Files\Fah\[email protected]
Arguments: -svcstart -d C:\Program Files\Fah -verbosity 9 

Launched as a service.
Entered C:\Program Files\Fah to do work.

[17:51:34] - Ask before connecting: No
[17:51:34] - User name: John_Deas (Team 0)
[17:51:34] - User ID: 7FE55C6067C16CA5
[17:51:34] - Machine ID: 1
[17:51:34] 
[17:51:34] Loaded queue successfully.
[17:51:34] 
[17:51:34] + Processing work unit
[17:51:34] Core required: FahCore_82.exe
[17:51:34] - Autosending finished units... [September 25 17:51:34 UTC]
[17:51:34] Trying to send all finished work units
[17:51:34] + No unsent completed units remaining.
[17:51:34] - Autosend completed
[17:51:34] Core found.
[17:51:34] Working on queue slot 04 [September 25 17:51:34 UTC]
[17:51:34] + Working ...
[17:51:34] - Calling '.\FahCore_82.exe -dir work/ -suffix 04 -checkpoint 15 -service -verbose -lifeline 3388 -version 620'

[17:51:35] 
[17:51:35] *------------------------------*
[17:51:35] Folding@Home PMD Core
[17:51:35] Version 1.03 (September 7, 2005)
[17:51:35] 
[17:51:35] Preparing to commence simulation
[17:51:35] - Looking at optimizations...
[17:51:35] - Files status OK
[17:51:36] - Expanded 16554 -> 106504 (decompressed 643.3 percent)
[17:51:36] 
[17:51:36] Project: 4620 (Run 2, Clone 15, Gen 4)
[17:51:36] 
[17:51:36] Assembly optimizations on if available.
[17:51:36] Entering M.D.
[17:56:32] Protein: p4620_T0_NTL9-12_minout
[17:56:32] 
[17:56:32] Completed 1650000 out of 2500000 steps  (66%)
[18:04:11] Writing local files
[18:04:11] Completed 1675000 out of 2500000 steps  (67%)
Regards,
John Deas

Re: Unfinished p4620 lurking in queue?

Posted: Thu Sep 25, 2008 11:42 pm
by toTOW
Here what I have :

WU assigned to donor at: 2008-09-11 00:06:14
Entered into logs at: 2008-09-25 16:00:04
Days taken to complete WU: 14.66

The WU was completed successfully and you got full credit for it :

Hi John_Deas (team 0),
Your WU (P4620 R2 C15 G4) was added to the stats database on 2008-09-25 16:11:41 for 84.69 points of credit.

Re: Unfinished p4620 lurking in queue?

Posted: Fri Sep 26, 2008 5:01 pm
by John Deas
Thanks. That's extremely odd. It looks as though the -oneunit parameter doesn't do what it says on the tin. Preparatory to going away, I started the service with it set, and in due course the log-file read:

Code: Select all

[07:05:59] + Results successfully sent
[07:05:59] Thank you for your contribution to Folding@Home.
[07:05:59] + Number of Units Completed: 791

[07:06:03] Trying to send all finished work units
[07:06:03] + No unsent completed units remaining.
[07:06:03] + -oneunit flag given and have now finished a unit. Exiting.***** Got a SIGTERM signal (2)
[07:06:04] Killing all core threads

Folding@Home Client Shutdown.
The system remained switched on a few hours more, but there were no more entries to FAHlog.txt. However, from what you say, p4620 R/C/G 2/15/4 was assigned to me (allowing for a seven hour time difference) almost immediately at 07:06:14 and (from the amount of work which had been done) was evidently being processed, without making any log file entries, until the system was switched off six hours or so later.

So it appears that the effect of -oneunit in this case was not "after this unit, stop work" but "after this unit, say you have shut down, but actually get another WU and carry on working without writing anything to the log." Do you know if anyone else has encountered this? Should I report it in the "Windows v6.20 Uniprocessor Client" forum?

Regards, John Deas

Re: Unfinished p4620 lurking in queue?

Posted: Fri Sep 26, 2008 8:26 pm
by toTOW
If you were running in service mode, the service manager restarted the service ...

To use -oneunit, you have to disable the service, and run from a shortcut.

Re: Unfinished p4620 lurking in queue?

Posted: Fri Sep 26, 2008 8:58 pm
by John Deas
Ah, that's probably it. I did set the service to "manual" startup rather than "automatic", but perhaps that only applies at system start time. But I wonder why, after restarting, it didn't write anything in FAHlog.txt?

Regards, John Deas

Re: Unfinished p4620 lurking in queue?

Posted: Sat Sep 27, 2008 1:08 am
by toTOW
I don't know what happened ... but luckily, the uniprocessor client has long deadlines, so the WU hasn't been lost :)

Re: Unfinished p4620 lurking in queue?

Posted: Sat Sep 27, 2008 5:36 am
by anko1
Not sure how helpful this will be, but occasionally in the past, I accidentally ran a second instance of the client when the first instance was in service mode and stopped (I think). Anyway, if I looked at the console output, it said something to the effect that the FAHlog was open, so this was running in FAHlog2. However, even though there's a FAHlog2 file, it was empty. [Which is why I can't give you the exact "error" message.]

Edit: this doesn't explain why the unit finished in your client. Usually I had to let it finish and then restart the original. However, I usually had an active WU in the original, too, so maybe your client was able to pick it up since it was empty?

Re: Unfinished p4620 lurking in queue?

Posted: Sat Sep 27, 2008 9:47 pm
by John Deas
Ha! On reading your post I checked and yes, I have a FAHlog2.txt file, and mine is empty too. It seems the sequence is something like (a) I set the service startup to "Manual", wrongly thinking that would prevent it restarting, (b) I ran it as a service with -oneunit set, {c) It finished the WU and closed down, (d) Service Manager thought there ought to be one running and started up another instance, which began working and, noticing that the first was somehow still around, opened FAHlog2.txt but (e) for some reason never wrote anything to it. Funny. If I can think of a simpler way to describe it, maybe I will put a note in the Wiki about it. Anyway, I know what (not) to do next time.

Regards, John Deas

Re: Unfinished p4620 lurking in queue?

Posted: Sat Sep 27, 2008 10:26 pm
by anko1
Yes, as I recall, the only way to avoid the second log was to kill the service totally by disabling it in the msc.services directory before running the client manually. I presume it's a bug [or a feature? ;-)] that nothing shows up in FAHlog2.

Angela