Page 1 of 1

Problem with P3852

Posted: Tue Jan 08, 2008 8:41 pm
by Baowoulf
I have two computers running F@H 5.03.

Each computer has done 2 or so P3852's. Each had one of them not counted on my stats page while the other finished without a problem. I know it's been a while but when it first happened I was given adviced to just wait and it was around the time of the power surge so I probably got a bit lazy. Anyways,

On mine it was on December 17, 2007. Project 3852 (Run 1009, Clone 0, Gen 26). I could tell that it didn't go through because even after 4+ hours after it had finished on the 17th it still said my last WU finished was on the 16th. I can only access part of the log that I posted about on another problem since it seems because I had to restart the log for that day doesn't appear anymore. Is there anyway to get that info back? And how do you know when a restart will reset your log? Is it if your last WU finished was on the day you restarted?

On my second computer it was on December 26, 2007. Project 3852 (Run 4691, Clone 0, Gen 22). I haven't restarted this computer so I can still access it's log file.

I also had an EUE but just once. It hasn't repeated itself as well as the ERROR0x79 which was for a possible bad WU. Should I not worry about these since they haven't repeated themselves a second time?

Re: Problem with P3852

Posted: Tue Jan 08, 2008 9:16 pm
by 7im
You only get credit the first time you submit the work unit.

You need to change the client option for USE IE SETTINGS to NO (default setting), and then restart the client. With the incorrect YES setting, the work unit will upload, but the fah client is blocked from receiving the upload acknowledgment from the fah server. As such, the client keeps trying to upload the same work unit over and over, even though the work unit has been uploaded and credited.

Hi Baowoulf (team 48759),
Your WU (P3852 R4691 C0 G22) was added to the stats database on 2007-12-26 18:56:45 for 343 points of credit.

Hi Baowoulf (team 48759),
Your WU (P3852 R4691 C0 G22) was added to the stats database on 2007-12-26 23:32:49 for 0 points of credit.

Hi Baowoulf (team 48759),
Your WU (P3852 R4691 C0 G22) was added to the stats database on 2007-12-27 03:29:49 for 0 points of credit.

Hi Baowoulf (team 48759),
Your WU (P3852 R4691 C0 G22) was added to the stats database on 2007-12-27 07:32:30 for 0 points of credit.

Re: Problem with P3852

Posted: Tue Jan 08, 2008 10:20 pm
by Baowoulf
IE Settings has always been set to No. I've never messed with them at all.

They were four separate P3852 WU's which was why I gave the full info for the ones that didn't get counted.

I made a mistake Run 4691 was on my computer not my second one. And Run 1009 was on my computer. If that matters at all.

The problem at least on the log I can see on my second computer is that for Run 1009 the first attempt failed and the second one worked. Not sure about Run 4691 unless there is some way to get info on past WU's. My computers log starts on January the 3rd.

As far as I know I from looking at my stats page I'm missing two WU's and whatever points they gave for P3852. No I'm not counting any one twice all 4, the two that worked and the two that didn't work are separate WU's of P3852.


Also what about the EUE should I worry about that at all since it only happened once?

Re: Problem with P3852

Posted: Wed Jan 09, 2008 12:24 am
by 7im
Baowoulf wrote:IE Settings has always been set to No. I've never messed with them at all.

Also what about the EUE should I worry about that at all since it only happened once?
That's odd. Do you have firewall or proxy that might do the same thing as IE=Yes? Flakey ISP?

I checked that other WU, Project: 3852 (Run 1009, Clone 0, Gen 26), and it does not appear in the database. Looks like a bad WU, because Gen 25 was returned weeks ago, and Gen 26 would likely have been completed by now. And Gen 27 - 32 were completed over a few days time in the last week or so.

Individual EUEs are expected as part of folding. Only multiples start to raise a concern. I would point you to the FAH WIKI entry on EUEs, but it appears to be down at the moment.

Re: Problem with P3852

Posted: Wed Jan 09, 2008 12:58 am
by Baowoulf
I have Norton Internet Security and a router that likes to for some reason lose connection like once a every other month or so making me need to do a hard reset. Possibly because it's getting old. Maybe that's why it tried to resend that one WU.

This is from my second computer. Anyway to pull up old log files on my main computer that don't show up anymore after a restart?
[19:08:52] *------------------------------*
[19:08:52] Folding@Home Double Gromacs Core B
[19:08:52] Version 1.04 (Fri Aug 10 16:46:39 PDT 2007)
[19:08:52]
[19:08:52] Preparing to commence simulation
[19:08:52] - Files status OK
[19:08:53] - Expanded 151516 -> 543405 (decompressed 358.6 percent)
[19:08:53]
[19:08:53] Project: 3852 (Run 1009, Clone 0, Gen 26)
[19:08:53]
[19:08:54] Assembly optimizations on if available.
[19:08:54] Entering M.D.
[19:09:00] Will resume from checkpoint file
[19:09:04] Working on p3850_fkbprelative_ligand
[19:09:04] Completed 0 out of 1250000 steps (0)
[19:09:05] Extra SSE2 boost OK
[19:09:05] Resuming from checkpoint
[19:09:06] Verified work/wudata_06.log
[19:09:06] Verified work/wudata_06.edr
[19:09:07] Verified work/wudata_06.xvg
[19:09:07] Verified work/wudata_06.trr
[19:09:07] Verified work/wudata_06.xtc
[19:09:07] Completed 1234055 out of 1250000 steps (98)
[19:15:08] Completed 1237500 out of 1250000 steps (99)
[19:30:09] Timer requesting checkpoint
[19:45:10] Timer requesting checkpoint
[19:50:00] Completed 1250000 out of 1250000 steps (100)
[19:51:04]
[19:51:04] Finished Work Unit:
[19:51:04] - Reading up to 130100 from "work/wudata_06.trr": Read 130100
[19:51:04] - Reading up to 27004 from "work/wudata_06.xtc": Read 27004
[19:51:05] logfile size: 18431
[19:51:05] Leaving Run
[19:51:10] - Writing 453347 bytes of core data to disk...
[19:51:10] ... Done.
[19:51:10] - Shutting down core
[19:51:10]
[19:51:10] Folding@home Core Shutdown: FINISHED_UNIT
[19:51:14] CoreStatus = 64 (100)
[19:51:14] Sending work to server


[19:51:14] + Attempting to send results
[19:51:39] - Couldn't send HTTP request to server
[19:51:39] + Could not connect to Work Server (results)
[19:51:39] (128.59.74.4:8080)
[19:51:39] - Error: Could not transmit unit 06 (completed December 17) to work server.
[19:51:39] Keeping unit 06 in queue.


[19:51:39] + Attempting to send results
[19:51:47] + Results successfully sent
[19:51:47] Thank you for your contribution to Folding@Home.
[19:51:47] + Number of Units Completed: 5

[19:51:47] - Preparing to get new work unit...
[19:51:47] + Attempting to get work packet
[19:51:47] - Connecting to assignment server
[19:51:48] - Successful: assigned to (171.64.65.58).
[19:51:48] + News From Folding@Home: Welcome to Folding@Home
[19:51:48] Loaded queue successfully.
[19:51:50] + Closed connections

So Run 1009 is probably bad? And your sure the other one got counted on the first submission? For some reason I thought it didn't, but I may have miscounted.

Re: Problem with P3852

Posted: Wed Jan 09, 2008 5:51 am
by 7im
7im wrote:...
Hi Baowoulf (team 48759),
Your WU (P3852 R4691 C0 G22) was added to the stats database on 2007-12-26 18:56:45 for 343 points of credit.

Re: Problem with P3852

Posted: Wed Jan 09, 2008 7:31 am
by bruce
Baowoulf wrote:I can only access part of the log that I posted about on another problem since it seems because I had to restart the log for that day doesn't appear anymore. Is there anyway to get that info back? And how do you know when a restart will reset your log? Is it if your last WU finished was on the day you restarted?
Whenever you restart the client, it checks the size of FAHlog.txt. If it's under 50K, it proceeds to append data to it. If it's over 50K, it is renamed to be FAHlog-Prev.txt and the client starts with a new file FAHlog.txt. You are able to get that old info for as long as it takes you to generate another file over 50K. (Of course if you NEVER restart your client, the file can grow larger than 50K)

Re: Problem with P3852

Posted: Wed Jan 09, 2008 10:14 pm
by Baowoulf
Can a WU be sent successfully, at least seem to be on the folders end, and then when it reaches your servers be found out to be a bad WU?

Re: Problem with P3852

Posted: Wed Jan 09, 2008 10:21 pm
by toTOW
There can still be corruption during network transferts ... if you have a poor line for example.

Re: Problem with P3852

Posted: Wed Jan 09, 2008 10:46 pm
by 7im
toTOW wrote:There can still be corruption during network transferts ... if you have a poor line for example.
That's true, but doesn't answer the question. We all know data corruption is possible. :roll:

Baowoulf wrote:Can a WU be sent successfully, at least seem to be on the folders end, and then when it reaches your servers be found out to be a bad WU?
If the WU fails to upload, the fah server does not send back an upload acknowledgement. Without a positive ack, the fah client will keep trying to upload that WU every 6 hours.

Re: Problem with P3852

Posted: Wed Jan 09, 2008 10:58 pm
by Baowoulf
Well everything seems fine for 1009 on my end if I'm reading the log correctly. That's why I asked. And I have cable internet and it's always been great for me. I did get this


[19:51:39] + Attempting to send results
[19:51:47] + Results successfully sent
[19:51:47] Thank you for your contribution to Folding@Home.
[19:51:47] + Number of Units Completed: 5


But it never showed up on my stats page. Is the acknowledgement you were talking about?

Re: Problem with P3852

Posted: Wed Jan 09, 2008 11:23 pm
by 7im
Baowoulf wrote:...

But it never showed up on my stats page. Is the acknowledgement you were talking about?
No, and Yes. We don't see any of the client <-> server communications or acknowledgments, other than what is written in to that fahlog.txt file.

Crediting is not instantaneous. It can take at least 1 hour, to several hours, for a WU to get from the work server to the stats system, and then published on the stats page.

Re: Problem with P3852

Posted: Wed Jan 09, 2008 11:37 pm
by Baowoulf
Even after I had waited 4+ hours thou on the 17th the stats page still said last WU completed was on the 16th, and Run 1009 was completed on the 17th. Could the power surge and other server problems stopped it from being to added after I received the acknowledge from the F@H server on my end? That's the only thing I could think of.

Re: Problem with P3852

Posted: Thu Jan 10, 2008 12:37 am
by 7im
Baowoulf wrote:Even after I had waited 4+ hours thou on the 17th the stats page still said last WU completed was on the 16th, and Run 1009 was completed on the 17th. Could the power surge and other server problems stopped it from being to added after I received the acknowledge from the F@H server on my end? That's the only thing I could think of.
Yes, it is very rare, but Stanford servers or networks can have problems, just like any other server or network. Disk drives fill up, etc.

There were some new projects not connected to the stats server for a while, but from what I read, those WUs have all been recredited. Sometimes those WUs are searchable, sometimes not. So it is possible to have uploaded on the 17th, and not get credit until today. But I don't track that sort of thing very closely any longer. After 1000 WUs, I stopped counting. ;)

Re: Problem with P3852

Posted: Thu Jan 10, 2008 12:40 am
by Baowoulf
Well I guess all I can do is wait and see. At least the problem isn't on my end. Only one or two bad ones out of 182 so far isn't that bad :) Thanks for the help.