Page 1 of 2

0 completed, 0 failed?

Posted: Fri Jun 17, 2022 3:27 am
by Peter_Hucker
I use HFM to monitor several computers running Folding@Home. One of them is reporting "0 completed, 0 failed", although the log in FAH Client doesn't show problems. The other computers all show things like "61 completed, 0 failed" or "24 completed, 2 failed".

Is there any way I can see if the tasks by one specific computer are working ok? The log says yes, but I assume the server is saying no?

Re: 0 completed, 0 failed?

Posted: Sun Jun 19, 2022 12:00 am
by Peter_Hucker
Anyone? The computer is called "Black" locally, on IP 192.168.1.64, and I found "PID 7956" in FAH Control. My account is here: https://stats.foldingathome.org/donor/id/597859326

Re: 0 completed, 0 failed?

Posted: Sun Jun 19, 2022 1:27 am
by bollix47
Do the counts change if you press F10 while HFM is in focus or select View and Toggle Completed/Failed Count Style?

Re: 0 completed, 0 failed?

Posted: Sun Jun 19, 2022 1:30 am
by Peter_Hucker
Only on the "good" computers. What did I just do? It was showing 106 completed for the best one, that changes to only 7. The "bad" computer remains at 0.

Re: 0 completed, 0 failed?

Posted: Sun Jun 19, 2022 9:04 am
by bollix47
Toggling the counts goes from a Current count (since that clilent was last rebooted) to a to-date count. You can toggle back & forth as many times as you like ... it's just a different view & no data has changed.

Do you have HFM installed on just one computer or are you installing it on more than one computer?

On the 'bad' client does all the other data show correctly?

If you go to tools > WU history and type in client=Black if that's what it says under Client,(if may be called something like Black Slot 00) do you get a list of WUs completed by that computer and are they listed as finished?

Re: 0 completed, 0 failed?

Posted: Sun Jun 19, 2022 9:27 am
by Peter_Hucker
HFM is only on my main computer. The other 6 computers just run the folding client.

Everything shows correctly apart from the total tasks completed/failed.

It used to show those correctly on the "bad" machine aswell, before something changed. I've moved GPUs around, changed graphics drivers, and done windows updates. But I can think of nothing else that would cause a problem.

Where does HFM get these totals? Are they held on the clients already and it just fetches them? Or is it calculated by HFM? Or does it get it from the folding server?

Re: 0 completed, 0 failed?

Posted: Sun Jun 19, 2022 5:02 pm
by bollix47
Go to https://apps.foldingathome.org/cpu and type in your folding name which will show you the latest return from each of your computers. You may be able to figure out which is black.

Another source of good info for knowing what each client & HFM are communicating about:

View > show/hide/messages/window

There you should see Black's convo & there might even be a clue as to what's affecting your counts on the 'bad' one

Re: 0 completed, 0 failed?

Posted: Mon Jun 20, 2022 7:52 am
by Peter_Hucker
The latest one there is from another computer, Glass. The points and upload time match my computer's logs.

Black returned a task about 16 hours later and it's not showing up there.

I guess I stop running folding on Black. It's odd it's showing fine in the logs and the server isn't acknowledging them. I guess there's nothing I can do, there's nothing to tell me where to look to sort the problem.

Re: 0 completed, 0 failed?

Posted: Tue Jun 21, 2022 10:32 pm
by TheWolf
Have you checked "Black's" F@H configuration? Your user name/team # and bonus code to see if it's there on the rig in question? I have seen times when this info has disappeared and needed to be replaced. Just a thought, I'm sure you have already done this. If this has happened it will be turning in work to team zero "0" under the default name.

Re: 0 completed, 0 failed?

Posted: Wed Jun 22, 2022 11:43 am
by Peter_Hucker
I went to web control on Black and a good machine, and both have the same name, team, and passkey.

Hopefully the work I wasn't credited for was actually acknowledged and is useful.

Until I know what's going on I'm afraid Black will have to do something else. It's on Einstein and Milkyway at the moment, and will do World Community Grid once they get running, which looks imminent. Not sure how long their GPU work will last though. Black has 5 GPUs so it's a lot of power I wanted to be on here.

Re: 0 completed, 0 failed?

Posted: Wed Jun 22, 2022 4:00 pm
by Joe_H
Have you checked completed WUs completed by Black both in HFM.net and on the app link provided by bollix47? If it has not shown up on the servers, post the log sections showing the beginning of processing from the download and the end of processing and the completed upload. That can be used to ask the maintainers of the servers and the stats system to look into the problem.

Otherwise it sounds like a communication problem with HFM.net, and posting either on its support channel or the existing topic here may get you some suggestions on fixing that.

Re: 0 completed, 0 failed?

Posted: Wed Jun 22, 2022 4:10 pm
by Peter_Hucker
Perhaps that link doesn't show things immediately, since I can see one here:

https://apps.foldingathome.org/wu?p=179 ... e=0&gen=30

And the log for that day:

21:18:06:WU03:FS02:0x22:Checkpoint completed at step 1000000
21:18:18:WU03:FS02:0x22:Saving result file ..\logfile_01.txt
21:18:18:WU03:FS02:0x22:Saving result file checkpointIntegrator.xml
21:18:18:WU03:FS02:0x22:Saving result file checkpointState.xml
21:18:23:WU03:FS02:0x22:Saving result file positions.xtc
21:18:24:WU03:FS02:0x22:Saving result file science.log
21:18:24:WU03:FS02:0x22:Saving result file xtcAtoms.csv.bz2
21:18:24:WU03:FS02:0x22:Folding@home Core Shutdown: FINISHED_UNIT
21:18:30:WU03:FS02:FahCore returned: FINISHED_UNIT (100 = 0x64)
21:18:30:WU03:FS02:Sending unit results: id:03 state:SEND error:NO_ERROR project:17917 run:517 clone:0 gen:30 core:0x22 unit:0x000000000000001e000045fd00000205
21:18:30:WU03:FS02:Uploading 23.67MiB to 128.174.73.78
21:18:30:WU02:FS02:Starting
21:18:30:WU03:FS02:Connecting to 128.174.73.78:8080
----snip----
21:18:37:WU02:FS02:0x22:Platform 0: Reference
21:18:37:WU02:FS02:0x22:Platform 1: CPU
21:18:37:WU02:FS02:0x22:Platform 2: OpenCL
21:18:37:WU02:FS02:0x22: opencl-device 3 specified
21:18:42:WU03:FS02:Upload 24.82%
21:18:48:WU03:FS02:Upload 37.76%
21:18:54:WU03:FS02:Upload 50.17%
21:19:00:WU03:FS02:Upload 63.63%
21:19:06:WU03:FS02:Upload 77.10%
21:19:12:WU03:FS02:Upload 90.57%
21:19:17:WU03:FS02:Upload complete
21:19:17:WU03:FS02:Server responded WORK_ACK (400)
21:19:17:WU03:FS02:Final credit estimate, 78559.00 points
21:19:17:WU03:FS02:Cleaning up

So where to from here? I'm not comfortable running it if I see no numbers on HFM, I may be running 5 GPUs to do nothing useful.

Does anyone know how HFM gets it's numbers? I can see nothing in FAHControl to tell me how many have been completed. If it has no mechanism to count them, is HFM contacting the server?

Re: 0 completed, 0 failed?

Posted: Wed Jun 22, 2022 4:16 pm
by Peter_Hucker
The last three completed tasks in Black's log are found in that link, I'll assume HFM is broken. Black will resume Folding.

Re: 0 completed, 0 failed?

Posted: Wed Jun 22, 2022 4:19 pm
by Peter_Hucker
As for HFM support, nobody ever replies in there, I've tried Github and Google groups.

Re: 0 completed, 0 failed?

Posted: Wed Jun 22, 2022 4:45 pm
by Joe_H
Peter_Hucker wrote: Wed Jun 22, 2022 4:19 pm As for HFM support, nobody ever replies in there, I've tried Github and Google groups.
The topic here is viewtopic.php?t=9903. Harlam usually responds in a day or two. There are other posts there of various issues and solutions people have run into using HFM.net.