Only 60% of WUs are Credited

Moderators: Site Moderators, FAHC Science Team

markhl
Posts: 7
Joined: Sun Mar 17, 2024 3:02 am
Location: California

Only 60% of WUs are Credited

Post by markhl »

I have been running FAH since 2022 on a Dell desktop bought in 2018. In January, I started running version 8.4.9. FAH uses two CPUs and a GPU. I shut the PC down each night and sometimes pause FAH to reduce surges in CPU fan noise. Is this setup OK for FAH?

I just checked my Work Units at https://v8-4.foldingathome.org/wus.
In the last 33 days, my system has attempted 161 WUs.
Only 94 WUs (about 60%) reached 100% completion and were Credited!
43 WUs were lost to Shutting Down at an average of 50% progress.
22 WUs were Dumped at an average of 30% progress; most of these WUs were then Credited on other people's systems so they could have been Credited on my system.
One WU Failed; it has now Failed 278 times on other people's systems!

So, FAH did not complete more than one-third of all WUs assigned to my machine. Issues seem to affect CPU and GPU WUs equally. That is not a great use of my compute. If that is also affecting other volunteers, it could be a problem. Other people might want to check their Work Units.

I have seen the good advice to pause FAH and wait a minute before shutting down Windows. Then remember to resume FAH after I reboot or start the PC. I will try to do so. But it is easy to forget and should not be necessary. Does FAH reliably resume from the last checkpoint?

I have run many volunteer computing projects that do not lose their WUs after reboot. For example, my BOINC projects only lose a WU every few months. Ideas welcome, thanks!
calxalot
Site Moderator
Posts: 1438
Joined: Sat Dec 08, 2007 1:33 am
Location: San Francisco, CA
Contact:

Re: Only 60% of WUs are Credited

Post by calxalot »

Another workaround is to set folding to Finish after starting it. Then it might already be paused if you forget to manually pause before a shutdown.
Joe_H
Site Admin
Posts: 8087
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Studio M1 Max 32 GB smp6
Mac Hack i7-7700K 48 GB smp4
Location: W. MA

Re: Only 60% of WUs are Credited

Post by Joe_H »

With Windows Pause folding a minute or two before shutting down your PC. Windows is supposed to wait for the folding process to exit, it often does not wait long enough. This is a known issue with Windows. There is code in the setup of the client to have Windows wait, from many reports Windows is ignoring it.

Personally my experience with Windows is this is a long running bug. I have dealt with the results of Windows not waiting for processes to exit for over 20 years. It does this for regular shutdowns, also with shutdown and reboots that are part of software updates.
Image
calxalot
Site Moderator
Posts: 1438
Joined: Sat Dec 08, 2007 1:33 am
Location: San Francisco, CA
Contact:

Re: Only 60% of WUs are Credited

Post by calxalot »

I think logout will also do it.

Code in the client is not sufficient.
Dev maybe thinks that windows kills the cores before the client can stop them normally, then the client assumes the cores crashed and dumps work.
arisu
Posts: 248
Joined: Mon Feb 24, 2025 11:11 pm

Re: Only 60% of WUs are Credited

Post by arisu »

calxalot wrote: Wed Mar 05, 2025 5:55 am I think logout will also do it.

Code in the client is not sufficient.
Dev maybe thinks that windows kills the cores before the client can stop them normally, then the client assumes the cores crashed and dumps work.
That probably is it. The current code is pretty liberal about dumping the core for many exit reasons and does not always make optimal decisions: https://github.com/FoldingAtHome/fah-cl ... ExitCode.h. It would probably be CLIENT_DIED which is an overloaded status:

Code: Select all

 *   DEFAULT  - v322-v600: DUMP and ERROR.
 *              v623:      If SMP then DUMP and ERROR else EXIT.
 *              CLIENT_DIED, BAD_WORK_CHECKSUM, MALLOC_ERROR, UNKNOWN_ERROR
Non-SMP clients just exit without dumping the WU for some reason.

It can be triggered by mistake, for example (on Linux at least) system-wide resource limits can send SIGXCPU and SIGKILL to the core, but not the client. But one can't just make CLIENT_DIED more relaxed because that is probably the same error that would happen if the core has a bug that causes it to receive SIGSEGV (just a guess). Windows probably has something similar. I don't think the client distinguishes different types of exits caused by signals (or their Windows equivalents). It treats it as an unknown error internally.
calxalot
Site Moderator
Posts: 1438
Joined: Sat Dec 08, 2007 1:33 am
Location: San Francisco, CA
Contact:

Re: Only 60% of WUs are Credited

Post by calxalot »

There is an unreleased commit that changes something about the terminate order.
I don’t know if anyone else has tested it.

https://github.com/FoldingAtHome/fah-cl ... 429c9444cc
muziqaz
Posts: 1531
Joined: Sun Dec 16, 2007 6:22 pm
Hardware configuration: 9950x, 7950x3D, 5950x, 5800x3D
7900xtx, RX9070, Radeon 7, 5700xt, 6900xt, RX 550 640SP
Location: London
Contact:

Re: Only 60% of WUs are Credited

Post by muziqaz »

Ideally we would need fahclient to probe Works folder and fold everything that is in there before downloading new WUs. If existing WU is expired, dump it, if not, continue folding.
FAH Omega tester
Image
calxalot
Site Moderator
Posts: 1438
Joined: Sat Dec 08, 2007 1:33 am
Location: San Francisco, CA
Contact:

Re: Only 60% of WUs are Credited

Post by calxalot »

Sounds like a great enhancement request.
muziqaz
Posts: 1531
Joined: Sun Dec 16, 2007 6:22 pm
Hardware configuration: 9950x, 7950x3D, 5950x, 5800x3D
7900xtx, RX9070, Radeon 7, 5700xt, 6900xt, RX 550 640SP
Location: London
Contact:

Re: Only 60% of WUs are Credited

Post by muziqaz »

calxalot wrote: Wed Mar 05, 2025 8:02 pm Sounds like a great enhancement request.
I think I asked Joe for this, ever since Windows started forgetting WUs upon restart. Works folder would have many forgotten WUs still sitting there. Later that developed into killing WUs
FAH Omega tester
Image
calxalot
Site Moderator
Posts: 1438
Joined: Sat Dec 08, 2007 1:33 am
Location: San Francisco, CA
Contact:

Re: Only 60% of WUs are Credited

Post by calxalot »

Don’t ask him. Just create a ticket.
muziqaz
Posts: 1531
Joined: Sun Dec 16, 2007 6:22 pm
Hardware configuration: 9950x, 7950x3D, 5950x, 5800x3D
7900xtx, RX9070, Radeon 7, 5700xt, 6900xt, RX 550 640SP
Location: London
Contact:

Re: Only 60% of WUs are Credited

Post by muziqaz »

calxalot wrote: Wed Mar 05, 2025 8:19 pm Don’t ask him. Just create a ticket.
I think I did
FAH Omega tester
Image
markhl
Posts: 7
Joined: Sun Mar 17, 2024 3:02 am
Location: California

Re: Only 60% of WUs are Credited

Post by markhl »

Thanks for the discussion! I will continue to Pause and then to wait a minute before shutdown. I see fewer lost WUs when I do that. How many other users or devices does this issue affect? What percentage of all FAH WUs are being lost at shutdown?
muziqaz
Posts: 1531
Joined: Sun Dec 16, 2007 6:22 pm
Hardware configuration: 9950x, 7950x3D, 5950x, 5800x3D
7900xtx, RX9070, Radeon 7, 5700xt, 6900xt, RX 550 640SP
Location: London
Contact:

Re: Only 60% of WUs are Credited

Post by muziqaz »

Everyone on Windows who do not pause before reboot, lose WUs
FAH Omega tester
Image
arisu
Posts: 248
Joined: Mon Feb 24, 2025 11:11 pm

Re: Only 60% of WUs are Credited

Post by arisu »

muziqaz wrote: Mon Mar 17, 2025 5:54 am Everyone on Windows who do not pause before reboot, lose WUs
That seems like an extremely serious problem for the project. What percentage of Windows folders are using the v8 client?
Joe_H
Site Admin
Posts: 8087
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Studio M1 Max 32 GB smp6
Mac Hack i7-7700K 48 GB smp4
Location: W. MA

Re: Only 60% of WUs are Credited

Post by Joe_H »

This stats page - https://stats.foldingathome.org/os - gives number for current folders and the OS used. It doesn't include Intel GPU stats, and any CPU folding on Raspberry Pi and similar systems is probably included under Linux. But Windows and Linux are almost even, just around 50% for Windows and 45% for Linux. I do not know why there are separate Windows and Win64 categories.

It would take someone scanning the server logs to get an idea which clients are using V7 versus v8. I haven't heard of that being done recently, last time I heard of it being done was years ago. Some were still using very early versions of v7.
Image
Post Reply