Page 1 of 1

Dramatically Lower Points

Posted: Mon Feb 22, 2021 7:56 pm
by oliverjdent
I have been averaging around one million points per day until 2/19/21. Even with the same number of CPUs and GPUs working, the number of points went down to 500,000 points bottoming out at 270,362 points on 2/20/21.

https://folding.extremeoverclocking.com ... =&u=186507

I have been folding since June 2006. This is the first time the number of points my machines have generated has dropped off so dramatically without either a hardware failure or a FAH client failure.

If anyone can explain why my points have dropped by so much I would appreciate the assistance.

If there is something I need to do to bring my points back up to the levels I have expected please let me know.

Thank you.

Oliver.

Re: Dramatically Lower Points

Posted: Mon Feb 22, 2021 11:15 pm
by JimboPalmer
I am (rarely) getting a WU that refuses to upload, I think at least one server has no disk space. It tries multiple servers and ports and eventually is uploaded.

I am hearing that others, with different workloads and resources, have even more upload failures in their logs. Some are still not uploaded once they pass the deadlines.

I would think that would be in your log.

Re: Dramatically Lower Points

Posted: Mon Feb 22, 2021 11:18 pm
by Neil-B
Eoc wasnt showing a big reduction in wu throughput .. might be useful to check if there has been a shift in project your kit has been folding .. also check to see if gpus are using cuda not opencl if nvidia ones

Re: Dramatically Lower Points

Posted: Tue Feb 23, 2021 2:40 am
by v00d00
Admittedly their have been a few workunits knocking about that give less PPD than average. I've noticed it and others on beta have and comments have been made. I dont think those workunits were ever adjusted (asuming they needed to be). Can't remeber which Projects they were from, but I had a diabetes projects from the High Priority workunits that finished in about 40 mins on a RTX 2080 and they gave less PPD than doing most other workunits. Not that I care per se, they get done all the same regardless of points. But obviously other people do.

Re: Dramatically Lower Points

Posted: Tue Feb 23, 2021 1:51 pm
by oliverjdent
@JimboPalmer, thank you for the suggestion. I will check my logs today for upload failures.

@Neil-B, I have seen multiple "Completed" entries in the FAH Control viewer that didn't seem to upload and clear. This might be the cause.

@v00d00, thank you for the feedback.

This morning, I show 930,927 points on 41 work units. I typically complete 20 to 30 WU per day. So, it looks like things are returning to normal.

Thank you all for your assistance.

Oliver.

Re: Dramatically Lower Points

Posted: Tue Feb 23, 2021 2:29 pm
by Neil-B
Do your logs show multiple attempts to upload the completed wus with short response errors where they don't? ... there are issues at the moment with at least one av/firewall stopping uploads working properly

Re: Dramatically Lower Points

Posted: Tue Feb 23, 2021 2:31 pm
by Neil-B
Sometimes av/firewall issues self resolve as the vendors update their products security definitions

Re: Dramatically Lower Points

Posted: Tue Feb 23, 2021 11:44 pm
by oliverjdent
I went through my logs on all three of my machines. I didn't find any issues with uploading finished WUs.

I did find this error several times in one of my machine's GPU output (GeForce GTX 660). It is now running without errors.
01:22:08:WU01:FS01:0x22:An exception occurred at step 39908: Particle coordinate is nan
01:22:08:WU01:FS01:0x22:ERROR:98: Attempting to restart from last good checkpoint by restarting core.
01:22:08:WU01:FS01:0x22:Folding@home Core Shutdown: CORE_RESTART
01:22:09:WARNING:WU01:FS01:FahCore returned: CORE_RESTART (98 = 0x62)

On my main machine I found these two errors in the last 24 hours:
******************************* Date: 2021-02-22 *******************************
10:18:02:ERROR:WU02:FS01:Exception: Server did not assign work unit
10:18:03:ERROR:WU02:FS01:Exception: Server did not assign work unit
10:19:03:ERROR:WU02:FS01:Exception: Server did not assign work unit
10:18:02:ERROR:WU02:FS01:Exception: Server did not assign work unit
10:18:03:ERROR:WU02:FS01:Exception: Server did not assign work unit
10:19:03:ERROR:WU02:FS01:Exception: Server did not assign work unit

16:49:55:WARNING:WU02:FS00:Failed to get assignment from 'assign1.foldingathome.org:80': No WUs available for this configuration
16:49:55:WARNING:WU02:FS00:Failed to get assignment from 'assign1.foldingathome.org:80': No WUs available for this configuration

Now, both the CPU and GPU are working without issue.

Re: Dramatically Lower Points

Posted: Wed Feb 24, 2021 10:58 am
by Neil-B
Firstly for the main machine the two errors may well just be down to availability of WUs for that kits configuration - it happens from time to time - there will be a rash of posts on the forums along the lines of "run out of wus" or "are we low on wus" - basically the researchers sometimes are too busy handling all the science processed and haven't got more projects in the queue - it is a relatively rare occurrence but when it happens can leave slots/machine idle - sometimes just every now and then but at other times totally idle for days - the researchers do try to avoid this obviously.

The GTX 660 error is usually related to OC'ing issues (or running at a clock where the GPU is unstable - sometimes even normal clocks can suffer this) ... if it only happens occasionally and wus still complete it may be a case of just monitor it ... if wus start failing because of it then action needs to be taken ... even factory overclocks can suffer from this especially with cuda enable a8 core wus ... sometimes age/dust/dried tim all play into it ... a search of the forums for "particle is nan" will give you more of a picture than I can and some of the other edge cases that might cause this error.

Re: Dramatically Lower Points

Posted: Wed Feb 24, 2021 11:44 pm
by v00d00
Generally if you get one NaN error, report it, it may be that workunit which is faulty. If you get loads of them its a hardware issue, drop the overclock or maybe underclock a little, until it goes away.

Beyond that do the standard maintenance tasks that are required. Vacuum the card out from time to time if it gets dusty or you see high temps on the card. If underclocking the card doesnt fix the problem, it may be a PSU issue, test with a different PSU. if the card is oldish you could strip it and apply some new thermal paste to the gpu core and reassemble, also if any exposed chips on the card are getting hot, you could try adding a heatsink to them.

Re: Dramatically Lower Points

Posted: Fri Feb 26, 2021 1:43 pm
by oliverjdent
@Neil-B and v00d00,

Thank you for the responses and the good information. I will check my logs more regularly.

I hope both of you have a good weekend.

Oliver.

Re: Dramatically Lower Points

Posted: Fri Feb 26, 2021 2:48 pm
by Neil-B
... and yourself ... fold on :)