Page 1 of 1
one more time, gpu disabled,gpu wu computing lost
Posted: Tue Dec 14, 2021 5:48 pm
by promeneur
After a big update, i restart my PC.
I open a kde session.
I launch fah-control.
1. first problem
Nvidia GPU slot is disabled.
Progress was 85 % before the restart.
I stop then start faclient
All is ok. Now the gpu slot is enabled and computing.
2. second problem
There is a side effect. The WU computing does not start with the previous WU at 85 % computed but with a new WU.
Re: one more time, gpu disabled,gpu wu computing lost
Posted: Tue Dec 14, 2021 8:02 pm
by aetch
When I ran ubuntu I found I had to re-install the geforce drivers after updating the operating system, every time.
With the GPU slot disabled there is no longer a slot for core 22 to run on, so it just dumps the work unit. The client takes only a few seconds to do this.
By the time you re-enabled your GPU folding slot the previous work unit is long gone, so it requests a new one.
Personally, regardless of operating system I always make sure the current work units are cleared before taking my system off-line for maintenance. I never know what problems I will encounter and/or how long my system will actually be down for.
Re: one more time, gpu disabled,gpu wu computing lost
Posted: Tue Dec 14, 2021 8:12 pm
by Joe_H
aetch wrote:Personally, regardless of operating system I always make sure the current work units are cleared before taking my system off-line for maintenance. I never know what problems I will encounter and/or how long my system will actually be down for.
For the same reasons I turn off automatic updates, only leave on notifications that updates are available. Been bitten way too many times by issues connected with letting an OS update things on its schedule instead of when I will have most things that might be negatively affected taken core of first.
Re: one more time, gpu disabled,gpu wu computing lost
Posted: Wed Dec 15, 2021 1:44 pm
by promeneur
It is not the first time and the other times this occurs by simply starting my PC in the morning.
Losing a part of a computing of a WU is a lost. No ?
I never lost a cpu slot or get a disabled cpu slot. So I ask for the same thing for a gpu slot.
Re: one more time, gpu disabled,gpu wu computing lost
Posted: Wed Dec 15, 2021 3:10 pm
by aetch
I have to ask, is this a 24/7 folding machine or is it only folding a few hours a day?
If it's only folding for a few hours a day it's possible the work unit is hitting the timeout and expiry triggers.
Timeout - this is when the researchers would like the work unit returned by to ensure the science progresses at a brisk pace. If this is triggered a copy of your work unit is assigned to another folder to ensure the work is carried out and is not lost.
Expiry - the researchers won't wait for you to return your work unit any longer. The client dumps the work unit and all the work put into has been wasted.
Project summary detailing periods of timeout and expiry for the work units of each project ->
https://apps.foldingathome.org/psummary
The exact date/time that your work units will timeout/expire depends upon the date/time you were assigned it.
This is separate from the ETA, which is an indicator of when your computer expects to finish the work unit.
Re: one more time, gpu disabled,gpu wu computing lost
Posted: Wed Dec 15, 2021 3:41 pm
by promeneur
fahcclient runs from morning to evening.
The lost WU computed by gpu was at progress = 85 %.
I clearly saw in the log the WU was dumped.
It's a pity because this WU was the first computed WU expected to be completed in time after 3 WU not completed in time.
Re: one more time, gpu disabled,gpu wu computing lost
Posted: Wed Dec 15, 2021 4:14 pm
by Joe_H
promeneur wrote:It is not the first time and the other times this occurs by simply starting my PC in the morning.
Losing a part of a computing of a WU is a lost. No ?
I never lost a cpu slot or get a disabled cpu slot. So I ask for the same thing for a gpu slot.
Your system will always have a CPU, so the CPU folding slot will never go away on its own. However for the client there will either be a detectable and usable GPU at startup or there will not be one. If some change such as a software update to the drivers or a hardware change happens that causes the GPU to not be detected as usable, then the client as designed will remove the slot. There is nothing that will change that, and I do not expect that to change in the next version.
Re: one more time, gpu disabled,gpu wu computing lost
Posted: Sat Dec 18, 2021 5:38 pm
by toTOW
You have to make sure that FAHClient has access to the internet when it starts. If it doesn't, it will fail to update GPUs.txt file and when it happens, it disable GPU slots (as they are marked as unsupported). The will only be usable again when the client is able to access to the Internet to update the file.
Another option is that when the client start as as service, it might not have access to the GPU at system startup ... so you have to find a way to delay service startup, or you'll have to start the client manually ...