Page 1 of 3

Please allow smaller WUs on FAHClient

Posted: Sun May 19, 2019 11:04 pm
by ifolder
Today both CPU and GPU WUs take a lot of time to complete.

With summer coming I will need to switch off my folding computers during several months because during the day the temperature is way too high.

It would therefore be great to have a --wu-size small/medium/normal option to be able to ask for smaller WUs that could be completed during the night, when the temperature is lower. (Btw the max-packet-size option isn't OK as the WU completion time is not at all related to the packet size).

And I believe it won't be too difficult to implement as the base_credit parameter is closely related to the computing time required to complete the WU.

Moreover the NaCl client already gets smaller WUs so generating smaller WUs is possible.

Re: Please allow smaller WUs on FAHClient

Posted: Mon May 20, 2019 12:49 am
by Theodore
Been thinking of another possibility,
Upload the WU results when dumped, to the last save state, when closing off the client.

Re: Please allow smaller WUs on FAHClient

Posted: Mon May 20, 2019 3:26 pm
by bruce
Technically speaking, there are already "smaller" WUs but they're assigned randomly ... and increasing the number of projects containing them would help, too.

Yes, WUs can be constructed with durations of small/medium/normal/(any). I'll pass your suggestion on (again) but I'm not too hopeful anything will happen soon -- if at all.

Generally the scientist running the project chooses a size that's a compromise based on the time spent computing across the expected range of hardware and the time wasted downloading/uploading the data. (With shorter WUs, a bigger percentage of the time is wasted NOT processing.)

There is no option in the client to specify wu-size and adding a new option like that requires a new FAHClient, a new FAHControl, changes to the servers, etc. ) Packet-size is currently unused and in some cases could be substituted to obtain the same functionality -- but that's an extra step for the project owners and a donor re-education issue so it's not ideal either -- and also requires the support of all project owners.

The new feature will also require support from all project owners since the default will continue to be "any"

NaCl does have small WUs, but from the server's perspective, the efficiency of those projects is inferior to other projects.

https://github.com/FoldingAtHome/fah-issues/issues/1282

Re: Please allow smaller WUs on FAHClient

Posted: Mon May 20, 2019 8:46 pm
by ifolder
bruce wrote:Yes, WUs can be constructed with durations of small/medium/normal/(any). I'll pass your suggestion on (again) but I'm not too hopeful anything will happen soon -- if at all.
Thanks for submitting the issue. Unfortunately me neither I'm not too hopeful... F@H has unfortunately the reputation to not care too much about the donors' needs despite the fact that we are the ones to spend money on hardware and electricity, we are the ones annoyed by the hardware noise and heat; while they get paid every month for doing their research, get all the glory from published papers and eventually the money from patents that could result from them...
Generally the scientist running the project chooses a size that's a compromise based on the time spent computing across the expected range of hardware and the time wasted downloading/uploading the data. (With shorter WUs, a bigger percentage of the time is wasted NOT processing.)
On a 12-hours CPU WU the networking part takes less than 0.1%. Moreover it is invisible thanks to the prefetch (download at 99%).

I would add that shorter WUs in general may reduce the dumping rate as one could more easily accept to wait a few minutes for the WU to complete rather than wait for hours...

Re: Please allow smaller WUs on FAHClient

Posted: Mon May 20, 2019 9:55 pm
by MeeLee
@Bruce, likewise I wouldn't mind if the option for longer WUs was available if that is more beneficial to FAH.
I have one PC running 24/7, and whether or WUs are 1 hour or 24 hours, makes little difference to me.
If large WUs get better scores, I might actually prefer them.

However, the option to send results from a partially processed WU, would be interesting as well!

Re: Please allow smaller WUs on FAHClient

Posted: Mon May 20, 2019 10:06 pm
by JimboPalmer
{I am just a donor like you, not in any way affiliated with the project. Some time in early 2009 I decided my computers needed a hobby, and my Director liked F@H, so I had 55 old PCs crunching away. I think I am down to 9 now. (11 slots) So over 150,000 WUs and it is still just a hobby for my PCs}

Priority, as I see it.
1. Science, if it does not make more science, meh.
2. Security, if it is not trustworthy, it is useless.
3. Research: publish or perish.
...
837. Donors. If it takes little time and effort, why not?

This is as it should be, if we are not making repeatable results, why are we here? F@H exists because the researchers had no money and needed an infinite amount of computing resources.
They want donors who run 24/7 with the latest hardware, we do our best with what we have. F@H is very dependent on 24/7 results. one slow WU will slow the entire generation of research.

For projects without this restriction, the Android and NaCl clients exist. You could write a script to run NaCl during your off hours. Just be aware, it will reduce your value to the project.

As near as I can tell, they hire one 'structural programmer' who writes the server, control and client code you would need to have in place. Once all that was done, researchers would have to sort WUs into 'bins' and write WUs that look into those bins.

(The ability to specify what kind of research you wish to support existed long before the servers would direct your resources to that kind of research, it is not that it never gets done, just do not hold your breath waiting for it. It was a Donor request and it eventually got done.)

Besides the one IT guy and biochemists trying to program, there exists a third method. Both AMD and Nvidia have donated programming skill to F@H. Sony wrote a PS/3 and Android client. If you fund programming on topics that interest you, everyone benefits. That is beyond my resources, but I do not speak for you.

You misunderstood the issue of breaking WUs into pieces, that is going to cost time for the Researcher, for their computers, and for their internet connection. That it works well at your end, is not the limit. It just shows their one programmer is really good.

If you are dumping any WUs, F@H is probably better off without you. You are slowing down the work of everyone.

{There, I ranted}

Re: Please allow smaller WUs on FAHClient

Posted: Tue May 21, 2019 12:06 am
by bruce
It's unlikely that varible-length WUs will be adopted -- such as capturing a fraction of an incomplete WU. There are a number of convenience-based reasons why all Gens of a P/R/C that are collected are equal lengths.
F@H has unfortunately the reputation to not care too much about the donors' needs despite the fact that we are the ones to spend money on hardware and electricity, we are the ones annoyed by the hardware noise and heat; while they get paid every month for doing their research, get all the glory from published papers and eventually the money from patents that could result from them...
That's a common misconception when you look at FAH from the Donor's perspective. In fact, the costs of running projects are quite significant when you look at FAH from the University's perspective. A lot of the support costs are provided by volunteers and the permanent staff have to write proposals to obtain grants to buy servers and hire the remaining support staff. Most of the peer-reviewed papers that are published are are part of the research required for students to obtain advanced degrees and the resulting information isn't sold to drug companies or others but placed in the public domain.
On a 12-hours CPU WU the networking part takes less than 0.1%. Moreover it is invisible thanks to the prefetch (download at 99%).
That''s true, from our perspective, but from the server's perspective, there is additional delay while your results are prepared to be issued as Gen (N+1) and then it waits on the server for somebody to ask to download it. The server needs to have enough "extra" WUs so that it never runs out so the outgoing queue must be kept at positive length even with the variability of the number of active donors as well as accommodating any WUs that are dumped or lost/reissued or otherwise delayed.

Re: Please allow smaller WUs on FAHClient

Posted: Tue May 21, 2019 2:27 am
by MeeLee
JimboPalmer wrote: If you are dumping any WUs, F@H is probably better off without you. You are slowing down the work of everyone.
I disagree, and think this comment is very short sighted.

There are times when it doesn't matter if a WU is manually dumped or automatically dumped due to timeout;
- On machines that are turned off for longer than a day or two, WUs get dumped automatically.
- On machines where a system upgrade, or new OS is installed, WUs get lost as well.

It would make sense, in a way, that it would benefit FAH to know when a WU is dumped, so it can be assigned to someone else; rather than having to wait until the server determines that the WU is lost or timed out.

In my case, I occasionally manually dump WUs because I'm on a limited time.
I set my WUs to finish at a certain time.
If they aren't finished by this time, I'm forced to pull the plug; with whatever is processed, being wasted.
This is the disadvantage of a portable unit. I don't have the luxury to wait for WUs to finish.
Yet this unit also is responsible for ~4M Points per week.
I think FAH would rather see the 4M points this machine does per week, than nothing.

As far as making science wait; I regularly receive WUs that have been done before by others, sometimes 6 months before it was assigned to me.
So I don't particularly think I'm making someone wait; when the WU can get reassigned the very next day if need be!

Re: Please allow smaller WUs on FAHClient

Posted: Tue May 21, 2019 4:15 am
by bruce
Dumping WUs should be avoided, if possible, but it isn't always possible. Your example of a WU that cannot be completed by the timeout is a good one. When you do dump a WU (in an approved way) the client will notify the server that it has been dumped and it will be made eligible for reassignment immediately -- as opposed to waiting until the WU expires.

This debate has no single answer, but neither is the comment a short signted as you suggest. It is partly short sighted -- certainly not "extremely"
MeeLee wrote:
JimboPalmer wrote: If you are dumping any WUs, F@H is probably better off without you. You are slowing down the work of everyone.
I disagree, and think this comment is very short sighted.
in the case where you're certain a WU will be dumped, it's best to set FINISH as early as possible so no new WUs will be assigned and then dump the doomed WU as early as possible.

FAH does not eissue WUs unless the WU expires or it has been reportedly failed. Unless the eventual analysis of the WUs demonstrates that there was a massive error in a FAHCore and it has been fixed, repeating a project is NOT done. I'm not sure what happened 6 months ago. It seems more likely that a project could determine that the statistics of the dynamic conformational landscape of the protein showed that parts were inadequately sampled and additional Runs/Clones were needed to validate the statistics. Runs/clones would again start from 0 even though they would be different.
As far as making science wait; I regularly receive WUs that have been done before by others, sometimes 6 months before it was assigned to me.
So I don't particularly think I'm making someone wait; when the WU can get reassigned the very next day if need be![/quote]

Re: Please allow smaller WUs on FAHClient

Posted: Tue May 21, 2019 6:45 pm
by ifolder
JimboPalmer wrote:Priority, as I see it.
1. Science, if it does not make more science, meh.
2. Security, if it is not trustworthy, it is useless.
3. Research: publish or perish.
...
837. Donors. If it takes little time and effort, why not?
This is your opinion. If you're happy with the present situation good for you.
In my opinion donors make a lot of efforts to provide computing power to F@H so their needs should be considered a little bit more by PG.
JimboPalmer wrote:If you are dumping any WUs, F@H is probably better off without you. You are slowing down the work of everyone.
MeeLee wrote:I disagree, and think this comment is very short sighted.
It is not only short sighted but also a dumb comment!
FIY Mr JimboPalmer, who are raked 1558, I am in the top 150 and have earned 10x more points than you.
So no, I don't think F@H would be better off without me and I don't think that I am slowing down the work of everyone because sometimes I need to dump a WU.

Anyway to return to the original topic, by having F@H implement my suggestion, more science would be done thanks to the higher flexibility.
It would also benefit to reduce the dumping rate as dumping is sometimes unavoidable as explained by MeeLee.

Re: Please allow smaller WUs on FAHClient

Posted: Tue May 21, 2019 7:36 pm
by ifolder
BTW, since shorter WUs would mean shorter completion times, this would allow PG to set shorter WU expiry times which would lessen the impact of residual dumping.

Re: Please allow smaller WUs on FAHClient

Posted: Wed May 22, 2019 4:24 pm
by bruce
There's a problem with shorter WUs and the way the QRB works which probably doesn't have a solution.

The NaCl WUs run on the CPU doing a similar analysis to FAHCore_a4 BUT THEY INTENTIONALLY DO NOT RECEIVE A BONUS.

Suppose I create a "normal" WU for GPUs that takes between 0.5 days and 5.0 days, depending on my GPU speed) and I set a deadline of 5 days. [Fast GPUs complete WUs in, say, 10% of the deadline, slow GPUs complete WUs just in time.] Now suppose I create a "short" WU that is 10% as long and I set the deadline to 0.5 days. What's going to happen to the bonus?

What I predict is going to happen immediately is that everyone who has a GPU that can complete the first WU in 0.5 days will immediately do everything they can to get assignments from the new WUs because they will be completing the new WU in 0.05 days (1% of the new deadline) giving them a greatly exaggerated bonus and draining all available WUs from the servers.

FAH wants to be fair to everyone, but what can be done to limit the greatly exaggerated bonus and avoid WU-hogging.

Re: Please allow smaller WUs on FAHClient

Posted: Wed May 22, 2019 8:22 pm
by ifolder
There is something I don't get in your explanation.
Let's say that instead of issuing one big WU, you issue 10 small WUs which are a tenth of the big WU and which together correspond to the same amount of computation as the big WU.
What hinders you to simply adapt the basecredit and the bonus values of the small WUs so that folding the big WU or folding the 10 small WUs bring the same number of points on the same hardware?

And to be more extreme, why not making all WUs 10 times smaller as it would bring more flexibility, more folding, less dumping, shorter deadlines and quicker reassignment etc?

Re: Please allow smaller WUs on FAHClient

Posted: Wed May 22, 2019 9:30 pm
by MeeLee
There already are short medium and long WUs, which are assigned randomly.
Currently, shorter WUs have credits linear with their length.
For instance,
I notice that a 1 hour WU gets me ~60k points.
A 2 hour WU on the same system, gets me ~120k points.


I was thinking that by automatically assigning short WUs to systems that are folding occasionally.
And on systems that have folded 24 hours continuously (with faster GPUs), assign large WUs that take a day to two to complete.
I'm thinking because there are less large than medium or small work units.
That would get rid of half the problem

Re: Please allow smaller WUs on FAHClient

Posted: Wed May 22, 2019 10:39 pm
by bruce
Good suggestions. Keep them coming.

If the base credit is set at 10%, that only works if you ignore the bonus. Whenever you adjust the deadline, the non-linearity of the bonus curve changes so even if you manage to get the same PPD for one particular GPU performance value, it won't work for GPUs that are either faster or slower.