Suggestion [or how to eliminate slack time between WUs?]

RMAC9.5 · Post by **RMAC9.5** » Wed Oct 08, 2008 11:13 pm

I have a slightly bigger issue with this "one work unit at a time" policy and would like to suggest that those of us on dial up be allowed to download a second work unit before the first work unit is finished. I downloaded my first WU to my home PC last night and it was scheduled to finish around 11:00 CST this morning. Since I will be at work until about 19:00 CST, my home PC will sit idle for 8+ hours because of my inability to return the completed WU over my dial up connection while I am at work.

Post by **bruce** » Wed Nov 12, 2008 6:15 am

I have a dial up machine in a temporary location. By going back to the V5 client, I was able to configure Use IE Settings and I can configure it for auto-dial so it runs unattended. It is running an unpatched version of windows so I din;y have to contend with which-ever patch it was that broke Use IE Settings.

Panzergranate · Post by **Panzergranate** » Wed Nov 12, 2008 6:23 am

I really hope that FAH will be capable of cache works such as BOINC does.

Post by **bruce** » Wed Nov 12, 2008 6:39 am

FAH will not be designed to cache WUs. . . or . . FAN cannot run within the BOINC middleware structure BECAUSE BOINC does not have a scheduling option consistent with the needs of FAH.

for FAH, it is more important to minimize the total time a WU is assigned to you (including any cache time) than it is to keep the CPU busy. The WUs often ate serialized rather than parallel and that leads to a policy if no caching of WUs.

BOINC is a good set of projects and if it's up to you to decide which suits your style better. Naturally we hope it will be FAH, but if it's not, don't stress yourself by trying to force them both into a single style.

This has nothing to do with the GPU client, so I'm moving it to the General FAH forum.

Panzergranate · Post by **Panzergranate** » Wed Nov 12, 2008 8:40 am

I'm currently participating in both BOINC and FAH.
There will always be some time that the powerful GPU computation power be wasted if no cache of WUs existed. I like to keep my CPU and GPU busy all the time, otherwise they're just idling around and not doing anything good or productive.

MtM · Post by **MtM** » Wed Nov 12, 2008 9:02 am

7im wrote:Yes, but how much does delaying the return of a completed work unit to download a new WU hurt the performance of the project, and does the few minutes of idle time you avoid more than overcome that delay?

With such a large focus on getting WUs completed and returned to Stanford faster and faster from the likes of SMP and GPU clients, I would offer a guess that because of the serial nature of work units, getting the WU back sooner is more helpful to the project than reducing the slack time between WUs.

Again, I'm not here to debate, just say that in almost 7 years of the project, if ending this slack and been significantly helpful, Stanford would have changed it.

True except in those 7 years the way the project works has changed from 'many datapoints' to 'quick return times'. So it's getting increasingly important to lessen the delays between finishing a wu and uploading it so maybe atleast the six hour delay should be halved ( and the staff maintaning the servers doubeld

)?

MtM · Post by **MtM** » Wed Nov 12, 2008 9:03 am

Panzergranate wrote:I'm currently participating in both BOINC and FAH.
There will always be some time that the powerful GPU computation power be wasted if no cache of WUs existed. I like to keep my CPU and GPU busy all the time, otherwise they're just idling around and not doing anything good or productive.

Which part of serial do you not understand? Just so I can try and explain it to you

Panzergranate · Post by **Panzergranate** » Wed Nov 12, 2008 10:51 am

MtM wrote:
Panzergranate wrote:I'm currently participating in both BOINC and FAH.
There will always be some time that the powerful GPU computation power be wasted if no cache of WUs existed. I like to keep my CPU and GPU busy all the time, otherwise they're just idling around and not doing anything good or productive.
Which part of serial do you not understand? Just so I can try and explain it to you

Thank you. What I'd like to know is:
Is it true that FAH is indeed so serial in nature that a new WU just can't be sent to a client as cache or something?
Wasted time caused by such a no-cache design on individual clients may be insignificant, but the amount of such time accumulated by all the users participating FAH could potentially be really productive.

MtM · Post by **MtM** » Wed Nov 12, 2008 12:53 pm

Panzergranate wrote:
MtM wrote:
Panzergranate wrote:I'm currently participating in both BOINC and FAH.
There will always be some time that the powerful GPU computation power be wasted if no cache of WUs existed. I like to keep my CPU and GPU busy all the time, otherwise they're just idling around and not doing anything good or productive.
Which part of serial do you not understand? Just so I can try and explain it to you
Thank you. What I'd like to know is:
Is it true that FAH is indeed so serial in nature that a new WU just can't be sent to a client as cache or something?
Wasted time caused by such a no-cache design on individual clients may be insignificant, but the amount of such time accumulated by all the users participating FAH could potentially be really productive.

Depends, some projects which run on the new hw ( smp and gpu ) are indeed serial in nature. I'm working on an article in which I will try to answer among others to answer your question so I will quote you something already answerd.

The newer way of thinking about things is essentially a test of whether the formula I used to calculate the percent chance of folding is accurate. If everything is simple, that simple exponential formula applies, but experimental evidence, let alone Murphy's law, would seem to indicate that things aren't so simple. For example, there could be two exponentials, not just one, and using the strategy above you'd really only see the first (faster) exponential. The contemporary approach relies on really fast machines (SMP, GPU, PS3, etc.) to make trajectories longer, instead of doing more trajectories: running a new Gen for the same Run/Clone rather than starting a new Gen 0 for a new Run/Clone. This way we can see if there is more complicated behavior. And the way to get this done is to use fast hardware with fast turnaround (deadline) times, because we need the Gen you're working on back before we can start to work on the next Gen.

Hope that helps you.

Panzergranate · Post by **Panzergranate** » Wed Nov 12, 2008 3:30 pm

MtM wrote:The newer way of thinking about things is essentially a test of whether the formula I used to calculate the percent chance of folding is accurate. If everything is simple, that simple exponential formula applies, but experimental evidence, let alone Murphy's law, would seem to indicate that things aren't so simple. For example, there could be two exponentials, not just one, and using the strategy above you'd really only see the first (faster) exponential. The contemporary approach relies on really fast machines (SMP, GPU, PS3, etc.) to make trajectories longer, instead of doing more trajectories: running a new Gen for the same Run/Clone rather than starting a new Gen 0 for a new Run/Clone. This way we can see if there is more complicated behavior. And the way to get this done is to use fast hardware with fast turnaround (deadline) times, because we need the Gen you're working on back before we can start to work on the next Gen.

So FAH GPU takes fewer approaches at a time, but investigate them more deeply to see if they can generate more interesting results.
While BOINC analyzes more possibilities at a time, so it may take longer to check each of them as thoroughly as FAH does?

Rattledagger · Post by **Rattledagger** » Thu Nov 13, 2008 2:32 am

bruce wrote:FAH will not be designed to cache WUs. . . or . . FAN cannot run within the BOINC middleware structure BECAUSE BOINC does not have a scheduling option consistent with the needs of FAH.

Well, just my guess, but I think BOINC with <max_wus_in_progress>1</max_wus_in_progress> would on average get faster turnaround-times on SMP-wu's than the FAH-client, since many quad-core users runs 2 FAH-SMP-clients.

Short deadlines gives fast turnaround-times...

for FAH, it is more important to minimize the total time a WU is assigned to you (including any cache time) than it is to keep the CPU busy. The WUs often ate serialized rather than parallel and that leads to a policy if no caching of WUs.

Well, for a really serial wu, the same computer can download generation #1 and continue crunching all the way to generation #100 or whatever, without any extra download after generation #1. To keep track on the progress, can trickle once per hour or something, and to not risk losing everything, use intermediate uploads, example upload once per day or something... Oh, and in case a particular computer is "too slow", can just be ordered to abort the wu, and server-side can generate another wu someone else gets, that example starts at generation #13 or therever the "too slow" had progressed to...

7im · Post by **7im** » Thu Nov 13, 2008 6:02 am

I think we have had too much speculation about BOINC vs. FAH already. Bringing it up again doesn't change the previous answers Rattledagger.

BOINC would not be faster. It's SMP support is to download 4 work units, one for each of the 4 processors, NOT to process a single work unit 4 times faster like FAH does. I don't see how Max WUs = 1 changes that or makes BOINC faster.

And we've already explained how FAH is better doing round robin WU assignments instead of doing 1 through 100 pure serial assignments. Round robin doesn't need to trickle, it doesn't need to do intermediate uploads, and it doesn't need to abort WUs. How is aborting WUs less wasteful? Nevermind, don't answer that.

And since some of the BOINC projects I've seen have to process a single work unit as many as 2 or 3 times to verify correct computations, it is 2 or 3 times more wasteful than FAH, because WUs only need to be processed once for FAH.

I don't know why you keep trying to shovel that crap around here.

(NO, I'm not calling BOINC crap. It's very good for what it does.)

RMAC9.5 · Post by **RMAC9.5** » Thu Nov 13, 2008 8:17 am

Bruce and 7im,
I am new to Folding but I am also a DC oldtimer and some of what you say makes no sense to me. I am a dial up user and the fastest way for me to finish and return a WU is to allow me to cache ONE extra WU. For example, I have two ATI Radeon 3850 video cards which I recently bought for Folding. They take any where from 8 to 16 hours to complete a GPU WU depending on how many other DC processes are running in parallel. If I could cache ONE extra GPU WU per PC, I would configure these two PCs so that the video cards would run flat out and they would finish 2.5 to 3 GPU WUs per day. Currently, I can't/won't do this because I am not willing for both the video card and the CPU to sit idle for multiple hours per day. Instead of finishing 5 - 6 GPU WUs per day these two PCs finish 2 - 3 GPU WUs per day.

I also have 4 other PCs with empty PCI-E video card slots that could be used for Folding, but the management effort needed to make sure that each GPU folding run finishes in the morning before I go to work or in the evening after I come home from work is simply too great.

Rattledagger · Post by **Rattledagger** » Thu Nov 13, 2008 10:58 am

7im wrote:I think we have had too much speculation about BOINC vs. FAH already. Bringing it up again doesn't change the previous answers Rattledagger.

BOINC would not be faster. It's SMP support is to download 4 work units, one for each of the 4 processors, NOT to process a single work unit 4 times faster like FAH does. I don't see how Max WUs = 1 changes that or makes BOINC faster.

BOINC v6.3.xx supports both SMP and GPU-crunching, meaning there's no problem to run an application that example uses 5 cpu's. Projects can even specify they're using fractional cpu, example "use 1 GPU + 0.5 cpu".

This together with "max_wus_in_progress", that can be used to disable caching of wu's in a particular project, is fairly new features, not present during the "beta"-test back in 2005.

The only info posted by Pandegroup is "some issues", with no specifications of that these "issues" are...

And since some of the BOINC projects I've seen have to process a single work unit as many as 2 or 3 times to verify correct computations, it is 2 or 3 times more wasteful than FAH, because WUs only need to be processed once for FAH.

Hmm, that has this to do with whatever scheduling-needs FAH would have?

Various projects chooses the replication they needs to make sure the results is scientifically usable, this includes running some projects with 1 result/wu, just like FAH does, but also with AFAIK the max being 5 results/wu issued and needing 3 validated results/wu like LHC@home uses since they needs the results back ASAP.

Projects can even choose to use another fairly-new feature called "adaptive replication", a method that should give around 1.1 results/wu instead of 2 results/wu that is common if needs replication. Atleast one project uses this method, while SETI@home is actually thinking about this, and is currently testing-it on their beta-test.

MtM · Post by **MtM** » Thu Nov 13, 2008 4:56 pm

Panzergranate wrote:
MtM wrote:The newer way of thinking about things is essentially a test of whether the formula I used to calculate the percent chance of folding is accurate. If everything is simple, that simple exponential formula applies, but experimental evidence, let alone Murphy's law, would seem to indicate that things aren't so simple. For example, there could be two exponentials, not just one, and using the strategy above you'd really only see the first (faster) exponential. The contemporary approach relies on really fast machines (SMP, GPU, PS3, etc.) to make trajectories longer, instead of doing more trajectories: running a new Gen for the same Run/Clone rather than starting a new Gen 0 for a new Run/Clone. This way we can see if there is more complicated behavior. And the way to get this done is to use fast hardware with fast turnaround (deadline) times, because we need the Gen you're working on back before we can start to work on the next Gen.
So FAH GPU takes fewer approaches at a time, but investigate them more deeply to see if they can generate more interesting results.
While BOINC analyzes more possibilities at a time, so it may take longer to check each of them as thoroughly as FAH does?

What the quote is ment to illustrate is the diffrent approaches. I don't like to get into a boinc vs fah argument, but I see it will happen anyway so maybe I should just offer my vision even if it's not really a meaningfull one I'm afraid.

What boinc does is what fah did in the past, or still does with the uniprocessor clients.

Back in the day, we couldn't get long trajectories. By long I mean, we couldn't even come close to getting simulations that were comparable to the time it typically takes for a protein to folding in experiments. For example, the villin headpiece molecules fold in about 50 microseconds in a test tube. Simulations were originally limited to maybe 50 or 100 nanoseconds, about 1,000 times less. The idea is that the probability of a 50-nanosecond trajectory resulting in a folding event, for a protein that folds in 50 microseoncds on average, is

p(folding) = 1 - exp[ -( 50 ns )/( 50,000 ns ) = ~ 0.1 %

It's kind of like flipping a biased coin: say 99.9 % of the time, you get heads, but if you're lucky one one flip you *might* get tails. Of course, if you want to see tails, then you could always try to flip a lot, and occasionally (like 1 in every 1,000 flips) you'll observe heads. Likewise, if there's ~0.1% chance of observing folding in a 50-ns trajectory, then if we run 1,000 simulations then on average one of them should fold, if we run 10,000 simulations then ~10 should fold, etc. This is the reason for F@H in the first place: thousands of client machines mean thousands of "coin flips" or attempts to fold the protein, which increases our chances of actually seeing tails or a protein fold.

As said, that's still the way boinc works at present. If you compare that with the current approach of Folding At Home, you will I hope see that there is no way for boinc to be faster, as they might allow you to cache wu's and not grind to an halt when a technical difficulty has struck, but they need more completed work units ( and the multiplier on this is HUGE ) to get any sensible results.

Offcourse both smp and gpu are still having troubles, but supporting them now will only mean they get worked out quicker! They need you, yes you specificly because if you weren't interested and commited to try and make a diffrence, you wouldn't bring this up here. So I kind of expect that if I could convince you, you will continue with that intrest and commitmend in a way supportive of the project. By running the clients, and by reporting the problems you run into in a way so it helps them to fix it.

That being said, my gf's pc runs HCC. a boinc project. It's abit more appealing with his screensaver and she likes having it on. Both her and me have lost relatives due to cancer, and reading on your screen you're participating in a project called Help Conquer Cancer is a valuable asset to the project. I never ran it under my own name, I swapped my cpu for a gpu from someone else who did boinc as a team commitment. He just told me yesterday he was quiting and I could take the cpu of the project as well, it's just not that high a priority for me atm.

So, imho, boinc is not bad, but f@h is better, With the sidenote that boinc does primaraly has it's advantages with it's interface. And I would love for fah to get to a point where it could also boost it'self by having a singular interface to control all their clients, and maybe a way to be more selective which projects ranges you prefer to participate in. F@h is aimed at many deseases and only from the project description one can get some summier information about the particular work unit's purpose. I would like to see that changed for the better, as I know people like to feel more involved, and you accomplish this by telling them more about what you are doing and how it works. I already think the project is making allot of progress in communication to the donors, but it can always be improved

Does that satisfy you for an answer?

Folding Forum

Suggestion [or how to eliminate slack time between WUs?]

Re: Suggestion

Re: Suggestion

Re: Suggestion

Re: Suggestion

Re: Suggestion

Re: Suggestion

Re: Suggestion

Re: Suggestion

Re: Suggestion

Re: Suggestion

Re: Suggestion

Re: Suggestion

Re: Suggestion

Re: Suggestion

Re: Suggestion