Page 41 of 47

Re: Change in BA requirements

Posted: Mon Jan 13, 2014 2:50 am
by bruce
Early on, Kasson said that core-count is an unreliable control variable to achieve X% of the machines (or X% of the work, whichever they have in mind) but it's the only number he has to work with. Actually, he also has the deadline to work with.

Building on mdk777's proposal above, let me restate the same problem in different terms.

Suppose we have reliable information about what percent of the work (or cores) in BA compared to the desired amount of work (or cores) -- and the Pande Group wants to exclude assignments to certain machines between those numbers. New Question: How much should they shorten the deadlines to achieve the desired result? [Note: I don't trust anybody's current estimates of the first two numbers, but somebody official can figure them out and publish them.

If after the deadlines are adjusted, if the desired result is achieved and it's a stable number - GREAT. If it is not, or it drifts too much, how often should those deadlines be readjusted in search of the desired goal? Would that degree of unpredictable be acceptable to the Donor community?

It reminds me a lot of the politicians (and the Fed) trying to control "the economy" but aside from being off-topic in this topic, a discussion of that anywhere on this forum would most certainly be censored.

Re: Change in BA requirements

Posted: Mon Jan 13, 2014 2:51 am
by 7im
Since the Uniprocessor fahcore_78 went away in August 2013, all CPU clients are now technically SMP clients, even if they just run on one core. So there aren't that many single core machines any more.

Re: Change in BA requirements

Posted: Mon Jan 13, 2014 2:58 am
by bruce
troy8d wrote:I believe the difficulty you are having reconciling these numbers may be based on the assumption that all windows/os x/linux client are running SMP and neglecting to account for the fact that a large portion of these clients are uniprocessor. *** Correcting for this provides a much better estimate. This estimate still requires assumptions about the cores per machine for both bigadv and smp clients and I trust that PG has much more accurate measurements of these statistics that we can estimate from the client statistics.
*** and you're neglecting GPU.

The numbers 1% or 5% may sound precise but they are generalities / estimates. Even if they were exact, we don't have a definite answer to "Percent of what"

Let's celebrate what we DO know, not what we DON'T know yet. We don't know enough to solve everything precisely but we do know a lot more than we did a day or so ago. Most of what we're talking about is an indication of direction and therefore still generalities, not final details.

Re: Change in BA requirements

Posted: Mon Jan 13, 2014 3:07 am
by mdk777
If after the deadlines are adjusted, if the desired result is achieved and it's a stable number - GREAT. If it is not, or it drifts too much, how often should those deadlines be readjusted in search of the desired goal? Would that degree of unpredictable be acceptable to the Donor community?
It is self regulating once you agree on aim and control limits.

today we have crossed the upper control limit of 5% of smp WU being processed by BA.
Reset aim deadline-line time= today 2.5% of smp WU being processed by BA ...people find that their 24 cores no longer make the deadline time but their 36 cores machines do easily.

watching the change in percent of BA lets you know when the next adjustment will have to occur.

many people build 48 core machines, it will happen faster. many people run regular smp on 16 thread machines, it will take longer....but the donor has a chart he can track and make inferences from. :mrgreen:

But you could also chart and set date certain for adjustments every six months. :wink:

Today I look at the chart and it show BA is out of control at 8% and the next adjustment will occur in 3 months.
Well, I know it will be a big one and will cull many BA rigs running now. :ewink:

Re: Change in BA requirements

Posted: Mon Jan 13, 2014 3:19 am
by Grandpa_01
The only way to to limit machines is by deadlines, anything that is client related can / will be spoofed. I would suggest bi annual reviews, I doubt that is actually necessary but is more of an insurance policy if a sudden need arises. The relatively short history of bigadv shows that it is not that frequently needed but that may change in the future. It does not really matter as long as there is a road map and it is stuck to and updated with each review. PG will have to make sure they get their deadlines as close as possible then live with them otherwise the road map becomes a worthless piece of paper.
The point is something is needed whether it is quarterly bi-annual etc. does not matter what does matter is that it does exist and needs to be well publicized, easy to find and kept updated.

Re: Change in BA requirements

Posted: Mon Jan 13, 2014 3:25 am
by mdk777
I doubt that is actually necessary but is more of an insurance policy if a sudden need arises.
I agree. I'm not getting into the why or rationale for predetermining what percentage BA should be of total SMP...
Just that it isn't difficult to provide feedback to donors on how well(or how excessive) BA is in achieving that goal. :wink:

Re: Change in BA requirements

Posted: Mon Jan 13, 2014 3:29 am
by bruce
I've seen people ask for advice on how to build a new BA system. They rarely get much help on this forum but there's always help to be found on the enthusiast forums.

I could easily have missed it, but I don't remember ever seeing something like "Don't waste your money because with that budget, it's a poor choice because...." followed by information and predictions about obsolescence. Who's responsibility is that? Sure, some of you knew there would be periodic changes to requirements and some didn't but still should have translated into some kind of warnings?

Granted, FAH only made one adjustment so maybe it was too easy to put out of your mind, but the information was there and who did anything about it except the FAQ and the "Pande Group apologists"?

Re: Change in BA requirements

Posted: Mon Jan 13, 2014 3:45 am
by mdk777
BIGADV WU have indeed been the ppd/ppw leaders for the last few years.

however, to someone new reading this thread:

be warned that there is no assurance these WU will continue indefinitely. or even for the long term. :wink:

I would investigate the number of remaining WU before investing the time/effort and considerable capital involved in building a 4P folding rig at this time. :mrgreen:
viewtopic.php?f=38&t=23944&start=45#p253107

I have also consistently told people that folding on your gaming GPU is without a doubt the lowest cost/ lowest risk option.

Compared to total TFLOPS including GPU, we are debating about less than a tenth of a percent, and not 5%.

Again, I don't see the pressing need to alienate donors over less than a tenth of a percent, but we have already crossed that Rubicon.

I'm just suggesting methods of control if that control is deemed necessary.

Re: Change in BA requirements

Posted: Mon Jan 13, 2014 4:31 am
by Bill1024
Until we know what the deadline is it's all up in the air.
Today the slowest AMD 4P 24 core will not make deadline on the 8101 WU.
We do not know if the slowest AMD 32 core 4P cores will make the deadline. An Intel 2P 16/32 with turbo and some OC may make it.

We know switching a 24, 48 or 64 core servers. over to SMP does not do the WUs where there are 180,000 sitting there.
Something to think over too. The power is in the vast numbers of cores working on one single WU.
When you divide a 24 core up, say into 4 equal parts 6 cores at 2.1ghz - 2.5ghz is SLOW compared to to a new Intel quad core at 4.5ghz. AMD has a 5ghz chip out.
When I divided mine up PPD went from 175k PPD BA, to half that doing -SMP 24 then down to 37k PPD total doing SMP broken down into 4 client running.
Loss of QRB and return time is the cause? I guess. I forgot to mention that before.
Still PG has not said there is a backlog, nudging donors off BA 24 to SMP and asked to divide them up further. Sad to say, not many will be willing to do that.

Re: Change in BA requirements

Posted: Mon Jan 13, 2014 5:06 am
by k1wi
If there can be a more defined roadmap I'm happy for that, so long as it doesn't hamstring PG. At the end of the day I want science to advance as fast as practical* (*where by practical the costs born by the community as a result of such progress are not excessive).

What follows is some thinking out loud, so I hope I don't cause offense to anyone.

In light of the changes we've requested of PG, looking ahead perhaps we can make some improvements of our own. Particularly in how we advise potential BA folders and ourselves evaluate whether or not to go BA. Because in the end I'd hate for BA to get pulled because it causes too much strife. I'm thinking that in exchange for a more defined roadmap, perhaps we as donors and participants of the 'community' need to advise donors thinking of going into BA that their system won't be earning BA points for an unlimited(perhaps indefinite?) period.

Perhaps as part of the decision as to whether to go BA (or not) we need to encourage donors to ask themselves whether they would be prepared to fold on the rig if it reverted back to SMP after the more defined period? The economist in me sees the BA bonus as needing to cover not just the higher set up costs and maybe running costs, but also the risk associated with when the next adjustment will happen, and whether the period of bonus relative to the expected life of the server is worth it. If there is uncertainty about when the next adjustment is going to happen that should increase the risk and therefore cost...

Prof. Pande gave us a tentative road map early on in the piece, with the next revision toward the end of the year and I'm pretty confident he wishes he could give a more definitive timeline without the progress of science being unduly constrained. A more conservative approach by us donors in evaluating the risk effectively increases the cost of BA folding, and would likely result in a slow down in the BA race. That is, those on the margin decide the BA bonus isn't worth the higher cost associated with the more explicitly stated risk.

But maybe that is a worthwhile outcome? Particularly given the sums of money involved here.

---
Also, Bill - regarding what happens to BA folders when they drop off from BA to SMP - maybe that's something we can work on with PG. As I understand it, ideally they want CPU cores split across as few clients as possible - that is a single client with as many cores as possible - but not all SMP WUs play nice with some/most higher core counts? Certainly 1 client will significantly outperform 4x 1/4 clients...

Maybe there is something we can do there about improving things there - initially with existing projects and in the future during the beta testing process?

Coming BA Changes

Posted: Mon Jan 13, 2014 2:22 pm
by emmanon
I'm not sure if this is the right forum for this post. Does anyone have any new information about the February and April BA changes :?:
Thanks

Re: Change in BA requirements

Posted: Mon Jan 13, 2014 4:06 pm
by PantherX
Welcome to the F@H Forum emmanon,

Please note that so far, Dr. Vijay has answered some questions raised in this thread. However, there haven't been any new announcement about the February/April Bigadv changes. If any changes were to be announced, it would surely be posted on the blog (http://folding.stanford.edu/home/blog).

Re: Change in BA requirements

Posted: Mon Jan 13, 2014 4:46 pm
by mdk777

Code: Select all

BA has always been intended to be used for a few calculations/Projects in FAH that requires the most processing power; larger RAM, more cores, and more bandwidth than typical FAH calculations. However, as time goes on, the BA requirements are met by more and more donor machines due to technological advancements. If we wait too long, a large fraction of donor machines will become BA capable (and in terms of the computing power of FAH, a very large fraction of FAH would be in the BA class in that case). In that situation, in order to get any useful work done in FAH, we'd have to make all WUs in the same category as BA WUs (since most donor would not want to run any other type of WU given the point difference as many choose to optimize for highest PPD/Watt ratio). However, that change would just lead to a big inflation of FAH points and also wouldn't give donors with the most powerful machines any benefit for being part of BA. Something that we wanted to avoid.
So, going back to priori premise:

The competition is neither a zero sum game nor an exclusive comparison set.

So if 30% of the work being done were BA, would that mean that?:

1. A huge number of people elected to invest 10K in building 4p rigs?
or
2. A huge number of people stopped running smp on their desktop?

You have no way of knowing based at only looking at a comparison of relative ratio between the two.

SMP could see a 10 x increase in participation, but BA at the same time could see a thousand X increase.
Or both could decrease in absolute numbers, but the participation in regular smp could decrease faster.

The combinations and permutations of factors that could affect the ratio are infinite.
The question remains, does it make any sense to tie them together in the first place?

As many donors have pointed out, the real competition is GPU from a ppd/ppw point of view.
If I am going to burn watts, it is going to be running GPU.
Say ,for the sake of argument, that everyone agrees; Will PG tie running smp to GPU?

"we have too many people running gpu and not enough running smp. We know that any donor that can run GPU, has to have a computer, and therefore can run smp. Consequently, starting at x date, donors will be required to run smp concurrently with GPU to be allowed to continue to participate."

You say that would be insane? Well, it has just as much logic as tying the participation in BA to the success(or failure) of participation in regular smp.

Like most donors, I see no reason to tie fundamentally different classes of processors.

say regular smp is able run on smartphones, tablets, android devices. Will these devices ever approach the processing power of what we consider a 4 core desktop today? Not for a long time. Say 10 million people are willing to run a client on an android device...should that subsequent work ratio be considered in determining what is a BA multi-socket workstation?

Why should it? like GPU, it could some day be a significant source of TFLOPS...but that absolute number of TFLOPS has nothing to do with the work being done by BA workstations.

My solution:

Let nature take its course. If the desktops continues to increase in power so that 16 core(32 thread) machines are the norm and can easily accomplish BA WU...great :!:

regular smp will die a natural death (no surprise) and yes, over a very long period of time, BA will cease to have any exclusivity or special bonus...Again, something that has been happening gradually over time anyway.

Will my smartphone ever compete with a 4P server in processing power? Will it ever compete with a 300 watt GPU? I don't think so.
Why would I expect that a point system could possibly accommodate these different classes of machines not only accurately, but also allow handicapping so we could "race" :?:

If we are going to "race"...set up special bonus for the top winners ...it has to be within a class.

comparing/tying different classes is just destructive and non-productive...

IS QRB insanely productive WITHIN GPU, WITHIN BA, WITHIN SMP. Yes. :!: :!: it is insanely productive and accurate in showing the advantage to PG in getting work returned quickly and without interruption.

Can the curve(s) be always perfectly fitted and compared ACROSS these classes? No :oops:

I think most people(donors) understand most of the above...and so still question the first premise that BA needs to be some percent of smp . :mrgreen:

As noted by Bill1024 and others, logistically, the two classes do not even run the same WU with any degree of efficient interchangeability. :!:
Having spent some time and good faith energy in testing, they wonder why continue to force a round peg in a square hole. :ewink:

So, beyond being a faulty premise...It does not even work as an immediate ameliorate to the supposed problem. :!: :!: :!:

Yes, ever increasing power and efficiency will make some classes look less attractive over time. This is natural. You don't stop development of new products because it will make existing products obsolete. (unless you want to die as a corporation)

Yes, if everyone is only willing to run GPU because it is 100x more ppd and ppw; this will certainly cause a disruption in projects that were dependent on smp. However, dealing with that disruption really is 100% a concern for PG administration. It should not ever be any concern for donors. Just like the responsibility of dealing with overstock and obsolete inventory is never the responsibility of a customer.(well, I will sell you this gallon of gasoline that you want, but you also have to buy a gallon of whale oil that I already have on stock) As I have mentioned in the past, trying to tie in this manner was one of the original things outlawed in the first anti-trust laws. Economists and government saw it as just a bad thing all around.

PS

No, i don't think I am going to change anyone's mind. I was just reading what the majority were posting on other forums and thought I would make one last stab at explaining the donor's perspective on why this just doesn't make any sense.

Good Luck ALL. :mrgreen:

Re: Change in BA requirements

Posted: Mon Jan 13, 2014 7:08 pm
by Nathan_P
1st Off, Many thanks for Vijay responding to us and providing clarification, disappointed that this is the 1st mention of a %age qty that they want to run BA unit but at least now we know.

People know my views on this subject, however as a trial I have been running SMP on my slowest (soon to be obsolete) BA box. PPD is not bad at 77-82k, however that is ~30-90k less than is possible. Decision on whether it will be shut down is now on hold, however there will be no upgrades or new machines for now, If/when I do upgrade the machine has to be capable of BA for 3 years, otherwise it will not happen.

What is more important is PPD/W. 178k PPD @ 297w is 599PPD/w, however on SMP @ 82k PPD its 276PPD/w - that makes a big difference and is ultimately the way that most will decide whether or not to keep machines running.

Oh and I have recommended a couple of times in the last 3 months not to invest in a 2p machine until this is sorted out.

Thought for PG, Rather than do 2 core count jumps, why not just do 24 cores but tighter deadlines?

Re: Change in BA requirements

Posted: Mon Jan 13, 2014 8:28 pm
by HaloJones
The problem with tighter deadlines is machines that download units and then don't complete in time. This would actually slow down the work for the whole BA program.

Others have suggested control units - large atom count but only one or two iterations, timed and returned for Stanford to determine suitability - but that brings another whole raft of technical challenges that are not related to the science.

I can't see an easier solution than a core count and a deadline reduction. For a while there will be pain for some folders who will waste significant electricity failing to complete on time, trying to optimise and finally accepting defeat.