Suggested Change to the PPD System

MtM · Post by **MtM** » Fri Mar 16, 2012 6:20 pm

You don't need to convince me that factorising ppd/credit is something which is nice to look at. I agree with your above two posts, and had already given a method of avoiding it while still retaining a tie to scientific throughput which is important to me.

Let's continue with methods of avoiding it, preferably in relation to my suggestion I made earlier but if you have another suggestion that's fine with me.

Edit: what's you're still not replying to so far is the fact that for lots of people the points system is NOT primairally a means of competition. I use the point system to value my contributed work towards others but I'm not seeing that as a competition at all. I appreciate the fact that Grandpa is making as much ppd as I'm making in several months if not more, since I care more about the science behind the project then that I care about the competative aspect of donating computational resources.

This is why I want to keep a visible binding between ppd and science, a binding which can only be normalized if the normalizing factor is correcting something like computational performance and this difference is published every time it's used to set a work unit's base credit.

If done that way, as I said before, you can control ppd so it won't hockey stick and still enable viewing the actual uncorrected computational/scientific performance ( if done on a per submission basis, you can't do it on just an assigned ammount of credit as different projects would have different normalization values ).

When trying to avoid the normalization values for different projects by giving all projects the same credit rating you would have to ensure that all people run an equal mix of work unit's, if you don't the normalisation wouldn't keep in line with actual scientific throughput. I would like to hear about suggestions which would avoid this, and still allow normalizing the worth of projects accross the board but I feel it's near impossible ( at the least ).

7im · Post by **7im** » Fri Mar 16, 2012 7:09 pm

Grandpa_01 wrote:...

Just curious but where would the incentive be for people to buy the machinery and to run the bigadv WU's if it were normalised across the board. The purpose of the QRB was to encourage quick returns. Guess what it works just look at all the 4P out there right now and there are quite a few being planned and built. Why would I or anyone else run a bigadv WU if we could make the same PPD off of a smp with far less risk.

You can't seriously think a 1 core benchmark would be normalized against a 16 core benchmark? We don't do that now, and never would. Please excuse yourself from the discussion if you cannot keep it reality based.

ChasR said use the same 16 core machine for CPU, SMP, and BigAdv. But only use and bench for 1 core on CPU, 4 or 8 cores on SMP, and 16 cores on Bigadv. (or whatever PG finds appropriate to use) They'd all have a different base PPD, as one would realistically expect. Maybe Base points for CPU client, and Base x 16 for BA. And then add the QRB on top of that. Still PLENTY of incentive for BigAdv with QRB.

This also seems like it would make the benchmarks more accurate. Tough to argue against that...

MtM · Post by **MtM** » Fri Mar 16, 2012 8:02 pm

7im wrote:
Grandpa_01 wrote:...

Just curious but where would the incentive be for people to buy the machinery and to run the bigadv WU's if it were normalised across the board. The purpose of the QRB was to encourage quick returns. Guess what it works just look at all the 4P out there right now and there are quite a few being planned and built. Why would I or anyone else run a bigadv WU if we could make the same PPD off of a smp with far less risk.
You can't seriously think a 1 core benchmark would be normalized against a 16 core benchmark? We don't do that now, and never would. Please excuse yourself from the discussion if you cannot keep it reality based.

ChasR said use the same 16 core machine for CPU, SMP, and BigAdv. But only use and bench for 1 core on CPU, 4 or 8 cores on SMP, and 16 cores on Bigadv. (or whatever PG finds appropriate to use) They'd all have a different base PPD, as one would realistically expect. Maybe Base points for CPU client, and Base x 16 for BA. And then add the QRB on top of that. Still PLENTY of incentive for BigAdv with QRB.

This also seems like it would make the benchmarks more accurate. Tough to argue against that...

I don't think he was arguing against it, rather a miscommunication. I for one won't argue against the concept but I doubt it is needed.

What would it change? Do you think PG isn't capable of extrapolating base credit between an i5 and a bigadv16 system without needing to benchmark them on the same system?

Who's is saying that base credit is now not in line with science? And what proof is there to support that?

Does this mean gpu work unit's need to benched on each variation as well? Some cards have 48 cores, some others have over a thousand cores. Can PG not predict performance spread accurately enough to set kfactor and deadline in a manner which ensures systems won't be to near the steep end of the hockey stick, and if so wouldn't the first and most simple solution be adapting how f@h can predict this spread?

Also, the x16 is arbitrary, there is no fixed number to compare one client type to another client type. A 16 core machine is capable of handling bigger simulations with more atoms and that's a big part of why it's worth more to run them, a multplier value would needs to be referenced against this difference. I know you didn't mention is as a fixed number, but I think no fixed number would be accurate in any situation.

k1wi · Post by **k1wi** » Fri Mar 16, 2012 8:04 pm

Can we please focus on normalisation first? If we can sort out the underlying issue of exponential PPD growth due to continuing improvement of computers THEN we can address the QRB, as my following post will outline.

7im · Post by **7im** » Fri Mar 16, 2012 8:26 pm

I agree, just trying to answer the QRB part of the discussion and get that out of the way so we can get back on topic. Normalization does not end the QRB.

@MTM, I'm not going to answer those specific questions because I was speaking in generalities of concept. And specific numbers were only to help communicate a general concept, not provide exact answers. PG can put exact numbers on it later. But to prove the need for a change, yes, a one socket i5 benchmark cannot be expected to produce as accurate a result for a multi socket computer as a new multi socket benchmark computer could produce. i5 isn't bad, MP PC would be better. Common sense is proof enough.

Post by **bruce** » Fri Mar 16, 2012 8:51 pm

If I could figure out how to split this topic into two parts: Hardware normalization and hockey-stick QRB I would, but the way the two discussions proceeded, they are terribly intertwined and not conducive to separation.

On the issue of hardware normalization, I'm not convinced it would make that much difference EXCEPT that those who keep track of milestones (like their first 1000 points or whatever ... and yes, that used to be important when FAH was exclusively uniprocessor projects) would object that the Pande Group is devaluing their contribution. We can argue that it's not true, since points are relative, but the emotional reaction from many Donors to getting less points this month than they got last month would be severe ... and not worth making a big change like you are suggesting. That doesn't mean that your proposal doesn't have a good factual foundation, only that it will be misinterpreted.

The Pande Group doesn't change the points system without a really, really strong reason, and the overall benefit would be less significant than the negative emotional reaction that it would cause.

k1wi · Post by **k1wi** » Fri Mar 16, 2012 8:59 pm

The way I see it, if we normalise the points we have two issues to overcome. Two issues that need to be addressed very separately.

1. How to determine how much to normalise points by (and by what frequency).

2. How to apply that normalisation.

In the first instance, I believe that PG has enough tools at their disposal in order to measure the rate of points growth based on the improvement of computer hardware, both from within the FAH system and also publicly available data on the relative performance improvements of computers over time. As a user of the forums, I don't have a detailed knowledge of exactly what information they have to play with, so that makes creating a precise formula for measuring improvement over time difficult. The accuracy of the solved rate matters only to how one period in time compares to another period in time, not between different computers in a given time, as will become clear in my next point. One possibility is to measure the performance of the top 1% of computers. If the ppd of the top 1% of clients increases 5% between periods, then the adjustment accounts for this technological improvement and the top 1% continues to earn the same amount as they did before. Of course, because this 5% applies across the entire range, the normalisation applies equally to all users. Of course, there are issues such as what happens if the top end improves faster than other areas of computing (i.e. there is a stratification), but this would only have an impact on the relative 'difficulty' at different points in time, not between computers in a given time period (because all of the users are experiencing the same normalisation). Different methods of determining technological improvement will have different outcomes in terms of the relative points time period to time period, but not within a time period. For example, if we were to take the median value (whatever that is) and the top end improves relatively faster (say because mainstream users all shift to mobile devices), then it will keep the same proportion at a given time, but the top end may be different relative to previous time periods.

2. This one is interesting. I would say the easiest way to 'normalise' would be to add another component onto the formula. If at present we have:

ppd

Most simply, the mathematically easiest way to implement the change would be to modify this formula to:

ppd = [current or native ppd]/y

where y = relative difference in computer performance relative to the base of time n1.

Over time y would increase, and if need be we could periodically reset y back to 1, for example if y became 100,000 then we would convert current or native ppd by 100,000 and start again with y at 1... My thinking being that it reduces the frequency at which you have to make wide changes to the entire system, that is, to all the different projects.

Unfortunately, I believe that this would require a pretty substantial server code rewrite, and that's probably not ideal. The complicating factor at this point is of course, the QRB. But as I stated, QRB is 'the next level of analysis' and I don't think we can start on that step of the process until there is a general agreement that the above process is sound and reasonable, particularly because it represents a HUGE shift in how we consider ppd over time.

With the QRB, the formula for calcuating PPD is:

PPD = base_ppd * speed_ratio * max(1,sqrt(x*speed_ratio))

I believe that the formula above becomes, in essence:

PPD = base_ppd * speed_ratio * max(1,sqrt(x*speed_ratio)) / Y

I don't know Stanford server code, so I cannot tell you how to make that able to be implemented. I think that becomes a Stanford issue. One option would be to divide each component prior to running the equation?

Finally, I would like to add that the normalising has NO effect on the relationship between scientific value and ppd, except in that it accounts for the ever increasing improvement that is the result of technological improvement. PPD becomes, in effect, the effective amount of work done at a given point in time, taking into account the technological improvements that have been made by the computer industry. I think this actually has a positive outcome, because I could look at my ppd in 2005 and compare it with today and be able to say "My PPD was higher in 2005, compared to what it is today, therefore, in real terms, I have actually slacked off a bit"

In essence, it takes the curve out of the blue line in the graph on page 1 and converts it into a horizontal line.

Looking towards editing the QRB, I would like to put something out there, but not actually investigate it yet. By put it out there I mean let people think it over, hypothesise and, IF we can resolve the underlying exponential issue, perhaps we can apply the same learnings to the QRB...

If the QRB is ^4 representation of the same issue, then perhaps there is a method in which we can apply the normalisation to the speed ratio and then again to the whole equation?

As I said - this is perhaps something for us to put to one side, and if we can get to a consensus or something, THEN we can bring it to the fore?

ADD: Bruce: I am looking at this from a very long term point of view - I think we can improve the point system so that the same methodology can be used in the long term. I don't think that measuring points in the trillions/day is particularly useful. The issue is that to make the point system sustainable, it requires a major comprehension about what the point system means, and what it actually represents in the long term.

ChasR · Post by **ChasR** » Fri Mar 16, 2012 9:20 pm

MtM wrote:ChasR, so what if there's the incentive to go big on core's? Aren't there enough people who will not be able to do so and would just have to accept donating what they can? What you're saying is that no charity should ever except donations over 5 dollars, as that would be the average most could afford without there being a chance of discouraging them from donating in the first place.

Also, again someone who's going to inflate the numbers to sway opinions. Which donor has doubled his wu's worth by beating the deadline by one second?

You're claiming to better know the scientific value better then PG? I know you where one of the people who used to run two instances of smp on one system, and one who has debated against the QRB from the start, it holds no value unless you can proof PG is wrong and time isn't as important in a serial flow of work units.

You keep trying to say people double their points if they are just a tad quicker, but never proof it. The comments in previous posts about kfactor and deadline's for any particular project which are set correctly for the spread in speed for the machines they will be assigned to will not or at least should not allow machines to be situated to close to the hockey stick.

If that's your problem, you should ask for better methods of predicting this spread in speed for any particular project so they can set up kfactor and deadline's in a way which prevents the above from happening.

Or proof you know better and QRB is wrong from the start.

@Kiwi, I asked you a bunch of questions in the previous posts, you never answered one of them. If you don't/can't, your proposal already falls apart.

I got busy and couldn't reply so this is a bit behind the discussion. I'll catchup in a bit.

Certainly there are enough people who can't or won't adopt MP systems. If they view their contribution as undervalued they quit or after going to a team website and finding nothing but posts about 4P systems, never start.

I'd like to see Grandpa's numbers on regular smp work on his 4P. I believe PG intended BA work to have a 20% bonus over regular SMP, which is a rational incentive. It likely does on the cache bound i5 benchmark machine. It becomes irrational on MP machines.

As for the 2x bonus for completion just prior to the preferred deadline, on many Wus it's much larger than that. Go to the smp bonus calculator and plug in these numbers for p6904, 80:38 and calculate the ppd. Now enter 80:39. The difference of 1 second/frame, 100 seconds total makes the difference of 301,147 in the credit for the WU. If you make the preferred deadline by one second by 1 second instead of 100, you get virtually the same result. I don't see how one can rationalize that one second is worth that many points and a 10.6 x boost in ppd (or penalty looking at it the other way). The QRB is seriously flawed and the points don't reasonably represent the value of the science.

k1wi · Post by **k1wi** » Fri Mar 16, 2012 10:04 pm

I have changed the title of the thread to better reflect the purpose of this thread.

ChasR · Post by **ChasR** » Fri Mar 16, 2012 10:18 pm

@Grandpa,
2.5 x the ppd of normal smp on p6904 is still absurd.
By normalizing the value of the work unit to an i5, I mean the value of a smp WU run on 4 cores of a new MP benchmark machine would be set so that the ppd produced on an i5 wouldn't change. That in no way removes the incentive to fold on faster hardware or on more cores. It merely sets the point where the old benchmark scale and the new benchmark scale coincide.

MtM · Post by **MtM** » Fri Mar 16, 2012 10:22 pm

k1wi wrote:Can we please focus on normalisation first? If we can sort out the underlying issue of exponential PPD growth due to continuing improvement of computers THEN we can address the QRB, as my following post will outline.

Most of my posts are about normalisation not qrb. I've given examples for both already.

7im wrote:I agree, just trying to answer the QRB part of the discussion and get that out of the way so we can get back on topic. Normalization does not end the QRB.

@MTM, I'm not going to answer those specific questions because I was speaking in generalities of concept. And specific numbers were only to help communicate a general concept, not provide exact answers. PG can put exact numbers on it later. But to prove the need for a change, yes, a one socket i5 benchmark cannot be expected to produce as accurate a result for a multi socket computer as a new multi socket benchmark computer could produce. i5 isn't bad, MP PC would be better. Common sense is proof enough.

1. Where am I only talking about QRB?

2. So you are saying we should bench gpu unit's on 48 core card's as well as 448 core cards ( nvidia specific )? If you answer yes you got a problem because that would make the system to complicated to be considered, answering no would result in the question as to show the difference between a 16 core system and a 4 core system being different from a quad core/octa core and a 4p 16c. Common sence I think I got enough of, it's just that I might be using it wrong and that doesn't change without given enough incentive

ChasR wrote:
MtM wrote:ChasR, so what if there's the incentive to go big on core's? Aren't there enough people who will not be able to do so and would just have to accept donating what they can? What you're saying is that no charity should ever except donations over 5 dollars, as that would be the average most could afford without there being a chance of discouraging them from donating in the first place.

Also, again someone who's going to inflate the numbers to sway opinions. Which donor has doubled his wu's worth by beating the deadline by one second?

You're claiming to better know the scientific value better then PG? I know you where one of the people who used to run two instances of smp on one system, and one who has debated against the QRB from the start, it holds no value unless you can proof PG is wrong and time isn't as important in a serial flow of work units.

You keep trying to say people double their points if they are just a tad quicker, but never proof it. The comments in previous posts about kfactor and deadline's for any particular project which are set correctly for the spread in speed for the machines they will be assigned to will not or at least should not allow machines to be situated to close to the hockey stick.

If that's your problem, you should ask for better methods of predicting this spread in speed for any particular project so they can set up kfactor and deadline's in a way which prevents the above from happening.

Or proof you know better and QRB is wrong from the start.

@Kiwi, I asked you a bunch of questions in the previous posts, you never answered one of them. If you don't/can't, your proposal already falls apart.
I got busy and couldn't reply so this is a bit behind the discussion. I'll catchup in a bit.

Certainly there are enough people who can't or won't adopt MP systems. If they view their contribution as undervalued they quit or after going to a team website and finding nothing but posts about 4P systems, never start.

I'd like to see Grandpa's numbers on regular smp work on his 4P. I believe PG intended BA work to have a 20% bonus over regular SMP, which is a rational incentive. It likely does on the cache bound i5 benchmark machine. It becomes irrational on MP machines.

As for the 2x bonus for completion just prior to the preferred deadline, on many Wus it's much larger than that. Go to the smp bonus calculator and plug in these numbers for p6904, 80:38 and calculate the ppd. Now enter 80:39. The difference of 1 second/frame, 100 seconds total makes the difference of 301,147 in the credit for the WU. If you make the preferred deadline by one second by 1 second instead of 100, you get virtually the same result. I don't see how one can rationalize that one second is worth that many points and a 10.6 x boost in ppd (or penalty looking at it the other way). The QRB is seriously flawed and the points don't reasonably represent the value of the science.

Maybe the reward is big because of the base credit/kfactor but you're forgetting that making the preferred deadline should cause a substantial reward as it prevents sending out a new work unit ( slowing down scientific progress by doing double work ). To deminish the reward so it still gives bonus to making preferred deadline but not in the range you're describing I posted something below Kiwi's quote.

k1wi wrote: ppd

Most simply, the mathematically easiest way to implement the change would be to modify this formula to:

ppd = [current or native ppd]/y

where y = relative difference in computer performance relative to the base of time n1.

Over time y would increase, and if need be we could periodically reset y back to 1, for example if y became 100,000 then we would convert current or native ppd by 100,000 and start again with y at 1... My thinking being that it reduces the frequency at which you have to make wide changes to the entire system, that is, to all the different projects.

Unfortunately, I believe that this would require a pretty substantial server code rewrite, and that's probably not ideal. The complicating factor at this point is of course, the QRB. But as I stated, QRB is 'the next level of analysis' and I don't think we can start on that step of the process until there is a general agreement that the above process is sound and reasonable, particularly because it represents a HUGE shift in how we consider ppd over time.

With the QRB, the formula for calcuating PPD is:

PPD = base_ppd * speed_ratio * max(1,sqrt(x*speed_ratio))

I believe that the formula above becomes, in essence:

PPD = base_ppd * speed_ratio * max(1,sqrt(x*speed_ratio)) / Y

I don't know Stanford server code, so I cannot tell you how to make that able to be implemented. I think that becomes a Stanford issue. One option would be to divide each component prior to running the equation?

Finally, I would like to add that the normalising has NO effect on the relationship between scientific value and ppd, except in that it accounts for the ever increasing improvement that is the result of technological improvement. PPD becomes, in effect, the effective amount of work done at a given point in time, taking into account the technological improvements that have been made by the computer industry. I think this actually has a positive outcome, because I could look at my ppd in 2005 and compare it with today and be able to say "My PPD was higher in 2005, compared to what it is today, therefore, in real terms, I have actually slacked off a bit"

In essence, it takes the curve out of the blue line in the graph on page 1 and converts it into a horizontal line.

Looking towards editing the QRB, I would like to put something out there, but not actually investigate it yet. By put it out there I mean let people think it over, hypothesise and, IF we can resolve the underlying exponential issue, perhaps we can apply the same learnings to the QRB...

If the QRB is ^4 representation of the same issue, then perhaps there is a method in which we can apply the normalisation to the speed ratio and then again to the whole equation?

As I said - this is perhaps something for us to put to one side, and if we can get to a consensus or something, THEN we can bring it to the fore?

ADD: Bruce: I am looking at this from a very long term point of view - I think we can improve the point system so that the same methodology can be used in the long term. I don't think that measuring points in the trillions/day is particularly useful. The issue is that to make the point system sustainable, it requires a major comprehension about what the point system means, and what it actually represents in the long term.

Ppd normalisation: yes that's what I proposed earlier as alternative to your previously mentioned dollar value suggestion.

QRB:

My math is not of sufficient level to put this in equation form ->

PPD = base_ppd * speed_ratio * max(1,sqrt(x*speed_ratio)) / Y

The problem is in the start and end, or in the fact that you can't acuratlely predict where systems will end up so you don't have control over which part of the slope is utilized.

It should be possible to add a normalization factor based on the speed_ratio so you won't get the problems ChasR pointed out ( I said based on speed_ratio as that indicates on which part of the slope we are ). The only requirement is that the point increase should stay exponential before the deadline backwards.

If you could formulate how to get Y ( benchmark a fixed project on the different hardware? ), and convince PG to publish Y alongside of ppd, as well as make it possible for donors to check their individual submissions ( not saying indefinetly, but offer a chance for donors to download their data in some form.. ) I'm going to support this idea because it does fix the problem with exponentially increasing computational power while it still allows me to look up the actual computational effort/scientific value ( not human/donor effort ) needed to get it.

If you could convince PG to add a normalisation from speed_ratio 0 to speed_ratio 0.X, I'm going to support you as I do think it's a strange idea that such a work unit would increase so much in rewarded credit by making the deadline.

I don't think one can fix the hockey stick effect on it's own as you would end up loosing exponential increases if you do, and then it might result in people running two smp:24's on the 48 core systems. I think it has to be realized by controlling the spread of assigned machines so the region of where machines end up in the slope is controlled.

7im · Post by **7im** » Fri Mar 16, 2012 10:29 pm

I cannot support an ongoing normalization process. I can only support a normalization based on updating the benchmark system from an i5 to a 16 or 32 core system.

We'll never be able to sell the weekly/monthly normalization idea. I'll never know if my PPW or PPM was up or down because of a Project change, a hardware problem, or because of a normalization change.

However, when PG can eventually move to a client side benchmark, that will resolve much of this normalization problem as well.

MtM · Post by **MtM** » Fri Mar 16, 2012 10:32 pm

7im you could is you were able to look it up. I only support ongoing normalization if it is logged and I'm able to look it up. It does not matter on which interval this is done, yearly or monthly, as long as I'm capable of calculating the results I would have gotten without the normalisation.

Explain why the client side benchmark can fix the problem ( not saying it can't

).

k1wi · Post by **k1wi** » Fri Mar 16, 2012 10:35 pm

MtM, if you prposed ppd normalisation earlier then you were proposing what I proposed from the very first post...

Unless I am mistaken, it does not encourage people to folding 2x24 instead of 1x48, because we are not altering the relative distribution of points at a given time, if you normalise all the points for 10% and computer A had 15x the ppd of computer B, once the new normalisation is applied it will still have 15x the ppd! All that will happen is that the 'maximum' value will remain at a relatively stable level over the long run instead of ever increasing.

Furthermore, I suggest that you can accurately predict where systems will end up - at least over the short to medium term.

7im - I would never advocate a weekly schedule. Perhaps monthly is too frequent, but quaterly would be feasible, perhaps even half yearly. Put on a schedule just like the Fed Reserve and it becomes a known part of the folding calendar. Even annually would be fine, however I think there is a danger in only doing it too infrequently. It will still be possible to determine between project change and normalisation change because all projects would be affected equally and at the same time, so you can compare it to other current projects.

MtM · Post by **MtM** » Fri Mar 16, 2012 10:45 pm

I made a suggestion here -> viewtopic.php?p=210435#p210435 after you made some suggestions about dollar value. I brought up computational effort, not human effort. So I would think you're come over to my side, and I haven't come to your side. That aside, I'm actually happy we ended up on the same side

And I asked 7im about the client side benchmark because I agree, with that you can predict where that machine would end up and it would enable setting up the kfactor and deadline so machines will not end up on the wrong area of the slope. I mentioned client-server logic before, that refered to this.

Edit: this isn't the first time I had this discussion.

Folding Forum

Suggested Change to the PPD System

Re: PPD Bonus Scheme

Re: PPD Bonus Scheme

Re: PPD Bonus Scheme

Re: PPD Bonus Scheme

Re: PPD Bonus Scheme

Re: PPD Bonus Scheme

Re: PPD Bonus Scheme

Re: PPD Bonus Scheme

Re: Suggested Change to the PPD System

Re: PPD Bonus Scheme

Re: PPD Bonus Scheme

Re: Suggested Change to the PPD System

Re: Suggested Change to the PPD System

Re: Suggested Change to the PPD System

Re: Suggested Change to the PPD System