Blog post: "Unified GPU/SMP benchmarking scheme ..."

mdk777 · Post by **mdk777** » Thu Nov 08, 2012 2:34 pm

well, two things.

1. Mass participation is really what is needed for a successful Distributed Compute project.
Regardless of how many points one were to award...how many people are there in the world willing to build and run dedicated 4P servers?

I'm aware that many dedicated FOLDERS here are willing to make that kind of investment....But we are still talking thousands...not Hundred thousands, and not millions.

I am sure these 4P systems have and will continue to contribute very important and very significant capability. However, to really make an insane, a 100x or 1000x or 10,000x increase in the pace of science.....You need mass participation

2. While I think the PPD should, and will be determined on the 'equal points for equal work.' : I am not so sure the investment and Wattage comparisons are so far off.

I have looked again at building a dedicated / watt efficient platform for a GPU dedicated system. It is easy to spend $700 without the card...so $1000 for the entire system.

$244 I7 CPU
$100 Z77 MB
$100 Gold or PLAT PSU
$100 OS
$80 MEMORY
$50 BOX
SSD DRIVE, etc. etc.

$250 $300 GPU and you are around a grand.

Now, you are going to be at 200/300 watts and a $1000 for the expected 100K to 150K ppd.

This compares with what 600 Watts and $2000-$3000 that people making 4P systems were spending to generate 200K to 300K PPD.(OC systems were going as high as 400K if I'm not mistaken) So yeah, the GPU will win in comparison, but it is not anything like a 10x loss, or even a 5x loss compared to the BIG-ADV. points that people were generating.

Obviously, YMMV, people will build SLI systems to reduce overhead costs etc. etc. etc.

I'm just pointing out that using the entire system for GPU has costs beyond the GPU itself...and if GROMACS uses Heterogeneous compute, the entire system might be used even more than in the past.

Ultimately, the Science should and will determine the best system/ points. I'm just not sure it will be immediately a lopsided loss for the 4P systems in comparison.

mihapiha · Post by **mihapiha** » Thu Nov 08, 2012 2:42 pm

I'm not particularly worried about the PPD I make now, but the folding farm was purchased just a month ago. I wouldn't mind getting a couple of P8102 or P6903 WUs though....
So I just would like to know, whether a GPU based folding farm would have been a better investment.
A friend of mine said that my Opteron build can do 528 GFLOPS while a GTX680 can do more than 3000 GFLOPS.
Are GPUs really that much faster and better suited for these types of calculations compared to CPUs?

What kind of hardware is ideal for F@H purposes. I could still sell my folding-farm and get something more suited. Considering how new it is, I should have no problems finding a buyer for a reasonable priece

mdk777 · Post by **mdk777** » Thu Nov 08, 2012 3:12 pm

Are GPUs really that much faster and better suited for these types of calculations compared to CPUs?

Yes and NO.

Up until now, the problem with GPU compute was that it was very, very powerful for specific calculations.. However, the problem was that often, the time required to communicate the results over the latency of the PCIE bus to a CPU diminished that superiority. The very specific ability and limited "general" compute capability severely limited the GPU.

Improvements in programing...better GPU design for compute and the integration of the better software, may now be changing that rapidly.

What I see in TITAN, and other HPC systems is BOTH CPU, and GPU being important for max. efficiency and total compute.

The question you have is really one of timing. You have a mature technology that is producing results now.

When will Heterogeneous compute fully mature? Will FOLDING be able to rapidly intergrate OPENCL and Gromacs into cores out of BETA? 3 months, a year from now?

Really hard to say. I have a "grave yard' of graphics cards that showed promise on paper from a FLOPS point of view...but were never successful FOLDERS.

Trying to "anticipate" the FOLDING optimum hardware is just about impossible. Your best bet is to wait for a core to prove itself, and respond. In 3 to 6 months, new generations of cards will be out, and even if GPU dominates the efficiency/ppd/utilization derby at that time....knowing which software, and which specific card combination that will be optimal is impossible today.

Napoleon · Post by **Napoleon** » Thu Nov 08, 2012 4:02 pm

P5-133XL wrote:I do not know if an SMP-GPU client is currently even possible given the current state of GPGPU programming capability. It is a question of how to keep SMP-GPU's synchronized like the CPU-SMP threads get synchronized. I suspect it will be years before such is even possible.

The CUDA 5 toolkit apparently tries to overcome that limitation:

Though there's not currently a huge usage potential for GPUDirect in gaming, this is by far CUDA 5's most impressive feature. GPUDirect enables GPUs to perform direct memory transfers, not only to other GPUs sharing the same PCI-E bus but, also to other devices. Place a DMA-capable network card on the bus and suddenly you have direct memory transfers from one GPU to any other PCI-E device on a network, without any CPU intervention.

7im · Post by **7im** » Thu Nov 08, 2012 4:54 pm

mihapiha wrote: What kind of hardware is ideal for F@H purposes.

There is no easy answer to this question. The GPU QRB ppd is not yet know, although there are some general indicators. And the reasons that you purchased a 48 core box have not gone away. You also have to ask if the loss you will take on the 48 core box will be more than made up by the switch to GPUs?

Alternately, you have to consider how often you plan to upgrade. The ATI 4870 I bought 3 years ago can no longer fold because the hardware is no longer supported. However, the dual core computer I bought 3 years ago will still be folding for a long time. The 48 core box you build will be folding for a long time. Any GPUs you pick up to day will not last as long. If you are willing to jump on the upgrade Merry-Go-Round, and update hardware every few years, then you may end up switching to GPUs not long from now.

Rolo · Post by **Rolo** » Thu Nov 08, 2012 5:11 pm

mdk777 wrote:well, two things.
While I think the PPD should, and will be determined on the 'equal points for equal work.' : I am not so sure the investment and Wattage comparisons are so far off.

I agree in that the results (WU produced over time) are all that matter; how those results were produced is immaterial as far as "payment" (points) go.

mdk777 wrote:well, two things.
I have looked again at building a dedicated / watt efficient platform for a GPU dedicated system. It is easy to spend $700 without the card...so $1000 for the entire system.

$244 I7 CPU
$100 Z77 MB
$100 Gold or PLAT PSU
$100 OS
$80 MEMORY
$50 BOX
SSD DRIVE, etc. etc.

$250 $300 GPU and you are around a grand.

Now, you are going to be at 200/300 watts and a $1000 for the expected 100K to 150K ppd.

Unless you're referring to used prices, I would estimate a little higher and not buy unproven brands.

My i5/Z68/GTX580 box burns 130W at idle, 430W folding (not counting monitor), so I would estimate higher on that also. Even more if you need to actively cool the room.

mdk777 · Post by **mdk777** » Thu Nov 08, 2012 7:31 pm

Unless you're referring to used prices, I would estimate a little higher and not buy unproven brands.

I follow sale prices obsessively....but yeah, I was talking best case (someone building bare bones for a FOLDING rig. and not including any fat or bling)

My i5/Z68/GTX580 box burns 130W at idle, 430W folding (not counting monitor), so I would estimate higher on that also. Even more if you need to actively cool the room.

Yeah, again, I was looking at the best case. I have a 910 watt PSU that is sitting in storage because it didn't make sense to burn the watts on older GPU Cards.
SO where the sweet spot on the current Keepler, or AMD next gen cards will be remains to be seen. However there are some pretty good values in the 125 to 150 watt cards as opposed to the 250 watt cards now.

But again, I was just looking at a very efficient best case scenario for comparison to the best case for BIG-SMP comparison...

As you mention, with less than best case, the Big-SMP continues to look even better.

mihapiha · Post by **mihapiha** » Thu Nov 08, 2012 8:25 pm

You guys have taken this a bit off topic. But maybe it's just me who doesn't follow.

But just to be clear. I got a Supermicro H8QGi+-F motherboard with 4x Opteron 6180 SE CPUs which are overclocked to 2.75 GHz. So I get about 430k PPD with P8101 WUs. (According to what I've read, 600k PPD should be a standard if I'd get at least a couple of P8102 WUs)
The folding farm wasn't cheap, so I'd love to know whether two computers with 3x 660Ti would have been the better investment. I think I could have gotten that type of a system for the money I spend on this thing...

k1wi · Post by **k1wi** » Thu Nov 08, 2012 8:45 pm

mihapiha wrote:But just to be clear. I got a Supermicro H8QGi+-F motherboard with 4x Opteron 6180 SE CPUs which are overclocked to 2.75 GHz. So I get about 430k PPD with P8101 WUs. (According to what I've read, 600k PPD should be a standard if I'd get at least a couple of P8102 WUs)
The folding farm wasn't cheap, so I'd love to know whether two computers with 3x 660Ti would have been the better investment. I think I could have gotten that type of a system for the money I spend on this thing...

When you bought it it was the best investment you could have made for your dollar. At this point in time, it is still the best investment you could have made, but some who 'invest' now might choose to wait until things become clearer. What will it be in the near future/later future? You could speculate either way. Hardware and software development are fast moving things.

mdk777 · Post by **mdk777** » Thu Nov 08, 2012 10:39 pm

The folding farm wasn't cheap, so I'd love to know whether two computers with 3x 660Ti would have been the better investment. I think I could have gotten that type of a system for the money I spend on this thing...

Can't know it today. Short ANSWER.

Long ANSWER = Many variables that we were discussing.

1. Software. How fast will it develop? How completely? Fermi VS. Keepler( 660Ti is Keepler and the compute is different from older FERMI) CUDA VS. OPEN CL...VS. Gromacs...VS timeline for optimization on current and future graphic cards and FOLDING CORES.
2. power consumption/heat/space/noise associated with GPU. Are you willing to deal with these inconveniences? Ongoing costs compared to initial costs predominate over a long term.

My best guess.

A. You will still be the most efficient for the short term.
b. medium term = 3 to six months = closer to parity
C. long term FAST CPU AND FAST GPU working together will be the winning combination (in the past all the work done on GPU, you could get by with weak CPU system)

What that long term configuration will look like is a great deal of speculation at this point.

What you have now is the sure thing. The future is just WAG based on our experience and following developments in the industry. Obviously you have to take my opinion with a great deal of SALT.

Finally, another way to look at it.

Will your system be optimal in 2 years?
No, but neither will the Laptop I bought my daughter for college 3 months ago.

The question you ask yourself is not if it will be optimal in 2 years, (you assume that new technology will eclipse it by then), but to what degree will it be obsolete?

That was the original question that drew me into the thread. I don't think in 2 years your system will be so obsolete to be worthless...I expect that it will be eclipsed, but in a gradual manner.

Post by **bruce** » Thu Nov 08, 2012 11:51 pm

Do you buy a new car every year ... or every few years ... or every 10 years? Determining when a computer is "worthless" is based partly on the numbers but also partly on emotion. You're investment is not going to be a choice that lasts forever because at some point you'll decide that the cost of power per point will pass some threshold compared to the cost of an upgrade or the cost of a new system -- or if you're less analytical, you'll feel the competitive pressure from those who are earning more PPD and you'll upgrade anyway ... or you'll find a super-neat game that will play better on a new system ... or.... Nobody can make that decision for you.

Rolo · Post by **Rolo** » Fri Nov 09, 2012 2:23 am

^^ Right on.

- Basing decisions on guesswork is bad policy; just go with the facts as they are presently, which is what you did

- Know that 2 years in tech is like 40 people years (or 280 dog-years heh); therefore, going with information available at the time is all you have

- I kept my first vehicle for 13 years (just before it imploded--never a Chevy again) and I'm on my second vehicle that just turned 9 and I still like now it as much as I did then, so I'm not too quick to buy new

screen317 · Post by **screen317** » Sat Nov 10, 2012 4:29 am

KWSN_PToT wrote:Now, there is an added dis-incentive: Early Return Bonus (for GPUs).

I see an incentive for people with GPUs. I do not see a penalty for people not running GPUs.

7im · Post by **7im** » Sat Nov 10, 2012 5:32 am

Punchy wrote:It would be so easy to end the emotions and general speculation with just a small amount of information from PG. A short explanation of the specifics (atoms, simulation length, other relevant details) of the GPU and SMP projects that were being compared, along with the benchmark numbers from which the new formulas were derived, would go a long way. Is that too much to ask?

Nope, not too much, but not exactly easy either. When has discussing points with you ever been easy?

I'm sure all that info will be explained in great detail, eventually. But until ALL the details are settled, commenting on only parts of that info now would only set off more worthless speculations by all the usual suspects.

derrickmcc · Post by **derrickmcc** » Sat Nov 10, 2012 1:48 pm

Nathan_P wrote:
derrickmcc wrote:
Nathan_P wrote:I have no problem with a faster machine doing equal work getting more points.

However i do have an issue with a WU with only 900 atoms being folded in the same amount of time and getting 3 x PPD as the 77,113 atom WU that i am currently folding on a dual L5640 machine.

"Equal work means equal points" was the blog post - not on this 8057 WU its not.
It's not just the number of atoms, but also the length of the simulation (i.e the number of steps) that determines how much work is being done.
Fair enough, how many steps in an 8057 WU? my 77,113 atom WU - 6947 has 500,000 steps.

There are 50,000,000 steps in a 8057 WU.
so atoms * steps = 45 Billion for p8057 and 38.5 Billion for p6947

p8057 has 2549 base points and a k-factor of 0.75, wheras p6947 has 552 base points and a k-factor of 3.30

p8057 TPF is 2min28sec on a GTX 460 (825/1600/1650)

p8057 is a beta WU which is also testing the application of QRB to GPU, as such the base points and k-factor may be adjusted in the light of the results of this beta test.

Folding Forum

Blog post: "Unified GPU/SMP benchmarking scheme ..."

Re: Beta-GPU-WUs vs. BIG-SMP-WUs

Re: Beta-GPU-WUs vs. BIG-SMP-WUs

Re: Beta-GPU-WUs vs. BIG-SMP-WUs

Re: Beta-GPU-WUs vs. BIG-SMP-WUs

Re: Beta-GPU-WUs vs. BIG-SMP-WUs

Re: Beta-GPU-WUs vs. BIG-SMP-WUs

Re: Beta-GPU-WUs vs. BIG-SMP-WUs

Re: Beta-GPU-WUs vs. BIG-SMP-WUs

Re: Beta-GPU-WUs vs. BIG-SMP-WUs

Re: Beta-GPU-WUs vs. BIG-SMP-WUs

Re: Beta-GPU-WUs vs. BIG-SMP-WUs

Re: Beta-GPU-WUs vs. BIG-SMP-WUs

Re: Blog post: "Unified GPU/SMP benchmarking scheme ..."

Re: Blog post: "Unified GPU/SMP benchmarking scheme ..."

Re: Blog post: "Unified GPU/SMP benchmarking scheme ..."