Folding Forum

Posted: **Tue Jun 07, 2022 6:08 am**

Can you explain why does FAH depend upon GPU while it uses OpenCL or CUDA ?

Thanks

Posted: **Tue Jun 07, 2022 8:12 am**

These are only my opinions and understanding.

promeneur wrote: ↑Tue Jun 07, 2022 6:08 am why does FAH depend upon GPU while it uses OpenCL or CUDA ?

Because it's the GPUs carrying out the calculations.

I'm sure this raises more questions than it answers.
What benefit do GPUs give us? - The are capable of doing a massive amount calculations at the same time. The vastly outstrip CPUs in this regard. This is also why the GPU work units are scored considerably higher than CPU work units. In my Ryzen system the GPU scores roughly 9-10 times higher than that of the CPU (the Ryzen itself is running a 20 thread slot).

Why not exclusively use GPUs? - GPUs can do a vast number of calculations but they're largely simpler calculations with lower precision. CPUs are capable of doing complex calculations and , I think, a higher level of accuracy. Which one the researcher uses will depend upon what they are trying to achieve.

Why use GPUs at all? They are capable of doing a significantly higher number of calculations and their output is "good enough" for the researcher(s) who chooses to use it.

A few other notes:-
CUDA is NVidia exclusive and will only run on NVidia cards, assuming the GPU has it and the drivers are new enough.
FAH only started leveraging the increased performance of CUDA about a year ago. This is why the NVidia cards score roughly 25-30% higher than the equivalent AMD cards.
OpenCL will run just about anywhere, including CPUs, but runs quickest on GPUs.
OpenCL is the fallback for NVidia cards which have failed to initialize CUDA.

Posted: **Tue Jun 07, 2022 9:16 am**

Welcome to Folding@Home!

Again, just my opinion, not necessarily the complete facts.

If Folding@Home wrote to the hardware level, then it would need intimate knowledgeable that the vendors do not publish, about every single graphics card in the world. Instead, F@H writes to publicly defined abstraction layers. OpenCL 1.2 contains the minimum defined instructions to execute the folding needed.

Nvidia's OpenCL is written using a proprietary abstraction layers called CUDA, so if F@H can write to CUDA, they avoid abstracting an abstraction. This saves them execution speed at the cost of twice as much programming.

F@H could write a separate program for every single GPU, but I doubt they woul be done with programming cards released a decade ago, let alone anything we still use.

Posted: **Tue Jun 07, 2022 9:37 am**

My question is not accurate, then another one.

Why some GPU are compliant (white list) and some others not, if using an abstraction layer as OpenCL or CUDA ?

Posted: **Tue Jun 07, 2022 12:03 pm**

Again, my opinion and (mis)understanding.

I think there's a number of factors affecting this:-
GPU compatibility
Driver support
GPU capability
GPU performance

Compatibility - the first thing to realise is that FAH want to leverage as wide a range of hardware as possible but need to draw the line somewhere as to what is useful. This does mean leveraging the most common features across the available hardware.
Just like a computer game, FAH has minimum requirements.
They don't require latest drivers but can't draw the line so far back it excludes the latest cards. There was some discussion about this problem while support for CUDA was being added.
I believe it needs OpenCL 1.2 and for CUDA it's something like 11.2 (I think that translated to the geforce 47x series of drivers).
viewtopic.php?t=37391
viewtopic.php?t=37545

Driver support - this will automatically fail a number of GPUs simply because the manufacturer has stopped updating drivers for the card and FAH requirements have increased to a point where the gap can no longer be bridged.

GPU capability - even where driver support is still active, or recent enough, the GPU may not meet FAH requirements. Typically FAH looks for FP32 (32-bit floating point number) and FP64 (64-bit floating point) on a GPU. Typically the FP32 performance is significantly higher than FP64. FAH uses a mix of the two, I think they would use exclusively FP64 if they could but right now it's too big a performance hit. A card missing, or extremely low performing, FP64 would black list a card.

GPU performance - to be honest, I think this is more likely to result in a GPU being deranked to a lower species than actual blacklisting.

I think the best indicator to some of this is the recent/current testing with the intel GPUs. Some GPUs simply don't have the capability, some are proving unreliable and others are throttled to the point of being useless. I think the most telling thing about this is that it's still in testing and has not been released to full FAH yet.

It's worth bearing in mind the GPU.txt file is more than just a white/black list. It's a very rough breakdown of GPU capability and performance to allow a slightly more focussed matching of projects to GPUs. This is to, hopefully, prevent large powerful GPUs being underused by work units with small proteins and small slower GPUs being overwhelmed by large work units (a large work unit could be a large protein and/or a long simulation time).

At the end of the day FAH have limited resources and they need to make the most of them. This may mean investigating possible avenues and deciding to not follow them if the ongoing support is not worth the return.

Posted: **Wed Jun 08, 2022 7:56 am**

My understanding is, that certain instructions can only be done with the CPU.
There's CPU folding (high accuracy, slow speed), and GPU folding (high performance, lower accuracy).
And with lower accuracy, I mean 16 to 32 bit calculations.
A modern GPU has thousands of of 16-32 bit shaders, that can calculate smaller math problems very quickly, because there are so many shaders all working on their individual math problems.

It's easy to see how a thousand slower (1,4Ghz to 2,4Ghz) shaders, can overall be much faster than only 4, 8, 12, or 16 CPU cores running at almost double (3,5Ghz to 5,5Ghz) the speed.

For GPU folding, CPU and GPU work in unison, usually requiring about 3 to 4Ghz per modern GPU.
The CPU does the job of compressing and decompressing data, logs, background down and uploads, runs the folding client software, ETA prediction, as well as handles what could be considered the "containers", containing the calculated data by sending the content to and from the GPU VRAM memory over the PCIE bus, and handles save states (perhaps some more background tasks).

I believe, but am not sure, that each "container" (WU), contains lots of 32 bit data, but some have 64bit precision (CPU precision) data to be calculated.
This is either done on the GPU, or on the CPU.
At least, on Boinc (a very similar program) they use the GPU cores for that (not to be confused with Cuda cores, which are actually shaders).
But not all projects or programs work like this.
The CPU can also be responsible for the advanced math problems (like 64bit calculations), and the GPU doing the standard 32bit math.

Folding Forum

why GPU dependancy ?

why GPU dependancy ?

Re: why GPU dependancy ?

Re: why GPU dependancy ?

Re: why GPU dependancy ?

Re: why GPU dependancy ?

Re: why GPU dependancy ?