Folding Forum

Posted: **Sun Aug 30, 2020 6:11 pm**

Foliant wrote:
MeeLee wrote:the remaining will run at PCIE 2.0 speeds, and they're all x1 slots.

So your second issue will be that your GPUs will be running below 50% of their capacity.
I dont own a 1060 or P106 but I collected some data with my GTX 650 on a PCIe2 x1 Slot.
If I start only one Core of my CPU (6 Core Phenom II 1090T) all 4 GPUs will slow down.
I cant say nothing about the Bandwith my 650 is using. But i think this Low-End-GPU dont bottlenecks the PCIe2 x1.

The GTX 650 is about 4.5x slower than a GTX 1060 according to benchmarks.

Posted: **Sun Aug 30, 2020 11:10 pm**

I am ruining two MSI P106-100 on an ASRock H110 Pro BTC+ MB. I had a Celeron G3930. Plan is purchase some used 2060 cards once the 3000 series come out.

Posted: **Mon Aug 31, 2020 12:59 am**

gaskippy wrote:I am ruining two MSI P106-100 on an ASRock H110 Pro BTC+ MB. I had a Celeron G3930. Plan is purchase some used 2060 cards once the 3000 series come out.

I’m content with a 2060 and a 2060 KO on one motherboard, but one is on a PCIe 3.0 x8 and the other on a 3.0 by x4. The x4 seems to slow it down a little (under 95% use) sometimes. I had the motherboard configured wrong, earlier, so that one was on a 3.0 x16 and the other on a 3.0 x1, and the latter had GPU use under 70% fairly often.

If I could get 2060s cheap and the performance/watt were still pretty high compared with the new cards, I might get more. We’ll see.

Posted: **Mon Aug 31, 2020 1:08 am**

I am running my two P106 cards on usb riser cards. Been happy with the setup.

Posted: **Mon Aug 31, 2020 4:58 am**

gaskippy wrote:I am ruining two MSI P106-100 on an ASRock H110 Pro BTC+ MB. I had a Celeron G3930. Plan is purchase some used 2060 cards once the 3000 series come out.

First make sure it's PCIE 3.0, not 2.0.
2.0 on an x1 riser is seriously bottlenecking even a GT1650 Super, let alone a 2060.
3.0 x1 is even bottlenecking a 2060 by about 10%, using Linux (no Windows, for Windows you need twice the PCIE bandwidth or more).

Second, make sure you can find a CPU that has enough cores (1 thread per GPU), that your motherboard will accept.

Third, make sure that your bios accepts RTX GPUs.
Not many mining boards do, as the mining craze was around the time GT/GTX 1000 series GPUs were the latest.

Posted: **Mon Aug 31, 2020 8:25 am**

Foliant wrote:...can someone explain what the "Average performance" in "ns/day" is meaning?...

Since you're aware of PPD which is Points Per Day, ns/day means nanoseconds per day. That means, the current simulation that you're folding, how many ns/day you're system (CPU/GPU) is capable of processing. Every simulation (Project) will have a different ns/day. Simple simulations with few atoms will have a high ns/day while large and complex simulation with many atoms will have a lower ns/day. Moreover, it is likely that ns/day might change for a simulation depending on how "straight" or "folded" a protein is.

Here's an image which shows the various time-scales to put things in preservative:

Source -> https://www.frontiersin.org/articles/10 ... 00258/full

If you would like to look at some historical numbers, have a look here: viewtopic.php?f=38&t=24225 keep in mind that those numbers are based on FahCore_21 which has different requirements than FahCore_22 so the current state is different.

Posted: **Mon Aug 31, 2020 8:26 am**

MeeLee wrote:
gaskippy wrote:...Second, make sure you can find a CPU that has enough cores (1 thread per GPU), that your motherboard will accept...

I have seen some posts where 2 threads (1 virtual, 1 real) per GPU can be needed. That's highly dependent on the type of WU. Thus, if you would like to future proof the baseline then for each GPU, you will need 2 GB RAM with 2 CPUs (or threads).

Posted: **Mon Aug 31, 2020 8:30 pm**

PantherX wrote:
MeeLee wrote:
gaskippy wrote:...Second, make sure you can find a CPU that has enough cores (1 thread per GPU), that your motherboard will accept...
I have seen some posts where 2 threads (1 virtual, 1 real) per GPU can be needed. That's highly dependent on the type of WU. Thus, if you would like to future proof the baseline then for each GPU, you will need 2 GB RAM with 2 CPUs (or threads).

When a CPU has hyperthreading/SMT enabled, the Nvidia driver will always balance 1 GPU between both threads, but doesn't fully utilize them.
It doesn't mean it needs more than 1 CPU (unless some of the very latest projects have a much higher CPU usage than with core 22).

About a year ago, Foldy confirmed my results that a GTX 1050/1060 need only about a 1,8Ghz CPU before bottlenecking, and an RTX 2060 needs only 2Ghz.
And that having a 4 Ghz CPU per thread, may show 100% core utilization in Windows Task Manager, but the actual processing data would be below 50%.
Meaning, pairing 2 RTX 2080Tis on one CPU core of 4+Ghz is perfectly possible.

Posted: **Tue Sep 01, 2020 12:56 am**

MeeLee wrote:
gaskippy wrote:
Third, make sure that your bios accepts RTX GPUs.
Not many mining boards do, as the mining craze was around the time GT/GTX 1000 series GPUs were the latest.

My x1 risers are PCIE 2.0.

Thanks for the information on the RTX GPUs. I might stay will the GTX 1000 series or P102 / P104 GPUs. Just trying to help the folding.

Posted: **Tue Sep 01, 2020 9:35 am**

MeeLee wrote:...When a CPU has hyperthreading/SMT enabled, the Nvidia driver will always balance 1 GPU between both threads, but doesn't fully utilize them.
It doesn't mean it needs more than 1 CPU (unless some of the very latest projects have a much higher CPU usage than with core 22)...

Yep, using CPU affinity, I can lock the process to a single CPU but what I noticed that for the newer complex WUs, 2 CPUs gave a slight reduction in the TPF. However, most current GPU WUs would be be okay with a single CPU.

MeeLee wrote:...About a year ago, Foldy confirmed my results that a GTX 1050/1060 need only about a 1,8Ghz CPU before bottlenecking, and an RTX 2060 needs only 2Ghz.
And that having a 4 Ghz CPU per thread, may show 100% core utilization in Windows Task Manager, but the actual processing data would be below 50%.
Meaning, pairing 2 RTX 2080Tis on one CPU core of 4+Ghz is perfectly possible.

I am keen to know how it was tested, assuming you know the procedure. I wonder if it can be repeated once the next gen GPUs from AMD/Nvidia arrives!

Posted: **Tue Sep 01, 2020 7:01 pm**

PantherX wrote:
MeeLee wrote:...When a CPU has hyperthreading/SMT enabled, the Nvidia driver will always balance 1 GPU between both threads, but doesn't fully utilize them.
It doesn't mean it needs more than 1 CPU (unless some of the very latest projects have a much higher CPU usage than with core 22)...
Yep, using CPU affinity, I can lock the process to a single CPU but what I noticed that for the newer complex WUs, 2 CPUs gave a slight reduction in the TPF. However, most current GPU WUs would be be okay with a single CPU.
MeeLee wrote:...About a year ago, Foldy confirmed my results that a GTX 1050/1060 need only about a 1,8Ghz CPU before bottlenecking, and an RTX 2060 needs only 2Ghz.
And that having a 4 Ghz CPU per thread, may show 100% core utilization in Windows Task Manager, but the actual processing data would be below 50%.
Meaning, pairing 2 RTX 2080Tis on one CPU core of 4+Ghz is perfectly possible.
I am keen to know how it was tested, assuming you know the procedure. I wonder if it can be repeated once the next gen GPUs from AMD/Nvidia arrives!

The procedure isn't that difficult.
While I assume Foldy used Windows, and just manually reduced the CPU frequency on the fly with some Windows CPU program, and lowered it until the GPU showed a significant drop in PPD count,
I used Linux, and htop (like taskmanager), which shows both the active (kernel) data, and passive data (those null-bytes Nvidia uses to keep the GPU locked in a certain CPU core).
If kernel data showed a 60% kernel usage on a 4Ghz CPU, I would then restart the pc in bios, and lower CPU frequency there to 2,5Ghz (62.5%), and do consecutive tests lowering it from there.
Once I hit that 60% spot (the 4Ghz running at 2,4Ghz), on that particular setup, Htop would show 100% kernel usage (no idle data), and the GPU would lower more significantly.
The difference between running at full 100% (4Ghz) to 60% (2,4Ghz) in my test setup, was not very significant (<10%).
Once I surpassed it, the difference was >10%, and dropped more drastically.
I then used the same procedure for the GPUs I owned (1030, 1050, 1060, 2060, 2070, 2080, 2080Ti).
The GPUs Foldy and I had in common, showed about the same results.

Initially I did the test, to see if I could lower TDP of the CPU, running lower watts, and just stumbled upon those GPU results by accident.
The thought hadn't occurred to me until then, that the GPU needs a minimum CPU frequency.

But it turned out, running an Intel CPU below it's base frequency wasn't really worth the loss of PPD on the GPU.
Setting the CPU from baseline (3,8Ghz) to whatever the GPU needed (1,8-2,5Ghz), resulted in only a ~5-7W difference on an i5 9400F.
Since Intel didn't make many significant upgrades on the architecture between 2nd gen to 9th gen CPUs, I presume the results are similar with all those CPUs.
We haven't tested any AMD CPUs, nor the latest Ryzem or Intel 10th gen CPUs, which do have a significant technological jump compared to the older CPUs.

What did make a difference in the test, was disabling the turbo frequency, and setting the CPU to a fixed value just above it's base frequency.
Resulting in a higher power saving than with turbo enabled (5-10W), or a small power cost vs running it at base frequencies (+2 to +3W);
While on the GPU side, the PPD gain was a few percent (~+2 to +3%) higher compared to baseline, but also about ~-1% (lower) compared to with turbo on.
I found in my scenario, (6th to 8th gen Intel I5) setting a very mild overclock from baseline, was the sweet spot.

I'd be interested to see how GPUs would do with the newest 10th gen CPUs in this test, as the 10th gens have a very low base frequency, and very high boost frequencies.
They should (at least in theory) be able to have a much more flexible power consumption range than older CPUs, in as such that running them stock with turbo on, could make the CPU use about twice the wattage, than running it on a fixed value between baseline and turbo.

Side note, what you could do, to save even more power, is disable CPU cores that aren't needed, as well as disable HT/SMT. (my CPUs don't have HT/SMT).
Like, if you only use a pc with a 6 core with HT/SMT (12 threads) for GPU folding, and have only 2 or 3 GPUs, you could disable HT/SMT in BIOS (6 cores), as well as disable 2 cores (or 3 cores for Linux), running Linux with just the same amount of cores as you have GPUs, and running Windows with the same +1 core.
It would save a bit more on power, than to just leave them idle, without affecting PPD.

I haven't tested RAM speeds, because at the time, XMP was still a rarity, but Ryzen CPUs are more affected by slower RAM.
I do know that running RAM with their XMP profiles loaded (3,200 to 3,600MHz) uses slightly more power than running them with XMP off (2,133MHz).
It would be nice if someone could test ram speeds in a similar manner (use XMP profile to run it from it's stock XMP down to as low as a GPU can handle).
Sadly I don't have the time for it anymore.

Generally Ryzens aren't very suited for GPU folding, as they have too many threads.
Though a Ryzen 3 3100x with SMT disabled, is a good alternative to Intel CPUs, as they are running at 7nm and very efficient.

Posted: **Wed Sep 02, 2020 7:40 pm**

On Windows I just reduced max CPU frequency in Windows power options 100%, 90%, ... and for gtx 1080ti it needs 2 Ghz if I remember which was 50% on my CPU, before it bottlenecked PPD/TPF.

Posted: **Sun Sep 06, 2020 1:45 am**

Thanks for the information MeeLee & foldy, I will keep this in mind when I make my dedicated build in the future

Posted: **Sun Sep 06, 2020 7:42 pm**

PantherX wrote:
Foliant wrote:...can someone explain what the "Average performance" in "ns/day" is meaning?...
[...]
Here's an image which shows the various time-scales to put things in preservative:
https://www.frontiersin.org/files/Artic ... 8-g001.jpg
Source -> https://www.frontiersin.org/articles/10 ... 00258/full
[...]

That destroyed my thoughts about the speed of research that happens.

There is the need for more Hardware.

FYI I managed to get a 1050 (should be close but below the P106-090) for testing.
Tested under Windows because i dont know how to track PCIe-Bus-Load under Linux.

Code: Select all

PCIe 3 16x
Project: 16920 (5,30,62)
TPF: 1,54
Bus: 46%
Average performance: 75.6567 ns/day


PCIe 3 16x
Project: 16920 (26,15,74)
TPF: 1,54
Bus: 46%
Average performance: 75.2613 ns/day

Code: Select all

PCIe 2 1x
Project: 13422 (6078,69,1)
TPF: 7,45
Bus: 73%
Average performance: 36.7973 ns/day

PCIe 2 1x
Project: 13424 (1822,11,0)
TPF: 7,36
Bus: 71%
Average performance: 18.9391 ns/day

So maybe a 1060 / P106 is max for a PCIe2 1x Slot

Regards,
Patrick

Posted: **Mon Sep 07, 2020 8:46 am**

Foliant wrote:...There is the need for more Hardware. ...

Say Hi to Nvidia 3000 Series: https://www.techradar.com/nz/news/rtx-3080
Later in the year, you can also say Hi to AMD Big Navi

Folding Forum

Mining board with 8x zotac p106

Re: Mining board with 8x zotac p106

Re: Mining board with 8x zotac p106

Re: Mining board with 8x zotac p106

Re: Mining board with 8x zotac p106

Re: Mining board with 8x zotac p106

Re: Mining board with 8x zotac p106

Re: Mining board with 8x zotac p106

Re: Mining board with 8x zotac p106

Re: Mining board with 8x zotac p106

Re: Mining board with 8x zotac p106

Re: Mining board with 8x zotac p106

Re: Mining board with 8x zotac p106

Re: Mining board with 8x zotac p106

Re: Mining board with 8x zotac p106

Re: Mining board with 8x zotac p106