PantherX wrote:MeeLee wrote:...When a CPU has hyperthreading/SMT enabled, the Nvidia driver will always balance 1 GPU between both threads, but doesn't fully utilize them.
It doesn't mean it needs more than 1 CPU (unless some of the very latest projects have a much higher CPU usage than with core 22)...
Yep, using CPU affinity, I can lock the process to a single CPU but what I noticed that for the newer complex WUs, 2 CPUs gave a slight reduction in the TPF. However, most current GPU WUs would be be okay with a single CPU.
MeeLee wrote:...About a year ago, Foldy confirmed my results that a GTX 1050/1060 need only about a 1,8Ghz CPU before bottlenecking, and an RTX 2060 needs only 2Ghz.
And that having a 4 Ghz CPU per thread, may show 100% core utilization in Windows Task Manager, but the actual processing data would be below 50%.
Meaning, pairing 2 RTX 2080Tis on one CPU core of 4+Ghz is perfectly possible.
I am keen to know how it was tested, assuming you know the procedure. I wonder if it can be repeated once the next gen GPUs from AMD/Nvidia arrives!
The procedure isn't that difficult.
While I assume Foldy used Windows, and just manually reduced the CPU frequency on the fly with some Windows CPU program, and lowered it until the GPU showed a significant drop in PPD count,
I used Linux, and htop (like taskmanager), which shows both the active (kernel) data, and passive data (those null-bytes Nvidia uses to keep the GPU locked in a certain CPU core).
If kernel data showed a 60% kernel usage on a 4Ghz CPU, I would then restart the pc in bios, and lower CPU frequency there to 2,5Ghz (62.5%), and do consecutive tests lowering it from there.
Once I hit that 60% spot (the 4Ghz running at 2,4Ghz), on that particular setup, Htop would show 100% kernel usage (no idle data), and the GPU would lower more significantly.
The difference between running at full 100% (4Ghz) to 60% (2,4Ghz) in my test setup, was not very significant (<10%).
Once I surpassed it, the difference was >10%, and dropped more drastically.
I then used the same procedure for the GPUs I owned (1030, 1050, 1060, 2060, 2070, 2080, 2080Ti).
The GPUs Foldy and I had in common, showed about the same results.
Initially I did the test, to see if I could lower TDP of the CPU, running lower watts, and just stumbled upon those GPU results by accident.
The thought hadn't occurred to me until then, that the GPU needs a minimum CPU frequency.
But it turned out, running an Intel CPU below it's base frequency wasn't really worth the loss of PPD on the GPU.
Setting the CPU from baseline (3,8Ghz) to whatever the GPU needed (1,8-2,5Ghz), resulted in only a ~5-7W difference on an i5 9400F.
Since Intel didn't make many significant upgrades on the architecture between 2nd gen to 9th gen CPUs, I presume the results are similar with all those CPUs.
We haven't tested any AMD CPUs, nor the latest Ryzem or Intel 10th gen CPUs, which do have a significant technological jump compared to the older CPUs.
What did make a difference in the test, was disabling the turbo frequency, and setting the CPU to a fixed value just above it's base frequency.
Resulting in a higher power saving than with turbo enabled (5-10W), or a small power cost vs running it at base frequencies (+2 to +3W);
While on the GPU side, the PPD gain was a few percent (~+2 to +3%) higher compared to baseline, but also about ~-1% (lower) compared to with turbo on.
I found in my scenario, (6th to 8th gen Intel I5) setting a very mild overclock from baseline, was the sweet spot.
I'd be interested to see how GPUs would do with the newest 10th gen CPUs in this test, as the 10th gens have a very low base frequency, and very high boost frequencies.
They should (at least in theory) be able to have a much more flexible power consumption range than older CPUs, in as such that running them stock with turbo on, could make the CPU use about twice the wattage, than running it on a fixed value between baseline and turbo.
Side note, what you could do, to save even more power, is disable CPU cores that aren't needed, as well as disable HT/SMT. (my CPUs don't have HT/SMT).
Like, if you only use a pc with a 6 core with HT/SMT (12 threads) for GPU folding, and have only 2 or 3 GPUs, you could disable HT/SMT in BIOS (6 cores), as well as disable 2 cores (or 3 cores for Linux), running Linux with just the same amount of cores as you have GPUs, and running Windows with the same +1 core.
It would save a bit more on power, than to just leave them idle, without affecting PPD.
I haven't tested RAM speeds, because at the time, XMP was still a rarity, but Ryzen CPUs are more affected by slower RAM.
I do know that running RAM with their XMP profiles loaded (3,200 to 3,600MHz) uses slightly more power than running them with XMP off (2,133MHz).
It would be nice if someone could test ram speeds in a similar manner (use XMP profile to run it from it's stock XMP down to as low as a GPU can handle).
Sadly I don't have the time for it anymore.
Generally Ryzens aren't very suited for GPU folding, as they have too many threads.
Though a Ryzen 3 3100x with SMT disabled, is a good alternative to Intel CPUs, as they are running at 7nm and very efficient.