I found this, which claims I can "Drag and drop columns to change their position and visibility. Note, only the default columns are displayed on small screens."
But although I can drag them to a different order, how do I turn them on and off? I've actually got 5 columns displayed, not the 25 shown, which makes me think this is for something else.
Fixed it myself, please leave this here for anyone else having the same problem. And BTW, this is not very intuitive.
The thin line next to the spinning arrow is apparently where you put them. Since it was empty I never thought of using it. The default already has four things in there if you leave it blank. You drag them from the bottom box to the top box. Also, actions isn't listed, but it's always shown on the right.
P.S. why do you use a spinning arrow here to mean reset to default, whereas in the main display the same arrows mean running?
Make two resource groups, one with just CPU's, and one with just the GPU. That will separate the jobs with their resources the way you are asking to see them.
bobdarnall wrote: ↑Fri Apr 04, 2025 1:48 am
Make two resource groups, one with just CPU's, and one with just the GPU. That will separate the jobs with their resources the way you are asking to see them.
Be careful doing that. If you select 16/16 threads and you enable a GPU, it will automatically fold on only 15/16 threads but only if the CPU and GPU are in the same resource group. If they are separate resource groups, it will fold on 16/16 and that harms performance severely.
are you sure? Folding magically doesn't harm performance if the GPU task is in Boinc. But Boinc on the CPU harms it! Folding CPU tasks seem to get out of the way / have lower priority.
BOINC may be sending long-lived kernels to the GPU, and letting the GPU keep everything in VRAM and operate without the CPU's help until their WU is completed. This probably depends on project. Something like Einstein@home is done entirely in the GPU.
FAH is different and sends short-lived kernels that may last mere milliseconds, or less. So the CPU has to be constantly feeding the GPU with new data and processing GPU returns. If you're folding on all CPUs as well as the GPU, then two things happen: 1) The CPU thread for the GPU slows down, so the GPU spends more time idling waiting to be fed, 2) One of the CPU folding threads is now running slower (CPU folding is only as fast as the slowest thread).
Nvidia GPUs are particularly sensitive to this because their drivers demand 100% CPU use on one thread for each GPU in use.
Actually Einstein is very sensitive, it needs constant CPU help.
And I was talking about how Folding CPU tasks are very polite. They get out of the way of Einstein, but Boinc ones don't.
It doesn't make sense really for any combination. I have 24 cores. I run 25 tasks. Each task (including the CPU part of the GPU one) should get almost one core each.
I think Windows is just rubbish with multitasking. I remember when preemptive multitasking was invented, back with single core machines on windows 3. I don't think they developed it any further since then. Surely it's not rocket science to give 25 tasks an equal shot of 24 cores?!
You'll still be hurting performance by running 25 tasks on 24 cores. Folding GPU tasks will suffer unless they are given a core completely to themselves with nothing heavy getting scheduled on it. Especially Nvidia GPUs.
The reason isn't bad scheduling or multitasking. Even if you pin the threads to the CPU cores the same thing will happen.
If multitasking was perfect, the CPU part of the GPU task would get precisely 24/25 = 0.96 cores. This is plentiful. But windows does not play fair and tasks do not get an even amount. I monitor usage of CPU and GPU using MSI afterburner (which also lets you tweak GPU clock/temperature/power) and I try to balance CPU usage and GPU usage to be similar %. I also have made a Nano GPU run a lot faster by turning power limit to 150%, on some tasks, it used more electricity and throttled. I have found with good cooling it's fine to go 150%. But I did melt the power connector! The wires are now soldered directly on.
Even if multitasking had perfect load balancing, you still need an entire, unused logical core for the thread that feeds the GPU. The reason for this is that it is latency-sensitive. Even if the scheduler switches from GPU feeding thread to CPU folding thread for just 20 ms, then GPU is going to be starved of maybe dozens or even hundreds of kernels. The GPU feeder thread uses a spin wait loop, polling the GPU's status constantly. Even the most brief interruption will decrease GPU usage and you will have more time where the GPU shader cores are idling.
If a scheduler could interlace two running tasks at per-instruction granularity then it would be possible, since two threads on an 4 GHz CPU core would each behave 100% identically to two threads each on a separate 2 GHz CPU core. But that's not possible for hardware reasons (context switches will always have non-zero latency). It's not the scheduler's fault.
You will find that you always get better PPD by setting aside a CPU core for each GPU that you are folding on.
Anyway, how many instructions does a CPU core get through in a second? At 4GHz let's say 4 clock cycles per instruction. So 1 billion instructions per second. So 20ms would be 20 million instructions run on the CPU task before polling the GPU. I can't believe the granularity is anywhere near that coarse. I guess one of us could look up how fine it really is and why it can't be made finer (efficiency?), but I'm not sure where to start.