Page 3 of 3

Re: Dual 20 Core xeons only one being utilized.

Posted: Fri Dec 28, 2018 4:53 pm
by bruce
There are potential problems with dividing the atoms of a protein into groups that can run in parallel (on separate threads) One such problem is the protein might be too small ... making the groups each too small. A slot with a large number of threads would only work well on proteins with a large number of atoms. Another potential problem is that GROMACS doesn't like to use thread-counts with large prime factors. In both cases, FAH should assign a project with acceptable number of threads, leaving other threads idle.

While this has nothing directly to do with the 40 vs 32-thread limitation, it would limit your total productivity somewhat, even if the Windows 40 / 32 thread problem were to be fixed.

Re: Dual 20 Core xeons only one being utilized.

Posted: Fri Dec 28, 2018 5:05 pm
by jplacava
^^ Thanks - am I to understand that it would be better to create several slots with say 8 threads each? Productivity-wise?

Re: Dual 20 Core xeons only one being utilized.

Posted: Fri Dec 28, 2018 5:24 pm
by foldy
No, 2 slots is enough to get all cores used. It is better to only fold 2 work units with all threads fast then to have 4 work units finished slow.

Re: Dual 20 Core xeons only one being utilized.

Posted: Sat Dec 29, 2018 7:20 am
by ProDigit
Depends.
On my Xeon, I can set 10 out of 20 cores, and it appears that Windows is using core 0, 2, 4, 6, 8.... (all the master cores), and not the hyperthreading cores by default.
Remember you have only so much L-cache assigned to each core (or core set); thus it makes more sense to run half the threads on a hyperthreading core.
And run all the cores (minus the ones assigned to the graphics card) on a non-hyperthreading CPU.

Surprisingly, the difference between running my Xeon at 10 cores, or running them at 20 cores is less than 3 Watts on the wall!
CPU PPD on the other hand went up by 33+%.

Re: Dual 20 Core xeons only one being utilized.

Posted: Sat Dec 29, 2018 6:37 pm
by foldy
Surprisingly, the difference between running my Xeon at 10 cores, or running them at 20 cores is less than 3 Watts on the wall!
CPU PPD on the other hand went up by 33+%.
Does anyone else see the same result, only using real cores not virtual cores (hyperthreading) improves PPD?

Re: Dual 20 Core xeons only one being utilized.

Posted: Sat Dec 29, 2018 8:53 pm
by toTOW
I wouldn't be surprised with AVX core and/or a WU with very few atoms ...

Re: Dual 20 Core xeons only one being utilized.

Posted: Sun Dec 30, 2018 12:15 am
by bruce
With SSE2, using an odd-even pair of cores runs slower than using only cores with even or only cores with odd numbers since the odd-even pair share the SSE2 hardware. I've never tested AVX so I'm not sure how it behaves in similar situations. ** Of course for most people, it's the OS that decides which cores to leave idle when they're not all in use (i.e- unless you you intentionally assign affinity.) For the most part, using most of your cores gets more work done than intentionally leaving some idle.

If you do decide to test AVX, please report the details of which projects you tested, how many concurrent frames you tested, and how the TFP changed.