another question about threads

mwroggenbuck · Post by **mwroggenbuck** » Tue Nov 17, 2020 3:05 pm

I apologize if this is answered elsewhere, but I see something that I cannot understand.

I am using windows 10 with a 20 thread machine. This should mean 5% CPU for each thread. I gave my CPU slot 17 threads. The windows task manager (details tab) shows 17 threads allocated to the a7 process. However, it is only using 80% of the CPU. I would think that should be 85%. What am I missing?

Additionally, how should the a8 core act?

foldy · Post by **foldy** » Tue Nov 17, 2020 3:31 pm

That sounds like a math riddle ... maybe FAH work unit did not like using 17 threads and lowered it to 16?

mwroggenbuck · Post by **mwroggenbuck** » Tue Nov 17, 2020 3:53 pm

Well, the a8 core is using 85% (what I expected). So this is either an a7 feature, or (as foldy suggested) something associated with the specific work unit.

Neil-B · Post by **Neil-B** » Tue Nov 17, 2020 4:09 pm

Check log ... this would show how many threads are actually used for the a7 wu ... I would expect 17 to be down stepped

Post by **Joe_H** » Tue Nov 17, 2020 4:17 pm

If you check the log file for the run command information connected with the CPU WU being processed, you will probably find the core using 16 threads. 17 is a large prime and will not be used directly for computing. The log entry will look a bit like this:

Code: Select all

 08:35:44:WU00:FS00:0xa7:       SIMD: avx_256
08:35:44:WU00:FS00:0xa7:********************************************************************************
08:35:44:WU00:FS00:0xa7:Project: 16927 (Run 7, Clone 6, Gen 50)
08:35:44:WU00:FS00:0xa7:Unit: 0x000000328120d1c95f930c26c013ad6d
08:35:44:WU00:FS00:0xa7:Digital signatures verified
08:35:44:WU00:FS00:0xa7:Calling: mdrun -s frame50.tpr -o frame50.trr -cpi state.cpt -cpt 15 -nt 2

the '-nt 2' indicates that 2 threads will be used. Depending on the OS and libraries in use, 1 or 2 more threads will be present but mostly inactive for the main code of the core executable. In the example I have here the 2 threads slice the region being simulated into 2 sections. A higher thread count will result in more slices in up to 3 dimensions. This is why 17 and similar "large" primes and their multiples aren't used, those as factors will result in slices that are too thin in some dimension.

Beyond about 16-18 threads the Gromacs code used in A7 and A8 can reserve some threads for doing PME calculations separately from the threads for each section. Details on this breakdown would be in either the science.log or md.log files that are part of the work files connected with the running WU.

The current version of the A8 core was created with its '--ntmpi' parameter set to 1. That has some implications for thread usage on larger systems, especially multiple processor systems. It also allows use of some thread counts that would not be used by the A7 core, still figuring out the full implications of that. One that is known is that depending on the WU size in atoms and space, different projects have thread counts beyond which there is little or no improvement in processing time. But the core will still use that many.

mwroggenbuck · Post by **mwroggenbuck** » Tue Nov 17, 2020 4:45 pm

Interesting. The log file did show 16 threads on the a7 core and 17 threads on the a8. I never noticed this in the log file before. This answers my question. Thanks.

mwroggenbuck · Post by **mwroggenbuck** » Wed Nov 18, 2020 1:31 pm

Ok, I finally understand that there is a difference between the number of threads you give to the client and the number of threads the client actually uses. The a7 core has a number of restrictions (prime number above 3, the value 10, there may be more). Apparently a8 has different set of restrictions. In any event, the client will use the most number of threads that it can, which will process a given work unit as fast as possible on your machine.

Neil-B · Post by **Neil-B** » Wed Nov 18, 2020 2:28 pm

FYI ... There is also another circumstance and that is when researcher limit cpu projects to certain thread numbers (such as less than 11 or more than 18) which will mean generally your kit wont see them if the rule applies - however occasionally if there are no other WUs available the AS may assign you one of these if it can instead of nothing - for instance my 24 and 32 slots occasionally get 10 thread max WUs when there is a lack of WUs for bigger thread counts.

MeeLee · Post by **MeeLee** » Wed Nov 18, 2020 9:41 pm

Taskmanager is inaccurate compared it to eg: HWMonitor.
I think, taskmanager is a program ported from the old Windows NT/95 days.
It doesn't know how to deal with turbo boost frequencies of CPUs.
It can show them, but the graphs don't accurately represent them.
Try HWMonitor for more accurate results.

mwroggenbuck · Post by **mwroggenbuck** » Thu Nov 19, 2020 12:55 am

I already use HWMonitor (for the same reason). However, I have found the details tab of task manager to be pretty accurate.

Rel25917 · Post by **Rel25917** » Fri Nov 20, 2020 6:57 am

On the Win 10 computer I have task manager lies like there is no tomorrow, nearly useless. On an 8 core/thread cpu, using 6 for cpu and one for gpu, task manager shows 100% use, and no, I do not have so much other junk running that it takes up the whole last core. HWMonitor reports a more realistic 90% use.

Folding Forum

another question about threads

another question about threads

Re: another question about threads

Re: another question about threads

Re: another question about threads

Re: another question about threads

Re: another question about threads

Re: another question about threads

Re: another question about threads

Re: another question about threads

Re: another question about threads

Re: another question about threads