BTW BrgHW, since you have an Intel quad and you use 2 cores for the SMP, you can refine your profiles to always use two cores that share an L2 cache for the SMP, and use the other PAIR of cores for the GPUs (eventually balance the GPU clients by memory as well).
None of this will have a huge effect, mind you. But for the SMP client it will eliminate the synchronization and sharing of data to go over the north bridge.
PAIR0 always contains physical core 0 and the other core that happens to share the L2 cache with core 0. So you can place your GPU clients on PAIR1. Just replace CPU3 with PAIR1::CPU1 and CPU2 with PAIR1::CPU0.
Optionally, if your GPU clients have some common path prefix, you can write a rule to balance them across the cores based on memory use. I do not expect any big effect on PPD here.
Code: Select all
CoomonPathPrefixToGPUClients*\FahCore_*.exe := PAIR1 [assign=1,resource=memuse,policy=pseudobalanced]
And it might help a little for the SMP client to be on a PAIR of cores because the four SMP processes exchange data every time step and there are 250K or 500K simulation steps typically for each unit. Replace CPU1 with PAIR0::CPU1 and CPU0 with PAIR0::CPU0 in your profile for the SMP client.