Ah, because some things have to be done in order before continuing through the simulation it winds up spending clock cycles with unused compute units. Thanks!HaloJones wrote:wasted capacity effectively. they become no more efficient than a card with half the cores and end up with only maybe 50% utilisation. give them a really big protein with >100000 atoms and the card can get up to >90% utilisation that then allows them to get comparatively low TPF, to return far more quickly than a lower spec card and get an exponentially large quick return bonus
It would be cool if we could give cards like that additional WUs when they have compute units available (like extra "logical" GPUs in FahClient). For now though, it makes me think that an "optimal" system might be one which has several GPUs that can efficiently handle the average protein size plus standard deviation in parallel rather than one big one with an equal number of compute units.