Quad-core 2Ghz vs Dual-core 4Ghz - Which faster?

Moderators: Site Moderators, FAHC Science Team

tear
Posts: 254
Joined: Sun Dec 02, 2007 4:08 am
Hardware configuration: None
Location: Rocky Mountains

Re: Quad-core 2Ghz vs Dual-core 4Ghz - Which faster?

Post by tear »

HaloJones wrote:Runs Win-SMP client with MPIEXEC 24/7.
That's a valid point (assuming it's true, no Win-SMP here).
Believe me, it's almost linear with [otherwise idle] Linux-SMP.

Anyway, I'm afraid we've put to much into equation by now :)


Cheers,
tear

EDIT: ok, it may need some extra tinkering (no pun intended) but the conclusion (better double clock than twice the cores) IMNSHO remains intact
One man's ceiling is another man's floor.
Image
bollix47
Posts: 2963
Joined: Sun Dec 02, 2007 5:04 am
Location: Canada

Re: Quad-core 2Ghz vs Dual-core 4Ghz - Which faster?

Post by bollix47 »

HaloJones wrote:
Scenario 1.

3.2GHz Quad-core machine running Windows. Runs Win-SMP client with MPIEXEC 24/7. gets A2 unit and takes a whopping 26hours to fold it.

When did the Win-SMP start getting A2 core WUs?

If and when it does I doubt that it would take 26 hours to process a unit(unless it's one of the 3840 point units) as native linux on a Q6600 @ stock only takes 12-13 hours(on a 1920 point unit) and although windows is considerably slower it's not twice as slow.

oops ... sorry for the double post ... hit the quote button instead of the edit button. :oops:
Last edited by bollix47 on Sun Nov 23, 2008 12:21 pm, edited 2 times in total.
Image
HaloJones
Posts: 906
Joined: Thu Jul 24, 2008 10:16 am

Re: Quad-core 2Ghz vs Dual-core 4Ghz - Which faster?

Post by HaloJones »

Good point, but the WinSMP still raises four threads and fully occupies the four cores and achieves 1700ppd.

The two VMs running two linux SMP clients with A2s is getting 5200ppd and returning units in the same time as the win-smp. Until the Win-smp client is better, I'll keep doing what I do which is return twice the work in the same time.

two SMP clients under Linux is better than a single one under Windows *even when the linux clients get a1s*
single 1070

Image
shatteredsilicon
Posts: 87
Joined: Tue Jul 08, 2008 2:27 pm
Hardware configuration: 1x Q6600 @ 3.2GHz, 4GB DDR3-1333
1x Phenom X4 9950 @ 2.6GHz, 4GB DDR2-1066
3x GeForce 9800GX2
1x GeForce 8800GT
CentOS 5 x86-64, WINE 1.x with CUDA wrappers

Re: Quad-core 2Ghz vs Dual-core 4Ghz - Which faster?

Post by shatteredsilicon »

WangFeiHong wrote:I thought we had established that 1x4Ghz processors were at least, faster than 2x2ghz processors due to inter-core bottlenecks.
That's what I thought I said - 1x4GHz is better than 2x2GHz.
Image
1x Q6600 @ 3.2GHz, 4GB DDR3-1333
1x Phenom X4 9950 @ 2.6GHz, 4GB DDR2-1066
3x GeForce 9800GX2
1x GeForce 8800GT
CentOS 5 x86-64, WINE 1.x with CUDA wrappers
7im
Posts: 10179
Joined: Thu Nov 29, 2007 4:30 pm
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
Location: Arizona
Contact:

Re: Quad-core 2Ghz vs Dual-core 4Ghz - Which faster?

Post by 7im »

shatteredsilicon wrote:
WangFeiHong wrote:I thought we had established that 1x4Ghz processors were at least, faster than 2x2ghz processors due to inter-core bottlenecks.
That's what I thought I said - 1x4GHz is better than 2x2GHz.
Did I miss the post where you show SMP folding numbers to back up this statement?


@ HaloJones - This thread isn't about comparing the Windows SMP client (known to be slower) to the Linux SMP client. This was about 2x2 vs. 1x4 cores. If you can use VMs to process 2 WUs in the same time as 1 WU, then do it. Well done. However, that's rarely the case were 2 take the same time as 1. Even almost the same time is acceptable and helpful. But again, the curve drops off quickly as to when that difference in time becomes less helpful. Do what you think is best.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
tear
Posts: 254
Joined: Sun Dec 02, 2007 4:08 am
Hardware configuration: None
Location: Rocky Mountains

Re: Quad-core 2Ghz vs Dual-core 4Ghz - Which faster?

Post by tear »

Finally we're getting *somewhere* :mrgreen:

TBH, coming up with numbers won't be very hard [isn't thread.excitement() unusually high already? ;-)].
Okay, I can't come with *exactly* those but I can try doing a comparison of QC 1GHz vs. DC 2GHz
(as soon as I figure out how to turn on this bloody Speedstep in Xeons) [sorry, I'm a Lo-clock guy].

Would that work?


Cheers,
tear
One man's ceiling is another man's floor.
Image
Jeff_Grant
Posts: 13
Joined: Mon Dec 03, 2007 5:05 am

Re: Quad-core 2Ghz vs Dual-core 4Ghz - Which faster?

Post by Jeff_Grant »

Even if you could run it on a p4 nortwood, it would take so much longer it will have no value when it gets back. Again, you should read that quote again.
Why do guys have to do this to me, now I have to dig out a P4 northwood, LN2 and push to 8ghz. Let's see here. . .

http://forums.hardwarezone.com.sg/showt ... ?t=2009804
Jeff_Grant
Posts: 13
Joined: Mon Dec 03, 2007 5:05 am

Re: Quad-core 2Ghz vs Dual-core 4Ghz - Which faster?

Post by Jeff_Grant »

ahh maybe it was a prescott.
HaloJones
Posts: 906
Joined: Thu Jul 24, 2008 10:16 am

Re: Quad-core 2Ghz vs Dual-core 4Ghz - Which faster?

Post by HaloJones »

Suppose for a moment, that the SMP client could run serially instead of parallellellelly. Each thread would take exactly as long as necessary and the total time spent on the cpu would be exactly as long as the threads added together. Now suppose you have a 9.6GHz cpu. It would take no longer than the sum of each thread's time and there would be no wasted cpu time.

Now look at what actually happens. You have a 4x2.4GHz quad. No core can start each frame until the slowest thread has completed. Any time spent by any other process on any core will result in an overrun of that thread compared to the rest, with the inevitable result that some of the total 9.6GHz is wasted as cores wait at the end of their frame for the other cores to catch up.

Of course, there isn't an option to run the four threads serially so does it still hold true if they're run in parallel? Arguably yes.

When the multi-core cpu is waiting for a frame to "catch up" with the others, it can only dedicate one quarter of its power to the frame. But a single-core 9.6GHz cpu could dedicate all of its power to that catch-up and finish it quicker. Now if only you could get a 9.6GHz single-core chip!

So is a really fast dual better than a slower quad? What about a 4GHz dual against a quad-core 2.4GHz? It all would depend on how unbalanced the threads are. When I set "verbosity = 9" it is interesting to see in logs, messages referring to "58.7% time waiting" or somesuch, it indicates to me that the thread balancing is very poor and that a fast dual is probably better than a quad with the same combined speed.

But I can't prove it. Just my suspicion.
single 1070

Image
tear
Posts: 254
Joined: Sun Dec 02, 2007 4:08 am
Hardware configuration: None
Location: Rocky Mountains

Re: Quad-core 2Ghz vs Dual-core 4Ghz - Which faster?

Post by tear »

HaloJones wrote:(...) it is interesting to see in logs, messages referring to "58.7% time waiting" or somesuch, it indicates to me that the thread balancing is very poor
Yup, that's the thing [well, it *seems* it is].

Code: Select all

Average load imbalance: 0.6 %
Part of the total run time spent waiting due to load imbalance: 0.4 %
Steps where the load balancing was limited by -rdd, -rcon and/or -dds: X 0 % Y 0 %  0 %
Anyway, MONEY TALKS!!^W^W numbers speak!

[I'll try to conduct actual experiment sometime today]


tear
One man's ceiling is another man's floor.
Image
Jeff_Grant
Posts: 13
Joined: Mon Dec 03, 2007 5:05 am

Re: Quad-core 2Ghz vs Dual-core 4Ghz - Which faster?

Post by Jeff_Grant »

o is a really fast dual better than a slower quad? What about a 4GHz dual against a quad-core 2.4GHz?
With all other things equal, like cache? That could make a difference too.
Jeff_Grant
Posts: 13
Joined: Mon Dec 03, 2007 5:05 am

Re: Quad-core 2Ghz vs Dual-core 4Ghz - Which faster?

Post by Jeff_Grant »

Memory the same speed?
shatteredsilicon
Posts: 87
Joined: Tue Jul 08, 2008 2:27 pm
Hardware configuration: 1x Q6600 @ 3.2GHz, 4GB DDR3-1333
1x Phenom X4 9950 @ 2.6GHz, 4GB DDR2-1066
3x GeForce 9800GX2
1x GeForce 8800GT
CentOS 5 x86-64, WINE 1.x with CUDA wrappers

Re: Quad-core 2Ghz vs Dual-core 4Ghz - Which faster?

Post by shatteredsilicon »

7im wrote:
shatteredsilicon wrote:
WangFeiHong wrote:I thought we had established that 1x4Ghz processors were at least, faster than 2x2ghz processors due to inter-core bottlenecks.
That's what I thought I said - 1x4GHz is better than 2x2GHz.
Did I miss the post where you show SMP folding numbers to back up this statement?
I don't have a setup directly equivalent to this to demonstrate with, but the fact that running 4x clients on a quad with each affinity bound to one core only yields massively more PPD than running 1 client on all fore cores (1 FahCore per CPU core) is pretty strong evidence of it. If the scaling was perfectly balanced under real-world conditions, then the performance would favour the setup with fewest total processes, and thus running 4x clients would be slower because there is more process switching taking place (which is overheady and slows things down), and since there is 4x the amount of data being processed, cache effectiveness is also significantly reduced, not to mention the 4-fold increase in memory bandwidth contention. The fact that despite the extra process switching overheads and more cache and memory bandwidth contention from running multiple SMP folding processes each bound to one CPU core, this setup still yields much higher throughput (at the expense of a much less increased latency) means that the MPI FAH scaling is actually pretty dire.

The problem is exactly as Halo describes it - the performance of the whole operation is limited by the performance of the slowest core, which means that the effect of other processes competing for CPU time reflects 4-fold on the folding performance, i.e. another process using up 10% of one core should slow one thread down by 10%, but because the folding speed is limited by the slowest thread, it will actually slow all four threads down by 10%.
Image
1x Q6600 @ 3.2GHz, 4GB DDR3-1333
1x Phenom X4 9950 @ 2.6GHz, 4GB DDR2-1066
3x GeForce 9800GX2
1x GeForce 8800GT
CentOS 5 x86-64, WINE 1.x with CUDA wrappers
shatteredsilicon
Posts: 87
Joined: Tue Jul 08, 2008 2:27 pm
Hardware configuration: 1x Q6600 @ 3.2GHz, 4GB DDR3-1333
1x Phenom X4 9950 @ 2.6GHz, 4GB DDR2-1066
3x GeForce 9800GX2
1x GeForce 8800GT
CentOS 5 x86-64, WINE 1.x with CUDA wrappers

Re: Quad-core 2Ghz vs Dual-core 4Ghz - Which faster?

Post by shatteredsilicon »

tear wrote:
HaloJones wrote:(...) it is interesting to see in logs, messages referring to "58.7% time waiting" or somesuch, it indicates to me that the thread balancing is very poor
Yup, that's the thing [well, it *seems* it is].

Code: Select all

Average load imbalance: 0.6 %
Part of the total run time spent waiting due to load imbalance: 0.4 %
Steps where the load balancing was limited by -rdd, -rcon and/or -dds: X 0 % Y 0 %  0 %
Anyway, MONEY TALKS!!^W^W numbers speak!

[I'll try to conduct actual experiment sometime today]
This measurement must have come from a dedicated machine that has practically no other load on it. As discussed, dedicated folding machines scale reasonably well with the a2 core. Run it on a machine that is used 24/7 (e.g. a server, or even a dedicated folding machine that also runs some GPU clients) and you'll start seeing massive imbalances. At anything over 50% of one core being used on a quad, it'll actually be more efficient to run the folding SMP client just on 2 cores, and leave 1.5 cores completely unused. In fact, this seems to be exactly what the process scheduler under Linux tends to do under such conditions, leading to 30-40% idle time on a machine that should in theory have all it's free cycles saturated by the CPU hungry folding processes.
Image
1x Q6600 @ 3.2GHz, 4GB DDR3-1333
1x Phenom X4 9950 @ 2.6GHz, 4GB DDR2-1066
3x GeForce 9800GX2
1x GeForce 8800GT
CentOS 5 x86-64, WINE 1.x with CUDA wrappers
tear
Posts: 254
Joined: Sun Dec 02, 2007 4:08 am
Hardware configuration: None
Location: Rocky Mountains

Re: Quad-core 2Ghz vs Dual-core 4Ghz - Which faster?

Post by tear »

Relax, I heard you the first time ;-) [all my folders are dedicated].

Stay tuned,
tear
One man's ceiling is another man's floor.
Image
Post Reply