Dual X5650 giving me 20k PPD, why so slow?

A forum for discussing FAH-related hardware choices and info on actual products (not speculation).

Moderator: Site Moderators

Forum rules
Please read the forum rules before posting.
Nilem
Posts: 9
Joined: Wed Oct 07, 2020 7:47 pm

Dual X5650 giving me 20k PPD, why so slow?

Post by Nilem »

I've just found an old HP server with Dual Xeon X5650 and started to fold on it. I know its old, but since, each CPU has a cpu mark of 5863, I was thinking the result should be okay.

But it's been disapointing. I'm currently folding a 16810 project's FAHCore_a8 WU and only making 20k-ish PPD with all 24 thread on it. Meanwhile, I also have a single i5-8500B 6 cores (CPU mark of 9538) that got the same WU, and it could crush it at of speed that would end up with over 100k PPD.

Is this normal? Are my dual X5650 just that slow? Or is there something I should check to unleash their power?

So far, I've tried to split the thread in slots like 12/12, 8/8/6, etc, and couldn't really get much better results. But even when the 24 threads are in one slot, the monitor tells me most cores are pretty much in full use.

Also, sometimes it suddenly starts to fold very fast and get up to 150k PPD, but this is always just a glitch : when this happen, progress stop to be print in the log tab and after a few minutes, the progress bar just go down to where it was before the power up. (Example : in a fews minutes I get from 55% to 60%, but none of the progress was log and it suddently go back down to 55.40%). Not sure if this has anything to do with my main question.
Joe_H
Site Admin
Posts: 7937
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: Dual X5650 giving me 20k PPD, why so slow?

Post by Joe_H »

To start, yes the X5650 is that old. It is a 10 year old Westmere design and only supports SSE. Newer chips support AVX or AVX2 which gives a significant speedup over the SSE2 instructions used when on your processors.

You may find depending on the WU that using just the 12 main CPU threads will give nearly the highest PPD.

The PPD and ETA figures right after a start are not reliable estimates. The client needs to see at least 1-2% progress before they will settle down to accurate numbers.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
Nilem
Posts: 9
Joined: Wed Oct 07, 2020 7:47 pm

Re: Dual X5650 giving me 20k PPD, why so slow?

Post by Nilem »

Joe_H wrote: Newer chip support AVX or AVX2 which gives a significant speedup over the SSE2 instructions used when on your processors.
Well, here is an important details I totally miss! Thanks for the info.
Joe_H wrote: You may find depending on the WU that using just the 12 main CPU threads will give nearly the highest PPD.
You mean that running just 1 slot with 12 threads might get higher PPD than 1 slot with 24 threads? Whats would be the reason for not using all the HT?
_r2w_ben
Posts: 285
Joined: Wed Apr 23, 2008 3:11 pm

Re: Dual X5650 giving me 20k PPD, why so slow?

Post by _r2w_ben »

Nilem wrote:You mean that running just 1 slot with 12 threads might get higher PPD than 1 slot with 24 threads? Whats would be the reason for not using all the HT?
FAHCore_a8 does a better job of keeping execution units busy than FAHCore_a7. Spawning and syncing the extra threads that use HT might be slowing things down more than they're helping.

The current build of FAHCore_a8 is de-optimized for multi-socket environments (referred to as -ntmpi 1 here). This results in extra syncing between sockets so less threads might be faster.
Joe_H
Site Admin
Posts: 7937
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: Dual X5650 giving me 20k PPD, why so slow?

Post by Joe_H »

Yes, 1 slot with 12 threads.

When using the 2 threads per core provided through HT, the processing threads are in constant contention for use of the single FPU available. Each thread needs to be synchronized with the others, so overall processing is limited by the speed of the slowest threads. In addition, while some WUs will process on all 24 threads, depending on the size in atoms, they may not process much faster past a llower thread count.

Depending on what else is running on the system, including system processes for the OS, leaving at least 1 or 2 threads unused by F@h can result in less interruption of the threads that are being used.

Finally, one other thing I did not mention before is the NUMA related settings in your BIOS. They can affect how threads on one physical CPU communicate with those on the other. Those settings may speed up or slow down folding, or have no discernible effect.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
Neil-B
Posts: 1996
Joined: Sun Mar 22, 2020 5:52 pm
Hardware configuration: 1: 2x Xeon [email protected], 512GB DDR4 LRDIMM, SSD Raid, Win10 Ent 20H2, Quadro K420 1GB, FAH 7.6.21
2: Xeon [email protected], 32GB DDR4, NVME, Win10 Pro 20H2, Quadro M1000M 2GB, FAH 7.6.21 (actually have two of these)
3: [email protected], 12GB DDR3, SSD, Win10 Pro 20H2, GTX 750Ti 2GB, GTX 1080Ti 11GB, FAH 7.6.21
Location: UK

Re: Dual X5650 giving me 20k PPD, why so slow?

Post by Neil-B »

With xeons (at least the more recent ones) the contention impacts the scenario less than other multi thread cpus (or at least that is my experience) .. the output/throughput from my 14 core xeons increases significantly (admittedly not quite double but fairly close) when running 28 threads ... but worth a few tests using an offline core/wu to see the real difference.
2x Xeon E5-2697v3, 512GB DDR4 LRDIMM, SSD Raid, W10-Ent, Quadro K420
Xeon E3-1505Mv5, 32GB DDR4, NVME, W10-Pro, Quadro M1000M
i7-960, 12GB DDR3, SSD, W10-Pro, GTX1080Ti
i9-10850K, 64GB DDR4, NVME, W11-Pro, RTX3070

(Green/Bold = Active)
sptn.
Posts: 51
Joined: Wed Sep 09, 2020 10:05 am

Re: Dual X5650 giving me 20k PPD, why so slow?

Post by sptn. »

Nilem wrote:[...]I also have a single i5-8500B 6 cores (CPU mark of 9538) that got the same WU, and it could crush it at of speed that would end up with over 100k PPD.[...]
I can't help you with your Problem, but I would like to know how you squeeze 100k PPD out of an i5.
I have an i7 (i guess i7-8550U something) and getting max 36k (more likely 30) PPD.
JimboPalmer
Posts: 2522
Joined: Mon Feb 16, 2009 4:12 am
Location: Greenwood MS USA

Re: Dual X5650 giving me 20k PPD, why so slow?

Post by JimboPalmer »

sptn. wrote:I have an i7 (i guess i7-8550U something) and getting max 36k (more likely 30) PPD.
Have you entered a passkey for your Folding?

https://foldingathome.org/support/faq/points/passkey/
Tsar of all the Rushers
I tried to remain childlike, all I achieved was childish.
A friend to those who want no friends
Nilem
Posts: 9
Joined: Wed Oct 07, 2020 7:47 pm

Re: Dual X5650 giving me 20k PPD, why so slow?

Post by Nilem »

sptn. wrote:I can't help you with your Problem, but I would like to know how you squeeze 100k PPD out of an i5.
I have an i7 (i guess i7-8550U something) and getting max 36k (more likely 30) PPD.
The "i7" doesn't mean much. An i7-8550U is a CPU for notebook and it is nearly half as powerful as an i5-8500B, but consume a little 15w compare to 65w. The "i7" version that would compare better is the i7-8700B, which is a 6 cores HT (the i5-8500B isn't HT) which deliver 30% more,

https://www.cpubenchmark.net/compare/In ... 3064vs3388
Nilem
Posts: 9
Joined: Wed Oct 07, 2020 7:47 pm

Re: Dual X5650 giving me 20k PPD, why so slow?

Post by Nilem »

So I got my best PPD with two slots like so : 12/10. I could than achieve around 25k PPD.

But I see that the lack of AVX makes this machine very inefficient under FAH. I've tried Rosetta on it, and the result where more proportional to the raw power of that dual X5650, so I will probably keep it there for now and run FAH on my more recent machine.
sptn.
Posts: 51
Joined: Wed Sep 09, 2020 10:05 am

Re: Dual X5650 giving me 20k PPD, why so slow?

Post by sptn. »

JimboPalmer wrote:
sptn. wrote:I have an i7 (i guess i7-8550U something) and getting max 36k (more likely 30) PPD.
Have you entered a passkey for your Folding?

https://foldingathome.org/support/faq/points/passkey/
Yes I did.
Nilem wrote:
sptn. wrote:I can't help you with your Problem, but I would like to know how you squeeze 100k PPD out of an i5.
I have an i7 (i guess i7-8550U something) and getting max 36k (more likely 30) PPD.
The "i7" doesn't mean much. An i7-8550U is a CPU for notebook and it is nearly half as powerful as an i5-8500B, but consume a little 15w compare to 65w. The "i7" version that would compare better is the i7-8700B, which is a 6 cores HT (the i5-8500B isn't HT) which deliver 30% more,

https://www.cpubenchmark.net/compare/In ... 3064vs3388
Ah this makes sense.
Nathan_P
Posts: 1164
Joined: Wed Apr 01, 2009 9:22 pm
Hardware configuration: Asus Z8NA D6C, 2 [email protected] Ghz, , 12gb Ram, GTX 980ti, AX650 PSU, win 10 (daily use)

Asus Z87 WS, Xeon E3-1230L v3, 8gb ram, KFA GTX 1080, EVGA 750ti , AX760 PSU, Mint 18.2 OS

Not currently folding
Asus Z9PE- D8 WS, 2 [email protected] Ghz, 16Gb 1.35v Ram, Ubuntu (Fold only)
Asus Z9PA, 2 Ivy 12 core, 16gb Ram, H folding appliance (fold only)
Location: Jersey, Channel islands

Re: Dual X5650 giving me 20k PPD, why so slow?

Post by Nathan_P »

If you haven't already, try linux as the OS. That 20k is low for a pair of x5650's
Image
Nilem
Posts: 9
Joined: Wed Oct 07, 2020 7:47 pm

Re: Dual X5650 giving me 20k PPD, why so slow?

Post by Nilem »

Nathan_P wrote:If you haven't already, try linux as the OS. That 20k is low for a pair of x5650's
I didn't try, I actually tought windows would be a bit faster. Do you have an estimation on how many more points should I get under Linux?
PantherX
Site Moderator
Posts: 6986
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: Dual X5650 giving me 20k PPD, why so slow?

Post by PantherX »

Nilem wrote:...I didn't try, I actually tought windows would be a bit faster. Do you have an estimation on how many more points should I get under Linux?
Currently, folding on Linux does generate more points, each Project would have a different increase. Do note, that while this is the current situation, it might change in the future. Here's a Google Sheets that have some information for you to make an informed decision:
Same set of hardware for each test: https://docs.google.com/spreadsheets/d/ ... edit#gid=0
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
MeeLee
Posts: 1339
Joined: Tue Feb 19, 2019 10:16 pm

Re: Dual X5650 giving me 20k PPD, why so slow?

Post by MeeLee »

The idea of running one slot per CPU is probably interesting.
I don't know how you'd do that in the slots setting.
But if you run 2 CPUs, and force a WU on them that utilizes both CPUs, the connection between both CPUs are most definitely going to slow each unit down.
Disabling HT is also a good way to save power.
If you have PCIE ports available, it may be better to plug a few GPUs in. As a single GT1030 will get a higher score than both of those CPUs.

Or, if you have some money spare, get a Ryzen 3900X. They're not only more than twice as fast @ 3,8-4Ghz, they're also getting extra bonus PPDs (due to many threads and faster finishing WUs) AND they use less power (around 150W stock, to 200W with PBO for a system doing CPU folding only, at the wall).
Post Reply