Issue with multi-socket MacPro1,1 in Ubuntu 20.04.1

Moderators: Site Moderators, FAHC Science Team

Post Reply
FalconFour
Posts: 29
Joined: Fri Sep 05, 2008 11:57 am

Issue with multi-socket MacPro1,1 in Ubuntu 20.04.1

Post by FalconFour »

I've got an ancient (mid-2006, first gen) MacPro1,1 that's got a good GPU and useful for converting electricity into desirable heat, while also producing ~200k PPD.

It sports two Xeon X5355 CPUs, each with 4 cores (so, 8 total). All 8 cores are accounted for in /proc/cpuinfo - interlaced between physical IDs 0 and 1 almost randomly (it goes "physical ID" 0, 0, 1, 0, 1, 0, 1, 1 in order of core IDs 0...7).

F@H client reports all of them:

Code: Select all

09:03:41:WU00:FS02:0xa8:************************************ System ************************************
09:03:41:WU00:FS02:0xa8:        CPU: Intel(R) Xeon(R) CPU X5355 @ 2.66GHz
09:03:41:WU00:FS02:0xa8:     CPU ID: GenuineIntel Family 6 Model 15 Stepping 7
09:03:41:WU00:FS02:0xa8:       CPUs: 8
09:03:41:WU00:FS02:0xa8:     Memory: 31.36GiB
09:03:41:WU00:FS02:0xa8:Free Memory: 29.42GiB
09:03:41:WU00:FS02:0xa8:    Threads: POSIX_THREADS
09:03:41:WU00:FS02:0xa8: OS Version: 5.4
09:03:41:WU00:FS02:0xa8:Has Battery: false
09:03:41:WU00:FS02:0xa8: On Battery: false
09:03:41:WU00:FS02:0xa8: UTC Offset: -8
09:03:41:WU00:FS02:0xa8:        PID: 1515
09:03:41:WU00:FS02:0xa8:        CWD: /home/falcon/fahclient/work
09:03:41:WU00:FS02:0xa8:********************************************************************************
I have two slots for WUs, so that each core can hopefully take one socket and it doesn't waste a lot of time crossing sockets. But now as I see more detail about it, it seems it's trying to outsmart me by keeping work ALL on one socket -- both WUs share 4 cores, not 8.

I have 3 slots - one CPU (3 thread, leaving one for GPU) one GPU, then another CPU (4 thread). The main star of the show is an R9 280x, which performs admirably once my "house of cards balancing on the head of a toothpick" style software setup actually works (I boot with Alt, select "Windows" to boot Linux, then once I get a black screen, I Ctrl+Alt+F2 over to the console, log in, cd to /fahclient, and run FAHClient. If i try setting it up as a service, it fails to run GPU workunits with the dreaded "Error initializing context: clCreateCommandQueue (-6)" error every time - but not if I run it from the console).

The CPU workunits seem to be... how do you say... underperforming, even for a CPU that scores only about 1900 on Passmark these days (compared to what, 33000 for a new Ryzen?). But they underperform in a strange way - the slots intermittently start reporting that they have "6 days" ETA, then crash down to mere hours while the other slot gets the "6 days" curse.

I looked into top, and I found the culprit:

Code: Select all

top - 01:12:49 up 28 min,  2 users,  load average: 4.40, 4.20, 3.57
Tasks: 197 total,   4 running, 193 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.1 us,  0.9 sy, 52.3 ni, 46.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :  32111.5 total,  29984.9 free,   1355.3 used,    771.3 buff/cache
MiB Swap:   1383.8 total,   1383.8 free,      0.0 used.  30359.6 avail Mem

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
   1427 falcon    39  19  382780 138436  11992 R 299.3   0.4  32:48.80 FahCore_a7
   1515 falcon    39  19  529356 147708  12980 R 112.0   0.4  10:18.62 FahCore_a8
   1261 falcon    39  19 1662988 765912  67688 R  12.6   2.3   5:44.44 FahCore_22
    520 root      -2   0       0      0      0 S   0.7   0.0   0:06.11 comp_1.0.0
     61 root      20   0       0      0      0 I   0.3   0.0   0:00.50 kworker/1:1-events
    313 root      19  -1   59900  24932  23652 S   0.3   0.1   0:00.64 systemd-journal
Yeah, that's the 3-thread FahCore_a7 running fine, but the 4-thread FahCore_a8 is only getting the remaining unused core, while the GPU thread presumably gets the scrappy leftovers of the single utilized CPU socket?

Wut?

Where do I even begin to diagnose this?
PantherX
Site Moderator
Posts: 6986
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: Issue with multi-socket MacPro1,1 in Ubuntu 20.04.1

Post by PantherX »

Can you please post the first ~100 lines from your log file which will have the system configuration and your client setup so we can see how the client is configured and match it to how you wanted it to be configured?

While we wait for the log file, the optimum setup would be 1 GPU Slot with 1 CPU Slot having 7 CPUs (or 6 CPUs if you want to leave 1 CPU for the OS). However, the GPU Slot would only work in Windows/Linux while the CPU Slot would work in Windows/Linux/macOS.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
Neil-B
Posts: 1996
Joined: Sun Mar 22, 2020 5:52 pm
Hardware configuration: 1: 2x Xeon [email protected], 512GB DDR4 LRDIMM, SSD Raid, Win10 Ent 20H2, Quadro K420 1GB, FAH 7.6.21
2: Xeon [email protected], 32GB DDR4, NVME, Win10 Pro 20H2, Quadro M1000M 2GB, FAH 7.6.21 (actually have two of these)
3: [email protected], 12GB DDR3, SSD, Win10 Pro 20H2, GTX 750Ti 2GB, GTX 1080Ti 11GB, FAH 7.6.21
Location: UK

Re: Issue with multi-socket MacPro1,1 in Ubuntu 20.04.1

Post by Neil-B »

A single 6/7 cpu may also outperform your two slot approach both in total ppd and actually benefit to science - and possibly actually process more wus per month than the two ... science benefits from quick turnaround of wus so even if overall you processed slightly less per month with one slot they would all be being returned quicker which is a good thing.

... and apologies, cant help trouble shoot the issue as Macs (both hardware and the OSs) are alien to me :(
2x Xeon E5-2697v3, 512GB DDR4 LRDIMM, SSD Raid, W10-Ent, Quadro K420
Xeon E3-1505Mv5, 32GB DDR4, NVME, W10-Pro, Quadro M1000M
i7-960, 12GB DDR3, SSD, W10-Pro, GTX1080Ti
i9-10850K, 64GB DDR4, NVME, W11-Pro, RTX3070

(Green/Bold = Active)
FalconFour
Posts: 29
Joined: Fri Sep 05, 2008 11:57 am

Re: Issue with multi-socket MacPro1,1 in Ubuntu 20.04.1

Post by FalconFour »

PantherX wrote:Can you please post the first ~100 lines from your log file which will have the system configuration and your client setup so we can see how the client is configured and match it to how you wanted it to be configured?

While we wait for the log file, the optimum setup would be 1 GPU Slot with 1 CPU Slot having 7 CPUs (or 6 CPUs if you want to leave 1 CPU for the OS). However, the GPU Slot would only work in Windows/Linux while the CPU Slot would work in Windows/Linux/macOS.
Sure! Here's that. I'm using remote FAHControl from a main Windows laptop, so the log tab is a little muddy (it doesn't seem to include all the detail from the System Info tab), but I think this segment has all the detail:

Code: Select all

09:01:50:FS00:Unpaused
09:01:50:WU01:FS00:Starting
09:01:50:WARNING:WU01:FS00:Changed SMP threads from 3 to 8 this can cause some work units to fail
09:01:50:WARNING:WU01:FS00:AS lowered CPUs from 8 to 3
09:01:50:WU01:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /home/falcon/fahclient/cores/cores.foldingathome.org/lin/64bit-sse2/a7-0.0.19/Core_a7.fah/FahCore_a7 -dir 01 -suffix 01 -version 706 -lifeline 1225 -checkpoint 15 -np 3
09:01:50:WU01:FS00:Started FahCore on PID 1423
09:01:50:WU01:FS00:Core PID:1427
09:01:50:WU01:FS00:FahCore 0xa7 started
09:01:51:WU01:FS00:0xa7:*********************** Log Started 2021-01-11T09:01:50Z ***********************
09:01:51:WU01:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
09:01:51:WU01:FS00:0xa7:       Type: 0xa7
09:01:51:WU01:FS00:0xa7:       Core: Gromacs
09:01:51:WU01:FS00:0xa7:       Args: -dir 01 -suffix 01 -version 706 -lifeline 1423 -checkpoint 15 -np 3
09:01:51:WU01:FS00:0xa7:************************************ CBang *************************************
09:01:51:WU01:FS00:0xa7:       Date: Nov 27 2019
09:01:51:WU01:FS00:0xa7:       Time: 11:26:54
09:01:51:WU01:FS00:0xa7:   Revision: d25803215b59272441049dfa05a0a9bf7a6e3c48
09:01:51:WU01:FS00:0xa7:     Branch: master
09:01:51:WU01:FS00:0xa7:   Compiler: GNU 8.3.0
09:01:51:WU01:FS00:0xa7:    Options: -std=c++11 -ffunction-sections -fdata-sections -O3 -funroll-loops
09:01:51:WU01:FS00:0xa7:             -fno-pie -fPIC
09:01:51:WU01:FS00:0xa7:   Platform: linux2 4.19.0-5-amd64
09:01:51:WU01:FS00:0xa7:       Bits: 64
09:01:51:WU01:FS00:0xa7:       Mode: Release
09:01:51:WU01:FS00:0xa7:************************************ System ************************************
09:01:51:WU01:FS00:0xa7:        CPU: Intel(R) Xeon(R) CPU X5355 @ 2.66GHz
09:01:51:WU01:FS00:0xa7:     CPU ID: GenuineIntel Family 6 Model 15 Stepping 7
09:01:51:WU01:FS00:0xa7:       CPUs: 8
09:01:51:WU01:FS00:0xa7:     Memory: 31.36GiB
09:01:51:WU01:FS00:0xa7:Free Memory: 29.56GiB
09:01:51:WU01:FS00:0xa7:    Threads: POSIX_THREADS
09:01:51:WU01:FS00:0xa7: OS Version: 5.4
09:01:51:WU01:FS00:0xa7:Has Battery: false
09:01:51:WU01:FS00:0xa7: On Battery: false
09:01:51:WU01:FS00:0xa7: UTC Offset: -8
09:01:51:WU01:FS00:0xa7:        PID: 1427
09:01:51:WU01:FS00:0xa7:        CWD: /home/falcon/fahclient/work
09:01:51:WU01:FS00:0xa7:******************************** Build - libFAH ********************************
09:01:51:WU01:FS00:0xa7:    Version: 0.0.19
09:01:51:WU01:FS00:0xa7:     Author: Joseph Coffland <[email protected]>
09:01:51:WU01:FS00:0xa7:  Copyright: 2019 foldingathome.org
09:01:51:WU01:FS00:0xa7:   Homepage: https://foldingathome.org/
09:01:51:WU01:FS00:0xa7:       Date: Nov 26 2019
09:01:51:WU01:FS00:0xa7:       Time: 00:41:43
09:01:51:WU01:FS00:0xa7:   Revision: d5b5c747532224f986b7cd02c968ed9a20c16d6e
09:01:51:WU01:FS00:0xa7:     Branch: master
09:01:51:WU01:FS00:0xa7:   Compiler: GNU 8.3.0
09:01:51:WU01:FS00:0xa7:    Options: -std=c++11 -ffunction-sections -fdata-sections -O3 -funroll-loops
09:01:51:WU01:FS00:0xa7:             -fno-pie
09:01:51:WU01:FS00:0xa7:   Platform: linux2 4.19.0-5-amd64
09:01:51:WU01:FS00:0xa7:       Bits: 64
09:01:51:WU01:FS00:0xa7:       Mode: Release
09:01:51:WU01:FS00:0xa7:************************************ Build *************************************
09:01:51:WU01:FS00:0xa7:       SIMD: sse2
09:01:51:WU01:FS00:0xa7:********************************************************************************
09:01:51:WU01:FS00:0xa7:Project: 16927 (Run 0, Clone 819, Gen 54)
09:01:51:WU01:FS00:0xa7:Unit: 0x00000000000000000000000000000000
09:01:51:WU01:FS00:0xa7:Digital signatures verified
09:01:51:WU01:FS00:0xa7:Calling: mdrun -s frame54.tpr -o frame54.trr -cpi state.cpt -cpt 15 -nt 3
09:01:52:WU01:FS00:0xa7:Steps: first=27000000 total=500000
09:01:53:WU01:FS00:0xa7:Completed 463127 out of 500000 steps (92%)
09:02:18:Saving configuration to config.xml
09:02:18:<config>
09:02:18:  <!-- Folding Core -->
09:02:18:  <core-priority v='low'/>
09:02:18:
09:02:18:  <!-- HTTP Server -->
09:02:18:  <allow v='127.0.0.1 192.168.86.0/24 10.0.0.0/8'/>
09:02:18:
09:02:18:  <!-- Network -->
09:02:18:  <proxy v=':8080'/>
09:02:18:
09:02:18:  <!-- Remote Command Server -->
09:02:18:  <command-allow-no-pass v='127.0.0.1 192.168.86.0/24 10.0.0.0/8'/>
09:02:18:
09:02:18:  <!-- Slot Control -->
09:02:18:  <pause-on-battery v='false'/>
09:02:18:  <power v='full'/>
09:02:18:
09:02:18:  <!-- User Information -->
09:02:18:  <passkey v='*****'/>
09:02:18:  <team v='245782'/>
09:02:18:  <user v='FalconFour'/>
09:02:18:
09:02:18:  <!-- Folding Slots -->
09:02:18:  <slot id='0' type='CPU'>
09:02:18:    <cpus v='8'/>
09:02:18:  </slot>
09:02:18:  <slot id='1' type='GPU'>
09:02:18:    <pci-bus v='8'/>
09:02:18:    <pci-slot v='0'/>
09:02:18:  </slot>
09:02:18:  <slot id='2' type='CPU'>
09:02:18:    <cpus v='4'/>
09:02:18:    <paused v='true'/>
09:02:18:  </slot>
09:02:18:</config>
09:03:40:FS02:Unpaused
09:03:40:WU00:FS02:Starting
09:03:40:WU00:FS02:Running FahCore: /usr/bin/FAHCoreWrapper /home/falcon/fahclient/cores/cores.foldingathome.org/lin/64bit-sse2/a8-0.0.9/Core_a8.fah/FahCore_a8 -dir 00 -suffix 01 -version 706 -lifeline 1225 -checkpoint 15 -np 4
09:03:40:WU00:FS02:Started FahCore on PID 1511
09:03:40:WU00:FS02:Core PID:1515
09:03:40:WU00:FS02:FahCore 0xa8 started
In the above, I tweaked the 3-core slot to 8-core, but found that it still only used 3 cores (probably because the WU was downloaded/started that way). Strange warning at the top, though.

I'm thinking I might just try Windows 10 on this thing next and see if it performs more smoothly. I use Linux on my dedicated F@H PCs because of the efficiency improvement - I'd seen a 10-20% boost under Linux than the same hardware in Windows, so it was worth the hassle.

I know newer hardware would of course be faster, but old hardware doesn't just despawn from existence when it's replaced ;) I also have an nVidia RTX 3060ti running in the living room PC crunching out over 10x the PPD of this thing. It's not about points, it's about converting electricity to heat while also doing useful work! Better this way than a room heater doing no work...

Other PCs I've got going are a ThinkPad P52 with Quadro P2000 (Linux), a gaming laptop with GF 960m (Linux), another laptop with an R9 Fury X Thunderbolt eGPU (Windows), a board-sitting-on-a-bench with an R9 580 (Linux), another room heater with a GTX 760 (Linux), and the VR PC with the RTX 3060ti (Windows). I try to heat the house with science :lol:
Joe_H
Site Admin
Posts: 7926
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: Issue with multi-socket MacPro1,1 in Ubuntu 20.04.1

Post by Joe_H »

To me it looks like your Linux kernel does not have scheduler optimizations for the hardware architecture of the Mac Pro logic board and its processors. Someone may or may not have done such an optimization.

In trying to avoid NUMA issues, you may have overthought the situation. I would just set up a slot for 6 or 7 CPU threads and be done with it. Then the scheduler will "know" it has that many threads that should run on different cores from a single process instead of two independent ones.

As for the CPU thread count on current WUs, once downloaded at a certain number of threads they will not use more than that number, but can use less. The messages in the log are partly a holdover from older code where the thread count change could be a problem. The ones with "AS" identified as the reason are where that download thread limit is being expressed.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
PantherX
Site Moderator
Posts: 6986
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: Issue with multi-socket MacPro1,1 in Ubuntu 20.04.1

Post by PantherX »

FYI, when using remote FAHControl, make sure that you untick "Follow" and then click "Refresh" which will pull the entire log from the remote system which will include the first 100 lines.

Joe_H has correctly stated the events but if you wanted a breakdown, here it is:
This means that you have modified the CPU Slot from 3 CPUs to 8 CPUs. Some decommissioned FahCores would fail but in this case, it won't fail:
09:01:50:WARNING:WU01:FS00:Changed SMP threads from 3 to 8 this can cause some work units to fail

This means that since the WU was downloaded with 3 CPUs, it will not use 8 CPUs, instead it will continue to use 3 CPUs and the new WU would use 8 CPUs:
09:01:50:WARNING:WU01:FS00:AS lowered CPUs from 8 to 3

BTW, since you have a GPU Slot, it would be ideal to have 1 CPU and 1 GPU so this is what your new configuration would look like:

Code: Select all

<config>
  <!-- Folding Core -->
  <core-priority v='low'/>

  <!-- HTTP Server -->
  <allow v='127.0.0.1 192.168.86.0/24 10.0.0.0/8'/>

  <!-- Network -->
  <proxy v=':8080'/>

  <!-- Remote Command Server -->
  <command-allow-no-pass v='127.0.0.1 192.168.86.0/24 10.0.0.0/8'/>

  <!-- Slot Control -->
  <pause-on-battery v='false'/>
  <power v='full'/>

  <!-- User Information -->
  <passkey v='*****'/>
  <team v='245782'/>
  <user v='FalconFour'/>

  <!-- Folding Slots -->
  <slot id='0' type='CPU'>
    <cpus v='7'/>
  </slot>
  <slot id='1' type='GPU'>
    <pci-bus v='8'/>
    <pci-slot v='0'/>
  </slot>
</config>
1 CPU is required for the GPU regardless of OS. Linux would net you more points than Windows due to optimizations and lack of overhead. Plus, since this is a dedicated system that folds, would be a set-and-forget approach.

BTW, to even get more points, you may want to use CPU affinity where you set FahCore_a7/FahCore_a8 affinity to 0-6 CPUs and FahCore_21/FahCore_22 to 7 CPU. On Windows, you can use Process Lasso but not sure what the equivalent would be on Linux (assuming that's possible).
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
FalconFour
Posts: 29
Joined: Fri Sep 05, 2008 11:57 am

Re: Issue with multi-socket MacPro1,1 in Ubuntu 20.04.1

Post by FalconFour »

Yup, sure enough, the problem was me overthinking it. Know too much (on-die L2, etc)... assume mistakes will be made (lots of cross-socket interaction that's unnecessary)... make a mistake trying to prevent a mistake (one WU per socket, easy right?). Oof.

I let both slots finish their WUs today then pause, wait for me to get home and reconfigure it. Deleted the 4-thread one and left the 8-core one. It downloaded an 8-core WU and all 8 cores are firing constantly now.

Now I just need to keep an eye on if it's stepping on the GPU WU's toes, and maybe back it off to 7-thread, but it seems like the GPU WU is taking about 17% of a core on average and behaving nicely. I'll keep an eye on PPD for signs of problems.

Thanks for the tip, that got me on track!

Oh, and as to the log, I didn't even see those controls in the bottom-right corner until I looked for the follow/refresh controls. Nice! That'll be handy for the future.
Neil-B
Posts: 1996
Joined: Sun Mar 22, 2020 5:52 pm
Hardware configuration: 1: 2x Xeon [email protected], 512GB DDR4 LRDIMM, SSD Raid, Win10 Ent 20H2, Quadro K420 1GB, FAH 7.6.21
2: Xeon [email protected], 32GB DDR4, NVME, Win10 Pro 20H2, Quadro M1000M 2GB, FAH 7.6.21 (actually have two of these)
3: [email protected], 12GB DDR3, SSD, Win10 Pro 20H2, GTX 750Ti 2GB, GTX 1080Ti 11GB, FAH 7.6.21
Location: UK

Re: Issue with multi-socket MacPro1,1 in Ubuntu 20.04.1

Post by Neil-B »

even if it looks as if you have headroom on the cpu I really would suggest running it for a while with the cpu set at each core count and monitoring the gpu/overall ppd ... most people find that without a full thread available to service the gpu the gpu slot underperforms - and the significance of that can vary from wu to wu dependant upon how each project is using the gpu ... traditional wisdom would say 7 thread max (leaving one for the gpu) - maybe 6 to avoid certain issues with old cpu folding processing cores and leave extra headroom for system (gpu can have spikey cpu load on checkpoints and maxing cpu at that time can slow throughput) ... but if 8 works and produces best throughput for you hey great :) - and you are running linux with I believe an amd gpu so you are maximising your chances of this
2x Xeon E5-2697v3, 512GB DDR4 LRDIMM, SSD Raid, W10-Ent, Quadro K420
Xeon E3-1505Mv5, 32GB DDR4, NVME, W10-Pro, Quadro M1000M
i7-960, 12GB DDR3, SSD, W10-Pro, GTX1080Ti
i9-10850K, 64GB DDR4, NVME, W11-Pro, RTX3070

(Green/Bold = Active)
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Issue with multi-socket MacPro1,1 in Ubuntu 20.04.1

Post by bruce »

FalconFour wrote: I let both slots finish their WUs today then pause, wait for me to get home and reconfigure it. Deleted the 4-thread one and left the 8-core one. It downloaded an 8-core WU and all 8 cores are firing constantly now.

Now I just need to keep an eye on if it's stepping on the GPU WU's toes, and maybe back it off to 7-thread, but it seems like the GPU WU is taking about 17% of a core on average and behaving nicely. I'll keep an eye on PPD for signs of problems.
Another tip: Assignments for FAHCore_a7 and for FAHCore_a8 often will come with different optimums and keeping both types straight (or straightening them out) would drive me crazy. Managing everything manually is a hassle you don't need. Over the long-haul, you probably can't improve the default choices by very much.
FalconFour
Posts: 29
Joined: Fri Sep 05, 2008 11:57 am

Re: Issue with multi-socket MacPro1,1 in Ubuntu 20.04.1

Post by FalconFour »

Update: I found that the 8-thread CPU WU was so dramatically under-performing, it wasn't even moving. It made 3% progress overnight on a WU with 1360 base credit. Um... yikes?

The GPU slot worked admirably as expected, same performance as usual, but that 8-thread WU (on a7 core) was just not moving - had a 9-day ETA after it ran overnight (2 hour 12 min TPF). Typical WUs on this thing take 16-odd hours (though I've only been running it a few days, enough to get a feel for it). Something is seriously wrong, still, but I don't know that it can be fixed.

I bumped that WU down to 4 threads for the night, to see if things improve (counterintuitively). I'll report back in the morning if I remember/have time before work.

edit: yep, reduced to 4 threads and now it's about 9 minutes TPF instead of 2h12m.
PantherX
Site Moderator
Posts: 6986
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: Issue with multi-socket MacPro1,1 in Ubuntu 20.04.1

Post by PantherX »

FalconFour wrote:Update: I found that the 8-thread CPU WU was so dramatically under-performing, it wasn't even moving. It made 3% progress overnight on a WU with 1360 base credit. Um... yikes?...
Actually, I am not surprised by this behavior since this is the expected result when you have over-subscribed your system. This is how you have allocated your resources:
CPU Slots: 8 CPUs
GPU Slot: 1 CPUs
Total Slots: 9 CPUs but you only have 8 CPUs thus, you have over-subscribed your system.
FalconFour wrote:...The GPU slot worked admirably as expected, same performance as usual, but that 8-thread WU (on a7 core) was just not moving - had a 9-day ETA after it ran overnight (2 hour 12 min TPF). Typical WUs on this thing take 16-odd hours (though I've only been running it a few days, enough to get a feel for it). Something is seriously wrong, still, but I don't know that it can be fixed...
Good to know that GPU Slot wasn't impacted and my reasoning would be that the OS (via the drivers) have a higher priority for the GPU over the CPU hence, no impact. The solution is to reduce the CPU from 8 to 7 (or 6).
FalconFour wrote:...edit: yep, reduced to 4 threads and now it's about 9 minutes TPF instead of 2h12m.
Since you have the TFP with 4 CPUs, I would suggest that you pause the WU, bump up the CPU to 6 or 7 and then resume folding. Wait for at least 4% before using the TFP. Go with whatever is the lowest TFP :)
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
Neil-B
Posts: 1996
Joined: Sun Mar 22, 2020 5:52 pm
Hardware configuration: 1: 2x Xeon [email protected], 512GB DDR4 LRDIMM, SSD Raid, Win10 Ent 20H2, Quadro K420 1GB, FAH 7.6.21
2: Xeon [email protected], 32GB DDR4, NVME, Win10 Pro 20H2, Quadro M1000M 2GB, FAH 7.6.21 (actually have two of these)
3: [email protected], 12GB DDR3, SSD, Win10 Pro 20H2, GTX 750Ti 2GB, GTX 1080Ti 11GB, FAH 7.6.21
Location: UK

Re: Issue with multi-socket MacPro1,1 in Ubuntu 20.04.1

Post by Neil-B »

... and that last bit will only work immediately for the WU that was downloaded as 8 and reduced to 4 - if another wu has been downloaded at 4 since it wont utilise the increased core count until the next wu is downloaded :)
2x Xeon E5-2697v3, 512GB DDR4 LRDIMM, SSD Raid, W10-Ent, Quadro K420
Xeon E3-1505Mv5, 32GB DDR4, NVME, W10-Pro, Quadro M1000M
i7-960, 12GB DDR3, SSD, W10-Pro, GTX1080Ti
i9-10850K, 64GB DDR4, NVME, W11-Pro, RTX3070

(Green/Bold = Active)
FalconFour
Posts: 29
Joined: Fri Sep 05, 2008 11:57 am

Re: Issue with multi-socket MacPro1,1 in Ubuntu 20.04.1

Post by FalconFour »

Well, see... OK, I'll try with 7, but I'm finding really erratic behavior with this thing. If I were to plot its progress % over time on a chart while processing the same WU with nothing else on the system but that running, it'd look like a mountain range. Other PCs seem to be pretty linear. Hard to explain it, except to think that it's a NUMA issue (though this being an LGA771 CPU with offboard memory controller, doesn't the chipset host both sockets with uniform memory access?), as the system drifts from work on a single socket (fast) to splitting between sockets (slow to a near halt). I'd imagine it's L2 cache related in that case.

This previous observation of erratic performance was seen with just a single 4-thread WU + GPU WU, not the 8-thread experiment.

Also to note: during a period where it's having a slowed-to-a-crawl mind freeze, it also seems that the WU has completely hung - if I try exiting F@H either with a pause command or a Ctrl+C, it actually takes >30sec to stop, and FAHClient seems to do a hard "kill" on it. Strange, but maybe a clue there. Other times, when it's not frozen, it'll just pause/resume/reconfigure normally. It's totally unpredictable when it's stuck or not.

I'll give it a go with bumping it to 7 threads for the next WU, which I suppose is the optimal default, and see how it goes.

edit: I bumped my existing 4-thread WU from 4 to 7, and it reconfigured itself with 6 threads without redownloading a WU. It's flying along at 2309 PPD and 8:29 TPF. I also heard the fans crank a little higher. We'll see how that holds up overnight, I guess? Weather has warmed up recently, so I'm only running this PC when it's quite cold at night, so it's a bit slow to see results.

update: OK, it's really ripping along now with 7 threads (configured as "-1" for auto). I guess this is all just a long way to say to just let it do auto. 62520 PPD total and 11856 PPD on CPU alone. I'll take it, though GPU sure seems to be underperforming at the moment with its current WU (currently 21hr ETA). Oh well. Happy with it for now!
Post Reply