Project 18251 very low PPD on RTX 2060s
Moderators: Site Moderators, FAHC Science Team
-
- Posts: 66
- Joined: Wed Mar 18, 2020 2:55 pm
- Hardware configuration: HP Z600 (5) HP Z800 (3) HP Z440 (3)
ASUS Turbo GTX 1060, 1070, 1080, RTX 2060 (3)
Dell GTX 1080 - Location: Sydney Australia
Re: Project 18251 very low PPD on RTX 2060s
Further to the above, just now when doing my rounds I found four of the five devices running this weekend to be dealing with 18251's, and the fifth (Z805/GTX 1080) is protected from them by deliberate shortage of RAM. After an invigorating climb to the attic I found Z803/GTX1080 repeating its 1.5M PPD performance with a TPF of about 4 minutes, so I fired up the HP Performance advisor and went to the "Manage CPU affinity" option. All current Fah cores are allowed to run on all 24 logical processors, and overall usage was under 10% on both CPUs. I waited for the next 5% checkpoint to arrive, and all 24 spun up to 100% for maybe 10 seconds at most.
The other Donor box running up there was Z443/RTX 2060(TU106), which differs from Z441 and Z442 in having an E5-1620 V3 with 4C 8T rather than an E5-1650 V3 with 6C 12T. In the usual 2060(TU106) fashion with 18251s it was battling towards 0.6M PPD with a TPF of just under 8 minutes, and its 8 logical processors were averaging about 16% utilisation. When the next 5% checkpoint finally arrived the 8 logical processors spun up to 100% for longer - maybe 20-30 seconds.
Clearly, 20 repetitions of high CPU use of well under a minute each time means that differing CPU speed is not likely to be a major factor. And by the way, with all those high end GPUs out there, how come four of my five elderly devices get 18251s when there's nothing much Green between a 4090 and 2060 in the NVIDIA GPUs at LAR systems for this project?
The other Donor box running up there was Z443/RTX 2060(TU106), which differs from Z441 and Z442 in having an E5-1620 V3 with 4C 8T rather than an E5-1650 V3 with 6C 12T. In the usual 2060(TU106) fashion with 18251s it was battling towards 0.6M PPD with a TPF of just under 8 minutes, and its 8 logical processors were averaging about 16% utilisation. When the next 5% checkpoint finally arrived the 8 logical processors spun up to 100% for longer - maybe 20-30 seconds.
Clearly, 20 repetitions of high CPU use of well under a minute each time means that differing CPU speed is not likely to be a major factor. And by the way, with all those high end GPUs out there, how come four of my five elderly devices get 18251s when there's nothing much Green between a 4090 and 2060 in the NVIDIA GPUs at LAR systems for this project?
-
- Posts: 66
- Joined: Wed Mar 18, 2020 2:55 pm
- Hardware configuration: HP Z600 (5) HP Z800 (3) HP Z440 (3)
ASUS Turbo GTX 1060, 1070, 1080, RTX 2060 (3)
Dell GTX 1080 - Location: Sydney Australia
Re: Project 18251 very low PPD on RTX 2060s
I have finally invented a protocol for keeping 18251 jobs from wasting the time of my 2060s. Part of it is balanced by allocating Z803 and its RTX 1080 to prefer Alzheimer's Disease projects, since the RTX 1080 in Z803 folds 18251s at its normal rate (4 min TPF, about 1.5M PPD) whereas the 2060s run around TPF 8 min and 0.6M PPD or less on 18251s, ie one-third or less of their usual performance of ~2M PPD.
Since I am using the V7 client, and I had noticed that the 18251 downloads are large, I remembered a previous discussion at viewtopic.php?t=37313 and set the "expert" parameter max-packet-size to SMALL for the 2060 GPU slots, which seems to work just fine in excluding 18251's, though I am advised that this is a parameter from ye olde days of slow modems and only accidentally useful here. It also (unfortunately) biases my selection against other jobs with large downloads that would run normally, so I expect to remove this after 18251 goes away.
In one case where this failed, the "Nuke" option is to pause the job, delete the slot, and restart. This dumps the job quickly so it can be reassigned to a more compatible device before too much of anyone's time is wasted. I note that this protocol was developed the other evening when all three 2060s started their overnight runs with 18251 jobs that they could not complete within the7 hours of "off peak" and 2 hours of "shoulder" electricity I allocate to them overnight, so they would need to be paused for 15 hours and end up taking more than 24 hours for little return.
PS: Just added the RTX 1070 in Z602 to the Alzheimer's-preferred team. Immediately picked up an 18251 job, will run about 7 hours at TPF ~8mins, 1.2M PPD. So normal behaviour in 18251s for a 1070.
Since I am using the V7 client, and I had noticed that the 18251 downloads are large, I remembered a previous discussion at viewtopic.php?t=37313 and set the "expert" parameter max-packet-size to SMALL for the 2060 GPU slots, which seems to work just fine in excluding 18251's, though I am advised that this is a parameter from ye olde days of slow modems and only accidentally useful here. It also (unfortunately) biases my selection against other jobs with large downloads that would run normally, so I expect to remove this after 18251 goes away.
In one case where this failed, the "Nuke" option is to pause the job, delete the slot, and restart. This dumps the job quickly so it can be reassigned to a more compatible device before too much of anyone's time is wasted. I note that this protocol was developed the other evening when all three 2060s started their overnight runs with 18251 jobs that they could not complete within the7 hours of "off peak" and 2 hours of "shoulder" electricity I allocate to them overnight, so they would need to be paused for 15 hours and end up taking more than 24 hours for little return.
PS: Just added the RTX 1070 in Z602 to the Alzheimer's-preferred team. Immediately picked up an 18251 job, will run about 7 hours at TPF ~8mins, 1.2M PPD. So normal behaviour in 18251s for a 1070.
-
- Posts: 1534
- Joined: Sun Dec 16, 2007 6:22 pm
- Hardware configuration: 9950x, 7950x3D, 5950x, 5800x3D
7900xtx, RX9070, Radeon 7, 5700xt, 6900xt, RX 550 640SP - Location: London
- Contact:
Re: Project 18251 very low PPD on RTX 2060s
I asked researcher to disable 18251 for nvidia species 7, which is 2060 and 2070 mobile. Something in 2060 was cut too much by nvidia that it simply gets choked by this project
-
- Site Moderator
- Posts: 6421
- Joined: Sun Dec 02, 2007 10:38 am
- Location: Bordeaux, France
- Contact:
Re: Project 18251 very low PPD on RTX 2060s
This project uses a bugged code in OpenMM that prevent it from working well on newer GPUs ... this bug is fixed in core 26, but it has other issue that must be fixed before moving to it. 
If we disable NV species 7, it will also take down high end GTX 1xxx GPUs that are doing fine on this project and it might only left very few systems eligible to running it ...

If we disable NV species 7, it will also take down high end GTX 1xxx GPUs that are doing fine on this project and it might only left very few systems eligible to running it ...

-
- Posts: 546
- Joined: Fri Apr 03, 2020 2:22 pm
- Hardware configuration: ASRock X370M PRO4
Ryzen 2400G APU
16 GB DDR4-3200
MSI GTX 1660 Super Gaming X
Re: Project 18251 very low PPD on RTX 2060s
But a reminder for the internal testers....
All the first work units ran at normal PPD estimates for my 1660 Super. Later projects increased TPF almost 100%. I tend to think either something changed in the later runs of the project, or possibly something that changed in Windows 10 OS that caused the TPF changes.
Looking at LAR, I tend to think that the majority of Turing and later GPU's are impacted.
All the first work units ran at normal PPD estimates for my 1660 Super. Later projects increased TPF almost 100%. I tend to think either something changed in the later runs of the project, or possibly something that changed in Windows 10 OS that caused the TPF changes.
Looking at LAR, I tend to think that the majority of Turing and later GPU's are impacted.
Fold them if you get them!
-
- Posts: 66
- Joined: Wed Mar 18, 2020 2:55 pm
- Hardware configuration: HP Z600 (5) HP Z800 (3) HP Z440 (3)
ASUS Turbo GTX 1060, 1070, 1080, RTX 2060 (3)
Dell GTX 1080 - Location: Sydney Australia
Re: Project 18251 very low PPD on RTX 2060s
After checking at https://apps.foldingathome.org/GPUs.txt I was surprised at the wide range of "Species 7", which includes 10xx 20xx and even 3050s and lower-class 3060s. I don't mind donating 1080 and 1070 time to 18251, though the Z600 and Z800 boxes and their GPUs are less energy efficient than the Z440/2060 combinations, and I prefer to run them only now and then. So I agree that it might be difficult for the research if it had to exclude Species 7.
On the Core 24 / Core 26 issue I note that 18251 has a doppelganger called 18252, with the same number of atoms and base credit, but Core 26 rather than Core 24, which shows up at https://folding.lar.systems/projects/ as:
On the Core 24 / Core 26 issue I note that 18251 has a doppelganger called 18252, with the same number of atoms and base credit, but Core 26 rather than Core 24, which shows up at https://folding.lar.systems/projects/ as:
When we look for GPU performance on 18252 we find only 3 records, namely:18252 alzheimers University of Pennsylvania 0x26 1,224,788 118,000
18251 alzheimers NEW University of Pennsylvania 0x24 1,224,788 118,000
I note that this is a TU104 2060 (cut down 2070/2080) rather than the regular TU106 versions that have problems, but does this mean that the Core26 version is already running?1 GeForce RTX 4060 Ti AD106 9,425,296 PPD average
2 Radeon RX 6800/6800XT/6900XT Navi 21 4,471,442 PPD Average
3 GeForce RTX 2060 TU104 3,190,526 PPD Average
-
- Posts: 1534
- Joined: Sun Dec 16, 2007 6:22 pm
- Hardware configuration: 9950x, 7950x3D, 5950x, 5800x3D
7900xtx, RX9070, Radeon 7, 5700xt, 6900xt, RX 550 640SP - Location: London
- Contact:
Re: Project 18251 very low PPD on RTX 2060s
They are both identical projects just one is built for core24 and one for core26.
Core26 version run great on all hardware. Unfortunately core26 has few snags, plus HIP is being added to it right now
Core26 version run great on all hardware. Unfortunately core26 has few snags, plus HIP is being added to it right now
-
- Posts: 123
- Joined: Fri Apr 16, 2010 11:43 pm
- Hardware configuration: AMD 5800X3D Asus ROG Strix X570-E Gaming WiFi II bios 5031 G-Skill TridentZ Neo 3600mhz Asrock Tachi RX 7900XTX Corsair rm850x psu Asus PG32UQXR EK Elite 360 D-rgb aio Win 11pro/Kubuntu 2404.2 LTS UPS BX1500G
- Location: Galifrey
Re: Project 18251 very low PPD on RTX 2060s
I am seeing consistently low completion times with projects 18251 and 18238
They abnormally take over 5 hours to complete? All other core 24 based projects have a normal TPF average of around 1.08 and complete around 2 hours.
Could this be a regression or bug introduced in the rocm 6.3.4 stack? What is different in the way these 2 work units are configured that so severly impacts performance?
Code: Select all
Machine
Project
Core
OS
Status
Progress
TPF
PPD
Assign Time
Info
Tardis-2L
18251
0x24
100.0%
2m 55s
2,506,039
10h 30m ago
Tardis-2L
18251
0x24
100.0%
3m 01s
2,374,681
1d 00h ago
Tardis-2L
18251
0x24
100.0%
3m 08s
2,256,210
3d 11h ago
Tardis-2L
18251
0x24
100.0%
3m 14s
2,145,675
3d 21h ago
Tardis-2L
18251
0x24
100.0%
3m 08s
2,239,834
4d 09h ago
Code: Select all
Machine
Project
Core
OS
Status
Progress
TPF
PPD
Assign Time
Info
Tardis-2L
18238
0x24
87.3%
3m 56s
2,564,722
5h 44m ago
Tardis-2L
18238
0x24
100.0%
3m 43s
2,806,531
16h 48m ago
Tardis-2L
18238
0x24
100.0%
3m 48s
2,685,195
2d 03h ago
Could this be a regression or bug introduced in the rocm 6.3.4 stack? What is different in the way these 2 work units are configured that so severly impacts performance?
-
- Posts: 1534
- Joined: Sun Dec 16, 2007 6:22 pm
- Hardware configuration: 9950x, 7950x3D, 5950x, 5800x3D
7900xtx, RX9070, Radeon 7, 5700xt, 6900xt, RX 550 640SP - Location: London
- Contact:
Re: Project 18251 very low PPD on RTX 2060s
There is no regression bugs in 6.3.4. There are many issues with opencl performance on Linux on AMD from beginning of time, unfortunately.
On top of that, opencl is not getting any attention anymore from AMD. There is like one person tweaking thing or two, if even that.
HIP is coming soon, so hopefully we can leave opencl behind
On top of that, opencl is not getting any attention anymore from AMD. There is like one person tweaking thing or two, if even that.
HIP is coming soon, so hopefully we can leave opencl behind
-
- Posts: 123
- Joined: Fri Apr 16, 2010 11:43 pm
- Hardware configuration: AMD 5800X3D Asus ROG Strix X570-E Gaming WiFi II bios 5031 G-Skill TridentZ Neo 3600mhz Asrock Tachi RX 7900XTX Corsair rm850x psu Asus PG32UQXR EK Elite 360 D-rgb aio Win 11pro/Kubuntu 2404.2 LTS UPS BX1500G
- Location: Galifrey
Re: Project 18251 very low PPD on RTX 2060s
Can they be blacklisted on Linux for the affected amd cards? Why slow down those projects by 3 hours when they can be processed more efficiently by the cards unaffected? Needless waste of electricity imo.
-
- Posts: 1534
- Joined: Sun Dec 16, 2007 6:22 pm
- Hardware configuration: 9950x, 7950x3D, 5950x, 5800x3D
7900xtx, RX9070, Radeon 7, 5700xt, 6900xt, RX 550 640SP - Location: London
- Contact:
Re: Project 18251 very low PPD on RTX 2060s
Constraints per GPU generation per OS is extremely tricky. At this point in time, this does not really matter. HIP is around the corner, and that WU runs bad on certain nvidia GPUs too, yet it still makes the deadline. As long as WU makes the deadline there is no wasted resource 

-
- Posts: 123
- Joined: Fri Apr 16, 2010 11:43 pm
- Hardware configuration: AMD 5800X3D Asus ROG Strix X570-E Gaming WiFi II bios 5031 G-Skill TridentZ Neo 3600mhz Asrock Tachi RX 7900XTX Corsair rm850x psu Asus PG32UQXR EK Elite 360 D-rgb aio Win 11pro/Kubuntu 2404.2 LTS UPS BX1500G
- Location: Galifrey
Re: Project 18251 very low PPD on RTX 2060s
Yeah it's about the science
Still it's unfortunate when you get 3 in a row and at the end of 16 hours you've only processed 3 wu's when you know they should only have taken 6 hours.
Hip can't come soon enough.
The card is capable of so much more, so much more. Could go back to windows but I don't want to go!
Wait that's Ten.....Fold on.
Still it's unfortunate when you get 3 in a row and at the end of 16 hours you've only processed 3 wu's when you know they should only have taken 6 hours.
Hip can't come soon enough.
The card is capable of so much more, so much more. Could go back to windows but I don't want to go!
Wait that's Ten.....Fold on.