Lack of power after one WU

If you think it might be a driver problem, see viewforum.php?f=79

Moderators: Site Moderators, FAHC Science Team

Medora
Posts: 8
Joined: Sun Nov 24, 2024 9:37 am
Hardware configuration: Intel Core i5 11600K
16Gb RAM DDR4
ASRock H510 Pro BTC+
AMD RX Vega 64
AMD RX Vega 64
Nvidia RTX2080 Ti
Nvidia Tesla P100
2*650W bequiet! PurePower
Windows 10
Location: France

Lack of power after one WU

Post by Medora »

Hello,

I'm suddenly facing something weird with my rig.
It's composed of 2*AMD RX Vega 64, 1*Nvidia RTX 2080Ti and 1*Nvidia Tesla P100.
When I start FAH everything is working well, I get WU, PPD are coherent, power consumption and temperature of cards are OK.
But after the first WU, when they begin the second one, starts the problem.
The 2 AMD cards continues to work but at only like 5-10% (I see it directly on the external tachigraph and the temp meter).
They get WU and work on it, but take days to finish something they do in hours, so resulting estimated PPD are really bad.
On the 2 Nvidia it's harder to say if they are facing the same issue, it seems not because temperature, consumption and computing speed are ok, but PPD seems to be impacted too.
The only way I find until now to relaunch it to full power is to click on "Pause" then "Fold" and everything is working well again for one WU.


I'm not exactly sure about the key event who starts this malfunction, but I think it's related to a Windows 10 update followed by a reboot some weeks/months ago.
So I updated Windows again, the AMD drivers too, installed FAH 8 instead of 7 but it's still the same.
I tried to have a look at the FAH logs, but without finding anything. Maybe I'm not looking for the good keyword.

Do you had something like that ?
Do you know if Windows deployed some no-mining sniffer or something like that ?

What do you think I should look at / try to know ?



PS: I looked at the forum without finding similar issues, sorry if I'm wrong.
calxalot
Site Moderator
Posts: 1273
Joined: Sat Dec 08, 2007 1:33 am
Location: San Francisco, CA
Contact:

Re: Lack of power after one WU

Post by calxalot »

Windows abruptly kills the client and cores.

For now, you should quit folding from the sys tray before a logout or reboot.

A future v8 beta will fix this. Maybe 8.4.10.
calxalot
Site Moderator
Posts: 1273
Joined: Sat Dec 08, 2007 1:33 am
Location: San Francisco, CA
Contact:

Re: Lack of power after one WU

Post by calxalot »

There may be other issues. Like video driver resets.
I'm not a Windows or Linux expert.
Medora
Posts: 8
Joined: Sun Nov 24, 2024 9:37 am
Hardware configuration: Intel Core i5 11600K
16Gb RAM DDR4
ASRock H510 Pro BTC+
AMD RX Vega 64
AMD RX Vega 64
Nvidia RTX2080 Ti
Nvidia Tesla P100
2*650W bequiet! PurePower
Windows 10
Location: France

Re: Lack of power after one WU

Post by Medora »

I put the fold on Pause, then quited FAH using the System Tray and rebooted the system, we will see :)
But I'm pretty sure I already did it.

I thought about the drivers too at first because I had in the past some incompatibility after a Win update, but these time it didn't solve the issue.
Medora
Posts: 8
Joined: Sun Nov 24, 2024 9:37 am
Hardware configuration: Intel Core i5 11600K
16Gb RAM DDR4
ASRock H510 Pro BTC+
AMD RX Vega 64
AMD RX Vega 64
Nvidia RTX2080 Ti
Nvidia Tesla P100
2*650W bequiet! PurePower
Windows 10
Location: France

Re: Lack of power after one WU

Post by Medora »

It's still the same.
I tried to only select the AMD card in the options, but without more success.

I will try to let only the Nvidia cards to see if it's better or if they have the same problem.
Medora
Posts: 8
Joined: Sun Nov 24, 2024 9:37 am
Hardware configuration: Intel Core i5 11600K
16Gb RAM DDR4
ASRock H510 Pro BTC+
AMD RX Vega 64
AMD RX Vega 64
Nvidia RTX2080 Ti
Nvidia Tesla P100
2*650W bequiet! PurePower
Windows 10
Location: France

Re: Lack of power after one WU

Post by Medora »

Hello,

I did some tests this week, but without more success.

Do you know how I could do a little script to pause the folding and restart it ?
It could be a workaround.

I looked at the documentation, but I don't really understand how API or the soft works.
When I look at the config file for example it's not what I'm running on.
calxalot
Site Moderator
Posts: 1273
Joined: Sat Dec 08, 2007 1:33 am
Location: San Francisco, CA
Contact:

Re: Lack of power after one WU

Post by calxalot »

Version 8.4.4+ includes a python script fahctl, but I don't see it on my Win 11 VM.
It is here:
https://github.com/FoldingAtHome/fah-cl ... pts/fahctl

You can try my python utility lufah in powershell.
https://pypi.org/project/lufah/

Code: Select all

python --version
pip install lufah
lufah --help
I don't know how to script nicely quitting a program on Windows.
The latest client should respond to a standard Win32 quit message.
toTOW
Site Moderator
Posts: 6394
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: Lack of power after one WU

Post by toTOW »

Cohabitation between nVidia and AMD GPUs in the same system has always been quite random ...

How do you check GPU loads ? Don't use Windows Task Manager but GPUZ instead.

Can you post some logs taken when you're seeing the issue ?
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
Medora
Posts: 8
Joined: Sun Nov 24, 2024 9:37 am
Hardware configuration: Intel Core i5 11600K
16Gb RAM DDR4
ASRock H510 Pro BTC+
AMD RX Vega 64
AMD RX Vega 64
Nvidia RTX2080 Ti
Nvidia Tesla P100
2*650W bequiet! PurePower
Windows 10
Location: France

Re: Lack of power after one WU

Post by Medora »

I will give a try to v8.4.4+ when my currents WU will be finished, it could be interesting to test :)

I check the GPU load with OCCT, where I can see power draining, temperature and GPU utilization.
On the AMD card their is some external LEDs showing the GPU current workload, and they are cooled by a blower, so I hear it too ^^
I don't get significant data with GPU-Z for these AMD cards, only for the Nvidia.

I put the log file here.
muziqaz
Posts: 1056
Joined: Sun Dec 16, 2007 6:22 pm
Hardware configuration: 9950x, 7950x3D, 5950x, 5800x3D
7900xtx, Radeon 7, 5700xt, 6900xt, RX 550 640SP
Location: London
Contact:

Re: Lack of power after one WU

Post by muziqaz »

While we are checking few things, it is important to know that nVidia and AMD driver on the same system at the same time have never worked too well, if at all. One overrides the other, or tries to neutralize each other. It is best to split AMD GPUs on their own system and nVidia GPU their own.
Also, it is important to ask: what PSU is powering this Frankenshtein?
2x vega64s are 600W
Same for 2 nvidia cards +-
But this is driver issue for sure. If you leave 2 of the AMD cards and try folding, see how those do, then take out AMD and put nVidia cards in only and see how those do. It is possible that when using AMD cards only, you will need to remove nVidia drivers and when using nVidia you will need to remove AMD drivers and install nvidia drivers.
Linux might net better results in this combination
FAH Omega tester
Medora
Posts: 8
Joined: Sun Nov 24, 2024 9:37 am
Hardware configuration: Intel Core i5 11600K
16Gb RAM DDR4
ASRock H510 Pro BTC+
AMD RX Vega 64
AMD RX Vega 64
Nvidia RTX2080 Ti
Nvidia Tesla P100
2*650W bequiet! PurePower
Windows 10
Location: France

Re: Lack of power after one WU

Post by Medora »

I installed beta of v8.4.9. I will give it the night to see how it's going.

Yes it is the last test I had in mind, removing Nvidia cards and see.
I didn't do it yet because I wouldn't stop and open the case, but I will do it if it's still not working.

It's powered by 2 PSU of 650W each.
If I open the box I could try to give more loadbalancing by changing the slots of the cards on the motherboard.
But it was working well before a Windows update, so ...
muziqaz
Posts: 1056
Joined: Sun Dec 16, 2007 6:22 pm
Hardware configuration: 9950x, 7950x3D, 5950x, 5800x3D
7900xtx, Radeon 7, 5700xt, 6900xt, RX 550 640SP
Location: London
Contact:

Re: Lack of power after one WU

Post by muziqaz »

Ehm, I think your PSU's are not enough, especially for fah. And especially in this funky set up, though is it some sort of the server? Can you post full system specs?
FAH Omega tester
Medora
Posts: 8
Joined: Sun Nov 24, 2024 9:37 am
Hardware configuration: Intel Core i5 11600K
16Gb RAM DDR4
ASRock H510 Pro BTC+
AMD RX Vega 64
AMD RX Vega 64
Nvidia RTX2080 Ti
Nvidia Tesla P100
2*650W bequiet! PurePower
Windows 10
Location: France

Re: Lack of power after one WU

Post by Medora »

It's in my description :)

Intel Core i5 11600K
16Gb RAM DDR4
ASRock H510 Pro BTC+
AMD RX Vega 64
AMD RX Vega 64
Nvidia RTX2080 Ti
Nvidia Tesla P100
2*650W bequiet! PurePower 12M 80+ Gold
Windows 10

Why ?
muziqaz wrote: Sun Dec 01, 2024 10:29 pm ... especially for fah. And especially in this funky set up ...
When I look at the max power consumption of the cards I got RTX2080Ti (260W), P100 (200W), RX Vega 64 (150W) and processor (100W) so a power budget of 860W over 1300W available.

It's not a server, it's more a custom heater.
My first goal is to replace the heater in my appartment :)
I only fold during cold periods.

Image Image
muziqaz
Posts: 1056
Joined: Sun Dec 16, 2007 6:22 pm
Hardware configuration: 9950x, 7950x3D, 5950x, 5800x3D
7900xtx, Radeon 7, 5700xt, 6900xt, RX 550 640SP
Location: London
Contact:

Re: Lack of power after one WU

Post by muziqaz »

Vega64 is 300W, you have 2 of them.
Your CPU is like 200W give or take, unless you underclocked it to oblivion, or are not folding on it.
Besides, GPUs take power from pcie slots, so one PSU is loaded more than the other.
While I admire imagination, this set up is a ball ache to troubleshoot when things don't work out. There are so many things which might go wrong.
Like, pcie slots for bottom GPUs, what link speed are they running at?
Best bet, remove AMD GPUs and fold on Nvidia only. More PPD and it should work as it is supposed to.
And for future, dual PSU is not ideal for multiGPU set ups. There are so many things to consider with this set up, I don't even know where to start :D You basically need to be certified electrician to work it out ;)
FAH Omega tester
Medora
Posts: 8
Joined: Sun Nov 24, 2024 9:37 am
Hardware configuration: Intel Core i5 11600K
16Gb RAM DDR4
ASRock H510 Pro BTC+
AMD RX Vega 64
AMD RX Vega 64
Nvidia RTX2080 Ti
Nvidia Tesla P100
2*650W bequiet! PurePower
Windows 10
Location: France

Re: Lack of power after one WU

Post by Medora »

Maybe, but it's not what is measured while folding.
They never go over 150W.

And I'm pretty sure of me about this point for 2 reasons:
- It's working well when I stop and start again, but only for 1 WU
- I've a power sensor in my EDF Linky counter sending data in realtime to my HomeAssistant and when all cards run we hit 850-900VA of consumption.

I'm not using the CPU to fold, only GPU.

All PCIe slots are set to 1x.

No need to be electrician, you just need to be able to read the documentation of the motherboard ;)
It's a motherdboard designed for bitcoin mining with 6 GPU slots and 3 PSU capacity.

You can check the documentation here if you want to :)
https://www.asrock.com/MB/Intel/H510%20 ... p#Overview

Image Image
Post Reply