Search found 27 matches
- Sat Mar 01, 2025 8:56 am
- Forum: v8.4.xx Open Beta
- Topic: Confirmation before automatic dumping
- Replies: 6
- Views: 1991
Re: Confirmation before automatic dumping
Linux amdgpu can also reset and recover itself: [533240.381281] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:8 pasid:32770, for process FahCore_26 pid 479221 thread FahCore_26 pid 479221) [533240.381345] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000760000...
- Sat Mar 01, 2025 8:17 am
- Forum: GPU Projects and FahCores
- Topic: Core 0x22 is unable to detect OpenCL device (bad work unit)
- Replies: 15
- Views: 17233
Re: Core 0x22 is unable to detect OpenCL device (bad work unit)
Oops, I didn't check out cbang, just OpenMM (and only briefly). I assumed it might have been in libFAH, whatever that is.
- Sat Mar 01, 2025 8:14 am
- Forum: v8.4.xx Open Beta
- Topic: Confirmation before automatic dumping
- Replies: 6
- Views: 1991
Re: Confirmation before automatic dumping
It's not always possible to know what will reset the GPU. If it resets though then a reboot isn't needed, but yeah there are some cases where it crashes and is unable to reset and in those cases the WU would end up getting dumped pretty soon. Maybe I'll write and test a patch and if it works well, I...
- Sat Mar 01, 2025 7:17 am
- Forum: v8.4.xx Open Beta
- Topic: Confirmation before automatic dumping
- Replies: 6
- Views: 1991
Confirmation before automatic dumping
If I switch virtual terminals while folding and an OpenGL context is open, the GPU driver resets itself. When folding is subsequently paused, the client sends SIGINT to the running cores. The GPU core does not respond in time (since the GPU was ripped out under it causing it to become unresponsive),...
- Sat Mar 01, 2025 5:35 am
- Forum: GPU Projects and FahCores
- Topic: Core 0x22 is unable to detect OpenCL device (bad work unit)
- Replies: 15
- Views: 17233
Re: Core 0x22 is unable to detect OpenCL device (bad work unit)
Thank you for the WU! I can reproduce the issue with a chroot into a Centos 7.9 image (CentOS-7-x86_64-Minimal-2009.iso which uses glibc 2.17). I copied the files to the chroot as well as the needed libraries (like ROCm libraries) and regenerated the ld.so.cache. I ran core22 under strace and checke...
- Sat Mar 01, 2025 3:27 am
- Forum: v8.4.9 Public Release for Windows / Linux / macOS
- Topic: Estimated PPD inflation after crash
- Replies: 1
- Views: 244
Estimated PPD inflation after crash
Shortly after changing my name and team affiliation, I accidentally put the system into an almost-unrecoverable state (doing things unrelated to FAH) and had to use sysrq-E to kill a non-responsive Xorg that wasn't letting me change VTs. To my surprise, fah-client was still running but at 100% CPU (...
- Fri Feb 28, 2025 8:42 am
- Forum: 3rd party contributed software
- Topic: lufah - Little Utility for FAH v8
- Replies: 15
- Views: 56464
Re: lufah - Little Utility for FAH v8
I only have one group at this time so I'm not sure but I think that having a total for all groups when there are multiple units would be nice. Btw the same could be done with showing totals for CPU and GPU.
- Fri Feb 28, 2025 8:16 am
- Forum: GPU Projects and FahCores
- Topic: Core 0x22 is unable to detect OpenCL device (bad work unit)
- Replies: 15
- Views: 17233
Re: Core 0x22 is unable to detect OpenCL device (bad work unit)
Any WU, even CPU WUs?
I just assumed that a WU typically run on a 4090 might require more VRAM than the 480M has available to it (though I hear FAH uses very little VRAM).
I just assumed that a WU typically run on a 4090 might require more VRAM than the 480M has available to it (though I hear FAH uses very little VRAM).
- Fri Feb 28, 2025 6:53 am
- Forum: 3rd party contributed software
- Topic: lufah - Little Utility for FAH v8
- Replies: 15
- Views: 56464
Re: lufah - Little Utility for FAH v8
It would be nice if it could display the total estimated PPD for a group. I didn't test this in many configurations and I'm not the best Python programmer but this patch works for me: diff --git a/src/lufah/commands/core/units.py b/src/lufah/commands/core/units.py index 6de1415..a01743f 100644 --- a...
- Fri Feb 28, 2025 6:26 am
- Forum: GPU Projects and FahCores
- Topic: Core 0x22 is unable to detect OpenCL device (bad work unit)
- Replies: 15
- Views: 17233
Re: Core 0x22 is unable to detect OpenCL device (bad work unit)
Ok! Running the core from command line is as simple as having the right current working directory with the right permissions and flags, with LD_LIBRARY_PATH set appropriately, and with a dummy process for its lifeline PID right? I know how to extract the wudata_01.dat if the core doesn't extract it ...
- Fri Feb 28, 2025 4:49 am
- Forum: Discussions of General-FAH topics
- Topic: Beta WUs and bug reports
- Replies: 1
- Views: 2939
Beta WUs and bug reports
I'm aware that support questions for beta WUs are only allowed for folders who are part of the beta team, but that non beta team members are allowed to run beta WUs anyway but are "on their own". If a beta WU crashes (and I know that the crash is not obvious PEBKAC), would it be allowed or...
- Fri Feb 28, 2025 3:44 am
- Forum: Q&A about unsupported distros of Linux
- Topic: How to use cpulimit [SOLVED]
- Replies: 4
- Views: 15821
Re: How to use cpulimit [SOLVED]
If you don't want to fold on fewer cores, then it's better to use cgroups with cpu.max instead of cpulimit.
https://www.kernel.org/doc/html/latest/ ... up-v2.html
https://www.kernel.org/doc/html/latest/ ... up-v2.html
- Fri Feb 28, 2025 3:38 am
- Forum: GPU Projects and FahCores
- Topic: Core 0x22 is unable to detect OpenCL device (bad work unit)
- Replies: 15
- Views: 17233
Re: Core 0x22 is unable to detect OpenCL device (bad work unit)
I would like to test this to find some workarounds (maybe it's as simple as putting glibc version 2.17 as libc.so.6 in the core directory so LD_LIBRARY_PATH picks it up). Then I don't have to keep failing WUs unnecessarily. Where can I get an expired core22 WU to play with? I'll run the core manuall...
- Wed Feb 26, 2025 6:12 am
- Forum: GPU Projects and FahCores
- Topic: Core 0x22 is unable to detect OpenCL device (bad work unit)
- Replies: 15
- Views: 17233
Re: Core 0x22 is unable to detect OpenCL device (bad work unit)
This is Debian stable (bookworm) with glibc 2.36 (2.36-9+deb12u9) and kernel linux-image-6.1.0-29-amd64 (6.1.123-1). Far from the latest.
What versions are incompatible? If Debian stable is too recent, that would make most Linux distros too recent as well.
What versions are incompatible? If Debian stable is too recent, that would make most Linux distros too recent as well.
- Wed Feb 26, 2025 4:33 am
- Forum: GPU Projects and FahCores
- Topic: Core 0x22 is unable to detect OpenCL device (bad work unit)
- Replies: 15
- Views: 17233
Re: Core 0x22 is unable to detect OpenCL device (bad work unit)
It looks like this might be related to https://github.com/FoldingAtHome/fah-client-bastet/issues/245 which concludes that it is not a v8 issue but a core 0x22 issue. The last followup on that issue says: Just the follow up. core22 is folding on v7 and v8 on kubuntu. So it is safe to say Mint Linux m...