Search found 61 matches
- Thu Nov 07, 2024 8:34 pm
- Forum: Issues with a specific WU
- Topic: Corrupted / bad job 18237/1069/0/71 (failing for all users)
- Replies: 10
- Views: 142
Re: Corrupted / bad job 18237/1069/0/71 (failing for all users)
Hi Paul, It's a brand new card and, again, it's only for this particular project, I would be more hardware-focused if it happened with the 16 other projects I am currently being assigned ? It only happens with this one. Regarding Core24, it runs without any error on all P18230 WUs received so far, b...
- Thu Nov 07, 2024 6:46 pm
- Forum: Issues with a specific WU
- Topic: Corrupted / bad job 18237/1069/0/71 (failing for all users)
- Replies: 10
- Views: 142
Re: Corrupted / bad job 18237/1069/0/71 (failing for all users)
Hi Paul I have browsed my logs and here is one sample of the first Force RMSE error I saw mid-October. I have dozens like this, only for this particular project. 17:41:08:I3:WU40:Started FahCore on PID 8605 17:41:09:I1:WU40:*********************** Log Started 2024-10-18T17:41:09Z *******************...
- Thu Nov 07, 2024 8:27 am
- Forum: Issues with a specific WU
- Topic: Corrupted / bad job 18237/1069/0/71 (failing for all users)
- Replies: 10
- Views: 142
Re: Corrupted / bad job 18237/1069/0/71 (failing for all users)
Hi Paul,
On my system, out of the 17 different GPU projects being assigned to my system since October, P18237 is the only one failing regularly (but not for 100% of WUs) with Force RMSE errors.
The issue may be wider than your particular WU ?
Best regards
Nicolas
On my system, out of the 17 different GPU projects being assigned to my system since October, P18237 is the only one failing regularly (but not for 100% of WUs) with Force RMSE errors.
The issue may be wider than your particular WU ?
Best regards
Nicolas
- Thu Oct 17, 2024 7:15 am
- Forum: GPU Projects and FahCores
- Topic: FAH Core 24 Fails Xubuntu 18.04 extended
- Replies: 8
- Views: 16687
Re: FAH Core 24 Fails Xubuntu 18.04 extended
Hello, Here are the steps taken and investigations carried out: 1/ Ubuntu 20.04 LTS: make OpenCL work just in case it would help the core to work on OpenCL. My GPU was identified by the client as OpenCL = unsupported. Installed opencl-headers, ocl-icd-libopencl1 and nvidia-cuda-toolkit. One of them ...
- Thu Oct 17, 2024 6:17 am
- Forum: GPU Projects and FahCores
- Topic: FAH Core 24 Fails Xubuntu 18.04 extended
- Replies: 8
- Views: 16687
Re: FAH Core 24 Fails Xubuntu 18.04 extended
Hello, I have resumed folding recently, good to see familiar names like toTOW and bollix47 are still around. After installing a v8.3.18 client on an Ubuntu 20.04 LTS machine (upgrade from a 18.04 old install) equipped with a GTX 980 Ti, I faced the same issue as 58Enfield and Gary480six, eg an immed...
- Thu Feb 02, 2017 9:40 pm
- Forum: GPU Projects and FahCores
- Topic: Core 21 Projects spamming BAD_WORK_UNIT failures
- Replies: 16
- Views: 8493
Re: Core 21 Projects spamming BAD_WORK_UNIT failures
367.57 works ok with 0.0.18 on my Kepler (GTX 770) / Maxwell (GTX 750 Ti) setup
- Wed Feb 01, 2017 7:48 pm
- Forum: New Donors start here
- Topic: beta wu core 0.0.18 first wu with that core getting error
- Replies: 2
- Views: 809
Re: beta wu core 0.0.18 first wu with that core getting erro
this is a known issue of 0.0.18 to display this in the log, but is not harmful to science
this bug shall be fixed in an update of the core
this bug shall be fixed in an update of the core
- Sat Dec 17, 2016 7:19 am
- Forum: Issues with a specific WU
- Topic: Project: 10493 (Run 5, Clone 38, Gen 285) - UNKNOWN_ENUM
- Replies: 2
- Views: 1519
Re: Project: 10493 (Run 5, Clone 38, Gen 285) - UNKNOWN_ENUM
I think the cause is an hardware failure. I reinstalled the system, and drivers appear to reset randomly, though I folded for months 24/7 with these. My best guess is one of the cards is failing and resets the driver for all cards. Will need to run each card separately to icheck this assumption.
- Mon Dec 12, 2016 7:10 pm
- Forum: Issues with a specific WU
- Topic: Project: 10493 (Run 5, Clone 38, Gen 285) - UNKNOWN_ENUM
- Replies: 2
- Views: 1519
Project: 10493 (Run 5, Clone 38, Gen 285) - UNKNOWN_ENUM
Hello Strange error on this WU, I got a UNKNOWN_ENUM (127 = 0x7f), then FAHClient restarted with a TPF of 15 minutes on a 980 Ti. After computer rebooted, FAHClient fails to restart even from command line. In process to find a way to reinstall (GDEBI uninstall / reinstall does not work) 13:20:35:WU0...
- Sun Oct 23, 2016 6:53 am
- Forum: Issues with a specific WU
- Topic: Project: 13204 (Run 20, Clone 0, Gen 118)
- Replies: 2
- Views: 1140
Re: Project: 13204 (Run 20, Clone 0, Gen 118)
Thanks for the information bruce
- Sat Oct 22, 2016 7:21 am
- Forum: Issues with a specific WU
- Topic: Project: 13204 (Run 20, Clone 0, Gen 118)
- Replies: 2
- Views: 1140
Project: 13204 (Run 20, Clone 0, Gen 118)
This one appears to consistently fail at the same percentage. GTX 750 Ti @ 1306 MHz, Ubuntu 14.04, 367.44 00:42:11:WU03:FS02:0x21:Project: 13204 (Run 20, Clone 0, Gen 118) 00:42:11:WU03:FS02:0x21:Unit: 0x00000043ab436c6657894f0c62f3d847 00:42:11:WU03:FS02:0x21:CPU: 0x00000000000000000000000000000000...
- Sun Aug 28, 2016 6:15 am
- Forum: Problems with NVidia drivers
- Topic: Linux 367.44 driver - performance boost over 346.96
- Replies: 0
- Views: 1654
Linux 367.44 driver - performance boost over 346.96
Hello, If, like me, you were still on 346.96 because 361.x and 364.x from nvidia website did not play along well with your kernel/X server, or yielded a performance loss on core 21 (2-7% in TPF), you could consider upgrading to 367.44 : - Install went flawlessly on Ubuntu 14.04 / kernel 3.13.93, exc...
- Sat Aug 20, 2016 6:00 am
- Forum: FAH Hardware
- Topic: Tesla P100-SXM2
- Replies: 11
- Views: 4901
Re: Tesla P100-SXM2
Also, P100 is supposed to use 3840 Pascal cores when Titan XP only has 3584 active cores ?
- Fri Jun 24, 2016 7:24 pm
- Forum: Issues with a specific WU
- Topic: P9160 (890,0,0) Force RMSE error at startup
- Replies: 3
- Views: 1386
Re: P9160 (890,0,0) Force RMSE error at startup
Thanks for the explanation, bruce
- Thu Jun 23, 2016 10:00 pm
- Forum: Issues with a specific WU
- Topic: P9160 (890,0,0) Force RMSE error at startup
- Replies: 3
- Views: 1386
P9160 (890,0,0) Force RMSE error at startup
Hello, First time I see this type of error, GTX 750 Ti, Ubuntu 14.04 / driver 346.96 19:28:48:WU00:FS02:0x18:Project: 9160 (Run 890, Clone 0, Gen 0) 19:28:48:WU00:FS02:0x18:Unit: 0x00000000ab40415c567465dc5c555885 19:28:48:WU00:FS02:0x18:CPU: 0x00000000000000000000000000000000 19:28:48:WU00:FS02:0x1...