Search found 61 matches

by Nicolas_orleans
Thu Nov 07, 2024 8:34 pm
Forum: Issues with a specific WU
Topic: Corrupted / bad job 18237/1069/0/71 (failing for all users)
Replies: 10
Views: 142

Re: Corrupted / bad job 18237/1069/0/71 (failing for all users)

Hi Paul, It's a brand new card and, again, it's only for this particular project, I would be more hardware-focused if it happened with the 16 other projects I am currently being assigned ? It only happens with this one. Regarding Core24, it runs without any error on all P18230 WUs received so far, b...
by Nicolas_orleans
Thu Nov 07, 2024 6:46 pm
Forum: Issues with a specific WU
Topic: Corrupted / bad job 18237/1069/0/71 (failing for all users)
Replies: 10
Views: 142

Re: Corrupted / bad job 18237/1069/0/71 (failing for all users)

Hi Paul I have browsed my logs and here is one sample of the first Force RMSE error I saw mid-October. I have dozens like this, only for this particular project. 17:41:08:I3:WU40:Started FahCore on PID 8605 17:41:09:I1:WU40:*********************** Log Started 2024-10-18T17:41:09Z *******************...
by Nicolas_orleans
Thu Nov 07, 2024 8:27 am
Forum: Issues with a specific WU
Topic: Corrupted / bad job 18237/1069/0/71 (failing for all users)
Replies: 10
Views: 142

Re: Corrupted / bad job 18237/1069/0/71 (failing for all users)

Hi Paul,
On my system, out of the 17 different GPU projects being assigned to my system since October, P18237 is the only one failing regularly (but not for 100% of WUs) with Force RMSE errors.
The issue may be wider than your particular WU ?
Best regards
Nicolas
by Nicolas_orleans
Thu Oct 17, 2024 7:15 am
Forum: GPU Projects and FahCores
Topic: FAH Core 24 Fails Xubuntu 18.04 extended
Replies: 8
Views: 16687

Re: FAH Core 24 Fails Xubuntu 18.04 extended

Hello, Here are the steps taken and investigations carried out: 1/ Ubuntu 20.04 LTS: make OpenCL work just in case it would help the core to work on OpenCL. My GPU was identified by the client as OpenCL = unsupported. Installed opencl-headers, ocl-icd-libopencl1 and nvidia-cuda-toolkit. One of them ...
by Nicolas_orleans
Thu Oct 17, 2024 6:17 am
Forum: GPU Projects and FahCores
Topic: FAH Core 24 Fails Xubuntu 18.04 extended
Replies: 8
Views: 16687

Re: FAH Core 24 Fails Xubuntu 18.04 extended

Hello, I have resumed folding recently, good to see familiar names like toTOW and bollix47 are still around. After installing a v8.3.18 client on an Ubuntu 20.04 LTS machine (upgrade from a 18.04 old install) equipped with a GTX 980 Ti, I faced the same issue as 58Enfield and Gary480six, eg an immed...
by Nicolas_orleans
Thu Feb 02, 2017 9:40 pm
Forum: GPU Projects and FahCores
Topic: Core 21 Projects spamming BAD_WORK_UNIT failures
Replies: 16
Views: 8493

Re: Core 21 Projects spamming BAD_WORK_UNIT failures

367.57 works ok with 0.0.18 on my Kepler (GTX 770) / Maxwell (GTX 750 Ti) setup
by Nicolas_orleans
Wed Feb 01, 2017 7:48 pm
Forum: New Donors start here
Topic: beta wu core 0.0.18 first wu with that core getting error
Replies: 2
Views: 809

Re: beta wu core 0.0.18 first wu with that core getting erro

this is a known issue of 0.0.18 to display this in the log, but is not harmful to science
this bug shall be fixed in an update of the core
by Nicolas_orleans
Sat Dec 17, 2016 7:19 am
Forum: Issues with a specific WU
Topic: Project: 10493 (Run 5, Clone 38, Gen 285) - UNKNOWN_ENUM
Replies: 2
Views: 1519

Re: Project: 10493 (Run 5, Clone 38, Gen 285) - UNKNOWN_ENUM

I think the cause is an hardware failure. I reinstalled the system, and drivers appear to reset randomly, though I folded for months 24/7 with these. My best guess is one of the cards is failing and resets the driver for all cards. Will need to run each card separately to icheck this assumption.
by Nicolas_orleans
Mon Dec 12, 2016 7:10 pm
Forum: Issues with a specific WU
Topic: Project: 10493 (Run 5, Clone 38, Gen 285) - UNKNOWN_ENUM
Replies: 2
Views: 1519

Project: 10493 (Run 5, Clone 38, Gen 285) - UNKNOWN_ENUM

Hello Strange error on this WU, I got a UNKNOWN_ENUM (127 = 0x7f), then FAHClient restarted with a TPF of 15 minutes on a 980 Ti. After computer rebooted, FAHClient fails to restart even from command line. In process to find a way to reinstall (GDEBI uninstall / reinstall does not work) 13:20:35:WU0...
by Nicolas_orleans
Sun Oct 23, 2016 6:53 am
Forum: Issues with a specific WU
Topic: Project: 13204 (Run 20, Clone 0, Gen 118)
Replies: 2
Views: 1140

Re: Project: 13204 (Run 20, Clone 0, Gen 118)

Thanks for the information bruce
by Nicolas_orleans
Sat Oct 22, 2016 7:21 am
Forum: Issues with a specific WU
Topic: Project: 13204 (Run 20, Clone 0, Gen 118)
Replies: 2
Views: 1140

Project: 13204 (Run 20, Clone 0, Gen 118)

This one appears to consistently fail at the same percentage. GTX 750 Ti @ 1306 MHz, Ubuntu 14.04, 367.44 00:42:11:WU03:FS02:0x21:Project: 13204 (Run 20, Clone 0, Gen 118) 00:42:11:WU03:FS02:0x21:Unit: 0x00000043ab436c6657894f0c62f3d847 00:42:11:WU03:FS02:0x21:CPU: 0x00000000000000000000000000000000...
by Nicolas_orleans
Sun Aug 28, 2016 6:15 am
Forum: Problems with NVidia drivers
Topic: Linux 367.44 driver - performance boost over 346.96
Replies: 0
Views: 1654

Linux 367.44 driver - performance boost over 346.96

Hello, If, like me, you were still on 346.96 because 361.x and 364.x from nvidia website did not play along well with your kernel/X server, or yielded a performance loss on core 21 (2-7% in TPF), you could consider upgrading to 367.44 : - Install went flawlessly on Ubuntu 14.04 / kernel 3.13.93, exc...
by Nicolas_orleans
Sat Aug 20, 2016 6:00 am
Forum: FAH Hardware
Topic: Tesla P100-SXM2
Replies: 11
Views: 4901

Re: Tesla P100-SXM2

Also, P100 is supposed to use 3840 Pascal cores when Titan XP only has 3584 active cores ?
by Nicolas_orleans
Fri Jun 24, 2016 7:24 pm
Forum: Issues with a specific WU
Topic: P9160 (890,0,0) Force RMSE error at startup
Replies: 3
Views: 1386

Re: P9160 (890,0,0) Force RMSE error at startup

Thanks for the explanation, bruce
by Nicolas_orleans
Thu Jun 23, 2016 10:00 pm
Forum: Issues with a specific WU
Topic: P9160 (890,0,0) Force RMSE error at startup
Replies: 3
Views: 1386

P9160 (890,0,0) Force RMSE error at startup

Hello, First time I see this type of error, GTX 750 Ti, Ubuntu 14.04 / driver 346.96 19:28:48:WU00:FS02:0x18:Project: 9160 (Run 890, Clone 0, Gen 0) 19:28:48:WU00:FS02:0x18:Unit: 0x00000000ab40415c567465dc5c555885 19:28:48:WU00:FS02:0x18:CPU: 0x00000000000000000000000000000000 19:28:48:WU00:FS02:0x1...