Problem with running a GTX770

It seems that a lot of GPU problems revolve around specific versions of drivers. Though NVidia has their own support structure, you can often learn from information reported by others who fold.

Moderators: Site Moderators, FAHC Science Team

Post Reply
Le cutter
Posts: 5
Joined: Tue Dec 17, 2024 5:27 pm

Problem with running a GTX770

Post by Le cutter »

Hi everyone,

I was previously running the latest FAH client on my i5 3570k and GTX1080 on windows 10. I wanted to build another computer to fold and add a bit of heat during the winter (one 1080 isn't enough) so I bought a GTX770 for cheap on ebay.
It works nicely but I get some errors running folding at home.
Apparently older cards may have issue running cuda, but it's a kepler so I think it's "not too old".

I tried to run an older version of FAH ( 7.6.21 ) and it still gets some errors, then after multiple tries the folding slot goes on "failed" and stops working. While the CPU unit is running fine.
Can the temps be the cause of this? I've not changed the thermal paste yet, and my temps can reach up to 80°C with the fans on full blast.
The card runs fine for gaming.

I have no idea if my card is running cuda or open CL.

Here's what the errors look like :
17:17:50:WU02:FS01:0x22:There are 4 platforms available.
17:17:50:WU02:FS01:0x22:Platform 0: Reference
17:17:50:WU02:FS01:0x22:Platform 1: CPU
17:17:50:WU02:FS01:0x22:Platform 2: OpenCL
17:17:50:WU02:FS01:0x22: opencl-device 0 specified
17:17:50:WU02:FS01:0x22:Platform 3: CUDA
17:17:50:WU02:FS01:0x22: cuda-device 0 specified
17:17:51:WU02:FS01:0x22:Attempting to create CUDA context:
17:17:51:WU02:FS01:0x22: Configuring platform CUDA
17:17:52:WU01:FS01:Upload 14.94%
17:17:52:WU02:FS01:0x22:Failed to create CUDA context:
17:17:52:WU02:FS01:0x22:Error compiling program: nvrtc: error: invalid value for --gpu-architecture (-arch)
17:17:52:WU02:FS01:0x22:Attempting to create OpenCL context:
17:17:52:WU02:FS01:0x22: Configuring platform OpenCL
17:17:56:WU02:FS01:0x22: Using OpenCL on platformId 0 and gpu 0
17:14:42:WU02:FS01:0x22:An exception occurred at step 186241: Particle coordinate is nan
17:14:42:WU02:FS01:0x22:Max number of attempts to resume from last checkpoint (2) reached. Aborting.
17:14:42:WU02:FS01:0x22:ERROR:114: Max number of attempts to resume from last checkpoint reached.
17:14:42:WU02:FS01:0x22:Folding@home Core Shutdown: BAD_WORK_UNIT
17:14:42:WARNING:WU02:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
If you have any ideas, thanks for your help.
Joe_H
Site Admin
Posts: 7990
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Studio M1 Max 32 GB smp6
Mac Hack i7-7700K 48 GB smp4
Location: W. MA

Re: Problem with running a GTX770

Post by Joe_H »

Not sure why the WUs are failing, someone else may have some ideas. But not running using CUDA is normal for a Kepler based GPU. The CUDA support libraries used in making the GPU folding cores can support a range of Nvidia cards, but to support the current generation that range only goes as far back as Maxwell. They would need to create two or more versions of the cores and maintain them to still support the Kepler based GPUs.
Image
Le cutter
Posts: 5
Joined: Tue Dec 17, 2024 5:27 pm

Re: Problem with running a GTX770

Post by Le cutter »

Thanks for your response,

So my card is necessarily running on Open CL? Or do I have to do something to activate it?
Joe_H
Site Admin
Posts: 7990
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Studio M1 Max 32 GB smp6
Mac Hack i7-7700K 48 GB smp4
Location: W. MA

Re: Problem with running a GTX770

Post by Joe_H »

Yes, your GTX 770 will be running on OpenCL. You can see that is actually happening with these lines in the log:

Code: Select all

17:17:51:WU02:FS01:0x22:Attempting to create CUDA context:
17:17:51:WU02:FS01:0x22: Configuring platform CUDA
17:17:52:WU02:FS01:0x22:Failed to create CUDA context:
17:17:52:WU02:FS01:0x22:Error compiling program: nvrtc: error: invalid value for --gpu-architecture (-arch)
17:17:52:WU02:FS01:0x22:Attempting to create OpenCL context:
17:17:52:WU02:FS01:0x22: Configuring platform OpenCL
17:17:56:WU02:FS01:0x22: Using OpenCL on platformId 0 and gpu 0
Basically the GPU folding core tried to use CUDA, that failed since the 770 is not included in the run-time library, and then started up using OpenCL.

The other segment of log shows that it errored at a step part way through processing a WU. The rest of the messages indicate it had restarted several tames at the previous checkpoint, but still ran into an error. The retries are in case it was just a transient error. Normally that could just indicate a single bad WU, but since the GPU slot was disabled that implies multiple WUs were processed in a row and ended on errors.

Often you can just pause and restart the folding and that will reenable the slot. Sometimes you have to restart the client process or reboot. But if the card is causing errors it may result in the same erroring out and eventually disabling the slot again.

It could be localized overheating of some component on the video card. Having looked up the operating specs for the GTX 770, 80 C is the normal temperature that it will reach before throttling. It can operate at up to 95+ C safely, and will throttle clocks at that point. But some component could be reaching a higher temperature than the sensor you checked. That could be from dried out thermal pads or paste on the GPU, VRM, or RAM on the card. Blocked fins on the heat sink from dust and lint can also be a cause.
Image
Le cutter
Posts: 5
Joined: Tue Dec 17, 2024 5:27 pm

Re: Problem with running a GTX770

Post by Le cutter »

I took out the GTX770, did a DDU and installed the drivers for the 1080. Except the drivers, nothing has changed. And the 1080 is running fine.

I took a look at the 770, the fins are very clean (it's a gainward phantom so you can clearly see the fins) but i've noticed that one of the stickers for the warantee isn't altered. So I guess that the thermal paste has never been changed. Same goes for the thermal pads but I don't know if i'm going to try to change them (unless they're really really dry) cause i've heard some pretty bad stories if you don't take the correct thickness of thermal pads.
Some graphics card require different thermal pads thickness for VRAM and VRM, and if you don't do it correctly VRM or VRAM might not make correct contact.
toTOW
Site Moderator
Posts: 6394
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: Problem with running a GTX770

Post by toTOW »

Never buy anything older than Pascal (GTX 1000 series) for FAH !

Kepler (GTX 700 series) is already end of life at nVidia and don't get drivers updates and Maxwell (GTX 900 series) might be soon the next one.
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
Le cutter
Posts: 5
Joined: Tue Dec 17, 2024 5:27 pm

Re: Problem with running a GTX770

Post by Le cutter »

It's weird that it's working a bit and then failing instead of failing right away.
Why can't FAH say "your card is too old, we can't use it" ?
Joe_H
Site Admin
Posts: 7990
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Studio M1 Max 32 GB smp6
Mac Hack i7-7700K 48 GB smp4
Location: W. MA

Re: Problem with running a GTX770

Post by Joe_H »

toTOW's opinion on purchasing video cards overstates the issue. The card is supported by F@h, just is from the oldest architecture still usable. That support comes with the caveat that CUDA can not be used with that series and that does result in less throughput when using OpenCL. F@h is very good at identifying too old cards, the forum here regularly gets questions about getting older generation cards to work such as this topic - viewtopic.php?t=42239. The log will show unsupported and the client control will not allow configuration of the GPUs for folding.

However tracking down problems with the older cards can get involved. Unlike games which briefly use full power, F@h folding uses a large fraction of the GPU processing capacity continuously. That can lead to heat issues. Sometimes using a utility to power limit a card will work around the issue. Another issue is drivers. Sometimes the newest drivers are not tested as thoroughly before release on older cards and have on occasion been not compatible. Related to this is if there are two or more cards from too different architectures installed in a system there can also be issues. That your system can process for part way through a WU indicates basic usability, but tracking down whether it is failing after that can be hard to determine whether it is hardware or software related.
Image
Le cutter
Posts: 5
Joined: Tue Dec 17, 2024 5:27 pm

Re: Problem with running a GTX770

Post by Le cutter »

I just changed thermal paste, will see if I get better results.
I was used to undervolt and underclock my AMD cards but on MSI afterburner it seems that the voltage slider is all the way to the left at 0. I'll try reducing the power limit instead.

Thanks for your answers
Post Reply