Pausing mitigates TDR bug?
Moderators: Site Moderators, FAHC Science Team
-
- Posts: 147
- Joined: Mon May 21, 2012 10:28 am
Pausing mitigates TDR bug?
I fold on a GTX770, using the 319.76 Linux drivers on a Fedora 20 box.
While I was previously rebooting every 36 hours to avoid the TDR bug, I have noticed that pausing the folding seems to have the same effect. Letting the card rest for 5-10 minutes between each WU, I am now approaching 72 hours of folding without rebooting.
Can anyone confirm if this is expected behavior?
While I was previously rebooting every 36 hours to avoid the TDR bug, I have noticed that pausing the folding seems to have the same effect. Letting the card rest for 5-10 minutes between each WU, I am now approaching 72 hours of folding without rebooting.
Can anyone confirm if this is expected behavior?
-
- Posts: 10179
- Joined: Thu Nov 29, 2007 4:30 pm
- Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
- Location: Arizona
- Contact:
Re: Pausing mitigates TDR bug?
The TDR bug was simply time related. Didn't matter if you were folding or gaming or not. So pausing would have no affect.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
Tell me and I forget. Teach me and I remember. Involve me and I learn.
-
- Posts: 147
- Joined: Mon May 21, 2012 10:28 am
Re: Pausing mitigates TDR bug?
Understood, could it have something to do with my OS then? I am far past the 36-hour cutoff, and there have been no broken WU:s, no crashes, or any other symptoms of the bug at all.
-
- Posts: 1576
- Joined: Tue May 28, 2013 12:14 pm
- Location: Tokyo
Re: Pausing mitigates TDR bug?
I expirienced the TDR bug mainly on GTX 780 at that time; with GK110 chipset (also Titan and 780Ti). The 770 has GK104.
With newer driver the TDR got fixed; but GK104 based card got slower (like my 660TI). I split my GPU in different system and gave each a matching driver. TDR not seen for 9 month or so.
With newer driver the TDR got fixed; but GK104 based card got slower (like my 660TI). I split my GPU in different system and gave each a matching driver. TDR not seen for 9 month or so.
Please contribute your logs to http://ppd.fahmm.net
-
- Posts: 10179
- Joined: Thu Nov 29, 2007 4:30 pm
- Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
- Location: Arizona
- Contact:
Re: Pausing mitigates TDR bug?
2 options. Pre-TDR bug driver version. Or the GPU did a reset. Check the FAH logs to see if there are any folding interruptions in the last 2 days other than your pausing the client.csvanefalk wrote:Understood, could it have something to do with my OS then? I am far past the 36-hour cutoff, and there have been no broken WU:s, no crashes, or any other symptoms of the bug at all.
Optionally, there a v55 fahcore that has no folding slow down, so you could upgrade past the TDR bug driver version, and just use the latest NV driver.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
Tell me and I forget. Teach me and I remember. Involve me and I learn.
Re: Pausing mitigates TDR bug?
AFAIK that version of the core is Windows only at this time.7im wrote:Optionally, there a v55 fahcore that has no folding slow down, so you could upgrade past the TDR bug driver version, and just use the latest NV driver.
-
- Posts: 10179
- Joined: Thu Nov 29, 2007 4:30 pm
- Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
- Location: Arizona
- Contact:
Re: Pausing mitigates TDR bug?
Yep. Time to poke Prot again.bollix47 wrote:AFAIK that version of the core is Windows only at this time.7im wrote:Optionally, there a v55 fahcore that has no folding slow down, so you could upgrade past the TDR bug driver version, and just use the latest NV driver.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
Tell me and I forget. Teach me and I remember. Involve me and I learn.
-
- Posts: 110
- Joined: Thu Apr 30, 2009 7:31 pm
- Hardware configuration: [email protected]
[email protected]
[email protected]
GTX460@800MHz - Location: Essen, Germany
Re: Pausing mitigates TDR bug?
In my opinion v55 is still beta.7im wrote:Yep. Time to poke Prot again.bollix47 wrote:AFAIK that version of the core is Windows only at this time.7im wrote:Optionally, there a v55 fahcore that has no folding slow down, so you could upgrade past the TDR bug driver version, and just use the latest NV driver.
Heiko
-
- Posts: 10179
- Joined: Thu Nov 29, 2007 4:30 pm
- Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
- Location: Arizona
- Contact:
Re: Pausing mitigates TDR bug?
Operationally, yes (simply because no one has moved it to public yet).
Is there some functional reason you think they should not release it as public?
Is there some functional reason you think they should not release it as public?
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
Tell me and I forget. Teach me and I remember. Involve me and I learn.
-
- Posts: 110
- Joined: Thu Apr 30, 2009 7:31 pm
- Hardware configuration: [email protected]
[email protected]
[email protected]
GTX460@800MHz - Location: Essen, Germany
Re: Pausing mitigates TDR bug?
No but I´ve no idea who decides about the public release of a fahcore and why it takes so long to release an obviously working fahcore version.7im wrote:Operationally, yes (simply because no one has moved it to public yet).
Is there some functional reason you think they should not release it as public?
Heiko
-
- Posts: 147
- Joined: Mon May 21, 2012 10:28 am
Re: Pausing mitigates TDR bug?
7im - I can't identify with either of the cases you mentioned. The driver version is 319.76, and I have had the TDR issue with it earlier:
I also cannot find any evidence in the log of the GPU resetting, apart from me pausing it (too large to post here):
http://hastebin.com/zomevafedu.coffee
Code: Select all
[christopher@chrisdesktop ~]$ nvidia-smi
Fri Jul 18 07:44:01 2014
+------------------------------------------------------+
| NVIDIA-SMI 5.319.76 Driver Version: 319.76 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 770 Off | 0000:03:00.0 N/A | N/A |
| 50% 66C N/A N/A / N/A | 688MB / 2047MB | N/A Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Compute processes: GPU Memory |
| GPU PID Process name Usage |
|=============================================================================|
| 0 Not Supported |
+-----------------------------------------------------------------------------+
http://hastebin.com/zomevafedu.coffee
-
- Posts: 10179
- Joined: Thu Nov 29, 2007 4:30 pm
- Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
- Location: Arizona
- Contact:
Re: Pausing mitigates TDR bug?
The bug, as reported in the NV forum, was time based. You are welcome to look it up.
Also keep trying your pause trick. Does it work consistently, or just this once on a while? Let us know.
Also keep trying your pause trick. Does it work consistently, or just this once on a while? Let us know.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
Tell me and I forget. Teach me and I remember. Involve me and I learn.
-
- Posts: 147
- Joined: Mon May 21, 2012 10:28 am
Re: Pausing mitigates TDR bug?
I have not used the pause trick for at least 48 hours, and the folding process continues without error. There appear to be no traces of the bug at all. I wish I could determine exactly how I got to this stage for the benefit of other Linux GPU folders, but the only major change I can recall doing was to recompile the driver after updating to kernel 3.15.
Complete log is here: http://hastebin.com/bahegomewu.coffee
Complete log is here: http://hastebin.com/bahegomewu.coffee