I am folding on both GPU and CPU, albiet slower to keep my temperatures down. However, my display shows that there items have been stuck at 99.99% for a while. The log for one of the slots shows a different percentage, which hasn't updated for about 6 or so hours. The other log shows nothing but that it found a checkpoint.
I have already restarted the application as well as the program, but not sure where to go from here. It's been taking a while to finish these two units, so I'm hoping that I don't have to scrap them and start over.
Does your computer enter a sleep or hibernate state while folding?
This has not been confirmed yet, but there are reports suggesting that if a computer hibernates/sleeps while processing a GPU assignment, the progress may be disrupted and the WU cannot be completed. Could this be what's happening to you?
Here's what we believe to be true: A sleep/hibernate will not disrupt a CPU-based WU. Pausing a GPU WU before hibernating/sleeping will allow processing to be resumed later.
Any additional information that you can provide will help us isolate this problem.
-Edit- Currently the CPU has backtracked to 85.70% and has updated itself in the log. The GPU still is stuck at 99.99% and in the log shows the same as what was posted previously.
I do not believe that the computer went into sleep or hibernation mode t all during these WUs. I set this computer up, as well as my wife's computer up, to be folding full time while plugged in. Now, the folding does pause when the computer is taken off of the charger since that is an option within the program, but they have not gone to sleep.
They have restarted for updates, but I highly doubt this to be the issue.
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
The 99% display on the GPU core is a cosmetic error. It will show 0% or 99% when it doesn't have an actual frame number to display. This is common at the start of a work unit, and at the end, or when the FAHClient is starting up and hasn't updated yet.
Also, please note there is a delay at the start of FAHCore_17 work units while the work unit environment is being setup. It can take anywhere from 2-7 minutes, depending on the speed of the CPU, or what CPU resources are available. Longer if all the CPU cores are pegged while folding an SMP work unit.
Also, is this a normal frame time for your system? That seems like a really long time. Is there something else using up CPU resources, like a defrag or virus scan?
22:45:54:WU00:FS01:0xa4:Completed 1255610 out of 1500000 steps (83%)
01:56:46:WU00:FS01:0xa4:Completed 1260000 out of 1500000 steps (84%)
What's missing from this discussion is the configuration of your client. Please paste the first page of the log into your next post. I'm interested in the hardware that the client detects and the configuration of the slots which are only partially described by the screenshot of FAHClient. Edit the log from the data directory and copy the part before the first WU starts or in the log panel, uncheck "Follow" and then click Refresh and scroll all the way to the top.
It may be a cosmetic error, but the log still shows no progress on the GPU side of things. It simply had been running for almost 48 hours now with nothing in the log besides it ticking over to a new day. And yes, that frame is pretty normal as I am only folding on one core out of four, as well as 25%. This is simply to keep temperatures down since I am also folding on the GPU.
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
I set the CPU usage to 100%, and restarted my machine. It is back to stating that it found a checkpoint file, but that is all.
I'll give it until tomorrow morning or so I suppose, and if it still is stuck in the log, get rid of it and grab a new WU. Shame I miss out on 6000 points though.
F@h is now the top computing platform on the planet and nothing unites people like a dedicated fight against a common enemy. This virus affects all of us. Lets end it together.
There is also a bug that seems to show up with to much overclock. The Log will not update a GPU slot; The %GPU usage will stay at 0% forever; The slot will update in the advanced/web client till 99% and then stay there forever. Upon pausing+unpausing or restarting the client it will start at the last known checkpoint for the slot (typically 0%) and work fine. The higher the OC the more often this will happen and the less the OC the less frequent.
7im wrote:Also, is this a normal frame time for your system? That seems like a really long time. Is there something else using up CPU resources, like a defrag or virus scan?
22:45:54:WU00:FS01:0xa4:Completed 1255610 out of 1500000 steps (83%)
01:56:46:WU00:FS01:0xa4:Completed 1260000 out of 1500000 steps (84%)
It should be noted that the CPU slot was set to 25% of one of the four CPUs (and perhaps not 24x7) which will very likely not make the deadline. With that fraction of your processing power, progress will necessarily be very, very slow. 0.3 frame in 3h11m works out to be about 44 days to complete that WU if you fold continuously at that rate. That's a pretty small time period to extrapolate to a complete WU so there's a huge uncertainty factor.
Project 8703 has a timeout of 26 days, at which time it will be assumed to be lost and will be reissued to someone else even though you will still be working on it.
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
Well, setting it to 100% seemed to work! Sadly, bruce you are right. It is too slow in order for me to actually complete the unit before it times out. Due to this, I suppose I will make some adjustments in my settings in order for it to get back on track for the next WU. Either way, setting it to a different usage has resulted in a 4% change over night, so that is helpful that it worked.
How would I go about deleting this WU so that I can retrieve a different one? I rather not work on this WU until it times out, as that is simply a waste of electricity.
-Edit-
I will simply stop folding on the GPU in order to combat this issue, as well as temperature issues. Back to 100% for all cores on the CPU! Thank you all for the help on this somewhat daunting issue.
FAH runs on a very tight schedule. The bonus points increase quite rapidly the quicker that you return any WU, indicating their strong desire for prompt returns of results. Also, when a WU expires, it gets no credit. These characteristics are quite unlike BOINC (at least for the projects I was familiar with a few years ago.)
Some people do like to split their resources between FAH and BOINC, and when they do I recommend that instead of attempting to run both concurrently, that you dedicate say 30 days to FAH and then 30 days to BOINC. That way FAH WUs are checked out and returned promptly. (You can use FAH's "Finish" function to complete the current assignment without downloading a new assignment.)