Page 4 of 5
Re: 75XX Project issues (crashes, too many steps etc.)
Posted: Sat Mar 14, 2015 4:25 pm
by bruce
So perhaps the best interpretation of this problem is: If you're still on V6, upgrade to V7. It's really quote easy and you'll certainly get better support bu using the improved analytics.
Re: 75XX Project issues (crashes, too many steps etc.)
Posted: Sun Mar 15, 2015 5:19 pm
by Nathan_P
Sorry, the best interpretation is to fix the issue and let people who want to use 6.34 continue to do so.
I just tried to install v7 on one of my Linux boxes and it was a hopeless waste of 4 hours and a WU, suffice to say that it has been deleted and I have gone back to v6
Re: 75XX Project issues (crashes, too many steps etc.)
Posted: Mon Mar 16, 2015 2:38 am
by 7im
I followed the Linux install guide in Ubuntu and it took me 4 minutes, after I installed debi. What flavor are you running?
Re: 75XX Project issues (crashes, too many steps etc.)
Posted: Mon Mar 16, 2015 3:44 pm
by Nathan_P
Ubuntu 12.10 - which I didn't know at the time and realise is part of the issue.
Client installed OK, FAHcontrol was a whole different matter, it didn't install properly, so I reinstalled and it complained about dependancies, followed the wiki steps and it complained about the second set of commands needed, so I reinstalled a 3rd time and this time it wanted to download some updates for gnome, which it couldn't do as its 12.10 not the 12.04 LTS release that I thought it was running.
So moved on to at least monitor and salvage the current WU and webcontrol was only giving 100k PPD as opposed to the 170k I usually get on A3 work, left it to settle for an hour, came back and progress had moved on 25% but PPD was down to 80k, left it another couple of hours and progress was only at 60% so something was slowing everything down, then web control crashed and wouldn't connect to the client, even after rebooting the machine. At this point I vaped everything and went back to v6 and BA units - which I want to move away from, but couldn't because v6 wont connect to a server to get a WU, ARGH!
Re: 75XX Project issues (crashes, too many steps etc.)
Posted: Mon Mar 16, 2015 4:17 pm
by 7im
Being more of a nix newb, I tend to stick with the LTS versions. Even so, I haven't seen any reports like this before. The client version may have changed, but the FAHCore hasn't, and the core does all the folding. It *should* run at the same speed.
If you feel like experimenting again later, we should look at a Top screen to see what if anything is stealing cycles.
Re: 75XX Project issues (crashes, too many steps etc.)
Posted: Mon Mar 16, 2015 5:42 pm
by Nathan_P
I'm going to install 12.04 LTS Thursday on my day off, then I can try again and also get the various tweaks that tear and co created installed.
Still would be nice for v6 to work though.......
Re: 75XX Project issues (crashes, too many steps etc.)
Posted: Mon Mar 16, 2015 6:49 pm
by billford
Nathan_P wrote:
Client installed OK, FAHcontrol was a whole different matter
I had that trouble with 12.04 and (I think) 14.04- I thought it was just my unfamiliarity with Linux so didn't raise it here.
In both, FAHContol installed without a hitch if I just right-clicked the file and let the package installer look after it.
I'm using Mint 17.1 now and don't bother with the installation instructions at all, I just double-click the .deb files.
Re: 75XX Project issues (crashes, too many steps etc.)
Posted: Tue Mar 17, 2015 7:36 am
by toTOW
Project: 7521 (Run 0, Clone 43, Gen 263)
[07:30:55] *------------------------------*
[07:30:55] Folding@Home Gromacs GB Core
[07:30:55] Version 2.27 (Dec. 15, 2010)
[07:30:55]
[07:30:55] Preparing to commence simulation
[07:30:55] - Ensuring status. Please wait.
[07:31:05] - Looking at optimizations...
[07:31:05] - Working with standard loops on this execution.
[07:31:05] - Created dyn
[07:31:05] - Files status OK
[07:31:05] - Expanded 2523688 -> 3157328 (decompressed 125.1 percent)
[07:31:05] Called DecompressByteArray: compressed_data_size=2523688 data_size=3157328, decompressed_data_size=3157328 diff=0
[07:31:05] - Digital signature verified
[07:31:05]
[07:31:05] Project: 7521 (Run 0, Clone 43, Gen 263)
[07:31:05]
[07:31:05] Entering M.D.
[07:31:11] CoreStatus = 0 (0)
[07:31:11] Sending work to server
[07:31:11] Project: 7521 (Run 0, Clone 43, Gen 263)
[07:31:11] - Error: Could not get length of results file work/wuresults_00.dat
[07:31:11] - Error: Could not read unit 00 file. Removing from queue.
[07:31:11] Trying to send all finished work units
[07:31:11] + No unsent completed units remaining.
Re: 75XX Project issues (crashes, too many steps etc.)
Posted: Tue Mar 17, 2015 11:16 am
by autogrog
This WU is still failing immediately after downloading:
7520 (Run 64, Clone 3, Gen 254)
Re: 75XX Project issues (crashes, too many steps etc.)
Posted: Tue Mar 17, 2015 2:19 pm
by 7im
Most of these failures started as Run 0 problems. But now we're getting higher up in to the R C G numbers and still having the same issues.
This looks more and more like inherently unstable projects than some malformed WUs from a raid crash.
7520 (Run 32, Clone 3, Gen 465)
Posted: Wed Mar 18, 2015 8:11 am
by ThunderRd
OK, here's another one:
Code: Select all
Reading file work/wudata_00.tpr, VERSION 4.5.5-dev-20120703-fc032f9-dirty (single precision)
[06:14:53] CoreStatus = 0 (0)
[06:14:53] Sending work to server
[06:14:53] Project: 7520 (Run 32, Clone 3, Gen 465)
[06:14:53] - Error: Could not get length of results file work/wuresults_00.dat
[06:14:53] - Error: Could not read unit 00 file. Removing from queue.
[06:14:53] Trying to send all finished work units
[06:14:53] + No unsent completed units remaining.
My client downloaded from .97, and attempted to run this one about a dozen times before switching over to .96 and getting an 8828, which is running now.
Is there a way to block the client from going to .97 and using .96 instead, until this all sorts out?
Re: 75XX Project issues (crashes, too many steps etc.)
Posted: Wed Mar 18, 2015 3:05 pm
by 7im
No.
Re: 75XX Project issues (crashes, too many steps etc.)
Posted: Wed Mar 18, 2015 4:02 pm
by sick willie
7im wrote:You've been running v6 too long to remember to change the Machine ID value in the config after deleting the WU info to force a new WU in Windows. Or to delete the ID .dat file in Linux to affect the same change.
I'm about to have to start back on ID one from having done this with these 7520 WU's.
I'm going to start posting the faulty units in this thread. When I started with this problem, there wasn't any info I could find here.
These are bad:
7520 (R 47, C 4, G 282)
7520 (R 59, C 4, G 246)
Re: 75XX Project issues (crashes, too many steps etc.)
Posted: Wed Mar 18, 2015 6:55 pm
by toTOW
Two more with are alternately crashing on the same client :
Project: 7516 (Run 0, Clone 181, Gen 0)
Project: 7521 (Run 0, Clone 43, Gen 263)
Re: 75XX Project issues (crashes, too many steps etc.)
Posted: Thu Mar 19, 2015 12:21 pm
by autogrog
This WU STILL gets downloaded and the client wastes a lot of time retrying before getting a functional WU:
[08:29:26] Project: 7520 (Run 64, Clone 3, Gen 254)
[08:29:26] - Error: Could not get length of results file work/wuresults_06.dat
[08:29:26] - Error: Could not read unit 06 file. Removing from queue.
It needs to be fixed/deleted.