Page 2 of 2

Re: Project: 2605 (Run 9, Clone 571, Gen 5)

Posted: Mon Mar 03, 2008 3:48 am
by Flathead74
Looks like this one is making a return run... :x
This makes the fourth time that we've received it, with the same results each time.

This time though, I noticed the "Quit with Segmentation fault" notation in the console.
This does not show up in the Fahlog.


[03:01:55] Project: 2605 (Run 9, Clone 571, Gen 5)
-snip-
[03:02:03] Completed 0 out of 500000 steps (0 percent)
[03:02:03] Folding@home Core Shutdown: INTERRUPTED
[0]0:Return code = 0, signaled with Segmentation fault
[0]1:Return code = 0, signaled with Segmentation fault
[0]2:Return code = 0, signaled with Segmentation fault
[0]3:Return code = 0, signaled with Segmentation fault

[03:02:07] CoreStatus = 0 (0)
[03:02:07] Client-core communications error: ERROR 0x0
[03:02:07] Deleting current work unit & continuing...

Re: Project: 2605 (Run 9, Clone 571, Gen 5)

Posted: Sun Mar 09, 2008 9:55 pm
by Tigerbiten
Flathead74 wrote:Looks like this one is making a return run... :x
This makes the fourth time that we've received it, with the same results each time.
.............
Try five times now ............. :cry:
Just got it again ...... :evil:
No change is status here.
But I just hosed my fah install removing it, so had to re-download the whole client, so no log files of this attempt to run it.

Has anyone got past the first frame with this protien ?
Any idea how many times its been sent out ?
I first got this protien over 75 days ago.
At around 3 days lost for every re-issue, I make it must be up to 20-25 times by now........... :shock:

Luck ............... :D

Re: Project: 2605 (Run 9, Clone 571, Gen 5)

Posted: Sun Mar 09, 2008 10:23 pm
by bruce
Tigerbiten wrote:Any idea how many times its been sent out ?
No idea.

All I can tell you is that nobody has returned it yet.

Re: Project: 2605 (Run 9, Clone 571, Gen 5)

Posted: Wed Mar 26, 2008 1:10 pm
by saab
Got 2605 (Run 9, Clone 571, Gen 5) last night and it exited with the following message on a dual core so I took unit from right after it was downloaded and ran it on my quad core and it got the same results.
Is it my problem or yours?

Code: Select all

[12:32:05] Project: 2605 (Run 9, Clone 571, Gen 5)
[12:32:05]
[12:32:06] Entering M.D.
NNODES=4, MYRANK=0, HOSTNAME=edit
NNODES=4, MYRANK=1, HOSTNAME=tdit
NNODES=4, MYRANK=2, HOSTNAME=edit
NNODES=4, MYRANK=3, HOSTNAME=edit
NODEID=0 argc=15
NODEID=3 argc=15
NODEID=1 argc=15
NODEID=2 argc=15
      Written by David van der Spoel, Erik Lindahl, Berk Hess, and others.
       Copyright (c) 1991-2000, University of Groningen, The Netherlands.
             Copyright (c) 2001-2004, The GROMACS development team,
            check out http://www.gromacs.org for more information.

        This inclusion of Gromacs code in the Folding@Home Core is under
        a special license (see http://folding.stanford.edu/gromacs.html)
         specially granted to Stanford by the copyright holders. If you
          are interested in using Gromacs, visit www.gromacs.org where
                you can download a free version of Gromacs under
         the terms of the GNU General Public License (GPL) as published
       by the Free Software Foundation; either version 2 of the License,
                     or (at your option) any later version.

[12:32:14] Protein: Protein in POPC
[12:32:14] Writing local files
starting mdrun 'Protein in POPC'
500000 steps,   1000.0 ps.

[12:32:15] Extra SSE boost OK.
[12:32:16] 0000 steps  (0 percent)
[0]0:Return code = 0, signaled with Segmentation fault
[0]1:Return code = 0, signaled with Segmentation fault
[0]2:Return code = 0, signaled with Segmentation fault
[0]3:Return code = 0, signaled with Segmentation fault
[12:32:20] CoreStatus = 0 (0)
[12:32:20] Client-core communications error: ERROR 0x0
[12:32:20] Deleting current work unit & continuing...

Re: Project: 2605 (Run 9, Clone 571, Gen 5)

Posted: Wed Mar 26, 2008 5:17 pm
by Tigerbiten
saab wrote:Got 2605 (Run 9, Clone 571, Gen 5) last night and it exited with the following message on a dual core so I took unit from right after it was downloaded and ran it on my quad core and it got the same results.
Is it my problem or yours?
Its theirs.
I've tried to crunch this protien 5x now and downloaded it 3x each try and never got past the first frame on any attempt to crunch it.
I call it a bad work-unit.
Just keep deleteing it untill you get something else.

Luck ............... :D

Re: Project: 2605 (Run 9, Clone 571, Gen 5)

Posted: Wed Mar 26, 2008 5:21 pm
by 7im
If you post about a WU problem, please include system info as well.

For instance, we were having a problem with another client, and we found the user had the total memory in the Virtual Machine set too low. Increasing the memory allocation fixed the problem. Not saying that is the cause here, but it could be and we'd never know without more system details.

Thanks.

Re: Project: 2605 (Run 9, Clone 571, Gen 5)

Posted: Wed Mar 26, 2008 5:25 pm
by codysluder
7im wrote:If you post about a WU problem, please include system info as well.
System info needs to also include OS. Some problems are different on Windows/Linux and some are not.

Re: Project: 2605 (Run 9, Clone 571, Gen 5)

Posted: Thu Mar 27, 2008 8:30 pm
by MoneyGuyBK
7im wrote:If you post about a WU problem, please include system info as well.
For instance, we were having a problem with another client, and we found the user had the total memory in the Virtual Machine set too low. Increasing the memory allocation fixed the problem. Not saying that is the cause here, but it could be and we'd never know without more system details.
Thanks.
7im, that brings up a question for me, as I don't recall it being discuused before.
* What is the "Minimum" amount of memory to allocate in VM, or any setup for that matter.... and of course is there a "Maximum" ?
I have mine set at 512, tried it in the past with 1024 and saw no difference, and switched back to 512 again. TIA

Peace

Re: Project: 2605 (Run 9, Clone 571, Gen 5)

Posted: Thu Mar 27, 2008 8:39 pm
by él Mero
Maybe this can clarify a little MoneyGuyBK:

Memory Resource Management

Re: Project: 2605 (Run 9, Clone 571, Gen 5)

Posted: Thu Mar 27, 2008 8:47 pm
by Ren02
MoneyGuyBK wrote:
7im wrote:If you post about a WU problem, please include system info as well.
For instance, we were having a problem with another client, and we found the user had the total memory in the Virtual Machine set too low. Increasing the memory allocation fixed the problem. Not saying that is the cause here, but it could be and we'd never know without more system details.
Thanks.
7im, that brings up a question for me, as I don't recall it being discuused before.
* What is the "Minimum" amount of memory to allocate in VM, or any setup for that matter.... and of course is there a "Maximum" ?
I have mine set at 512, tried it in the past with 1024 and saw no difference, and switched back to 512 again. TIA

Peace
Getting offtopic, but the new 2619 project that runs on FahCore_a2 is very demanding when it comes to memory. My VM that runs Ubuntu+FaH takes 900megs of RAM with p2619 and just 400megs with p2605. I had my VMs memory allocation set on 512megs as well. Swap was 240megs. 512+240 wasn't enough so 2 out of 3 VMs crashed. ;) Bought another 2 gigs of RAM today. :D

Re: Project: 2605 (Run 9, Clone 571, Gen 5)

Posted: Thu Mar 27, 2008 10:24 pm
by 7im
MoneyGuyBK wrote:
7im wrote:If you post about a WU problem, please include system info as well.
For instance, we were having a problem with another client, and we found the user had the total memory in the Virtual Machine set too low. Increasing the memory allocation fixed the problem. Not saying that is the cause here, but it could be and we'd never know without more system details.
Thanks.
7im, that brings up a question for me, as I don't recall it being discuused before.
* What is the "Minimum" amount of memory to allocate in VM, or any setup for that matter.... and of course is there a "Maximum" ?
I have mine set at 512, tried it in the past with 1024 and saw no difference, and switched back to 512 again. TIA

Peace
The minimum memory to allocate to the VM is the largest amount of memory the OS will need, plus the needs of the swap file, plus the needs of the largest work unit that you intend to run with the FAH client.

If you run with a single CPU client, the largest is about 200 MB, if you use the BigWU setting. if you do not use the BigWu setting, you can probably get by with only 128 MB for fah.

If you run SMP client, the largest memory user is now the A2 fahcore with the 2619 project, at 200 MB per fahcore. And that could easily grow in the future.

So if you need 256 MB for the OS, plus 256 for the swap, plus 800 for FAH, that's 1312 MB. If you are running 2 VMs on a quad core, double it.

There are two things that can be configured to prevent a lack of memory problem. First, Pande Group needs to configure the Work Server correctly to check the Minimum Memory setting for the FAH client. With p2619, that should be 800MB.

The second thing is the Memory setting in the fah client setup. You need to set that number correctly as well. If you allocate 1024 MB to your VM image, and your OS needs 128 MB and your swap needs 256 MB, then the memory setting in the fah client should be 1024 - (128+256) = 640 MB.

If the fah client shows 640 MB, and the Work Server needs 800 MB for 2619s, then you shouldn't get 2619s, and you will be assigned to a different work server for a different project.

Re: Project: 2605 (Run 9, Clone 571, Gen 5)

Posted: Sat Mar 29, 2008 8:11 pm
by saab
Follow up to 2605 (Run 9, Clone 571, Gen 5) failure.
Syst info was:
Dual core running Linux 2.6.15-51-amd64-generic x86_64 GNU/Linux
e4400 @2.0GHz / 2Gb PC-6400
with FAH setting:
memory=1024
type=3

Quad core running Linux 2.6.22-14-generic x86_64 GNU/Linux
q6600 @3.0GHz / 2Gb PC-6400
memory=1024
type=3

I should have saved the unit to try on a virtual machine but suspect it would have ended the same.
these two did 2619 fine

Re: Project: 2605 (Run 9, Clone 571, Gen 5)

Posted: Thu Apr 03, 2008 5:01 am
by jima13
[04:26:37] Project: 2605 (Run 9, Clone 571, Gen 5)
[04:26:37]
[04:26:37] Assembly optimizations on if available.
[04:26:37] Entering M.D.
[04:26:43] Rejecting checkpoint
[04:26:44] Protein: Protein in POPC
[04:26:44] Writing local files
[04:26:45] Extra SSE boost OK.
[04:26:45] Writing local files
[04:26:45] Completed 0 out of 500000 steps (0 percent)
[04:26:49] CoreStatus = 0 (0)
[04:26:49] Client-core communications error: ERROR 0x0
[04:26:49] Deleting current work unit & continuing...

I just watched this happen 3x then it downloaded a new core and a different project> [04:43:17] Project: 2605 (Run 5, Clone 82, Gen 37) which is running fine...

Re: Project: 2605 (Run 9, Clone 571, Gen 5)

Posted: Mon Apr 14, 2008 8:06 am
by Tigerbiten
Just got this BAD work-unit again.
Still no change.
Attempted runs 3x =

Code: Select all

[07:20:47] Folding@Home Gromacs SMP Core
[07:20:47] Version 1.74 (November 27, 2006)
[07:20:47] 
[07:20:47] Preparing to commence simulation
[07:20:47] - Ensuring status. Please wait.
[07:20:48] - Starting from initial work packet
[07:20:48] 
[07:20:48] Project: 2605 (Run 9, Clone 571, Gen 5)
[07:20:48] 
[07:20:48] Assembly optimizations on if available.
[07:20:48] Entering M.D.
[07:21:05] 0 percent)
[07:21:05] - Starting from initial work packet
[07:21:05] 
[07:21:05] Project: 2605 (Run 9, Clone 571, Gen 5)
[07:21:05] 
[07:21:05] Entering M.D.
[07:21:12] Protein: Protein in POPC
[07:21:12] Writing local files
[07:21:13] t)
[07:21:13] ra SSE boost OK.
[07:21:17] CoreStatus = 0 (0)
[07:21:17] Client-core communications error: ERROR 0x0
[07:21:17] Deleting current work unit & continuing...
[07:25:48] - Warning: Could not delete all work unit files (8): Core returned invalid code
Luck ............ :D