Page 1 of 1

Project: 3064 (Run 5, Clone 4, Gen 77)

Posted: Sun May 18, 2008 4:53 pm
by toaster8

Code: Select all

--- Opening Log file [May 18 15:13:48] 


# SMP Client ##################################################################
###############################################################################

                       Folding@Home Client Version 6.10beta2

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: /Users/rtoso/Library/FAH
Executable: /Applications/[email protected]/fah6
Arguments: -local -advmethods -forceasm -verbosity 9 -smp 

Warning:
 By using the -forceasm flag, you are overriding
 safeguards in the program. If you did not intend to
 do this, please restart the program without -forceasm.
 If work units are not completing fully (and particularly
 if your machine is overclocked), then please discontinue
 use of the flag.

[15:13:48] - Ask before connecting: No
[15:13:48] - User name: toaster8 (Team 1971)
[15:13:48] - User ID: 21C26B23348749C6
[15:13:48] - Machine ID: 1
[15:13:48] 
[15:13:48] Loaded queue successfully.
[15:13:48] 
[15:13:48] + Processing work unit
[15:13:48] Core required: FahCore_a1.exe
[15:13:48] Core found.
[15:13:48] - Using generic /Applications/[email protected]/mpiexec
[15:13:48] - Autosending finished units...
[15:13:48] Trying to send all finished work units
[15:13:48] + No unsent completed units remaining.
[15:13:48] - Autosend completed
[15:13:48] Working on Unit 01 [May 18 15:13:48]
[15:13:48] + Working ...
[15:13:48] - Calling '/Applications/[email protected]/mpiexec -np 4 -host 127.0.0.1 ./FahCore_a1.exe -dir work/ -suffix 01 -priority 96 -checkpoint 30 -forceasm -verbose -lifeline 16635 -version 610'

[15:13:48] 
[15:13:48] *------------------------------*
[15:13:48] Folding@Home Gromacs SMP Core
[15:13:48] Version 1.74 (September 24, 2007)
[15:13:48] 
[15:13:48] Preparing to commence simulation
[15:13:48] - Ensuring status. Please wait.
[15:13:49] 
[15:13:49] Project: 3064 (Run 5, Clone 4, Gen 77)
[15:13:49] 
[15:13:49] Assembly optimizations on if available.
[15:13:49] Entering M.D.
[15:14:06]  on if available.
[15:14:06] Entering M.D.
[15:14:12] mdrunneRead topology
[15:14:12] (Protein: p3064_lambda5_2003
[15:14:12] Writing local files
[15:14:12] Completed 4750000 out of 5000000 steps  (95 percent)
[15:14:13] Extra SSE boost OK.
[15:14:13] 0 steps  (95 percent)
[15:14:13] Extra SSE boost OK.
[15:44:12] Timered checkpoint triggered.
[15:46:49] Writing local files
[15:46:49] Completed 4800000 out of 5000000 steps  (96 percent)
[15:50:40] CoreStatus = 1 (1)
[15:50:40] Client-core communications error: ERROR 0x1
[15:50:40] Deleting current work unit & continuing...
[15:50:40] - Using generic /Applications/[email protected]/mpiexec
[15:55:08] - Warning: Could not delete all work unit files (1): Core returned invalid code
[15:55:08] Trying to send all finished work units
[15:55:08] + No unsent completed units remaining.
[15:55:08] - Preparing to get new work unit...
[15:55:08] + Attempting to get work packet
[15:55:08] - Will indicate memory of 4096 MB
[15:55:08] - Detect CPU. Vendor: GenuineIntel, Family: 6, Model: 7, Stepping: 6
[15:55:08] - Connecting to assignment server
[15:55:08] Connecting to http://assign.stanford.edu:8080/
[15:55:08] Posted data.
[15:55:08] Initial: 40AB; - Successful: assigned to (171.64.65.56).
[15:55:08] + News From Folding@Home: Welcome to Folding@Home
[15:55:08] Loaded queue successfully.
[15:55:08] Connecting to http://171.64.65.56:8080/
[15:55:11] Posted data.
[15:55:11] Initial: 0000; - Receiving payload (expected size: 2430104)
[15:55:19] - Downloaded at ~296 kB/s
[15:55:19] - Averaged speed for that direction ~150 kB/s
[15:55:19] + Received work.
[15:55:19] + Closed connections
[15:55:24] 
[15:55:24] + Processing work unit
[15:55:24] Core required: FahCore_a1.exe
[15:55:24] Core found.
[15:55:24] - Using generic /Applications/[email protected]/mpiexec
[15:55:24] Working on Unit 02 [May 18 15:55:24]
[15:55:24] + Working ...
[15:55:24] - Calling '/Applications/[email protected]/mpiexec -np 4 -host 127.0.0.1 ./FahCore_a1.exe -dir work/ -suffix 02 -priority 96 -checkpoint 30 -forceasm -verbose -lifeline 16635 -version 610'
I have another unit that behaved the same and died at 56% it is Project: 3065 (Run 2, Clone 2, Gen 50)

Re: Project: 3064 (Run 5, Clone 4, Gen 77)

Posted: Sun May 18, 2008 7:17 pm
by ChelseaOilman
Look in your work folder for wuresults_0x.dat files. If you see any, you may be able to get credit for them by using qfix.

Re: Project: 3064 (Run 5, Clone 4, Gen 77)

Posted: Sun May 18, 2008 9:52 pm
by nwkelley
i have alerted the researcher in charge of your post, thank you.
nick

Re: Project: 3064 (Run 5, Clone 4, Gen 77)

Posted: Mon May 19, 2008 1:08 am
by toaster8
Thanks, Chelsea but no such luck.

Thanks, Nick - how about the other project too? If more information is needed just let me know.

Re: Project: 3064 (Run 5, Clone 4, Gen 77)

Posted: Mon May 19, 2008 4:29 pm
by DanEnsign
ChelseaOilman wrote:Look in your work folder for wuresults_0x.dat files. If you see any, you may be able to get credit for them by using qfix.
Hi Toaster, It's not clear from your post what the problem is.

These errors happen on SMP work units -- business as usual.

It's a different story if you didn't recieve (partial) credit, however.

Dan

Re: Project: 3064 (Run 5, Clone 4, Gen 77)

Posted: Mon May 19, 2008 4:56 pm
by ChelseaOilman
DanEnsign wrote:It's a different story if you didn't recieve (partial) credit, however.
[15:46:49] Completed 4800000 out of 5000000 steps (96 percent)
[15:50:40] CoreStatus = 1 (1)
[15:50:40] Client-core communications error: ERROR 0x1
[15:50:40] Deleting current work unit & continuing...
Someone received full credit for finishing it. toaster8 received no credit because the WU was deleted.

Re: Project: 3064 (Run 5, Clone 4, Gen 77)

Posted: Thu May 22, 2008 2:53 am
by toaster8
I didn't delete it, the system must have. :x

Re: Project: 3064 (Run 5, Clone 4, Gen 77)

Posted: Thu May 22, 2008 12:51 pm
by anandhanju
I know I sound like a broken record here; If you get these WUs again, can you stop the client at failure point-5% (~90% and 50%) and see if it progresses? This seems to work sometimes and I tried correlating this to an error type (x00, x01, x7B..) but so far do not have anything concrete.