Page 1 of 1

Project: 7500

Posted: Sun Jan 29, 2012 3:46 pm
by iancook221188
this is one of the project that keep failing i have no idea how many now. all these projects are of the 7500, they seem to stop before they start all the other work unit i get fold with out problems im running smp10 i don't think that is the problem. on my other rig they seem to fold 7500 fine but there on smp4 and smp8

Code: Select all

[03:28:25] *------------------------------*
[03:28:25] Folding@Home Gromacs SMP Core
[03:28:25] Version 2.27 (Dec. 15, 2010)
[03:28:25] 
[03:28:25] Preparing to commence simulation
[03:28:25] - Looking at optimizations...
[03:28:25] - Created dyn
[03:28:25] - Files status OK
[03:28:25] - Expanded 1247723 -> 2077012 (decompressed 166.4 percent)
[03:28:25] Called DecompressByteArray: compressed_data_size=1247723 data_size=2077012, decompressed_data_size=2077012 diff=0
[03:28:25] - Digital signature verified
[03:28:25] 
[03:28:25] Project: 7500 (Run 0, Clone 230, Gen 126)
[03:28:25] 
[03:28:25] Assembly optimizations on if available.
[03:28:25] Entering M.D.
[03:28:31] Mapping NT from 10 to 10 
[03:28:32] mdrun returned 255
[03:28:32] Going to send back what have done -- stepsTotalG=500000
[03:28:32] Work fraction=0.0000 steps=500000.
[03:28:36] logfile size=0 infoLength=0 edr=0 trr=25
[03:28:36] logfile size: 0 info=0 bed=0 hdr=25
[03:28:36] - Writing 642 bytes of core data to disk...
[03:28:36] Done: 130 -> 147 (compressed to 113.0 percent)
[03:28:36]   ... Done.
[03:28:36] 
[03:28:36] Folding@home Core Shutdown: EARLY_UNIT_END
[03:28:39] CoreStatus = 72 (114)
[03:28:39] Sending work to server
[03:28:39] Project: 7500 (Run 0, Clone 230, Gen 126)


[03:28:39] + Attempting to send results [January 29 03:28:39 UTC]
[03:28:39] - Reading file work/wuresults_07.dat from core
[03:28:39]   (Read 659 bytes from disk)
[03:28:39] Connecting to http://128.143.199.97:8080/
[03:28:40] Posted data.
[03:28:40] Initial: 0000; - Uploaded at ~1 kB/s
[03:28:40] - Averaged speed for that direction ~55 kB/s
[03:28:40] + Results successfully sent
[03:28:40] Thank you for your contribution to Folding@Home.
[03:28:44] Trying to send all finished work units
[03:28:44] + No unsent completed units remaining.
[03:28:44] - Preparing to get new work unit...
[03:28:44] Cleaning up work directory
[03:28:44] + Attempting to get work packet
[03:28:44] Passkey found
[03:28:44] - Will indicate memory of 4096 MB
[03:28:44] - Connecting to assignment server
[03:28:44] Connecting to http://assign.stanford.edu:8080/
[03:28:45] Posted data.
[03:28:45] Initial: 8F80; - Successful: assigned to (128.143.199.97).
[03:28:45] + News From Folding@Home: Welcome to Folding@Home
[03:28:45] Loaded queue successfully.
[03:28:45] Sent data
[03:28:45] Connecting to http://128.143.199.97:8080/
[03:28:46] Posted data.
[03:28:46] Initial: 0000; - Receiving payload (expected size: 1248586)
[03:28:49] - Downloaded at ~406 kB/s
[03:28:49] - Averaged speed for that direction ~498 kB/s
[03:28:49] + Received work.
[03:28:49] Trying to send all finished work units
[03:28:49] + No unsent completed units remaining.
[03:28:49] + Closed connections
[03:28:54] 
[03:28:54] + Processing work unit
[03:28:54] Core required: FahCore_a3.exe
[03:28:54] Core found.
[03:28:54] Working on queue slot 08 [January 29 03:28:54 UTC]
[03:28:54] + Working ...
[03:28:54] - Calling '.\FahCore_a3.exe -dir work/ -nice 19 -suffix 08 -np 10 -checkpoint 5 -verbose -lifeline 7040 -version 634'

[03:28:54] 
[03:28:54] *------------------------------*
Mod Edit: Added Code Tags - PantherX

Re: Project: 7500

Posted: Sun Jan 29, 2012 7:18 pm
by PantherX
Only your error report in the WU Database so I have marked it for a followup.

Re: Project: 7500

Posted: Sun Jan 29, 2012 7:57 pm
by toTOW
It might not like the decomposition of the WU because of the -smp 10 flag ...

Re: Project: 7500

Posted: Sun Jan 29, 2012 8:58 pm
by bruce
toTOW wrote:It might not like the decomposition of the WU because of the -smp 10 flag ...
That would be interesting to know. How about reconfiguring -smp 10 to be -smp 8 for long enough to prove that machine can run P7500. Extrapolating from what Kasson said in a recent post, WUs are harder to decompose into 10=5x2x1 that when running at 8=2x2x2. The with 7 or maybe even 5 slices of the protein, it's going to be more difficult than when all of the factors are 2 ot 3. There is no known way to determine whether a protein can be sliced into N pieces (where N is prime).

viewtopic.php?f=19&t=20611&p=206054#p206054

Re: Project: 7500

Posted: Sun Jan 29, 2012 9:50 pm
by Jonazz
So more cores =/= always better for SMP, even if the numbers are always even?

Re: Project: 7500

Posted: Sun Jan 29, 2012 10:04 pm
by bruce
The limitation in GROMACS SMP has been described as a problem when the number of cores contains a "large" prime factor. Of course "large" is a relative term, so nobody can guarantee that a certain value will succeed or fail. Numbers like 11, 13, 17, 19 are very likely to fail so they've been excluded. A value like 23 will (almost) certainly fail, so 46 will, too, even though it's an even number. Numbers like 5 are much more likely to succeed.

It should be noted that the authors of GROMACS have targeted computers owned by research organizations. You can find computers with 8, 12, 16, 24, 36, 48 threads (and maybe some others) but nobody owns a computer with 11 or 19 or 23 threads.

Re: Project: 7500

Posted: Sun Jan 29, 2012 10:42 pm
by iancook221188
ill try -8 and -12 to see if they can successfully decompose this work unit

Re: Project: 7500

Posted: Tue Feb 07, 2012 6:58 pm
by sortofageek
I will now close this report as another folder has completed it successfully.

Feedback for the successful folder:
Your WU (P7500 R0 C230 G126) was added to the stats database on 2012-01-29 12:10:21 for 2297.98 points of credit.