Project: 6701 (Run 24, Clone 15, Gen 34) ???

Moderators: Site Moderators, FAHC Science Team

Post Reply
KneeDeep
Posts: 8
Joined: Tue Oct 13, 2009 3:20 pm

Project: 6701 (Run 24, Clone 15, Gen 34) ???

Post by KneeDeep »

(This is my first trouble report: apologies if I've failed miserably!)

System: Win 7 x64, AMD [email protected], 4GB RAM@800MHz, (R5750 & R4350 folding as well); SMP running as a service, GPU's as consoles.

SLOOOooowwwwwww..... far slower than anything I've noticed in my previous 1600 WUs. Nothing is truly hanging, but 1-2 hrs/% isn't right for a 921 credit WU. This has dropped my 6cpu 1055t from 1200PPD to 80PPD.

63%-85% of CPU cycles are going to the A3 task, judging from Task Mgr display (most of rest are going to two GPU folders). Rebooting, and stop-starting the Service, had no effect on the run times.

The deadline is in 1d14hr, but the remaining 83% of the task will take 9d at the current rate.

What action should I take at this point?


Code: Select all

 -- Log since last reboot --
--- Opening Log file [August 18 18:39:50 UTC] 


# Windows SMP Console Edition #################################################
###############################################################################

                       Folding@Home Client Version 6.30

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: C:\Folding@home\SMP
Service: C:\Folding@home\SMP\[email protected]
Arguments: -svcstart -d C:\Folding@home\SMP -smp -verbosity 9 

Launched as a service.
Entered C:\Folding@home\SMP to do work.

[18:39:50] - Ask before connecting: No
[18:39:50] - User name: MudHole (Team 0)
[18:39:50] - User ID: 2979EE7E723C1EF7
[18:39:50] - Machine ID: 13
[18:39:50] 
[18:39:50] Loaded queue successfully.
[18:39:50] 
[18:39:50] + Processing work unit
[18:39:50] Core required: FahCore_a3.exe
[18:39:50] Core found.
[18:39:50] - Autosending finished units... [August 18 18:39:50 UTC]
[18:39:50] Trying to send all finished work units
[18:39:50] + No unsent completed units remaining.
[18:39:50] - Autosend completed
[18:39:50] Working on queue slot 08 [August 18 18:39:50 UTC]
[18:39:50] + Working ...
[18:39:50] - Calling '.\FahCore_a3.exe -dir work/ -nice 19 -suffix 08 -np 6 -checkpoint 15 -service -verbose -lifeline 4620 -version 630

[18:39:51] 
[18:39:51] *------------------------------*
[18:39:51] Folding@Home Gromacs SMP Core
[18:39:51] Version 2.22 (Mar 12, 2010)
[18:39:51] 
[18:39:51] Preparing to commence simulation
[18:39:51] - Looking at optimizations...
[18:39:51] - Files status OK
[18:39:51] - Expanded 764006 -> 1404481 (decompressed 183.8 percent)
[18:39:51] Called DecompressByteArray: compressed_data_size=764006 data_size=1404481, decompressed_data_size=1404481 diff=0
[18:39:51] - Digital signature verified
[18:39:51] 
[18:39:51] Project: 6701 (Run 24, Clone 15, Gen 34)
[18:39:51] 
[18:39:51] Assembly optimizations on if available.
[18:39:51] Entering M.D.
[18:39:57] Using Gromacs checkpoints
[18:40:57] Resuming from checkpoint
[18:40:57] Verified work/wudata_08.log
[18:40:57] Verified work/wudata_08.trr
[18:40:57] Verified work/wudata_08.xtc
[18:40:57] Verified work/wudata_08.edr
[18:40:58] Completed 157700 out of 2000000 steps  (7%)
[18:57:39] Completed 160000 out of 2000000 steps  (8%)
[20:30:59] Completed 180000 out of 2000000 steps  (9%)
[22:41:56] Completed 200000 out of 2000000 steps  (10%)
[00:39:50] - Autosending finished units... [August 19 00:39:50 UTC]
[00:39:50] Trying to send all finished work units
[00:39:50] + No unsent completed units remaining.
[00:39:50] - Autosend completed
[00:49:12] Completed 220000 out of 2000000 steps  (11%)
[01:39:49] Completed 240000 out of 2000000 steps  (12%)
[03:42:04] Completed 260000 out of 2000000 steps  (13%)
[06:39:50] - Autosending finished units... [August 19 06:39:50 UTC]
[06:39:50] Trying to send all finished work units
[06:39:50] + No unsent completed units remaining.
[06:39:50] - Autosend completed
[07:10:06] Completed 280000 out of 2000000 steps  (14%)
[10:03:25] Completed 300000 out of 2000000 steps  (15%)
[12:39:50] - Autosending finished units... [August 19 12:39:50 UTC]
[12:39:50] Trying to send all finished work units
[12:39:50] + No unsent completed units remaining.
[12:39:50] - Autosend completed
[13:49:21] Completed 320000 out of 2000000 steps  (16%)
[15:29:06] Completed 340000 out of 2000000 steps  (17%)
Arnette
Posts: 25
Joined: Wed Jan 27, 2010 1:30 pm
Location: Ontario, Canada
Contact:

Re: Project: 6701 (Run 24, Clone 15, Gen 34) ???

Post by Arnette »

Try disabling your SMP service and see what % cpu your GPU's are using while folding.

I've found the ATI cards can use a ton of CPU while folding - though not enough to yield those low results...

Are you sure all of the cores are being utilized for folding? It might be worth trying -smp 5 in your client instead of -smp
This will leave you 1 free core for your GPU clients and/or windows to use.
Our Folding@Home Teampage --> http://www.lbsfolding.info
John_Weatherman
Posts: 289
Joined: Sun Dec 02, 2007 4:31 am
Location: Carrizo Plain National Monument, California
Contact:

Re: Project: 6701 (Run 24, Clone 15, Gen 34) ???

Post by John_Weatherman »

"4GB RAM@800MHz" - I assume that's wrong?
KneeDeep
Posts: 8
Joined: Tue Oct 13, 2009 3:20 pm

Re: Project: 6701 (Run 24, Clone 15, Gen 34) ???

Post by KneeDeep »

Arnette wrote:Try disabling your SMP service and see what % cpu your GPU's are using while folding.

I've found the ATI cards can use a ton of CPU while folding - though not enough to yield those low results...

Are you sure all of the cores are being utilized for folding? It might be worth trying -smp 5 in your client instead of -smp
This will leave you 1 free core for your GPU clients and/or windows to use.
* - Disabling SMP showed each GPU'folder consuming about 17% -- as before the shutdown.
* - Note: I set the GPU's to nocpulock so they wouldn't each contend with a single locked CPU-folder, but would distribute their CPU competition over all CPU-folders.
*** Edit: I just noticed that the R5750 has nocpulock set while the R4350 doesn't! Oh well...
* - The CPU monitor has always shown complete CPU saturation of all 6 cores whenever SMP's been running
* - A (second) system reboot and a switch to using "-smp 5" is running: I'll post the results in a couple of hours -- or less if it makes a significant improvement. But... 921 points suggests a normal WU time of ~16 hours; that's 10 min/%, and in 30 minutes since the restart with "-smp 5" there hasn't been a % increase, so I continue to think there's something hosed with this WU. And I expect it ain't gonna finish it by the deadline. And I remain open to suggestions.
John_Weatherman wrote:"4GB RAM@800MHz" - I assume that's wrong?
? - Why assume anything? That's the DRAM Frequency CPU-Z shows, although I suppose the memory is doubling that to 1600 -- I'm tired of these variations on the ways of stating memory speeds: DDR3-1600 vs PC3-12800 vs DRAM freq = 800. It's DDR3 on a 240MHz bus with an FSB:DRAM ratio of 3:10. Have I sinned? Probably!
John_Weatherman
Posts: 289
Joined: Sun Dec 02, 2007 4:31 am
Location: Carrizo Plain National Monument, California
Contact:

Re: Project: 6701 (Run 24, Clone 15, Gen 34) ???

Post by John_Weatherman »

I would say just write DDR3-1600 and that's enough.(unless you've brought something cheap on Ebay that was made by a Chinese prisoner with a hammer and screwdriver in a labor camp - "Re-education through work!")
KneeDeep
Posts: 8
Joined: Tue Oct 13, 2009 3:20 pm

Re: Project: 6701 (Run 24, Clone 15, Gen 34) ???

Post by KneeDeep »

... I ran a dozen tests, and got fed up editing this post to explain each in detail.

In summary: if I shut down the GPU's, the SMP clicked off a % of the WU every 11 minutes.

If I ran the GPU's with "nocpulock=0" [the default setting] the SMP slowed 30-40% -- which would be expected as the GPU's each seemed to take over a CPU.

It appears that SMP threads are NOT CPU locked: running "smp -4" with NO GPU's should have shown two nearly idle CPU's and four fully active CPU's if threads were CPU locked. Instead, all six CPU's were running around 2/3 utilized.

It appears that GPU threads are NOT CPU locked whether "nocpulock" is zero or one: without SMP running, there should have only been two CPU's showing heavy activity; instead, each CPU was showing partial activity.

There is some SMP-cramping impact of setting a GPU to "nocpulock=1", but... I can't guess at the mechanism. The SMP slowdown was reduced by using "smp -5" and "smp -4".

In the GPU forum, I've started a thread asking what nocpulock is supposed to do.
http://foldingforum.org/viewtopic.php?f ... 43#p156038

o&o
sortofageek
Site Admin
Posts: 3110
Joined: Fri Nov 30, 2007 8:06 pm
Location: Team Helix
Contact:

Re: Project: 6701 (Run 24, Clone 15, Gen 34) ???

Post by sortofageek »

I know this doesn't help you, but since you reported this WU in this forum, I'm just confirming it was not a bad WU. It was successfully completed and returned by another folder.
KneeDeep
Posts: 8
Joined: Tue Oct 13, 2009 3:20 pm

Re: Project: 6701 (Run 24, Clone 15, Gen 34) ???

Post by KneeDeep »

sortofageek wrote:I know this doesn't help you, but since you reported this WU in this forum, I'm just confirming it was not a bad WU. It was successfully completed and returned by another folder.
... -I- am probably that other folder. Once I found it was the GPU setting that was retarding the SMP work, I set the SMP back on track.

Or, do they distribute a WU to multiple folders?! [I beat the deadline.]
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Project: 6701 (Run 24, Clone 15, Gen 34) ???

Post by bruce »

"I beat the deadline" might mean the Preferred Deadline or the Final Deadline. When a WU passes the Preferred Deadline, it is reissued. When certain types of error results are uploaded, a WU is reissued. Occasionally there may be other reasons to reissue a WU, but I think they're rather rare. (Stanford doesn't want to waste processing resources any more than you do.)

Yes, you did return the WU for full credit plus bonus.
sortofageek
Site Admin
Posts: 3110
Joined: Fri Nov 30, 2007 8:06 pm
Location: Team Helix
Contact:

Re: Project: 6701 (Run 24, Clone 15, Gen 34) ???

Post by sortofageek »

Sorry, KneeDeep. I didn't look at the log, so didn't realize you are actually:

[18:39:50] - User name: MudHole (Team 0)

So, I thought this was a different folder:

Hi MudHole (team 0),
Your WU (P6701 R24 C15 G34) was added to the stats database on 2010-08-20 08:10:54 for 2848.65 points of credit.
Post Reply