Project: 5769 (Run 4, Clone 20, Gen 2247) BAD_WORK_UNIT

Moderators: Site Moderators, FAHC Science Team

Rolo
Posts: 33
Joined: Fri Oct 26, 2012 11:49 pm
Hardware configuration: Zaphod: [email protected], 8GB DDR3-2133, GTX580@882/1765/2037
Trillian: [email protected], 8GB DDR3-2133, GTX580@795/1590/2004
Beeblebrox: Conroe [email protected], 4x1GB Corsair Dominator PC8500 DDR2@900, Gigabyte GA-965P-DQ6v3.3, MSI N260GTX-T2D896-OC@621/1296/1080/1.125v (65nm-192core)
Hotblack: Wolfdale [email protected], 2x2GB Crucial Ballistix PC8500 DDR2@1000, Gigabyte GA-965P-DQ6v3.3, MSI N260GTX-T2D896-OC@621/1296/1080/1.125v (65nm-192core)
Location: Pike's Peak

Re: Project: 5769 (Run 4, Clone 20, Gen 2247) BAD_WORK_UNIT

Post by Rolo »

I seemed to have done it all.

To not hijack this thread, I posted the particulars here: viewtopic.php?f=80&t=22874

To get back on topic, does anyone know if there is a log parser that can list all project runs and their outcomes? I'd like to get a list of all WUs I've run and their outcomes.
Image
Napoleon
Posts: 887
Joined: Wed May 26, 2010 2:31 pm
Hardware configuration: Atom330 (overclocked):
Windows 7 Ultimate 64bit
Intel Atom330 dualcore (4 HyperThreads)
NVidia GT430, core_15 work
2x2GB Kingston KVR1333D3N9K2/4G 1333MHz memory kit
Asus AT3IONT-I Deluxe motherboard
Location: Finland

Re: Project: 5769 (Run 4, Clone 20, Gen 2247) BAD_WORK_UNIT

Post by Napoleon »

Try FahWatch, viewtopic.php?f=14&t=20391&p=205051&hilit=fahwatch#p205051.
Win7 64bit, FAH v7, OC'd
2C/4T Atom330 3x667MHz - GT430 2x832.5MHz - ION iGPU 3x466.7MHz
NaCl - Core_15 - display
aoeu
Posts: 87
Joined: Thu Dec 31, 2009 9:07 pm

Re: Project: 5769 (Run 4, Clone 20, Gen 2247) BAD_WORK_UNIT

Post by aoeu »

I have also experienced a problem with a 5769 WU (13, 297, 20). It sits at 0% after several days. I tried to 'finish' the unit hoping that it would go away. It didn't. I don't think it will either as it reports knowing that it expired a week ago. I just updated FAH to 7.2.9 hoping that it would go away and it's still hanging. How do I delete this unit?

Peace?
aoeu
Joe_H
Site Admin
Posts: 7938
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: Project: 5769 (Run 4, Clone 20, Gen 2247) BAD_WORK_UNIT

Post by Joe_H »

Do you happen to have a Fermi based GPU? If so,this topic describes the problem and the fix. Last week non-Fermi WU's were assigned to Fermi clients and would not fold. If that is not your problem, please post the beginning of your log showing the System information and part of the log showing the beginning of the WU being processed and any errors reported. Information on how to post from the log is here.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
aoeu
Posts: 87
Joined: Thu Dec 31, 2009 9:07 pm

Re: Project: 5769 (Run 4, Clone 20, Gen 2247) BAD_WORK_UNIT

Post by aoeu »

Does this help?

Code: Select all

*********************** Log Started 2012-11-30T17:53:58Z ***********************
17:53:58:************************* Folding@home Client *************************
17:53:58:      Website: http://folding.stanford.edu/
17:53:58:    Copyright: (c) 2009-2012 Stanford University
17:53:58:       Author: Joseph Coffland <[email protected]>
17:53:58:         Args: --lifeline 4052 --command-port=36330
17:53:58:       Config: C:/Users/aoeu/AppData/Roaming/FAHClient/config.xml
17:53:58:******************************** Build ********************************
17:53:58:      Version: 7.2.9
17:53:58:         Date: Oct 3 2012
17:53:58:         Time: 18:05:48
17:53:58:      SVN Rev: 3578
17:53:58:       Branch: fah/trunk/client
17:53:58:     Compiler: Intel(R) C++ MSVC 1500 mode 1200
17:53:58:      Options: /TP /nologo /EHa /Qdiag-disable:4297,4103,1786,279 /Ox -arch:SSE
17:53:58:               /QaxSSE2,SSE3,SSSE3,SSE4.1,SSE4.2 /Qopenmp /Qrestrict /MT /Qmkl
17:53:58:     Platform: win32 XP
17:53:58:         Bits: 32
17:53:58:         Mode: Release
17:53:58:******************************* System ********************************
17:53:58:          CPU: Intel(R) Core(TM) i5 CPU 750 @ 2.67GHz
17:53:58:       CPU ID: GenuineIntel Family 6 Model 30 Stepping 5
17:53:58:         CPUs: 4
17:53:58:       Memory: 7.99GiB
17:53:58:  Free Memory: 6.00GiB
17:53:58:      Threads: WINDOWS_THREADS
17:53:58:   On Battery: false
17:53:58:   UTC offset: -5
17:53:58:          PID: 33108
17:53:58:          CWD: C:/Users/aoeu/AppData/Roaming/FAHClient
17:53:58:           OS: Windows 7 Professional
17:53:58:      OS Arch: AMD64
17:53:58:         GPUs: 2
17:53:58:        GPU 0: NVIDIA:2 GF116 [GeForce GTS 450]
17:53:58:        GPU 1: NVIDIA:2 GF116 [GeForce GTS 450]
17:53:58:         CUDA: 2.1
17:53:58:  CUDA Driver: 5000
17:53:58:Win32 Service: false
17:53:58:***********************************************************************
17:53:58:<config>
17:53:58:  <!-- Folding Slot Configuration -->
17:53:58:  <gpu v='true'/>
17:53:58:
17:53:58:  <!-- Network -->
17:53:58:  <proxy v=':8080'/>
17:53:58:
17:53:58:  <!-- User Information -->
17:53:58:  <passkey v='********************************'/>
17:53:58:  <team v='48083'/>
17:53:58:  <user v='aoeu'/>
17:53:58:
17:53:58:  <!-- Folding Slots -->
17:53:58:  <slot id='0' type='GPU'/>
17:53:58:  <slot id='1' type='GPU'/>
17:53:58:  <slot id='2' type='SMP'/>
17:53:58:</config>
17:53:58:Connecting to assign-GPU.stanford.edu:80
17:53:58:Connecting to assign-GPU.stanford.edu:8080
17:53:58:Read GPUs.txt
17:53:58:Trying to access database...
17:53:58:Successfully acquired database lock
17:53:58:Enabled folding slot 00: READY gpu:0:"GF116 [GeForce GTS 450]"
17:53:58:Enabled folding slot 01: READY gpu:1:"GF116 [GeForce GTS 450]"
17:53:58:Enabled folding slot 02: READY smp:4
17:53:58:WU01:FS00:Starting
17:53:58:WU01:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/aoeu/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_11.fah/FahCore_11.exe -dir 01 -suffix 01 -version 702 -lifeline 33108 -checkpoint 15 -gpu 0
17:53:59:WU01:FS00:Started FahCore on PID 14296
17:53:59:WU01:FS00:Core PID:3632
17:53:59:WU01:FS00:FahCore 0x11 started
17:53:59:WU00:FS01:Starting
17:53:59:WU00:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/aoeu/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_15.fah/FahCore_15.exe -dir 00 -suffix 01 -version 702 -lifeline 33108 -checkpoint 15 -gpu 1
17:53:59:WU00:FS01:Started FahCore on PID 3908
17:53:59:WU00:FS01:Core PID:32828
17:53:59:WU00:FS01:FahCore 0x15 started
17:53:59:WU02:FS02:Starting
17:53:59:WU02:FS02:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/aoeu/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/Core_a4.fah/FahCore_a4.exe -dir 02 -suffix 01 -version 702 -lifeline 33108 -checkpoint 15 -np 4
17:53:59:WU02:FS02:Started FahCore on PID 33544
17:53:59:WU02:FS02:Core PID:33800
17:53:59:WU02:FS02:FahCore 0xa4 started
17:53:59:WU01:FS00:0x11:
17:53:59:WU01:FS00:0x11:*------------------------------*
17:53:59:WU01:FS00:0x11:Folding@Home GPU Core
17:53:59:WU01:FS00:0x11:Version 1.31 (Tue Sep 15 10:57:42 PDT 2009)
17:53:59:WU01:FS00:0x11:
17:53:59:WU01:FS00:0x11:Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
17:53:59:WU01:FS00:0x11:Build host: amoeba
17:53:59:WU01:FS00:0x11:Board Type: Nvidia
17:53:59:WU01:FS00:0x11:Core      : 
17:53:59:WU01:FS00:0x11:Preparing to commence simulation
17:53:59:WU01:FS00:0x11:- Ensuring status. Please wait.
17:53:59:WU00:FS01:0x15:
17:53:59:WU00:FS01:0x15:*------------------------------*
17:53:59:WU00:FS01:0x15:Folding@Home GPU Core
17:53:59:WU00:FS01:0x15:Version                2.25 (Wed May 9 17:03:01 EDT 2012)
17:53:59:WU00:FS01:0x15:Build host             AmoebaRemote
17:53:59:WU00:FS01:0x15:Board Type             NVIDIA/CUDA
17:53:59:WU00:FS01:0x15:Core                   15
17:53:59:WU00:FS01:0x15:GPU device info vendor=0 device=0 name=NA match=0 deviceId=1
17:53:59:WU00:FS01:0x15:
17:53:59:WU00:FS01:0x15:Window's signal control handler registered.
17:53:59:WU00:FS01:0x15:Preparing to commence simulation
17:53:59:WU00:FS01:0x15:- Looking at optimizations...
17:53:59:WU00:FS01:0x15:- Files status OK
17:53:59:WU00:FS01:0x15:sizeof(CORE_PACKET_HDR) = 512 file=<>
17:53:59:WU00:FS01:0x15:- Expanded 60165 -> 264278 (decompressed 439.2 percent)
17:53:59:WU00:FS01:0x15:Called DecompressByteArray: compressed_data_size=60165 data_size=264278, decompressed_data_size=264278 diff=0
17:53:59:WU00:FS01:0x15:- Digital signature verified
17:53:59:WU00:FS01:0x15:
17:53:59:WU00:FS01:0x15:Project: 8054 (Run 0, Clone 3101, Gen 12)
17:53:59:WU00:FS01:0x15:
17:53:59:WU00:FS01:0x15:Assembly optimizations on if available.
17:53:59:WU00:FS01:0x15:Entering M.D.
17:53:59:WU02:FS02:0xa4:
17:53:59:WU02:FS02:0xa4:*------------------------------*
17:53:59:WU02:FS02:0xa4:Folding@Home Gromacs GB Core
17:53:59:WU02:FS02:0xa4:Version 2.27 (Dec. 15, 2010)
17:53:59:WU02:FS02:0xa4:
17:53:59:WU02:FS02:0xa4:Preparing to commence simulation
17:53:59:WU02:FS02:0xa4:- Looking at optimizations...
17:53:59:WU02:FS02:0xa4:- Files status OK
17:53:59:WU02:FS02:0xa4:- Expanded 2079568 -> 5386224 (decompressed 259.0 percent)
17:53:59:WU02:FS02:0xa4:Called DecompressByteArray: compressed_data_size=2079568 data_size=5386224, decompressed_data_size=5386224 diff=0
17:53:59:WU02:FS02:0xa4:- Digital signature verified
17:53:59:WU02:FS02:0xa4:
17:53:59:WU02:FS02:0xa4:Project: 7809 (Run 10, Clone 243, Gen 18)
17:53:59:WU02:FS02:0xa4:
17:53:59:WU02:FS02:0xa4:Assembly optimizations on if available.
17:53:59:WU02:FS02:0xa4:Entering M.D.
17:54:01:WU00:FS01:0x15:Will resume from checkpoint file 00/wudata_01.ckp
17:54:01:WU00:FS01:0x15:Tpr hash 00/wudata_01.tpr:  590112850 2274287033 1458076086 840345429 968028606
17:54:01:WU00:FS01:0x15:GPU device id=1
17:54:01:Server connection id=1 on 0.0.0.0:36330 from 127.0.0.1
17:54:01:WU00:FS01:0x15:Working on Good ROcking Metal Altar for Chronical Sinners
17:54:01:WU00:FS01:0x15:Client config unavailable.
17:54:01:WU00:FS01:0x15:Starting GUI Server
17:54:05:WU02:FS02:0xa4:Using Gromacs checkpoints
17:54:05:WU02:FS02:0xa4:Mapping NT from 4 to 4 
17:54:06:WU02:FS02:0xa4:Resuming from checkpoint
17:54:06:WU02:FS02:0xa4:Verified 02/wudata_01.log
17:54:06:WU02:FS02:0xa4:Verified 02/wudata_01.trr
17:54:06:WU02:FS02:0xa4:Verified 02/wudata_01.xtc
17:54:06:WU02:FS02:0xa4:Verified 02/wudata_01.edr
17:54:06:WU02:FS02:0xa4:Completed 840570 out of 1500000 steps  (56%)
17:54:08:WU01:FS00:0x11:- Looking at optimizations...
17:54:08:WU01:FS00:0x11:- Working with standard loops on this execution.
17:54:08:WU01:FS00:0x11:- Previous termination of core was improper.
17:54:08:WU01:FS00:0x11:- Going to use standard loops.
17:54:08:WU01:FS00:0x11:- Files status OK
17:54:08:WU01:FS00:0x11:- Expanded 45386 -> 251112 (decompressed 553.2 percent)
17:54:08:WU01:FS00:0x11:Called DecompressByteArray: compressed_data_size=45386 data_size=251112, decompressed_data_size=251112 diff=0
17:54:08:WU01:FS00:0x11:- Digital signature verified
17:54:08:WU01:FS00:0x11:
17:54:08:WU01:FS00:0x11:Project: 5769 (Run 13, Clone 297, Gen 20)
17:54:08:WU01:FS00:0x11:
17:54:08:WU01:FS00:0x11:Entering M.D.
17:54:14:WU01:FS00:0x11:Tpr hash 01/wudata_01.tpr:  1490663758 55188980 3700858087 2492274234 503400090
17:54:14:WU01:FS00:0x11:
17:54:14:WU01:FS00:0x11:Calling fah_main args: 14 usage=100
17:54:14:WU01:FS00:0x11:
17:55:07:WU00:FS01:0x15:Resuming from checkpoint
17:55:07:WU00:FS01:0x15:fcCheckPointResume: retreived and current tpr file hash:
17:55:07:WU00:FS01:0x15:   0    590112850    590112850
17:55:07:WU00:FS01:0x15:   1   2274287033   2274287033
17:55:07:WU00:FS01:0x15:   2   1458076086   1458076086
17:55:07:WU00:FS01:0x15:   3    840345429    840345429
17:55:07:WU00:FS01:0x15:   4    968028606    968028606
17:55:07:WU00:FS01:0x15:fcCheckPointResume: file hashes same.
17:55:07:WU00:FS01:0x15:fcCheckPointResume: state restored.
17:55:07:WU00:FS01:0x15:fcCheckPointResume: name 00/wudata_01.log Verified 00/wudata_01.log
17:55:07:WU00:FS01:0x15:fcCheckPointResume: name 00/wudata_01.trr Verified 00/wudata_01.trr
17:55:07:WU00:FS01:0x15:fcCheckPointResume: name 00/wudata_01.xtc Verified 00/wudata_01.xtc
17:55:07:WU00:FS01:0x15:fcCheckPointResume: name 00/wudata_01.edr Verified 00/wudata_01.edr
17:55:07:WU00:FS01:0x15:fcCheckPointResume: state restored 2
17:55:07:WU00:FS01:0x15:Resumed from checkpoint
17:55:07:WU00:FS01:0x15:Setting checkpoint frequency: 500000
17:55:07:WU00:FS01:0x15:Completed  16500001 out of 50000000 steps (33%).
17:55:08:WARNING:WU00:FS01:Detected clock skew (1 mins 09 secs), adjusting time estimates
18:06:48:WU00:FS01:0x15:Completed  17000000 out of 50000000 steps (34%).
18:11:08:WU02:FS02:0xa4:Completed 855000 out of 1500000 steps  (57%)
18:18:27:WU00:FS01:0x15:Completed  17500000 out of 50000000 steps (35%).
18:28:33:WU02:FS02:0xa4:Completed 870000 out of 1500000 steps  (58%)
18:30:08:WU00:FS01:0x15:Completed  18000000 out of 50000000 steps (36%).
18:41:47:WU00:FS01:0x15:Completed  18500000 out of 50000000 steps (37%).
18:45:51:WU02:FS02:0xa4:Completed 885000 out of 1500000 steps  (59%)
bollix47
Posts: 2963
Joined: Sun Dec 02, 2007 5:04 am
Location: Canada

Re: Project: 5769 (Run 4, Clone 20, Gen 2247) BAD_WORK_UNIT

Post by bollix47 »

Yes thanks, it does.

In FAHControl click on the Pause button and wait a few seconds until all slots pause.

Click on Start>All Programs>FAHClient>Data Directory>work

Right click on 01 and select Delete.

Click on the Fold button.
Image
aoeu
Posts: 87
Joined: Thu Dec 31, 2009 9:07 pm

Re: Project: 5769 (Run 4, Clone 20, Gen 2247) BAD_WORK_UNIT

Post by aoeu »

Thank you.

Without a reply in an hour consider that you were right.

I'm willing to take the points hit for being slow in asking about this.
aoeu
Posts: 87
Joined: Thu Dec 31, 2009 9:07 pm

Re: Project: 5769 (Run 4, Clone 20, Gen 2247) BAD_WORK_UNIT

Post by aoeu »

I see a new WU. THX

Peace?
aoeu
codysluder
Posts: 1024
Joined: Sun Dec 02, 2007 12:43 pm

Re: Project: 5769 (Run 4, Clone 20, Gen 2247) BAD_WORK_UNIT

Post by codysluder »

The magic is not just that you got a new WU, but that you are running a different fahcore. The WUs that used Fahcore_11 seem to have been the problem.
Post Reply