Work unit exceeding deadline

Moderators: Site Moderators, FAHC Science Team

Post Reply
hglag
Posts: 2
Joined: Thu Dec 26, 2024 8:26 am
Location: Australia

Work unit exceeding deadline

Post by hglag »

Hi,

Is there a way to extend the deadline of a work unit? I have task that has been running for a couple of days and it looks like it is going to exceed its deadline by a few hours. Are there any tricks to prevent this from happening in the future?

Thanks
-Harry


Work Unit Details:

Run Time 2d 06h
ETA 1d 10h
Assign Time 2d 06h ago
Deadline 19h 39m
Timeout 17h 15m


Logs:

21:27:21:I1:WU3:Requesting WU assignment for user hglag team 186
21:27:22:I1:WU3:Received WU assignment oTS4QX9pcXzZhZCNi7WFpd8VouOSkUPYgt7vV_E4THQ
21:27:22:I1:WU3:Downloading WU
21:27:38:I1:WU3:Received WU P18806 R29 C11 G332
21:27:38:I3:WU3:Started FahCore on PID 10996
21:27:38:I1:WU3:*********************** Log Started 2024-12-27T21:27:38Z ***********************
21:27:38:I1:WU3:************************** Gromacs Folding@home Core ***************************
21:27:38:I1:WU3: Core: Gromacs
21:27:38:I1:WU3: Type: 0xa9
21:27:38:I1:WU3: Version: 0.0.12
21:27:38:I1:WU3: Author: Joseph Coffland <[email protected]>
21:27:38:I1:WU3: Copyright: 2022 foldingathome.org
21:27:38:I1:WU3: Homepage: https://foldingathome.org/
21:27:38:I1:WU3: Date: Nov 15 2022
21:27:38:I1:WU3: Time: 13:31:08
21:27:38:I1:WU3: Compiler: Visual C++
21:27:38:I1:WU3: Options: /TP /std:c++17 /nologo /EHa /wd4297 /wd4103 /O2 /Zc:throwingNew /MT
21:27:38:I1:WU3: Platform: win32 10
21:27:38:I1:WU3: Bits: 64
21:27:38:I1:WU3: Mode: Release
21:27:38:I1:WU3: SIMD: avx2_256
21:27:38:I1:WU3: OpenMP: ON
21:27:38:I1:WU3: CUDA: OFF
21:27:38:I1:WU3: OpenCL: OFF
21:27:38:I1:WU3: Args: -dir oTS4QX9pcXzZhZCNi7WFpd8VouOSkUPYgt7vV_E4THQ -suffix 01
21:27:38:I1:WU3: -version 8.4.9 -lifeline 16008 -np 1
21:27:38:I1:WU3:************************************ libFAH ************************************
21:27:38:I1:WU3: Date: Nov 15 2022
21:27:38:I1:WU3: Time: 13:30:33
21:27:38:I1:WU3: Compiler: Visual C++
21:27:38:I1:WU3: Options: /TP /std:c++14 /nologo /EHa /wd4297 /wd4103 /O2 /Zc:throwingNew /MT
21:27:38:I1:WU3: Platform: win32 10
21:27:38:I1:WU3: Bits: 64
21:27:38:I1:WU3: Mode: Release
21:27:38:I1:WU3:************************************ CBang *************************************
21:27:38:I1:WU3: Date: Nov 15 2022
21:27:38:I1:WU3: Time: 13:29:57
21:27:38:I1:WU3: Compiler: Visual C++
21:27:38:I1:WU3: Options: /TP /std:c++14 /nologo /EHa /wd4297 /wd4103 /O2 /Zc:throwingNew /MT
21:27:38:I1:WU3: Platform: win32 10
21:27:38:I1:WU3: Bits: 64
21:27:38:I1:WU3: Mode: Release
21:27:38:I1:WU3:************************************ System ************************************
21:27:38:I1:WU3: CPU: 13th Gen Intel(R) Core(TM) i7-13700
21:27:38:I1:WU3: CPU ID: GenuineIntel Family 6 Model 183 Stepping 1
21:27:38:I1:WU3: CPUs: 24
21:27:38:I1:WU3: Memory: 15.70GiB
21:27:38:I1:WU3:Free Memory: 6.70GiB
21:27:38:I1:WU3: Threads: WINDOWS_THREADS
21:27:38:I1:WU3: OS Version: 6.2
21:27:38:I1:WU3:Has Battery: false
21:27:38:I1:WU3: On Battery: false
21:27:38:I1:WU3: UTC Offset: 8
21:27:38:I1:WU3: PID: 10996
21:27:38:I1:WU3: CWD: C:\ProgramData\FAHClient\work
21:27:38:I1:WU3: Exec: C:\ProgramData\FAHClient\cores\gromacs-core-a9\windows-10-64bit\cpu-avx2_256-release\fahcore-a9-windows-10-64bit-cpu-avx2_256-release-0.0.12\FahCore_a9.exe
21:27:38:I1:WU3:********************************************************************************
21:27:38:I1:WU3:Project: 18806 (Run 29, Clone 11, Gen 332)
21:27:38:I1:WU3:Reading tar file core.xml
21:27:38:I1:WU3:Reading tar file frame332.tpr
21:27:38:I1:WU3:Digital signatures verified
21:27:38:I1:WU3:Calling: mdrun -c frame332.gro -s frame332.tpr -x frame332.xtc -cpt 5 -nt 1 -ntmpi 1 -update cpu -nb cpu -bonded cpu -pme cpu -pmefft cpu
21:27:39:I1:WU3:Steps: first=83000000 total=83250000
21:27:44:I1:WU3:Completed 1 out of 250000 steps (0%)
22:20:41:I1:WU3:Completed 2500 out of 250000 steps (1%)
23:13:37:I1:WU3:Completed 5000 out of 250000 steps (2%)
...
00:23:10:I1:WU3:Completed 142500 out of 250000 steps (57%)
01:16:31:I1:WU3:Completed 145000 out of 250000 steps (58%)
02:09:53:I1:WU3:Completed 147500 out of 250000 steps (59%)
03:03:07:I1:WU3:Completed 150000 out of 250000 steps (60%)
03:56:28:I1:WU3:Completed 152500 out of 250000 steps (61%)
calxalot
Site Moderator
Posts: 1273
Joined: Sat Dec 08, 2007 1:33 am
Location: San Francisco, CA
Contact:

Re: Work unit exceeding deadline

Post by calxalot »

No, you can’t extend the deadline.

You can run faster by using more than one cpu core and not use efficiency cores. You will need to set cpus appropriately and use a utility such as Process Lasso.
muziqaz
Posts: 1060
Joined: Sun Dec 16, 2007 6:22 pm
Hardware configuration: 9950x, 7950x3D, 5950x, 5800x3D
7900xtx, Radeon 7, 5700xt, 6900xt, RX 550 640SP
Location: London
Contact:

Re: Work unit exceeding deadline

Post by muziqaz »

Projects aren't supposed to be assigned to CPUs with 1 threads assigned to fold.
FAH Omega tester
calxalot
Site Moderator
Posts: 1273
Joined: Sat Dec 08, 2007 1:33 am
Location: San Francisco, CA
Contact:

Re: Work unit exceeding deadline

Post by calxalot »

I mostly see 1 to 64 cpus allowed in the assignment data using debug build.
muziqaz
Posts: 1060
Joined: Sun Dec 16, 2007 6:22 pm
Hardware configuration: 9950x, 7950x3D, 5950x, 5800x3D
7900xtx, Radeon 7, 5700xt, 6900xt, RX 550 640SP
Location: London
Contact:

Re: Work unit exceeding deadline

Post by muziqaz »

We need to remember to remind researchers to set lower limit of CPU threads for their new projects. 1 core/thread however powerful it might be will not be enough for today's workunits
FAH Omega tester
calxalot
Site Moderator
Posts: 1273
Joined: Sat Dec 08, 2007 1:33 am
Location: San Francisco, CA
Contact:

Re: Work unit exceeding deadline

Post by calxalot »

And maybe the default minimum could be 2 on the servers.
Post Reply