Page 1 of 1

More work units that will exceed expiry time

Posted: Sat Aug 10, 2024 10:31 am
by Blue_Bubble
I have noticed this on a few occasions in the last couple of weeks. Also, my daily points score has reduced hugely in that time despite nothing having changed at my end. Is this a general issue ?

Example from today:

Code: Select all

12:08:12:WU00:FS00:Connecting to 13.59.134.176:8080
12:08:13:WU00:FS00:Assigned to work server 129.32.209.203
12:08:13:WU00:FS00:Requesting new work unit for slot 00: RUNNING cpu:4 from 129.32.209.203
12:08:13:WU00:FS00:Connecting to 129.32.209.203:8080
12:08:13:WU01:FS00:0xa8:Saving result file ../logfile_01.txt
12:08:13:WU01:FS00:0xa8:Saving result file dhdl.xvg
12:08:13:WU01:FS00:0xa8:Saving result file md.log
12:08:13:WU01:FS00:0xa8:Saving result file md4.gro
12:08:13:WU01:FS00:0xa8:Saving result file science.log
12:08:13:WU01:FS00:0xa8:Saving result file state.cpt
12:08:13:WU00:FS00:Downloading 3.52MiB
12:08:13:WU01:FS00:0xa8:Folding@home Core Shutdown: FINISHED_UNIT
12:08:13:WU01:FS00:FahCore returned: FINISHED_UNIT (100 = 0x64)
12:08:13:WU01:FS00:Sending unit results: id:01 state:SEND error:NO_ERROR project:19229 run:5485 clone:2 gen:4 core:0xa8 unit:0x04000000020000006d1500001d4b0000
12:08:13:WU01:FS00:Uploading 9.73MiB to 128.174.73.74
12:08:13:WU01:FS00:Connecting to 128.174.73.74:8080
12:08:14:WU00:FS00:Download complete
12:08:14:WU00:FS00:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:12422 run:39 clone:8 gen:1 core:0xa8 unit:0x01000000080000002700000086300000
12:08:14:WU00:FS00:Starting
12:08:14:WU00:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/lin/64bit-avx2-256/a8-0.0.12/Core_a8.fah/FahCore_a8 -dir 00 -suffix 01 -version 705 -lifeline 2542 -checkpoint 30 -np 4
12:08:14:WU00:FS00:Started FahCore on PID 14071
12:08:14:WU00:FS00:Core PID:14075
12:08:14:WU00:FS00:FahCore 0xa8 started
12:08:14:WU00:FS00:0xa8:*********************** Log Started 2024-08-09T12:08:14Z ***********************
12:08:14:WU00:FS00:0xa8:************************** Gromacs Folding@home Core ***************************
12:08:14:WU00:FS00:0xa8:       Core: Gromacs
12:08:14:WU00:FS00:0xa8:       Type: 0xa8
12:08:14:WU00:FS00:0xa8:    Version: 0.0.12
12:08:14:WU00:FS00:0xa8:     Author: Joseph Coffland <[email protected]>
12:08:14:WU00:FS00:0xa8:  Copyright: 2020 foldingathome.org
12:08:14:WU00:FS00:0xa8:   Homepage: https://foldingathome.org/
12:08:14:WU00:FS00:0xa8:       Date: Jan 16 2021
12:08:14:WU00:FS00:0xa8:       Time: 19:24:44
12:08:14:WU00:FS00:0xa8:   Compiler: GNU 8.3.0
12:08:14:WU00:FS00:0xa8:    Options: -faligned-new -std=c++14 -fsigned-char -ffunction-sections
12:08:14:WU00:FS00:0xa8:             -fdata-sections -O3 -funroll-loops -fno-pie
12:08:14:WU00:FS00:0xa8:   Platform: linux2 4.15.0-128-generic
12:08:14:WU00:FS00:0xa8:       Bits: 64
12:08:14:WU00:FS00:0xa8:       Mode: Release
12:08:14:WU00:FS00:0xa8:       SIMD: avx2_256
12:08:14:WU00:FS00:0xa8:     OpenMP: ON
12:08:14:WU00:FS00:0xa8:       CUDA: OFF
12:08:14:WU00:FS00:0xa8:       Args: -dir 00 -suffix 01 -version 705 -lifeline 14071 -checkpoint 30 -np
12:08:14:WU00:FS00:0xa8:             4
12:08:14:WU00:FS00:0xa8:************************************ libFAH ************************************
12:08:14:WU00:FS00:0xa8:       Date: Jan 16 2021
12:08:14:WU00:FS00:0xa8:       Time: 19:21:38
12:08:14:WU00:FS00:0xa8:   Compiler: GNU 8.3.0
12:08:14:WU00:FS00:0xa8:    Options: -faligned-new -std=c++14 -fsigned-char -ffunction-sections
12:08:14:WU00:FS00:0xa8:             -fdata-sections -O3 -funroll-loops -fno-pie
12:08:14:WU00:FS00:0xa8:   Platform: linux2 4.15.0-128-generic
12:08:14:WU00:FS00:0xa8:       Bits: 64
12:08:14:WU00:FS00:0xa8:       Mode: Release
12:08:14:WU00:FS00:0xa8:************************************ CBang *************************************
12:08:14:WU00:FS00:0xa8:       Date: Jan 16 2021
12:08:14:WU00:FS00:0xa8:       Time: 19:21:24
12:08:14:WU00:FS00:0xa8:   Compiler: GNU 8.3.0
12:08:14:WU00:FS00:0xa8:    Options: -faligned-new -std=c++14 -fsigned-char -ffunction-sections
12:08:14:WU00:FS00:0xa8:             -fdata-sections -O3 -funroll-loops -fno-pie -fPIC
12:08:14:WU00:FS00:0xa8:   Platform: linux2 4.15.0-128-generic
12:08:14:WU00:FS00:0xa8:       Bits: 64
12:08:14:WU00:FS00:0xa8:       Mode: Release
12:08:14:WU00:FS00:0xa8:************************************ System ************************************
12:08:14:WU00:FS00:0xa8:        CPU: Intel(R) Core(TM) i7-4500U CPU @ 1.80GHz
12:08:14:WU00:FS00:0xa8:     CPU ID: GenuineIntel Family 6 Model 69 Stepping 1
12:08:14:WU00:FS00:0xa8:       CPUs: 4
12:08:14:WU00:FS00:0xa8:     Memory: 7.71GiB
12:08:14:WU00:FS00:0xa8:Free Memory: 3.17GiB
12:08:14:WU00:FS00:0xa8:    Threads: POSIX_THREADS
12:08:14:WU00:FS00:0xa8: OS Version: 4.15
12:08:14:WU00:FS00:0xa8:Has Battery: false
12:08:14:WU00:FS00:0xa8: On Battery: false
12:08:14:WU00:FS00:0xa8: UTC Offset: 1
12:08:14:WU00:FS00:0xa8:        PID: 14075
12:08:14:WU00:FS00:0xa8:        CWD: /var/lib/fahclient/work
12:08:14:WU00:FS00:0xa8:********************************************************************************
12:08:14:WU00:FS00:0xa8:Project: 12422 (Run 39, Clone 8, Gen 1)
12:08:14:WU00:FS00:0xa8:Unit: 0x00000000000000000000000000000000
12:08:14:WU00:FS00:0xa8:Reading tar file core.xml
12:08:14:WU00:FS00:0xa8:Reading tar file frame1.tpr
12:08:14:WU00:FS00:0xa8:Digital signatures verified
12:08:14:WU00:FS00:0xa8:Calling: mdrun -c frame1.gro -s frame1.tpr -x frame1.xtc -cpt 30 -nt 4 -ntmpi 1
12:08:14:WU00:FS00:0xa8:Steps: first=5000000 total=10000000
12:08:19:WU00:FS00:0xa8:Completed 1 out of 5000000 steps (0%)
12:08:21:WU01:FS00:Upload complete
12:08:21:WU01:FS00:Server responded WORK_ACK (400)
12:08:21:WU01:FS00:Final credit estimate, 25073.00 points
12:08:21:WU01:FS00:Cleaning up
13:57:46:WU00:FS00:0xa8:Completed 50000 out of 5000000 steps (1%)
******************************* Date: 2024-08-09 *******************************
15:43:44:WU00:FS00:0xa8:Completed 100000 out of 5000000 steps (2%)
17:28:56:WU00:FS00:0xa8:Completed 150000 out of 5000000 steps (3%)
19:13:10:WU00:FS00:0xa8:Completed 200000 out of 5000000 steps (4%)
20:56:08:WU00:FS00:0xa8:Completed 250000 out of 5000000 steps (5%)
******************************* Date: 2024-08-09 *******************************
22:39:33:WU00:FS00:0xa8:Completed 300000 out of 5000000 steps (6%)
00:24:26:WU00:FS00:0xa8:Completed 350000 out of 5000000 steps (7%)
02:05:39:WU00:FS00:0xa8:Completed 400000 out of 5000000 steps (8%)
03:47:17:WU00:FS00:0xa8:Completed 450000 out of 5000000 steps (9%)
******************************* Date: 2024-08-10 *******************************
05:29:26:WU00:FS00:0xa8:Completed 500000 out of 5000000 steps (10%)
07:16:16:WU00:FS00:0xa8:Completed 550000 out of 5000000 steps (11%)
09:09:48:WU00:FS00:0xa8:Completed 600000 out of 5000000 steps (12%)
Image

Re: More work units that will exceed expiry time

Posted: Sat Aug 10, 2024 3:31 pm
by toTOW
These projects shouldn't assign to CPU with less than 8 cores ... we sent a reminder to the researcher to check the project settings.

Re: More work units that will exceed expiry time

Posted: Sat Aug 10, 2024 4:18 pm
by Blue_Bubble
Thanks toTOW ... I've aborted that work unit and received a more suitable one.

Re: More work units that will exceed expiry time

Posted: Sat Aug 10, 2024 10:01 pm
by Marcos FRM
I have seen this too here. WUs expiring the deadline with my i7-4770:

Project: 12420 (Run 176, Clone 7, Gen 0) -- with 2 CPUs
Project: 12423 (Run 18, Clone 1, Gen 51) -- with 3 CPUs
Project: 12421 (Run 151, Clone 6, Gen 2) -- with 4 CPUs

Re: More work units that will exceed expiry time

Posted: Mon Aug 12, 2024 1:37 pm
by ETA_2025
toTOW wrote: Sat Aug 10, 2024 3:31 pm These projects shouldn't assign to CPU with less than 8 cores ... we sent a reminder to the researcher to check the project settings.
Well, I keep getting these work units on my Raspberry Pi 4B's (4 core ARM64). Applying the correct constraints seems to be a game of whack a mole.

Project: 12420 (Run 39, Clone 4, Gen 5) took just under 6 hours to complete a frame!
Project: 12421 (Run 50, Clone 5, Gen 2) dumped before completing a frame.
Project: 12422 (Run 55, Clone 5, Gen 6) dumped before completing a frame.
Project: 12423 (Run 92, Clone 3, Gen 29) took just under 6 hours to complete a frame!

Other work units recently dumped:
Project: 12420 (Run 88, Clone 7, Gen 3)
Project: 12420 (Run 100, Clone 5, Gen 3)
Project: 12421 (Run 44, Clone 8, Gen 1)
Project: 12421 (Run 123, Clone 9, Gen 3)
Project 12422 (Run 55, Clone 5, Gen 6)

Whenever I see a work unit with a base credit of 150,542 I dump it, because I know my hardware cannot complete it in time. So, there's clearly a problem with the constraints applied to projects in the range of 12420-12423.

Also toTOW, muziqaz suggests I shouldn't be folding using Raspberry Pi 4B's. Is this correct?

And, Project: 18800 (Run 0, Clone 161, Gen 545) and Project: 18457 (Run 0, Clone 94, Gen 25) were dumped, because they caused the Raspberry Pi 4B's to continuously reboot.

Re: More work units that will exceed expiry time

Posted: Mon Aug 12, 2024 2:24 pm
by toTOW
ETA_2025 wrote: Mon Aug 12, 2024 1:37 pm Also toTOW, muziqaz suggests I shouldn't be folding using Raspberry Pi 4B's. Is this correct?
Yes, it's too slow to make the deadlines on 95% of projects. I gave up folding on these a long time ago.

A RPi 5 might be able to make the deadlines, but I won't other to try ...

Re: More work units that will exceed expiry time

Posted: Mon Aug 12, 2024 3:25 pm
by ETA_2025
All dumped work units in archived logs:
Project: 12420 (Run 2, Clone 9, Gen 2)
Project: 12420 (Run 24, Clone 4, Gen 5)
Project: 12420 (Run 32, Clone 2, Gen 23)
Project: 12420 (Run 73, Clone 3, Gen 13)
Project: 12420 (Run 90, Clone 8, Gen 0)
Project: 12420 (Run 97, Clone 3, Gen 15)
Project: 12420 (Run 137, Clone 4, Gen 3)
Project: 12420 (Run 149, Clone 9, Gen 0) (received twice)
Project: 12420 (Run 153, Clone 4, Gen 1)
Project: 12420 (Run 161, Clone 3, Gen 13)
Project: 12420 (Run 188, Clone 1, Gen 22)
Project: 12421 (Run 39, Clone 2, Gen 16)
Project: 12421 (Run 52, Clone 6, Gen 4)
Project: 12421 (Run 65, Clone 1, Gen 36)
Project: 12421 (Run 87, Clone 6, Gen 0)
Project: 12421 (Run 89, Clone 6, Gen 0)
Project: 12421 (Run 92, Clone 4, Gen 11)
Project: 12421 (Run 98, Clone 8, Gen 2)
Project: 12421 (Run 119, Clone 5, Gen 0)
Project: 12421 (Run 124, Clone 9, Gen 1)
Project: 12421 (Run 129, Clone 9, Gen 1)
Project: 12421 (Run 132, Clone 9, Gen 0)
Project: 12422 (Run 5, Clone 5, Gen 1)
Project: 12422 (Run 19, Clone 3, Gen 18)
Project: 12422 (Run 39, Clone 7, Gen 0)
Project: 12422 (Run 69, Clone 2, Gen 26)
Project: 12422 (Run 97, Clone 8, Gen 1)
Project: 12422 (Run 107, Clone 5, Gen 0)
Project: 12422 (Run 117, Clone 5, Gen 1)
Project: 12423 (Run 19, Clone 5, Gen 15)
Project: 12423 (Run 24, Clone 7, Gen 6)
Project: 12423 (Run 29, Clone 3, Gen 23)
Project: 12423 (Run 30, Clone 3, Gen 25)
Project: 12423 (Run 34, Clone 8, Gen 3)
Project: 12423 (Run 66, Clone 0, Gen 101)
Project: 12423 (Run 82, Clone 8, Gen 1)

There is clearly a problem, if ARM 64 hardware are getting these.

Re: More work units that will exceed expiry time

Posted: Mon Aug 12, 2024 3:39 pm
by ETA_2025
toTOW wrote: Mon Aug 12, 2024 2:24 pm
ETA_2025 wrote: Mon Aug 12, 2024 1:37 pm Also toTOW, muziqaz suggests I shouldn't be folding using Raspberry Pi 4B's. Is this correct?
Yes, it's too slow to make the deadlines on 95% of projects. I gave up folding on these a long time ago.

A RPi 5 might be able to make the deadlines, but I won't other to try ...
There's no chance a RPi 5 could do these work units, unless it's at least 6 times faster than a RPi 4B, as the RPi 4B would take around 24 days to complete one of these work units, with a Base Credit of 150,542.0. That base credit that makes it obvious that these work units are definitely not intended for ARM64 hardware.

And, with over a million 0xa8 work units, I expect it will be a while before there are no longer any work units available for RPi 4B's.

Re: More work units that will exceed expiry time

Posted: Mon Aug 12, 2024 10:44 pm
by calxalot
Aside: my M1 Mac mini is able to do these, but is only slightly beating the timeout.

Re: More work units that will exceed expiry time

Posted: Wed Aug 14, 2024 3:07 pm
by toTOW
Constraints on p12420-23 have been fixed. These projects shouldn't assign any more to CPU <= 9 ...

Re: More work units that will exceed expiry time

Posted: Thu Aug 15, 2024 10:39 am
by ETA_2025
toTOW wrote: Wed Aug 14, 2024 3:07 pm Constraints on p12420-23 have been fixed. These projects shouldn't assign any more to CPU <= 9 ...
Thanks toTOW.

Re: More work units that will exceed expiry time

Posted: Thu Sep 12, 2024 12:57 am
by Kebast
I was surprised when I grabbed 12423 on my 5900X. I've allocated 12 threads to folding, TPF of 13m12s :).
Came here to check on it.