Early Unit End on 128.143.48.226
Moderators: Site Moderators, FAHC Science Team
-
- Posts: 37
- Joined: Fri Dec 14, 2007 3:53 pm
- Hardware configuration: 1 Dell server running Quad Xeon 2.4 Deino, SMP client. 2 Vista GPU client and 2 6.23 BetaR1. 8 other XP clients running 6.23.
- Location: Portsmouth England
- Contact:
Early Unit End on 128.143.48.226
I have a 6.23 Windows client that is EUE over and over again. It downloads a unit, works on it for a few seconds and then fails again. It has done this about five times before I killed it. Is there a way of knowing if the work units are faulty, this machine is not overclocked at all by the way. Thanks.
Re: Early Unit End on 128.143.48.226
There is no dependable way to know if a WU is faulty. The general guidelines are that if you have an occasional WU that fails (no matter how many times it is reassigned) ignore it. If you have a variety of WUs that fail, it's probably your hardware.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
-
- Posts: 37
- Joined: Fri Dec 14, 2007 3:53 pm
- Hardware configuration: 1 Dell server running Quad Xeon 2.4 Deino, SMP client. 2 Vista GPU client and 2 6.23 BetaR1. 8 other XP clients running 6.23.
- Location: Portsmouth England
- Contact:
Re: Early Unit End on 128.143.48.226
My problem is that I cannot just ignore it as it is being constantly re-downloaded, I am using a mobile broadband connection thus the quantity of data is important to my total. Here are the relevant parts of my log:
Code: Select all
[11:36:07] Preparing to commence simulation
[11:36:07] - Files status OK
[11:36:08] - Expanded 245273 -> 653388 (decompressed 266.3 percent)
[11:36:08]
[11:36:08] Project: 3861 (Run 147, Clone 3, Gen 6)
[11:36:08]
[11:36:08] Assembly optimizations on if available.
[11:36:08] Entering M.D.
[11:36:14] Gromacs cannot continue further.
[11:36:14] Going to send back what have done.
[11:36:14] logfile size: 0
[11:36:14] Warning: Core could not open logfile.
[11:36:14] - Writing 536 bytes of core data to disk...
[11:36:14] Done: 24 -> 69 (compressed to 287.5 percent)
[11:36:14] ... Done.
[11:36:14]
[11:36:14] Folding@home Core Shutdown: EARLY_UNIT_END
[11:36:17] CoreStatus = 72 (114)
[11:36:17] Sending work to server
[11:36:17] Project: 3861 (Run 147, Clone 3, Gen 6)
[11:36:17] + Attempting to send results [June 18 11:36:17 UTC]
[11:36:17] - Reading file work/wuresults_05.dat from core
[11:36:17] (Read 581 bytes from disk)
[11:36:17] Connecting to http://128.143.48.226:8080/
[11:36:18] Posted data.
[11:36:18] Initial: 0000; - Uploaded at ~1 kB/s
[11:36:18] - Averaged speed for that direction ~1 kB/s
[11:36:18] + Results successfully sent
[11:36:18] Thank you for your contribution to Folding@Home.
[11:36:22] + Attempting to get work packet
[11:36:22] - Will indicate memory of 383 MB
[11:36:22] - Connecting to assignment server
[11:36:22] Connecting to http://assign.stanford.edu:8080/
[11:36:24] Posted data.
[11:36:24] Initial: 8F80; - Successful: assigned to (128.143.48.226).
[11:36:24] + News From Folding@Home: Welcome to Folding@Home
[11:36:24] Loaded queue successfully.
[11:36:24] Connecting to http://128.143.48.226:8080/
[11:36:25] Posted data.
[11:36:31] Initial: 0000; - Receiving payload (expected size: 245785)
[11:36:31] Conversation time very short, giving reduced weight in bandwidth avg
[11:36:31] - Downloaded at ~480 kB/s
[11:36:31] - Averaged speed for that direction ~227 kB/s
[11:36:31] + Received work.
[11:36:31] Trying to send all finished work units
[11:36:31] + No unsent completed units remaining.
[11:36:31] + Closed connections
[11:36:36]
[11:36:36] + Processing work unit
[11:36:36] Core required: FahCore_7c.exe
[11:36:36] Core found.
[11:36:36] Working on queue slot 06 [June 18 11:36:36 UTC]
[11:36:36] + Working ...
[11:36:36] - Calling '.\FahCore_7c.exe -dir work/ -suffix 06 -checkpoint 15 -verbose -lifeline 3184 -version 623'
[11:36:37] *------------------------------*
[11:36:37] Folding@Home Double Gromacs Core C
[11:36:37] Version 1.00 (Thu Apr 24 19:12:09 PDT 2008)
[11:36:37]
[11:36:37] Preparing to commence simulation
[11:36:37] - Files status OK
[11:36:37] - Expanded 245273 -> 653388 (decompressed 266.3 percent)
[11:36:37]
[11:36:37] Project: 3861 (Run 147, Clone 3, Gen 6)
[11:36:37]
[11:36:37] Assembly optimizations on if available.
[11:36:37] Entering M.D.
[11:36:43] Gromacs cannot continue further.
[11:36:43] Going to send back what have done.
[11:36:43] logfile size: 0
[11:36:43] Warning: Core could not open logfile.
[11:36:43] - Writing 536 bytes of core data to disk...
[11:36:43] Done: 24 -> 69 (compressed to 287.5 percent)
[11:36:43] ... Done.
[11:36:43]
[11:36:43] Folding@home Core Shutdown: EARLY_UNIT_END
[11:36:47] CoreStatus = 72 (114)
[11:36:47] Sending work to server
[11:36:47] Project: 3861 (Run 147, Clone 3, Gen 6)
Re: Early Unit End on 128.143.48.226
For people on metered connections (including mobile connections and such) the quickest way to get rid of multiple downloads of the same defective WU is to change your MachineID. Do not use this method if you have previous results which still need to be uploaded, though. It's essentially a variation on sneakernetting except that only one computer is involved.pompeyrodney wrote:My problem is that I cannot just ignore it as it is being constantly re-downloaded, I am using a mobile broadband connection thus the quantity of data is important to my total.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
-
- Posts: 37
- Joined: Fri Dec 14, 2007 3:53 pm
- Hardware configuration: 1 Dell server running Quad Xeon 2.4 Deino, SMP client. 2 Vista GPU client and 2 6.23 BetaR1. 8 other XP clients running 6.23.
- Location: Portsmouth England
- Contact:
Re: Early Unit End on 128.143.48.226
I've had a few EUEs at 0% with gen 0 p3864 WUs.
p3864, r383, c19, g0 on June 11th
p3864, r385, c19, g0 on June 11th
p3864, r388, c17, g0 on June 27th (today)
p3864, r398, c4, g0 on feb 24th
Notice the range of Run numbers, all nicely close together.
This might be a coincidence, but then I have not had any other p3864 WUs within this range of Runs.
Could it be a bad run of Runs ?
p3864, r383, c19, g0 on June 11th
p3864, r385, c19, g0 on June 11th
p3864, r388, c17, g0 on June 27th (today)
p3864, r398, c4, g0 on feb 24th
Notice the range of Run numbers, all nicely close together.
This might be a coincidence, but then I have not had any other p3864 WUs within this range of Runs.
Could it be a bad run of Runs ?