Page 1 of 2
171.67.108.25
Posted: Sat Apr 09, 2011 7:29 am
by Monterone
Hi,
2 days ago, 6513 (17, 274, 51) failed (Bad WU) and it won't upload. I'm using v7.
Here's the log data.
Code: Select all
07:12:32:Sending unit results: id:01 state:SEND project:6513 run:17 clone:274 gen:51 core:0x78 unit:0x225dd2554d9d926b0033011200111971
07:12:32:Unit 01: Uploading 4.40KiB
07:12:32:Connecting to 171.64.65.62:8080
07:12:32:WARNING: Exception: Failed to send results to work server: Failed to read response packet
07:12:32:Trying to send results to collection server
07:12:32:Unit 01: Uploading 4.40KiB
07:12:32:Connecting to 171.67.108.25:8080
07:12:34:WARNING: WorkServer connection failed on port 8080 trying 80
07:12:34:Connecting to 171.67.108.25:80
07:12:35:ERROR: Exception: Failed to connect to 171.67.108.25:80: Es konnte keine Verbindung hergestellt werden, da der Zielcomputer die Verbindung verweigerte.
The last line means: No connection could be established, because the target computer denied the connection.
Re: 171.67.108.25
Posted: Sat Apr 09, 2011 3:30 pm
by gwildperson
I think this might be the same problem reported here:
https://fah-web.stanford.edu/projects/F ... ticket/615. Two days ago did you get BAD_WORK_UNIT (114) or was it some other error?
Re: 171.67.108.25
Posted: Sat Apr 09, 2011 4:45 pm
by Monterone
Oh, it was an EUE and not a bad WU. Excuse me!
Here's the log:
Code: Select all
Writing local files
Completed 37500 out of 250000 steps (15%)
Gromacs cannot continue further.
Going to send back what have done.
logfile size: 12592
- Writing 13128 bytes of core data to disk...
Done: 12616 -> 3989 (compressed to 31.6 percent)
... Done.
Folding@home Core Shutdown: EARLY_UNIT_END
Re: 171.67.108.25
Posted: Sat Apr 09, 2011 5:04 pm
by toTOW
The 171.67.108.25 server is a collection server ... you WU is coming from 171.64.65.62.
Re: 171.67.108.25
Posted: Sun Apr 10, 2011 8:21 am
by Monterone
Sorry, I don't know the difference between a work and a collection server. "Collection" sounds for me, as if it would collect the WUs and "work" as it would distribute the work to the folders.
But it's fact, that the WU doesn't upload...
Re: 171.67.108.25
Posted: Sun Apr 10, 2011 1:36 pm
by toTOW
The collection server is supposed (but it's not always working as expected) to collect WUs that can't be returned to their associated work server (because it's down or overloaded).
Re: 171.67.108.25
Posted: Sun Apr 10, 2011 11:27 pm
by PantherX
If you want some more information regarding the the Servers, you might what to read this thread -> viewtopic.php?f=18&t=17794
Re: 171.67.108.25
Posted: Mon Apr 11, 2011 12:44 pm
by Monterone
Thank you! But my problem is still there and I don't know, if it's a problem of the server or the client. Is there a way to delete a WU from the queue with v7?
Re: 171.67.108.25
Posted: Mon Apr 11, 2011 9:34 pm
by PantherX
There is a way to delete the WU from the queue, follow bruce's post to the letter (viewtopic.php?p=182589#p182589).
Re: 171.67.108.25
Posted: Mon Apr 11, 2011 9:58 pm
by gwildperson
How do I do it in OS X?
Re: 171.67.108.25
Posted: Tue Apr 12, 2011 12:49 am
by k1wi
Code: Select all
[23:59:21] Project: 6504 (Run 1, Clone 199, Gen 95)
[23:59:21] - Read packet limit of 540015616... Set to 524286976.
[23:59:21] + Attempting to send results [April 11 23:59:21 UTC]
[00:00:02] - Unknown packet returned from server, expected ACK for results
[00:00:02] - Error: Could not transmit unit 02 (completed April 11) to work server.
[00:00:02] - Read packet limit of 540015616... Set to 524286976.
[00:00:02] + Attempting to send results [April 12 00:00:02 UTC]
[00:00:02] - Couldn't send HTTP request to server
[00:00:02] (Got status 502)
[00:00:02] + Could not connect to Work Server (results)
[00:00:02] (171.67.108.25:8080)
[00:00:02] + Retrying using alternative port
[00:00:02] - Couldn't send HTTP request to server
[00:00:02] (Got status 503)
[00:00:02] + Could not connect to Work Server (results)
[00:00:02] (171.67.108.25:80)
[00:00:02] Could not transmit unit 02 to Collection server; keeping in queue.
The sever appears to be in NOT ACCEPT mode...
It's possible it's my connection/proxy, as another user here is experiencing the same issue with an SMP unit.
Re: 171.67.108.25
Posted: Tue Apr 12, 2011 5:35 am
by Blacksmith1
WU 5797 completed. not uploading due to the sever(171.67.108.25) it is trying to upload to being in not accept mode. 11 tries as of this post.
I would post the log but it is too large, and too complicated by the v7 putting all the client info into one log.
restarted fah contol anf got his from the log
Code: Select all
05:43:57:Sending unit results: id:02 state:SEND project:5797 run:21 clone:996 gen:7 core:0x11 unit:0x5bb422fb4da3754f000703e4001516a5
05:43:57:Unit 02: Uploading 5.42KiB
05:43:57:Connecting to 171.64.65.106:8080
05:43:57:WARNING: Exception: Failed to send results to work server: Failed to read response packet: HTTP_OK
05:43:58:Trying to send results to collection server
05:43:58:Unit 02: Uploading 5.42KiB
05:43:58:Connecting to 171.67.108.25:8080
05:44:00:WARNING: WorkServer connection failed on port 8080 trying 80
05:44:00:Connecting to 171.67.108.25:80
05:44:01:ERROR: Exception: Failed to connect to 171.67.108.25:80: No connection could be made because the target machine actively refused it.
05:45:34:Sending unit results: id:02 state:SEND project:5797 run:21 clone:996 gen:7 core:0x11 unit:0x5bb422fb4da3754f000703e4001516a5
05:45:34:Unit 02: Uploading 5.42KiB
05:45:34:Connecting to 171.64.65.106:8080
05:45:34:WARNING: Exception: Failed to send results to work server: Failed to read response packet: HTTP_OK
05:45:34:Trying to send results to collection server
05:45:34:Unit 02: Uploading 5.42KiB
05:45:34:Connecting to 171.67.108.25:8080
05:45:36:WARNING: WorkServer connection failed on port 8080 trying 80
05:45:36:Connecting to 171.67.108.25:80
05:45:37:ERROR: Exception: Failed to connect to 171.67.108.25:80: No connection could be made because the target machine actively refused it.
the work server says it is accepting, but I can't connect to it for some reason.
Re: 171.67.108.25
Posted: Tue Apr 12, 2011 8:36 pm
by k1wi
Can someone check whether the following work units have been updated?
Project: 6504 (Run 1, Clone 199, Gen 95)
Project: 6516 (Run 5, Clone 212, Gen 36)
Still getting the same send errors as in my previous post on these two, continuing to download and upload new work units successfully.
Re: 171.67.108.25
Posted: Tue Apr 12, 2011 10:08 pm
by PantherX
No data in the WU Database yet for both WUs:
No data back from query
So I have marked them for a follow-up.
Re: 171.67.108.25
Posted: Tue Apr 12, 2011 10:24 pm
by k1wi
Thanks Panther, it seems they're still being attempted to be uploaded to the server that is in NOT ACCEPT...
I thought maybe they had managed to go through but for some reason this wasn't relayed back to the client, but it just seems they can't get returned.
I don't think it's my connection, because other work units have been able to be uploaded.