Page 1 of 2

Server 171.67.108.20 is not reacchable

Posted: Sat Jan 24, 2009 1:50 pm
by bdo
When I put the IP adress of this server in my Browser, I do'nt receive a answer. but in the server status page it is in mode accept. I have no problem with the other server
The problem is that I try to send à WU to this server.
Can you see where is the problem.
Thanks

Code: Select all

[13:09:42] - Autosending finished units... [January 24 13:09:42 UTC]
[13:09:42] Trying to send all finished work units
[13:09:42] Project: 4436 (Run 30, Clone 4, Gen 7)
[13:09:42] - Read packet limit of 540015616... Set to 524286976.


[13:09:42] + Attempting to send results [January 24 13:09:42 UTC]
[13:09:42] - Reading file work/wuresults_06.dat from core
[13:09:42] Working on queue slot 07 [January 24 13:09:42 UTC]
[13:09:42] + Working ...
[13:09:42] - Calling '.\FahCore_7c.exe -dir work/ -suffix 07 -checkpoint 15 -service -verbose -lifeline 4448 -version 623'

[13:09:42]   (Read 2349770 bytes from disk)
[13:09:42] Connecting to http://171.67.108.20:8080/
[13:09:42] 
[13:09:42] *------------------------------*
[13:09:42] Folding@Home Double Gromacs Core C
[13:09:42] Version 1.00 (Thu Apr 24 19:12:09 PDT 2008)
[13:09:42] 
[13:09:42] Preparing to commence simulation
[13:09:42] - Files status OK
[13:09:43] - Expanded 220976 -> 620709 (decompressed 280.8 percent)
[13:09:43] 
[13:09:43] Project: 3859 (Run 2272, Clone 0, Gen 9)
[13:09:43] 
[13:09:43] Assembly optimizations on if available.
[13:09:43] Entering M.D.
[13:09:49] Will resume from checkpoint file
[13:09:49] Working on p3850_fkbprelative_ligand
[13:09:49] Completed 0 out of 1000000 steps  (0%)
[13:09:49] Extra SSE2 boost OK
[13:09:49] Resuming from checkpoint
[13:09:50] Verified work/wudata_07.log
[13:09:50] Verified work/wudata_07.edr
[13:09:50] Verified work/wudata_07.xvg
[13:09:50] Verified work/wudata_07.trr
[13:09:50] Verified work/wudata_07.xtc
[13:09:50] Completed 66191 out of 1000000 steps  (6%)
[13:15:39] - Couldn't send HTTP request to server
[13:15:39] + Could not connect to Work Server (results)
[13:15:39]     (171.67.108.20:8080)
[13:15:39] + Retrying using alternative port
[13:15:39] Connecting to http://171.67.108.20:80/
[13:19:00] Completed 70000 out of 1000000 steps  (7%)
[13:19:12] - Couldn't send HTTP request to server
[13:19:12] + Could not connect to Work Server (results)
[13:19:12]     (171.67.108.20:80)
[13:19:12] - Error: Could not transmit unit 06 (completed January 24) to work server.
[13:19:12] - 4 failed uploads of this unit.
[13:19:12]   Keeping unit 06 in queue.
[13:19:12] + Sent 0 of 1 completed units to the server
[13:19:12] - Autosend completed

Re: Server 171.67.108.20 is not reacchable

Posted: Sat Jan 31, 2009 5:50 pm
by new08
You are not alone..One week later things are still the same.
Before it was 17 that was down and I lost quite a few units. I used 20 to test browser and it was ok then.
Now I'm on the same tack with this one.
With no name against the server, maybe it's just pot luck if it gets checked or booted??
VJ said to my earlier query some new code was going in during Jan '09 on 108.17 - maybe it's affecting this one now also?

Re: Server 171.67.108.20 is not reacchable

Posted: Sun Feb 01, 2009 10:37 am
by new08
Sunday 1 Feb Though this server is showing as Accepting- it does appear to be more in reject mode with no uploads occurring and still not responding -with only one connection live at present.
One has to ask..how difficult is it to get/keep servers on line??

Re: Server 171.67.108.20 is not reacchable

Posted: Mon Feb 02, 2009 10:47 pm
by old_fool
I am having problems with 171.67.108.20, too!


Is anyone reading these fora, at all?



And yes, does it take such great effort to keep a server running? I mean, thousands of people are doing a lot of voluntary work for Stanford - it would be really nice if the project team at Stanford could put some care in keeping the servers up.

Bear with me guys, I'm trying to be polite. But I'm fuming.

Re: Server 171.67.108.20 is not reacchable

Posted: Tue Feb 03, 2009 5:35 pm
by old_fool
Can we contact an admin?

Re: Server 171.67.108.20 is not reacchable

Posted: Wed Feb 04, 2009 6:17 pm
by alpha
The stats page it is ACTIVE, yet my WU is still trying to upload to Stanford.

Are we being ignored or do they not have time to look into this problem thats been going on since Jan. 24, 2009?

Re: Server 171.67.108.20 is not reacchable

Posted: Thu Feb 05, 2009 6:26 am
by new08
I see that 108.20 is the only server in that range that does not have a collection server (CS Column) allocated.
Is this pertinent?
I have only one days grace to upload this unit on the 22nd Feb when the unit finishes.
If that fails [after three unit reruns] - then I will not use that client again ,as I've wasted a lot of time on this part of the project.

Re: Server 171.67.108.20 is not reacchable

Posted: Thu Feb 05, 2009 12:45 pm
by old_fool
Hey new08,


what do you mean "not use this client"? I didn't think it was the client to determine the server you get your work units from, is it? If it is, should one re-install the client, and that will assign a different server?

I have stopped this computer 2 days ago. I only used it for checking my e-mails in the morning, and was keeping it up mostly for FAH. If these guys are unwilling to do the most basic maintenance work, then I won't bother either. I'll save the electricity, thanks.

What infuriates me most, however, is that I have this ideal that with these WUs I am helping, albeit very little, in research that may one day save the life of people with leukemia, Alzheimer's or CJD... or such. I'm not doing it for Stanford per se. There IS a bigger picture there!

Re: Server 171.67.108.20 is not reacchable

Posted: Thu Feb 05, 2009 1:22 pm
by alpha
I have 4 WU's that will die in a couple of days that need this server. I may just stop F@H too, it get a little saddening for all that work to just die.

Re: Server 171.67.108.20 is not reacchable

Posted: Thu Feb 05, 2009 2:19 pm
by new08
Quote : If it is, should one re-install the client.
I have upgraded the client to 6.23 but this is the last shot.
Certain OS's and clients do seem to need certain servers- so my move away way on this basis.
I do have other PCs that run and all get this problem sometimes...but not quite so consistently as this .108.? ident!

Re: Server 171.67.108.20 is not reacchable

Posted: Thu Feb 05, 2009 2:28 pm
by old_fool
new08 wrote:I have upgraded the client to 6.23 but this is the last shot.
Certain OS's and clients do seem to need certain servers- so my move away way on this basis.
I do have other PCs that run and all get this problem sometimes...but not quite so consistently as this .108.? ident!
Actually, that will probably not do the trick with your existing processed WUs. Just check this thread:
viewtopic.php?f=18&t=6302

The problem is that FAH has a broken-ass architecture, and the collection server is hard-coded in each WU. This wouldn't be a problem if each admin did his/her work conscientiously, but they aren't. How long has this comedy going on? A week? Almost two weeks? And a server that is UNABLE to collect WUs is still distributing them, basically wasting our resources for nothing. Just because a responsible person couldn't care to just go and unplug the damn server. Until that server is up, it's going to spread WUs that will not be collected.

Re: Server 171.67.108.20 is not reacchable

Posted: Thu Feb 05, 2009 3:25 pm
by toTOW
I've sent a mail to Vijay to get more detail about this server.

Re: Server 171.67.108.20 is not reacchable

Posted: Thu Feb 05, 2009 4:43 pm
by VijayPande
There was a strange issue with this one, that was keeping this issue off of the serverstat radar. I've fixed that and we should be ok from here on out.

Re: Server 171.67.108.20 is not reacchable

Posted: Thu Feb 05, 2009 11:11 pm
by alpha
Thank you, all my WU's have uploaded. I appreciate that someone looked at this thread and contacted the correct person to fix it.
Have a great day all.

Re: Server 171.67.108.20 is not reacchable

Posted: Fri Feb 06, 2009 2:10 pm
by old_fool
alpha wrote:Thank you, all my WU's have uploaded. I appreciate that someone looked at this thread and contacted the correct person to fix it.
Have a great day all.
I concur - thanks for fixing the issue.