171.64.122.70 in Reject Status
Moderators: Site Moderators, FAHC Science Team
171.64.122.70 in Reject Status
Now that I have a WU that finished and appears to be okay, the server is in REJECT status.
Re: 171.64.122.70 in Reject Status
It'll be back up shortly (within the day). It's down at the moment because it's setting up a new batch of work units.
Re: 171.64.122.70 in Reject Status
I have completed this Project: 5906 (Run 12, Clone 775, Gen 1) and see that 171.64.122.70 is in reject mode now (which is no biggie). Thank you ihaque for your quick information.
What gets me, is that it is sent to an collection server, that had been down since Sat Feb 6 15:30:10 PST 2010 171.65.103.100 - VSPMF33 - CS 1 DOWN, after a few tries to the work server!!!
Was this collection server working when this batch of GPU WUs was loaded?
What gets me, is that it is sent to an collection server, that had been down since Sat Feb 6 15:30:10 PST 2010 171.65.103.100 - VSPMF33 - CS 1 DOWN, after a few tries to the work server!!!
Was this collection server working when this batch of GPU WUs was loaded?
Code: Select all
[21:40:52] + No unsent completed units remaining.
[21:40:52] - Preparing to get new work unit...
[21:40:52] + Attempting to get work packet
[21:40:52] - Will indicate memory of 3071 MB
[21:40:52] - Connecting to assignment server
[21:40:52] Connecting to http://assign-GPU.stanford.edu:8080/
[21:40:53] Posted data.
[21:40:53] Initial: 40AB; - Successful: assigned to (171.64.122.70).
[21:40:53] + News From Folding@Home: Welcome to Folding@Home
[21:40:53] Loaded queue successfully.
[21:40:53] Connecting to http://171.64.122.70:8080/
[21:40:54] Posted data.
[21:40:54] Initial: 0000; - Receiving payload (expected size: 70669)
[21:40:54] Conversation time very short, giving reduced weight in bandwidth avg
[21:40:54] - Downloaded at ~138 kB/s
[21:40:54] - Averaged speed for that direction ~62 kB/s
[21:40:54] + Received work.
[21:40:54] Trying to send all finished work units
[21:40:54] + No unsent completed units remaining.
[21:40:54] + Closed connections
[21:40:54]
[21:40:54] + Processing work unit
[21:40:54] Core required: FahCore_14.exe
[21:40:54] Core found.
[21:40:54] Working on queue slot 04 [February 19 21:40:54 UTC]
[21:40:54] + Working ...
[21:40:54] - Calling '.\FahCore_14.exe -dir work/ -suffix 04 -priority 96 -checkpoint 15 -verbose -lifeline 5396 -version 623'
[21:40:54]
[21:40:54] *------------------------------*
[21:40:54] Folding@Home GPU Core - Beta
[21:40:54] Version 1.26 (Wed Oct 14 13:09:26 PDT 2009)
[21:40:54]
[21:40:54] Compiler : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[21:40:54] Build host: vspm46
[21:40:54] Board Type: Nvidia
[21:40:54] Core :
[21:40:54] Preparing to commence simulation
[21:40:54] - Looking at optimizations...
[21:40:54] - Created dyn
[21:40:54] - Files status OK
[21:40:55] - Expanded 70157 -> 360060 (decompressed 513.2 percent)
[21:40:55] Called DecompressByteArray: compressed_data_size=70157 data_size=360060, decompressed_data_size=360060 diff=0
[21:40:55] - Digital signature verified
[21:40:55]
[21:40:55] Project: 5906 (Run 12, Clone 775, Gen 1)
[21:40:55]
[21:40:55] Assembly optimizations on if available.
[21:40:55] Entering M.D.
[21:41:01] Tpr hash work/wudata_04.tpr: 1233736580 1650430303 1222906360 568095948 536277868
[21:41:01] Working on Protein
[21:41:02] Client config found, loading data.
[21:41:02] Starting GUI Server
[21:42:00] Completed 1%
Skip
[00:02:13] Completed 100%
[00:02:13] Successful run
[00:02:13] DynamicWrapper: Finished Work Unit: sleep=10000
[00:02:23] Reserved 11312 bytes for xtc file; Cosm status=0
[00:02:23] Allocated 11312 bytes for xtc file
[00:02:23] - Reading up to 11312 from "work/wudata_04.xtc": Read 11312
[00:02:23] Read 11312 bytes from xtc file; available packet space=786419152
[00:02:23] xtc file hash check passed.
[00:02:23] Reserved 23472 23472 786419152 bytes for arc file=<work/wudata_04.trr> Cosm status=0
[00:02:23] Allocated 23472 bytes for arc file
[00:02:23] - Reading up to 23472 from "work/wudata_04.trr": Read 23472
[00:02:23] Read 23472 bytes from arc file; available packet space=786395680
[00:02:23] trr file hash check passed.
[00:02:23] Allocated 560 bytes for edr file
[00:02:23] Read bedfile
[00:02:23] edr file hash check passed.
[00:02:23] Allocated 61571 bytes for logfile
[00:02:23] Read logfile
[00:02:23] GuardedRun: success in DynamicWrapper
[00:02:23] GuardedRun: done
[00:02:23] Run: GuardedRun completed.
[00:02:25] - Writing 97427 bytes of core data to disk...
[00:02:25] Done: 96915 -> 43356 (compressed to 44.7 percent)
[00:02:25] ... Done.
[00:02:25] - Shutting down core
[00:02:25]
[00:02:25] Folding@home Core Shutdown: FINISHED_UNIT
[00:02:28] CoreStatus = 64 (100)
[00:02:28] Unit 4 finished with 97 percent of time to deadline remaining.
[00:02:28] Updated performance fraction: 0.983769
[00:02:28] Sending work to server
[00:02:28] Project: 5906 (Run 12, Clone 775, Gen 1)
[00:02:28] - Read packet limit of 540015616... Set to 524286976.
[00:02:28] + Attempting to send results [February 20 00:02:28 UTC]
[00:02:28] - Reading file work/wuresults_04.dat from core
[00:02:28] (Read 43868 bytes from disk)
[00:02:28] Connecting to http://171.64.122.70:8080/
[00:02:30] - Couldn't send HTTP request to server
[00:02:30] + Could not connect to Work Server (results)
[00:02:30] (171.64.122.70:8080)
[00:02:30] + Retrying using alternative port
[00:02:30] Connecting to http://171.64.122.70:80/
[00:02:31] - Couldn't send HTTP request to server
[00:02:31] + Could not connect to Work Server (results)
[00:02:31] (171.64.122.70:80)
[00:02:31] - Error: Could not transmit unit 04 (completed February 20) to work server.
[00:02:31] - 1 failed uploads of this unit.
[00:02:31] Keeping unit 04 in queue.
[00:02:31] Trying to send all finished work units
[00:02:31] Project: 5906 (Run 12, Clone 775, Gen 1)
[00:02:31] - Read packet limit of 540015616... Set to 524286976.
[00:02:31] + Attempting to send results [February 20 00:02:31 UTC]
[00:02:31] - Reading file work/wuresults_04.dat from core
[00:02:31] (Read 43868 bytes from disk)
[00:02:31] Connecting to http://171.64.122.70:8080/
[00:02:32] - Couldn't send HTTP request to server
[00:02:32] + Could not connect to Work Server (results)
[00:02:32] (171.64.122.70:8080)
[00:02:32] + Retrying using alternative port
[00:02:32] Connecting to http://171.64.122.70:80/
[00:02:33] - Couldn't send HTTP request to server
[00:02:33] + Could not connect to Work Server (results)
[00:02:33] (171.64.122.70:80)
[00:02:33] - Error: Could not transmit unit 04 (completed February 20) to work server.
[00:02:33] - 2 failed uploads of this unit.
[00:02:33] - Read packet limit of 540015616... Set to 524286976.
[00:02:33] + Attempting to send results [February 20 00:02:33 UTC]
[00:02:33] - Reading file work/wuresults_04.dat from core
[00:02:33] (Read 43868 bytes from disk)
[00:02:33] Connecting to http://171.65.103.100:8080/
[00:02:54] - Couldn't send HTTP request to server
[00:02:54] + Could not connect to Work Server (results)
[00:02:54] (171.65.103.100:8080)
[00:02:54] + Retrying using alternative port
[00:02:54] Connecting to http://171.65.103.100:80/
[00:03:15] - Couldn't send HTTP request to server
[00:03:15] + Could not connect to Work Server (results)
[00:03:15] (171.65.103.100:80)
[00:03:15] Could not transmit unit 04 to Collection server; keeping in queue.
[00:03:15] + Sent 0 of 1 completed units to the server
[00:03:15] - Preparing to get new work unit...
[00:03:15] + Attempting to get work packet
[00:03:15] - Will indicate memory of 3071 MB
[00:03:15] - Connecting to assignment server
[00:03:15] Connecting to http://assign-GPU.stanford.edu:8080/
[00:03:16] Posted data.
[00:03:16] Initial: 40AB; - Successful: assigned to (171.64.65.20).
[00:03:16] + News From Folding@Home: Welcome to Folding@Home
[00:03:16] Loaded queue successfully.
[00:03:16] Connecting to http://171.64.65.20:8080/
[00:03:16] Posted data.
[00:03:16] Initial: 0000; - Receiving payload (expected size: 70689)
[00:03:17] - Downloaded at ~69 kB/s
[00:03:17] - Averaged speed for that direction ~63 kB/s
[00:03:17] + Received work.
[00:03:17] Trying to send all finished work units
[00:03:17] Project: 5906 (Run 12, Clone 775, Gen 1)
[00:03:17] - Read packet limit of 540015616... Set to 524286976.
[00:03:17] + Attempting to send results [February 20 00:03:17 UTC]
[00:03:17] - Reading file work/wuresults_04.dat from core
[00:03:17] (Read 43868 bytes from disk)
[00:03:17] Connecting to http://171.64.122.70:8080/
[00:03:18] - Couldn't send HTTP request to server
[00:03:18] + Could not connect to Work Server (results)
[00:03:18] (171.64.122.70:8080)
[00:03:18] + Retrying using alternative port
[00:03:18] Connecting to http://171.64.122.70:80/
[00:03:19] - Couldn't send HTTP request to server
[00:03:19] + Could not connect to Work Server (results)
[00:03:19] (171.64.122.70:80)
[00:03:19] - Error: Could not transmit unit 04 (completed February 20) to work server.
[00:03:19] - 3 failed uploads of this unit.
[00:03:19] - Read packet limit of 540015616... Set to 524286976.
[00:03:19] + Attempting to send results [February 20 00:03:19 UTC]
[00:03:19] - Reading file work/wuresults_04.dat from core
[00:03:19] (Read 43868 bytes from disk)
[00:03:19] Connecting to http://171.65.103.100:8080/
[00:03:40] - Couldn't send HTTP request to server
[00:03:40] + Could not connect to Work Server (results)
[00:03:40] (171.65.103.100:8080)
[00:03:40] + Retrying using alternative port
[00:03:40] Connecting to http://171.65.103.100:80/
[00:04:01] - Couldn't send HTTP request to server
[00:04:01] + Could not connect to Work Server (results)
[00:04:01] (171.65.103.100:80)
[00:04:01] Could not transmit unit 04 to Collection server; keeping in queue.
[00:04:01] + Sent 0 of 1 completed units to the server
[00:04:01] + Closed connections
[00:04:01]
[00:04:01] + Processing work unit
[00:04:01] Core required: FahCore_14.exe
[00:04:01] Core found.
[00:04:01] Working on queue slot 05 [February 20 00:04:01 UTC]
[00:04:01] + Working ...
[00:04:01] - Calling '.\FahCore_14.exe -dir work/ -suffix 05 -priority 96 -checkpoint 15 -verbose -lifeline 5396 -version 623'
[00:04:02]
[00:04:02] *------------------------------*
[00:04:02] Folding@Home GPU Core - Beta
[00:04:02] Version 1.26 (Wed Oct 14 13:09:26 PDT 2009)
[00:04:02]
[00:04:02] Compiler : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[00:04:02] Build host: vspm46
[00:04:02] Board Type: Nvidia
[00:04:02] Core :
[00:04:02] Preparing to commence simulation
[00:04:02] - Looking at optimizations...
[00:04:02] - Created dyn
[00:04:02] - Files status OK
[00:04:02] - Expanded 70177 -> 360060 (decompressed 513.0 percent)
[00:04:02] Called DecompressByteArray: compressed_data_size=70177 data_size=360060, decompressed_data_size=360060 diff=0
[00:04:02] - Digital signature verified
[00:04:02]
[00:04:02] Project: 5910 (Run 6, Clone 71, Gen 7)
[00:04:02]
[00:04:02] Assembly optimizations on if available.
[00:04:02] Entering M.D.
[00:04:04] - Autosending finished units... [February 20 00:04:04 UTC]
[00:04:04] Trying to send all finished work units
[00:04:04] Project: 5906 (Run 12, Clone 775, Gen 1)
[00:04:04] - Read packet limit of 540015616... Set to 524286976.
[00:04:04] + Attempting to send results [February 20 00:04:04 UTC]
[00:04:04] - Reading file work/wuresults_04.dat from core
[00:04:04] (Read 43868 bytes from disk)
[00:04:04] Connecting to http://171.64.122.70:8080/
[00:04:05] - Couldn't send HTTP request to server
[00:04:05] + Could not connect to Work Server (results)
[00:04:05] (171.64.122.70:8080)
[00:04:05] + Retrying using alternative port
[00:04:05] Connecting to http://171.64.122.70:80/
[00:04:07] - Couldn't send HTTP request to server
[00:04:07] + Could not connect to Work Server (results)
[00:04:07] (171.64.122.70:80)
[00:04:07] - Error: Could not transmit unit 04 (completed February 20) to work server.
[00:04:07] - 4 failed uploads of this unit.
[00:04:07] - Read packet limit of 540015616... Set to 524286976.
[00:04:07] + Attempting to send results [February 20 00:04:07 UTC]
[00:04:07] - Reading file work/wuresults_04.dat from core
[00:04:07] (Read 43868 bytes from disk)
[00:04:07] Connecting to http://171.65.103.100:8080/
[00:04:08] Tpr hash work/wudata_05.tpr: 3784723264 3781642382 1067008939 317210289 4179130916
[00:04:08] Working on Protein
[00:04:09] Client config found, loading data.
[00:04:09] Starting GUI Server
[00:04:28] - Couldn't send HTTP request to server
[00:04:28] + Could not connect to Work Server (results)
[00:04:28] (171.65.103.100:8080)
[00:04:28] + Retrying using alternative port
[00:04:28] Connecting to http://171.65.103.100:80/
[00:04:49] - Couldn't send HTTP request to server
[00:04:49] + Could not connect to Work Server (results)
[00:04:49] (171.65.103.100:80)
[00:04:49] Could not transmit unit 04 to Collection server; keeping in queue.
[00:04:49] + Sent 0 of 1 completed units to the server
[00:04:49] - Autosend completed
[00:05:12] Completed 1%
Re: 171.64.122.70 in Reject Status
Looks like it's back up now.
Re: 171.64.122.70 in Reject Status
The purpose of the Collection Servers is specifically to accept uploads when a main Work Server is down.Cajun_Don wrote:What gets me, is that it is sent to an collection server, that had been down since Sat Feb 6 15:30:10 PST 2010 171.65.103.100 - VSPMF33 - CS 1 DOWN, after a few tries to the work server!!!
Was this collection server working when this batch of GPU WUs was loaded?
I don't know any more about the specific WUs that you're reporting or what new WUs were being loaded. It's certainly possible that you were assigned your WU before the WS went down, you completed it while the WS was down, you uploaded it to the CS, and went on to process something else. Then when the WS came back up, it was transferred from the CS to the WS. Like I said, that's the way CSs are designed to work.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
Re: 171.64.122.70 in Reject Status
Can you tell me if the WU Project: 5906 (Run 12, Clone 775, Gen 1)has uploaded, since the WS has been put back online?
Re: 171.64.122.70 in Reject Status
Hi Cajun_Don (team 15),
Your WU (P5906 R12 C775 G1) was added to the stats database on 2010-02-19 19:11:59 for 472 points of credit.
Your WU (P5906 R12 C775 G1) was added to the stats database on 2010-02-19 19:11:59 for 472 points of credit.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.