Project: 1735 (Run 6, Clone 95, Gen 2)

Moderators: Site Moderators, FAHC Science Team

Post Reply
Oriolus
Posts: 31
Joined: Mon Jul 14, 2008 1:42 pm

Project: 1735 (Run 6, Clone 95, Gen 2)

Post by Oriolus »

I have a problem with two WUs of which the results seem to be refused to upload.
In the subject line the first project is mentioned: Project: 1735 (Run 6, Clone 95, Gen 2).
The second one reads: Project: 1734 (Run 8, Clone 83, Gen 3).
In my log-files the following lines keep coming back (this one is dated July 10th):


[22:06:59] + Attempting to send results
[22:07:37] Writing local files
[22:07:37] Completed 2425000 out of 2500000 steps (97)
[22:07:52] - Couldn't send HTTP request to server
[22:07:52] + Could not connect to Work Server (results)
[22:07:52] (134.139.127.32:8080)
[22:07:52] - Error: Could not transmit unit 00 (completed July 9) to work server.


[22:07:52] + Attempting to send results
[22:08:45] - Couldn't send HTTP request to server
[22:08:45] + Could not connect to Work Server (results)
[22:08:45] (134.139.127.33:8080)
[22:08:45] Could not transmit unit 00 to Collection server; keeping in queue.


[22:08:45] + Attempting to send results
[22:09:02] Writing checkpoint files
[22:09:38] - Couldn't send HTTP request to server
[22:09:38] + Could not connect to Work Server (results)
[22:09:38] (134.139.127.32:8080)
[22:09:38] - Error: Could not transmit unit 08 (completed July 8) to work server.


[22:09:38] + Attempting to send results
[22:10:33] - Couldn't send HTTP request to server
[22:10:33] + Could not connect to Work Server (results)
[22:10:33] (134.139.127.33:8080)
[22:10:33] Could not transmit unit 08 to Collection server; keeping in queue.

How can I look after a successful upload?
If more information is needed, I would be glad to give it :)

TIA, Oriolus
Oriolus
Posts: 31
Joined: Mon Jul 14, 2008 1:42 pm

Re: Project: 1735 (Run 6, Clone 95, Gen 2)

Post by Oriolus »

Since June 14th there is another unit that cannot be sent to server:
Project: 1737 (Run 8, Clone 82, Gen 2)

[14:21:52] Completed 2450000 out of 2500000 steps (98)
[14:30:52] Writing local files
[14:30:52] Completed 2475000 out of 2500000 steps (99)


[14:39:16] + Attempting to send results
[14:39:33] Writing local files
[14:39:33] Completed 2500000 out of 2500000 steps (100)
[14:39:33] Writing final coordinates.
[14:39:33] Past main M.D. loop
[14:40:09] - Couldn't send HTTP request to server
[14:40:09] + Could not connect to Work Server (results)
[14:40:09] (134.139.127.32:8080)
[14:40:09] - Error: Could not transmit unit 00 (completed July 9) to work server.


[14:40:09] + Attempting to send results
[14:40:34]
[14:40:34] Finished Work Unit:
[14:40:34] - Reading up to 153336 from "work/wudata_04.arc": Read 153336
[14:40:34] - Reading up to 1150192 from "work/wudata_04.xtc": Read 1150192
[14:40:34] goefile size: 0
[14:40:34] Leaving Run
[14:40:34] - Writing 1344712 bytes of core data to disk...
[14:40:34] ... Done.
[14:40:34] - Shutting down core
[14:40:34]
[14:40:34] Folding@home Core Shutdown: FINISHED_UNIT
[14:40:36] CoreStatus = 64 (100)
[14:40:36] Sending work to server
[14:40:36] - Preparing to get new work unit...
[14:40:36] + Attempting to get work packet
[14:40:36] - Connecting to assignment server
[14:40:37] - Successful: assigned to (171.64.122.72).
[14:40:37] + News From Folding@Home: Welcome to Folding@Home
[14:40:37] Loaded queue successfully.
[14:40:41] + Closed connections
[14:40:41]
[14:40:41] + Processing work unit
[14:40:41] Core required: FahCore_81.exe
[14:40:41] Core found.
[14:40:41] Working on Unit 05 [July 14 14:40:41]
[14:40:41] + Working ...
[14:40:41]
[14:40:41] *------------------------------*
[14:40:41] Folding@Home Gromacs Simulated Tempering Core
[14:40:41] Version 1.10 (Oct 4, 2007)
[14:40:41]
[14:40:41] Preparing to commence simulation
[14:40:41] - Looking at optimizations...
[14:40:41] - Created dyn
[14:40:41] - Files status OK
[14:40:41] - Expanded 239529 -> 1167683 (decompressed 487.4 percent)
[14:40:41] - Starting from initial work packet
[14:40:41]
[14:40:41] Project: 4418 (Run 53, Clone 12, Gen 20)
[14:40:41]
[14:40:41] Assembly optimizations on if available.
[14:40:41] Entering M.D.
[14:40:47] Protein: p4418_Seq_50_nat_AMBER_Native
[14:40:47]
[14:40:47] Writing local files
[14:41:02] - Couldn't send HTTP request to server
[14:41:02] + Could not connect to Work Server (results)
[14:41:02] (134.139.127.33:8080)
[14:41:02] Could not transmit unit 00 to Collection server; keeping in queue.


[14:41:02] + Attempting to send results
[14:41:56] - Couldn't send HTTP request to server
[14:41:56] + Could not connect to Work Server (results)
[14:41:56] (134.139.127.31:8080)
[14:41:56] - Error: Could not transmit unit 04 (completed July 14) to work server.
[14:41:56] Keeping unit 04 in queue.


[14:41:56] + Attempting to send results
[14:42:49] - Couldn't send HTTP request to server
[14:42:49] + Could not connect to Work Server (results)
[14:42:49] (134.139.127.32:8080)
[14:42:49] - Error: Could not transmit unit 08 (completed July 8) to work server.


[14:42:49] + Attempting to send results
[14:43:28] Extra SSE boost OK.
[14:43:28] Writing local files
[14:43:28] Completed 0 out of 150000 steps (0)
[14:43:43] - Couldn't send HTTP request to server
[14:43:43] + Could not connect to Work Server (results)
[14:43:43] (134.139.127.33:8080)
[14:43:43] Could not transmit unit 08 to Collection server; keeping in queue.
[14:45:34] Writing local files
[14:45:34] Completed 1500 out of 150000 steps (1)
[14:47:46] Writing local files
[14:47:46] Completed 3000 out of 150000 steps (2)
[14:49:56] Writing local files
[14:49:56] Completed 4500 out of 150000 steps (3)

I hope someone can help me out!

TIA, Oriolus
codysluder
Posts: 1024
Joined: Sun Dec 02, 2007 12:43 pm

Re: Project: 1735 (Run 6, Clone 95, Gen 2)

Post by codysluder »

Note to Moderator:
I've seen other reports of problems uploading to servers (134.139.127.3x:8080). Perhaps the title should be changed and the discussion moved to the topic about troubles with a specific server.
ppetrone
Pande Group Member
Posts: 115
Joined: Wed Dec 12, 2007 6:20 pm
Location: Stanford
Contact:

Re: Project: 1735 (Run 6, Clone 95, Gen 2)

Post by ppetrone »

Thank you for this report. I have alerted Eric who is in charge of this project about the potential uploading problem.
Oriolus
Posts: 31
Joined: Mon Jul 14, 2008 1:42 pm

Re: Project: 1735 (Run 6, Clone 95, Gen 2)

Post by Oriolus »

It looks as if WU 08 was uploaded successfully, but WU 00 and WU 04 are still awaiting their moment:

Launch directory: C:\Program Files\Folding@Home
Executable: C:\Program Files\Folding@Home\FAH504-Console.exe


[08:21:28] - Ask before connecting: No
[08:21:29] - User name: Oriolus (Team 99)
[08:21:35] - User ID: D0DAAF14AAAE80C
[08:21:36] - Machine ID: 1
[08:21:38]
[08:21:39] Loaded queue successfully.
[08:21:39] + Benchmarking ...
[08:21:44]


[08:21:46] + Processing work unit
[08:21:47] + Attempting to send results
[08:21:48] Core required: FahCore_82.exe
[08:21:54] Core found.
[08:22:03] Working on Unit 07 [July 17 08:22:03]
[08:22:03] + Working ...
[08:22:14]
[08:22:14] *------------------------------*
[08:22:14] Folding@Home PMD Core
[08:22:14] Version 1.03 (September 7, 2005)
[08:22:14]
[08:22:14] Preparing to commence simulation
[08:22:14] - Looking at optimizations...
[08:22:14] - Files status OK
[08:22:14] - Expanded 93016 -> 599777 (decompressed 644.8 percent)
[08:22:14]
[08:22:14] Project: 2170 (Run 36, Clone 599, Gen 3)
[08:22:14]
[08:22:14] Assembly optimizations on if available.
[08:22:14] Entering M.D.
[08:22:58] - Couldn't send HTTP request to server
[08:22:58] + Could not connect to Work Server (results)
[08:22:58] (134.139.127.32:8080)
[08:22:58] - Error: Could not transmit unit 00 (completed July 9) to work server.


[08:22:58] + Attempting to send results
[08:23:51] - Couldn't send HTTP request to server
[08:23:51] + Could not connect to Work Server (results)
[08:23:51] (134.139.127.33:8080)
[08:23:51] Could not transmit unit 00 to Collection server; keeping in queue.


[08:23:51] + Attempting to send results
[08:24:45] - Couldn't send HTTP request to server
[08:24:45] + Could not connect to Work Server (results)
[08:24:45] (134.139.127.31:8080)
[08:24:45] - Error: Could not transmit unit 04 (completed July 14) to work server.


[08:24:45] + Attempting to send results
[08:24:46] - Couldn't send HTTP request to server
[08:24:46] + Could not connect to Work Server (results)
[08:24:46] (134.139.127.34:8080)
[08:24:46] Could not transmit unit 04 to Collection server; keeping in queue.


[08:24:46] + Attempting to send results
[08:25:09] + Results successfully sent
[08:25:09] Thank you for your contribution to Folding@Home.
[08:25:09] + Number of Units Completed: 176


[08:25:30] Protein: p2170_lambda_obc_300K
[08:25:30]
[08:25:30] Completed 182075 out of 500000 steps (36)
[08:25:45] Writing checkpoint files
[08:28:49] Writing checkpoint files
[08:31:53] Writing checkpoint files
[08:34:59] Writing checkpoint files
[08:38:04] Writing checkpoint files

Still wondering why it takes so long (from July 8th until 17th) before there is a success...

Oriolus
codysluder
Posts: 1024
Joined: Sun Dec 02, 2007 12:43 pm

Re: Project: 1735 (Run 6, Clone 95, Gen 2)

Post by codysluder »

Oriolus wrote:Still wondering why it takes so long (from July 8th until 17th) before there is a success...
Based on my testing. the Collection Servers are not working.
http://134.139.127.32:8080
http://134.139.127.33:8080
http://134.139.127.31:8080
http://134.139.127.34:8080

Looking at NETLOAD on servers .31 and .32, they're both considered "heavily loaded" They're uploading a WU every 3 seconds or so and they're out of new work so your WUs are probably part of a backlog that is so huge that server can't handle the upload. Somebody needs to fix the collection servers.
Oriolus
Posts: 31
Joined: Mon Jul 14, 2008 1:42 pm

Re: Project: 1735 (Run 6, Clone 95, Gen 2)

Post by Oriolus »

At last units 08, 00 and 04 were uploaded successfully on July 17th at about 20:30h!
Thanks, whoever worked on it!

Oriolus
Post Reply