CSs 171.67.108.17 - 171.65.103.100 - 171.67.108.25

Moderators: Site Moderators, FAHC Science Team

tobor
Posts: 56
Joined: Tue Jul 15, 2008 11:15 pm
Hardware configuration: ASUS M3N-HT deluxe,AMD6400 duel 3.2gig, GeForce9800 GTX C-760 M-1140 S-1900,4 gig OCZ ddr
Location: Missouri,USA

Re: 171.64.65.71 accepting... but

Post by tobor »

Got another one that wont send ?????

Code: Select all

[20:20:46] Completed 90%
[20:21:43] Completed 91%
[20:22:44] Completed 92%
[20:23:39] Completed 93%
[20:24:34] Completed 94%
[20:25:28] Completed 95%
[20:26:23] Completed 96%
[20:27:18] Completed 97%
[20:28:13] Completed 98%
[20:29:09] Completed 99%
[20:30:03] Completed 100%
[20:30:03] Successful run
[20:30:03] DynamicWrapper: Finished Work Unit: sleep=10000
[20:30:13] Reserved 101164 bytes for xtc file; Cosm status=0
[20:30:13] Allocated 101164 bytes for xtc file
[20:30:13] - Reading up to 101164 from "work/wudata_01.xtc": Read 101164
[20:30:13] Read 101164 bytes from xtc file; available packet space=786329300
[20:30:13] xtc file hash check passed.
[20:30:13] Reserved 30216 30216 786329300 bytes for arc file=<work/wudata_01.trr> Cosm status=0
[20:30:13] Allocated 30216 bytes for arc file
[20:30:13] - Reading up to 30216 from "work/wudata_01.trr": Read 30216
[20:30:13] Read 30216 bytes from arc file; available packet space=786299084
[20:30:13] trr file hash check passed.
[20:30:13] Allocated 560 bytes for edr file
[20:30:13] Read bedfile
[20:30:13] edr file hash check passed.
[20:30:13] Logfile not read.
[20:30:13] GuardedRun: success in DynamicWrapper
[20:30:13] GuardedRun: done
[20:30:13] Run: GuardedRun completed.
[20:30:15] + Opened results file
[20:30:15] - Writing 132452 bytes of core data to disk...
[20:30:15] Done: 131940 -> 131502 (compressed to 99.6 percent)
[20:30:15]   ... Done.
[20:30:15] DeleteFrameFiles: successfully deleted file=work/wudata_01.ckp
[20:30:16] Shutting down core 
[20:30:16] 
[20:30:16] Folding@home Core Shutdown: FINISHED_UNIT
[20:30:19] CoreStatus = 64 (100)
[20:30:19] Sending work to server
[20:30:19] Project: 10102 (Run 783, Clone 3, Gen 6)


[20:30:19] + Attempting to send results [February 9 20:30:19 UTC]

Folding@Home Client Shutdown.


--- Opening Log file [February 9 20:36:29 UTC] 


# Windows GPU Console Edition #################################################
###############################################################################

                       Folding@Home Client Version 6.23

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: C:\Documents and Settings\steve\Application Data\Folding@home-gpu
Arguments: -gpu 0 

[20:36:29] - Ask before connecting: No
[20:36:29] - User name: stv911 (Team 4)
[20:36:29] - User ID: BE4069840F022A4
[20:36:29] - Machine ID: 2
[20:36:29] 
[20:36:29] Loaded queue successfully.
[20:36:29] Initialization complete
[20:36:29] - Preparing to get new work unit...
[20:36:29] + Attempting to get work packet
[20:36:29] Project: 10102 (Run 783, Clone 3, Gen 6)


[20:36:29] + Attempting to send results [February 9 20:36:29 UTC]
[20:36:29] - Connecting to assignment server
[20:36:29] - Successful: assigned to (171.67.108.21).
[20:36:29] + News From Folding@Home: Welcome to Folding@Home
[20:36:30] Loaded queue successfully.
[20:36:31] + Closed connections
[20:36:31] 
[20:36:31] + Processing work unit
[20:36:31] Core required: FahCore_11.exe
[20:36:31] Core found.
[20:36:31] Working on queue slot 02 [February 9 20:36:31 UTC]
[20:36:31] + Working ...
[20:36:31] 
[20:36:31] *------------------------------*
[20:36:31] Folding@Home GPU Core
[20:36:31] Version 1.31 (Tue Sep 15 10:57:42 PDT 2009)
[20:36:31] 
[20:36:31] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[20:36:31] Build host: amoeba
[20:36:31] Board Type: Nvidia
[20:36:31] Core      : 
[20:36:31] Preparing to commence simulation
[20:36:31] - Looking at optimizations...
[20:36:31] DeleteFrameFiles: successfully deleted file=work/wudata_02.ckp
[20:36:31] - Created dyn
[20:36:31] - Files status OK
[20:36:31] - Expanded 65016 -> 343723 (decompressed 528.6 percent)
[20:36:31] Called DecompressByteArray: compressed_data_size=65016 data_size=343723, decompressed_data_size=343723 diff=0
[20:36:31] - Digital signature verified
[20:36:31] 
[20:36:31] Project: 5785 (Run 6, Clone 52, Gen 14)
[20:36:31] 
[20:36:31] Assembly optimizations on if available.
[20:36:31] Entering M.D.
[20:36:34] - Couldn't send HTTP request to server
[20:36:34] + Could not connect to Work Server (results)
[20:36:34]     (171.64.65.71:8080)
[20:36:34] + Retrying using alternative port
[20:36:37] Tpr hash work/wudata_02.tpr:  3143327561 1006880533 222085423 1916458197 1539356921
[20:36:37] 
[20:36:37] Calling fah_main args: 14 usage=100
[20:36:37] 
[20:36:38] Working on Gromacs Runs One Microsecond At Cannonball Speeds
[20:36:40] Client config found, loading data.
[20:36:40] Starting GUI Server
[20:36:40] - Couldn't send HTTP request to server
[20:36:40] + Could not connect to Work Server (results)
[20:36:40]     (171.64.65.71:80)
[20:36:40] - Error: Could not transmit unit 01 (completed February 9) to work server.
[20:36:40]   Keeping unit 01 in queue.
[20:37:51] Completed 1%
[20:39:09] Completed 2%
[20:40:23] Completed 3%
[20:41:51] Completed 4%
[20:43:04] Completed 5%
[20:44:15] Completed 6%
[20:45:26] Completed 7%
[20:46:37] Completed 8%
[20:47:48] Completed 9%
Image
lambdapro
Posts: 16
Joined: Tue Dec 29, 2009 6:20 pm

Re: 171.64.65.71 accepting... but

Post by lambdapro »

How long do they recommend before we delete the WU from the directory? My system is pausing about 45 minutes between each work package right now while it is trying to upload this.
David
ikerekes
Posts: 94
Joined: Thu Nov 13, 2008 4:18 pm
Hardware configuration: q6600 @ 3.3Ghz windows xp-sp3 one SMP2 (2.15 core) + 1 9800GT native GPU2
Athlon x2 6000+ @ 3.0Ghz ubuntu 8.04 smp + asus 9600GSO gpu2 in wine wrapper
5600X2 @ 3.19Ghz ubuntu 8.04 smp + asus 9600GSO gpu2 in wine wrapper
E5200 @ 3.7Ghz ubuntu 8.04 smp2 + asus 9600GT silent gpu2 in wine wrapper
E5200 @ 3.65Ghz ubuntu 8.04 smp2 + asus 9600GSO gpu2 in wine wrapper
E6550 vmware ubuntu 8.4.1
q8400 @ 3.3Ghz windows xp-sp3 one SMP2 (2.15 core) + 1 9800GT native GPU2
Athlon II 620 @ 2.6 Ghz windows xp-sp3 one SMP2 (2.15 core) + 1 9800GT native GPU2
Location: Calgary, Canada

Re: 171.64.65.71 accepting... but

Post by ikerekes »

lambdapro wrote:How long do they recommend before we delete the WU from the directory? My system is pausing about 45 minutes between each work package right now while it is trying to upload this.
David
same here!
I got tired of watching the failed count going up and up, and after the 15-th. failed upload, I cleaned the the whole queue. :evil:
Image
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: 171.64.65.71 accepting... but

Post by bruce »

I see two issues. First, from the data that I can see it doesn't look like the server has a problem, but that doesn't always mean it's problem free, and certainly if a number of people are reporting more or less the same problem, It warrants further investigation.

Second, I don't understand why folding stops during the attempted uploads. If the WU that you just finished needs to be returned to that server, then yes, it will try to upload it before it tries to get a new assignment, but once it's past that point, it should not detract from folding the next WU. The work-around for that problem is to stop the client and restart it. Look carefully at the log posted by tobor immediately above.

Project: 10102 (Run 783, Clone 3, Gen 6) finishes at [20:30:16] and starts to upload. He stops / restarts the client.
Within 6 minutes, Project: 5785 (Run 6, Clone 52, Gen 14) has been downloaded and starts folding while the client simultaneously attempts to upload Project: 10102 (Run 783, Clone 3, Gen 6). (I realize this process is far from ideal because it requires you to monitor the client and stop/start it to minimize the impact of whatever the problem is, but you're free to use it if it helps.)

Tobor first post also shows a client restart and folding resumes without waiting for the uplaod to either succeed or fail.

lambdapro statement that his system is pausing about 45 minutes between each work package right now is not supported by information in a posted copy of FAHlog.txt. jevans64's log shows just the opposite. While it is true that there are abnormally long gaps between messages associated with the uploads, the "Completed XX%" messages are showing progress the whole time.

Code: Select all

[12:17:38] Completed 6%
[12:18:47] Completed 7%
[12:18:57] Posted data.
[12:19:57] Completed 8%
[12:21:06] Completed 9%

And the Initial: xxx line 20 minutes later...

[12:37:18] Completed 23%
[12:38:27] Completed 24%
[12:38:57] Initial: 00FA; Completed 25%
The problem of not getting new work may or may not be related, but it's not precisely the same problem.

Would somebody who has more than one computer and has one that is experiencing these long gaps between upload messages go to another computer and ping that server? Does it respond normally?
Bobby-Uschi
Posts: 70
Joined: Thu Jul 31, 2008 3:26 pm
Hardware configuration: PC1//C2Q-Q9450,GA-X48-DS5-NinjaMini,GTX285,2x160GB Western Sata2,2x1GB Geil800,Tagan 800W;XP Pro SP3-32Bit;
PC2//C2Q-Q2600k.GB-P67UD4-Freezer 7Pro,GTX285Leadtek,260 GB Western Sata2,4x2GB GeilPC3,OCZ600W;Win7-64Bit;Siemens 22"
Location: Deutschland

Re: 171.64.65.71 accepting... but

Post by Bobby-Uschi »

(171.64.65.71 OK!

Code: Select all

[22:53:43] Completed 98%
[22:54:17] Completed 99%
[22:54:50] Completed 100%
[22:54:50] Successful run
[22:54:50] DynamicWrapper: Finished Work Unit: sleep=10000
[22:55:00] Reserved 75820 bytes for xtc file; Cosm status=0
[22:55:00] Allocated 75820 bytes for xtc file
[22:55:00] - Reading up to 75820 from "work/wudata_03.xtc": Read 75820
[22:55:00] Read 75820 bytes from xtc file; available packet space=786354644
[22:55:00] xtc file hash check passed.
[22:55:00] Reserved 15168 15168 786354644 bytes for arc file=<work/wudata_03.trr> Cosm status=0
[22:55:00] Allocated 15168 bytes for arc file
[22:55:00] - Reading up to 15168 from "work/wudata_03.trr": Read 15168
[22:55:00] Read 15168 bytes from arc file; available packet space=786339476
[22:55:00] trr file hash check passed.
[22:55:00] Allocated 560 bytes for edr file
[22:55:00] Read bedfile
[22:55:00] edr file hash check passed.
[22:55:00] Allocated 33335 bytes for logfile
[22:55:00] Read logfile
[22:55:00] GuardedRun: success in DynamicWrapper
[22:55:00] GuardedRun: done
[22:55:00] Run: GuardedRun completed.
[22:55:01] + Opened results file
[22:55:01] - Writing 125395 bytes of core data to disk...
[22:55:01] Done: 124883 -> 99289 (compressed to 79.5 percent)
[22:55:01]   ... Done.
[22:55:01] DeleteFrameFiles: successfully deleted file=work/wudata_03.ckp
[22:55:01] Shutting down core 
[22:55:01] 
[22:55:01] Folding@home Core Shutdown: FINISHED_UNIT
[22:55:05] CoreStatus = 64 (100)
[22:55:05] Sending work to server
[22:55:05] Project: 5768 (Run 1, Clone 65, Gen 1943)
[22:55:05] - Read packet limit of 540015616... Set to 524286976.


[22:55:05] + Attempting to send results [February 9 22:55:05 UTC]
[22:55:08] + Results successfully sent
[22:55:08] Thank you for your contribution to Folding@Home.
[22:55:08] + Number of Units Completed: 886

[22:55:12] Project: 10103 (Run 835, Clone 4, Gen 2)
[22:55:12] - Read packet limit of 540015616... Set to 524286976.


[22:55:12] + Attempting to send results [February 9 22:55:12 UTC]
[23:10:13] + Could not connect to Work Server (results)
[23:10:13]     (171.64.65.71:8080)
[23:10:13] + Retrying using alternative port
[23:10:25] - Couldn't send HTTP request to server
[23:10:25] + Could not connect to Work Server (results)
[23:10:25]     (171.64.65.71:80)
[23:10:25] - Error: Could not transmit unit 02 (completed February 9) to work server.
[23:10:25] - Read packet limit of 540015616... Set to 524286976.


[23:10:25] + Attempting to send results [February 9 23:10:25 UTC]
[23:34:47] - Unknown packet returned from server, expected ACK for results
[23:34:47]   Could not transmit unit 02 to Collection server; keeping in queue.
[23:34:47] - Preparing to get new work unit...
[23:34:47] + Attempting to get work packet
[23:34:47] - Connecting to assignment server
[23:34:48] - Successful: assigned to (171.67.108.21).
[23:34:48] + News From Folding@Home: Welcome to Folding@Home
[23:34:48] Loaded queue successfully.
[23:34:50] Project: 10103 (Run 835, Clone 4, Gen 2)
[23:34:50] - Read packet limit of 540015616... Set to 524286976.


[23:34:50] + Attempting to send results [February 9 23:34:50 UTC]
[23:35:23] - Couldn't send HTTP request to server
[23:35:23] + Could not connect to Work Server (results)
[23:35:23]     (171.64.65.71:8080)
[23:35:23] + Retrying using alternative port
[23:35:31] - Couldn't send HTTP request to server
[23:35:31] + Could not connect to Work Server (results)
[23:35:31]     (171.64.65.71:80)
[23:35:31] - Error: Could not transmit unit 02 (completed February 9) to work server.
[23:35:31] - Read packet limit of 540015616... Set to 524286976.


[23:35:31] + Attempting to send results [February 9 23:35:31 UTC]
[23:35:52] - Server does not have record of this unit. Will try again later.
[23:35:52]   Could not transmit unit 02 to Collection server; keeping in queue.
[23:35:52] + Closed connections
[23:35:52] 
[23:35:52] + Processing work unit
[23:35:52] Core required: FahCore_11.exe
[23:35:52] Core found.
[23:35:52] Working on queue slot 04 [February 9 23:35:52 UTC]
[23:35:52] + Working ...
[23:35:52] 
[23:35:52] *------------------------------*
[23:35:52] Folding@Home GPU Core
[23:35:52] Version 1.31 (Tue Sep 15 10:57:42 PDT 2009)
[23:35:52] 
[23:35:52] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[23:35:52] Build host: amoeba
[23:35:52] Board Type: Nvidia
[23:35:52] Core      : 
[23:35:52] Preparing to commence simulation
[23:35:52] - Looking at optimizations...
[23:35:52] DeleteFrameFiles: successfully deleted file=work/wudata_04.ckp
[23:35:52] - Created dyn
[23:35:52] - Files status OK
[23:35:52] - Expanded 65022 -> 343707 (decompressed 528.6 percent)
[23:35:52] Called DecompressByteArray: compressed_data_size=65022 data_size=343707, decompressed_data_size=343707 diff=0
[23:35:52] - Digital signature verified
[23:35:52] 
[23:35:52] Project: 5783 (Run 3, Clone 99, Gen 29)
[23:35:52] 
[23:35:52] Assembly optimizations on if available.
[23:35:52] Entering M.D.
[23:35:58] Tpr hash work/wudata_04.tpr:  749506165 3788249871 1549560040 3354628632 3250258505
[23:35:58] 
[23:35:58] Calling fah_main args: 14 usage=100
[23:35:58] 
[23:35:59] Working on GROwing Monsters And Cloning Shrimps
[23:36:00] Client config found, loading data.
[23:36:00] Starting GUI Server
[23:36:27] Project: 10103 (Run 835, Clone 4, Gen 2)
[23:36:27] - Read packet limit of 540015616... Set to 524286976.


[23:36:27] + Attempting to send results [February 9 23:36:27 UTC]
[23:36:37] - Couldn't send HTTP request to server
[23:36:37] + Could not connect to Work Server (results)
[23:36:37]     (171.64.65.71:8080)
[23:36:37] + Retrying using alternative port
[23:36:44] - Couldn't send HTTP request to server
[23:36:44] + Could not connect to Work Server (results)
[23:36:44]     (171.64.65.71:80)
[23:36:44] - Error: Could not transmit unit 02 (completed February 9) to work server.
[23:36:44] - Read packet limit of 540015616... Set to 524286976.


[23:36:44] + Attempting to send results [February 9 23:36:44 UTC]
[23:36:47] - Server does not have record of this unit. Will try again later.
[23:36:47]   Could not transmit unit 02 to Collection server; keeping in queue.
[23:36:47] + Working...
[23:37:12] Completed 1%
[23:38:25] Completed 2%
and again 45 min pause
Go to sleep
Gruss
Bob
PC1//C2Q-Q9450,GA-X48-DS5-,2xGTX285,2x160GB Western Sata2,2x1GB Geil800,Tagan 800W;XP Pro SP3-32Bit
PC2//C2Q-Q2600k.GB-P67UD4-Freezer 7Pro,GTX285Leadtek,260 GB WeSata2,4x2GB GeilPC3,OCZ600W;Win7-64Bit;Siemens 22"stern
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: 171.64.65.71 accepting... but

Post by bruce »

Bobby-Uschi wrote:(171.64.65.71 OK!
That's not what your log says.

You successfully uploaded Project: 5768 to server 171.67.108.11 and then failed to upload some older WUs to server 171.64.65.71. I don't see that anything has changed.
Ragnar Dan
Posts: 52
Joined: Fri Dec 07, 2007 3:21 am
Location: U.S. (TechReport.com's Team 2630)

Re: 171.64.65.71 accepting... but

Post by Ragnar Dan »

The server does have a problem, and it causes the client software to miscount successfully completed WU's by not incrementing the counter after it uploads the first time because of the following error condition:

Code: Select all

[23:25:14] Folding@home Core Shutdown: FINISHED_UNIT
[23:25:18] CoreStatus = 64 (100)
[23:25:18] Unit 1 finished with 97 percent of time to deadline remaining.
[23:25:18] Updated performance fraction: 0.986047
[23:25:18] Sending work to server
[23:25:18] Project: 10104 (Run 79, Clone 4, Gen 0)
[23:25:18] - Read packet limit of 540015616... Set to 524286976.


[23:25:18] + Attempting to send results [February 9 23:25:18 UTC]
[23:25:18] - Reading file work/wuresults_01.dat from core
[23:25:18]   (Read 132585 bytes from disk)
[23:25:18] Connecting to http://171.64.65.71:8080/
[23:33:45] Posted data.
[23:35:14] Initial: 481A; - Uploaded at ~0 kB/s
[23:35:14] - Averaged speed for that direction ~95 kB/s
[23:35:14] - Unknown packet returned from server, expected ACK for results
[23:35:14] - Error: Could not transmit unit 01 (completed February 9) to work server.
[23:35:14] - 1 failed uploads of this unit.
[23:35:14]   Keeping unit 01 in queue.
[23:35:14] Trying to send all finished work units
[23:35:14] Project: 10104 (Run 79, Clone 4, Gen 0)
[23:35:14] - Read packet limit of 540015616... Set to 524286976.


[23:35:14] + Attempting to send results [February 9 23:35:14 UTC]
[23:35:14] - Reading file work/wuresults_01.dat from core
[23:35:14]   (Read 132585 bytes from disk)
[23:35:14] Connecting to http://171.64.65.71:8080/
[23:35:17] Posted data.
[23:35:17] Initial: 0000; - Uploaded at ~43 kB/s
[23:35:17] - Averaged speed for that direction ~84 kB/s
[23:35:17] - Server has already received unit.
[23:35:17] + Sent 0 of 1 completed units to the server
[23:35:17] - Preparing to get new work unit...
[23:35:17] + Attempting to get work packet
[23:35:17] - Will indicate memory of 896 MB
[23:35:17] - Connecting to assignment server
[23:35:17] Connecting to http://assign-GPU.stanford.edu:8080/
[23:35:17] Posted data.
[23:35:17] Initial: 43AB; - Successful: assigned to (171.67.108.11).
[23:35:17] + News From Folding@Home: Welcome to Folding@Home
[23:35:17] Loaded queue successfully.
[23:35:17] Connecting to http://171.67.108.11:8080/
[23:35:19] Posted data.
The WU was uploaded, but not counted. The machine has no other problems with connections, so, considering all 3 of my GPU's have had the same situation occur recently, I blame the server.
Tigerbiten
Posts: 62
Joined: Sun Dec 02, 2007 6:02 am

Re: 171.64.65.71 accepting... but

Post by Tigerbiten »

I'm getting results the same error.

Code: Select all

[04:04:22] Project: 10102 (Run 356, Clone 3, Gen 7)
[04:04:22] - Read packet limit of 540015616... Set to 524286976.


[04:04:22] + Attempting to send results [February 11 04:04:22 UTC]
[04:04:22] - Reading file work/wuresults_03.dat from core
[04:04:22]   (Read 132158 bytes from disk)
[04:04:22] Connecting to http://171.64.65.71:8080/
[04:24:22] Posted data.
[04:29:27] Initial: 484A; - Uploaded at ~0 kB/s
[04:29:27] - Averaged speed for that direction ~18 kB/s
[04:29:27] - Unknown packet returned from server, expected ACK for results
[04:29:27] - Error: Could not transmit unit 03 (completed February 11) to work server.
[04:29:27] - 1 failed uploads of this unit.
[04:29:27]   Keeping unit 03 in queue.
[04:29:27] Trying to send all finished work units
[04:29:27] Project: 10102 (Run 356, Clone 3, Gen 7)
[04:29:27] - Read packet limit of 540015616... Set to 524286976.


[04:29:27] + Attempting to send results [February 11 04:29:27 UTC]
[04:29:27] - Reading file work/wuresults_03.dat from core
[04:29:27]   (Read 132158 bytes from disk)
[04:29:27] Connecting to http://171.64.65.71:8080/
[04:29:30] Posted data.
[04:29:30] Initial: 0000; - Uploaded at ~43 kB/s
[04:29:30] - Averaged speed for that direction ~23 kB/s
[04:29:30] - Server has already received unit.
[04:29:30] + Sent 0 of 1 completed units to the server
[04:29:30] - Preparing to get new work unit...
........... snip ...........
[06:38:25] - Autosending finished units... [February 11 06:38:25 UTC]
[06:38:25] Trying to send all finished work units
[06:38:25] + No unsent completed units remaining.
[06:38:25] - Autosend completed
[06:38:25] + Working...
Did Project: 10102 (Run 356, Clone 3, Gen 7) get uploaded or did the server reject it ??
I see the line "1 failed uploads of this unit." then the line "Server has already received unit".
Which of these two lines is correct as the results file is still in my work folder.
Has the server recorded the work-unit received without any data being sent ????
7im
Posts: 10179
Joined: Thu Nov 29, 2007 4:30 pm
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
Location: Arizona
Contact:

Re: 171.64.65.71 accepting... but

Post by 7im »

[04:29:27] - Unknown packet returned from server, expected ACK for results

A few years ago, the ACK message with a v5 client indicated the WU was actually uploaded, but the client didn't get back an acknowledgement the server received it. The client would then try to send it again, and again. With the new server code, it appears the server is smarter now, and realizes the WU was already uploaded.

Verify the points if you can, and then you can delete the WU if it doesn't delete itself. If you can't verify the points, don't delete it just yet, on the outside chance there was server hiccup. Even so, you won't be able to upload this WU if the server continues say it already has it.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
tobor
Posts: 56
Joined: Tue Jul 15, 2008 11:15 pm
Hardware configuration: ASUS M3N-HT deluxe,AMD6400 duel 3.2gig, GeForce9800 GTX C-760 M-1140 S-1900,4 gig OCZ ddr
Location: Missouri,USA

Re: 171.64.65.71 accepting... but

Post by tobor »

I dont think it was uploaded(in my case anyway)
Until I deleted the queue.dat file my points were way down.
It was hanging for 45mins or more after each client was done trying to upload.
Image
ikerekes
Posts: 94
Joined: Thu Nov 13, 2008 4:18 pm
Hardware configuration: q6600 @ 3.3Ghz windows xp-sp3 one SMP2 (2.15 core) + 1 9800GT native GPU2
Athlon x2 6000+ @ 3.0Ghz ubuntu 8.04 smp + asus 9600GSO gpu2 in wine wrapper
5600X2 @ 3.19Ghz ubuntu 8.04 smp + asus 9600GSO gpu2 in wine wrapper
E5200 @ 3.7Ghz ubuntu 8.04 smp2 + asus 9600GT silent gpu2 in wine wrapper
E5200 @ 3.65Ghz ubuntu 8.04 smp2 + asus 9600GSO gpu2 in wine wrapper
E6550 vmware ubuntu 8.4.1
q8400 @ 3.3Ghz windows xp-sp3 one SMP2 (2.15 core) + 1 9800GT native GPU2
Athlon II 620 @ 2.6 Ghz windows xp-sp3 one SMP2 (2.15 core) + 1 9800GT native GPU2
Location: Calgary, Canada

Re: 171.64.65.71 accepting... but

Post by ikerekes »

I have two of my gpu's after not able to return the wu to 171.64.65.71 I have shutdown the client, cleaned the work directory and restarted the client.
The assignment server insist to assign this faulty server to me, and my client is just hangs there not able to get any new work:

Code: Select all

--- Opening Log file [February 11 19:17:43 UTC] 


# Windows GPU Console Edition #################################################
###############################################################################

                       Folding@Home Client Version 6.23

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: Z:\home\kerekei\fahgpu2
Executable: [email protected]
Arguments: -verbosity 9 -forcegpu nvidia_g80 

[19:17:43] - Ask before connecting: No
[19:17:43] - User name: ikerekes (Team 50619)
[19:17:43] - User ID: 94F4BD501CF3AB2
[19:17:43] - Machine ID: 2
[19:17:43] 
[19:17:43] Work directory not found. Creating...
[19:17:43] Could not open work queue, generating new queue...
[19:17:43] - Autosending finished units... [February 11 19:17:43 UTC]
[19:17:43] Trying to send all finished work units
[19:17:43] + No unsent completed units remaining.
[19:17:43] - Autosend completed
[19:17:43] - Preparing to get new work unit...
[19:17:43] + Attempting to get work packet
[19:17:43] - Will indicate memory of 2013 MB
[19:17:43] - Detect CPU. Vendor: GenuineIntel, Family: 6, Model: 7, Stepping: 10
[19:17:43] - Connecting to assignment server
[19:17:43] Connecting to http://assign-GPU.stanford.edu:8080/
[19:17:43] Posted data.
[19:17:43] Initial: 40AB; - Successful: assigned to (171.64.65.71).
[19:17:43] + News From Folding@Home: Welcome to Folding@Home
[19:17:43] Loaded queue successfully.
[19:17:43] Connecting to http://171.64.65.71:8080/
Could PG take out this server from the circulation until it is fixed?
Image
tobor
Posts: 56
Joined: Tue Jul 15, 2008 11:15 pm
Hardware configuration: ASUS M3N-HT deluxe,AMD6400 duel 3.2gig, GeForce9800 GTX C-760 M-1140 S-1900,4 gig OCZ ddr
Location: Missouri,USA

Re: 171.64.65.71 accepting... but

Post by tobor »

@!^@#!@$%$*)*!# GOT ANOTHER ONE !!!!!!

Code: Select all

[23:23:55] Loaded queue successfully.
[23:23:55] Initialization complete
[23:23:55] 
[23:23:55] + Processing work unit
[23:23:55] Project: 10102 (Run 862, Clone 3, Gen 7)


[23:23:55] + Attempting to send results [February 11 23:23:55 UTC]
[23:23:55] Core required: FahCore_11.exe
[23:23:55] Core found.
[23:23:55] Working on queue slot 06 [February 11 23:23:55 UTC]
[23:23:55] + Working ...
[23:23:55] 
[23:23:55] *------------------------------*
[23:23:55] Folding@Home GPU Core
[23:23:55] Version 1.31 (Tue Sep 15 10:57:42 PDT 2009)
[23:23:55] 
[23:23:55] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[23:23:55] Build host: amoeba
[23:23:55] Board Type: Nvidia
[23:23:55] Core      : 
[23:23:55] Preparing to commence simulation
[23:23:55] - Looking at optimizations...
[23:23:55] - Files status OK
[23:23:55] - Expanded 46723 -> 252912 (decompressed 541.3 percent)
[23:23:55] Called DecompressByteArray: compressed_data_size=46723 data_size=252912, decompressed_data_size=252912 diff=0
[23:23:55] - Digital signature verified
[23:23:55] 
[23:23:55] Project: 5766 (Run 9, Clone 274, Gen 1062)
[23:23:55] 
[23:23:55] Assembly optimizations on if available.
[23:23:55] Entering M.D.
[23:24:01] Will resume from checkpoint file
[23:24:01] Tpr hash work/wudata_06.tpr:  2945747848 1455680995 1206627321 2123427442 2872100080
[23:24:01] 
[23:24:01] Calling fah_main args: 14 usage=100
[23:24:01] 
[23:24:01] Working on Protein
[23:24:02] Client config found, loading data.
[23:24:02] Resuming from checkpoint
[23:24:02] fcCheckPointResume: retreived and current tpr file hash:
[23:24:02]    0   2945747848   2945747848
[23:24:02]    1   1455680995   1455680995
[23:24:02]    2   1206627321   1206627321
[23:24:02]    3   2123427442   2123427442
[23:24:02]    4   2872100080   2872100080
[23:24:02] fcCheckPointResume: file hashes same.
[23:24:02] fcCheckPointResume: state restored.
[23:24:02] Verified work/wudata_06.log
[23:24:02] Verified work/wudata_06.edr
[23:24:02] Verified work/wudata_06.xtc
[23:24:02] Completed 13%
[23:24:02] Starting GUI Server
[23:24:34] Completed 14%
[23:25:06] Completed 15%
[23:25:39] Completed 16%
[23:26:11] Completed 17%
[23:26:43] Completed 18%
[23:27:16] Completed 19%
[23:27:48] Completed 20%
[23:28:20] Completed 21%
[23:28:24] - Couldn't send HTTP request to server
[23:28:24] + Could not connect to Work Server (results)
[23:28:24]     (171.64.65.71:8080)
[23:28:24] + Retrying using alternative port
[23:28:32] - Couldn't send HTTP request to server
[23:28:32] + Could not connect to Work Server (results)
[23:28:32]     (171.64.65.71:80)
[23:28:32] - Error: Could not transmit unit 05 (completed February 11) to work server.


[23:28:32] + Attempting to send results [February 11 23:28:32 UTC]
[23:28:53] Completed 22%
[23:29:25] Completed 23%
[23:29:36] - Server does not have record of this unit. Will try again later.
[23:29:36]   Could not transmit unit 05 to Collection server; keeping in queue.
[23:29:57] Completed 24%
[23:30:30] Completed 25%
[23:31:03] Completed 26%
Image
Flathead74
Posts: 266
Joined: Sun Dec 02, 2007 6:08 pm
Location: Central New York
Contact:

Re: 171.64.65.71 accepting... but

Post by Flathead74 »

Code: Select all

[18:22:17] Completed 100%
[18:22:17] Successful run
[18:22:17] DynamicWrapper: Finished Work Unit: sleep=10000
[18:22:27] Reserved 101028 bytes for xtc file; Cosm status=0
[18:22:27] Allocated 101028 bytes for xtc file
[18:22:27] - Reading up to 101028 from "work/wudata_08.xtc": Read 101028
[18:22:27] Read 101028 bytes from xtc file; available packet space=786329436
[18:22:27] xtc file hash check passed.
[18:22:27] Reserved 30216 30216 786329436 bytes for arc file=<work/wudata_08.trr> Cosm status=0
[18:22:27] Allocated 30216 bytes for arc file
[18:22:27] - Reading up to 30216 from "work/wudata_08.trr": Read 30216
[18:22:27] Read 30216 bytes from arc file; available packet space=786299220
[18:22:27] trr file hash check passed.
[18:22:27] Allocated 560 bytes for edr file
[18:22:27] Read bedfile
[18:22:27] edr file hash check passed.
[18:22:27] Logfile not read.
[18:22:27] GuardedRun: success in DynamicWrapper
[18:22:27] GuardedRun: done
[18:22:27] Run: GuardedRun completed.
[18:22:31] + Opened results file
[18:22:31] - Writing 132316 bytes of core data to disk...
[18:22:31] Done: 131804 -> 131365 (compressed to 99.6 percent)
[18:22:31]   ... Done.
[18:22:31] DeleteFrameFiles: successfully deleted file=work/wudata_08.ckp
[18:22:31] Shutting down core 
[18:22:31] 
[18:22:31] Folding@home Core Shutdown: FINISHED_UNIT
[18:22:35] CoreStatus = 64 (100)
[18:22:35] Unit 8 finished with 98 percent of time to deadline remaining.
[18:22:35] Updated performance fraction: 0.979071
[18:22:35] Sending work to server
[18:22:35] Project: 10103 (Run 988, Clone 3, Gen 4)
[18:22:35] - Read packet limit of 540015616... Set to 524286976.


[18:22:35] + Attempting to send results [February 12 18:22:35 UTC]
[18:22:35] - Reading file work/wuresults_08.dat from core
[18:22:35]   (Read 131877 bytes from disk)
[18:22:35] Connecting to http://171.64.65.71:8080/
[18:31:02] Posted data.
[18:34:46] Initial: 48FA; - Uploaded at ~0 kB/s
[18:34:46] - Averaged speed for that direction ~55 kB/s
[18:34:46] - Unknown packet returned from server, expected ACK for results
[18:34:46] - Error: Could not transmit unit 08 (completed February 12) to work server.
[18:34:46] - 1 failed uploads of this unit.
[18:34:46]   Keeping unit 08 in queue.
[18:34:46] Trying to send all finished work units
[18:34:46] Project: 10103 (Run 988, Clone 3, Gen 4)
[18:34:46] - Read packet limit of 540015616... Set to 524286976.


[18:34:46] + Attempting to send results [February 12 18:34:46 UTC]
[18:34:46] - Reading file work/wuresults_08.dat from core
[18:34:46]   (Read 131877 bytes from disk)
[18:34:46] Connecting to http://171.64.65.71:8080/
[18:34:53] Posted data.
[18:34:53] Initial: 0000; - Uploaded at ~18 kB/s
[18:34:53] - Averaged speed for that direction ~47 kB/s
[18:34:53] - Server has already received unit.
[18:34:53] + Sent 0 of 1 completed units to the server
[18:34:53] - Preparing to get new work unit...
[18:34:53] + Attempting to get work packet
Here's another that looks very similar to the WU reported by Ragnar Dan.
statesidecoma
Posts: 20
Joined: Thu Jan 15, 2009 3:51 am
Location: Grove, Oklahoma

Re: 171.64.65.71 accepting... but

Post by statesidecoma »

I have this same problem going on 6 computers. Just started this days ago for no reason at all. I don't have a proxy to go through. I have even deleted the whole client altogether and setup a whole new client. It does the exact same thing. There is something going on here. Not saying there is something wrong with server, but it can't be our routers. Something has changed in the last 7 days. And it doesn't matter what sever you get. You get the same old message. This isn't happening on SMP clients just GPU. Are there any solutions or suggestions?
statesidecoma
Posts: 20
Joined: Thu Jan 15, 2009 3:51 am
Location: Grove, Oklahoma

Re: 171.64.65.71 accepting... but

Post by statesidecoma »

My bad, just read some other threads. Looks like a server problem and you guys are aware of it. That's good to know. :biggrin:
Post Reply