GPU server status 171.67.108.21, 171.64.65.71,171.67.108.26

Moderators: Site Moderators, FAHC Science Team

Post Reply
ikerekes
Posts: 94
Joined: Thu Nov 13, 2008 4:18 pm
Hardware configuration: q6600 @ 3.3Ghz windows xp-sp3 one SMP2 (2.15 core) + 1 9800GT native GPU2
Athlon x2 6000+ @ 3.0Ghz ubuntu 8.04 smp + asus 9600GSO gpu2 in wine wrapper
5600X2 @ 3.19Ghz ubuntu 8.04 smp + asus 9600GSO gpu2 in wine wrapper
E5200 @ 3.7Ghz ubuntu 8.04 smp2 + asus 9600GT silent gpu2 in wine wrapper
E5200 @ 3.65Ghz ubuntu 8.04 smp2 + asus 9600GSO gpu2 in wine wrapper
E6550 vmware ubuntu 8.4.1
q8400 @ 3.3Ghz windows xp-sp3 one SMP2 (2.15 core) + 1 9800GT native GPU2
Athlon II 620 @ 2.6 Ghz windows xp-sp3 one SMP2 (2.15 core) + 1 9800GT native GPU2
Location: Calgary, Canada

GPU server status 171.67.108.21, 171.64.65.71,171.67.108.26

Post by ikerekes »

I am folding with 7 gpu client and cleaning the queue's regularly because for a few days now I have serious problem to upload the results to the servers.
Today morning between 6 of my 7 card I have 9 wu's "ready for upload" (poor science :twisted: ). My major problem that any time the server can not accept the wu, the client hung for about a 45 minutes to try to resend the wu :mrgreen:
Asus 9800GT/512M
Index 1: ready for upload 164 X min speed
server: 171.67.108.21:8080; project: 3470
Folding: run 10, clone 62, generation 0; benchmark 0; misc: 500, 200
issue: Fri Feb 12 20:27:41 2010; begin: Fri Feb 12 20:27:36 2010
end: Fri Feb 12 21:46:40 2010; due: Sun Feb 21 20:27:36 2010 (9 days)
--
Index 2: ready for upload 164 X min speed
server: 171.67.108.21:8080; project: 3470
Folding: run 10, clone 175, generation 0; benchmark 0; misc: 500, 200
issue: Fri Feb 12 21:47:52 2010; begin: Fri Feb 12 21:47:46 2010
end: Fri Feb 12 23:06:39 2010; due: Sun Feb 21 21:47:46 2010 (9 days)
--
Index 3: ready for upload 24.7 X min speed
server: 171.64.65.71:8080; project: 10105
Folding: run 109, clone 6, generation 2; benchmark 0; misc: 500, 200
issue: Fri Feb 12 23:07:54 2010; begin: Fri Feb 12 23:07:49 2010
end: Sat Feb 13 02:02:26 2010; due: Mon Feb 15 23:07:49 2010 (3 days)


Asus 9600GSO/384M
Index 1: ready for upload 19.6 X min speed
server: 171.64.65.71:8080; project: 10105
Folding: run 228, clone 3, generation 0; benchmark 0; misc: 500, 200
issue: Thu Feb 11 12:22:38 2010; begin: Thu Feb 11 12:27:25 2010
end: Thu Feb 11 16:07:40 2010; due: Sun Feb 14 12:27:25 2010 (3 days)
--
Index 2: ready for upload 783.00 pts (176.628 pt/hr) 135 X min speed
server: 171.67.108.21:8080; project: 5781
Folding: run 13, clone 973, generation 3; benchmark 0; misc: 500, 200
issue: Fri Feb 12 21:21:15 2010; begin: Fri Feb 12 21:21:16 2010
end: Sat Feb 13 01:47:15 2010; due: Tue Mar 9 21:21:16 2010 (25 days)


Asus 9600GSO/384M
Index 1: ready for upload 783.00 pts (154.633 pt/hr) 118 X min speed
server: 171.67.108.21:8080; project: 5781
Folding: run 7, clone 16, generation 1; benchmark 0; misc: 500, 200
issue: Fri Feb 12 18:02:43 2010; begin: Fri Feb 12 18:02:44 2010
end: Fri Feb 12 23:06:33 2010; due: Tue Mar 9 18:02:44 2010 (25 days)


Asus 9600GT Silent
Index 2: ready for upload 104 X min speed
server: 171.67.108.21:8080; project: 3469
Folding: run 13, clone 184, generation 0; benchmark 0; misc: 500, 200
issue: Fri Feb 12 21:46:40 2010; begin: Fri Feb 12 21:47:27 2010
end: Fri Feb 12 23:52:31 2010; due: Sun Feb 21 21:47:27 2010 (9 days)


XFX 9600GSO/768M


Zotac 9800GT/512 low power
Index 7: ready for upload 783.00 pts (199.972 pt/hr) 153 X min speed
server: 171.67.108.21:8080; project: 5781
Folding: run 29, clone 386, generation 1; benchmark 0; misc: 500, 200
issue: Fri Feb 12 19:43:12 2010; begin: Fri Feb 12 19:43:48 2010
end: Fri Feb 12 23:38:44 2010; due: Tue Mar 9 19:43:48 2010 (25 days)


Asus 9600GSO/512M
Index 3: ready for upload 91.2 X min speed
server: 171.67.108.21:8080; project: 3470
Folding: run 4, clone 153, generation 0; benchmark 0; misc: 500, 200
issue: Fri Feb 12 20:24:25 2010; begin: Fri Feb 12 20:24:25 2010
end: Fri Feb 12 22:46:34 2010; due: Sun Feb 21 20:24:25 2010 (9 days)
There are numerous threads on this forum about these and some other servers (all of them server code v5) exhibiting the same problems.
Last edited by ikerekes on Sat Feb 13, 2010 2:59 pm, edited 1 time in total.
Image
Pette Broad
Posts: 128
Joined: Mon Dec 03, 2007 9:38 pm
Hardware configuration: CPU folding on only one machine a laptop

GPU Hardware..
3 x 460
1 X 260
4 X 250

+ 1 X 9800GT (3 days a week)
Location: Chester U.K

Re: GPU server status

Post by Pette Broad »

Yeah, its starting to look like there's big trouble. I have 14 out of 14 GPU clients waiting to send units back, and that's without going into the CPU nightmare position I'm in right now.

Pete
Image
Tobit
Posts: 342
Joined: Thu Apr 17, 2008 2:35 pm
Location: Manchester, NH USA

Re: GPU server status

Post by Tobit »

171.67.108.21 is up but in Reject mode.

I am sure Stanford is away of the server problems and are working on the problem. Let's all just try to be patient a bit longer. Thankfully there are some other GPU servers still up and assigning new work.
ikerekes
Posts: 94
Joined: Thu Nov 13, 2008 4:18 pm
Hardware configuration: q6600 @ 3.3Ghz windows xp-sp3 one SMP2 (2.15 core) + 1 9800GT native GPU2
Athlon x2 6000+ @ 3.0Ghz ubuntu 8.04 smp + asus 9600GSO gpu2 in wine wrapper
5600X2 @ 3.19Ghz ubuntu 8.04 smp + asus 9600GSO gpu2 in wine wrapper
E5200 @ 3.7Ghz ubuntu 8.04 smp2 + asus 9600GT silent gpu2 in wine wrapper
E5200 @ 3.65Ghz ubuntu 8.04 smp2 + asus 9600GSO gpu2 in wine wrapper
E6550 vmware ubuntu 8.4.1
q8400 @ 3.3Ghz windows xp-sp3 one SMP2 (2.15 core) + 1 9800GT native GPU2
Athlon II 620 @ 2.6 Ghz windows xp-sp3 one SMP2 (2.15 core) + 1 9800GT native GPU2
Location: Calgary, Canada

Re: GPU server status

Post by ikerekes »

Tobit wrote:
I am sure Stanford is away of the server problems and are working on the problem. Let's all just try to be patient a bit longer. Thankfully there are some other GPU servers still up and assigning new work.
Yes Mr. Tobit, I am aware that Vijay is aware of the problem http://foldingforum.org/viewtopic.php?f ... ed#p129884
Maybe you are aware of the fact that this was first reported in Jan 22? and Today is Feb 13 (at least not Friday :mrgreen: )
Image
Tobit
Posts: 342
Joined: Thu Apr 17, 2008 2:35 pm
Location: Manchester, NH USA

Re: GPU server status 171.67.108.21, 171.64.65.71,171.67.108.26

Post by Tobit »

Looks like 108.21 is working just fine again as I've just uploaded my backlog of work to this server.
shunter
Posts: 84
Joined: Sun Apr 06, 2008 8:22 am
Location: Hertfordshire, United Kingdom

171.67.108.21 Accepting .... BUT

Post by shunter »

This looks to be the same issue as viewtopic.php?f=18&t=13297 but on a different server. There were long delays yesterday in uploading units due to server problems which appeared to have been fixed but earlier this morning (approx 4,00am GMT) my completed but failed to upload GPU units started to disappear after the failed uploads as server recorded them as being completed. So far I have lost 7 units which is about 15% of today's production.

I have read the earlier comments but my system's settings have not chenged in ages so I have to assume it is a server issue. Hope this is resolves soon although today is a Sunday.

Good luck
Shunter


[08:26:06] + Attempting to send results [February 14 08:26:06 UTC]
[08:26:06] - Reading file work/wuresults_07.dat from core
[08:26:06] (Read 65476 bytes from disk)
[08:26:06] Connecting to http://171.67.108.21:8080/
[08:26:07] - Couldn't send HTTP request to server
[08:26:07] + Could not connect to Work Server (results)
[08:26:07] (c)
[08:26:07] + Retrying using alternative port
[08:26:07] Connecting to http://171.67.108.21:80/
[08:26:29] - Couldn't send HTTP request to server
[08:26:29] + Could not connect to Work Server (results)
[08:26:29] (171.67.108.21:80)
[08:26:29] - Error: Could not transmit unit 07 (completed February 14) to work server.
[08:26:29] - 1 failed uploads of this unit.
[08:26:29] Keeping unit 07 in queue.
[08:26:29] Trying to send all finished work units
[08:26:29] Project: 3470 (Run 10, Clone 182, Gen 2)
[08:26:29] - Read packet limit of 540015616... Set to 524286976.


[08:26:29] + Attempting to send results [February 14 08:26:29 UTC]
[08:26:29] - Reading file work/wuresults_07.dat from core
[08:26:29] (Read 65476 bytes from disk)
[08:26:29] Connecting to http://171.67.108.21:8080/
[08:26:30] Posted data.
[08:26:30] Initial: 0000; - Uploaded at ~64 kB/s
[08:26:30] - Averaged speed for that direction ~40 kB/s
[08:26:30] - Server has already received unit.
[08:26:30] + Sent 0 of 1 completed units to the server
Image
Tobit
Posts: 342
Joined: Thu Apr 17, 2008 2:35 pm
Location: Manchester, NH USA

Re: GPU server status 171.67.108.21, 171.64.65.71,171.67.108.26

Post by Tobit »

I can confirm the "Server has already received unit" problem that shunter is also experiencing on 171.67.108.21 . This has been going on since approximately 0600 UTC on Sunday.

Code: Select all

[08:46:23] Unit 9 finished with 99 percent of time to deadline remaining.
[08:46:23] Updated performance fraction: 0.990288
[08:46:23] Sending work to server
[08:46:23] Project: 5781 (Run 15, Clone 291, Gen 3)
[08:46:23] - Read packet limit of 540015616... Set to 524286976.

[08:46:23] + Attempting to send results [February 14 08:46:23 UTC]
[08:46:23] - Reading file work/wuresults_09.dat from core
[08:46:23]   (Read 168629 bytes from disk)
[08:46:23] Connecting to http://171.67.108.21:8080/
[08:46:25] - Couldn't send HTTP request to server
[08:46:25] + Could not connect to Work Server (results)
[08:46:25]     (171.67.108.21:8080)
[08:46:25] + Retrying using alternative port
[08:46:25] Connecting to http://171.67.108.21:80/
[08:46:46] - Couldn't send HTTP request to server
[08:46:46] + Could not connect to Work Server (results)
[08:46:46]     (171.67.108.21:80)
[08:46:46] - Error: Could not transmit unit 09 (completed February 14) to work server.
[08:46:46] - 1 failed uploads of this unit.
[08:46:46]   Keeping unit 09 in queue.
[08:46:46] Trying to send all finished work units
[08:46:46] Project: 5781 (Run 15, Clone 291, Gen 3)
[08:46:46] - Read packet limit of 540015616... Set to 524286976.

[08:46:46] + Attempting to send results [February 14 08:46:46 UTC]
[08:46:46] - Reading file work/wuresults_09.dat from core
[08:46:46]   (Read 168629 bytes from disk)
[08:46:46] Connecting to http://171.67.108.21:8080/
[08:46:47] Posted data.
[08:46:47] Initial: 0000; - Uploaded at ~165 kB/s
[08:46:47] - Averaged speed for that direction ~89 kB/s
[08:46:47] - Server has already received unit.
[08:46:47] + Sent 0 of 1 completed units to the server
[08:46:47] - Preparing to get new work unit...
Pette Broad
Posts: 128
Joined: Mon Dec 03, 2007 9:38 pm
Hardware configuration: CPU folding on only one machine a laptop

GPU Hardware..
3 x 460
1 X 260
4 X 250

+ 1 X 9800GT (3 days a week)
Location: Chester U.K

Re: GPU server status 171.67.108.21, 171.64.65.71,171.67.108.26

Post by Pette Broad »

This needs to be sorted very quickly. I'm completing units and getting no credit because the server has already received the unit. I've seen this project go through some difficult times but this for me at least this is a disaster. With the umpteen CPU units waiting to be returned, some for over 2 weeks and the complete lack of information from Pande group I'm probably going to switch projects soon.

Pete
Image
derrickmcc
Posts: 221
Joined: Fri Jul 24, 2009 12:30 am
Hardware configuration: 2 x GTX 460 (825/1600/1650)
AMD Athlon II X2 250 3.0Ghz
Kingston 2Gb DDR2 1066 Mhz
MSI K9A2 Platinum
Western Digital 500Gb Sata II
LiteOn DVD
Coolermaster 900W UCP
Antec 902
Windows XP SP3
Location: Malvern, UK

Re: GPU server status 171.67.108.21, 171.64.65.71,171.67.108.26

Post by derrickmcc »

I am also having the same problem across 4 GPU's:

Code: Select all

[10:42:53] + Attempting to send results [February 14 10:42:53 UTC]
[10:42:55] - Couldn't send HTTP request to server
[10:42:55] + Could not connect to Work Server (results)
[10:42:55]     (171.67.108.21:8080)
[10:42:55] + Retrying using alternative port
[10:43:16] - Couldn't send HTTP request to server
[10:43:16] + Could not connect to Work Server (results)
[10:43:16]     (171.67.108.21:80)
[10:43:16] - Error: Could not transmit unit 09 (completed February 14) to work server.
[10:43:16]   Keeping unit 09 in queue.
[10:43:16] Project: 3470 (Run 1, Clone 24, Gen 2)
[10:43:16] - Read packet limit of 540015616... Set to 524286976.

[10:43:16] + Attempting to send results [February 14 10:43:16 UTC]
[10:43:18] - Server has already received unit.

...
[12:40:48] + Attempting to send results [February 14 12:40:48 UTC]
[12:40:53] - Couldn't send HTTP request to server
[12:40:53] + Could not connect to Work Server (results)
[12:40:53]     (171.67.108.21:8080)
[12:40:53] + Retrying using alternative port
[12:41:14] - Couldn't send HTTP request to server
[12:41:14] + Could not connect to Work Server (results)
[12:41:14]     (171.67.108.21:80)
[12:41:14] - Error: Could not transmit unit 03 (completed February 14) to work server.
[12:41:14]   Keeping unit 03 in queue.
[12:41:14] Project: 5781 (Run 14, Clone 955, Gen 3)
[12:41:14] - Read packet limit of 540015616... Set to 524286976.

[12:41:14] + Attempting to send results [February 14 12:41:14 UTC]
[12:41:18] - Server has already received unit.

...
[11:44:37] + Attempting to send results [February 14 11:44:37 UTC]
[11:44:39] - Couldn't send HTTP request to server
[11:44:39] + Could not connect to Work Server (results)
[11:44:39]     (171.67.108.21:8080)
[11:44:39] + Retrying using alternative port
[11:45:00] - Couldn't send HTTP request to server
[11:45:00] + Could not connect to Work Server (results)
[11:45:00]     (171.67.108.21:80)
[11:45:00] - Error: Could not transmit unit 00 (completed February 14) to work server.
[11:45:00]   Keeping unit 00 in queue.
[11:45:00] Project: 3470 (Run 3, Clone 25, Gen 2)
[11:45:00] - Read packet limit of 540015616... Set to 524286976.

[11:45:00] + Attempting to send results [February 14 11:45:00 UTC]
[11:45:02] - Server has already received unit.

...
[10:50:27] + Attempting to send results [February 14 10:50:27 UTC]
[10:50:31] - Couldn't send HTTP request to server
[10:50:31] + Could not connect to Work Server (results)
[10:50:31]     (171.67.108.21:8080)
[10:50:31] + Retrying using alternative port
[10:50:52] - Couldn't send HTTP request to server
[10:50:52] + Could not connect to Work Server (results)
[10:50:52]     (171.67.108.21:80)
[10:50:52] - Error: Could not transmit unit 00 (completed February 14) to work server.
[10:50:52]   Keeping unit 00 in queue.
[10:50:52] Project: 5781 (Run 15, Clone 150, Gen 3)
[10:50:52] - Read packet limit of 540015616... Set to 524286976.

[10:50:52] + Attempting to send results [February 14 10:50:52 UTC]
[10:50:56] - Server has already received unit.

Image
Image
Tobit
Posts: 342
Joined: Thu Apr 17, 2008 2:35 pm
Location: Manchester, NH USA

Re: GPU server status 171.67.108.21, 171.64.65.71,171.67.108.26

Post by Tobit »

05:08:39 is the last time 171.67.108.21 reported a successful upload from one of my GPUs. After that, a WU will finish, attempt to upload, report that it can't connect, keeps in queue, prepares to start a new WU, uploads all unsent work, uploads fine to 171.67.108.21, reports the server already has the WU, deletes WU, downloads new work and the cycle starts again. All GPUs here keep getting assigned to 171.67.108.21 and all are affected. These are dedicated machines and no configuration has changed in weeks.

Here is a perfect example of the last successful WU upload followed by the failures that started around 0600 UTC:

Code: Select all

[04:58:10] Unit 8 finished with 99 percent of time to deadline remaining.
[04:58:10] Updated performance fraction: 0.989444
[04:58:10] Sending work to server
[04:58:10] Project: 3469 (Run 22, Clone 180, Gen 1)
[04:58:10] - Read packet limit of 540015616... Set to 524286976.

[04:58:10] + Attempting to send results [February 14 04:58:10 UTC]
[04:58:10] - Reading file work/wuresults_08.dat from core
[04:58:10]   (Read 70020 bytes from disk)
[04:58:10] Connecting to http://171.67.108.21:8080/
[04:58:11] Posted data.
[04:58:11] Initial: 0000; - Uploaded at ~69 kB/s
[04:58:11] - Averaged speed for that direction ~70 kB/s
[04:58:11] + Results successfully sent
[04:58:11] Thank you for your contribution to Folding@Home.
[04:58:11] + Number of Units Completed: 142

[04:58:15] Trying to send all finished work units
[04:58:15] + No unsent completed units remaining.
[04:58:15] - Preparing to get new work unit...
[04:58:15] + Attempting to get work packet
[04:58:15] - Will indicate memory of 1873 MB
[04:58:15] - Connecting to assignment server
[04:58:15] Connecting to http://assign-GPU.stanford.edu:8080/
[04:58:16] Posted data.
[04:58:16] Initial: 43AB; - Successful: assigned to (171.67.108.21).
[04:58:16] + News From Folding@Home: Welcome to Folding@Home
[04:58:16] Loaded queue successfully.
[04:58:16] Connecting to http://171.67.108.21:8080/
[04:58:17] Posted data.
[04:58:17] Initial: 0000; - Receiving payload (expected size: 65451)
[04:58:17] Conversation time very short, giving reduced weight in bandwidth avg
[04:58:17] - Downloaded at ~127 kB/s
[04:58:17] - Averaged speed for that direction ~61 kB/s
[04:58:17] + Received work.
[04:58:17] Trying to send all finished work units
[04:58:17] + No unsent completed units remaining.
[04:58:17] + Closed connections
[04:58:17] 
[04:58:17] + Processing work unit
[04:58:17] Core required: FahCore_11.exe
[04:58:17] Core found.
[04:58:17] Working on queue slot 09 [February 14 04:58:17 UTC]
[04:58:17] + Working ...
[04:58:17] - Calling '.\FahCore_11.exe -dir work/ -suffix 09 -checkpoint 15 -verbose -lifeline 741 -version 623'

[04:58:17] 
[04:58:17] *------------------------------*
[04:58:17] Folding@Home GPU Core
[04:58:17] Version 1.31 (Tue Sep 15 10:57:42 PDT 2009)
[04:58:17] 
[04:58:17] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[04:58:17] Build host: amoeba
[04:58:17] Board Type: Nvidia
[04:58:17] Core      : 
[04:58:17] Preparing to commence simulation
[04:58:17] - Looking at optimizations...
[04:58:17] DeleteFrameFiles: successfully deleted file=work/wudata_09.ckp
[04:58:17] - Created dyn
[04:58:17] - Files status OK
[04:58:17] - Expanded 64939 -> 344387 (decompressed 530.3 percent)
[04:58:17] Called DecompressByteArray: compressed_data_size=64939 data_size=344387, decompressed_data_size=344387 diff=0
[04:58:17] - Digital signature verified
[04:58:17] 
[04:58:17] Project: 5781 (Run 15, Clone 291, Gen 3)
[04:58:17] 
[04:58:17] Assembly optimizations on if available.
[04:58:17] Entering M.D.
[04:58:23] Tpr hash work/wudata_09.tpr:  1197948273 1061539618 412233024 2699779062 2693220776
[04:58:23] 
[04:58:23] Calling fah_main args: 14 usage=100
[04:58:23] 
[04:58:24] Working on Great Red Owns Many ACres of Sand
[04:58:25] Client config found, loading data.
[04:58:25] Starting GUI Server
[05:00:42] Completed 1%

// intentionally removed //

[08:46:07] Completed 100%
[08:46:07] Successful run
[08:46:07] DynamicWrapper: Finished Work Unit: sleep=10000
[08:46:17] Reserved 146756 bytes for xtc file; Cosm status=0
[08:46:17] Allocated 146756 bytes for xtc file
[08:46:17] - Reading up to 146756 from "work/wudata_09.xtc": Read 146756
[08:46:17] Read 146756 bytes from xtc file; available packet space=786283708
[08:46:17] xtc file hash check passed.
[08:46:17] Reserved 22248 22248 786283708 bytes for arc file=<work/wudata_09.trr> Cosm status=0
[08:46:17] Allocated 22248 bytes for arc file
[08:46:17] - Reading up to 22248 from "work/wudata_09.trr": Read 22248
[08:46:17] Read 22248 bytes from arc file; available packet space=786261460
[08:46:17] trr file hash check passed.
[08:46:17] Allocated 560 bytes for edr file
[08:46:17] Read bedfile
[08:46:17] edr file hash check passed.
[08:46:17] Logfile not read.
[08:46:17] GuardedRun: success in DynamicWrapper
[08:46:17] GuardedRun: done
[08:46:17] Run: GuardedRun completed.
[08:46:21] + Opened results file
[08:46:21] - Writing 170076 bytes of core data to disk...
[08:46:21] Done: 169564 -> 168117 (compressed to 99.1 percent)
[08:46:21]   ... Done.
[08:46:21] DeleteFrameFiles: successfully deleted file=work/wudata_09.ckp
[08:46:21] Shutting down core 
[08:46:21] 
[08:46:21] Folding@home Core Shutdown: FINISHED_UNIT
[08:46:23] CoreStatus = 64 (100)
[08:46:23] Unit 9 finished with 99 percent of time to deadline remaining.
[08:46:23] Updated performance fraction: 0.990288
[08:46:23] Sending work to server
[08:46:23] Project: 5781 (Run 15, Clone 291, Gen 3)
[08:46:23] - Read packet limit of 540015616... Set to 524286976.


[08:46:23] + Attempting to send results [February 14 08:46:23 UTC]
[08:46:23] - Reading file work/wuresults_09.dat from core
[08:46:23]   (Read 168629 bytes from disk)
[08:46:23] Connecting to http://171.67.108.21:8080/
[08:46:25] - Couldn't send HTTP request to server
[08:46:25] + Could not connect to Work Server (results)
[08:46:25]     (171.67.108.21:8080)
[08:46:25] + Retrying using alternative port
[08:46:25] Connecting to http://171.67.108.21:80/
[08:46:46] - Couldn't send HTTP request to server
[08:46:46] + Could not connect to Work Server (results)
[08:46:46]     (171.67.108.21:80)
[08:46:46] - Error: Could not transmit unit 09 (completed February 14) to work server.
[08:46:46] - 1 failed uploads of this unit.
[08:46:46]   Keeping unit 09 in queue.
[08:46:46] Trying to send all finished work units
[08:46:46] Project: 5781 (Run 15, Clone 291, Gen 3)
[08:46:46] - Read packet limit of 540015616... Set to 524286976.


[08:46:46] + Attempting to send results [February 14 08:46:46 UTC]
[08:46:46] - Reading file work/wuresults_09.dat from core
[08:46:46]   (Read 168629 bytes from disk)
[08:46:46] Connecting to http://171.67.108.21:8080/
[08:46:47] Posted data.
[08:46:47] Initial: 0000; - Uploaded at ~165 kB/s
[08:46:47] - Averaged speed for that direction ~89 kB/s
[08:46:47] - Server has already received unit.
[08:46:47] + Sent 0 of 1 completed units to the server
[08:46:47] - Preparing to get new work unit...
[08:46:47] + Attempting to get work packet
Nathan_P
Posts: 1164
Joined: Wed Apr 01, 2009 9:22 pm
Hardware configuration: Asus Z8NA D6C, 2 [email protected] Ghz, , 12gb Ram, GTX 980ti, AX650 PSU, win 10 (daily use)

Asus Z87 WS, Xeon E3-1230L v3, 8gb ram, KFA GTX 1080, EVGA 750ti , AX760 PSU, Mint 18.2 OS

Not currently folding
Asus Z9PE- D8 WS, 2 [email protected] Ghz, 16Gb 1.35v Ram, Ubuntu (Fold only)
Asus Z9PA, 2 Ivy 12 core, 16gb Ram, H folding appliance (fold only)
Location: Jersey, Channel islands

Re: GPU server status 171.67.108.21, 171.64.65.71,171.67.108.26

Post by Nathan_P »

I've got the same problem on my 4 gpu clients, until its fixed i've pulled the plug and just smp folding for now. Shame really as i've just spent £250 on another Gfx card to help out more :cry:
Image
glussier
Posts: 9
Joined: Wed Nov 18, 2009 3:57 am

Re: GPU server status 171.67.108.21, 171.64.65.71,171.67.108.26

Post by glussier »

I, also have the same "Server has already received unit." on my 4 gpus. I don't care much about the points, but, I do care a lot about throwing money down the drain.
Image
Nathan_P
Posts: 1164
Joined: Wed Apr 01, 2009 9:22 pm
Hardware configuration: Asus Z8NA D6C, 2 [email protected] Ghz, , 12gb Ram, GTX 980ti, AX650 PSU, win 10 (daily use)

Asus Z87 WS, Xeon E3-1230L v3, 8gb ram, KFA GTX 1080, EVGA 750ti , AX760 PSU, Mint 18.2 OS

Not currently folding
Asus Z9PE- D8 WS, 2 [email protected] Ghz, 16Gb 1.35v Ram, Ubuntu (Fold only)
Asus Z9PA, 2 Ivy 12 core, 16gb Ram, H folding appliance (fold only)
Location: Jersey, Channel islands

Re: GPU server status 171.67.108.21, 171.64.65.71,171.67.108.26

Post by Nathan_P »

Pette Broad wrote:This needs to be sorted very quickly. I'm completing units and getting no credit because the server has already received the unit. I've seen this project go through some difficult times but this for me at least this is a disaster. With the umpteen CPU units waiting to be returned, some for over 2 weeks and the complete lack of information from Pande group I'm probably going to switch projects soon.

Pete
I'm not going to switch but i am shutting down for now, not getting point sis one thing - i can live with that as the science is still getting done but to have the server reject the WU as being already done is something else.
Image
bingo-dog
Posts: 8
Joined: Tue Dec 29, 2009 3:41 pm

Re: GPU server status 171.67.108.21, 171.64.65.71,171.67.108.26

Post by bingo-dog »

Same problem here - last 3 completed WUs were discarded as "already completed".
Sahkuhnder
Posts: 43
Joined: Sun Dec 02, 2007 5:28 am
Location: Vegas Baby! Yeah!

Re: GPU server status 171.67.108.21, 171.64.65.71,171.67.108.26

Post by Sahkuhnder »

Tobit wrote:...a WU will finish, attempt to upload, report that it can't connect, keeps in queue, prepares to start a new WU, uploads all unsent work, uploads fine to 171.67.108.21, reports the server already has the WU, deletes WU, downloads new work and the cycle starts again. All GPUs here keep getting assigned to 171.67.108.21 and all are affected. These are dedicated machines and no configuration has changed in weeks.
The exact same is happening with me too.

Code: Select all

[06:00:50] Folding@home Core Shutdown: FINISHED_UNIT
[06:00:53] CoreStatus = 64 (100)
[06:00:53] Sending work to server
[06:00:53] Project: 5781 (Run 14, Clone 296, Gen 3)
[06:00:53] - Read packet limit of 540015616... Set to 524286976.


[06:00:53] + Attempting to send results [February 14 06:00:53 UTC]
[06:00:58] + Results successfully sent
[06:00:58] Thank you for your contribution to Folding@Home.
[06:00:58] + Number of Units Completed: 547

[06:01:02] - Preparing to get new work unit...
[06:01:02] + Attempting to get work packet
[06:01:03] - Connecting to assignment server
[06:01:03] - Successful: assigned to (171.67.108.21).
[06:01:03] + News From Folding@Home: Welcome to Folding@Home
[06:01:03] Loaded queue successfully.
[06:01:04] + Closed connections
[06:01:04] 
[06:01:04] + Processing work unit
[06:01:04] Core required: FahCore_11.exe
[06:01:04] Core found.
[06:01:04] Working on queue slot 01 [February 14 06:01:04 UTC]
[06:01:04] + Working ...
[06:01:04] 
[06:01:04] *------------------------------*
[06:01:04] Folding@Home GPU Core
[06:01:04] Version 1.31 (Tue Sep 15 10:57:42 PDT 2009)
[06:01:04] 
[06:01:04] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[06:01:04] Build host: amoeba
[06:01:04] Board Type: Nvidia
[06:01:04] Core      : 
[06:01:04] Preparing to commence simulation
[06:01:04] - Looking at optimizations...
[06:01:04] DeleteFrameFiles: successfully deleted file=work/wudata_01.ckp
[06:01:04] - Created dyn
[06:01:04] - Files status OK
[06:01:04] - Expanded 22262 -> 146631 (decompressed 658.6 percent)
[06:01:04] Called DecompressByteArray: compressed_data_size=22262 data_size=146631, decompressed_data_size=146631 diff=0
[06:01:04] - Digital signature verified
[06:01:04] 
[06:01:04] Project: 5799 (Run 2, Clone 80, Gen 4)
[06:01:04] 
[06:01:04] Assembly optimizations on if available.
[06:01:04] Entering M.D.
[06:01:10] Tpr hash work/wudata_01.tpr:  2196091279 2246815699 3189961818 3960565494 2598115266
[06:01:10] 
[06:01:10] Calling fah_main args: 14 usage=100
[06:01:10] 
[06:01:11] Working on Protein
[06:01:11] Client config found, loading data.
[06:01:11] Starting GUI Server
[06:03:00] Completed 1%
[06:04:48] Completed 2%
[06:06:37] Completed 3%
[06:08:26] Completed 4%

<snip>

[08:55:15] Completed 96%
[08:57:04] Completed 97%
[08:58:53] Completed 98%
[09:00:41] Completed 99%
[09:02:30] Completed 100%
[09:02:30] Successful run
[09:02:30] DynamicWrapper: Finished Work Unit: sleep=10000
[09:02:40] Reserved 58912 bytes for xtc file; Cosm status=0
[09:02:40] Allocated 58912 bytes for xtc file
[09:02:40] - Reading up to 58912 from "work/wudata_01.xtc": Read 58912
[09:02:40] Read 58912 bytes from xtc file; available packet space=786371552
[09:02:40] xtc file hash check passed.
[09:02:40] Reserved 6936 6936 786371552 bytes for arc file=<work/wudata_01.trr> Cosm status=0
[09:02:40] Allocated 6936 bytes for arc file
[09:02:40] - Reading up to 6936 from "work/wudata_01.trr": Read 6936
[09:02:40] Read 6936 bytes from arc file; available packet space=786364616
[09:02:40] trr file hash check passed.
[09:02:40] Allocated 560 bytes for edr file
[09:02:40] Read bedfile
[09:02:40] edr file hash check passed.
[09:02:40] Logfile not read.
[09:02:40] GuardedRun: success in DynamicWrapper
[09:02:40] GuardedRun: done
[09:02:40] Run: GuardedRun completed.
[09:02:44] + Opened results file
[09:02:44] - Writing 66920 bytes of core data to disk...
[09:02:44] Done: 66408 -> 63709 (compressed to 95.9 percent)
[09:02:44]   ... Done.
[09:02:44] DeleteFrameFiles: successfully deleted file=work/wudata_01.ckp
[09:02:44] Shutting down core 
[09:02:44] 
[09:02:44] Folding@home Core Shutdown: FINISHED_UNIT
[09:02:48] CoreStatus = 64 (100)
[09:02:48] Sending work to server
[09:02:48] Project: 5799 (Run 2, Clone 80, Gen 4)
[09:02:48] - Read packet limit of 540015616... Set to 524286976.


[09:02:48] + Attempting to send results [February 14 09:02:48 UTC]
[09:02:50] - Couldn't send HTTP request to server
[09:02:50] + Could not connect to Work Server (results)
[09:02:50]     (171.67.108.21:8080)
[09:02:50] + Retrying using alternative port
[09:03:11] - Couldn't send HTTP request to server
[09:03:11] + Could not connect to Work Server (results)
[09:03:11]     (171.67.108.21:80)
[09:03:11] - Error: Could not transmit unit 01 (completed February 14) to work server.
[09:03:11]   Keeping unit 01 in queue.
[09:03:11] Project: 5799 (Run 2, Clone 80, Gen 4)
[09:03:11] - Read packet limit of 540015616... Set to 524286976.


[09:03:11] + Attempting to send results [February 14 09:03:11 UTC]
[09:03:13] - Server has already received unit.
[09:03:13] - Preparing to get new work unit...
[09:03:13] + Attempting to get work packet
[09:03:13] - Connecting to assignment server
[09:03:14] - Successful: assigned to (171.67.108.21).
[09:03:14] + News From Folding@Home: Welcome to Folding@Home
[09:03:14] Loaded queue successfully.
[09:03:16] + Closed connections
[09:03:16] 
[09:03:16] + Processing work unit
[09:03:16] Core required: FahCore_11.exe
[09:03:16] Core found.
[09:03:16] Working on queue slot 02 [February 14 09:03:16 UTC]
[09:03:16] + Working ...
[09:03:16] 
[09:03:16] *------------------------------*
[09:03:16] Folding@Home GPU Core
[09:03:16] Version 1.31 (Tue Sep 15 10:57:42 PDT 2009)
[09:03:16] 
[09:03:16] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[09:03:16] Build host: amoeba
[09:03:16] Board Type: Nvidia
[09:03:16] Core      : 
[09:03:16] Preparing to commence simulation
[09:03:16] - Looking at optimizations...
[09:03:16] DeleteFrameFiles: successfully deleted file=work/wudata_02.ckp
[09:03:16] - Created dyn
[09:03:16] - Files status OK
[09:03:16] - Expanded 64949 -> 344387 (decompressed 530.2 percent)
[09:03:16] Called DecompressByteArray: compressed_data_size=64949 data_size=344387, decompressed_data_size=344387 diff=0
[09:03:16] - Digital signature verified
[09:03:16] 
[09:03:16] Project: 5781 (Run 23, Clone 474, Gen 3)
[09:03:16] 
[09:03:16] Assembly optimizations on if available.
[09:03:16] Entering M.D.
[09:03:22] Tpr hash work/wudata_02.tpr:  285685404 1778957796 1384613819 819990744 2700173846
[09:03:22] 
[09:03:22] Calling fah_main args: 14 usage=100
[09:03:22] 
[09:03:23] Working on Great Red Owns Many ACres of Sand
[09:03:24] Client config found, loading data.
[09:03:25] Starting GUI Server
[09:11:26] Completed 1%
[09:19:27] Completed 2%
[09:27:28] Completed 3%
[09:35:29] Completed 4%
Image
Post Reply