Azure servers down 40.121.152.108 / 52.224.109.74
Moderators: Site Moderators, FAHC Science Team
-
- Posts: 1996
- Joined: Sun Mar 22, 2020 5:52 pm
- Hardware configuration: 1: 2x Xeon [email protected], 512GB DDR4 LRDIMM, SSD Raid, Win10 Ent 20H2, Quadro K420 1GB, FAH 7.6.21
2: Xeon [email protected], 32GB DDR4, NVME, Win10 Pro 20H2, Quadro M1000M 2GB, FAH 7.6.21 (actually have two of these)
3: [email protected], 12GB DDR3, SSD, Win10 Pro 20H2, GTX 750Ti 2GB, GTX 1080Ti 11GB, FAH 7.6.21 - Location: UK
Re: Can not upload to 40.121.152.108 nor 52.224.109.74
No ... The server is down ... You simply have to be patient and let the client do what it is designed to do - which is keep retrying to send at intervals until such time as either the server is up and can receive it or it passes the expiration deadline and gets dumped by the client ... The client should be continuing to get WUs and process them from other servers without interruption.
2x Xeon E5-2697v3, 512GB DDR4 LRDIMM, SSD Raid, W10-Ent, Quadro K420
Xeon E3-1505Mv5, 32GB DDR4, NVME, W10-Pro, Quadro M1000M
i7-960, 12GB DDR3, SSD, W10-Pro, GTX1080Ti
i9-10850K, 64GB DDR4, NVME, W11-Pro, RTX3070
(Green/Bold = Active)
Xeon E3-1505Mv5, 32GB DDR4, NVME, W10-Pro, Quadro M1000M
i7-960, 12GB DDR3, SSD, W10-Pro, GTX1080Ti
i9-10850K, 64GB DDR4, NVME, W11-Pro, RTX3070
(Green/Bold = Active)
Re: Can not upload to 40.121.152.108 nor 52.224.109.74
Ongoing discussions between FAH Server admins and azure server admins. Apparently a critical boot device is off-line.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
-
- Posts: 1
- Joined: Fri Apr 17, 2020 7:15 am
Re: Can not upload to 40.121.152.108 nor 52.224.109.74
The two work units I have awaiting upload both expire tomorrow, hopefully the servers will be restored by then.
-
- Posts: 94
- Joined: Wed Dec 05, 2007 10:23 pm
- Hardware configuration: Apple Mac Pro 1,1 2x2.66 GHz Dual-Core Xeon w/10 GB RAM | EVGA GTX 960, Zotac GTX 750 Ti | Ubuntu 14.04 LTS
Dell Precision T7400 2x3.0 GHz Quad-Core Xeon w/16 GB RAM | Zotac GTX 970 | Ubuntu 14.04 LTS
Apple iMac Retina 5K 4.00 GHz Core i7 w/8 GB RAM | OS X 10.11.3 (El Capitan) - Location: Michiana, USA
Re: Azure servers down 40.121.152.108 / 52.224.109.74
I heard reports of a CloudFlare outage July 17th. https://blog.cloudflare.com/cloudflare- ... y-17-2020/
Outage was 20:25 UTC through 22:10. The last contact with four of the Azure cloud servers was about two hours before this. Depending on how often the sanity check is performed, is this a possible cause of the Azure cloud outage?
Outage was 20:25 UTC through 22:10. The last contact with four of the Azure cloud servers was about two hours before this. Depending on how often the sanity check is performed, is this a possible cause of the Azure cloud outage?
-
- Posts: 14
- Joined: Wed Apr 15, 2020 3:15 pm
Re: Connection to Client broken after Win 10 2004 update
Hi not sure where to post this but I have a complete work unit that was never uploaded?
I had to switch my computer off for a few weeks.
See log pasted below:
Mod Edit: Added Code Tags - PantherX
I had to switch my computer off for a few weeks.
See log pasted below:
Code: Select all
04:48:07:WU01:FS00:Upload 34.80%
04:48:13:WU01:FS00:Upload 46.70%
04:48:19:WU01:FS00:Upload 56.77%
04:48:25:WU01:FS00:Upload 65.93%
04:48:31:WU01:FS00:Upload 76.01%
04:48:37:WU01:FS00:Upload 86.08%
04:48:43:WU01:FS00:Upload 95.24%
04:48:48:ERROR:WU01:FS00:Exception: 10001: Server responded: HTTP_BAD_GATEWAY
04:54:57:WU02:FS00:0xa7:Completed 92500 out of 125000 steps (74%)
05:03:11:WU02:FS00:0xa7:Completed 93750 out of 125000 steps (75%)
05:11:21:WU02:FS00:0xa7:Completed 95000 out of 125000 steps (76%)
05:19:31:WU02:FS00:0xa7:Completed 96250 out of 125000 steps (77%)
05:27:43:WU02:FS00:0xa7:Completed 97500 out of 125000 steps (78%)
05:35:55:WU02:FS00:0xa7:Completed 98750 out of 125000 steps (79%)
05:44:07:WU02:FS00:0xa7:Completed 100000 out of 125000 steps (80%)
05:46:37:WU01:FS00:Sending unit results: id:01 state:SEND error:NO_ERROR project:14818 run:641 clone:0 gen:241 core:0xa7 unit:0x000001192879986c5ea8de43810a1dc4
05:46:37:WU01:FS00:Uploading 6.83MiB to 40.121.152.108
05:46:37:WU01:FS00:Connecting to 40.121.152.108:8080
05:46:41:WARNING:WU01:FS00:WorkServer connection failed on port 8080 trying 80
05:46:41:WU01:FS00:Connecting to 40.121.152.108:80
05:46:43:WU01:FS00:Upload 4.58%
05:46:49:WU01:FS00:Upload 15.57%
05:46:55:WU01:FS00:Upload 24.72%
05:47:01:WU01:FS00:Upload 33.88%
05:47:07:WU01:FS00:Upload 45.79%
05:47:13:WU01:FS00:Upload 55.86%
05:47:19:WU01:FS00:Upload 65.02%
05:47:25:WU01:FS00:Upload 75.09%
05:47:31:WU01:FS00:Upload 84.25%
05:47:37:WU01:FS00:Upload 94.32%
05:47:43:WARNING:WU01:FS00:Exception: Failed to send results to work server: 10001: Server responded: HTTP_BAD_GATEWAY
05:47:43:WU01:FS00:Trying to send results to collection server
05:47:43:WU01:FS00:Uploading 6.83MiB to 52.224.109.74
05:47:43:WU01:FS00:Connecting to 52.224.109.74:8080
05:47:45:WARNING:WU01:FS00:WorkServer connection failed on port 8080 trying 80
05:47:45:WU01:FS00:Connecting to 52.224.109.74:80
05:47:49:WU01:FS00:Upload 6.41%
05:47:55:WU01:FS00:Upload 16.48%
05:48:01:WU01:FS00:Upload 26.56%
05:48:07:WU01:FS00:Upload 37.54%
05:48:13:WU01:FS00:Upload 47.62%
05:48:19:WU01:FS00:Upload 56.77%
05:48:25:WU01:FS00:Upload 66.85%
05:48:31:WU01:FS00:Upload 76.01%
05:48:37:WU01:FS00:Upload 86.08%
05:48:43:WU01:FS00:Upload 96.15%
05:48:48:ERROR:WU01:FS00:Exception: 10001: Server responded: HTTP_BAD_GATEWAY
05:52:23:WU02:FS00:0xa7:Completed 101250 out of 125000 steps (81%)
06:00:36:WU02:FS00:0xa7:Completed 102500 out of 125000 steps (82%)
06:08:46:WU02:FS00:0xa7:Completed 103750 out of 125000 steps (83%)
Re: Connection to Client broken after Win 10 2004 update
HTTP_BAD_GATEWAY indicates that the server is overloaded or cannot accept the upload for some other reason.
In this case, both servers have been down. They may be struggling to handle the surge load.
In this case, both servers have been down. They may be struggling to handle the surge load.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
Re: Azure servers down 40.121.152.108 / 52.224.109.74
I have the same Problem. I guess the WU was folded for nothing because it deletes itself soon.
Re: Azure servers down 40.121.152.108 / 52.224.109.74
Still (23 July 7pm) unable to upload a completed work unit from Work Queue 00 to 52.224.109.74...
-
- Posts: 1996
- Joined: Sun Mar 22, 2020 5:52 pm
- Hardware configuration: 1: 2x Xeon [email protected], 512GB DDR4 LRDIMM, SSD Raid, Win10 Ent 20H2, Quadro K420 1GB, FAH 7.6.21
2: Xeon [email protected], 32GB DDR4, NVME, Win10 Pro 20H2, Quadro M1000M 2GB, FAH 7.6.21 (actually have two of these)
3: [email protected], 12GB DDR3, SSD, Win10 Pro 20H2, GTX 750Ti 2GB, GTX 1080Ti 11GB, FAH 7.6.21 - Location: UK
Re: Azure servers down 40.121.152.108 / 52.224.109.74
... and still server down as shown on the server status link at top of this page.
When the issue is resolve (if indeed it is) the server will be restarted - until such time the client will not be able to connect to it and upload because it is down ... if the expiration deadline is reached before the server is up then the WU will be dumped by the client.
Looks like one of the four servers that went done on the 17th was up at least for a bit yesterday (22nd) but is now done again so my guess is they are still troubleshooting the issue(s)
When the issue is resolve (if indeed it is) the server will be restarted - until such time the client will not be able to connect to it and upload because it is down ... if the expiration deadline is reached before the server is up then the WU will be dumped by the client.
Looks like one of the four servers that went done on the 17th was up at least for a bit yesterday (22nd) but is now done again so my guess is they are still troubleshooting the issue(s)
2x Xeon E5-2697v3, 512GB DDR4 LRDIMM, SSD Raid, W10-Ent, Quadro K420
Xeon E3-1505Mv5, 32GB DDR4, NVME, W10-Pro, Quadro M1000M
i7-960, 12GB DDR3, SSD, W10-Pro, GTX1080Ti
i9-10850K, 64GB DDR4, NVME, W11-Pro, RTX3070
(Green/Bold = Active)
Xeon E3-1505Mv5, 32GB DDR4, NVME, W10-Pro, Quadro M1000M
i7-960, 12GB DDR3, SSD, W10-Pro, GTX1080Ti
i9-10850K, 64GB DDR4, NVME, W11-Pro, RTX3070
(Green/Bold = Active)
-
- Posts: 3
- Joined: Sat Jul 18, 2020 8:03 pm
- Location: Germany
Re: Azure servers down 40.121.152.108 / 52.224.109.74
The 52.... was started about half an hour ago and has meanwhile collected my "orphan" WU. So that problem is solved.
The 40... is still down at the moment.
The 40... is still down at the moment.
-
- Posts: 85
- Joined: Wed Apr 08, 2020 9:57 pm
- Location: Pacific Northwest
Re: Azure servers down 40.121.152.108 / 52.224.109.74
My work unit finally uploaded successfully to 52.224.109.74, less than 24 hours before the expiration.
Re: Azure servers down 40.121.152.108 / 52.224.109.74
Sorry for the troubles, and thanks for your patience! We're working with the Azure folks to get these machines up to accept work units.
Re: Azure servers down 40.121.152.108 / 52.224.109.74
Friday 24 Julyhutleytj wrote:Still (23 July 7pm) unable to upload a completed work unit from Work Queue 00 to 52.224.109.74...
My work unit has now uploaded to the collection server (at least, it is not in a queue to upload anymore)