New Assignment Server feedback/problem

Moderators: Site Moderators, FAHC Science Team

Breach
Posts: 204
Joined: Sat Mar 09, 2013 8:07 pm
Location: Brussels, Belgium

Re: New Assignment Server feedback/problem

Post by Breach »

I have just received a core 17:

Code: Select all

16:05:49:WU01:FS01:Connecting to 171.67.108.200:80
16:05:50:WU01:FS01:Assigned to work server 171.67.108.52
16:05:50:WU01:FS01:Requesting new work unit for slot 01: RUNNING gpu:0:GM204 [GeForce GTX 970] from 171.67.108.52
16:05:50:WU01:FS01:Connecting to 171.67.108.52:8080
16:05:52:WU01:FS01:Downloading 1.53MiB
16:05:53:WU01:FS01:Download complete
16:05:53:WU01:FS01:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:9201 run:562 clone:3 gen:2 core:0x17 unit:0x0000000b6652edc45399ec2237cfa30d
As evident it comes from a different WS: 171.67.108.52. So not sure whether it's a problem of the AS, or simply 171.64.65.105 is out of Core 17 WUs.
Windows 11 x64 / 5800X@5Ghz / 32GB DDR4 3800 CL14 / 4090 FE / Creative Titanium HD / Sennheiser 650 / PSU Corsair AX1200i
Joe_H
Site Admin
Posts: 7929
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: New Assignment Server feedback/problem

Post by Joe_H »

According to the Project Summaries, you should not be expecting to get Core_17 WU's from 171.64.65.105. Just Core_15 projects show up as being from that WS.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
Breach
Posts: 204
Joined: Sat Mar 09, 2013 8:07 pm
Location: Brussels, Belgium

Re: New Assignment Server feedback/problem

Post by Breach »

Thanks. So the next logical question is whether the AS assigning us to this WS considered expected behaviour?
Windows 11 x64 / 5800X@5Ghz / 32GB DDR4 3800 CL14 / 4090 FE / Creative Titanium HD / Sennheiser 650 / PSU Corsair AX1200i
Joe_H
Site Admin
Posts: 7929
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: New Assignment Server feedback/problem

Post by Joe_H »

What is expected is that if the AS can not connect a Windows computer with a GPU requesting a new WU to a WS with Core_17 work, then it will get assigned to 171.64.65.105 and be assigned a Core_15 WU. Linux GPU WU requests should get an "Empty work server" message if the AS can not connect to a WS with available Core_17 WU's.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
JimF
Posts: 651
Joined: Thu Jan 21, 2010 2:03 pm

Re: New Assignment Server feedback/problem

Post by JimF »

Joe_H wrote:What is expected is that if the AS can not connect a Windows computer with a GPU requesting a new WU to a WS with Core_17 work, then it will get assigned to 171.64.65.105 and be assigned a Core_15 WU.
I just finished my sole Core_17, and was assigned another Core_15, so I guess it can not connect to a WS with Core_17 work.
Breach
Posts: 204
Joined: Sat Mar 09, 2013 8:07 pm
Location: Brussels, Belgium

Re: New Assignment Server feedback/problem

Post by Breach »

JimF wrote:
Joe_H wrote:What is expected is that if the AS can not connect a Windows computer with a GPU requesting a new WU to a WS with Core_17 work, then it will get assigned to 171.64.65.105 and be assigned a Core_15 WU.
I just finished my sole Core_17, and was assigned another Core_15, so I guess it can not connect to a WS with Core_17 work.
After so much time on FAH I have just discovered this page ;-) :

http://fah-web.stanford.edu/pybeta/serverstat.html

According to that: 171.67.108.52 is 'full' (in full operation, should be giving out WUs), but is then marked as 'Blue' ("Blue - if the AS has decided not to assign to that machine, eg. the AS thinks it is down or out of jobs (blue means iced)". The WU stats for this WS are null - guess it's either out of work or there's another reason the AS considers it not available...
Windows 11 x64 / 5800X@5Ghz / 32GB DDR4 3800 CL14 / 4090 FE / Creative Titanium HD / Sennheiser 650 / PSU Corsair AX1200i
JimF
Posts: 651
Joined: Thu Jan 21, 2010 2:03 pm

Re: New Assignment Server feedback/problem

Post by JimF »

Breach wrote:According to that: 171.67.108.52 is 'full' (in full operation, should be giving out WUs), but is then marked as 'Blue' ("Blue - if the AS has decided not to assign to that machine, eg. the AS thinks it is down or out of jobs (blue means iced)". The WU stats for this WS are null - guess it's either out of work or there's another reason the AS considers it not available...
Good find. But I now have two core_17s from that work server, even though it still shows as "blue". I will let PG figure it all out. It seems to me possible though that as they transition to core_18, they may have shortages of 17s and have to fill in with the 15s.
heikosch
Posts: 110
Joined: Thu Apr 30, 2009 7:31 pm
Hardware configuration: [email protected]
[email protected]

[email protected]
GTX460@800MHz
Location: Essen, Germany

Re: New Assignment Server feedback/problem

Post by heikosch »

To my mind the problem is that the available WU count is 0 for 171.67.108.52. Regarding to the documentation the color changes to blue when it runs low on available WUs. So there´s always a change to get a WU.

Heiko
Image Image
Image
Gary480six
Posts: 93
Joined: Mon Jan 21, 2008 6:42 pm

Re: New Assignment Server feedback/problem

Post by Gary480six »

To my mind, what has never been explained, is why a month ago the Maxwell cards were being assigned the P13000 and P13001 work units - and completing them just fine.

Then, changes were made to the Assignment Server.... and suddenly, the Maxwell cards could not complete the P13000 work.

Is somebody going to address That issue?
7im
Posts: 10179
Joined: Thu Nov 29, 2007 4:30 pm
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
Location: Arizona
Contact:

Re: New Assignment Server feedback/problem

Post by 7im »

Gary480six wrote:snip

Is somebody going to address That issue?
Two issues actually. AS updates and Maxwell support. On the AS updates, no. No one is going to explain it in any more detail than already given. On Maxwell support, new GPU devices and new chip architectures are highly dependent on functional (for computing, not gaming) drivers from the manufacturers, as stated in the install guides. It takes time for both the OEMs and for fah to work out the kinks on new GPUs, especially when that may not be ther focus right now.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
kimben777
Posts: 23
Joined: Mon Apr 07, 2014 1:21 pm
Hardware configuration: 1) 1090t @3.6ghz, corsair 850w psu, 16gb gskill, asus m4a89td pro mb, asus 780 ti
2) fx-8350 @4ghz, cooler master 1000w psu, 8gb gskill, asus m5a99fx pro mb, two asus 780 ti's
3) am3 x4 @3.4ghz, rosewill 650w psu, asus M5A78L-M/USB3 mb, 8gb gskill, asus 780 ti

Re: New Assignment Server feedback/problem

Post by kimben777 »

What is the deal with the 171.67.108.52 server showing no or very low wu's since Thursday morning? How long does it take to fill it back up with wu's?
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: New Assignment Server feedback/problem

Post by bruce »

kimben777 wrote:What is the deal with the 171.67.108.52 server showing no or very low wu's since Thursday morning? How long does it take to fill it back up with wu's?
Every time somebody successfully completes a WU, a new WU is generated so anybody who returns a WU for Core_17 and is assigned a WU for Core_15 is helping to fill the server until the point that it turns non-blue and starts assigning again. Thus a server can alternate between having enough WUs to assign them and not having enough. That process is automatic (i.e.-works unattended).

A separate issue is whether science NEEDS more WUs. FAH does not assign "busy work" but insists that assignments are actually needed by science, so no answer can be given that doesn't consider the science.

At some point every project reaches the stage where they have "enough" completed WUs to draw the necessary scientific conclusions and the project is ended and at that point, no new WUs will be added. [You and I have no way of knowing when that's about to happen.}

On the other hand, a project may need a lot more WUs to be completed and they can be added by the PI -- after digesting the completed WUs (and perhaps moving data off-line) to make room for newly generated WUs. [In that case, your question is a good one!] I don't have a good answer, but I do know it does take a fair amount of processing time and a certain amount of manual work.
Gary480six
Posts: 93
Joined: Mon Jan 21, 2008 6:42 pm

Re: New Assignment Server feedback/problem

Post by Gary480six »

7im wrote: snip

On Maxwell support, new GPU devices and new chip architectures are highly dependent on functional (for computing, not gaming) drivers from the manufacturers, as stated in the install guides. It takes time for both the OEMs and for fah to work out the kinks on new GPUs, especially when that may not be their focus right now.
7im - I would understand this issue better if the new Maxwell cards Never worked for Folding on Windows systems. But they were stable and producing work on the P13000 and P13001 work units for several Weeks before everything blew up.
7im
Posts: 10179
Joined: Thu Nov 29, 2007 4:30 pm
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
Location: Arizona
Contact:

Re: New Assignment Server feedback/problem

Post by 7im »

Do you know about some Kepler GPUs needing to use the older 327.xx driver to fold at full speed? The newer drivers fold just fine also, just slower.

The ability to fold or not fold a single project is not a good indicator. Neither is a newer driver an indicator of a better driver. Fah is very dependent on third party hardware and software that is out of their control.
I don't know if the AS changes were related or not, but as I said, that won't be explained either. But an AS only routes a connection to a WS, and has no affect on the fahcore or work unit data. So unlikely to be the cause.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
heikosch
Posts: 110
Joined: Thu Apr 30, 2009 7:31 pm
Hardware configuration: [email protected]
[email protected]

[email protected]
GTX460@800MHz
Location: Essen, Germany

Re: New Assignment Server feedback/problem

Post by heikosch »

7im wrote:Do you know about some Kepler GPUs needing to use the older 327.xx driver to fold at full speed? The newer drivers fold just fine also, just slower.

The ability to fold or not fold a single project is not a good indicator. Neither is a newer driver an indicator of a better driver. Fah is very dependent on third party hardware and software that is out of their control.
I don't know if the AS changes were related or not, but as I said, that won't be explained either. But an AS only routes a connection to a WS, and has no affect on the fahcore or work unit data. So unlikely to be the cause.
When the new AS was activated (just over night for me!) my GTX 750Ti began to throw errors with P1300x. Core 0x17 Version didn´t change and I didn´t change nVidia driver nor installed other Software or updates.
Maybe they changed not only the AS but independently the content of the P1300x WUs. Shortly after that they stopped to assign P1300x to Maxwell GPUs.

Heiko

PS: You need nVidia 344.11 for GTX970/980, better 344.16 or 344.48 but 337.88 is ok for GTX 750Ti. I didn´t remember if 327.23 works with GTX 750Ti.
Image Image
Image
Post Reply