out of work?????
Moderators: Site Moderators, FAHC Science Team
-
- Posts: 130
- Joined: Wed Feb 06, 2013 4:46 pm
out of work?????
yea for the past few days the GPU slot was waiting on work, now both the CPU and GPU slot are waiting for work. It also took like an hour for the work to upload. Any idea as to what is going on? Do we know when work will be available again?
Thanks
Thanks
-
- Posts: 130
- Joined: Wed Feb 06, 2013 4:46 pm
Re: out of work?????
Sorry, I missed the above forum where it stated server outages along with running out of work. I will patiently wait for work
-
- Posts: 130
- Joined: Wed Feb 06, 2013 4:46 pm
Re: out of work?????
Wow, this is unbelievable that we have so many new donors that the servers, work load and my computer can process the work faster than we can get work. Just incredible!! I wish all these people folded like this full time
-
- Posts: 1164
- Joined: Wed Apr 01, 2009 9:22 pm
- Hardware configuration: Asus Z8NA D6C, 2 [email protected] Ghz, , 12gb Ram, GTX 980ti, AX650 PSU, win 10 (daily use)
Asus Z87 WS, Xeon E3-1230L v3, 8gb ram, KFA GTX 1080, EVGA 750ti , AX760 PSU, Mint 18.2 OS
Not currently folding
Asus Z9PE- D8 WS, 2 [email protected] Ghz, 16Gb 1.35v Ram, Ubuntu (Fold only)
Asus Z9PA, 2 Ivy 12 core, 16gb Ram, H folding appliance (fold only) - Location: Jersey, Channel islands
Re: out of work?????
Soon, the servers are issuing work but as you said, too many donors for the infrastructure to keep up. New servers are on the way as well as other enhancements so please bear with
Re: out of work?????
Has anyone thought of asking someone like google/Microsoft/etc or similar to assist with providing more servers, surely someone would be interested?
-
- Posts: 130
- Joined: Wed Feb 06, 2013 4:46 pm
Re: out of work?????
That's a great idea, I am sure some of those companies would be willing to donate to a really good cause.
-
- Posts: 1164
- Joined: Wed Apr 01, 2009 9:22 pm
- Hardware configuration: Asus Z8NA D6C, 2 [email protected] Ghz, , 12gb Ram, GTX 980ti, AX650 PSU, win 10 (daily use)
Asus Z87 WS, Xeon E3-1230L v3, 8gb ram, KFA GTX 1080, EVGA 750ti , AX760 PSU, Mint 18.2 OS
Not currently folding
Asus Z9PE- D8 WS, 2 [email protected] Ghz, 16Gb 1.35v Ram, Ubuntu (Fold only)
Asus Z9PA, 2 Ivy 12 core, 16gb Ram, H folding appliance (fold only) - Location: Jersey, Channel islands
Re: out of work?????
Discussions are being had, the requirements are steep though, fast i/o, fast network connects and tons of storage like 100TB+ per work server. Lets see what happens
Re: out of work?????
Actually, I think they're not that steep. I have experience with AWS, and I'm sure Google / Azure could manage the same. I've posted in the forums recently about asking to help with this stuff.
Just specifically mentioning this, because I worked on AWS recently so take it with a grain of salt that it's specific to one cloud provider in my writing here (as an example's sake).
Requirements mentioned from @Nathan_P:
Fast I/O - check - uses SSD drives (don't need 100TB here, see last item)
Fast network connection - check - depends on server size, and also they have specific instances for high network workloads such as this one. Take for example the "c5n" EC2 instance
Tons of fast, scalable and highly available storage - check - can mount an EFS drive using NFS 4.1 protocol, and have it do the encryption at rest, and TLS for in transit. Also, can apply a policy to help with storage to save on some cost based on when it was last used.
Admittedly, I don't know too much about the specifics of folding on GPUs, but what I really nerd out on is server infrastructures and high availability.
Just specifically mentioning this, because I worked on AWS recently so take it with a grain of salt that it's specific to one cloud provider in my writing here (as an example's sake).
Requirements mentioned from @Nathan_P:
Fast I/O - check - uses SSD drives (don't need 100TB here, see last item)
Fast network connection - check - depends on server size, and also they have specific instances for high network workloads such as this one. Take for example the "c5n" EC2 instance
Tons of fast, scalable and highly available storage - check - can mount an EFS drive using NFS 4.1 protocol, and have it do the encryption at rest, and TLS for in transit. Also, can apply a policy to help with storage to save on some cost based on when it was last used.
Admittedly, I don't know too much about the specifics of folding on GPUs, but what I really nerd out on is server infrastructures and high availability.
-
- Posts: 390
- Joined: Sun Dec 02, 2007 4:53 am
- Hardware configuration: FX8320e (6 cores enabled) @ stock,
- 16GB DDR3,
- Zotac GTX 1050Ti @ Stock.
- Gigabyte GTX 970 @ Stock
Debian 9.
Running GPU since it came out, CPU since client version 3.
Folding since Folding began (~2000) and ran Genome@Home for a while too.
Ran Seti@Home prior to that. - Location: UK
- Contact:
Re: out of work?????
The problem isnt just the servers. Researchers have to generate new workunits and I dont think its an automatic system. Also their are a finite number of hours in the day. I havent had work for 24 hours. Admittedly its been a long time since we've had shortages on this level and ive been folding since the project started. It will get fixed, but the timetable on it goes along the lines of "how long is a piece of string?".
-
- Posts: 1
- Joined: Thu Mar 12, 2020 3:38 am
Re: out of work?????
"I haven't had work for 24 hours"
It's been at least 48 hours for me without work. Appreciate the herculean efforts the folding at home folks are making to get the work units out. Waiting eagerly in anticipation.
It's been at least 48 hours for me without work. Appreciate the herculean efforts the folding at home folks are making to get the work units out. Waiting eagerly in anticipation.
Re: out of work?????
For prerspective, one team ALONE has added about 18,000 new users in the last week (PC Master Race).
There is no way to tell how many new "anonymous because of the default client setup" folders have been added to the default team - but I'd guess HUNDREDS OF THOUSANDS and perhaps as high as a MILLION over the last week.
For perspective, Coreweave is contributing "more than 6000 Tesla V100" GPUs out of their 45,000 thousand GPU render farm - and they're WAY down the scale in comparison to the default team or anonymous user in work down.
The primary issue appears to be lack of work units - the "60,000" that was added to one of the servers represented less than 6 HOURS of work at the current F@H participation level.
My personal guess is that participation in number of people has multiplied by at least 10 times, more likely over 100 times, and possibly by A THOUSAND times in the last week. Number of clients hasn't gone up as much as many of us longer-term folders have been "heavy hitters" with multiple machines/clients.
This would put serious strain on ANY organization.
Might be worth talking to Bill Gates about the infrastructure issue - he's not part of Microsoft any more, but with his personal/foundation STOCK ownership I'm sure he's still got a lot of PULL there, and his foundation IS focused on medical-related issues.
This one seems to be right up the alley.
There is no way to tell how many new "anonymous because of the default client setup" folders have been added to the default team - but I'd guess HUNDREDS OF THOUSANDS and perhaps as high as a MILLION over the last week.
For perspective, Coreweave is contributing "more than 6000 Tesla V100" GPUs out of their 45,000 thousand GPU render farm - and they're WAY down the scale in comparison to the default team or anonymous user in work down.
The primary issue appears to be lack of work units - the "60,000" that was added to one of the servers represented less than 6 HOURS of work at the current F@H participation level.
My personal guess is that participation in number of people has multiplied by at least 10 times, more likely over 100 times, and possibly by A THOUSAND times in the last week. Number of clients hasn't gone up as much as many of us longer-term folders have been "heavy hitters" with multiple machines/clients.
This would put serious strain on ANY organization.
Might be worth talking to Bill Gates about the infrastructure issue - he's not part of Microsoft any more, but with his personal/foundation STOCK ownership I'm sure he's still got a lot of PULL there, and his foundation IS focused on medical-related issues.
This one seems to be right up the alley.
Re: out of work?????
If they had needed more servers to handle the work, they probably would have gotten them before Covid-19 came along. It is more likely a lack of work. They have only so many scientists, who were busy enough even before this virus came along.
That is all separate from the temporary server outage they have been facing, which is being solved, if it has not been already.
That is all separate from the temporary server outage they have been facing, which is being solved, if it has not been already.
-
- Posts: 36
- Joined: Thu Nov 16, 2017 2:57 pm
Re: out of work?????
Contacting Bill Gates is a very good idea. It's possible he would assign someone and it just gets done. FWIW I am a Principal Systems Architect with a rather large service provider so if there's any volunteer-ship along those lines needed I'm here I don't know what servers you're using but I have a lot of experience with Cisco UCS. (I do NOT work for Cisco). I don't know if it has been considered, but probably a more distributed approach will scale better. Larger numbers of smaller systems. Doing aggregation right becomes key at that point. Still, it sounds like WU generation WILL be the biggest challenge. If it were me I would try to see how much of that process can be automated / re-used to maximize the use of researchers' time.
- TulaneBaG
-
- Site Moderator
- Posts: 2850
- Joined: Mon Jul 18, 2011 4:44 am
- Hardware configuration: OS: Windows 10, Kubuntu 19.04
CPU: i7-6700k
GPU: GTX 970, GTX 1080 TI
RAM: 24 GB DDR4 - Location: Western Washington
Re: out of work?????
The high levels of growth have been eaten up all the workunits in the queue and they are scrambling to deploy new servers to meet the demand. That's a good problem to have I suppose. I'm very happy to see the outpouring of support and collaborative efforts here!
F@h is now the top computing platform on the planet and nothing unites people like a dedicated fight against a common enemy. This virus affects all of us. Lets end it together.
Re: out of work?????
I don't start FAH anymore, there is just no WU.