How exactly are new WUs generated?

Moderators: Site Moderators, FAHC Science Team

Post Reply
mikeestacio
Posts: 22
Joined: Tue Mar 17, 2020 3:34 pm

How exactly are new WUs generated?

Post by mikeestacio »

Is it an algorithm that automatically takes completed WUs and modifies parameters to create a new PRCG? Is there a program that researchers manually load new parameters into?
Joe_H
Site Admin
Posts: 7939
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: How exactly are new WUs generated?

Post by Joe_H »

Partly. First they need to create a model of the protein system being done. That includes defining the initial locations of the atoms, forces between them, and velocity vectors for them as well. The solvent, mostly water, around the proteins also needs to be added. Some of data can be imported from other experimental data, but other data may need to be set manually from what I understand. There are some heuristics to aid in doing theses steps as well.

That creates the basic Project (P).

Various of these parameters are given variations to create a number of different starting points, these are the different Runs (R) and Clones (C). Again there are some heuristics that have been created.

The first WU of each PRC is then Generation (G) 0, it is processed and the results become the starting point for next WU, Gen 1. This cycle then is repeated, Gen n becoming the input for Gen n+1, until enough have been done or a Gen reaches a stopping point.

I have simplified this a bit, and my disclaimer is that this is to my best understanding of the process.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
mikeestacio
Posts: 22
Joined: Tue Mar 17, 2020 3:34 pm

Re: How exactly are new WUs generated?

Post by mikeestacio »

Joe_H wrote:Partly. First they need to create a model of the protein system being done. That includes defining the initial locations of the atoms, forces between them, and velocity vectors for them as well. The solvent, mostly water, around the proteins also needs to be added. Some of data can be imported from other experimental data, but other data may need to be set manually from what I understand. There are some heuristics to aid in doing theses steps as well.

That creates the basic Project (P).

Various of these parameters are given variations to create a number of different starting points, these are the different Runs (R) and Clones (C). Again there are some heuristics that have been created.

The first WU of each PRC is then Generation (G) 0, it is processed and the results become the starting point for next WU, Gen 1. This cycle then is repeated, Gen n becoming the input for Gen n+1, until enough have been done or a Gen reaches a stopping point.

I have simplified this a bit, and my disclaimer is that this is to my best understanding of the process.
I'm more curious about the specifics of how WUs are encoded to be uploaded on to the servers and distributed to us. Is there some sort of software tool used to create them? I'm trying to understand where the bottleneck is when WUs are not available.
Asgaroth
Posts: 29
Joined: Sun Dec 09, 2018 12:06 am

Re: How exactly are new WUs generated?

Post by Asgaroth »

Thanks @Joe_H, I didn't know the process of how these are generated, now I see where the PRCG acronym comes from.
There are two major products that came out of Berkeley: LSD and UNIX. We don't believe this to be a coincidence.
-- Jeremy S. Anderson
Joe_H
Site Admin
Posts: 7939
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: How exactly are new WUs generated?

Post by Joe_H »

mikeestacio wrote:I'm more curious about the specifics of how WUs are encoded to be uploaded on to the servers and distributed to us. Is there some sort of software tool used to create them? I'm trying to understand where the bottleneck is when WUs are not available.
Once the researcher sets up the basic project there are tools to get it onto a server. Some options need to be selected based on the project's needs. Some testing runs need to be done before releasing to everyone else to see if there are any problems, and to benchmark for points awards.

The server code takes care of the rest of WU distribution and returns, and generating the next generation of each run as each WU is returned.

The basic bottlenecks at this point are project setup, especially for a new protein system, and having enough servers to handle all the demand. The servers can only handle so many connections for downloads and uploads, and are currently at or near those limits. As they add more servers, more can be sent out.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
Post Reply