Page 1 of 3
Q: providing additional info like "ns" by project
Posted: Wed Aug 06, 2014 6:44 pm
by ChristianVirtual
I wonder if it would be possible to add for each WU the amount of nano-seconds it represents.
As it depmd highly on the hardware and assigned project/WU it might be require to put such value (in case it can be easy determined) into the log file, e.g. At the end of a WU as summary. Or ideally in each completed line.
Or is here some neat trick to generate/approximate that value based on actual PPD and psummary information ?
Why ? It is an impressive/educational number to help donors to understand how complex folding is and that *this* one WU they are working a whole day "just" represent few ns in real life.
Re: Q: adding "ns/day" to client output
Posted: Wed Aug 06, 2014 7:06 pm
by billford
Yes, I've often thought that would be a nice parameter to see, just the number of nS that a WU represents would be enough afaic.
Possibly not on the NaCl client though… pico-seconds might be depressing
Re: Q: adding "ns/day" to client output
Posted: Wed Aug 06, 2014 7:21 pm
by bruce
Actually, ns/day was reported by the V6 client but it was eliminated from V7 because it's so deceptive. One ns of simulation of a protein with a large number of atoms requires a lot more processing than one ns of simulation of a protein with fewer atoms. The complexity of the protein (the number of atoms and the bonds between them) is much more significant than any hardware considerations and although hardware considerations are, in fact, important, that turns out to be less important than the size and structure of the protein.
From the donor's perspective, what's important is how much science you're completing, and PPD (either baseline PPD or total PPD, depending on your perspective) is a much more uniform measurement of what your system is accomplishing.
Gromacs does report nanoseconds in its log file, so for CPU projects you can still find it fairly easily. I'm not sure about the GPU cores.
A report of NS/WU is probably more straightforward because of the uncertainties of estimating rates of progress, but it's still not a measurement of work accomplished that can be compared to work accomplished on a different protein.
Re: Q: adding "ns/day" to client output
Posted: Wed Aug 06, 2014 7:54 pm
by billford
That's sounds reasonable, but for me (I can't speak for ChristianVirtual obviously) it would simply be "a quantity that is of interest"; out of plain curiosity not for any purpose of comparison or measurement.
I'm not even bothered about it being part of the client, I'd be quite happy with another column in psummary (nS/WU) that I could look up when the urge took me. The logs tell me how long a WU (or even a single frame) takes, getting nS/day from that is a trivial exercise.
(I'm aware of many of the potential sources of error by doing it that way, but it would be accurate enough for me)
Re: Q: adding "ns/day" to client output
Posted: Wed Aug 06, 2014 7:58 pm
by ChristianVirtual
bruce wrote:From the donor's perspective, what's important is how much science you're completing, and PPD (either baseline PPD or total PPD, depending on your perspective) is a much more uniform measurement of what your system is accomplishing.
Gromacs does report nanoseconds in its log file, so for CPU projects you can still find it fairly easily. I'm not sure about the GPU cores.
A report of NS/WU is probably more straightforward because of the uncertainties of estimating rates of progress, but it's still not a measurement of work accomplished that can be compared to work accomplished on a different protein.
Call me Joe Donor: but what means "science done" ? What is the measurement system ? Just PPD is eventually not sufficient for me (anymore) as growing-up-donor. The PPD is just a projection in an attempt to describe science value. Like hourly wage for nurses and car dealers. While the car dealer might get more ¥€$ per day the work of a nurse might be perceived more valuable to the public (no offense indented to any car dealer here in the forum, we need you guys too)
When GROMACS has it in its files that great but a no-go for me; as I can't access the files. I have only the log files provided by remote API and it's runtime messages. Same as FAHConreol.
Re: Q: adding "ns/day" to client output
Posted: Wed Aug 06, 2014 8:07 pm
by ChristianVirtual
billford wrote: I'd be quite happy with another column in psummary (nS/WU) that I could look up when the urge took me. The logs tell me how long a WU (or even a single frame) takes, getting nS/day from that is a trivial exercise.
(I'm aware of many of the potential sources of error by doing it that way, but it would be accurate enough for me)
That would be a good start with psummary and ns/WU and wouldn't need to change the cores or clients
Re: Q: adding "ns/day" to client output
Posted: Wed Aug 06, 2014 8:21 pm
by billford
I'd prefer the "raw" data of nS/WU anyway, then I can calculate what
I want, not what someone else
thinks I want.
(I've probably been using Macs too long
)
Re: Q: adding "ns/day" to client output
Posted: Wed Aug 06, 2014 9:55 pm
by 7im
Except if the ns/wu is non-comparable because of differences in size, complexity, etc., what good does any calculation do for you?
You might want to look at what that v6 data is that bruce mentioned before assuming it's something you can actually use productively.
Re: Q: adding "ns/day" to client output
Posted: Wed Aug 06, 2014 10:04 pm
by billford
Who said anything about productively? As I said, it's just out of interest (and of more interest than the number of atoms involved, and that's in psummary).
Your mileage may vary of course… in fact I'd expect it to.
Re: Q: adding "ns/day" to client output
Posted: Wed Aug 06, 2014 11:14 pm
by NookieBandit
Having the ns/wu captured would give 3rd party monitoring apps like HFM the opportunity to summarize by work unit the total number of ns a donor has contributed to the trajectory, or segment under investigation. ChristianVirtual makes a good point in that as donors become more sophisticated by learning more about how simulations actually work, having access to data that allows characterization of their contribution on something other than PPD a very interesting concept. Seeing a summary of the total ns put against a number of work units in a project over the course of time (say, a year or more) would be motivating along with showing how daunting the task actually is.
Re: Q: adding "ns/day" to client output
Posted: Wed Aug 06, 2014 11:28 pm
by bruce
Curiosity about the science is a good thing and the sense of how many generations are required to create a meaningful solution is important. All we can do is ask, and then see if PG decides it's as important as spending the same amount on something else.
The typical project justification is based on "How much more science will be accomplished?" and unfortunately that's zero. Nevertheless, Stanford is an educational institution and educating donors about the science involved in FAH is certainly worth something.
Re: Q: adding "ns/day" to client output
Posted: Thu Aug 07, 2014 6:51 am
by billford
bruce wrote:
The typical project justification is based on "How much more science will be accomplished?" and unfortunately that's zero.
In any direct sense, yes, but I'm sure PG would acknowledge that suitable rewards can
lead to more science being done. And those rewards don't have to be tangible (though it doesn't do any harm if they are, ask curecoin
)
To be honest, at the moment I have essentially zero feel for how much actual science I am contributing; total points, PPD etc only give a comparative indication with respect to my own past performance, that of others and between clients processing WUs from the same project.
If I could look through my own (local) stats, see that I've completed (eg) "N" P13000 WU's and, even though they are not all on the same trajectory or even processed by the same client, calculate from that that I have contributed to evaluating the first "X" nano- (hopefully micro- or maybe milli- in due course) seconds in a protein molecule's life then that would give me much more satisfaction than simply looking at a points total.
(Which, as far as "scientific work performed" is concerned, is anyway distorted by QRB)
If HFM could do it automatically then so much the better, but that's not in PG's purview.
It may be that I'm unusual in seeing this sort of information as a reward in itself, but I very much doubt it.
bruce wrote:All we can do is ask
And that's all I'm doing
Re: Q: adding "ns/day" to client output
Posted: Thu Aug 07, 2014 7:19 pm
by bruce
If you want to measure science, nanoseconds is an extremely poor measure for two reasons.
1) As I said earlier, the scientific value of 1 ns of a large protein is a lot more valuable that 1 ns of a small protein. Also, "large" is a relative term, since many proteins that need to be studied are significantly larger that the ones that can be studied today. Moreover studying the interaction between two or more proteins can only lead to larger models.
2) I do not accept the statement Distortion by QRB" The scientific value of proteins which are returned promptly far exceeds the value of the same protein which is delayed, sitting waiting for computational resources. The justification for QRB being an improved measure of scientific value comes straight from Dr. Pande, himself, and has been repeatedly debated here on the forum.
Let's not turn this into a debate about QRB. That's off-topic and we're not going to go there.
Yes, there is value it documenting the number of nanoseconds in a WU (and I support that) but it cannot be used to compare the simulation of one protein to that of another.
WUs which are uploaded as soon as possible after downloading are scientifically much more valuable
Re: Q: adding "ns/day" to client output
Posted: Thu Aug 07, 2014 7:30 pm
by billford
bruce wrote:but it cannot be used to compare the simulation of one protein to that of another.
I accept that, as I said in an earlier post in this topic:
billford wrote:... not for any purpose of comparison or measurement.
It wasn't my intention to start another debate about QRB, apologies if it came across that way
I was referring to the distortion introduced between clients on different hardware
processing WUs from the same project- adding up the nanoseconds gives a more realistic measure (for this specific scenario) than adding up the points. I should have made that clearer.
Re: Q: adding "ns/day" to client output
Posted: Thu Aug 07, 2014 9:50 pm
by ChristianVirtual
I think it is clear that an intra-project comparison of ns/day only work within a group of similar projects (e.g. Same protein, different environmental parameters).
I was also thinking it would be helpful to understand the differences in protein complexity when the number of atoms in a project could be differentiated into number of atoms in the protein plus the number of atoms in the environment/solution. If difficult to get and as alternative the number of peptides could be provided. That would give a better context to compare ns/day between project, if one would like to do.