Page 1 of 1
Why isn't TPF averaged (or is it)?
Posted: Sat May 04, 2013 8:53 pm
by Breach
Background: I have started maintaining per WU Excel table with various details. One of the items I am calculating is WU average points per hour metric which I calculate as: Total Estimated points (with no interruptions) / 24 x TPF x 100. That's fine, but I am doing this on the assumption that the TPF stabilizes at some point (more or less). However that's a very wrong assumption as it turns out. For example PRCG 8089,1852,3,33 - for every frame the TPF estimate jumps between 1 min 02 seconds and 1 min 33 secs (with the ETA being 1 h 15 mins and 1 h 52 mins respectively). In this particular case that's not a problem as I can still calculate the average TPF as there's a pattern in this case, but anyway my more general question is - why isn't the reported TPF an average calculated on the basis of the sum of the previous TPFs in principle? It seems to me the TPF reported is the TPF of just the last frame which can lead to confusing ETAs?
Thanks.
PS
No, there's not extra load influencing this - I have dedicated 6 out of 8 threads and plenty of system idle headroom.
Re: Why isn't TPF averaged (or is it)?
Posted: Sat May 04, 2013 9:24 pm
by PantherX
Is there any reason to manually maintain the Excel sheet? If not, than I would recommend that you use HFM.NET (
http://code.google.com/p/hfm-net/) which does work with V7. However, it may not report the Slots type correctly. Am unsure of this since I use V7.2.6 with HFM.NET and it works just fine for my purpose. Do note that HFM.NET does calculate the TFP as an average over the last 3 complete frames and you can change it to other methods also.
FAHControl does display the average TPF and not the last TPF. Am not sure how many frames it uses to calculate the average TPF.
FahCores report the actual TPF. One reason for the variation is the different forces that needs to be calculated. It may change with each frame so it might contribute to the TPF fluctuation.
Do note that you can't rule out interruptions by the system unless you have set-up core-affinity (by using 3rd party tools). With that core-affinity, it will lock those process to those cores and the OS scheduler will not be able to use those locked cores.
Re: Why isn't TPF averaged (or is it)?
Posted: Sat May 04, 2013 9:25 pm
by bruce
I have wondered about many of those same questions and I have not been able to get any good answers. I have a couple of suggestions that might or might not be true but you can bounce them against the data and see if anything useful turns out.
First, every frame is NOT identical. There are periodic interruptions (like writing a checkpoint) which don't happen at the same frequency as the TPF.
Second, even if it seems like calculating a step with N atoms should take the same amount of time, that is simply NOT true. Search the forum for "folding event"
Third, if an average is used, and the protein shape changes appreciably or another task in your computer changes, the delay time before the number stabilize is highly dependent on how many samples are included. Donors want a TPF that converges quickly (say they added two more CPU-cores or overclocked their GPU) yet they want a stable predictiion -- and those two concepts contradict each other -- so no predicion of future results will always be satisfactory.
During the development of the 3rd party tool HFM, the pros and cons of several methods were evaluated and in the end, a choice of three methods was left to the user. V7 does not give you a choice, and apparently is using a fourth method.
You'll notice that there have been several V7 tickets opened on this topic but since the have no influence on science, only on ('worthless"?) points and since it's called "estimated" anyway, things like this do not get a lot of attention from the Developers who have plenty of more important things to be worrying about.
Re: Why isn't TPF averaged (or is it)?
Posted: Sat May 04, 2013 9:41 pm
by Breach
Thanks. I am unfamiliar with HFM and will check it out now
Are you sure that FAHControl displays TPF as an average? I'm seeing 1% 1:02 2% 1:33 3% 1.02 4% 1:33 etc. I am at 64% now (1:35), 65% (1:03) - if it was a true average it would have stabilized by now, though I guess it's still possible...
I have locked affinity for FAH to the first 3 cores and set it with realtime priority - no changes as far as TPF behavior is concerned. I don't understand why that would be needed though? I have no other programs with affinity set - if there's an available thread on 8 why would the scheduler interrupt the FAH core running on 1 to run it there...?
[Update:]
So:
1. Not sure whether whether HFM calculates average TPF or uses the delta from the log file - still it matches my manual calculations:
Code: Select all
20:16:18:WU02:FS01:0xa4:Completed 2500 out of 250000 steps (1%) - N/A
20:17:33:WU02:FS01:0xa4:Completed 5000 out of 250000 steps (2%) - 1m 15 sec
20:18:49:WU02:FS01:0xa4:Completed 7500 out of 250000 steps (3%) - 1m 16 sec
20:20:04:WU02:FS01:0xa4:Completed 10000 out of 250000 steps (4%) - 1m 15 sec
20:21:20:WU02:FS01:0xa4:Completed 12500 out of 250000 steps (5%) - 1m 16 sec
20:22:35:WU02:FS01:0xa4:Completed 15000 out of 250000 steps (6%) - 1m 15 sec
20:23:51:WU02:FS01:0xa4:Completed 17500 out of 250000 steps (7%) - 1m 16 sec
20:25:07:WU02:FS01:0xa4:Completed 20000 out of 250000 steps (8%) - 1m 16 sec
20:26:23:WU02:FS01:0xa4:Completed 22500 out of 250000 steps (9%) - 1m 16 sec
20:27:38:WU02:FS01:0xa4:Completed 25000 out of 250000 steps (10%) - 1m 15 sec
20:28:54:WU02:FS01:0xa4:Completed 27500 out of 250000 steps (11%) - 1m 16 sec
There's an option on how calculate the PPD in HFM - I don't see an option on how to calculate TPF?
2. This gets even stranger - give these figures in the log file I can't understand what kind of logic would the FAHClient use to come up with these 1m 02 sec / 1m 33 sec TPF figures...? Sure the average of these is about 1m 18 sec, but...
Re: Why isn't TPF averaged (or is it)?
Posted: Sat May 04, 2013 10:17 pm
by Breach
bruce wrote:
Third, if an average is used, and the protein shape changes appreciably or another task in your computer changes, the delay time before the number stabilize is highly dependent on how many samples are included. Donors want a TPF that converges quickly (say they added two more CPU-cores or overclocked their GPU) yet they want a stable predictiion -- and those two concepts contradict each other -- so no predicion of future results will always be satisfactory.
Thanks. I suppose it comes down to that. Current speed vs. average speed per km/mile. Still, a TPF/ETA which jumps up and down every frame in the FAHClient 7 is not helping either case... though I guess it's only an issue in specific projects.
[Update]
Just to update before I move on with my life
On another WU:
From the logs:
0-1%: 15:26
1-2%: 15:23
2-3%: 15:31
3-4%: 15:28
Average: 15:27
HFM reports the same figure so it seems it is indeed calculating a TPF average. FAHControl reports a TPF of 15:30... at least it's not jumping -/+ 30% between frames on this WU.
Re: Why isn't TPF averaged (or is it)?
Posted: Sat May 04, 2013 11:56 pm
by PantherX
Breach wrote:...Are you sure that FAHControl displays TPF as an average? I'm seeing 1% 1:02 2% 1:33 3% 1.02 4% 1:33 etc. I am at 64% now (1:35), 65% (1:03) - if it was a true average it would have stabilized by now, though I guess it's still possible...
Technically, a mathematical model is being used and I am unaware of what it is. For sake of simplicity, I used the term "Average" since it is easy to understand and eventually, FAHControl gets is correct (or within a few seconds of it). It does maintain a history of the Project so the next time you download the WU from a folded Project, it will display the TPF automatically and will be updated from the actual WU if needed.
Breach wrote:...I have locked affinity for FAH to the first 3 cores and set it with realtime priority - no changes as far as TPF behavior is concerned. I don't understand why that would be needed though? I have no other programs with affinity set - if there's an available thread on 8 why would the scheduler interrupt the FAH core running on 1 to run it there...?...
The scheduler has it's own logic that I am unaware off. Quite sometime ago, I read that if you used a 3rd party affinity program, you can lock all non-FAH programs to a particular CPU(s) and the free ones can be locked to FAH. One may see a difference in the TPF but generally, this not that widely used.
Breach wrote:...There's an option on how calculate the PPD in HFM - I don't see an option on how to calculate TPF?...
Edit -> Preference -> Options:
Calculate PPD Based on: 4 Options available from the drop down menu
Calculate Bonus Based on: 3 Options available from the drop down menu
Re: Why isn't TPF averaged (or is it)?
Posted: Sun May 05, 2013 12:22 am
by Jesse_V
According to
https://fah-web.stanford.edu/projects/F ... ticket/395
As of v7.1.44 the client tries to measure frames by watching the frame number change. If the core is shutdown or the machine is hibernated the core will detect this and adjust it's estimate accordingly. It saves the last three frame measures and uses the median value for it's predictions. This both allows the estimates to adjust quickly (on a per frame basis) and provides some smoothing for abnormal values.
To be clear the client is not taking the mean of the last three. It is taking the median value.
https://fah-web.stanford.edu/projects/F ... ticket/581 is an enhancement request to choose the PPD calculation algorithm.
Re: Why isn't TPF averaged (or is it)?
Posted: Sun May 05, 2013 2:31 am
by EXT64
On certain projects, FAHControl does not agree with the log (and thus core) on WU progress. It instead oscillates around the true TPF (low-high-low-...). This has been mentioned as a bug several times but as it was not deemed a high priority it has never been fixed or explained.
See viewtopic.php?f=88&t=23960
Re: Why isn't TPF averaged (or is it)?
Posted: Sun May 05, 2013 2:28 pm
by Flaschie
PantherX wrote:
Breach wrote:...There's an option on how calculate the PPD in HFM - I don't see an option on how to calculate TPF?...
Edit -> Preference -> Options:
Calculate PPD Based on: 4 Options available from the drop down menu
Calculate Bonus Based on: 3 Options available from the drop down menu
If this is something you check often, then it may be nice to know the shortcuts for toggling: "Alt+P" for PPD and "Alt+O" for bonus.
Re: Why isn't TPF averaged (or is it)?
Posted: Sun May 05, 2013 7:53 pm
by Qinsp
With my machines, once I get them running sweet, the actual TPF variation is small.
The 7.3.6 TPF estimation code is broken. Bad. Think train wreck. Ignore it. Look at the log file.
Re: Why isn't TPF averaged (or is it)?
Posted: Mon May 06, 2013 5:05 am
by bruce
The FAH Client design specifies that it does NOT look at the log ... so you know more about the TFP than it does.