Project: 7085 (Run 0, Clone 695, Gen 16)
Moderators: Site Moderators, FAHC Science Team
-
- Posts: 887
- Joined: Wed May 26, 2010 2:31 pm
- Hardware configuration: Atom330 (overclocked):
Windows 7 Ultimate 64bit
Intel Atom330 dualcore (4 HyperThreads)
NVidia GT430, core_15 work
2x2GB Kingston KVR1333D3N9K2/4G 1333MHz memory kit
Asus AT3IONT-I Deluxe motherboard - Location: Finland
Re: Project: 7085 (Run 0, Clone 695, Gen 16)
I started a separate discussion ( viewtopic.php?f=16&t=25464 ) about updating the main FAQ.
Win7 64bit, FAH v7, OC'd
2C/4T Atom330 3x667MHz - GT430 2x832.5MHz - ION iGPU 3x466.7MHz
NaCl - Core_15 - display
2C/4T Atom330 3x667MHz - GT430 2x832.5MHz - ION iGPU 3x466.7MHz
NaCl - Core_15 - display
-
- Pande Group Member
- Posts: 50
- Joined: Wed Sep 16, 2009 4:14 pm
Re: Project: 7085 (Run 0, Clone 695, Gen 16)
@netblazer I've changed the project mode so you shouldn't be assigned WUs from 7085 (and also prevent others from running into the same issue).
Re: Project: 7085 (Run 0, Clone 695, Gen 16)
Thank you.abdulwahidc wrote:@netblazer I've changed the project mode so you shouldn't be assigned WUs from 7085 (and also prevent others from running into the same issue).
I don't mind working on that project. Just give me something that I have a chance in hell to finish so that we don't waste the collective resources.
I can spare the 1$ per week in added electric bill coming this way .
P.S., just to add an idea in the melting pot : Would it be possible to start the same project in 2 "classes" / versions?
You could have 1 version with say 10 M steps like this one has. And then the other class with 1M steps and give the 1M steps WU to older, lower end machines while the i5s, i7s could take care of the bigger steps. And then when you see a generation getting too far behind, assign them to a faster machine to catch up a little bit. This logic is very simple to code and requires very little testing (compared to the amount of work required to change your FAHcores).
-
- Site Moderator
- Posts: 6986
- Joined: Wed Dec 23, 2009 9:33 am
- Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB
Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400 - Location: Land Of The Long White Cloud
- Contact:
Re: Project: 7085 (Run 0, Clone 695, Gen 16)
Dynamically changing the number of steps within the same Project isn't supported AFAIK. Not sure why that it. Thus, all WUs within the same Project have the same number of steps.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time
Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time
Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
Re: Project: 7085 (Run 0, Clone 695, Gen 16)
PantherX wrote:Dynamically changing the number of steps within the same Project isn't supported AFAIK. Not sure why that it. Thus, all WUs within the same Project have the same number of steps.
Hence, new solution that may be interesting to consider .
My crappy old laptop has absolutely no problem / bottleneck running this 1221 P WU. It just takes too darn long. I can run maybe at 1 M steps per day. It's stupidly simple, but having completed 10 projects for 1221 points feels a lot better than only 1 for the same amount of points (over 10 days without any gratification for completing & achieving something). And will keep me more interesting in being an active participant as I feel like I'm really contributing something (psychology of number is really interesting).
An even much simpler solution is just to take the same project and cut it all down to 2M step runs for everybody (and the i7s don't lose out). That doesn't require any coding & testing and can work unchanged for the next 20 years, not to say forever.
There's just no reason to throw out those older ACTIVE systems...
Re: Project: 7085 (Run 0, Clone 695, Gen 16)
There's has never been a way to dynamically subdivide a WU. If the WU consists of 10M steps, finishing 1M steps means you've only done 10% of it and it won't be returned until it's finished. The PI associated with the project, could create an independent project representing 1M steps, but then every WU assigned from that project would be 1M steps, loading up the communications links with 10x as many uploads and downloads. Obviously some policy settings can overcome such issues. One way or another, there need to be projects that you can complete, and the assignment logic needs to take your hardware into account. At least temporarily, restricting machines like yours from this particular project is probably the best way to handle it.
By the way, 1 step of one protein is not the same as 1 step of another protein. As the number of atoms increases, each step needs a lot more computing.
Are your power-saving settings set for maximum performance, including Never Sleep?
By the way, 1 step of one protein is not the same as 1 step of another protein. As the number of atoms increases, each step needs a lot more computing.
Are your power-saving settings set for maximum performance, including Never Sleep?
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
Re: Project: 7085 (Run 0, Clone 695, Gen 16)
Yes perf mode is on. Only the screen goes off, but the fahcores keep working... actually this is when I get the best TPF, I check that multiple times daily as well.
I've gone to viper's site to turn off all the unwanted services. I monitor the task manager regularly to wipe out anything I don't absolutely need. I'm stuck working with this thing right now because I lost my own. This is as tuned as it's ever going to get without over clocking it. I'm an SQL DBA and I specialize in performance tuning (hardware included). I'm not saying I'm perfect, but I don't think this laptop will go that much faster unless I flat out stop using it & start overclocking it... and it was worth like 300$ 5 years ago. So you get what you pay for!
What I meant is that Server side, you get 2 lengths of the WU (same everything, just adjust the step count). In some sequences you go in steps of 10 M, and in other sections you go in steps of 1 M and assign to correct hardware (obviously divide the PP WU by 10 as well).
In the generation script, all that it takes is a second counter with an if MOD @CounterVar = 0 (10 small units) else (1 big unit) to split the work according to what the folders need the repartition to be. That part is rather easy to code (just call the same function and change 1 parameter with step count). I don't know the impact on the rest of the pipeline but it could be negligible or even nonexistent depending on how you coded your stuff. It clearly has no impact on the fahcores nor the client. The uploading process stays the same. It requires a new field in the assignment table to send to the correct folder, but this is negligible DB size and coding size. Then processing the zipped file should work just fine since you didn't touch the client. You still have the same fields to link from preceding PRCG to the next one. Etc. This sounds really feasible without rebuilding the entire system, but this is as far as I can go without looking at the code, which is usually when all hell breaks loose.
I've thought about the bandwidth issue, but the fact is that your service has to be download bound rather than upload bound unlike "most" web servers. So uploading more wouldn't be much of an issue (but it has to be considered as it will indeed increase load, but only for slow folders, so again might be a non-issue). However since only 10% of the work would have been completed (all things considered), the download should be roughly 9-10 times smaller. Processing would then be also 9-10 times faster. The cycles would be shorter and could be resent sooner. So the work increase there would be very small, if actually noticeable.
This might also alleviate some pressure and leave you free to not change the bigadv requisites (again).
I've gone to viper's site to turn off all the unwanted services. I monitor the task manager regularly to wipe out anything I don't absolutely need. I'm stuck working with this thing right now because I lost my own. This is as tuned as it's ever going to get without over clocking it. I'm an SQL DBA and I specialize in performance tuning (hardware included). I'm not saying I'm perfect, but I don't think this laptop will go that much faster unless I flat out stop using it & start overclocking it... and it was worth like 300$ 5 years ago. So you get what you pay for!
What I meant is that Server side, you get 2 lengths of the WU (same everything, just adjust the step count). In some sequences you go in steps of 10 M, and in other sections you go in steps of 1 M and assign to correct hardware (obviously divide the PP WU by 10 as well).
In the generation script, all that it takes is a second counter with an if MOD @CounterVar = 0 (10 small units) else (1 big unit) to split the work according to what the folders need the repartition to be. That part is rather easy to code (just call the same function and change 1 parameter with step count). I don't know the impact on the rest of the pipeline but it could be negligible or even nonexistent depending on how you coded your stuff. It clearly has no impact on the fahcores nor the client. The uploading process stays the same. It requires a new field in the assignment table to send to the correct folder, but this is negligible DB size and coding size. Then processing the zipped file should work just fine since you didn't touch the client. You still have the same fields to link from preceding PRCG to the next one. Etc. This sounds really feasible without rebuilding the entire system, but this is as far as I can go without looking at the code, which is usually when all hell breaks loose.
I've thought about the bandwidth issue, but the fact is that your service has to be download bound rather than upload bound unlike "most" web servers. So uploading more wouldn't be much of an issue (but it has to be considered as it will indeed increase load, but only for slow folders, so again might be a non-issue). However since only 10% of the work would have been completed (all things considered), the download should be roughly 9-10 times smaller. Processing would then be also 9-10 times faster. The cycles would be shorter and could be resent sooner. So the work increase there would be very small, if actually noticeable.
This might also alleviate some pressure and leave you free to not change the bigadv requisites (again).
-
- Posts: 10179
- Joined: Thu Nov 29, 2007 4:30 pm
- Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
- Location: Arizona
- Contact:
Re: Project: 7085 (Run 0, Clone 695, Gen 16)
And in just 10 years we've gone from a single threaded Celeron 500 Mhz to as many as 128 2.6GHz threads in a 4Processor server. That's over 660x the speed, not to mention SSE speed doubled in that same time frame, making it 1330x the speed, or a 133x speed increase each year, and that's just in hardware. FAH software has increase 10x the speed during that time. From non-existent GPUs, to GPUs rivaling even the faster CPUs in folding performance.netblazer wrote:... and can work unchanged for the next 20 years, not to say forever.
There's just no reason to throw out those older ACTIVE systems...
Yes, there is good reason to throw out older active systems. They become boat anchors in 3-5 years, slowing down the overall speed of results, slowing down finding the cures for diseases that people die from each and every day. One disease that killed my father last year. Another that I survived with extreme treatments just this year. In the US, 1 in 4 die from Cancer. If you have a sibling and both parents, which one will it be? Both grandparents? You get 2 guesses...
PG has discussed adaptive sampling, but not for slower systems. They want to do it with the fastest systems to get longer trajectories very quickly. It's like scouting the trail ahead to see if they are heading in the right direction to find the best answers. And if not, they can change their direction sooner, making it a shorter path, or at least a more direct path to the final answers. The faster the better.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
Tell me and I forget. Teach me and I remember. Involve me and I learn.
Re: Project: 7085 (Run 0, Clone 695, Gen 16)
And those adaptive sampling would be on bigadv only or anybody with say i7 and up?
Re: Project: 7085 (Run 0, Clone 695, Gen 16)
As with any potential change adaptive sampling (if its value to improved scientific throughput can be demonstrated), don't expect an announcement until it is ready for a roll-out. Until that time, if it ever happens, details about which projects would be included are unknown as that's likely to depend on the preliminary test data they'd have at that time.
At the present time, the servers know very little about your hardware, basing it's decisions primarily on core count (threads or CPUs) reported by your OS. That is admittedly a rather poor predictor of speed, except perhaps for uniprocessor systems.
See Upcoming changes to bigadv threshold
At the present time, the servers know very little about your hardware, basing it's decisions primarily on core count (threads or CPUs) reported by your OS. That is admittedly a rather poor predictor of speed, except perhaps for uniprocessor systems.
See Upcoming changes to bigadv threshold
kasson wrote:We also recognize that core count is not the most robust metric of machine capability, but given our current infrastructure it is the most straightforward surrogate to evaluate.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
-
- Site Moderator
- Posts: 6986
- Joined: Wed Dec 23, 2009 9:33 am
- Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB
Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400 - Location: Land Of The Long White Cloud
- Contact:
Re: Project: 7085 (Run 0, Clone 695, Gen 16)
You are correct. We tend to support old hardware for as long as possible and there are two ways. The first being your suggestion of breaking up the larger projects into smaller ones. The second one is to have projects targeted for these older/smaller system from the beginning. The second method is what PG is currently using and so far, it has worked very well. You can see that your CPU is still being assigned valid WUs and you can finish it before the Preferred Deadline. Occasionally, there can be a server glitch or incorrectly configured server which may result in WUs not designed to run on your system, getting them. It has happened and was rectified, like now. Moreover, the benchmark is a dedicated system so its results will never be the same to a non-dedicated system. AFAIK, CPUs not supporting SSE2 are now no longer getting WUs since FahCore_78 isn't being assigned from roughly August. Still awaiting on if it will be fixed or officially deprecated.netblazer wrote:...There's just no reason to throw out those older ACTIVE systems...
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time
Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time
Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
-
- Posts: 10179
- Joined: Thu Nov 29, 2007 4:30 pm
- Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
- Location: Arizona
- Contact:
Re: Project: 7085 (Run 0, Clone 695, Gen 16)
Correct. PG does try to support clients and hardware as long as possible. Active systems can run as long as they can...but not 20 years, not past it's usable life. Power and efficiency increase too quickly. At some point, even with active systems, it's cheaper to replace with more efficient hardware just for the energy savings.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
Tell me and I forget. Teach me and I remember. Involve me and I learn.
Re: Project: 7085 (Run 0, Clone 695, Gen 16)
Same problem, different project :abdulwahidc wrote:@netblazer I've changed the project mode so you shouldn't be assigned WUs from 7085 (and also prevent others from running into the same issue).
project:7083 run:0 clone:494 gen:26 core:0xa4 unit:0x000000e60001329c4fe0eac412bf4aae
Re: Project: 7085 (Run 0, Clone 695, Gen 16)
netblazer wrote:Same problem, different project :abdulwahidc wrote:@netblazer I've changed the project mode so you shouldn't be assigned WUs from 7085 (and also prevent others from running into the same issue).
project:7083 run:0 clone:494 gen:26 core:0xa4 unit:0x000000e60001329c4fe0eac412bf4aae
And 7084 too while you're at it. They seem to have the exact same demand on CPU power.
Re: Project: 7085 (Run 0, Clone 695, Gen 16)
You reported CeleronM CPU 560 @ 2.13GHz. If you exclude whatever downtime you may typically experience (including SLEEP/HIBERNATE time) and any time when you're running some other program which puts heavy demands on the CPU such as video encoding, other DC projects, etc., over the course of a week, how many hours can FAH expect to find your computer essentially idle except for FAH? Does your computer run a pretty screensaver or something like "dark screen"?
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.