FahMon (multi-platform app to monitor various F@h clients)
Moderator: Site Moderators
-
- Site Admin
- Posts: 1288
- Joined: Fri Nov 30, 2007 9:37 am
- Location: Oxfordshire, UK
Re: FahMon (multi-platform app to monitor various F@h clients)
Gah. I'll check the rounding logic again then.
-
- Posts: 8
- Joined: Sun Dec 02, 2007 7:25 pm
- Location: Bordeaux, France
Re: FahMon (multi-platform app to monitor various F@h clients)
Farmer 4 Monster Folding since 2008...
Reviewer 4 FAH-@ddict
Reviewer 4 FAH-@ddict
-
- Posts: 704
- Joined: Tue Dec 04, 2007 6:56 am
- Hardware configuration: Ryzen 7 5700G, 22.40.46 VGA driver; 32GB G-Skill Trident DDR4-3200; Samsung 860EVO 1TB Boot SSD; VelociRaptor 1TB; MSI GTX 1050ti, 551.23 studio driver; BeQuiet FM 550 PSU; Lian Li PC-9F; Win11Pro-64, F@H 8.3.5.
[Suspended] Ryzen 7 3700X, MSI X570MPG, 32GB G-Skill Trident Z DDR4-3600; Corsair MP600 M.2 PCIe Gen4 Boot, Samsung 840EVO-250 SSDs; VelociRaptor 1TB, Raptor 150; MSI GTX 1050ti, 526.98 driver; Kingwin Stryker 500 PSU; Lian Li PC-K7B. Win10Pro-64, F@H 8.3.5. - Location: @Home
- Contact:
Re: FahMon (multi-platform app to monitor various F@h clients)
Revert to v2.3.2
Ryzen 7 5700G, 22.40.46 VGA driver; MSI GTX 1050ti, 551.23 studio driver
Ryzen 7 3700X; MSI GTX 1050ti, 551.23 studio driver [Suspended]
Ryzen 7 3700X; MSI GTX 1050ti, 551.23 studio driver [Suspended]
Re: FahMon (multi-platform app to monitor various F@h clients)
I did a little more testing on the 99% CPU problem on 2.3.4 under Windows, and I can confirm that the logs are definitely the source of the problem for me. Long CPU spikes only happens when FahMon tries to reload a client with a large log file. Logs under 100K are loaded quickly with no long CPU spike, but every time it reloads a client with a large FAHlog.txt file, it freezes my single-core Pentium M laptop for about 5-10 seconds with 99% CPU usage.uncle_fungus wrote:For those of you seeing massive CPU load while reloading, can you give me a ballpark figure for the size of your FAHlogs please. I don't think this is the cause of the load but it might be contributing if the files are large.
The large log problem is most noticeable on my GPU clients, since they can create large log files relatively quickly. I just rebooted two of them, thereby creating new FAHlog.txt files, and FahMon reloaded them right away with no long CPU spike. The third, with a FAHlog.txt file of 500K, still causes the problem.
So for now, the solution seems to be stopping and restarting my clients if their logs get too big.
-
- Posts: 2948
- Joined: Sun Dec 02, 2007 4:36 am
- Hardware configuration: Machine #1:
Intel Q9450; 2x2GB=8GB Ram; Gigabyte GA-X48-DS4 Motherboard; PC Power and Cooling Q750 PS; 2x GTX 460; Windows Server 2008 X64 (SP1).
Machine #2:
Intel Q6600; 2x2GB=4GB Ram; Gigabyte GA-X48-DS4 Motherboard; PC Power and Cooling Q750 PS; 2x GTX 460 video card; Windows 7 X64.
Machine 3:
Dell Dimension 8400, 3.2GHz P4 4x512GB Ram, Video card GTX 460, Windows 7 X32
I am currently folding just on the 5x GTX 460's for aprox. 70K PPD - Location: Salem. OR USA
Re: FahMon (multi-platform app to monitor various F@h clients)
My solution to the problem has been to only load FAHMON when I want to look at it, rather than keep it continiously run. The result, is that regardless of whatever inefficiency occurs it will only be there for a few minutes rather than 24x7.
-
- Posts: 1024
- Joined: Sun Dec 02, 2007 12:43 pm
Re: FahMon (multi-platform app to monitor various F@h clients)
Maybe FahMon could keep track of the number of records in each FAHlog. It's probably less CPU intensive to skip to record N than to read them all sequentially (even over a LAN connection). There would need to be logic to read the whole file if this method fails (like when the file is replaced or if scrolling backwards).
-
- Site Admin
- Posts: 1288
- Joined: Fri Nov 30, 2007 9:37 am
- Location: Oxfordshire, UK
Re: FahMon (multi-platform app to monitor various F@h clients)
I've just altered the logic again and it now works for me.bollix47 wrote:Sorry, to say but there was no difference in the projections. Still as described above except the current WU has 250001 steps.
Re: FahMon (multi-platform app to monitor various F@h clients)
I don't currently have a WU with 250001 steps but do have one with 249999 steps and that one now appears to be working correctly.
The only message(s) that have an X beside them are:
but I get that on all my clients.
Thanks for the Christmas gift and for all the work that you do.
Merry Christmas
The only message(s) that have an X beside them are:
Code: Select all
[24/12/08 - 15:54:03] ! The progress value in file \\ATLANTISVM01\folding\unitinfo.txt could not be found/parsed
[24/12/08 - 15:54:03] X Error while reading \\ATLANTISVM01\folding\unitinfo.txt!
Thanks for the Christmas gift and for all the work that you do.
Merry Christmas
-
- Site Admin
- Posts: 1288
- Joined: Fri Nov 30, 2007 9:37 am
- Location: Oxfordshire, UK
Re: FahMon (multi-platform app to monitor various F@h clients)
I've just uploaded another minor change that will also account for the 249999 case as well as the 250001 case (the calculated frame count would end up as 66 instead of 100 without this fix).
I'm working on the unitinfo error message above, as I can replicate that here too.
I'm working on the unitinfo error message above, as I can replicate that here too.
Re: FahMon (multi-platform app to monitor various F@h clients)
Compiled program again for 'another minor change' and can see that the projection for the WU with 249999 steps is now correct. The previous one was off by about an hour (was at 90%) but now appears to be spot on.
OT a bit but what do I need in Windows to do the svn? I've been doing it on a Linux setup and copying the files over to my windows box.
OT a bit but what do I need in Windows to do the svn? I've been doing it on a Linux setup and copying the files over to my windows box.
-
- Site Admin
- Posts: 1288
- Joined: Fri Nov 30, 2007 9:37 am
- Location: Oxfordshire, UK
Re: FahMon (multi-platform app to monitor various F@h clients)
Tortoisesvn http://tortoisesvn.tigris.org/ (integrates with explorer) or Slik SVN http://www.sliksvn.com/en/download (plain console version)
-
- Posts: 53
- Joined: Fri Feb 08, 2008 4:24 pm
- Hardware configuration: 2 x X5550 Xeons - SuperMicro MBD-X8DAi-O
Server 2008 R2 x64 - 12GB Crucial DDR3 ECC Ram
PCP&C 910 Silencer - 1 x HIS 4850 ICEQ Turbo Edition
6 x E5530 Xeons (3 Systems) - SUPERMICRO MBD-X8DTL-i-O
Server 2008 RS x64 - 8GB DDR3 GSkill Non-ECC Ram
Seasonic 80+ Bronze 380w PSU
2 x E5504 - SUPERMICRO MBD-X8DTL-i-O
Server 2008 R2 x64 - 6GB DDR3 GSkill Non-ECC Ram
2.3 TB Raid 5 Array - Corsair 520 Power Supply
E5504 - EVGA X58 ATX Motherboard
Windows 7 x64 - 6GB DDR3 GSkill Non-ECC Ram
Seasonic 300 Power Supply
Intel X5550 CPU - EVGA X58 Micro ATX Motherboard
Windows 7 x64 - 3GB Corsair DDR3-1600
Corsair 550 Power Supply - ATI 4350
Dell Vostro 1500 Laptop - Intel T9300 C2D CPU
Windows 7 x64 - 4 GB DDR2-6400 - nVidia 8400m GS
Xeon 3075 C2D - Intel P35 Motherboard - 4GB DDR2 Non-ECC Ram
Server 2008 R2 x64- Seasonic 300 Power Supply - Location: Columbia, Tennessee
- Contact:
Re: FahMon (multi-platform app to monitor various F@h clients)
I was checking my F@H logs from my 10 x ATI 4850 and many were 300K to over 600K.Hyperlife wrote:I did a little more testing on the 99% CPU problem on 2.3.4 under Windows, and I can confirm that the logs are definitely the source of the problem for me. Long CPU spikes only happens when FahMon tries to reload a client with a large log file. Logs under 100K are loaded quickly with no long CPU spike, but every time it reloads a client with a large FAHlog.txt file, it freezes my single-core Pentium M laptop for about 5-10 seconds with 99% CPU usage.uncle_fungus wrote:For those of you seeing massive CPU load while reloading, can you give me a ballpark figure for the size of your FAHlogs please. I don't think this is the cause of the load but it might be contributing if the files are large.
It does take FAHmon a good while to load each client.
Some are saying go back to 2.3.3 or 2.3.2, but if the logs are the problem, shoudn't it show up there as well>?
-
- Posts: 94
- Joined: Thu Nov 13, 2008 4:18 pm
- Hardware configuration: q6600 @ 3.3Ghz windows xp-sp3 one SMP2 (2.15 core) + 1 9800GT native GPU2
Athlon x2 6000+ @ 3.0Ghz ubuntu 8.04 smp + asus 9600GSO gpu2 in wine wrapper
5600X2 @ 3.19Ghz ubuntu 8.04 smp + asus 9600GSO gpu2 in wine wrapper
E5200 @ 3.7Ghz ubuntu 8.04 smp2 + asus 9600GT silent gpu2 in wine wrapper
E5200 @ 3.65Ghz ubuntu 8.04 smp2 + asus 9600GSO gpu2 in wine wrapper
E6550 vmware ubuntu 8.4.1
q8400 @ 3.3Ghz windows xp-sp3 one SMP2 (2.15 core) + 1 9800GT native GPU2
Athlon II 620 @ 2.6 Ghz windows xp-sp3 one SMP2 (2.15 core) + 1 9800GT native GPU2 - Location: Calgary, Canada
Re: FahMon (multi-platform app to monitor various F@h clients)
Today morning i started on a WU which caused FahMon to take my [email protected] to its knees.
the WU is:
the size of the unitinfo is humongous 165M
-rw-r--r-- 1 kerekei kerekei 165M 2008-12-30 10:38 unitinfo.txt
the WU is:
Code: Select all
[13:47:29] + News From Folding@Home: Welcome to Folding@Home
[13:47:30] Loaded queue successfully.
[13:47:30] Connecting to http://171.64.65.56:8080/
[13:47:35] Posted data.
[13:47:35] Initial: 0000; - Receiving payload (expected size: 4834396)
[13:47:45] - Downloaded at ~472 kB/s
[13:47:45] - Averaged speed for that direction ~409 kB/s
[13:47:45] + Received work.
[13:47:45] Trying to send all finished work units
[13:47:45] + No unsent completed units remaining.
[13:47:45] + Closed connections
[13:47:45]
[13:47:45] + Processing work unit
[13:47:45] At least 4 processors must be requested.Core required: FahCore_a2.exe
[13:47:45] Core found.
[13:47:45] Working on queue slot 04 [December 30 13:47:45 UTC]
[13:47:45] + Working ...
[13:47:45] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a2.exe -dir work/ -suffix 04 -checkpoint 15 -forceasm -verbose -lifeline 6668 -version 623'
[13:47:45]
[13:47:45] *------------------------------*
[13:47:45] Folding@Home Gromacs SMP Core
[13:47:45] Version 2.01 (Wed Aug 13 13:11:25 PDT 2008)
[13:47:45]
[13:47:45] Preparing to commence simulation
[13:47:45] - Ensuring status. Please wait.
[13:47:54] - Assembly optimizations manually forced on.
[13:47:54] - Not checking prior termination.
[13:47:55] - Expanded 4833884 -> 23977801 (decompressed 496.0 percent)
[13:47:55] Called DecompressByteArray: compressed_data_size=4833884 data_size=23977801, decompressed_data_size=23977801 diff=0
[13:47:56] - Digital signature verified
[13:47:56]
[13:47:56] Project: 2669 (Run 7, Clone 17, Gen 41)
[13:47:56]
[13:47:56] Assembly optimizations on if available.
[13:47:56] Entering M.D.
[13:56:15] Completed 2509 out of 249999 steps (1%)
[14:04:35] Completed 5009 out of 249999 steps (2%)
[14:12:54] Completed 7509 out of 249999 steps (3%)
[14:20:50] Completed 10009 out of 249999 steps (4%)
[14:28:45] Completed 12509 out of 249999 steps (5%)
[14:36:41] Completed 15009 out of 249999 steps (6%)
[14:44:36] Completed 17509 out of 249999 steps (7%)
[14:52:31] Completed 20009 out of 249999 steps (8%)
[15:00:28] Completed 22509 out of 249999 steps (9%)
[15:08:23] Completed 25009 out of 249999 steps (10%)
[15:16:18] Completed 27509 out of 249999 steps (11%)
[15:24:14] Completed 30009 out of 249999 steps (12%)
[15:32:09] Completed 32509 out of 249999 steps (13%)
-rw-r--r-- 1 kerekei kerekei 165M 2008-12-30 10:38 unitinfo.txt
-
- Site Admin
- Posts: 1288
- Joined: Fri Nov 30, 2007 9:37 am
- Location: Oxfordshire, UK
Re: FahMon (multi-platform app to monitor various F@h clients)
Yes, I know. You'll need to build FahMon from svn at the moment as the current release versions will try and parse the entire unitinfo file (which is bad for obvious reasons). SVN now only loads the first 512 bytes (if possible) of the file which fixes this problem.
-
- Posts: 8
- Joined: Sun Dec 02, 2007 7:25 pm
- Location: Bordeaux, France
Re: FahMon (multi-platform app to monitor various F@h clients)
The SVN build now solves all my problems. Thx for your work uncle_fungus
Farmer 4 Monster Folding since 2008...
Reviewer 4 FAH-@ddict
Reviewer 4 FAH-@ddict