unstable machine
Moderators: Site Moderators, FAHC Science Team
unstable machine
I have been on FAH for about a month. I'm estimating about 50% of my WU have failed with the log reporting "unstable machine". Failure usually after running 2-3 days. I am not running any programs, just surfing the net and I avoid sites where problems might be expected. I do run adaware. Windows 10, intel Pentium, 3 Ghz, build 7.4.4. Any help would be appreciated.
Re: unstable machine
Any OC?
Info about your hardware?
Info about your hardware?
-
- Site Moderator
- Posts: 6359
- Joined: Sun Dec 02, 2007 10:38 am
- Location: Bordeaux, France
- Contact:
Re: unstable machine
Please post more detail from you log file (How to ...).
We also need more details from your hardware ... laptop ? desktop ? cooling ? ...
We also need more details from your hardware ... laptop ? desktop ? cooling ? ...
Re: unstable machine
I am not sure about how to check temperature, but dropping from full to medium a few weeks ago seemed to help. I am running a desktop machine. Not sure what an "OC" is. I had cut info from my log file around unstable machine crash, but a thunderstorm a few minutes ago caused a shutdown and now that info is gone. Is there a way to back up or scroll up on the log file?
23:59:13: CPU: Intel(R) Pentium(R) D CPU 3.00GHz
23:59:13: CPU ID: GenuineIntel Family 15 Model 6 Stepping 5
23:59:13: CPUs: 2
23:59:13: Memory: 2.99GiB
23:59:13: Free Memory: 2.22GiB
23:59:13: Threads: WINDOWS_THREADS
23:59:13: OS Version: 6.2
23:59:13: Has Battery: false
23:59:13: On Battery: false
23:59:13: UTC Offset: -5
23:59:13: PID: 5128
23:59:13: CWD: C:/Users/Greg/AppData/Roaming/FAHClient
23:59:13: OS: Windows 10 Pro
23:59:13: OS Arch: X86
23:59:13: GPUs: 0
23:59:13: CUDA: Not detected
23:59:13:Win32 Service: false
23:59:13: CPU: Intel(R) Pentium(R) D CPU 3.00GHz
23:59:13: CPU ID: GenuineIntel Family 15 Model 6 Stepping 5
23:59:13: CPUs: 2
23:59:13: Memory: 2.99GiB
23:59:13: Free Memory: 2.22GiB
23:59:13: Threads: WINDOWS_THREADS
23:59:13: OS Version: 6.2
23:59:13: Has Battery: false
23:59:13: On Battery: false
23:59:13: UTC Offset: -5
23:59:13: PID: 5128
23:59:13: CWD: C:/Users/Greg/AppData/Roaming/FAHClient
23:59:13: OS: Windows 10 Pro
23:59:13: OS Arch: X86
23:59:13: GPUs: 0
23:59:13: CUDA: Not detected
23:59:13:Win32 Service: false
-
- Site Admin
- Posts: 7937
- Joined: Tue Apr 21, 2009 4:41 pm
- Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2 - Location: W. MA
Re: unstable machine
By default the folding client keeps the last 16 log files in a folder within the same location which is shown as the CWD in your section of log file. The file name for the log.txt file is extended with a time stamp.
OC is short hand for overclock.
OC is short hand for overclock.
iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
Re: unstable machine
This may be more info than you need.
Mod edit: added Code tags to log file
Code: Select all
00:36:54:WU00:FS00:FahCore 0xa4 started
00:36:54:WU00:FS00:0xa4:
00:36:54:WU00:FS00:0xa4:*------------------------------*
00:36:54:WU00:FS00:0xa4:Folding@Home Gromacs GB Core
00:36:54:WU00:FS00:0xa4:Version 2.27 (Dec. 15, 2010)
00:36:54:WU00:FS00:0xa4:
00:36:54:WU00:FS00:0xa4:Preparing to commence simulation
00:36:54:WU00:FS00:0xa4:- Looking at optimizations...
00:36:54:WU00:FS00:0xa4:- Created dyn
00:36:54:WU00:FS00:0xa4:- Files status OK
00:36:54:WU00:FS00:0xa4:- Expanded 826023 -> 1402860 (decompressed 169.8 percent)
00:36:54:WU00:FS00:0xa4:Called DecompressByteArray: compressed_data_size=826023 data_size=1402860, decompressed_data_size=1402860 diff=0
00:36:54:WU00:FS00:0xa4:- Digital signature verified
00:36:54:WU00:FS00:0xa4:
00:36:54:WU00:FS00:0xa4:Project: 9038 (Run 200, Clone 0, Gen 151)
00:36:54:WU00:FS00:0xa4:
00:36:54:WU00:FS00:0xa4:Assembly optimizations on if available.
00:36:55:WU00:FS00:0xa4:Entering M.D.
00:37:00:WU00:FS00:0xa4:Mapping NT from 1 to 1
00:37:01:WU00:FS00:0xa4:Completed 0 out of 250000 steps (0%)
00:55:15:WU00:FS00:0xa4:Completed 2500 out of 250000 steps (1%)
01:12:57:WU00:FS00:0xa4:Completed 5000 out of 250000 steps (2%)
01:30:37:WU00:FS00:0xa4:Completed 7500 out of 250000 steps (3%)
01:48:18:WU00:FS00:0xa4:Completed 10000 out of 250000 steps (4%)
02:06:04:WU00:FS00:0xa4:Completed 12500 out of 250000 steps (5%)
02:23:44:WU00:FS00:0xa4:Completed 15000 out of 250000 steps (6%)
02:41:27:WU00:FS00:0xa4:Completed 17500 out of 250000 steps (7%)
02:59:09:WU00:FS00:0xa4:Completed 20000 out of 250000 steps (8%)
03:00:31:WU00:FS00:0xa4:mdrun returned 255
03:00:31:WU00:FS00:0xa4:Going to send back what have done -- stepsTotalG=250000
03:00:31:WU00:FS00:0xa4:Work fraction=0.0808 steps=250000.
03:00:35:WU00:FS00:0xa4:logfile size=10718 infoLength=10718 edr=0 trr=25
03:00:35:WU00:FS00:0xa4:logfile size: 10718 info=10718 bed=0 hdr=25
03:00:35:WU00:FS00:0xa4:- Writing 11256 bytes of core data to disk...
03:00:35:WU00:FS00:0xa4:Done: 10744 -> 3778 (compressed to 35.1 percent)
03:00:35:WU00:FS00:0xa4: ... Done.
03:00:35:WU00:FS00:0xa4:
03:00:35:WU00:FS00:0xa4:Folding@home Core Shutdown: UNSTABLE_MACHINE
03:00:36:WARNING:WU00:FS00:FahCore returned: UNSTABLE_MACHINE (122 = 0x7a)
03:00:36:WU00:FS00:Sending unit results: id:00 state:SEND error:FAULTY project:9038 run:200 clone:0 gen:151 core:0xa4 unit:0x000000b4ab436c9e56982a13e24dde1d
03:00:36:WU00:FS00:Uploading 4.19KiB to 171.67.108.158
03:00:36:WU00:FS00:Connecting to 171.67.108.158:8080
03:00:36:WU00:FS00:Upload complete
03:00:36:WU00:FS00:Server responded WORK_ACK (400)
03:00:36:WU00:FS00:Cleaning up
03:00:36:WU01:FS00:Connecting to 171.67.108.45:8080
03:00:37:WU01:FS00:Assigned to work server 171.67.108.158
03:00:37:WU01:FS00:Requesting new work unit for slot 00: READY cpu:1 from 171.67.108.158
03:00:37:WU01:FS00:Connecting to 171.67.108.158:8080
03:00:38:WU01:FS00:Downloading 806.85KiB
03:00:44:WU01:FS00:Download 79.32%
03:00:45:WU01:FS00:Download complete
03:00:45:WU01:FS00:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:9034 run:73 clone:0 gen:152 core:0xa4 unit:0x000000b5ab436c9e5698307fdc3fa5c6
03:00:46:WU01:FS00:Starting
03:00:46:WU01:FS00:Running FahCore: "C:\Program Files\FAHClient/FAHCoreWrapper.exe" C:/Users/Greg/AppData/Roaming/FAHClient/cores/web.stanford.edu/~pande/Win32/x86/Core_a4.fah/FahCore_a4.exe -dir 01 -suffix 01 -version 704 -lifeline 5128 -checkpoint 15
03:00:46:WU01:FS00:Started FahCore on PID 3780
03:00:46:WU01:FS00:Core PID:4392
03:00:46:WU01:FS00:FahCore 0xa4 started
03:00:46:WU01:FS00:0xa4:
03:00:46:WU01:FS00:0xa4:*------------------------------*
03:00:46:WU01:FS00:0xa4:Folding@Home Gromacs GB Core
03:00:46:WU01:FS00:0xa4:Version 2.27 (Dec. 15, 2010)
03:00:46:WU01:FS00:0xa4:
03:00:46:WU01:FS00:0xa4:Preparing to commence simulation
03:00:46:WU01:FS00:0xa4:- Looking at optimizations...
03:00:46:WU01:FS00:0xa4:- Created dyn
03:00:46:WU01:FS00:0xa4:- Files status OK
03:00:46:WU01:FS00:0xa4:- Expanded 825698 -> 1401112 (decompressed 169.6 percent)
03:00:46:WU01:FS00:0xa4:Called DecompressByteArray: compressed_data_size=825698 data_size=1401112, decompressed_data_size=1401112 diff=0
03:00:46:WU01:FS00:0xa4:- Digital signature verified
03:00:46:WU01:FS00:0xa4:
03:00:46:WU01:FS00:0xa4:Project: 9034 (Run 73, Clone 0, Gen 152)
03:00:47:WU01:FS00:0xa4:
03:00:47:WU01:FS00:0xa4:Assembly optimizations on if available.
03:00:47:WU01:FS00:0xa4:Entering M.D.
03:00:52:WU01:FS00:0xa4:Mapping NT from 1 to 1
03:00:53:WU01:FS00:0xa4:Completed 0 out of 250000 steps (0%)
03:18:36:WU01:FS00:0xa4:Completed 2500 out of 250000 steps (1%)
03:36:19:WU01:FS00:0xa4:Completed 5000 out of 250000 steps (2%)
03:54:02:WU01:FS00:0xa4:Completed 7500 out of 250000 steps (3%)
04:11:43:WU01:FS00:0xa4:Completed 10000 out of 250000 steps (4%)
04:29:27:WU01:FS00:0xa4:Completed 12500 out of 250000 steps (5%)
-
- Posts: 1164
- Joined: Wed Apr 01, 2009 9:22 pm
- Hardware configuration: Asus Z8NA D6C, 2 [email protected] Ghz, , 12gb Ram, GTX 980ti, AX650 PSU, win 10 (daily use)
Asus Z87 WS, Xeon E3-1230L v3, 8gb ram, KFA GTX 1080, EVGA 750ti , AX760 PSU, Mint 18.2 OS
Not currently folding
Asus Z9PE- D8 WS, 2 [email protected] Ghz, 16Gb 1.35v Ram, Ubuntu (Fold only)
Asus Z9PA, 2 Ivy 12 core, 16gb Ram, H folding appliance (fold only) - Location: Jersey, Channel islands
Re: unstable machine
I think I've spotted the problem, all units these days are designed to run on 2 or more cpu cores, your WU are trying to run on just 1 core and I think that is the issue. The only way to solve it is to set the folder back to full and make sure its using 2 cores. Since my rigs are currently off I can't remember if moving the slider back to full is enough.
If you are having problems on running on 2 cores I would suggest either memory that starting to fail or a heat issue, it may be worth opening the case and cleaning out the heatsink on the cpu.
As for an OC, it means overclock, setting your CPU/graphics or RAM to run at a faster speed than it was designed to.
If you are having problems on running on 2 cores I would suggest either memory that starting to fail or a heat issue, it may be worth opening the case and cleaning out the heatsink on the cpu.
As for an OC, it means overclock, setting your CPU/graphics or RAM to run at a faster speed than it was designed to.
-
- Site Moderator
- Posts: 6359
- Joined: Sun Dec 02, 2007 10:38 am
- Location: Bordeaux, France
- Contact:
Re: unstable machine
Also, you have a Pentium D which is known to generate a lot of heat ... so make sure you've cleaned your fans and heatsinks from dust.