Page 1 of 1

Crashes with project 13850

Posted: Tue May 26, 2020 3:04 am
by tomc001
Just noticed this crash on a machine that has been running with no problems for a long time. It crashed several times in a row, all with project 13850.

Now it has started on project 14216 and seems to be running OK.

Code: Select all

02:50:42:WU00:FS00:0xa7:*********************** Log Started 2020-05-26T02:50:42Z ***********************
02:50:42:WU00:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
02:50:42:WU00:FS00:0xa7:       Type: 0xa7
02:50:42:WU00:FS00:0xa7:       Core: Gromacs
02:50:42:WU00:FS00:0xa7:       Args: -dir 00 -suffix 01 -version 705 -lifeline 10136 -checkpoint 15 -np
02:50:42:WU00:FS00:0xa7:             11
02:50:42:WU00:FS00:0xa7:************************************ CBang *************************************
02:50:42:WU00:FS00:0xa7:       Date: Oct 26 2019
02:50:42:WU00:FS00:0xa7:       Time: 01:38:25
02:50:42:WU00:FS00:0xa7:   Revision: c46a1a011a24143739ac7218c5a435f66777f62f
02:50:42:WU00:FS00:0xa7:     Branch: master
02:50:42:WU00:FS00:0xa7:   Compiler: Visual C++ 2008
02:50:42:WU00:FS00:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
02:50:42:WU00:FS00:0xa7:   Platform: win32 10
02:50:42:WU00:FS00:0xa7:       Bits: 64
02:50:42:WU00:FS00:0xa7:       Mode: Release
02:50:42:WU00:FS00:0xa7:************************************ System ************************************
02:50:42:WU00:FS00:0xa7:        CPU: Intel(R) Core(TM) i7-5820K CPU @ 3.30GHz
02:50:42:WU00:FS00:0xa7:     CPU ID: GenuineIntel Family 6 Model 63 Stepping 2
02:50:42:WU00:FS00:0xa7:       CPUs: 12
02:50:42:WU00:FS00:0xa7:     Memory: 31.90GiB
02:50:42:WU00:FS00:0xa7:Free Memory: 22.77GiB
02:50:42:WU00:FS00:0xa7:    Threads: WINDOWS_THREADS
02:50:42:WU00:FS00:0xa7: OS Version: 6.1
02:50:42:WU00:FS00:0xa7:Has Battery: false
02:50:42:WU00:FS00:0xa7: On Battery: false
02:50:42:WU00:FS00:0xa7: UTC Offset: -7
02:50:42:WU00:FS00:0xa7:        PID: 8100
02:50:42:WU00:FS00:0xa7:        CWD: C:\Users\tomc\AppData\Roaming\FAHClient\work
02:50:42:WU00:FS00:0xa7:******************************** Build - libFAH ********************************
02:50:42:WU00:FS00:0xa7:    Version: 0.0.18
02:50:42:WU00:FS00:0xa7:     Author: Joseph Coffland <[email protected]>
02:50:42:WU00:FS00:0xa7:  Copyright: 2019 foldingathome.org
02:50:42:WU00:FS00:0xa7:   Homepage: https://foldingathome.org/
02:50:42:WU00:FS00:0xa7:       Date: Oct 26 2019
02:50:42:WU00:FS00:0xa7:       Time: 01:52:30
02:50:42:WU00:FS00:0xa7:   Revision: c1e3513b1bc0c16013668f2173ee969e5995b38e
02:50:42:WU00:FS00:0xa7:     Branch: master
02:50:42:WU00:FS00:0xa7:   Compiler: Visual C++ 2008
02:50:42:WU00:FS00:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
02:50:42:WU00:FS00:0xa7:   Platform: win32 10
02:50:42:WU00:FS00:0xa7:       Bits: 64
02:50:42:WU00:FS00:0xa7:       Mode: Release
02:50:42:WU00:FS00:0xa7:************************************ Build *************************************
02:50:42:WU00:FS00:0xa7:       SIMD: avx_256
02:50:42:WU00:FS00:0xa7:********************************************************************************
02:50:42:WU00:FS00:0xa7:Project: 13850 (Run 0, Clone 30493, Gen 120)
02:50:42:WU00:FS00:0xa7:Unit: 0x00000094287234c95e788cb884e70c88
02:50:42:WU00:FS00:0xa7:Digital signatures verified
02:50:42:WU00:FS00:0xa7:Reducing thread count from 11 to 10 to avoid domain decomposition by a prime number > 3
02:50:42:WU00:FS00:0xa7:Calling: mdrun -s frame120.tpr -o frame120.trr -x frame120.xtc -e frame120.edr -cpi state.cpt -cpt 15 -nt 10
02:50:42:WU00:FS00:0xa7:ERROR:Guru Meditation #49221ba0906028ef.25df40a339ab1a3d (1268040.1349128) '00/01/frame120.xtc'
02:50:42:WU00:FS00:0xa7:WARNING:Unexpected exit() call
02:50:42:WU00:FS00:0xa7:WARNING:Unexpected exit from science code
02:50:42:WU00:FS00:0xa7:Saving result file ..\logfile_01.txt
02:50:42:WU00:FS00:0xa7:Saving result file frame120.edr
02:50:42:WU00:FS00:0xa7:Saving result file frame120.trr
02:50:42:WU00:FS00:0xa7:Saving result file frame120.xtc
02:50:42:WU00:FS00:0xa7:ERROR:Guru Meditation #49221ba0906028ef.25df40a339ab1a3d (1268040.1349128) '00/01/frame120.xtc'
02:50:45:WARNING:WU00:FS00:FahCore returned an unknown error code which probably indicates that it crashed
02:50:45:WARNING:WU00:FS00:FahCore returned: UNKNOWN_ENUM (-1073740777 = 0xc0000417)
02:50:45:WARNING:WU00:FS00:Too many errors, failing
02:50:45:WU00:FS00:Sending unit results: id:00 state:SEND error:FAILED project:13850 run:0 clone:30493 gen:120 core:0xa7 unit:0x00000094287234c95e788cb884e70c88
02:50:45:WU00:FS00:Uploading 4.50KiB to 40.114.52.201
02:50:45:WU00:FS00:Connecting to 40.114.52.201:8080
02:50:45:WU01:FS00:Connecting to 65.254.110.245:8080
02:50:45:WU00:FS00:Upload complete
02:50:46:WU00:FS00:Server responded WORK_QUIT (404)
02:50:46:WARNING:WU00:FS00:Server did not like results, dumping
02:50:46:WU00:FS00:Cleaning up

Re: Crashes with project 13850

Posted: Tue May 26, 2020 3:56 am
by bruce
I see a single WU Project: 13850 (Run 0, Clone 30493, Gen 120) which crashed immediately with a guru meditation error. The log you posted doesn't support the words "several times in a row"

A Guru Meditation error is the result of a corrupt checkpoint file -- usually due to a system shutdown that was other than orderly. That may or may not have been your fault, but the WU could not be recovered so it moved on to a new assignment.

Re: Crashes with project 13850

Posted: Tue May 26, 2020 12:38 pm
by tomc001
I only included one of the crashes, thinking that there was no point in showing the same thing several times.

The computer's up time is 9 days. So it wasn't a shutdown. It is back to normal operation so I guess it can be ignored now.