Re: GPU client: "Error starting Folding@Home core" then slee
Posted: Tue Mar 29, 2011 4:39 am
But you're not running as service, so it doesn't matter.
Community driven support forum for Folding@home
https://foldingforum.org/
Code: Select all
[16:32:39]
[16:32:39] Successful run
[16:32:39] DynamicWrapper: Finished Work Unit: sleep=10000
[16:32:49] Reserved 2474436 bytes for xtc file; Cosm status=0
[16:32:49] Allocated 2474436 bytes for xtc file
[16:32:49] - Reading up to 2474436 from "work/wudata_03.xtc": Read 2474436
[16:32:49] Read 2474436 bytes from xtc file; available packet space=783956028
[16:32:49] xtc file hash check passed.
[16:32:49] Reserved 76680 76680 783956028 bytes for arc file=<work/wudata_03.trr> Cosm status=0
[16:32:49] Allocated 76680 bytes for arc file
[16:32:49] - Reading up to 76680 from "work/wudata_03.trr": Read 76680
[16:32:49] Read 76680 bytes from arc file; available packet space=783879348
[16:32:49] trr file hash check passed.
[16:32:49] Allocated 544 bytes for edr file
[16:32:49] Read bedfile
[16:32:49] edr file hash check passed.
[16:32:49] Allocated 120122 bytes for logfile
[16:32:49] Read logfile
[16:32:49] GuardedRun: success in DynamicWrapper
[16:32:49] GuardedRun: done
[16:32:49] Run: GuardedRun completed.
[16:32:53] + Opened results file
[16:32:53] - Writing 2672294 bytes of core data to disk...
[16:32:54] Done: 2671782 -> 2514050 (compressed to 94.0 percent)
[16:32:54] ... Done.
[16:32:54] DeleteFrameFiles: successfully deleted file=work/wudata_03.ckp
[16:32:54] Shutting down core
[16:32:54]
[16:32:54] Folding@home Core Shutdown: FINISHED_UNIT
[16:32:57] CoreStatus = 64 (100)
[16:32:57] Sending work to server
[16:32:57] Project: 6801 (Run 8362, Clone 2, Gen 9)
[16:32:57] - Read packet limit of 540015616... Set to 524286976.
[16:32:57] + Attempting to send results [May 3 16:32:57 UTC]
[16:32:57] Gpu type=3 species=0.
[16:33:01] + Results successfully sent
[16:33:01] Thank you for your contribution to Folding@Home.
[16:33:01] + Number of Units Completed: 362
[16:33:05] - Preparing to get new work unit...
[16:33:05] Cleaning up work directory
[16:33:05] + Attempting to get work packet
[16:33:05] Passkey found
[16:33:05] Gpu type=3 species=0.
[16:33:05] - Connecting to assignment server
[16:33:06] - Successful: assigned to (171.64.65.64).
[16:33:06] + News From Folding@Home: Welcome to Folding@Home
[16:33:06] Loaded queue successfully.
[16:33:06] Gpu type=3 species=0.
[16:33:07] + Closed connections
[16:33:07]
[16:33:07] + Processing work unit
[16:33:07] Core required: FahCore_15.exe
[16:33:07] Core found.
[16:33:07] Working on queue slot 04 [May 3 16:33:07 UTC]
[16:33:07] + Working ...
[16:33:07]
[16:33:07] *------------------------------*
[16:33:07] Folding@Home GPU Core
[16:33:07] Version 2.15 (Tue Nov 16 09:05:18 PST 2010)
[16:33:07]
[16:33:07] Build host: SimbiosNvdWin7
[16:33:07] Board Type: NVIDIA/CUDA
[16:33:07] Core : x=15
[16:33:07] Window's signal control handler registered.
[16:33:07] Preparing to commence simulation
[16:33:07] - Looking at optimizations...
[16:33:07] DeleteFrameFiles: successfully deleted file=work/wudata_04.ckp
[16:33:07] - Created dyn
[16:33:07] - Files status OK
[16:33:07] sizeof(CORE_PACKET_HDR) = 512 file=<>
[16:33:07] - Expanded 43680 -> 171827 (decompressed 393.3 percent)
[16:33:07] Called DecompressByteArray: compressed_data_size=43680 data_size=171827, decompressed_data_size=171827 diff=0
[16:33:07] - Digital signature verified
[16:33:07]
[16:33:07] Project: 6801 (Run 9748, Clone 2, Gen 9)
[16:33:07]
[16:33:07] Assembly optimizations on if available.
[16:33:07] Entering M.D.
[16:33:09] Tpr hash work/wudata_04.tpr: 1433012669 2811342351 2985677414 1924824265 3721637216
[16:33:09] Working on ALZHEIMER'S DISEASE AMYLOID
[16:33:09] Client config found, loading data.
[16:33:09] Starting GUI Server
[16:33:09] Setting checkpoint frequency: 500000
[16:33:09] Setting checkpoint frequency: 500000
[16:35:28] Completed 500000 out of 50000000 steps (1%).
[16:37:48] Completed 1000000 out of 50000000 steps (2%).
[16:40:07] Completed 1500000 out of 50000000 steps (3%).
[16:42:27] Completed 2000000 out of 50000000 steps (4%).
[16:44:46] Completed 2500000 out of 50000000 steps (5%).
[16:47:05] Completed 3000000 out of 50000000 steps (6%).
[16:49:25] Completed 3500000 out of 50000000 steps (7%).
[16:51:44] Completed 4000000 out of 50000000 steps (8%).
[16:54:04] Completed 4500000 out of 50000000 steps (9%).
[16:56:23] Completed 5000000 out of 50000000 steps (10%).
[16:58:42] Completed 5500000 out of 50000000 steps (11%).
[17:01:02] Completed 6000000 out of 50000000 steps (12%).
[17:03:21] Completed 6500000 out of 50000000 steps (13%).
[17:05:41] Completed 7000000 out of 50000000 steps (14%).
[17:08:00] Completed 7500000 out of 50000000 steps (15%).
[17:10:20] Completed 8000000 out of 50000000 steps (16%).
[17:12:39] Completed 8500000 out of 50000000 steps (17%).
[17:14:59] Completed 9000000 out of 50000000 steps (18%).
[17:17:18] Completed 9500000 out of 50000000 steps (19%).
[17:19:38] Completed 10000000 out of 50000000 steps (20%).
[17:21:57] Completed 10500000 out of 50000000 steps (21%).
[17:24:16] Completed 11000000 out of 50000000 steps (22%).
[17:26:36] Completed 11500000 out of 50000000 steps (23%).
[17:28:55] Completed 12000000 out of 50000000 steps (24%).
[17:31:15] Completed 12500000 out of 50000000 steps (25%).
[17:33:34] Completed 13000000 out of 50000000 steps (26%).
[17:35:54] Completed 13500000 out of 50000000 steps (27%).
[17:38:13] Completed 14000000 out of 50000000 steps (28%).
[17:40:32] Completed 14500000 out of 50000000 steps (29%).
[17:42:52] Completed 15000000 out of 50000000 steps (30%).
[17:45:11] Completed 15500000 out of 50000000 steps (31%).
[17:47:31] Completed 16000000 out of 50000000 steps (32%).
[17:49:50] Completed 16500000 out of 50000000 steps (33%).
[17:52:10] Completed 17000000 out of 50000000 steps (34%).
[17:54:29] Completed 17500000 out of 50000000 steps (35%).
[17:56:48] Completed 18000000 out of 50000000 steps (36%).
[17:59:08] Completed 18500000 out of 50000000 steps (37%).
[18:01:27] Completed 19000000 out of 50000000 steps (38%).
[18:03:47] Completed 19500000 out of 50000000 steps (39%).
[18:06:06] Completed 20000000 out of 50000000 steps (40%).
[18:08:25] Completed 20500000 out of 50000000 steps (41%).
[18:10:45] Completed 21000000 out of 50000000 steps (42%).
[18:13:04] Completed 21500000 out of 50000000 steps (43%).
[18:15:24] Completed 22000000 out of 50000000 steps (44%).
[18:17:43] Completed 22500000 out of 50000000 steps (45%).
[18:20:02] Completed 23000000 out of 50000000 steps (46%).
[18:22:22] Completed 23500000 out of 50000000 steps (47%).
[18:24:41] Completed 24000000 out of 50000000 steps (48%).
[18:27:01] Completed 24500000 out of 50000000 steps (49%).
[18:29:20] Completed 25000000 out of 50000000 steps (50%).
[18:31:40] Completed 25500000 out of 50000000 steps (51%).
[18:33:59] Completed 26000000 out of 50000000 steps (52%).
[18:36:18] Completed 26500000 out of 50000000 steps (53%).
[18:38:38] Completed 27000000 out of 50000000 steps (54%).
[18:40:57] Completed 27500000 out of 50000000 steps (55%).
[18:43:17] Completed 28000000 out of 50000000 steps (56%).
[18:45:36] Completed 28500000 out of 50000000 steps (57%).
[18:47:55] Completed 29000000 out of 50000000 steps (58%).
[18:50:15] Completed 29500000 out of 50000000 steps (59%).
[18:52:34] Completed 30000000 out of 50000000 steps (60%).
[18:54:54] Completed 30500000 out of 50000000 steps (61%).
[18:57:13] Completed 31000000 out of 50000000 steps (62%).
[18:59:32] Completed 31500000 out of 50000000 steps (63%).
[19:01:52] Completed 32000000 out of 50000000 steps (64%).
[19:04:11] Completed 32499999 out of 50000000 steps (65%).
[19:06:31] Completed 32999999 out of 50000000 steps (66%).
[19:08:50] Completed 33499999 out of 50000000 steps (67%).
[19:11:10] Completed 33999999 out of 50000000 steps (68%).
[19:13:29] Completed 34499999 out of 50000000 steps (69%).
[19:15:48] Completed 34999999 out of 50000000 steps (70%).
[19:18:08] Completed 35499999 out of 50000000 steps (71%).
[19:20:27] Completed 35999999 out of 50000000 steps (72%).
[19:22:47] Completed 36499999 out of 50000000 steps (73%).
[19:25:07] Completed 36999999 out of 50000000 steps (74%).
[19:27:26] Completed 37499999 out of 50000000 steps (75%).
[19:29:45] Completed 37999999 out of 50000000 steps (76%).
[19:32:05] Completed 38499999 out of 50000000 steps (77%).
[19:34:24] Completed 38999999 out of 50000000 steps (78%).
[19:36:44] Completed 39499999 out of 50000000 steps (79%).
[19:39:03] Completed 39999999 out of 50000000 steps (80%).
[19:41:22] Completed 40499999 out of 50000000 steps (81%).
[19:43:42] Completed 40999999 out of 50000000 steps (82%).
[19:46:01] Completed 41499999 out of 50000000 steps (83%).
[19:48:21] Completed 41999999 out of 50000000 steps (84%).
[19:50:40] Completed 42499999 out of 50000000 steps (85%).
[19:53:00] Completed 42999999 out of 50000000 steps (86%).
[19:55:19] Completed 43499999 out of 50000000 steps (87%).
[19:57:38] Completed 43999999 out of 50000000 steps (88%).
[19:59:58] Completed 44499999 out of 50000000 steps (89%).
[20:02:17] Completed 44999999 out of 50000000 steps (90%).
[20:04:37] Completed 45499999 out of 50000000 steps (91%).
[20:06:56] Completed 45999999 out of 50000000 steps (92%).
[20:09:15] Completed 46499999 out of 50000000 steps (93%).
[20:11:35] Completed 46999999 out of 50000000 steps (94%).
[20:13:54] Completed 47499999 out of 50000000 steps (95%).
[20:16:14] Completed 47999999 out of 50000000 steps (96%).
[20:18:33] Completed 48499999 out of 50000000 steps (97%).
[20:20:53] Completed 48999999 out of 50000000 steps (98%).
[20:23:15] Completed 49499999 out of 50000000 steps (99%).
[20:25:36] Completed 49999999 out of 50000000 steps (100%).
[20:25:38] Finished fah_main
[20:25:38]
[20:25:38] Successful run
[20:25:38] DynamicWrapper: Finished Work Unit: sleep=10000
[20:25:48] Reserved 2476332 bytes for xtc file; Cosm status=0
[20:25:48] Allocated 2476332 bytes for xtc file
[20:25:48] - Reading up to 2476332 from "work/wudata_04.xtc": Read 2476332
[20:25:48] Read 2476332 bytes from xtc file; available packet space=783954132
[20:25:48] xtc file hash check passed.
[20:25:48] Reserved 76680 76680 783954132 bytes for arc file=<work/wudata_04.trr> Cosm status=0
[20:25:48] Allocated 76680 bytes for arc file
[20:25:48] - Reading up to 76680 from "work/wudata_04.trr": Read 76680
[20:25:48] Read 76680 bytes from arc file; available packet space=783877452
[20:25:48] trr file hash check passed.
[20:25:48] Allocated 544 bytes for edr file
[20:25:48] Read bedfile
[20:25:48] edr file hash check passed.
[20:25:48] Allocated 120122 bytes for logfile
[20:25:48] Read logfile
[20:25:48] GuardedRun: success in DynamicWrapper
[20:25:48] GuardedRun: done
[20:25:48] Run: GuardedRun completed.
[20:25:53] + Opened results file
[20:25:53] - Writing 2674190 bytes of core data to disk...
[20:25:54] Done: 2673678 -> 2516326 (compressed to 94.1 percent)
[20:25:54] ... Done.
[20:25:54] DeleteFrameFiles: successfully deleted file=work/wudata_04.ckp
[20:25:54] Shutting down core
[20:25:54]
[20:25:54] Folding@home Core Shutdown: FINISHED_UNIT
[20:25:57] CoreStatus = 64 (100)
[20:25:57] Sending work to server
[20:25:57] Project: 6801 (Run 9748, Clone 2, Gen 9)
[20:25:57] - Read packet limit of 540015616... Set to 524286976.
[20:25:57] + Attempting to send results [May 3 20:25:57 UTC]
[20:25:57] Gpu type=3 species=0.
[20:26:03] + Results successfully sent
[20:26:03] Thank you for your contribution to Folding@Home.
[20:26:03] + Number of Units Completed: 363
[20:26:07] - Preparing to get new work unit...
[20:26:07] Cleaning up work directory
[20:26:07] + Attempting to get work packet
[20:26:07] Passkey found
[20:26:07] Gpu type=3 species=0.
[20:26:07] - Connecting to assignment server
[20:26:07] - Successful: assigned to (171.64.65.64).
[20:26:07] + News From Folding@Home: Welcome to Folding@Home
[20:26:08] Loaded queue successfully.
[20:26:08] Gpu type=3 species=0.
[20:26:09] + Closed connections
[20:26:09]
[20:26:09] + Processing work unit
[20:26:09] Core required: FahCore_15.exe
[20:26:09] Core found.
[20:26:09] Working on queue slot 05 [May 3 20:26:09 UTC]
[20:26:09] + Working ...
[20:26:09]
[20:26:09] *------------------------------*
[20:26:09] Folding@Home GPU Core
[20:26:09] Version 2.15 (Tue Nov 16 09:05:18 PST 2010)
[20:26:09]
[20:26:09] Build host: SimbiosNvdWin7
[20:26:09] Board Type: NVIDIA/CUDA
[20:26:09] Core : x=15
[20:26:09] Window's signal control handler registered.
[20:26:09] Preparing to commence simulation
[20:26:09] - Looking at optimizations...
[20:26:09] DeleteFrameFiles: successfully deleted file=work/wudata_05.ckp
[20:26:09] - Created dyn
[20:26:09] - Files status OK
[20:26:09] sizeof(CORE_PACKET_HDR) = 512 file=<>
[20:26:09] - Expanded 43661 -> 171827 (decompressed 393.5 percent)
[20:26:09] Called DecompressByteArray: compressed_data_size=43661 data_size=171827, decompressed_data_size=171827 diff=0
[20:26:09] - Digital signature verified
[20:26:09]
[20:26:09] Project: 6801 (Run 6018, Clone 3, Gen 9)
[20:26:09]
[20:26:09] Assembly optimizations on if available.
[20:26:09] Entering M.D.
[20:26:11] Tpr hash work/wudata_05.tpr: 1500319053 2112754324 4111607144 1091663483 2920748839
[20:26:11] Working on ALZHEIMER'S DISEASE AMYLOID
[20:26:11] Client config found, loading data.
[20:26:15] CoreStatus = 63 (99)
[20:26:15] + Error starting Folding@home core.
[20:26:20]
[20:26:20] + Processing work unit
[20:26:20] Core required: FahCore_15.exe
[20:26:20] Core found.
[20:26:20] Working on queue slot 05 [May 3 20:26:20 UTC]
[20:26:20] + Working ...
[20:26:20]
[20:26:20] *------------------------------*
[20:26:20] Folding@Home GPU Core
[20:26:20] Version 2.15 (Tue Nov 16 09:05:18 PST 2010)
[20:26:20]
[20:26:20] Build host: SimbiosNvdWin7
[20:26:20] Board Type: NVIDIA/CUDA
[20:26:20] Core : x=15
[20:26:20] Window's signal control handler registered.
[20:26:20] Preparing to commence simulation
[20:26:20] - Ensuring status. Please wait.
[20:26:30] - Looking at optimizations...
[20:26:30] - Working with standard loops on this execution.
[20:26:30] - Previous termination of core was improper.
[20:26:30] - Files status OK
[20:26:30] sizeof(CORE_PACKET_HDR) = 512 file=<>
[20:26:30] - Expanded 43661 -> 171827 (decompressed 393.5 percent)
[20:26:30] Called DecompressByteArray: compressed_data_size=43661 data_size=171827, decompressed_data_size=171827 diff=0
[20:26:30] - Digital signature verified
[20:26:30]
[20:26:30] Project: 6801 (Run 6018, Clone 3, Gen 9)
[20:26:30]
[20:26:30] Entering M.D.
[20:26:32] Tpr hash work/wudata_05.tpr: 1500319053 2112754324 4111607144 1091663483 2920748839
[20:26:32] Working on ALZHEIMER'S DISEASE AMYLOID
[20:26:32] Client config found, loading data.
[20:26:34] CoreStatus = 63 (99)
[20:26:34] + Error starting Folding@home core.
[20:26:39]
[20:26:39] + Processing work unit
[20:26:39] Core required: FahCore_15.exe
[20:26:39] Core found.
[20:26:39] Working on queue slot 05 [May 3 20:26:39 UTC]
[20:26:39] + Working ...
[20:26:39]
[20:26:39] *------------------------------*
[20:26:39] Folding@Home GPU Core
[20:26:39] Version 2.15 (Tue Nov 16 09:05:18 PST 2010)
[20:26:39]
[20:26:39] Build host: SimbiosNvdWin7
[20:26:39] Board Type: NVIDIA/CUDA
[20:26:39] Core : x=15
[20:26:39] Window's signal control handler registered.
[20:26:39] Preparing to commence simulation
[20:26:39] - Ensuring status. Please wait.
[20:26:49] - Looking at optimizations...
[20:26:49] - Working with standard loops on this execution.
[20:26:49] - Previous termination of core was improper.
[20:26:49] - Going to use standard loops.
[20:26:49] - Files status OK
[20:26:49] sizeof(CORE_PACKET_HDR) = 512 file=<>
[20:26:49] - Expanded 43661 -> 171827 (decompressed 393.5 percent)
[20:26:49] Called DecompressByteArray: compressed_data_size=43661 data_size=171827, decompressed_data_size=171827 diff=0
[20:26:49] - Digital signature verified
[20:26:49]
[20:26:49] Project: 6801 (Run 6018, Clone 3, Gen 9)
[20:26:49]
[20:26:49] Entering M.D.
[20:26:51] Tpr hash work/wudata_05.tpr: 1500319053 2112754324 4111607144 1091663483 2920748839
[20:26:51] Working on ALZHEIMER'S DISEASE AMYLOID
[20:26:51] Client config found, loading data.
[20:26:54] CoreStatus = 63 (99)
[20:26:54] + Error starting Folding@home core.
[20:26:54] - Attempting to download new core...
[20:26:54] + Downloading new core: FahCore_15.exe
[20:26:54] + 10240 bytes downloaded
[20:26:54] + 20480 bytes downloaded
[20:26:54] + 30720 bytes downloaded
[20:26:54] + 40960 bytes downloaded
Actually, you just answered your own question there at the end.Atom wrote:I actually don't think it's permission-related. I have another three-way machine with 450s in it that has been running like a top for a while. Each GPU has completed more than 100 WU (as many as 124) and yesterday entered an EUE pause. Today each card started throwing "Corestatus = 63 (99)" errors. I can't see how permissions were fine yesterday and not fine today, when nobody has touched the machine at all.
Drivers didn't change overnight. Permissions didn't change overnight. The only thing that changed were the work units. I was actually watching the machine as two GPUs completed work units, uploaded their results, then downloaded new work units. THAT'S when they threw the error. After completing 124 work units, I don't think it's a generic driver issue. It seems like the core either wasn't shut down correctly, or wasn't started correctly.
Same exact WU?jtktam wrote:I had just ran into this problem last night.
I will get the offending WU ids, it was causing all sorts of headaches for me until I found this post.. I thought it was my setup
-joe