Project: 2669 (Run 2, Clone 52, Gen 143)

Moderators: Site Moderators, FAHC Science Team

Post Reply
parkut
Posts: 365
Joined: Tue Feb 12, 2008 7:33 am
Hardware configuration: Running exclusively Linux headless blades. All are dedicated crunching machines.
Location: SE Michigan, USA

Project: 2669 (Run 2, Clone 52, Gen 143) Hangs @ 0%

Post by parkut »

This WU does not process. Hangs on starting or restarting,
processor load goes to zero. One further oddness, the unitinfo.txt
file has a very strange progress value
...

Code: Select all

model name	: Intel(R) Core(TM)2 CPU          6600  @ 2.40GHz
cpu MHz		: 2400.063
cache size	: 4096 KB
Memory: 1.96 GB physical, 1023.99 MB virtual
...
Current Work Unit
-----------------
Name: Gromacs
Tag: P2669R2C52G143
Download time: October 5 11:04:20
Due time: October 8 11:04:20
Progress: 6871947% 
...
[12:08:24] *------------------------------*
[12:08:24] Folding@Home Gromacs SMP Core
[12:08:24] Version 2.10 (Sun Aug 30 03:43:28 CEST 2009)
[12:08:24] 
[12:08:24] Preparing to commence simulation
[12:08:24] - Ensuring status. Please wait.
[12:08:33] - Looking at optimizations...
[12:08:33] - Working with standard loops on this execution.
[12:08:33] - Files status OK
[12:08:35] - Expanded 4835603 -> 23977273 (decompressed 495.8 percent)
[12:08:35] Called DecompressByteArray: compressed_data_size=4835603 data_size=23977273, decompressed_data_size=23977273 diff=0
[12:08:35] - Digital signature verified
[12:08:35] 
[12:08:35] Project: 2669 (Run 2, Clone 52, Gen 143)
[12:08:35] 
[12:08:35] Entering M.D.
[12:08:45] Completed 0 out of 250000 steps  (0%)
BrokenWolf
Posts: 126
Joined: Sat Aug 02, 2008 3:08 am

Project: 2669 (Run 2, Clone 52, Gen 143)

Post by BrokenWolf »

From terminal Window

Code: Select all

[22:18:51] Called DecompressByteArray: compressed_data_size=4835603 data_size=23977273, decompressed_data_size=23977273 diff=0
[22:18:51] - Digital signature verified
[22:18:51]
[22:18:51] Project: 2669 (Run 2, Clone 52, Gen 143)
[22:18:51]
[22:18:51] Assembly optimizations on if available.
[22:18:51] Entering M.D.
[22:19:00] Run 2, Clone 52, Gen 143)
[22:19:00]
[22:19:01] Entering M.D.
NNODES=4, MYRANK=1, HOSTNAME=RHEL4BW2.lab1.com
NNODES=4, MYRANK=2, HOSTNAME=RHEL4BW2.lab1.com
NNODES=4, MYRANK=0, HOSTNAME=RHEL4BW2.lab1.com
NODEID=0 argc=20
NNODES=4, MYRANK=3, HOSTNAME=RHEL4BW2.lab1.com
NODEID=1 argc=20
Reading file work/wudata_08.tpr, VERSION 3.3.99_development_20070618 (single precision)
NODEID=2 argc=20
NODEID=3 argc=20
Note: tpx file_version 48, software version 67

NOTE: The tpr file used for this simulation is in an old format, for less memory usage and possibly more performance create a new tpr file with an up to date version of grompp

Making 1D domain decomposition 1 x 1 x 4
starting mdrun '22869 system'
36000004 steps,  72000.0 ps (continuing from step 35750004,  71500.0 ps).

Step 35750084, time 71500.2 (ps)  LINCS WARNING
relative constraint deviation after LINCS:
rms 0.005165, max 0.326180 (between atoms 7094 and 7096)
bonds that rotated more than 90 degrees:
 atom 1 atom 2  angle  previous, current, constraint length
   7094   7096   90.0    0.1090   0.1446      0.1090

Step 35750086, time 71500.2 (ps)  LINCS WARNING
relative constraint deviation after LINCS:
rms 0.001810, max 0.114418 (between atoms 7094 and 7096)
bonds that rotated more than 90 degrees:
 atom 1 atom 2  angle  previous, current, constraint length
   7094   7096   90.0    0.1087   0.1215      0.1090

Step 35750088, time 71500.2 (ps)  LINCS WARNING
relative constraint deviation after LINCS:
rms 0.004200, max 0.265630 (between atoms 7094 and 7096)
bonds that rotated more than 90 degrees:
 atom 1 atom 2  angle  previous, current, constraint length
   7094   7096   90.0    0.1089   0.1380      0.1090

Step 35750090, time 71500.2 (ps)  LINCS WARNING
relative constraint deviation after LINCS:
rms 0.002489, max 0.157058 (between atoms 7094 and 7096)
bonds that rotated more than 90 degrees:
 atom 1 atom 2  angle  previous, current, constraint length
   7094   7096   90.0    0.1088   0.1261      0.1090

Step 35750091, time 71500.2 (ps)  LINCS WARNING
relative constraint deviation after LINCS:
rms 0.001149, max 0.068081 (between atoms 7094 and 7096)
bonds that rotated more than 90 degrees:
 atom 1 atom 2  angle  previous, current, constraint length

Step 35750092, time 71500.2 (ps)  LINCS WARNING
relative constraint deviation after LINCS:
rms 0.055714, max 3.047543 (between atoms 7094 and 7096)
bonds that rotated more than 90 degrees:
 atom 1 atom 2  angle  previous, current, constraint length
   7097   7098   90.0    0.1010   0.2781      0.1010

Step 35750093, time 71500.2 (ps)  LINCS WARNING
relative constraint deviation after LINCS:
rms 0.000511, max 0.025604 (between atoms 7097 and 7099)
bonds that rotated more than 90 degrees:
 atom 1 atom 2  angle  previous, current, constraint length
   7094   7096   90.0    0.4412   0.1093      0.1090

Step 35750094, time 71500.2 (ps)  LINCS WARNING
relative constraint deviation after LINCS:
rms 0.053622, max 2.982553 (between atoms 7094 and 7096)
bonds that rotated more than 90 degrees:
 atom 1 atom 2  angle  previous, current, constraint length
   7094   7096   90.0    0.1093   0.4341      0.1090

Step 35750096, time 71500.2 (ps)  LINCS WARNING
relative constraint deviation after LINCS:
rms 0.052467, max 2.986027 (between atoms 7094 and 7096)
bonds that rotated more than 90 degrees:
 atom 1 atom 2  angle  previous, current, constraint length
   7097   7098   90.0    0.1004   0.2382      0.1010
   7097   7099   90.0    0.1016   0.1463      0.1010

Step 35750098, time 71500.2 (ps)  LINCS WARNING
relative constraint deviation after LINCS:
rms 0.047277, max 2.816680 (between atoms 7094 and 7096)
bonds that rotated more than 90 degrees:
 atom 1 atom 2  angle  previous, current, constraint length
   7094   7096   90.0    0.1090   0.4160      0.1090
   7097   7098   90.0    0.1005   0.1881      0.1010
   7097   7099   90.0    0.1020   0.1488      0.1010

Step 35750099, time 71500.2 (ps)  LINCS WARNING
relative constraint deviation after LINCS:
rms 0.001198, max 0.045022 (between atoms 7097 and 7100)
bonds that rotated more than 90 degrees:
 atom 1 atom 2  angle  previous, current, constraint length

Step 35750100, time 71500.2 (ps)  LINCS WARNING
relative constraint deviation after LINCS:
rms 0.048803, max 2.958677 (between atoms 7094 and 7096)
bonds that rotated more than 90 degrees:
 atom 1 atom 2  angle  previous, current, constraint length
   7094   7096   90.0    0.1131   0.4315      0.1090
   7097   7098   90.0    0.1006   0.1565      0.1010
   7097   7099   90.0    0.1034   0.1766      0.1010

Step 35750101, time 71500.2 (ps)  LINCS WARNING
relative constraint deviation after LINCS:
rms 0.001487, max 0.090276 (between atoms 7097 and 7099)
bonds that rotated more than 90 degrees:
 atom 1 atom 2  angle  previous, current, constraint length
   7097   7099   90.0    0.1766   0.1101      0.1010

Step 35750102, time 71500.2 (ps)  LINCS WARNING
relative constraint deviation after LINCS:
rms 0.048318, max 2.966585 (between atoms 7094 and 7096)
bonds that rotated more than 90 degrees:
 atom 1 atom 2  angle  previous, current, constraint length
   7097   7099   90.0    0.1101   0.1783      0.1010

Step 35750103, time 71500.2 (ps)  LINCS WARNING
relative constraint deviation after LINCS:
rms 0.305995, max 19.427565 (between atoms 7094 and 7096)
bonds that rotated more than 90 degrees:
 atom 1 atom 2  angle  previous, current, constraint length
   7094   7096   90.0    0.4324   2.2266      0.1090
   7097   7099   90.0    0.1783   0.1127      0.1010
Warning: 1-4 interaction between 7096 and 7098 at distance 2.246 which is larger than the 1-4 table size 2.200 nm
These are ignored for the rest of the simulation
This usually means your system is exploding,
if not, you should increase table-extension in your mdp file
or with user tables increase the table size

Step 35750104, time 71500.2 (ps)  LINCS WARNING
relative constraint deviation after LINCS:
rms 2965175033778.090332, max 162725338546176.000000 (between atoms 7094 and 7096)
bonds that rotated more than 90 degrees:
 atom 1 atom 2  angle  previous, current, constraint length
  22295  22297   90.0    0.1090   0.7214      0.1090
   7097   7099   90.0    0.1127   0.1907      0.1010
   7169   7171   90.0    0.1090 869260032.0000      0.1090
  22274  22276   90.0    0.1090   0.1690      0.1090
  22277  22279   90.0    0.1090 453803072.0000      0.1090
  22283  22285   90.0    0.1010 21919.0332      0.1010
   7107   7108   90.2    0.1090 2432.7432      0.1090
   7110   7111   90.0    0.1090 305205051392.0000      0.1090
   7110   7112   90.0    0.1090 6415524036608.0000      0.1090
   7137   7138   90.0    0.1010   0.1431      0.1010
   7165   7166   90.0    0.1010 2193711104.0000      0.1010
   6949   6951   90.0    0.1090 8157860864.0000      0.1090
   6952   6953   90.0    0.1090   7.0687      0.1090
   6952   6954   90.0    0.1090   5.5495      0.1090
   6881   6882   90.2    0.1080 5498.4014      0.1080

t = 71500.211 ps: Water molecule starting at atom 89206 can not be settled.
Check for bad contacts and/or reduce the timestep.
[22:19:56] lding@home Core Shutdown: INTERRUPTED
application called MPI_Abort(MPI_COMM_WORLD, 102) - process 0
[03:54:28] - Autosending finished units... [November 11 03:54:28 UTC]
[03:54:28] Trying to send all finished work units
[03:54:28] + No unsent completed units remaining.
[03:54:28] - Autosend completed
[09:54:28] - Autosending finished units... [November 11 09:54:28 UTC]
[09:54:28] Trying to send all finished work units
[09:54:28] + No unsent completed units remaining.
[09:54:28] - Autosend completed


Image
lammert
Posts: 6
Joined: Sun Feb 03, 2008 1:38 am

Project: 2669 (Run 2, Clone 52, Gen 143)

Post by lammert »

I get the following errors directly after starting this work unit:

Code: Select all

 $ ./fah6 -advmethods -smp 2

Note: Please read the license agreement (fah6 -license). Further 
use of this software requires that you have read and accepted this agreement.

2 cores detected


--- Opening Log file [December 22 19:21:07 UTC] 


# Linux SMP Console Edition ###################################################
###############################################################################

                       Folding@Home Client Version 6.24beta

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: /home/folding/folding6.24
Executable: ./fah6
Arguments: -advmethods -smp 2 

[19:21:07] - Ask before connecting: No
[19:21:07] - User name: lammert (Team 100344)
[19:21:07] - User ID: 3F8F18F05D21C391
[19:21:07] - Machine ID: 1
[19:21:07] 
[19:21:07] Could not open work queue, generating new queue...
[19:21:07] - Preparing to get new work unit...
[19:21:07] + Attempting to get work packet
[19:21:07] - Connecting to assignment server
[19:21:08] - Successful: assigned to (171.64.65.56).
[19:21:08] + News From Folding@Home: Welcome to Folding@Home
[19:21:08] Loaded queue successfully.
[19:21:41] + Closed connections
[19:21:41] 
[19:21:41] + Processing work unit
[19:21:41] At least 4 processors must be requested.Core required: FahCore_a2.exe
[19:21:41] Core found.
[19:21:42] Working on queue slot 01 [December 22 19:21:42 UTC]
[19:21:42] + Working ...
[19:21:42] 
[19:21:42] *------------------------------*
[19:21:42] Folding@Home Gromacs SMP Core
[19:21:42] Version 2.10 (Sun Aug 30 03:43:28 CEST 2009)
[19:21:42] 
[19:21:42] Preparing to commence simulation
[19:21:42] - Ensuring status. Please wait.
[19:21:43] Called DecompressByteArray: compressed_data_size=4835603 data_size=23977273, decompressed_data_size=23977273 diff=0
[19:21:43] - Digital signature verified
[19:21:43] 
[19:21:43] Project: 2669 (Run 2, Clone 52, Gen 143)
[19:21:43] 
[19:21:43] Assembly optimizations on if available.
[19:21:43] Entering M.D.
[19:21:53] Run 2, Clone 52, Gen 143)
[19:21:53] 
[19:21:55] Entering M.D.
NNODES=4, MYRANK=0, HOSTNAME=obelix.linocomm.net
NNODES=4, MYRANK=3, HOSTNAME=obelix.linocomm.net
NODEID=0 argc=20
NNODES=4, MYRANK=2, HOSTNAME=obelix.linocomm.net
NODEID=2 argc=20
NODEID=3 argc=20
Reading file work/wudata_01.tpr, VERSION 3.3.99_development_20070618 (single precision)
NNODES=4, MYRANK=1, HOSTNAME=obelix.linocomm.net
NODEID=1 argc=20
Note: tpx file_version 48, software version 68

NOTE: The tpr file used for this simulation is in an old format, for less memory usage and possibly more performance create a new tpr file with an up to date version of grompp

Making 1D domain decomposition 1 x 1 x 4
starting mdrun '22869 system'
36000004 steps,  72000.0 ps (continuing from step 35750004,  71500.0 ps).

Step 35750084, time 71500.2 (ps)  LINCS WARNING
relative constraint deviation after LINCS:
rms 0.009141, max 0.578010 (between atoms 7094 and 7096)
bonds that rotated more than 90 degrees:
 atom 1 atom 2  angle  previous, current, constraint length
   7094   7096   90.0    0.1090   0.1720      0.1090

Step 35750085, time 71500.2 (ps)  LINCS WARNING
relative constraint deviation after LINCS:
rms 0.000665, max 0.035553 (between atoms 7094 and 7096)
bonds that rotated more than 90 degrees:
 atom 1 atom 2  angle  previous, current, constraint length
   7094   7096   90.0    0.1720   0.1129      0.1090

Step 35750086, time 71500.2 (ps)  LINCS WARNING
relative constraint deviation after LINCS:
rms 0.009061, max 0.573239 (between atoms 7094 and 7096)
bonds that rotated more than 90 degrees:
 atom 1 atom 2  angle  previous, current, constraint length
   7094   7096   90.0    0.1129   0.1715      0.1090

Step 35750087, time 71500.2 (ps)  LINCS WARNING
relative constraint deviation after LINCS:
rms 0.877406, max 49.349945 (between atoms 7094 and 7096)
bonds that rotated more than 90 degrees:
 atom 1 atom 2  angle  previous, current, constraint length
   7097   7098   90.0    0.1010   0.4528      0.1010
   7097   7099   90.0    0.1008   2.6310      0.1010
Warning: 1-4 interaction between 7088 and 7096 at distance 5.705 which is larger than the 1-4 table size 2.200 nm
These are ignored for the rest of the simulation
This usually means your system is exploding,
if not, you should increase table-extension in your mdp file
or with user tables increase the table size

Step 35750088, time 71500.2 (ps)  LINCS WARNING
relative constraint deviation after LINCS:
rms 117119840290.386398, max 6680161550336.000000 (between atoms 7097 and 7099)
bonds that rotated more than 90 degrees:
 atom 1 atom 2  angle  previous, current, constraint length
   7178   7179   90.0    0.1090   0.9710      0.1090
   7094   7095   90.0    0.2943  19.3555      0.1090
   7094   7096   90.0    5.4881  18.6381      0.1090
   7097   7098   90.0    0.4528 230673399808.0000      0.1010
   7097   7099   90.0    2.6310 674696331264.0000      0.1010
   7097   7100   90.0    0.2861 227233923072.0000      0.1010
   7122   7123   90.0    0.1090 2948334.2500      0.1090
   7131   7133   93.0    0.1090  52.4591      0.1090
   7172   7173   90.0    0.1336 31509.7148      0.1336
  22755  22757   90.0    0.1090 6266.2402      0.1090
  22710  22711   90.0    0.1090  12.4786      0.1090
  22710  22712   90.0    0.1090   3.7750      0.1090
  22710  22713   90.0    0.1090  53.4988      0.1090
  22221  22224  118.3    0.1090  21.8209      0.1090
  22761  22762   90.0    0.1080 607.2614      0.1080
  22764  22765   90.5    0.0960 1421.7566      0.0960
  22243  22244   93.9    0.1010 251.9043      0.1010
  22699  22700   90.3    0.1090 2288.7644      0.1090
  22697  22698  115.1    0.1010   5.2659      0.1010
   6949   6950   90.0    0.1090   0.1418      0.1090
   6952   6954   90.0    0.1090   1.2559      0.1090
  22152  22153   90.0    0.1090 8634.4053      0.1090
  22225  22226   90.0    0.1090 158486.9375      0.1090
  22225  22227   90.0    0.1090 2393510.5000      0.1090
  22225  22228   90.0    0.1090 2805230.5000      0.1090
   6961   6963   90.0    0.1010 755.6068      0.1010

t = 71500.179 ps: Water molecule starting at atom 37207 can not be settled.
Check for bad contacts and/or reduce the timestep.
System is running fah6 version 6.24 in SMP mode on Centos 5.4 64 bits. This system is a vanilla HP ML110/G5 server with 8GB RAM, Xeon 3065 processor and without overclocking.

An attempt to delete the work unit gives the following error:

Code: Select all

$ ./fah6 -delete 0

Note: Please read the license agreement (fah6 -license). Further 
use of this software requires that you have read and accepted this agreement.



--- Opening Log file [December 22 19:34:56 UTC] 


# Linux Console Edition #######################################################
###############################################################################

                       Folding@Home Client Version 6.24beta

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: /home/folding/folding6.24
Executable: ./fah6
Arguments: -delete 0 

[19:34:56] - Ask before connecting: No
[19:34:56] - User name: lammert (Team 100344)
[19:34:56] - User ID: 3F8F18F05D21C391
[19:34:56] - Machine ID: 1
[19:34:56] 
[19:34:56] Loaded queue successfully.
[19:34:56] Deleting work unit #0 from work queue...
[19:34:56] - Failed to delete the requested work unit

Folding@Home Client Shutdown.
Please advice what to do.

Lammert
Flathead74
Posts: 266
Joined: Sun Dec 02, 2007 6:08 pm
Location: Central New York
Contact:

Re: Project: 2669 (Run 2, Clone 52, Gen 143)

Post by Flathead74 »

Lammert,
From your Fahlog.txt:

[19:21:42] Working on queue slot 01 [December 22 19:21:42 UTC]

So, to delete this WU from the queue, try: -delete 01
lammert
Posts: 6
Joined: Sun Feb 03, 2008 1:38 am

Re: Project: 2669 (Run 2, Clone 52, Gen 143)

Post by lammert »

Te first copy of the project was in queue slot 0 and deletion failed. The current copy is on queue slot 1 but deletion also fails:

Code: Select all

Launch directory: /home/folding/folding6.24
Executable: ./fah6
Arguments: -delete 1 

[01:25:03] - Ask before connecting: No
[01:25:03] - User name: lammert (Team 100344)
[01:25:03] - User ID: 3F8F18F05D21C391
[01:25:03] - Machine ID: 1
[01:25:03] 
[01:25:04] Loaded queue successfully.
[01:25:04] Deleting work unit #1 from work queue...
[01:26:16] - Failed to delete the requested work unit

Folding@Home Client Shutdown.
I can delete the queue.dat file and the wu... files in the work directory, but when starting fah6 I receive the same WU from the server again. Hopefully this WU can be marked as bad on the server side.
Flathead74
Posts: 266
Joined: Sun Dec 02, 2007 6:08 pm
Location: Central New York
Contact:

Re: Project: 2669 (Run 2, Clone 52, Gen 143)

Post by Flathead74 »

The address is 00, or 01, not the single digit 0 or 1.

It would be: -delete 00, or -delete 01.

Not -delete 0, or -delete 1.
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Project: 2669 (Run 2, Clone 52, Gen 143)

Post by bruce »

Please run fah6 -queueinfo to confirm exactly what you're trying to delete, followed by fah6 -delete 0N for the proper value of N. If it still fails, post both segments of FAHlog.txt.
lammert
Posts: 6
Joined: Sun Feb 03, 2008 1:38 am

Re: Project: 2669 (Run 2, Clone 52, Gen 143)

Post by lammert »

There is no difference between -delete 1 and -delete 01. This is the queue info:

Code: Select all

--- Opening Log file [December 23 10:17:43 UTC] 

                       
# Linux Console Edition #######################################################
###############################################################################

                       Folding@Home Client Version 6.24beta

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: /home/folding/folding6.24
Executable: ./fah6
Arguments: -queueinfo 

[10:17:43] - Ask before connecting: No
[10:17:43] - User name: lammert (Team 100344)
[10:17:43] - User ID: 3F8F18F05D21C391
[10:17:43] - Machine ID: 1
[10:17:43]
[10:17:43] Loaded queue successfully.
[10:17:43] Printing Queue Information
Current Queue:
Slot 02  Empty/Deleted

Slot 03  Empty/Deleted

Slot 04  Empty/Deleted

Slot 05  Empty/Deleted 

Slot 06  Empty/Deleted    

Slot 07  Empty/Deleted

Slot 08  Empty/Deleted

Slot 09  Empty/Deleted

Slot 00  Empty/Deleted

Slot 01 *Ready    
Project: 2669 (Run 2, Clone 52, Gen 143), Core: a2
Work server: 171.64.65.56:8080
Collection server: 171.67.108.25
Download date: December 23 01:32:03
Deadline date: December 26 01:32:03

PF: 0.000000 based on last 0 slot(s)

Folding@Home Client Shutdown
This is the delete result:

Code: Select all

--- Opening Log file [December 23 10:19:21 UTC]


# Linux Console Edition #######################################################
###############################################################################

                       Folding@Home Client Version 6.24beta

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: /home/folding/folding6.24
Executable: ./fah6
Arguments: -delete 01

[10:19:21] - Ask before connecting: No
[10:19:21] - User name: lammert (Team 100344)
[10:19:21] - User ID: 3F8F18F05D21C391
[10:19:21] - Machine ID: 1
[10:19:21]
[10:19:21] Loaded queue successfully.
[10:19:21] Deleting work unit #1 from work queue...
[10:20:33] - Failed to delete the requested work unit

Folding@Home Client Shutdown.
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Project: 2669 (Run 2, Clone 52, Gen 143)

Post by bruce »

I've never seen that sort of problem before.

Check the permissions of the work directory and queue.dat. If nothing else works, I'd try to rename them both to be something else and then just restart ./fah6
Amaruk
Posts: 254
Joined: Fri Jun 20, 2008 3:57 am
Location: Watching from the Woods

Re: Project: 2669 (Run 2, Clone 52, Gen 143)

Post by Amaruk »

One of my SMP folders had problems with this WU also. Here is the log:

Code: Select all

[06:02:31] Thank you for your contribution to Folding@Home.
[06:02:31] + Number of Units Completed: 855

[06:02:37] - Warning: Could not delete all work unit files (0): Core file absent
[06:02:37] Trying to send all finished work units
[06:02:37] + No unsent completed units remaining.
[06:02:37] - Preparing to get new work unit...
[06:02:37] + Attempting to get work packet
[06:02:37] - Will indicate memory of 3707 MB
[06:02:37] - Connecting to assignment server
[06:02:37] Connecting to http://assign.stanford.edu:8080/
[06:02:37] Posted data.
[06:02:37] Initial: 40AB; - Successful: assigned to (171.64.65.56).
[06:02:37] + News From Folding@Home: Welcome to Folding@Home
[06:02:37] Loaded queue successfully.
[06:02:37] Connecting to http://171.64.65.56:8080/
[06:02:43] Posted data.
[06:02:43] Initial: 0000; - Receiving payload (expected size: 4836115)
[06:02:48] - Downloaded at ~944 kB/s
[06:02:48] - Averaged speed for that direction ~956 kB/s
[06:02:48] + Received work.
[06:02:48] Trying to send all finished work units
[06:02:48] + No unsent completed units remaining.
[06:02:48] + Closed connections
[06:02:48] 
[06:02:48] + Processing work unit
[06:02:48] Core required: FahCore_a2.exe
[06:02:48] Core found.
[06:02:48] Working on Unit 01 [December 15 06:02:48]
[06:02:48] + Working ...
[06:02:48] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a2.exe -dir work/ -suffix 01 -checkpoint 15 -verbose -lifeline 6025 -version 602'

[06:02:48] 
[06:02:48] *------------------------------*
[06:02:48] Folding@Home Gromacs SMP Core
[06:02:48] Version 2.10 (Sun Aug 30 03:43:28 CEST 2009)
[06:02:48] 
[06:02:48] Preparing to commence simulation
[06:02:48] - Ensuring status. Please wait.
[06:02:49] Called DecompressByteArray: compressed_data_size=4835603 data_size=23977273, decompressed_data_size=23977273 diff=0
[06:02:49] - Digital signature verified
[06:02:49] 
[06:02:49] Project: 2669 (Run 2, Clone 52, Gen 143)
[06:02:49] 
[06:02:49] Assembly optimizations on if available.
[06:02:49] Entering M.D.
[06:02:58] Run 2, Clone 52, Gen 143)
[06:02:58] 
[06:02:58] Entering M.D.
[06:03:25] CoreStatus = 0 (0)
[06:03:25] Client-core communications error: ERROR 0x0
[06:03:25] Deleting current work unit & continuing...
[06:03:39] - Warning: Could not delete all work unit files (1): Core file absent
[06:03:39] Trying to send all finished work units
[06:03:39] + No unsent completed units remaining.
[06:03:39] - Preparing to get new work unit...
[06:03:39] + Attempting to get work packet
[06:03:39] - Will indicate memory of 3707 MB
[06:03:39] - Connecting to assignment server
[06:03:39] Connecting to http://assign.stanford.edu:8080/
[06:03:39] Posted data.
[06:03:39] Initial: 40AB; - Successful: assigned to (171.64.65.56).
[06:03:39] + News From Folding@Home: Welcome to Folding@Home
[06:03:39] Loaded queue successfully.
[06:03:39] Connecting to http://171.64.65.56:8080/
[06:03:44] Posted data.
[06:03:44] Initial: 0000; - Receiving payload (expected size: 4836115)
[06:03:48] - Downloaded at ~1180 kB/s
[06:03:48] - Averaged speed for that direction ~1001 kB/s
[06:03:48] + Received work.
[06:03:48] + Closed connections
[06:03:53] 
[06:03:53] + Processing work unit
[06:03:53] Core required: FahCore_a2.exe
[06:03:53] Core found.
[06:03:53] Working on Unit 02 [December 15 06:03:53]
[06:03:53] + Working ...
[06:03:53] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a2.exe -dir work/ -suffix 02 -checkpoint 15 -verbose -lifeline 6025 -version 602'

[06:03:53] 
[06:03:53] *------------------------------*
[06:03:53] Folding@Home Gromacs SMP Core
[06:03:53] Version 2.10 (Sun Aug 30 03:43:28 CEST 2009)
[06:03:53] 
[06:03:53] Preparing to commence simulation
[06:03:53] - Ensuring status. Please wait.
[06:04:03] - Looking at optimizations...
[06:04:03] - Working with standard loops on this execution.
[06:04:03] - Files status OK
[06:04:04] - Expanded 4835603 -> 23977273 (decompressed 495.8 percent)
[06:04:04] Called DecompressByteArray: compressed_data_size=4835603 data_size=23977273, decompressed_data_size=23977273 diff=0
[06:04:04] - Digital signature verified
[06:04:04] 
[06:04:04] Project: 2669 (Run 2, Clone 52, Gen 143)
[06:04:04] 
[06:04:04] Entering M.D.
[06:04:12] Completed 0 out of 250000 steps  (0%)
[06:04:32] CoreStatus = 0 (0)
[06:04:32] Client-core communications error: ERROR 0x0
[06:04:32] Deleting current work unit & continuing...
[06:04:46] - Warning: Could not delete all work unit files (2): Core file absent
[06:04:46] Trying to send all finished work units
[06:04:46] + No unsent completed units remaining.
[06:04:46] - Preparing to get new work unit...
[06:04:46] + Attempting to get work packet
[06:04:46] - Will indicate memory of 3707 MB
[06:04:46] - Connecting to assignment server
[06:04:46] Connecting to http://assign.stanford.edu:8080/
[06:04:46] Posted data.
[06:04:46] Initial: 40AB; - Successful: assigned to (171.64.65.56).
[06:04:46] + News From Folding@Home: Welcome to Folding@Home
[06:04:46] Loaded queue successfully.
[06:04:46] Connecting to http://171.64.65.56:8080/
[06:04:53] Posted data.
[06:04:53] Initial: 0000; - Receiving payload (expected size: 4836115)
[06:04:58] - Downloaded at ~944 kB/s
[06:04:58] - Averaged speed for that direction ~989 kB/s
[06:04:58] + Received work.
[06:04:58] + Closed connections
[06:05:03] 
[06:05:03] + Processing work unit
[06:05:03] Core required: FahCore_a2.exe
[06:05:03] Core found.
[06:05:03] Working on Unit 03 [December 15 06:05:03]
[06:05:03] + Working ...
[06:05:03] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a2.exe -dir work/ -suffix 03 -checkpoint 15 -verbose -lifeline 6025 -version 602'

[06:05:03] 
[06:05:03] *------------------------------*
[06:05:03] Folding@Home Gromacs SMP Core
[06:05:03] Version 2.10 (Sun Aug 30 03:43:28 CEST 2009)
[06:05:03] 
[06:05:03] Preparing to commence simulation
[06:05:03] - Ensuring status. Please wait.
[06:05:13] - Looking at optimizations...
[06:05:13] - Working with standard loops on this execution.
[06:05:13] - Files status OK
[06:05:14] - Expanded 4835603 -> 23977273 (decompressed 495.8 percent)
[06:05:14] Called DecompressByteArray: compressed_data_size=4835603 data_size=23977273, decompressed_data_size=23977273 diff=0
[06:05:14] - Digital signature verified
[06:05:14] 
[06:05:14] Project: 2669 (Run 2, Clone 52, Gen 143)
[06:05:14] 
[06:05:14] Entering M.D.
[06:05:22] Completed 0 out of 250000 steps  (0%)
[06:19:05] - Autosending finished units...
[06:19:05] Trying to send all finished work units
[06:19:05] + No unsent completed units remaining.
[06:19:05] - Autosend completed
[12:19:05] - Autosending finished units...
[12:19:05] Trying to send all finished work units
[12:19:05] + No unsent completed units remaining.
[12:19:05] - Autosend completed
[18:19:05] - Autosending finished units...
[18:19:05] Trying to send all finished work units
[18:19:05] + No unsent completed units remaining.
[18:19:05] - Autosend completed
[00:19:05] - Autosending finished units...
[00:19:05] Trying to send all finished work units
[00:19:05] + No unsent completed units remaining.
[00:19:05] - Autosend completed
[06:19:05] - Autosending finished units...
[06:19:05] Trying to send all finished work units
[06:19:05] + No unsent completed units remaining.
[06:19:05] - Autosend completed
[12:19:05] - Autosending finished units...
[12:19:05] Trying to send all finished work units
[12:19:05] + No unsent completed units remaining.
[12:19:05] - Autosend completed
[18:19:05] - Autosending finished units...
[18:19:05] Trying to send all finished work units
[18:19:05] + No unsent completed units remaining.
[18:19:05] - Autosend completed
[00:19:05] - Autosending finished units...
[00:19:05] Trying to send all finished work units
[00:19:05] + No unsent completed units remaining.
[00:19:05] - Autosend completed
[06:19:05] - Autosending finished units...
[06:19:05] Trying to send all finished work units
[06:19:05] + No unsent completed units remaining.
[06:19:05] - Autosend completed
[12:19:05] - Autosending finished units...
[12:19:05] Trying to send all finished work units
[12:19:05] + No unsent completed units remaining.
[12:19:05] - Autosend completed
[18:19:05] - Autosending finished units...
[18:19:05] Trying to send all finished work units
[18:19:05] + No unsent completed units remaining.
[18:19:05] - Autosend completed
[00:19:05] - Autosending finished units...
[00:19:05] Trying to send all finished work units
[00:19:05] + No unsent completed units remaining.
[00:19:05] - Autosend completed
[06:19:05] - Autosending finished units...
[06:19:05] Trying to send all finished work units
[06:19:05] + No unsent completed units remaining.
[06:19:05] - Autosend completed
[12:19:05] - Autosending finished units...
[12:19:05] Trying to send all finished work units
[12:19:05] + No unsent completed units remaining.
[12:19:05] - Autosend completed
[18:19:05] - Autosending finished units...
[18:19:05] Trying to send all finished work units
[18:19:05] + No unsent completed units remaining.
[18:19:05] - Autosend completed
[00:19:05] - Autosending finished units...
[00:19:05] Trying to send all finished work units
[00:19:05] + No unsent completed units remaining.
[00:19:05] - Autosend completed
[02:35:47] ***** Got an Activate signal (2)
[02:35:47] Killing all core threads

Folding@Home Client Shutdown.

--- Opening Log file [December 19 03:43:14] 


# SMP Client ##################################################################
###############################################################################

                       Folding@Home Client Version 6.02

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: /home/hope/folding
Executable: ./fah6
Arguments: -smp -verbosity 9 

[03:43:14] - Ask before connecting: No
[03:43:14] - User name: Amaruk (Team 50625)
[03:43:14] - User ID: 967AB22xxxxxxxx
[03:43:14] - Machine ID: 1
[03:43:14] 
[03:43:15] Loaded queue successfully.
[03:43:15] Unit 3's deadline (December 18 06:04) has passed.
[03:44:31] ***** Got an Activate signal (2)
[03:44:31] Killing all core threads

Folding@Home Client Shutdown.


--- Opening Log file [December 19 03:45:46] 


# SMP Client ##################################################################
###############################################################################

                       Folding@Home Client Version 6.02

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: /home/hope/folding
Executable: ./fah6
Arguments: -smp -verbosity 9 

[03:45:46] - Ask before connecting: No
[03:45:46] - User name: Amaruk (Team 50625)
[03:45:46] - User ID: 967AB22xxxxxxxx
[03:45:46] - Machine ID: 1
[03:45:46] 
[03:45:46] Work directory not found. Creating...
[03:45:46] Could not open work queue, generating new queue...
[03:45:46] - Autosending finished units...
[03:45:46] Trying to send all finished work units
[03:45:46] + No unsent completed units remaining.
[03:45:46] - Autosend completed
[03:45:46] - Preparing to get new work unit...
[03:45:46] + Attempting to get work packet
[03:45:46] - Will indicate memory of 3707 MB
[03:45:46] - Detect CPU. Vendor: AuthenticAMD, Family: 15, Model: 2, Stepping: 3
[03:45:46] - Connecting to assignment server
[03:45:46] Connecting to http://assign.stanford.edu:8080/
[03:45:46] Posted data.
[03:45:46] Initial: 40AB; - Successful: assigned to (171.64.65.56).
[03:45:46] + News From Folding@Home: Welcome to Folding@Home
[03:45:46] Loaded queue successfully.
[03:45:46] Connecting to http://171.64.65.56:8080/
[03:45:53] Posted data.
[03:45:53] Initial: 0000; - Receiving payload (expected size: 4836115)
[03:45:57] - Downloaded at ~1180 kB/s
[03:45:57] - Averaged speed for that direction ~1180 kB/s
[03:45:57] + Received work.
[03:45:57] + Closed connections
[03:45:57] 
[03:45:57] + Processing work unit
[03:45:57] Core required: FahCore_a2.exe
[03:45:57] Core found.
[03:45:57] Working on Unit 01 [December 19 03:45:57]
[03:45:57] + Working ...
[03:45:57] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a2.exe -dir work/ -suffix 01 -checkpoint 15 -verbose -lifeline 6018 -version 602'

[03:45:57] 
[03:45:57] *------------------------------*
[03:45:57] Folding@Home Gromacs SMP Core
[03:45:57] Version 2.10 (Sun Aug 30 03:43:28 CEST 2009)
[03:45:57] 
[03:45:57] Preparing to commence simulation
[03:45:57] - Ensuring status. Please wait.
[03:45:57] Files status OK
[03:45:58] - Expanded 4835603 -> 23977273 (decompressed 495.8 percent)
[03:45:58] Called DecompressByteArray: compressed_data_size=4835603 data_size=23977273, decompressed_data_size=23977273 diff=0
[03:45:58] - Digital signature verified
[03:45:58] 
[03:45:58] Project: 2669 (Run 2, Clone 52, Gen 143)
[03:45:58] 
[03:45:58] Assembly optimizations on if available.
[03:45:58] Entering M.D.
[03:46:08] Run 2, Clone 52, Gen 143)
[03:46:08] 
[03:46:08] Entering M.D.
[03:46:36] CoreStatus = FF (255)
[03:46:36] Client-core communications error: ERROR 0xff
[03:46:36] Deleting current work unit & continuing...
[03:46:49] - Warning: Could not delete all work unit files (1): Core file absent
[03:46:49] Trying to send all finished work units
[03:46:49] + No unsent completed units remaining.
[03:46:49] - Preparing to get new work unit...
[03:46:49] + Attempting to get work packet
[03:46:49] - Will indicate memory of 3707 MB
[03:46:49] - Connecting to assignment server
[03:46:49] Connecting to http://assign.stanford.edu:8080/
[03:46:49] Posted data.
[03:46:49] Initial: 40AB; - Successful: assigned to (171.64.65.56).
[03:46:49] + News From Folding@Home: Welcome to Folding@Home
[03:46:49] Loaded queue successfully.
[03:46:49] Connecting to http://171.64.65.56:8080/
[03:46:55] Posted data.
[03:46:55] Initial: 0000; - Receiving payload (expected size: 4836115)
[03:46:58] - Downloaded at ~1574 kB/s
[03:46:58] - Averaged speed for that direction ~1377 kB/s
[03:46:58] + Received work.
[03:46:58] + Closed connections
[03:47:03] 
[03:47:03] + Processing work unit
[03:47:03] Core required: FahCore_a2.exe
[03:47:03] Core found.
[03:47:03] Working on Unit 02 [December 19 03:47:03]
[03:47:03] + Working ...
[03:47:03] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a2.exe -dir work/ -suffix 02 -checkpoint 15 -verbose -lifeline 6018 -version 602'

[03:47:03] 
[03:47:03] *------------------------------*
[03:47:03] Folding@Home Gromacs SMP Core
[03:47:03] Version 2.10 (Sun Aug 30 03:43:28 CEST 2009)
[03:47:03] 
[03:47:03] Preparing to commence simulation
[03:47:03] - Ensuring status. Please wait.
[03:47:13] - Looking at optimizations...
[03:47:13] - Working with standard loops on this execution.
[03:47:13] - Files status OK
[03:47:14] - Expanded 4835603 -> 23977273 (decompressed 495.8 percent)
[03:47:14] Called DecompressByteArray: compressed_data_size=4835603 data_size=23977273, decompressed_data_size=23977273 diff=0
[03:47:14] - Digital signature verified
[03:47:14] 
[03:47:14] Project: 2669 (Run 2, Clone 52, Gen 143)
[03:47:14] 
[03:47:14] Entering M.D.
[03:47:22] Completed 0 out of 250000 steps  (0%)
[07:29:29] ***** Got an Activate signal (2)
[07:29:29] Killing all core threads

Folding@Home Client Shutdown.


--- Opening Log file [December 19 08:41:09] 


# SMP Client ##################################################################
###############################################################################

                       Folding@Home Client Version 6.02

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: /home/hope/folding
Executable: ./fah6
Arguments: -smp -verbosity 9 

[08:41:09] - Ask before connecting: No
[08:41:09] - User name: Amaruk (Team 50625)
[08:41:09] - User ID not found locally
[08:41:09] + Requesting User ID from server
[08:41:09] - Getting ID from AS: 
[08:41:09] Connecting to http://assign.stanford.edu:8080/
[08:41:09] Posted data.
[08:41:09] Initial: DF10; - Received User ID = 10DF386Cxxxxxxxx
[08:41:09] - Machine ID: 1
[08:41:09] 
[08:41:09] Work directory not found. Creating...
[08:41:09] Could not open work queue, generating new queue...
[08:41:09] - Autosending finished units...
[08:41:09] Trying to send all finished work units
[08:41:09] + No unsent completed units remaining.
[08:41:09] - Autosend completed
[08:41:09] - Preparing to get new work unit...
[08:41:09] + Attempting to get work packet
[08:41:09] - Will indicate memory of 3707 MB
[08:41:09] - Detect CPU. Vendor: AuthenticAMD, Family: 15, Model: 2, Stepping: 3
[08:41:09] - Connecting to assignment server
[08:41:09] Connecting to http://assign.stanford.edu:8080/
[08:41:09] Posted data.
[08:41:09] Initial: 40AB; - Successful: assigned to (171.64.65.56).
[08:41:09] + News From Folding@Home: Welcome to Folding@Home
[08:41:09] Loaded queue successfully.
[08:41:09] Connecting to http://171.64.65.56:8080/
[08:41:23] Posted data.
[08:41:23] Initial: 0000; - Receiving payload (expected size: 4856176)
[08:41:27] - Downloaded at ~1185 kB/s
[08:41:27] - Averaged speed for that direction ~1185 kB/s
[08:41:27] + Received work.
[08:41:27] + Closed connections
[08:41:27] 
[08:41:27] + Processing work unit
[08:41:27] Core required: FahCore_a2.exe
[08:41:27] Core found.
[08:41:27] Working on Unit 01 [December 19 08:41:27]
[08:41:27] + Working ...
[08:41:27] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a2.exe -dir work/ -suffix 01 -checkpoint 15 -verbose -lifeline 5932 -version 602'

[08:41:28] 
[08:41:28] *------------------------------*
[08:41:28] Folding@Home Gromacs SMP Core
[08:41:28] Version 2.10 (Sun Aug 30 03:43:28 CEST 2009)
[08:41:28] 
[08:41:28] Preparing to commence simulation
[08:41:28] - Ensuring status. Please wait.
[08:41:28] Files status OK
[08:41:29] - Expanded 4855664 -> 24045785 (decompressed 495.2 percent)
[08:41:29] Called DecompressByteArray: compressed_data_size=4855664 data_size=24045785, decompressed_data_size=24045785 diff=0
[08:41:29] - Digital signature verified
[08:41:29] 
[08:41:29] Project: 2662 (Run 0, Clone 143, Gen 64)
[08:41:29] 
[08:41:29] Assembly optimizations on if available.
[08:41:29] Entering M.D.
[08:41:38] Run 0, Clone 143, Gen 64)
[08:41:38] 
[08:41:38] Entering M.D.
[08:41:47] Completed 0 out of 250000 steps  (0%)
[08:48:25] Completed 2500 out of 250000 steps  (1%)
And snippet from terminal:

Code: Select all

bonds that rotated more than 90 degrees:
 atom 1 atom 2  angle  previous, current, constraint length
   7094   7096   90.0    0.1334   0.4055      0.1090

Step 35750101, time 71500.2 (ps)  LINCS WARNING
relative constraint deviation after LINCS:
rms 0.033269, max 2.064907 (between atoms 7094 and 7096)
bonds that rotated more than 90 degrees:
 atom 1 atom 2  angle  previous, current, constraint length
   7094   7096   90.0    0.4055   0.3341      0.1090

Step 35750102, time 71500.2 (ps)  LINCS WARNING
relative constraint deviation after LINCS:
rms 0.328024, max 18.621902 (between atoms 7094 and 7096)
bonds that rotated more than 90 degrees:
 atom 1 atom 2  angle  previous, current, constraint length
   7088   7090   90.0    0.1090   1.0036      0.1090
   7097   7098   90.0    0.0996   0.1921      0.1010
Warning: 1-4 interaction between 7088 and 7096 at distance 2.375 which is larger than the 1-4 table size 2.200 nm
These are ignored for the rest of the simulation
This usually means your system is exploding,
if not, you should increase table-extension in your mdp file
or with user tables increase the table size

Step 35750103, time 71500.2 (ps)  LINCS WARNING
relative constraint deviation after LINCS:
rms 4354159284.624430, max 225309523968.000000 (between atoms 7094 and 7096)
bonds that rotated more than 90 degrees:
 atom 1 atom 2  angle  previous, current, constraint length
   7083   7084   90.0    0.1090   0.2634      0.1090
   7094   7096   90.0    2.1388 24558737408.0000      0.1090
   7097   7098   90.0    0.1921   0.1023      0.1010
   7097   7099   90.0    0.1031   4.9392      0.1010
   7097   7100   90.0    0.1042   0.4260      0.1010
   7127   7128   90.0    0.1090   6.4780      0.1090
   7127   7129   97.2    0.1090  35.8049      0.1090
   7127   7130   90.0    0.1090   6.3442      0.1090
   7015   7016   90.0    0.1090   1.5267      0.1090
   7015   7018   90.0    0.1090   1.5241      0.1090
  22271  22273   90.0    0.1090 37075.3867      0.1090
  22214  22215   90.0    0.1090 1205.4769      0.1090
  22221  22222   90.0    0.1090   2.7076      0.1090
  22221  22223   90.0    0.1090   7.5109      0.1090
  22286  22288   90.0    0.1010 733765376.0000      0.1010
  22764  22765   90.0    0.0960 1309.7543      0.0960
  22157  22158  100.1    0.1090   3.7405      0.1090
  22157  22159   90.0    0.1090 210.9942      0.1090

t = 71500.209 ps: Water molecule starting at atom 112288 can not be settled.
Check for bad contacts and/or reduce the timestep.
I like the bit about 'This usually means your system is exploding' :lol:
Image
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Project: 2669 (Run 2, Clone 52, Gen 143)

Post by bruce »

I put it on the list to be stopped.
Post Reply