which is the most important file in the work unit results?

If you're new to FAH and need help getting started or you have very basic questions, start here.

Moderators: Site Moderators, FAHC Science Team

Post Reply
whocrazy
Posts: 97
Joined: Thu Mar 27, 2008 9:09 pm

which is the most important file in the work unit results?

Post by whocrazy »

Hi.
Just out of curiosity, I can remember with the first version of fah I tried, back in the old windows 9x and XP days, whenever the work unit finished, it would create a big text file .gro file, then it would somehow compress it and make it a binary file and then send the results, when the work units are submitted now however, it includes all the files, the .EDR, .XPC and all the log files, and the .gro file.
Which one is the most important file, and why are all the files included?
Thanks.
muziqaz
Posts: 1426
Joined: Sun Dec 16, 2007 6:22 pm
Hardware configuration: 9950x, 7950x3D, 5950x, 5800x3D
7900xtx, RX9070, Radeon 7, 5700xt, 6900xt, RX 550 640SP
Location: London
Contact:

Re: which is the most important file in the work unit results?

Post by muziqaz »

whocrazy wrote: Sat Mar 01, 2025 10:08 am Hi.
Just out of curiosity, I can remember with the first version of fah I tried, back in the old windows 9x and XP days, whenever the work unit finished, it would create a big text file .gro file, then it would somehow compress it and make it a binary file and then send the results, when the work units are submitted now however, it includes all the files, the .EDR, .XPC and all the log files, and the .gro file.
Which one is the most important file, and why are all the files included?
Thanks.
Can I ask for what purpose are you asking this? Which of your children are the most important to you?
I know that while folding wudata_01.dat is your WU. It is possible that this file is being sent back alongside science.log and other files. However it si possible wudata_01.dat is extracted into other files, this process happens quickly and since no one who folds really cares about it, it is not well documented, as long as results reach the server.
FAH Omega tester
Image
whocrazy
Posts: 97
Joined: Thu Mar 27, 2008 9:09 pm

Re: which is the most important file in the work unit results?

Post by whocrazy »

I only ask because I am interested and I like to know how it all works, there is no ulterior motive, I am not trying to hack anything or gain an unfair advantage, I am autistic and just very curious, I also like to know what all the other files do, like the .XTC and .EDR files.
PS: I have no children, and I am a little confused by your query.
muziqaz
Posts: 1426
Joined: Sun Dec 16, 2007 6:22 pm
Hardware configuration: 9950x, 7950x3D, 5950x, 5800x3D
7900xtx, RX9070, Radeon 7, 5700xt, 6900xt, RX 550 640SP
Location: London
Contact:

Re: which is the most important file in the work unit results?

Post by muziqaz »

To be fair I don't know exactly what each of the files is doing.
checkpt.crc is probably error checking file which checks if simulation is not looking for aliens
core.xml contains fahcore arguments
dhdl.xvg I think is simulation type file
ener.edr has something to do with energy field (probably)
frame22 files probably have snapshot of the frame (1%)
md.log molecular dynamics log from Gromacs/OpenMM (software used for simulations)
science.log has some extra simulation related logs which are not included in fahlog
state files are again snapshots of current state or something like that.
I'm, sure someone will correct me on most of them.
The point is, that a lot of the files are just proprietary files needed for the simulation, and none of them are more or less important. We cannot say which file is the most important file in the game or operating system. All of them are, because if you delete one random file your program might not start or work correctly :)
FAH Omega tester
Image
Joe_H
Site Admin
Posts: 8074
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Studio M1 Max 32 GB smp6
Mac Hack i7-7700K 48 GB smp4
Location: W. MA

Re: which is the most important file in the work unit results?

Post by Joe_H »

And some of the files may be used during the validation process on WUs received back by the WS before creating the next Gen WU to send out. With something missing the WU may fail validation, not earn credit, and be dumped by the server.
Image
arisu
Posts: 176
Joined: Mon Feb 24, 2025 11:11 pm

Re: which is the most important file in the work unit results?

Post by arisu »

The answer is: All files have a use except the .gro file, which contains duplicate information already contained in another file.

Note and disclaimer: None of the information below was obtained through reverse engineering the proprietary FAH cores. All of this information comes from public sources pertaining to the open source version of GROMACS or are simple observations. There may be subtle differences between the public GROMACS and the version that FAH uses, but the below should be substantially correct.

The WU that you receive is a single file, wudata_01.dat. The FAH core processes it, creating some files in the process, and then packages some of those files into wuresults_01.dat which is then sent to the collection server, to be converted automatically into a new wudata_01.dat which will be sent to the next person to continue the simulation from where you left off. The .dat files are .tar files (a file archive format similar to zip that is not unique to GROMACS) with a special header prepended to it that contains some metadata like project details, checksums to detect corruption, and a signature to ensure authenticity. The N in frameN refers to the Generation of the WU, so a project with Run 10, Clone 14, Gen 26 would have frameN files named frame26.

Files received from the work server:
wudata_01.dat which contains core.xml and frameN.tpr

Files created and sent to the collection server:
wuresults_01.dat which contains dhdl.xvg, frameN.gro, frameN.xtc, state.cpt, logfile_01.txt, science.log, and md.log

Files created but used only temporarily:
checkpt.crc, ener.edr, state_prev.cpt, and state_stepN.cpt




Here is a very BRIEF description of the files:

- core.xml file contains simulation settings and specifies which files should be returned to the researcher.
- frameN.gro contains the positions of each particle and their velocities at the end of the simulation.
- frameN.xtc contains the positions of only some particles (it usually excludes surrounding water molecules) at a low-precision.
- state.cpt contains a checkpoint of the entire simulation state. The previous checkpoint is state_prev.cpt.
- checkpt.crc probably contains checksums to detect data corruption. It's probably a proprietary FAH file.
- frameN.tpr contains the starting state of the simulation. It's sort of like an initial checkpoint that you "resume" from.
- logfile_01.txt, science.log, and md.log contain logs about the simulation progress and any errors that occur.
- dhdl.xvg is a graph file that graphs changes in energy (actually, enthalpy) over time.
- ener.edr contains information about the energy of the system.




Here is a very DETAILED description of the files, as best as I understand:

Configuration and core parameters (core.xml)
The .xml file (core.xml) contains parameters for the simulation that are passed to mdrun (the command that reads the files and writes output files and does the simulation magic). It also specifies which files are supposed to be uploaded to the collection server. If core.xml contains different parameters, then files different than the ones described below may be used. What I describe in this post comes from WUs with a core.xml that contains something like this (with the N in frameN replaced with the generation number of the WU):

Code: Select all

<config><core-args v='-c frameN.gro
    -s frameN.tpr
    -x frameN.xtc'/><return v='*.log
    *.xtc
    *.xvg
    state.cpt
    *.gro'/>
</config>
This tells the core to start the simulation with "mdrun -c frameN.gro -s frameN.tpr -x frameN.xtc", plus extra arguments chosen by the core (like "-cpt 5 -np 7 -ntmpi 1" appended to that to set checkpoint intervals and multithreading options). It also says to return to the server any files ending in .log, .xtc, .xvg, .gro, and the file with the full name state.cpt. Because the researcher in charge of the project creates the core.xml file, he or she could add additional options as needed. If they wanted the energy file to be sent to them, they could add *.edr to the list of files to return and it will be packaged into the wuresults_01.dat file instead of discarded.

Molecular structure (frameN.gro)
The .gro file (frameN.gro) contain the molecular structure in a GROMACS-specific format. By concatenating multiple .gro files, you have a trajectory file (I don't think FAH uses it that way though). Each line describes one atom, names the molecule that it's a part of, and specifies its position and its velocity vectors. This seems to be the largest file. It actually contains completely redundant information, and there's no real need for it to be created and sent to the server. The other files that are sent to the server (or were sent to us by the server) can be used to re-create the .gro file:

Code: Select all

$ mkdir wudata wuresults
$ tail -c+513 wudata_01.dat | tar -C wudata -x
$ tail -c+513 wuresults_01.dat | tar -C wuresults -x
$ gmx convert-trj -f wuresults/state.cpt -s wudata/frame4.tpr -o out.gro 2>/dev/null
$ diff wuresults/frame4.gro out.gro
1c1
< FOO in water
---
> frame t= 25000.000
This format can also be used to visualize the ending point of the simulation. Here is an example from a WU I ran recently. I've removed the water molecules and ions to make it easier to see and show it in two different visualization styles:

Image Image

Energy information (ener.edr)
The .edr file (ener.edr) contains information about the energy of the system (temperature, volume, enthalpy, pressure, voltage potential, etc) in a portable (machine-independent) format. It doesn't seem to be sent back to the server in any of the WUs I have run. The file can be analyzed with the "gmx energy" command. The binary equivalent of the .edr file is .ene and they can be inter-converted but a .ene file created on one architecture might not be readable if moved to another architecture. I don't think FAH uses .ene files.

Trajectory information (frameN.xtc and frameN.trr)
The .xtc file (frameN.xtc) is a portable (machine-independent) file containing low-precision trajectories. It stores a list of steps (and their timestamps) and a list of atoms' coordinates. The .trr file is the trajectory file like the .xtc file. Unlike the .xtc file, it contains full-precision trajectories as well as velocities, forces, and energies. I don't see the .trr file on any WUs I am running, but some FAH projects use it instead of the reduced-precision .xtc file. The simulation will periodically append a new "snapshot" to this file. How often the snapshot is made is configured by the researcher.

Simulation state checkpoints (state.cpt, state_prev.cpt, and state_stepN.cpt)
The .cpt file (state.cpt) is just a checkpoint file. It contains everything necessary to resume the simulation exactly where it stopped as well as the offset of files (such as logs) when the checkpoint was made, so that those files can be truncated to restore them to the state they were at then. There is also a state_stepN.cpt (N is the step number that the checkpoint is created for). When a checkpoint is made, the new state is written to state_stepN.cpt. Once that file is written, state.cpt is renamed to state_prev.cpt and state_stepN.cpt is renamed to state.cpt.

If you install the vanilla GROMACS program, you can view incredibly detailed information about it (hundreds of thousands of lines or more) with the command "gmx dump -cp state.cpt". This won't change the file contents btw. It just prints the information it contains.

Checkpoint checksums?? (checkpt.crc)
I don't know what the .crc file (checkpt.crc) is. I can't find anything about it in the GROMACS code so I guess it is something proprietary and custom for FAH. All I can tell is that it's 1760 bytes long and every few minutes, all but the initial 168 bytes are updated. I suspect it contains CRC checksums for components of the checkpoint files based on the name and the fact that it gets updated every time a new checkpoint is made. The GROMACS public source code contains some references to FAH-specific checksumming files and this is probably what it is referring to. FAH's copy of GROMACS seems to disable checksums in the .cpt file, so this file may be what is being used instead.

There is a note that might be relevant in "src/gromacs/fileio/checkpoint.cpp" in the vanilla GROMACS code, describing an undocumented function "fcCheckpoint()" that is probably defined in one of FAH's private repositories and enhances crash recovery:

Code: Select all

#if GMX_FAHCORE                                                                                                                         
    /* Always FAH checkpoint immediately after a Gromacs checkpoint.                                                                    
     *                                                                                                                                  
     * Note that it is critical that we save a FAH checkpoint directly                                                                  
     * after writing a Gromacs checkpoint.  If the program dies, either                                                                 
     * by the machine powering off suddenly or the process being,                                                                       
     * killed, FAH can recover files that have only appended data by                                                                    
     * truncating them to the last recorded length.  The Gromacs                                                                        
     * checkpoint does not just append data, it is fully rewritten each                                                                 
     * time so a crash between moving the new Gromacs checkpoint file in                                                                
     * to place and writing a FAH checkpoint is not recoverable.  Thus                                                                  
     * the time between these operations must be kept as short a                                                                        
     * possible.                                                                                                                        
     */                                                                                                                                 
    fcCheckpoint();                                                                                                                     
#endif 
This function might be what is responsible for creating the checkpt.crc file.

Molecular topology and simulation parameters (frameN.tpr)
The .tpr file (frameN.tpr) contains the starting state of the simulation (atom positions, energies, velocities, etc) and simulation parameters. It can be thought of as a combination of the "initial checkpoint" that is used when you first start running the WU and a detailed configuration file. If there are no checkpoint files, then this file is used instead and the simulation starts at 0%. I'm guessing that the .cpt file from a WU result is converted into the .tpr file for the new WU for the next Generation on the server.

Just like with the .cpt file, you can view a huge trove of information about the simulation with "gmx dump -f frameN.tpr". You may notice that some of the information is also printed at the beginning of md.log. Here is an example: https://textbin.net/q2ux7935bg

Log files (logfile_01.txt, science.log, and md.log)
The log files contain human-readable log information. The logfile_01.txt file contains the core's output log. The client's log that you can view through the web app contains this log (as well as lines from other running cores and the client itself). The science.log file is where all the output of the mdrun command goes that would otherwise be printed (on Linux, it just binds the core's stderr and stdout to this file). The md.log file contains information about the state of the simulation at various points in time as well as information about the GROMACS build that is being used and its configuration.

Graphical plot (dhdl.xvg)
The .xvg file (dhdl.xvg) is a graphical plot file that contains plots of various components of the simulation over time. For FAH core a8 at least with the WU I looked at, it contains a graph of dH/dλ (dH/dL, hence the name dhdl.xvg) which describes the difference between enthalpy (internal energy + pressure x volume) at two lambda states and plots it over time. Don't ask me what a labmda state is, but apparently the rate at which enthalpy changes over lambda states is important enough to save. Here's an example plot generated from a .xvg file on a WU I completed recently:

Image

I created that image from dhdl.xvg with the following commands (on Linux, this needs the packages grace, ghostscript, and imagemagick):

Code: Select all

gracebat dhdl.xvg
gs -dQUIET -dBATCH -dTextAlphaBits=4 -sDEVICE=pnggray -r96 -odhdl.png dhdl.ps
mogrify -transverse -flip -trim -strip -border 16 -bordercolor white -quality 100 dhdl.png
Different FAH GROMACS cores may be using slightly different files. Even different individual projects may be using and sending different files (some use .trr, some don't create or send a .xvg file, some don't even create or send a .xtc file) or even different file names (some projects use mdN instead of frameN). This is up to the researcher, and he/she can adjust that by changing core.xml. And the GPU cores, based on OpenMM instead of GROMACS, are much different (I think the only file they have in common is a .xtc file). The OpenMM cores mostly use .xml files instead of arcane custom formats. Interestingly, FAH's internal editions of both OpenMM and GROMACS seem to support encrypted input files. I don't know what the purpose of that would be since the process would need access to the key to make use of the input files in the first place...




N.B. Uploading the .gro file doesn't make too much sense to me. If it can be generated at the end of the simulation from other existing files in the work directory, why aren't those files uploaded to the collection server instead, and have the collection server generate the .gro if it's really needed? That would save more than 60% on upload bandwidth, probably more for large WUs.
Last edited by arisu on Tue Mar 18, 2025 6:48 am, edited 1 time in total.
Post Reply