Skip to content

Instantly share code, notes, and snippets.

@badarsh2
Created August 13, 2018 17:03
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save badarsh2/f72c115062154022020ebc57607e31cb to your computer and use it in GitHub Desktop.
Save badarsh2/f72c115062154022020ebc57607e31cb to your computer and use it in GitHub Desktop.
Summary of my work at OpenChemistry as part of Google Summer of Code 2018

Sample Screenshot

As part of Google Summer of Code 2018, I contributed to Avogadro 2 towards providing support for analysing Molecular Dynamics (MD) trajectories and generating inputs for a few MD codes. I dealt with adding new trajectory readers, MD input readers and writers, MD input generation tools, and revamping some of the visualization tools.

Work Accomplished

I made the following pull requests towards adding MD trajectory readers:

  • LAMMPS dump reader: ASCII full-precision trajectory dump reader. This patch contains the code additions for LAMMPS dump reader, and this patch contains the corresponding IO tests.
  • trr reader - The GROMACS trr trajectory format stores trajectory information in "packed" binary format. To unpack the binary information, I've used the struct library. In this patch, I've added the struct library to production code, and in this patch, I've added the reader implementation that makes use of the library, unpacks binary data and sets the molecule data.
  • dcd reader - dcd is a trajectory extension supported by several software like OpenMM, CHARMM, NAMD, LAMMPS etc. Like trr, dcd also stores data in packed binary format. This patch adds the dcd reader implementation which also handles CHARMM-standard dcd's and non-CHARMM dcd's appropriately, as CHARMM dcd's include certain extra information like unit cell coordinates.
  • OUTCAR reader - The OUTCAR ASCII file gives detailed output of a VASP run. This patch adds the code for implementation and tests for VASP OUTCAR reader.
  • mdcrd reader - mdcrd is AMBER's ASCII trajectory convention. The format in itself is quite rudimentary, as it contains merely a sequence of floats. It is not so trivial to ascertain the number of atoms and presence/absence of unit cell floats from the trajectory. Hence, in addition to this patch which stores the sequence of floats in an array, this patch which is still WIP, invokes a dialog that seeks the number of atoms and the unit cell boolean from the user, and accordingly sets up the molecule.
  • pdb trajectory reader - The PDB trajectory format is not much different from the conventional PDB format: it is just mere concatenation of conventional PDB frames separated by the ENDMOL keyword. OpenMM uses PDB as one of the trajectory reporters. This patch contains the code additions for supporting PDB trajectory reading on top of the existing native PDB reader.

Upon adding readers for various MD codes, a need for importing the topology file was realized, due to the fact that many of the trajectory formats do not contain information about the constituent elements.

  • To account for the same, this patch has been added. Calling the topology import option from the build menu invokes a file picker dialog, and the available native/OpenBabel file reader is used to read the chosen file. Element information is obtained from the topology file and assigned to the trajectory molecule.

I also worked on input generation for a few of the MD codes:

  • LAMMPS Input Generator - Avogadro 1 had a plugin for LAMMPS input generation, but it generated only the job file. This patch brings the plugin back to Avogadro 2, and on top of it, some subtle modifications have been made; now alongside the main job script, the lmpdat data file also is peviewable. This patch adds the implementation for native lmpdat writer. Various simulation inputs like periodic boundary conditions, velocity distribution, temperature etc. are feedable using the dialog and based on these inputs, the job file and the data file are geenrated.
  • OpenMM Input Generator - Adapted from http://builder.openmm.org. This patch contains the implementation for OpenMM script builder plugin. In addition to the Python job script, coordinate file is also generated if the writer for the corresponding data file format is available.

Setting the readers and writers aside, at the heart of Avogadro 2 lie the visualization tools. Tools for animating and exporting media formats of the trajectory evolution are of pivotal importance. To enhance their compatibility with Molecular Dynamics, I made the following tweaks and improvements.

  • Player tool overhaul - I made some subtle enhancements to the player tool: Added a navigation slider and an input spinbox to navigate easily to a desired time frame; In this patch, I've added facility to store the timestep values in an array in the molecule class, so that it can also be displayed in the player tool (this feature is yet to be completed).
  • Support for native media exports - Media exporting tools were already available, but some of the features did not seem to work as intended. For instance, the GIF and AVI export needed certain codecs and libraries to be preinstalled. In this patch, I've fixed those issues by adding libraries natively for GIF export (https://github.com/ginsweater/gif-h) and AVI export (https://github.com/Rolinh/libgwavi). I've also made another patch to fix an issue with PNG files not getting exported to the right path in the MSVC compiled version.

I also created a few Molecular Dynamics related analysis tools:

  • Root Mean Square Deviation VTK Chart - This patch adds support for rendering a VTK chart displaying the Root Mean Square Deviation curve for trajectories.
  • Pair Distribution Function VTK Chart - This patch adds support for rendering a VTK chart displaying a plot of the the Pair (Radial) distribution function of the (starting) molecule.

In addition to the primary project goals, I also made some additional Bonus patches:

  • Force vector arrows - In this patch, I've cloned the force arrow rendering functionality from Avogadro 1, added relevant functions to create force arrows, and added class variables and functions in the molecule class to store force vectors.
  • Residue classes - In this patch, I've implemented the Biological Residue class, which keeps track of residue name, residue id, chain id, atom names and physically stores the residue atoms. These classes were created with the primary objective of cutting down on the complexity of bond assignment, which here is done by means of dictionary lookup on preset data from the residuedata header.
  • Native PDB Reader - Upon implementing residue classes, I completed the patch which was initiated for implementing a native PDB reader class. I integrated the residue class implementation with the reader, and created appropriate handlers for bond assignment from residue data.

Future Work & Work In Progress

Although I've tried to cover a handful of Molecular Dynamics codes and managed to complete most of the project goals, there are still a couple of things left out which I'll continue working on:

  • Now that residue classes have been implemented, I'll work on modifying the existing GROMACS gro reader to read the residue information and perform relevant action like bond assignment from residue data.
  • I haven't completed writing tests for some of the readers and writers. I'll work on finishing off the same.
  • There are some minor UX issues that require appropriate handling. Firstly, reading a trajectory file immediately fires the Custom Element tool, which is unnecessary as it would be impractical to rely on the Custom Element tool to remap atom elements in systems containing large number of atoms. Instead, an alert dialog suggesting to use the "Import Topology" option from the Build menu sounds like a more convincing and user-friendly approach. Also, some of the menu options need to be better handled (enabled / disabled appropriately) upon checking whether the inputted file is a single molecule or a trajectory.
  • Other avenues for creating new readers, writers, tools and generators are to be explored and worked on.

Acknowledgements

I would like to thank Dr. Marcus Hanwell and Prof. Geoffrey Hutchison for their continuous guidance and support, without whom completing this project would have been impossible.

Useful Links

Link to all the project Pull Requests
About Me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment