Skip to content

Instantly share code, notes, and snippets.

@RuhiRG
Last active August 28, 2023 22:58
Show Gist options
  • Save RuhiRG/9a33441072660bd62b987f77b9d88367 to your computer and use it in GitHub Desktop.
Save RuhiRG/9a33441072660bd62b987f77b9d88367 to your computer and use it in GitHub Desktop.
GSoC 2023: Python Bindings for d-SEAMS under the PSF

PYSEAMS : Python bindings for d-SEAMS

d-SEAMS (Deferred Structural Elucidation Analysis for Molecular Simulations), a free and open-source post-processing engine for the analysis of molecular dynamics trajectories, which is specifically able to qualitatively classify ice structures, in both strong confinement and bulk systems.

The objective of my project was to create Python bindings to the seams-core C++ engine of d-SEAMS. I named my bindings repository and project as pyseams with the importable name cyoda in keeping with yodaStruct and yodaLib from d-SEAMS. The d-SEAMS project provided lua bindings which many users may find difficult. I wrote bindings with pybind11 for the functions and classes present in d-SEAMS. Each basic example has it's own file in pyseams, which users can run. Initially, d-SEAMS didn't support the latest version of Macos but I worked to ensure M1-Macos users can use d-SEAMS as well.

Beyond the scope of the project was my changes in the final stages to the C++ CLI within d-SEAMS to remove the lua bindings completely since we refactored the functions to return values instead of working with pointers.

Weekly progress

One of the best parts of being a mentee under the PSF is the mandatory blog posts, which really let me see how far my work has come, and also serves as a yardstick for others to come. I wrote 13 posts for the PSF blog during the coding period, which may be found here.

Milestones

In the evolution of the pyseams project, several notable milestones and improvements were achieved:

  1. Transition from Enum to Class:

    • To prevent the accidental misuse of constants, especially within Python bindings, we transitioned from enums to classes. This ensured that enums wouldn't be erroneously assigned as integers.
  2. Modifications in d-SEAMS:

    • Within d-SEAMS, the original C++ function code underwent alterations. Previously, it processed objects in place, returning an integer status code. However, this is not very pythonic, and first class objects make more sense from a bindings perspective. To address this, the core function in seams was restructured to accept a filename string as input and subsequently return a populated PointCloud object, replacing the pointers.
  3. Establishment of a Separate Repository:

    • A dedicated repository was set up for pyseams, ensuring better management and separation of concerns.
  4. Building Blocks of Python Bindings:

    • The foundational Python bindings were introduced for both enums and classes in pyseams.
  5. Function Binding Additions:

    • pyseams was further enriched with Python bindings specifically tailored for its functions.
  6. String Representation:

    • To enhance user experience and understandability, pyseams was equipped with its string representations of the seams-core classes.
  7. Successful Automated Testing:

    • Through GitHub's infrastructure, pyseams underwent and successfully passed automated tests, ensuring the robustness of its codebase.
  8. Binding Extensions for d-SEAMS Examples:

    • To make the ecosystem more holistic, Python bindings were developed for the built-in lua based examples present in d-SEAMS.

Current state

pyseams has a full set of bindings to d-SEAMS/seams-core and every example which can be run with yodaStruct can be run with the native python bindings (the rest are here):

import bbdir.cyoda as cyoda
trajectory ="subprojects/seams-core/input/traj/exampleTraj.lammpstrj"

#Get the frame
resCloud = cyoda.readLammpsTrjreduced(
          filename = trajectory,
          targetFrame = 1,
          typeI = 2, #oxygenAtomType
          isSlice = False,
          coordLow = [0,0,0],
          coordHigh = [0,0,0],
)

#Calculate the neighborlist by ID
nList = cyoda.neighListO(
    rcutoff = 3.5,
    yCloud = resCloud,
    typeI = 2, #oxygenAtomType
)

#Get the hydrogen-bonded network for the current frame
hbnList = cyoda.populateHbonds(
    filename = trajectory,
    yCloud = resCloud,
    nList = nList, 
    targetFrame = 1,
    Htype = 1, #hydrogen atom type
)

#Hydrogen-bonded network using indices not IDs
hbnList =  cyoda.neighbourListByIndex(
    yCloud = resCloud,
    nList = hbnList,
)

#Gets every ring (non-primitives included)
rings = cyoda.ringNetwork(
    nList = hbnList,
    maxDepth = 6,
)

#Does the prism analysis for quasi-one-dimensional ice
cyoda.prismAnalysis(
    path = "runOne/", #outDir
    rings = rings,
    nList = hbnList,
    yCloud = resCloud,
    maxDepth = 6, 
    atomID = 0,
    firstFrame = 1, #targetFrame
    currentFrame = 1, #frame
    doShapeMatching = False,
)

Merged Code

During the course of the project I was able to work across the C++ engine and my own Python bindings. I refactored code in the engine, designed and wrote bindings, and added a design document on deprecating the lua bindings as well. Some of the highlights are listed below.

seams-core Engine

  • PR -1 : Use meson without lua, add CI
  • PR - 2: Update sol2
  • PR - 3: Pin fmt to v9
  • PR - 4: Fix meson / library paths
  • PR -5: Enum class fix
  • PR - 6: Simplifying the function readXYZ
  • PR - 7: Work on MacOS, use as a subproject
  • PR - 8: Remove lua bindings
  • PR - 9: Fix C++14 builds

Along with some issues:

The pyseams repository

  • Early on, my mentors suggested that I write the bindings in a separate repository, the pyseams project
    • This entire repository is one of the main code-outputs for the program
  • Later, my mentors made me a member of the d-SEAMS organization and we moved the pyseams repository to its current location under d-SEAMS
    • Since then I have been making PRs to it from my personal fork which is in keeping with the best practices for reviewing code

Future Directions

Documentation needs to be improved. Some more tests would be useful. I was not able to look into the physical chemistry / biological applications this time (the other GSoC project idea) but I hope to return for it (or find time elsewhere).

Personal Development

This project was a first for me in many ways:

  • My first time working with a compiled language
  • My first time writing bindings (with pybind11)
  • My first public git based collaborative project
  • My first maintainer position within a Github organization

These mark the milestones of my personal growth, but there were several other non-programming skills I learned including efficient communication and build system portability. I learned to take responsibility for my changes and explain exactly what I did. I learned to defend my opinions and take constructive criticism. I feel like the program taught me a lot about working in the real world outside an academic setting. I look forward to contributing to other open source projects and working with the d-SEAMS team further!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment