Skip to content

Instantly share code, notes, and snippets.

@Rohan-cod
Last active October 2, 2023 09:26
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Rohan-cod/02bf3dd5bc89a8c3ae8bf32fc1655465 to your computer and use it in GitHub Desktop.
Save Rohan-cod/02bf3dd5bc89a8c3ae8bf32fc1655465 to your computer and use it in GitHub Desktop.
Google Summer of Code 2022 'Dynamic computation of variant annotations using GA4GH Variant Annotations plus FHIR Genomics' Final Report

Google Summer of Code 2022 Final Report

Project Details

Overview of the project

FHIR Genomics operations are based on the premise that genomic data, in FHIR format and/or some other format (e.g. VCF format), are stored in a repository, either in or alongside an EHR, possibly along with phenotype annotations. The FHIR Genomics operations essentially 'wrap' the repository, presenting a uniform interface to applications, regardless of internal repository data structures.

This project enhanced the existing open-source pipeline to dynamically compute variant annotations derived from VA-encoded ClinVar, CIViC, and PharmGKB knowledge.

My Contributions

  • First Phase of Project development (13-06-2022 to 25-07-2022)

    During the first phase of project development(13-06-2022 to 25-07-2022) we worked towards creating the schema for VA encoded Knowledge. We met with the GA4GH group to get the VA Encoded knowledge.

    Along with that, we made some minor enhancements to the reference implementation one of which was adding another utility named find_the_gene utility.

    Link to the commits I made to the reference implementation from 13-06-2022 to 25-07-2022.

  • Second Phase of Project development (25-07-2022 to 04-09-2022)

    During the second phase of project development(25-07-2022 to 04-09-2022), I updated the code to migrate from the existing knowledge to the VA-encoded knowledge. The migration was smooth as only a few fields were to be updated and the logic was almost similar.

    Along with that, we fixed some minor issues and added some more functionality like sequence phase relationship and ranges parameter to subject phenotype operations.

    Link to the commits I made to the reference implementation from 25-07-2022 to 04-09-2022.

Future Scope

  • Optimization: The code/queries need to be optimized to increase the efficiency of the operations.
  • Refactoring: The codebase needs refactoring which will help new contributors to get a hold of it quickly.

Experience

It was a great summer working on the project with my mentors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment