- Organization: Global Alliance for Genomics and Health
- Project Name: Dynamic computation of variant annotations using GA4GH Variant Annotations plus FHIR Genomics
- Project Link(GSoC): Dynamic computation of variant annotations using GA4GH Variant Annotations plus FHIR Genomics
- Project Link(GitHub): genomics-operations
- Mentors:
- Bob Dolin (@rhdolin)
- Srikar Chamala (@srikarchamala)
- Shailesh Gothi (@srgothi92)
- Bret Heale (@bheale)
- Student: Rohan Gupta (@Rohan-cod)
- GSoC Proposal: GSoC 2022 Proposal Dynamic computation of variant annotations using GA4GH Variant Annotations plus FHIR Genomics
FHIR Genomics operations are based on the premise that genomic data, in FHIR format and/or some other format (e.g. VCF format), are stored in a repository, either in or alongside an EHR, possibly along with phenotype annotations. The FHIR Genomics operations essentially 'wrap' the repository, presenting a uniform interface to applications, regardless of internal repository data structures.
This project enhanced the existing open-source pipeline to dynamically compute variant annotations derived from VA-encoded ClinVar, CIViC, and PharmGKB knowledge.
-
During the first phase of project development(13-06-2022 to 25-07-2022) we worked towards creating the schema for VA encoded Knowledge. We met with the GA4GH group to get the VA Encoded knowledge.
Along with that, we made some minor enhancements to the reference implementation one of which was adding another utility named find_the_gene utility.
Link to the commits I made to the reference implementation from 13-06-2022 to 25-07-2022.
-
During the second phase of project development(25-07-2022 to 04-09-2022), I updated the code to migrate from the existing knowledge to the VA-encoded knowledge. The migration was smooth as only a few fields were to be updated and the logic was almost similar.
Along with that, we fixed some minor issues and added some more functionality like sequence phase relationship and ranges parameter to subject phenotype operations.
Link to the commits I made to the reference implementation from 25-07-2022 to 04-09-2022.
- Optimization: The code/queries need to be optimized to increase the efficiency of the operations.
- Refactoring: The codebase needs refactoring which will help new contributors to get a hold of it quickly.
It was a great summer working on the project with my mentors.