Skip to content

Instantly share code, notes, and snippets.

@JustinAronson
Created August 27, 2023 15:00
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save JustinAronson/84b620a7cec2d303c29e7bea2da7ec1e to your computer and use it in GitHub Desktop.
Save JustinAronson/84b620a7cec2d303c29e7bea2da7ec1e to your computer and use it in GitHub Desktop.

Google Summer of Code 2023 Final Report

Project Information

  • Project Name: Implementing Phenopackets in a Variant Discovery Pipeline
  • Organization: Global Alliance for Genomics and Health (GA4GH)
  • Mentors:
  • Contributor: Justin Aronson
  • Repository: GitHub

Project Overview

Phenopacket Injection:

Enabled users to import phenopackets from a Fast Healthcare Interoperability Resources (FHIR) server to receive variant information of genes relevant to the patient's phenotype. HPO terms are retrieved from the FHIR server using FHIR search. A tool called Phen2Gene is then queryed to translate HPO terms into relevant genes. These genes are then displayed to the user, ranked based on their likelihood to cause the phenotypes seen in the patient.

Improved Query Strategy:

This app is very data heavy - typical use would involve querying 10 or more genes from the FHIR reference implementation. A new query strategy was used for this app, which involves submiting parallel queries to the reference implementation. When quering 3 genes from the reference implementation, an average of 45% speed improvement was observed over the old call strategy. This improvement increases for queries involving more genes.

Improved UI/UX:

Translated app into React.js and Typescript to help improve user experience. Improved multinucleotide variant reporting by including component single nucleotide variants.

Multiple Gene Loading:

Enabled users to load multiple genes at once, enabling phenopacket processing pipeline.

These changes can be found at this pull request

Next Steps

AI variant prioritization solution:

The rare disease variant prioritization process is a time-consuming process for clinicians. We aim to identify whether there exists enough signal in the FHIR server data to power an AI algorithm designed to 'bubble up' potentially rare disease causing variants. Public datasets, including ClinVar and OMIM, will be used to train the algorithm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment