Skip to content

Instantly share code, notes, and snippets.

@sgalpha01
Last active October 2, 2023 09:18
Show Gist options
  • Save sgalpha01/c0251dafa87c3e3c17fd5c1249055725 to your computer and use it in GitHub Desktop.
Save sgalpha01/c0251dafa87c3e3c17fd5c1249055725 to your computer and use it in GitHub Desktop.
Google Summer of Code '22 Final Report - TES Callback endpoint and mechanism in TESK

Google Summer of Code '22 Final Report

This report summarizes the work done by me in the Google Summer of Code 2022 program as a contributor for the Global Alliance for Genomics and Health organization, under the guidance of the mentors Alexander Kanitz, Alvaro Gonzalez, Ania Niewielska and Thanasis Vergoulis.

Background πŸ“š

The Global Alliance for Genomics and Health (GA4GH) is a policy-framing and technical standards-setting organization, seeking to enable responsible genomic data sharing within a human rights framework.

ELIXIR is a multinational Europe-based initiative that unites life science laboratories and organizations to establish a common infrastructure that supports and integrates scalable, sustainable bioinformatics and data analysis services for member states and beyond.

ELIXIR Cloud & AAI is a subgroup of ELIXIR and the driver project of GA4GH. ELIXIR Cloud & AAI develops services towards establishing a federated cloud computing network that enables the analysis of population-scale genomic and phenotypic data across participating, international nodes.

Motivation πŸ’ͺ

TESK is an implementation of a task execution engine based on the TES standard running on Kubernetes.

cwl-WES is a Flask/Gunicorn application that makes use of Connexion to implement the GA4GH WES OpenAPI specification. It enables clients/users to execute CWL workflows in the cloud via a GA4GH Task Execution Service (TES)-compatible execution backend (e.g., TESK or Funnel). Internally, it uses cwl-tes to communicate and manage tasks with the TES server.

Currently, the communication between them to determine the task/workflow status is based on periodic polling. This method increases the network load as the status may not change during each request and prove futile. Now, because of this behavior, it imposes a scalability issue.

This project aims to implement a callback mechanism for task status updates in TES, which will send requests from the server to the client whenever a task’s status changes, thus eliminating the need for expensive polling.

Implementation βš™οΈ

The entire callback service implementation can be divided into three parts:

  1. Callback sender on the TESK side.
  2. Callback listener on the cwl-WES/client side.
  3. Migration of polling from (cwl-tes β‡’ TESK) to (cwl-tes β‡’ callback listener service).

Architecture

What did I achieve? πŸ†

During the development of this project, the following milestones were achieved:

  • Extended the TES OpenAPI specification to add callback definition (#176).
  • Implementation of the callback sender on the TESK side (tesk-api:#39 and tesk-core:#39).
  • Made changes to cwl-tes to support callback listener.
  • Implementation of the callback listener using Flask.

Outlook πŸ’­

Although most of the milestones were achieved, there are some key areas that needs some work:

  • Handle some situations where the callback sender is not updating the task status.
  • Use a more robust and scalable solution for the callback listener, like FOCA.
  • Integration of the callback listener with the cwl-WES service.
  • Write documentation for the callback listener.

Acknowledgement πŸ™

I would like to thank my mentors Alexander Kanitz, Alvaro Gonzalez, Ania Niewielska and Thanasis Vergoulis for giving me the opportunity to participate in Google Summer of Code 2022 and contribute to Global Alliance for Genomics and Health with my project TES Callback endpoint and mechanism in TESK.

Project Presentation - Link

Logo_Banner

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment