Skip to content

Instantly share code, notes, and snippets.

@psankhe28
Last active January 23, 2025 10:14
Show Gist options
  • Save psankhe28/1af074aefa2fa67b056ce05b12c3b14e to your computer and use it in GitHub Desktop.
Save psankhe28/1af074aefa2fa67b056ce05b12c3b14e to your computer and use it in GitHub Desktop.
Final report GSoC'24 project ELIXIR Cloud Components

Final report GSoC project ELIXIR Cloud Components

This is the final report for my project that I've been working on during my summer of 2024 in guidance of Alex Kanitz, Anurag Gupta and my other respective mentors.

Click here for proposal submitted.

Background

The Global Alliance for Genomics and Health better know as GA4GH is a "policy-framing and technical standards-setting organization, seeking to enable responsible genomic data sharing within a human rights framework."

ELIXIR coordinates and develops life science resources across Europe so that researchers can more easily find, analyse and share data, exchange expertise, and implement best practices.

ELIXIR Cloud & AAI is a cross platform initiative of ELIXIR and a Driver Project of the GA4GH that develops services towards establishing a federated cloud computing network that enables the analysis of population-scale genomic and phenotypic data across participating, international nodes.

ELIXIR Cloud components are the Web Components which are developed & managed by the ELIXIR Cloud & AII Community.

Data Respositority Service Schemas is the repository for the schemas used for the Data Repository Service. The Data Repository Service (DRS) API provides a generic interface to data repositories so data consumers, including workflow systems, can access data in a single, standardized way regardless of where it’s stored or how it’s managed. The primary functionality of DRS is to map a logical ID to a means for physically retrieving the data represented by the ID.

DRS Filer is a microservice implementing the Global Alliance for Genomics and Health (GA4GH) Data Repository Service (DRS) API specification.

Motivation

My objective was to develop reusable Web Components Clients for file upload and data handling, adhering to the DRS API specifications established by the Global Alliance for Genomics and Health (GA4GH). These components empowered users to seamlessly upload files, which were stored in the Minio file server. Additionally, a suite of components was created to interact with DRS-Filer's CRUD endpoints, facilitating the management of data sets within DRS-Filer deployments. The generic DRS component was utilized to access DRS-Filer's GET endpoints. Furthermore, the DRS-Filer was upgraded to comply with the latest DRS API version (DRS 1.4.0), focusing on the exclusion of POST endpoints.

Implementation

The implementation of the "Dashboard Web Components" project is focused on establishing a robust file management system and creating reusable web components that adhere to the GA4GH (Global Alliance for Genomics and Health) DRS (Data Repository Service) API standards. The initial setup included deploying a MinIO server for efficient file storage management and configuring an instance of DRS-Filer, a specialized microservice for cloud-based file handling in genomics. Components for this project, currently being developed with LitElement, are intended to facilitate seamless interactions with GA4GH DRS endpoints, enabling crucial data operations like retrieval and management of genomic datasets. A key component is aimed at supporting generic DRS implementations, while additional components will handle CRUD operations within DRS-Filer to ensure comprehensive data management.

DRS-Filer is also being updated to align with the latest DRS API version 1.4.0, while aiming to maintain backward compatibility. Each component will be packaged for easy deployment and made available on npm, enabling future integration within the Elixir Cloud Component suite. This project streamlines genomic data accessibility, ultimately providing scalable, interoperable components aligned with current data standards to enhance the user experience.

In parallel, the implementation of a cloud storage handler compatible with MinIO and tus-js-client for resumable uploads is underway. This handler, managed by FOCA (Flask OpenAPI Connexion Application) as the API framework, handles file uploads, manage endpoint access, and ensure S3-compatible storage operations. The handler is designed to seamlessly integrate with existing systems, supporting resumable uploads and providing a reliable backend for file storage and retrieval.

What did I achieve?

This project, a blend of frontend, backend, and storage solutions, provided allowed me to delve deeply into various technologies implemented within GA4GH standards, as well as explore the intricacies of storage solutions and data handling.

In progress

Outlook

I wished to contribute more, and cover all my goals.

Things for future:-

  • Complete the drs-filer components
  • Integrate it with the cloud-storage-handler
  • Extending tus-node-server with a plugin in JavaScript to add needed functionalities for cloud-storage-handler, leveraging microservice flexibility.

Acknowledgment

I would like to express my special thanks of gratitude to my mentors Alex Kanitz, Anurag Gupta who gave me the golden opportunity to work on the project "Dashboard Web Components".

Over the last few months, apart from writing quality code, I have learned to take ownership of a project. I would also like to thank the GSoC program & GA4GH organization for providing me this wonderful experience over the last few months.

Pratiksha Sankhe

I am immensely grateful for the experience and growth this opporunity has brought me, the experience was like none other.

banner

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment