Skip to content

Instantly share code, notes, and snippets.

@uniqueg
Created March 31, 2021 12:36
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save uniqueg/dff6fb240339b2358e91c9845169d488 to your computer and use it in GitHub Desktop.
Save uniqueg/dff6fb240339b2358e91c9845169d488 to your computer and use it in GitHub Desktop.
RO-Crate x GA4GH interoperability

RO-Crate x GA4GH Cloud

The Reserch Object Crate (RO-Crate) and Global Alliance for Genomics and Health (GA4GH) communities both propose schemas to represent computational analysis workflows with the Workflow RO-Crate (WROC) and Tool Registry Service (TRS) API specifications, respectively.

Apart from the workflows themselves, it is also important to be able to adequately describe individual runs of a workflow, i.e., roughly speaking, a workflow and all of its inputs, outputs and engine-specific parameters. In the case of the GA4GH, the Workflow Execution Service (WES) API specification includes schemas to describe such run-related properties, whereas in the RO-Crate community, an equivalent specification consistent with the RO-Crate approach at packaging research data is currently under development, tentatively named (workflow) Run RO-Crate (RROC) here.

Taking into account the overlaps between the RO-Crate and GA4GH approaches, but also respecting their differences in philosophy, scope and intended use cases, this document serves as a starting point in an attempt to align the work being done within those two communities with the goal of increasing interoperability across the various services adopting (or planning to adopt) specifications developed by either or both parties. In particular, the alignment/coordination work should aim to propose changes and share ideas/considerations with a view to a possible two-way (and ideally lossless) translation between corresponding schemas.

The document contains a list of possible interfaces, considerations to take into account to achieve interoperability as well as lists of references to relevant specifications and implementations.

Interfaces

This section outlines functionalities that would be beneficial or required for a high level of interoperability between RO-Crate and GA4GH-based workflow and workflow run descriptions.

1. Unpack a WROC into a TRS entity

Allows workflow registries to import or have users upload workflows as WROCs and serve them at a TRS API, whence they can be discovered and retrieved by consuming applications.

Examples of services where this would be useful:

2. Pack a TRS entity into a WROC

Allows WROC-consuming applications to receive or retrieve workflows through an internal (server-side) or from an external (client-side) TRS API, respectively, or both, in order to display their contents, store them in a WROC-based database schema or execute them.

Examples of services where this would be useful:

3. Creating a WES run request from a WROC

Allows applications that enable their users to execute workflows via the WES API to consume WROCs instead of or in addition to TRS entities or other/custom solutions in order to receive/retrieve workflows, and prepare them for and trigger execution.

Examples of services where this would be useful:

4. Packing a WES run result into an RROC

Allows applications that enable their users to execute workflows via the WES API to prepare self-describing RROCs that can be easily shared and published at suitable repositories.

Examples of services where this would be useful:

Interoperability considerations

  • WROC <> TRS conversions should ideally be lossless at least at the level of metadata that is recommended to be part of a WROC
  • In contrast to TRS objects, WROCs may contain all the required input parameters to start a workflow run. This allows decoupling the preparation of a workflow run (i.e., garther the workflow and all required input parameters) from starting (or triggering the start) the run. In other words, "ready-to-go" workflows can be shared, e.g., in publications, and potentially easily triggered in cloud environments, including those relying on GA4GH APIs.
  • Information contained in a TRS entity and the WES run results, when coupled, should ideally be sufficient to create an RROC that contains all of the metadata recommended by the RO-Crate.

Specifications

Implementations

Lists of projects/services implementing the relevant specifications. Lists are not exhaustive.

WROC

RROC

Under development.

TRS

WES

@stain
Copy link

stain commented Jun 22, 2023

Consider also Workflow Run RO-Crate Profiles for describing history of runs, now supported by multiple workflow engines.

Also Trusted Workflow Run Crate profile which also covers a workflow run request. In TRE-FX we are using such crates as the payload of TES API to initiate workflow runs.

@stain
Copy link

stain commented Jun 22, 2023

How WorkflowHub use TRS: https://about.workflowhub.eu/developer/trs/ -- this is used by usegalaxy.eu and WfExS to execute workflows by requesting Workflow RO-Crates.

@uniqueg
Copy link
Author

uniqueg commented Jun 22, 2023

Thanks @stain. Actually, I had totally forgotten about this document :)
Together with your notes, this may be useful for Yuvraj or whoever is going to tackle this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment