The Reserch Object Crate (RO-Crate) and Global Alliance for Genomics and Health (GA4GH) communities both propose schemas to represent computational analysis workflows with the Workflow RO-Crate (WROC) and Tool Registry Service (TRS) API specifications, respectively.
Apart from the workflows themselves, it is also important to be able to adequately describe individual runs of a workflow, i.e., roughly speaking, a workflow and all of its inputs, outputs and engine-specific parameters. In the case of the GA4GH, the Workflow Execution Service (WES) API specification includes schemas to describe such run-related properties, whereas in the RO-Crate community, an equivalent specification consistent with the RO-Crate approach at packaging research data is currently under development, tentatively named (workflow) Run RO-Crate (RROC) here.
Taking into account the overlaps between the RO-Crate and GA4GH approaches, but also respecting their differences in philosophy, scope and intended use cases, this document serves as a starting point in an attempt to align the work being done within those two communities with the goal of increasing interoperability across the various services adopting (or planning to adopt) specifications developed by either or both parties. In particular, the alignment/coordination work should aim to propose changes and share ideas/considerations with a view to a possible two-way (and ideally lossless) translation between corresponding schemas.
The document contains a list of possible interfaces, considerations to take into account to achieve interoperability as well as lists of references to relevant specifications and implementations.
This section outlines functionalities that would be beneficial or required for a high level of interoperability between RO-Crate and GA4GH-based workflow and workflow run descriptions.
Allows workflow registries to import or have users upload workflows as WROCs and serve them at a TRS API, whence they can be discovered and retrieved by consuming applications.
Examples of services where this would be useful:
Allows WROC-consuming applications to receive or retrieve workflows through an internal (server-side) or from an external (client-side) TRS API, respectively, or both, in order to display their contents, store them in a WROC-based database schema or execute them.
Examples of services where this would be useful:
Allows applications that enable their users to execute workflows via the WES API to consume WROCs instead of or in addition to TRS entities or other/custom solutions in order to receive/retrieve workflows, and prepare them for and trigger execution.
Examples of services where this would be useful:
Allows applications that enable their users to execute workflows via the WES API to prepare self-describing RROCs that can be easily shared and published at suitable repositories.
Examples of services where this would be useful:
- WROC <> TRS conversions should ideally be lossless at least at the level of metadata that is recommended to be part of a WROC
- In contrast to TRS objects, WROCs may contain all the required input parameters to start a workflow run. This allows decoupling the preparation of a workflow run (i.e., garther the workflow and all required input parameters) from starting (or triggering the start) the run. In other words, "ready-to-go" workflows can be shared, e.g., in publications, and potentially easily triggered in cloud environments, including those relying on GA4GH APIs.
- Information contained in a TRS entity and the WES run results, when coupled, should ideally be sufficient to create an RROC that contains all of the metadata recommended by the RO-Crate.
- Workflow RO-Crate (WROC) specification
- GA4GH Tool Registry Service (TRS) API specification
- GA4GH Workflow Execution Service (TRS) API specification
Lists of projects/services implementing the relevant specifications. Lists are not exhaustive.
Under development.
Consider also Workflow Run RO-Crate Profiles for describing history of runs, now supported by multiple workflow engines.
Also Trusted Workflow Run Crate profile which also covers a workflow run request. In TRE-FX we are using such crates as the payload of TES API to initiate workflow runs.