In this workshop we will explore recent advancements in Research Objects and publishing of research data.
Scholarly Communication has evolved significantly in recent years, with an increasing focus on Open Research, FAIR data sharing and community-developed open source methods. The concepts of authorship and citation are changing, as researchers are increasingly reusing and evolving common software tools and datasets. Yet with a growing amount of cloud compute power and open platforms available, reproducibility of computational analyses becomes more challenging, and not yet commonly included in peer review. While recent advances in scientific workflows and provenance capture systems have improved on this situation, a question remains on how to publish, archive, explore and understand digital research outputs, as academic authors and publishers remain focused on PDFs and the occasional CSV file, with the Web and Open Research often left to “best effort” rather than being the expected norm.
A number of community initiatives have begun to explore how to package various multi-part research outcomes with their context, how to handle distributed and living content and how to port and safely exchange these “Research Objects” between platform and between researchers.
One such approach is researchobject.org which has proposed a way to package and describe research outputs, data, methods, workflows, provenance and structured metadata, reusing existing Web standards and formats. Research Objects, and Research Object-like approaches have gathered pace across:
- Scientific domains – bioinformatics, systems biology (e.g. COMBINE Archives), health informatics (e.g. BioComputeObject);
- Tasks – handling big data (BDBags), reproducible workflows (Common Workflow Language), scholarly communication and publishing (Dryad, DataONE);
- Platforms:
- Virtual Research Environments, e.g. the EVER-EST VRE for Earth Sciences;
- Community output aggregators such as the EU’s OpenAIRE, FAIRDOMHub, CodeOcean and Open Science Framework.
- Stakeholders
- Publishers – such as eLife’s Reproducible Document Stack project and science.ai,
- Repository providers such a dataONE and Dryad
- Funders, including EU (e.g. European Open Science Cloud) and NIH (e.g. Data Commons)
However, many challenges remain as to how to increase Research Object uptake with data providers, researchers, infrastructures, publishers and other stakeholders; credit and tracking metrics; develop supporting tooling; building effective community efforts and the relationship of rich metadata manifests with emerging container platforms.
The workshop aims to be a mix of:
- Two 30 minute invited talks
- 15 minute presentations by accepted speakers
- Short demos
- 5 minute lightning talks, organized at the workshop, allow participants to demo or present technology and topics that may come up during discussions in the break-out sessions.
- 1-2 hour break-out sessions to further build relationships across scientific domains and RO practitioners and “do it”. The break-out topics will be canvased before the meeting.
Keynote speaker: Carl Kesselman, Information Sciences Institute, University of Southern California