Skip to content

Instantly share code, notes, and snippets.

Purpose

This document describes a possible abstract design for the buter to facilitate discussions. It does not map directly to actual code, but rather serves to clarify our thinking on the concepts.

Components

Dataset

Represents a single entity of data, with associated metadata (e.g. a particular calexp for a particular instrument recorded at a particular time).

@pschella
pschella / butler-use-cases.md
Last active August 24, 2017 16:48
Use Cases for the LSST DM Data Butler Working Group

Persona

  • Dave the Developer (e.g. pipeline developer)
  • Susy the Astronomer (e.g. general public astronomer user)
  • Otto the Operator (e.g. person running pipelines on a cluster in operations)

Use cases

  • Susy is going to a conference and wants to pre-cache some data, from a remote repository, to be able to access through the butler while without network connectivity.
  • Susy and her colleagues want to access the same data (or overlapping) from a remote repository. It would be efficient if this can be cached in an on-site proxy.
  • Susy (or the task she is running) wants to load a dataset through the butler, but the dataset is too large for her device memory. What does the butler do?
  • Susy wants to access metadata associated with a dataset. Does the butler need to load the entire dataset?