Skip to content

Instantly share code, notes, and snippets.

@SteVwonder
Created October 2, 2018 20:59
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save SteVwonder/5595c17b226915c308e057fef8037808 to your computer and use it in GitHub Desktop.
Save SteVwonder/5595c17b226915c308e057fef8037808 to your computer and use it in GitHub Desktop.
Notes from discussion on R

Identify the requirements on R

Consumers/Produces

  • Definition: express a resource set
  • R_local: subset of R per broker rank
    • IMP consumes this
  • Producers + Consumers
    • Scheduler produces
    • Scheduler consumes
      • Resources incrementally handed back from the exec
    • Exec consumes
      • Used to determine which ranks job shells need to be executed
      • Divides R into R_locals
    • Job shell consumes R (and jobspec)
      • Also might use R_local
    • Users produce
    • Utilities (user-facing interfaces)
      • What resources are available/idle
      • What resources are allocated to a job (still)
    • Potential consumers/producers
      • Resource monitoring
      • Dyanmic scheduling
  • Relation to HWLOC
    • core ids should be hwloc logical ids
    • independent but R can be built from hwloc
    • R is a superset (supports power, etc)
  • Needs to be extensible enough to support both R and R_local requirements

Requirements

  • Should be versioned
  • Representation needs to be scalable
    • Should support a compact representation
      • Two-level compression (first representation, second compression alg)
  • Should support rank & slot as an attribute
    • Could hurt scalability and compactness, need to look later at optimizing this

General

  • Recursive nature
    • R should be passed from parent to child
  • Should be validated against/reconciled with hwloc
  • Resources
    • Arbitrary types and attributes
  • Should be expressible enough to represent a graph
    • Should be a superset of HWLOC
    • Can be optimize for specific graph configurations (multiple trees)
    • Should support arbitrary types of edges between resources (and attributes on those edges)

Exec System

  • Required resources
    • Node
    • Core
    • Memory
    • GPUs
  • Needs to be able to resolve the resource set of each rank
    • In the general case, maybe the node name can be used
    • In the non-general case, the scheduler can annotate resources with rank information
  • Should emit jobid as well for optimization on scheduler-side (add to github issue about exec system)

Job shell

Utilities

Users

Scheduler

  • Required resources
    • resources required by exec
  • Can support/allow arbitrary resources too

Common operations

  • Intersection
    • between two Rs
  • Filter
    • based on a key or attribute
    • variants:
      • just the resources that have the key
      • resources that have the key plus their children (even if the children don’t have the key)
  • Serialize and deserialize
    • Sanitize invalid attributes like rank and slot when passing to child/receiving from parent
  • Insert/Add
  • Delete/Remove
  • Traversal

Plan of attack

  • First implement the emitter in resource matching service
    • Emit simple examples as JSON
    • Then work on a reader
  • Producer and consumer owners can play with it and experiment
    • Enumerating and annotating slots and ranks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment