Skip to content

Instantly share code, notes, and snippets.

@skluck
Last active March 21, 2018 01:30
Show Gist options
  • Save skluck/851b2a4ee9c9068633856ebe50458d0b to your computer and use it in GitHub Desktop.
Save skluck/851b2a4ee9c9068633856ebe50458d0b to your computer and use it in GitHub Desktop.
Scalable job scheduling for Hal
job "hal_job_1234" {
datacenters = ["dc1"]
type = "batch"
constraint {
attribute = "${attr.kernel.name}"
value = "linux"
}
group "default" {
task "run_job" {
resources {
cpu = 100
memory = 512
}
logs {
max_files = 1
max_file_size = 10
}
artifact {
source = "s3::https://s3-us-east-1.amazonaws.com/my-bucket-example/hal-agent.phar"
options {
aws_access_key_id = "<id>"
aws_access_key_secret = "<secret>"
aws_access_token = "<token>"
}
}
driver = "raw_exec"
config {
# When running a binary that exists on the host, the path must be absolute/
command = "local/hal-agent.phar"
args = ["deploy", "job-id-here-1234"]
}
}
}
}

https://www.hashicorp.com/blog/replacing-queues-with-nomad-dispatch

Workflow 1

Cron replacement. Separate nomad(agent), builder(linux), and builder(win) clusters still needed.

  1. Frontend schedules job in nomad (with unique name and job definition)
    • Agent CLI binary is uploaded to s3, so it can be fetched by job
  2. Job is executed in trusted (raw_exec) environment
    • Download agent binary
    • Execute with job to run (hal.phar run ${job_id})

      Needs database access, access to API
      Docker must be installed (User nomad runs as must have docker access)
      PHP environment must be installed

Workflow 2

Full Scheduler replacement. Only single nomad cluster needed (windows and linux in fleet).

  1. Frontend schedules job in nomad (with unique name and job definition)
    • Agent CLI binary is uploaded to s3, so it can be fetched by job
  2. Job is executed in trusted (raw_exec) environment
    • Download agent binary
    • Execute with job to run (hal.phar run ${job_id})
    • Clone / fetch artifact, store in temp S3
    • Parse project config and build pipeline (stages for job)
  3. "Overwatch" job triggers other nomad jobs (docker) and monitors them
    • Trigger new jobs (AKA stages)
      • Store artifact
      • Monitor
      • Retrieve artifact
      • (Bonus points) Change/modify pipeline on the fly
  4. "Overwatch" job shuts down
    • Store entire job logs
    • Report back to API
    • Clean up all temp artifacts

Questions

  • (Step 2) How are logs sent back to the frontend/database?
  • (Step 2) How is artifact sent back to artifact repository?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment