Skip to content

Instantly share code, notes, and snippets.

@michaelsauter
Last active March 9, 2021 14:46
Show Gist options
  • Save michaelsauter/db826ba8ee078dcd468291af931cbba6 to your computer and use it in GitHub Desktop.
Save michaelsauter/db826ba8ee078dcd468291af931cbba6 to your computer and use it in GitHub Desktop.
ods-pipeline-introduction.md
@henrjk
Copy link

henrjk commented Mar 4, 2021

In the following I am not trying to propose how this could fit into the current proposal as I did not find the time to think deeply enough about this. Nonetheless I think it is worth sharing.

We have been using ods with a monorepo where several services (most based on quicksstarters) are versioned together. This showed tremendeous benefits and also allowed us to have the Jenkins pipeline coordinate the builds.

Our current pipeline does a few preparatory steps and builds a number of services in parallel each in their own subdirectory. Afterwards they are also deployed in parallel. And a last step there is a deployment of the gateway. If the pipeline builds feature branches a preview environment is created. This is enabled by having all services use tailor templates (infra structure as code).

When looking at our builds one can see that they do a lot of unnecessary work.

  1. Services are build and tested, a new image is created although their subdirectory did not change in the monorepo.
  2. The external dependency of services are retrieved from scratch over and over.

Inspired by https://stackoverflow.com/a/47721593 (comment from akhy) I inspected my current local workspace to see whether I can figure out what changed quickly with git (as I would expect):

$ git rev-parse HEAD:frontend/
34a3e85efea780c80507e743562d07b0137c1746
$ git rev-parse develop:frontend/
34a3e85efea780c80507e743562d07b0137c1746

$ git rev-parse HEAD:backend/be-datafeed/
869e9babb6f38dc0f1bce9fad4da7fc8ea715dce
$ git rev-parse develop:backend/be-datafeed/
869e9babb6f38dc0f1bce9fad4da7fc8ea715dce

If one where to store images/artifact by these git hashes instead of the root repo git hash one could avoid rebuilding sub-services in the git repo.

Of course there are also cases where this relationship could include other files. For example the frontend builds are copied in the gateway, so the gateway would also be dependent on the frontend directory hash.

Perhaps there could be initial stage which can compute a build plan. The build plan might be expressed with dependencies to folders of the repo. ODS woud then use the information to skip builds if the declared build output for the sha associated with the dependencies is already available.

To address the second part of fetching required libs over and over, would it be possible to keep the shared workspace or have another PVC available that could be used for several pipeline runs associated with the same branch? This should be optional so that some branches (develop,master) would always refetch.

@michaelsauter
Copy link
Author

michaelsauter commented Mar 4, 2021

@henrjk Thanks for your comments! I have been thinking about your use case as well and also came to the point of the SO answer. However, I don't have a complete picture right now what a good balance would be. My thoughts so far:

  • One point of my proposal is to have better support for monorepo, as I see the current Jenkins / orchestration pipeline / component-type concept has quite poor support for this
  • The proposal I made above would not allow parallel builds easily. If you used different build tasks for different subfolders, they would try to mount the same workspace, and therefore would be sequential. However if you build the parallelism into the build task itself, you basically require teams to build their own "agent images" as one cannot predict which toolchains to put into the image (and the other option of putting every conceivable toolchain into one image is not good either)
  • I've added a bullet point to the "concerns / limitations" section reflecting one potential solution I see, which is mounting several volumes, one per subfolder ...

In any case, the proposal is still quite flexibel (right now it is mainly "readme-ware") so I'm sure we can adapt it until we have a nice concept. Would love to workshop this with you / the team so that we learn from your experience with the monorepo!

@michaelsauter
Copy link
Author

michaelsauter commented Mar 5, 2021

I re-read the docs (always a good thing to do 😄) and realised that I was wrong on the "no task parallelism when tasks mount the same workspace" assumption. See https://tekton.dev/docs/pipelines/workspaces/#specifying-workspace-order-in-a-pipeline-and-affinity-assistants. I verified this to work as expected in our cluster, meaning tasks still run in parallel by default even if they share the same workspace (because they are assigned to the same node by default). This removes a big limitation outlined above - I've updated the proposal accordingly. Especially build tasks in monorepos will benefit hugely from this.

@felipecruz91
Copy link

With regards to running tasks in parallel sharing files in the same workspace, there's a nice thread about it:

tektoncd/pipeline#2586

@michaelsauter
Copy link
Author

@felipecruz91 thanks for the link. I discovered this as well and the current proposal already takes this into account :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment