Skip to content

Instantly share code, notes, and snippets.

@belgattitude
Last active May 1, 2024 09:31
Show Gist options
  • Save belgattitude/042f9caf10d029badbde6cf9d43e400a to your computer and use it in GitHub Desktop.
Save belgattitude/042f9caf10d029badbde6cf9d43e400a to your computer and use it in GitHub Desktop.
Composite github action to improve CI time with yarn 3+ / node-modules linker.

Why

Although @setup/node as a built-in cache option, it lacks an opportunity regarding cache persistence. Depending on usage, the action below might give you faster installs and potentially reduce carbon emissions (♻️🌳❤️).

Requirements

Yarn 3+/4+ with nodeLinker: node-modules. (Not using yarn ? see the corresponding pnpm 7/8+ action gist)

Structure

.
└── .github
    ├── actions
    │   └── yarn-nm-install/action.yml (composite action)    
    └── workflows
        └── ci.yml (uses: ./.github/actions/yarn-nm-install)    

Composite action

Create a file in .github/actions/yarn-nm-install/action.yml and paste

########################################################################################
# "yarn install" composite action for yarn 3/4+ and "nodeLinker: node-modules"         #
#--------------------------------------------------------------------------------------#
# Requirement: @setup/node should be run before                                        #
#                                                                                      #
# Usage in workflows steps:                                                            #
#                                                                                      #
#      - name: 📥 Monorepo install                                                     #
#        uses: ./.github/actions/yarn-nm-install                                       #
#        with:                                                                         #
#          enable-corepack: false                   # (default = 'false')              #
#          cwd: ${{ github.workspace }}/apps/my-app # (default = '.')                  #
#          cache-prefix: add cache key prefix       # (default = 'default')            #
#          cache-node-modules: false                # (default = 'false')              #
#          cache-install-state: false               # (default = 'false')              #
#                                                                                      #
# Reference:                                                                           #
#   - latest: https://gist.github.com/belgattitude/042f9caf10d029badbde6cf9d43e400a    #
#                                                                                      #
# Versions:                                                                            #
#   - 1.2.0 - 01-05-2024 - action/cache upraded to v4                                  #
#   - 1.1.0 - 22-07-2023 - Option to enable npm global cache folder.                   #
#   - 1.0.4 - 15-07-2023 - Fix corepack was always enabled.                            #
#   - 1.0.3 - 05-07-2023 - YARN_ENABLE_MIRROR to false (speed up cold start)           #
#   - 1.0.2 - 02-06-2023 - install-state default to false                              #
#   - 1.0.1 - 29-05-2023 - cache-prefix doc                                            #
#   - 1.0.0 - 27-05-2023 - new input: cache-prefix                                     #
########################################################################################

name: 'Monorepo install (yarn)'
description: 'Run yarn install with node_modules linker and cache enabled'
inputs:
  cwd:
    description: "Changes node's process.cwd() if the project is not located on the root. Default to process.cwd()"
    required: false
    default: '.'
  cache-prefix:
    description: 'Add a specific cache-prefix'
    required: false
    default: 'default'
  cache-npm-cache:
    description: 'Cache npm global cache folder often used by node-gyp, prebuild binaries (invalidated on lock/os/node-version)'
    required: false
    default: 'true'
  cache-node-modules:
    description: 'Cache node_modules, might speed up link step (invalidated lock/os/node-version/branch)'
    required: false
    default: 'false'
  cache-install-state:
    description: 'Cache yarn install state, might speed up resolution step when node-modules cache is activated (invalidated lock/os/node-version/branch)'
    required: false
    default: 'false'
  enable-corepack:
    description: 'Enable corepack'
    required: false
    default: 'true'

runs:
  using: 'composite'

  steps:
    - name: ⚙️ Enable Corepack
      if: inputs.enable-corepack == 'true'
      shell: bash
      working-directory: ${{ inputs.cwd }}
      run: corepack enable

    - name: ⚙️ Expose yarn config as "$GITHUB_OUTPUT"
      id: yarn-config
      shell: bash
      working-directory: ${{ inputs.cwd }}
      env:
        YARN_ENABLE_GLOBAL_CACHE: 'false'
      run: |
        echo "CACHE_FOLDER=$(yarn config get cacheFolder)" >> $GITHUB_OUTPUT
        echo "CURRENT_NODE_VERSION="node-$(node --version)"" >> $GITHUB_OUTPUT
        echo "CURRENT_BRANCH=$(echo ${GITHUB_REF#refs/heads/} | sed -r 's,/,-,g')" >> $GITHUB_OUTPUT
        echo "NPM_GLOBAL_CACHE_FOLDER=$(npm config get cache)" >> $GITHUB_OUTPUT

    - name: ♻️ Restore yarn cache
      uses: actions/cache@v4
      id: yarn-download-cache
      with:
        path: ${{ steps.yarn-config.outputs.CACHE_FOLDER }}
        key: yarn-download-cache-${{ inputs.cache-prefix }}-${{ hashFiles(format('{0}/yarn.lock', inputs.cwd), format('{0}/.yarnrc.yml', inputs.cwd)) }}
        restore-keys: |
          yarn-download-cache-${{ inputs.cache-prefix }}-

    - name: ♻️ Restore node_modules
      if: inputs.cache-node-modules == 'true'
      id: yarn-nm-cache
      uses: actions/cache@v3
      with:
        path: ${{ inputs.cwd }}/**/node_modules
        key: yarn-nm-cache-${{ inputs.cache-prefix }}-${{ runner.os }}-${{ steps.yarn-config.outputs.CURRENT_NODE_VERSION }}-${{ steps.yarn-config.outputs.CURRENT_BRANCH }}-${{ hashFiles(format('{0}/yarn.lock', inputs.cwd), format('{0}/.yarnrc.yml', inputs.cwd)) }}

    - name: ♻️ Restore global npm cache folder
      if: inputs.cache-npm-cache == 'true'
      id: npm-global-cache
      uses: actions/cache@v4
      with:
        path: ${{ steps.yarn-config.outputs.NPM_GLOBAL_CACHE_FOLDER }}
        key: npm-global-cache-${{ inputs.cache-prefix }}-${{ runner.os }}-${{ steps.yarn-config.outputs.CURRENT_NODE_VERSION }}-${{ hashFiles(format('{0}/yarn.lock', inputs.cwd), format('{0}/.yarnrc.yml', inputs.cwd)) }}

    - name: ♻️ Restore yarn install state
      if: inputs.cache-install-state == 'true' && inputs.cache-node-modules == 'true'
      id: yarn-install-state-cache
      uses: actions/cache@v3
      with:
        path: ${{ inputs.cwd }}/.yarn/ci-cache
        key: yarn-install-state-cache-${{ inputs.cache-prefix }}-${{ runner.os }}-${{ steps.yarn-config.outputs.CURRENT_NODE_VERSION }}-${{ steps.yarn-config.outputs.CURRENT_BRANCH }}-${{ hashFiles(format('{0}/yarn.lock', inputs.cwd), format('{0}/.yarnrc.yml', inputs.cwd)) }}

    - name: 📥 Install dependencies
      shell: bash
      working-directory: ${{ inputs.cwd }}
      run: yarn install --immutable --inline-builds
      env:
        # Overrides/align yarnrc.yml options (v3, v4) for a CI context
        YARN_ENABLE_GLOBAL_CACHE: 'false' # Use local cache folder to keep downloaded archives
        YARN_ENABLE_MIRROR: 'false' # Prevent populating global cache for caches misses (local cache only)
        YARN_NM_MODE: 'hardlinks-local' # Reduce node_modules size
        YARN_INSTALL_STATE_PATH: '.yarn/ci-cache/install-state.gz' # Might speed up resolution step when node_modules present
        # Other environment variables
        HUSKY: '0' # By default do not run HUSKY install

Workflow action

To use it in the workflows

    steps:
      - uses: actions/checkout@v4

      - name: Use Node.js ${{ matrix.node-version }}
        uses: actions/setup-node@v4
        with:
          node-version: ${{ matrix.node-version }}

      - name: 📥 Monorepo install
        uses: ./.github/actions/yarn-nm-install

Yarn config

Be sure that your .yarnrc.yml sets the nodeLinker: node-modules parameter:

nodeLinker: node-modules

#compressionLevel: 0  # Will give 10%-30% install speed up, but takes more space locally

# This line can be ommited if corepack is enabled (requires the packageManager field in package.json)
yarnPath: .yarn/releases/yarn-3.6.0.cjs # or 4.0.0-rc.45 (rc's seems quite stable imho)...

Input parameters

      - name: 📥 Monorepo install
        uses: ./.github/actions/yarn-nm-install
        with:
          cwd: '.'
          enable-corepack: false           
          cache-node-modules: true
          cache-install-state: true 
Parameter Default Comment
cwd '.' Run the install in a specific folder.
enable-corepack false Activate corepack.
cache-prefix default Allows to have multiple distinct install.
cache-node-modules false Cache node-modules (only for exceptional use-cases)
cache-install-state false Only useful is cache-node-modules is activated

This action always caches the yarn config get cacheFolder to avoid fetching archives from the npm repository. Depending on the number of your dependencies, this generally gives a 2x overall improvement. It affects the yarn fetch step and protects from npm outages as well. An example of speed gain could be:

CI Scenario Install CI fetch cache Total Cache size CI persist cache
yarn4 with cache 34s 3s 37s 201Mb (±5s)
yarn4 without cache 83s N/A 83s N/A N/A

Link: https://github.com/belgattitude/compare-package-managers#-install-speed

In some circumstances, you might archieve better install time by caching the node_modules folder as well. This will impact the yarn link step. The link step is where yarn runs postinstalls (node-gyp, download binaries...). But be aware that the time saved by doing this creates an overhead for the cache fetch/compression/persist (@action/cache has more to deal with). Use this option with care and in exceptional circumstances (ie you have multiple dependendent steps that run install). Also remember that the node_modules folder does not have the same portability/reliability than the recommended yarn cache folder. To get an idea of the performance gains, see this table:

Command Mean [s] Min [s] Max [s] Relative
yarnCache:on - installState:off - nm:off 22.717 ± 0.225 22.510 22.957 5.95 ± 0.25
yarnCache:on - installState:on - nm:off 21.350 ± 0.216 21.185 21.595 5.59 ± 0.24
yarnCache:on - installState:off - nm:on 11.676 ± 0.041 11.647 11.722 3.06 ± 0.13
yarnCache:on - installState:on - nm:on 3.820 ± 0.156 3.645 3.946 1.00

Link: https://github.com/belgattitude/compare-package-managers#compare-yarn-options

Results

On install, when only few deps changed

image

Cost of action/cache compression

image

Cleanup caches

When a PR is closed or merged the best is to remove install cache rather than letting github reach the max (10GB) and prune.

image

Link: https://docs.github.com/en/actions/using-workflows/caching-dependencies-to-speed-up-workflows#force-deleting-cache-entries

Here's an example (feel free to adapt if you need to preserse some things, ie gh actions-cache list -R $REPO -B $BRANCH | cut -f 1 | grep yarn will only clear yarn related caches)

# https://docs.github.com/en/actions/using-workflows/caching-dependencies-to-speed-up-workflows#force-deleting-cache-entries
name: Cleanup caches for closed branches

on:
  pull_request:
    types:
      - closed
  workflow_dispatch:

jobs:
  cleanup:
    runs-on: ubuntu-latest
    steps:
      - name: Check out code
        uses: actions/checkout@v3

      - name: Cleanup
        run: |
          gh extension install actions/gh-actions-cache

          REPO=${{ github.repository }}
          BRANCH="refs/pull/${{ github.event.pull_request.number }}/merge"

          echo "Fetching list of cache key"
          cacheKeysForPR=$(gh actions-cache list -R $REPO -B $BRANCH | cut -f 1 )

          ## Setting this to not fail the workflow while deleting cache keys. 
          set +e
          echo "Deleting caches..."
          for cacheKey in $cacheKeysForPR
          do
              gh actions-cache delete $cacheKey -R $REPO -B $BRANCH --confirm
          done
          echo "Done"
        env:
          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
@belgattitude
Copy link
Author

What I've understood about it:

  • install state alone isn't enough. (see params desc in the gist)

  • install-state+node-modules allows to reduce resolution step.

See https://github.com/belgattitude/compare-package-managers#compare-yarn-options

Command Mean [s] Min [s] Max [s] Relative
yarnCache:on - installState:on - nm:off 21.350 ± 0.216 21.185 21.595 5.59 ± 0.24
yarnCache:on - installState:off - nm:off 22.717 ± 0.225 22.510 22.957 5.95 ± 0.25
yarnCache:on - installState:off - nm:on 11.676 ± 0.041 11.647 11.722 3.06 ± 0.13
yarnCache:on - installState:on - nm:on 3.820 ± 0.156 3.645 3.946 1.00
  • but caching node-modules is heavy on CI (compress time and cache budget), so it's only relevant in some scenarios.

  • and in yarn 4, it won't help with resolution speed for PR's anyway -> https://yarnpkg.com/features/security#hardened-mode (could be disabled on private repo depending on trust)

IMHO, just let install-state and node-modules to false. I kept them in the gist as a an advanced recipe.

Hope it clarifies. Thx for the kind words :)

@navidemad
Copy link

Hello,

      - name: 🌐 Setup Node.js (Configure Node.js Environment)
        uses: actions/setup-node@master
        with:
          node-version-file: "package.json"

      - name: 📦 Install Node Modules (Fetch Required Packages)
        uses: ./.github/actions/yarn-install
        with:
          cwd: "."
          enable-corepack: false
          cache-node-modules: true
          cache-install-state: true

We are getting some :
Error: Input required and not supplied: path

Prepare all required actions
Getting action download info
Run ./.github/actions/yarn-install
  with:
    cwd: .
    enable-corepack: false
    cache-node-modules: true
    cache-install-state: true
    cache-prefix: default
    cache-npm-cache: true
  env:
    CI: true
    RAILS_ENV: test
    NODE_ENV: test
    RAKE_ENV: test
    TZ: Europe/Paris
Run EOF=$(dd if=/dev/urandom bs=15 count=1 status=none | base64)
  EOF=$(dd if=/dev/urandom bs=15 count=1 status=none | base64)
  echo "config<<$EOF" >> $GITHUB_OUTPUT
  echo "CACHE_FOLDER=$(yarn config get cacheFolder)" >> $GITHUB_OUTPUT
  echo "CURRENT_NODE_VERSION="node-$(node --version)"" >> $GITHUB_OUTPUT
  echo "CURRENT_BRANCH=$(echo ${GITHUB_REF#refs/heads/} | sed -r 's,/,-,g')" >> $GITHUB_OUTPUT
  echo "NPM_GLOBAL_CACHE_FOLDER=$(npm config get cache)" >> $GITHUB_OUTPUT
  echo "$EOF" >> $GITHUB_OUTPUT
  shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
  env:
    CI: true
    RAILS_ENV: test
    NODE_ENV: test
    RAKE_ENV: test
    TZ: Europe/Paris
    YARN_ENABLE_GLOBAL_CACHE: false
Run actions/cache@master
  with:
    key: yarn-download-cache-default-b4f21e4745143b0e4b2e3220af4fa429ecae49b7b671d16b510b0015fc55ebcd
    restore-keys: yarn-download-cache-default-
  
    enableCrossOsArchive: false
    fail-on-cache-miss: false
    lookup-only: false
  env:
    CI: true
    RAILS_ENV: test
    NODE_ENV: test
    RAKE_ENV: test
    TZ: Europe/Paris
Error: Input required and not supplied: path
Run actions/cache@master
  with:
    path: ./**/node_modules
    key: yarn-nm-cache-default-Linux---b4f21e4745143b0e4b2e3220af4fa429ecae49b7b671d16b510b0015fc55ebcd
    enableCrossOsArchive: false
    fail-on-cache-miss: false
    lookup-only: false
  env:
    CI: true
    RAILS_ENV: test
    NODE_ENV: test
    RAKE_ENV: test
    TZ: Europe/Paris
Cache not found for input keys: yarn-nm-cache-default-Linux---b4f21e4745143b0e4b2e3220af4fa429ecae49b7b671d16b510b0015fc55ebcd
Run actions/cache@master
  with:
    key: npm-global-cache-default-Linux--b4f21e4745143b0e4b2e3220af4fa429ecae49b7b671d16b510b0015fc55ebcd
    enableCrossOsArchive: false
    fail-on-cache-miss: false
    lookup-only: false
  env:
    CI: true
    RAILS_ENV: test
    NODE_ENV: test
    RAKE_ENV: test
    TZ: Europe/Paris
Error: Input required and not supplied: path
Run actions/cache@master
  with:
    path: ./.yarn/ci-cache
    key: yarn-install-state-cache-default-Linux---b4f21e4745143b0e4b2e3220af4fa429ecae49b7b671d16b510b0015fc55ebcd
    enableCrossOsArchive: false
    fail-on-cache-miss: false
    lookup-only: false
  env:
    CI: true
    RAILS_ENV: test
    NODE_ENV: test
    RAKE_ENV: test
    TZ: Europe/Paris
Cache not found for input keys: yarn-install-state-cache-default-Linux---b4f21e4745143b0e4b2e3220af4fa429ecae49b7b671d16b510b0015fc55ebcd

@belgattitude
Copy link
Author

belgattitude commented Oct 17, 2023

Can you try with a version, rather than master, ie actions/cache@v3… Might be broken upstream

@navidemad
Copy link

You are right, it works with actions/cache@v3.

@belgattitude
Copy link
Author

Thx for confirming, I’ll keep an eye on it

@navidemad
Copy link

Apparently it was not the problem:

Prepare all required actions
Getting action download info
Run ./.github/actions/yarn-install
  with:
    cwd: .
    enable-corepack: false
    cache-node-modules: true
    cache-install-state: true
    cache-prefix: default
    cache-npm-cache: true
  env:
    CI: true
    RAILS_ENV: test
    NODE_ENV: test
    RAKE_ENV: test
    TZ: Europe/Paris
Run EOF=$(dd if=/dev/urandom bs=15 count=1 status=none | base64)
  EOF=$(dd if=/dev/urandom bs=15 count=1 status=none | base64)
  echo "config<<$EOF" >> $GITHUB_OUTPUT
  echo "CACHE_FOLDER=$(yarn config get cacheFolder)" >> $GITHUB_OUTPUT
  echo "CURRENT_NODE_VERSION="node-$(node --version)"" >> $GITHUB_OUTPUT
  echo "CURRENT_BRANCH=$(echo ${GITHUB_REF#refs/heads/} | sed -r 's,/,-,g')" >> $GITHUB_OUTPUT
  echo "NPM_GLOBAL_CACHE_FOLDER=$(npm config get cache)" >> $GITHUB_OUTPUT
  echo "$EOF" >> $GITHUB_OUTPUT
  shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
  env:
    CI: true
    RAILS_ENV: test
    NODE_ENV: test
    RAKE_ENV: test
    TZ: Europe/Paris
    YARN_ENABLE_GLOBAL_CACHE: false
  
Run actions/cache@v3
  with:
    key: yarn-download-cache-default-4165d9[2](https://github.com/organization/project/actions/runs/6671346141/job/18133063595#step:7:2)5ad[3](https://github.com/organization/project/actions/runs/6671346141/job/18133063595#step:7:3)510be8de966[4](https://github.com/organization/project/actions/runs/6671346141/job/18133063595#step:7:4)e03f237bd83ad736936ef8673[5](https://github.com/organization/project/actions/runs/6671346141/job/18133063595#step:7:5)d923d39[6](https://github.com/organization/project/actions/runs/6671346141/job/18133063595#step:7:6)c15a1[7](https://github.com/organization/project/actions/runs/6671346141/job/18133063595#step:7:7)1
    restore-keys: yarn-download-cache-default-
  
    enableCrossOsArchive: false
    fail-on-cache-miss: false
    lookup-only: false
  env:
    CI: true
    RAILS_ENV: test
    NODE_ENV: test
    RAKE_ENV: test
    TZ: Europe/Paris
  
Error: Input required and not supplied: path

@navidemad
Copy link

I removed the EOF mechanics, and stick to the original version with some errors checking suggested by ChatGPT:

    - name: ⚙️ Expose yarn config as "$GITHUB_OUTPUT"
      id: yarn-config
      shell: bash
      working-directory: ${{ inputs.cwd }}
      env:
        YARN_ENABLE_GLOBAL_CACHE: "false"
      run: |
        CACHE_FOLDER=$(yarn config get cacheFolder) || { echo "Failed to get Yarn cache folder"; exit 1; }
        echo "CACHE_FOLDER=$CACHE_FOLDER" >> $GITHUB_OUTPUT

        CURRENT_NODE_VERSION="node-$(node --version)" || { echo "Failed to get Node version"; exit 1; }
        echo "CURRENT_NODE_VERSION=$CURRENT_NODE_VERSION" >> $GITHUB_OUTPUT

        CURRENT_BRANCH=$(echo ${GITHUB_REF#refs/heads/} | sed -r 's,/,-,g') || { echo "Failed to get current branch"; exit 1; }
        echo "CURRENT_BRANCH=$CURRENT_BRANCH" >> $GITHUB_OUTPUT

        NPM_GLOBAL_CACHE_FOLDER=$(npm config get cache) || { echo "Failed to get NPM global cache folder"; exit 1; }
        echo "NPM_GLOBAL_CACHE_FOLDER=$NPM_GLOBAL_CACHE_FOLDER" >> $GITHUB_OUTPUT

    - name: ♻️ Show exposed yarn-config
      shell: bash
      run: |
        echo "Cache folder: ${{ steps.yarn-config.outputs.CACHE_FOLDER }}"
        echo "Current Node Version: ${{ steps.yarn-config.outputs.CURRENT_NODE_VERSION }}"
        echo "Current Branch: ${{ steps.yarn-config.outputs.CURRENT_BRANCH }}"
        echo "NPM Global Cache: ${{ steps.yarn-config.outputs.NPM_GLOBAL_CACHE_FOLDER }}"

@Swiftwork
Copy link

I was using your composite action with yarn workspaces. However, I ran into an issue with installing using yarn workspaces focus packageA, which installs only the dependencies needed for that package. It could not resolve the jest binary command not found: jest. Oddly this only occurred in one of my packages/jobs. After a day of debugging I tracked the issue to the YARN_INSTALL_STATE_PATH. I think there needs to be a safeguard when the cache misses or is not used. Otherwise, it has the potential to confuse the installation. So I altered the line to be:

YARN_INSTALL_STATE_PATH: ${{ steps.yarn-install-state-cache.outputs.cache-hit == 'true' && '.yarn/ci-cache/install-state.gz' || '.yarn/install-state.gz' }}

Leaving it here for others to find 😄.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment