Skip to content

Instantly share code, notes, and snippets.

@kelvintaywl
Created April 11, 2023 06:19
Show Gist options
  • Save kelvintaywl/3fe1c96562adfab10ce7bbbcd56d85fc to your computer and use it in GitHub Desktop.
Save kelvintaywl/3fe1c96562adfab10ce7bbbcd56d85fc to your computer and use it in GitHub Desktop.

Finding diff in circleci/path-filtering Orb source code

# steps
$ circleci orb source circleci/path-filtering@0.1.3 > path-filter-013.yaml
$ circleci orb source circleci/path-filtering@0.1.4 > path-filter-014.yaml
$ git diff --no-index path-filter-013.yaml path-filter-014.yaml
diff --git a/path-filter-013.yaml b/path-filter-014.yaml
index bcf3896..605544e 100644
--- a/path-filter-013.yaml
+++ b/path-filter-014.yaml
@@ -1,8 +1,8 @@
 version: 2.1
 description: |
-    Continue a pipeline based on paths of changed files.
+    Continue a pipeline based on paths of changed files. This can be useful in a monorepo setup where one may want to trigger different workflows based on which module(s) in the repo has changed.
 display:
-    home_url: https://github.com/CircleCI-Public/path-filtering-orb
+    home_url: https://circleci.com/docs/2.0/dynamic-config
     source_url: https://github.com/CircleCI-Public/path-filtering-orb
 orbs:
     continuation: circleci/continuation@0.2.0
@@ -19,7 +19,7 @@ commands:
             mapping:
                 default: ""
                 description: |
-                    Mapping of path regular expressions to pipeline parameters and values. One mapping per line, whitespace-delimited.
+                    Mapping of path regular expressions to pipeline parameters and values. One mapping per line, whitespace-delimited. If duplicate parameter keys are found, the last matching pattern will apply.
                 type: string
             output-path:
                 default: /tmp/pipeline-parameters.json
@@ -28,13 +28,14 @@ commands:
                 type: string
         steps:
             - run:
-                command: |+
+                command: |
                     #!/usr/bin/env python3
 
                     import json
                     import os
                     import re
                     import subprocess
+                    from functools import partial
 
                     def checkout(revision):
                       """
@@ -48,52 +49,31 @@ commands:
                         check=True
                       )
 
-                    output_path = os.environ.get('OUTPUT_PATH')
-                    head = os.environ.get('CIRCLE_SHA1')
-                    base_revision = os.environ.get('BASE_REVISION')
-                    checkout(base_revision)  # Checkout base revision to make sure it is available for comparison
-                    checkout(head)  # return to head commit
-
-                    base = subprocess.run(
-                      ['git', 'merge-base', base_revision, head],
-                      check=True,
-                      capture_output=True
-                    ).stdout.decode('utf-8').strip()
-
-                    if head == base:
-                      try:
-                        # If building on the same branch as BASE_REVISION, we will get the
-                        # current commit as merge base. In that case try to go back to the
-                        # first parent, i.e. the last state of this branch before the
-                        # merge, and use that as the base.
-                        base = subprocess.run(
-                          ['git', 'rev-parse', 'HEAD~1'], # FIXME this breaks on the first commit, fallback to something
-                          check=True,
-                          capture_output=True
-                        ).stdout.decode('utf-8').strip()
-                      except:
-                        # This can fail if this is the first commit of the repo, so that
-                        # HEAD~1 actually doesn't resolve. In this case we can compare
-                        # against this magic SHA below, which is the empty tree. The diff
-                        # to that is just the first commit as patch.
-                        base = '4b825dc642cb6eb9a060e54bf8d69288fbee4904'
-
-                    print('Comparing {}...{}'.format(base, head))
-                    changes = subprocess.run(
-                      ['git', 'diff', '--name-only', base, head],
-                      check=True,
-                      capture_output=True
-                    ).stdout.decode('utf-8').splitlines()
-
-                    mappings = [
-                      m.split() for m in
-                      os.environ.get('MAPPING').splitlines()
-                    ]
-
-                    def check_mapping(m):
+                    def merge_base(base, head):
+                      return subprocess.run(
+                        ['git', 'merge-base', base, head],
+                        check=True,
+                        capture_output=True
+                      ).stdout.decode('utf-8').strip()
+
+                    def parent_commit():
+                      return subprocess.run(
+                        ['git', 'rev-parse', 'HEAD~1'],
+                        check=True,
+                        capture_output=True
+                      ).stdout.decode('utf-8').strip()
+
+                    def changed_files(base, head):
+                      return subprocess.run(
+                        ['git', '-c', 'core.quotepath=false', 'diff', '--name-only', base, head],
+                        check=True,
+                        capture_output=True
+                      ).stdout.decode('utf-8').splitlines()
+
+                    def check_mapping(changes, m):
                       if 3 != len(m):
                         raise Exception("Invalid mapping")
-                      path, param, value = m
+                      path, _param, _value = m
                       regex = re.compile(r'^' + path + r'$')
                       for change in changes:
                         if regex.match(change):
@@ -103,13 +83,59 @@ commands:
                     def convert_mapping(m):
                       return [m[1], json.loads(m[2])]
 
-                    mappings = filter(check_mapping, mappings)
-                    mappings = map(convert_mapping, mappings)
-                    mappings = dict(mappings)
+                    def write_mappings(mappings, output_path):
+                      with open(output_path, 'w') as fp:
+                        fp.write(json.dumps(mappings))
+
+                    def is_mapping_line(line: str) -> bool:
+                      is_empty_line = (line.strip() == "")
+                      is_comment_line = (line.strip().startswith("#"))
+                      return not (is_comment_line or is_empty_line)
+
+                    def create_parameters(output_path, head, base, mapping):
+                      checkout(base)  # Checkout base revision to make sure it is available for comparison
+                      checkout(head)  # return to head commit
+                      base = merge_base(base, head)
 
-                    with open(output_path, 'w') as fp:
-                      fp.write(json.dumps(mappings))
+                      if head == base:
+                        try:
+                          # If building on the same branch as BASE_REVISION, we will get the
+                          # current commit as merge base. In that case try to go back to the
+                          # first parent, i.e. the last state of this branch before the
+                          # merge, and use that as the base.
+                          base = parent_commit()
+                        except:
+                          # This can fail if this is the first commit of the repo, so that
+                          # HEAD~1 actually doesn't resolve. In this case we can compare
+                          # against this magic SHA below, which is the empty tree. The diff
+                          # to that is just the first commit as patch.
+                          base = '4b825dc642cb6eb9a060e54bf8d69288fbee4904'
 
+                      print('Comparing {}...{}'.format(base, head))
+                      changes = changed_files(base, head)
+
+                      if os.path.exists(mapping):
+                        with open(mapping) as f:
+                          mappings = [
+                            m.split() for m in f.read().splitlines() if is_mapping_line(m)
+                          ]
+                      else:
+                        mappings = [
+                          m.split() for m in
+                          mapping.splitlines() if is_mapping_line(m)
+                        ]
+                      mappings = filter(partial(check_mapping, changes), mappings)
+                      mappings = map(convert_mapping, mappings)
+                      mappings = dict(mappings)
+
+                      write_mappings(mappings, output_path)
+
+                    create_parameters(
+                      os.environ.get('OUTPUT_PATH'),
+                      os.environ.get('CIRCLE_SHA1'),
+                      os.environ.get('BASE_REVISION'),
+                      os.environ.get('MAPPING')
+                    )
                 environment:
                     BASE_REVISION: << parameters.base-revision >>
                     MAPPING: << parameters.mapping >>
@@ -132,6 +158,7 @@ jobs:
     filter:
         description: |
             Continues a pipeline in the `setup` state based with static config and a set of pipeline parameters based on the changes in this push.
+            The mapping should be a set of items like so: <path regular expression> <pipeline parameter> <value> Multiple mappings can be supplied on separate lines. If the regular expression matches any file changed between HEAD and the base revision, the pipeline parameter will be set to the supplied value for the setup workflow continuation. This way the continuation config can be filtered to only perform relevant tasks.
         executor:
             name: default
             tag: << parameters.tag >>
@@ -153,7 +180,12 @@ jobs:
             mapping:
                 default: ""
                 description: |
-                    Mapping of path regular expressions to pipeline parameters and values. One mapping per line, whitespace-delimited.
+                    Mapping of path regular expressions to pipeline parameters and values. If the value is a file, then it will be loaded from the disk. One mapping per line, whitespace-delimited.
+                type: string
+            output-path:
+                default: /tmp/pipeline-parameters.json
+                description: |
+                    Path to save the generated parameters to.
                 type: string
             resource_class:
                 default: small
@@ -183,10 +215,11 @@ jobs:
             - set-parameters:
                 base-revision: << parameters.base-revision >>
                 mapping: << parameters.mapping >>
+                output-path: << parameters.output-path >>
             - continuation/continue:
                 circleci_domain: << parameters.circleci_domain >>
                 configuration_path: << parameters.config-path >>
-                parameters: /tmp/pipeline-parameters.json
+                parameters: << parameters.output-path >>
 examples:
     example:
         description: |
@@ -200,9 +233,13 @@ examples:
                     jobs:
                         - path-filtering/filter:
                             base-revision: main
-                            config-path: .circleci/continue-config.yml
+                            config-path: .circleci/continue_config.yml
                             mapping: |
                                 src/.* build-code true
                                 doc/.* build-docs true
+                        - path-filtering/filter:
+                            base-revision: main
+                            config-path: .circleci/continue-config.yml
+                            mapping: .circleci/mapping.conf
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment