Skip to content

Instantly share code, notes, and snippets.

@thomaspoignant
Created November 8, 2021 13:48
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save thomaspoignant/eb6ddaa355e416f89ded01acbf1a86c5 to your computer and use it in GitHub Desktop.
Save thomaspoignant/eb6ddaa355e416f89ded01acbf1a86c5 to your computer and use it in GitHub Desktop.
An example to sort a list of jobs depending on their depends.
from typing import Dict, List, Optional
import networkx as nx
def main():
stages = {
"Lint": [],
"Test": [],
"Coverage": ["Test"],
"Docs": ["Coverage", "Lint"],
"Benchmark": ["Coverage"],
}
print(list(_sort_jobs(stages)))
# Output: [['Lint', 'Test'], ['Coverage'], ['Docs', 'Benchmark']]
def _sort_jobs(dependencies: Dict[str, List[str]]) -> List[List[str]]:
"""
_sort_jobs is taking a dependency dict and sort the jobs to be able to have parallel run.
:param dependencies: A dependency dict that contains every stages and their dependencies
something like { "1": [], "2": [], "3": ["2", "1", "4"], "4": ["2", "1"] }
:return: The order of the stages
example: [["1", "2"], ["4"], ["3"]]
"""
g = nx.DiGraph(dependencies)
# detect cycling workflows
cycles = list(nx.simple_cycles(g))
if len(cycles) > 0:
raise CyclingPipeline(cycles=cycles)
# sort the stages
out_degree_map = {v: d for v, d in g.out_degree() if d > 0}
zero_out_degree = [v for v, d in g.out_degree() if d == 0]
while zero_out_degree:
yield zero_out_degree
new_zero_out_degree = []
for v in zero_out_degree:
for child, _ in g.in_edges(v):
out_degree_map[child] -= 1
if not out_degree_map[child]:
new_zero_out_degree.append(child)
zero_out_degree = new_zero_out_degree
class CyclingPipeline(Exception):
"""
CyclingPipeline is an exception raised if we detect a cycle in the pipeline.
"""
def __init__(self, cycles=Optional[List]):
message = f"Invalid workflow you have a cycling dependencies: {cycles}"
super().__init__(message)
if __name__ == "__main__":
main()
@garyzava
Copy link

Hi good stuff! Comparing the initial dependencies vs the output ([['Lint', 'Test'], ['Coverage'], ['Docs', 'Benchmark']]), they are not exactly the same. In the output, Coverage has a dependency on both Lint and Test. Or maybe there's no way to represent accurately your sample graph in a list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment