Skip to content

Instantly share code, notes, and snippets.

@alive2020
Last active March 31, 2023 11:32
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save alive2020/b7c63851c30c1bb739994a8132aa1fc7 to your computer and use it in GitHub Desktop.
Save alive2020/b7c63851c30c1bb739994a8132aa1fc7 to your computer and use it in GitHub Desktop.
GSoC-2022. Final Submission.

beam-gsoc2022

GSoC-2022, Work on Runner Comparison / Capability Matrix revamp project for Apache Beam.

Mentor: Pablo Estrada
Contributor: Aisulu Karimbaeva

Background

Apache Beam maintains a static capability matrix to track which Beam features are supported by which set of language SDKs + Runners. The Runners/Beam Capability Matrix Page needed to be updated. The goal: Tighter coupling of the matrix portion of the comparison with tags on ValidatesRunner tests

Action: Create a completely new Capability Matrix that is based on the ValidatesRunner tests that run on the various Apache Beam runners. Improve UI/UX of Matrix Table. Use the test in ./test-infra/validates-runner/ to generate a JSON file that contains the capabilities supported by various runners and tested by each individual test.

Design:

How do we reformat test results into the file we need?
- “flink”/”spark”/”dataflow” are the COLUMNS
- The Categories are the ROWs
- If a test has no categories, then 
- Each element is a unit test. Each test has a category or more.
- Status: “PASSED” / “FAILED” / “NOT RUN”

design

What was done?

  • Get familiar with the code and tools especially Docker, and Hugo framework
  • Get familiar with the manually generated capability matrix and ValidatesRunner test results
  • Get ValidatesRunner tests results and save as local file, reformat test results to the needed format and create with that data a new Capability Matrix table - Single and Big tables: commited code1 , commited code2
  • Instead of locally imported reformatted test results use uploaded data from Gcload, add links to test names in the new Capability Matrix (based on runner testing) Big table. committed code
  • Added a command line utility: for generating reformatted data from test results, and uploading to GCloud. committed code
  • Reported a bug related to Validate-Runner test results

Created New Tables View:

Unexpanded Cap Matrix

UnexpandedCapMatrix2

ExpandedCapMatrix

ExpandedCapMatrix2

All my GSoC commited code are in #22930 pull request(in review)

Contact:
Email: alive.k001@gmail.com
Linkedin

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment