Skip to content

Instantly share code, notes, and snippets.

@yashugarg
Last active October 4, 2022 17:22
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save yashugarg/9cf40c29c08fd49956514f019952d4c1 to your computer and use it in GitHub Desktop.
Save yashugarg/9cf40c29c08fd49956514f019952d4c1 to your computer and use it in GitHub Desktop.
Google Summer of Code 2022 Final Work Submission

GSoC 2022

Google Summer of Code Final Work Report


Phase-1: Improving code coverage

Summary

Code coverage is a measurement used to express which lines of code can a test suite execute. Codecov measures and keeps track of the code coverage with every commit. Before the GSoC period started, the total test coverage was slightly shy of 80%.

Aim

To increase the code coverage of the tool to 95% or more, which includes:

  • Writing new unit tests and improving the existing ones.
  • Improving the test harnesses and CI infrastructure to cover everything.

Tasks Achieved

Writing and improving unit tests.

Working on tests and CI for Windows

Results


The graph shows a growth pattern in code coverage during Phase 1 of GSoC.

The total test coverage of the tool increased significantly during this phase of the project. With more than a 10% increase in code coverage, the test suite covered execution paths for both Windows and Linux operating systems.

Challenges

  • #1720 took exceptionally long to implement, requiring me to cover all possible execution paths for both operating systems.
  • With new code added regularly, keeping the test coverage percentage up was also challenging.

Phase-2: Implementing fuzzing

Summary

Fuzz testing, often known as fuzzing, is an automated software testing approach that includes feeding random, erroneous, or invalid data to a program. The goal of fuzzing is to find bugs and vulnerabilities in the program. The team decided to go with Atheris as the primary fuzzing tool for the project.

Aim

To implement fuzzing in the tool to find bugs and vulnerabilities, which includes:

  • Set up fuzzing infrastructure.
  • Implement structure-aware fuzzing for JSON inputs in merged reports and various input formats for CycloneDx and SPDX SBOMs.

Tasks Achieved

Results

After implementing the fuzzing infrastructure successfully, the tool supports fuzzing for various input formats and parsers. The fuzz tests are still in progress and cover a relatively low percentage of code. The tests didn't yield any new bugs, but that's because of the sound code quality and excellent parsing tools used in the project.

Challenges

  • Setting up fuzzing from scratch was a challenge in itself. I had no experience with atheris, but the mentors helped me immensely during this process.

Things I Learned

  • It was my first time writing unit tests in Python, and I learned about pytest-mock and implemented mock tests from scratch.
  • I also learned about protobuf while setting up structure-aware fuzzing into the project.
  • Besides many soft skills like communication and time management, the program also helped me learn a lot about best coding practices and open source etiquette.

You can find a detailed description of progress and work done in weekly blogs.

Future Work

I plan on contributing significantly to the project after the GSoC period. Things I plan to do:

  • Working with SBOMs and improving the tool to support more formats.
  • Adding more tests to cover the code paths that are not covered yet.
  • Fuzz testing the project for data vulnerability formats.

I am thankful to Google, Python Software Foundation, and Intel for providing me with this excellent opportunity and the mentors, Terri Oda, Suhail, and Anthony Harrison, who guided me throughout the program.

I also want to thank my fellow GSoC contributor Rhythm & Anant and the cve-bin-tool community for helping me during the program.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment