Skip to content

Instantly share code, notes, and snippets.

@rpotter12
Last active March 1, 2021 06:39
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rpotter12/836d52deb6a92c72e3232a18bc179d29 to your computer and use it in GitHub Desktop.
Save rpotter12/836d52deb6a92c72e3232a18bc179d29 to your computer and use it in GitHub Desktop.
Google Summer of Code 2020 Final Report

Google Summer of Code 2020 Final Report

Organization - Aboutcode

Project - Add addtional package metadata and lockfile parsers

ScanCode-toolkit is an opensource software and detects licenses, copyrights, package manifests & dependencies and more by scanning code. This allows to automated discovery of third-party packages, licenses being used in a project. Scancode currently handles various package metadata formats, such as for npm (package.json), python (setup.py). The goal of this project is to add additional package metadata and lockfile parsers to scancode-toolkit. Scancode currently implements parsers for Python packages (setup.py, .whl), package-lock.json/npm-shrinkwrap.json , Ruby Gems (Gemfile, Gemfile.lock), Java Jars, PHP Composer Packages, Debian .deb files/Yum .rpm files, Rust crates.

However, there a few formats still missing - such as:

  • Go (go.mod, go.sum)
  • OCaml(.opam)
  • Python packages (requirements.txt, pipfile.lock)
  • Rust (cargo.lock)

Here are the link to related pull requests I made during the summer.

PR Status Description
#2078 Merged Packagedcode to handle Python(requirements.txt) files.
#2097 Merged Packagedcode to handle Go(go.mod) files.
#2116 Merged Packagedcode to handle Python(Pipfile.lock) files.
#2132 Merged Detect vcs_url from Go(go.mod) files.
#2133 Merged Packagedcode to handle Go(go.sum) files.
#2152 Merged Merge go.mod and go.mod parsers in one module. This PR increases the readability of code and reduce the repeated lines of code
#2153 Merged Packagedcode to handle Rust(Cargo.lock) files.
#2156 Merged Packagedcode to handle Ocaml(opam) files.

During summer I have also worked to parse Rubygems(.gemspec) and Cocoapods(.podspec) files.

.gemspec and .podspec files are almost exactly same files. So we need to create only one parser to parse both the files. To detect all the data, we created a seperate module to oraganize information. Here is the link for the PR.

PR Status Description
#2075 Open Packagedcode to handle Rubygems(.gemspec) and Cocoapods(.podspec) files.

To parse these both files we have worked for gemfileparser organization which had a library to parse Rubygems file. We added new code to handle .podspec files too. That library was very limited to handle only specific types of dependencies, so we added code to handle all types of dependencies. Here is the links of the work:

PR Status Description
#9 Merged Detect all types of dependencies.
#11 Merged Remove runtime dependency on nose.

Links to my Pre-GSoC work that helped me to get selected in GSoC20:

PR Status Description
#1955 Merged Fixes undetected url in setup.py.
#1960 Closed Detect download_url in setup.py file.
#1981 Closed Basic Packagedcode to handle Ocaml(opam) files. This PR was closed during GSoC phase 3 because a new PR was opened that contains code to parse the whole file and detect all the data properly(PR link: #2156)

A lot of other cool stuff happened in these past few months. This was one of an awesome summer working with Aboutcode under Google Summer of Code 2020. I want to thank Google and Aboutcode for giving me this opportunity to work with such an amazing community. A shoutout and a special thanks to my mentors (Philippe Ombredanne and Steven Esser) for guiding me throughout the summer!

And so the wonderful journey comes to an end.

@pombredanne
Copy link

that's great 👍

@steven-esser
Copy link

looks good 👍

@aarjavjain123
Copy link

nice one

Copy link

ghost commented Dec 3, 2020

hi,
It tells about dependencies of the code which have not explicitly mentioned them in any file, right??

What is the main idea, how it tells about dependencies??

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment