Skip to content

Instantly share code, notes, and snippets.

@parsiya
Created March 22, 2023 21:27
Show Gist options
  • Save parsiya/060ed6d646c0863f9973d543934b65a2 to your computer and use it in GitHub Desktop.
Save parsiya/060ed6d646c0863f9973d543934b65a2 to your computer and use it in GitHub Desktop.

How to build Semgrep

OS: Debian 11 running under WSL2.

Starting here: https://github.com/returntocorp/semgrep/blob/develop/INSTALL.md

# clone the latest tag (at the time of writing)
$ git clone --depth=1 --branch v1.15.0 https://github.com/returntocorp/semgrep && cd semgrep
$ make dev-setup

We will get an error Opam has not been initialised, please run 'opam init'

Let's do it:

$ opam init
[ERROR] Missing dependencies -- the following commands are required for opam to operate:
  - bwrap: Sandboxing tool bwrap was not found. You should install 'bubblewrap'. See https://opam.ocaml.org/doc/FAQ.html#Why-does-opam-require-bwrap.

Install bubblewrap and try again.

$ sudo apt-get install bubblewrap
$ opam init

This time it works correctly. Let's rerun the original command, again.

$ make dev-setup

Works correctly and installs a bunch of packages and stuff.

Now, let's continue with the instructions and run:

$ make

We get another error that it cannot find dune.

make core
make[1]: Entering directory '/home/parsia/playground/semgrep/semgrep'
rm -f bin
make minimal-build
make[2]: Entering directory '/home/parsia/playground/semgrep/semgrep'
dune build
make[2]: dune: No such file or directory
make[2]: *** [Makefile:98: minimal-build] Error 127
make[2]: Leaving directory '/home/parsia/playground/semgrep/semgrep'
make[1]: *** [Makefile:89: core] Error 2
make[1]: Leaving directory '/home/parsia/playground/semgrep/semgrep'
make: *** [Makefile:78: build] Error 2

But dune is already installed

$ opam install dune
[NOTE] Package dune is already installed (current version is 3.7.0).

We need to add it to the path

$ eval $(opam config env)

Let's continue with make

$ make
make core
make[1]: Entering directory '/home/parsia/playground/semgrep/semgrep'
rm -f bin
make minimal-build
make[2]: Entering directory '/home/parsia/playground/semgrep/semgrep'
dune build
warning: free variables in primitive code "pcre_wasm_module" (/home/parsia/playground/semgrep/semgrep/_build/default/js/pcre.js:7)
vars: Module
warning: free variables in primitive code "tree_sitter_wasm_module" (/home/parsia/playground/semgrep/semgrep/_build/default/js/ocaml-tree-sitter.js:1)
vars: Module
File "src/core/dune", line 59, characters 13-19:
59 |         (run pipenv install)
                  ^^^^^^
Error: Program pipenv not found in the tree or in PATH
 (context: default)

Install pipenv:

$ python3 -m pip install pipenv

Run make again and it will give us this error (that we can ignore).

pre-commit run -a mypy
make[2]: pre-commit: No such file or directory

Now, we have a bunch of so files.

$ find . -type f -name "*.so"
./cli/src/semgrep/bin/semgrep_bridge_python.so
./cli/src/semgrep/bin/semgrep_bridge_core.so

# ignore the ones in `_build`.
./_build/default/src/cli-bridge/semgrep_bridge_python.so
./_build/default/src/cli-bridge/semgrep_bridge_core.so
./_build/default/libs/ocaml-tree-sitter-core/src/bindings/lib/dlltree_sitter_bindings_stubs.so
./_build/default/libs/ocaml-tree-sitter-core/downloads/tree-sitter-0.20.6/libtree-sitter.so
./_build/default/libs/ocaml-tree-sitter-core/downloads/tree-sitter/libtree-sitter.so
./_build/default/libs/ocaml-tree-sitter-core/tree-sitter/lib/libtree-sitter.so
./_build/default/libs/commons/dllcommons_stubs.so
...

Inside semgrep/cli/src/semgrep/bin we also see a binary named semgrep-core.

parsia@Parsia-PC:~/playground/semgrep/semgrep/cli/src/semgrep/bin$ ls
__init__.py  semgrep_bridge_core.so  semgrep_bridge_python.so  semgrep-core

Note that semgrep-core flags are different from the Python wrapper and it doesn't function like the familiar Semgrep CLI (run semgrep-core -help to see the switches). You can also find all of these (and more) in ~/.opam/default/bin/.

$ ls ~/.opam/default/bin/semgrep* -1
/home/parsia/.opam/default/bin/semgrep_bridge_core.so
/home/parsia/.opam/default/bin/semgrep_bridge_python.so
/home/parsia/.opam/default/bin/semgrep-core

But to make the FFI bindings, we need the header files (e.g., how to call the functions). They are in:

$ ls src/cli-bridge/ -1

bridge_design.ded
bridge_design.ded.png
bridge_design.txt
bridge_ml.c
bridge_ml.h
bridge_py.c
dlhelp.c
dlhelp.h
dune
Semgrep_bridge_core.ml

There are handy png and txt files here that explain what's happening.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment