Skip to content

Instantly share code, notes, and snippets.

@dasl-
Last active March 26, 2019 16:52
Show Gist options
  • Save dasl-/04aaeefc173bef533a1bcad3a6497d25 to your computer and use it in GitHub Desktop.
Save dasl-/04aaeefc173bef533a1bcad3a6497d25 to your computer and use it in GitHub Desktop.

Debugging vitess with delve

In this guide, we will run vitess in a docker container, and hook into it with delve, a go debugger. This will allow us to debug via a command line interface to delve, or even via a go IDE such as Jetbrains's GoLand.

Table of contents

Get the source code

In your terminal:

cd $(go env GOPATH)/src && mkdir vitess.io && cd vitess.io
git clone https://github.com/vitessio/vitess && cd vitess
git apply <(curl https://gist.githubusercontent.com/dasl-/8f44c285b8deebf101ffeb613c5b5ba9/raw/e6ab6ee1dcd3c83870f370aad46ad94c34a122a8/vitess_debug.diff)

This clones the vitess source code into your GOPATH (typically ~/go). If you are planning to debug using GoLand IDE, it is particularly important to clone at the above specified location -- GoLand maps the sources automatically, as long as you use GOPATH correctly.

It also applies a patch that will allow us to debug. The patch we are applying adds a build_debug command to the Makefile. This builds the source without some compiler optimizations, which makes it easier for delve to debug the program. Furthermore, it adjusts the Dockerfiles (one for each flavor of MySQL that vitess supports) to build vitess with these debug improvements and with delve included.

Build the new docker image

For building the docker images and setting up vitess inside the containers, we will follow instructions similar to these. For some reason though, recent commits to the vitess repo have removed the docker instructions from the most recent version of their docs and website (more details). In my experience, getting vitess to run inside a container was much easier than running on bare metal (I ran into dependency hell issues), as the current docs suggest.

First we will build the docker image for whatever flavor of MySQL you would prefer to run. See the available flavors here. If you prefer percona, you can do:

make docker_base_percona

Future commands in this tutorial will all assume that we have picked the percona flavor. It should be fairly obvious how to alter the commands if you chose a different flavor.

Note: building the docker image copies the source code and its compiled binaries from your host computer to the container. Anytime you update the source code on the host computer, you should rebuild the docker image.

Note: anytime the vitess go dependencies change (i.e. when pulling the latest vitess code), you may get errors running make docker_base_percona like this. In that case, you should:

  1. docker/bootstrap/build.sh common
  2. docker/bootstrap/build.sh percona
  3. make docker_base_percona

Run the docker image

Increase Docker Engine resources

First, bump up the resources allocated to the Docker Engine. Running vitess in the container will take a bunch of resources, otherwise some components will fail to start properly. I tried with 4 CPUs and 6GB RAM, and it worked fine. I recommend something similar:

Create an init script to initialize all vitess components in the container

Once the container is running, we will run an init script inside the container to setup vitess and all its components.

Copy the init script to your computer:

mkdir ~/vitess-scripts && curl https://gist.githubusercontent.com/dasl-/c46c31f604c82a0167ac06908aff54ea/raw/e6e29c6f55c3ce3fc167d1fb6efcd7e1d693c096/init.sh > ~/vitess-scripts/init.sh && chmod a+x ~/vitess-scripts/init.sh

This will script initialize zookeeper, mysql, vttablet, and vtgate. There are 3 mysql instances that are setup: an unsharded shops_index table that will back a lookup vindex for the sharded shop_data table. The shop_data table is sharded across two mysql shards.

Run the container

Run the container via:

docker run -v ~/vitess-scripts:/vitess-scripts -p 40000:40000 -p 15000:15000 -p 15100-15101:15100-15101 -p 15200-15201:15200-15201 -p 15300-15301:15300-15301 -h localhost --security-opt=seccomp:unconfined -it vitess/base:percona bash

Note: you may want to change the image name you are running from vitess/base:percona to the appropriate flavor of MySQL

Rundown of the args:

  • -v ~/vitess-scripts:/vitess-scripts: this mounts the init script we created in the previous step to the path /vitess-scripts/init.sh inside the container. Thus we will be able to run the init script inside the container.
  • -p 40000:40000: We will be making the delve debugging server available on port 40000. Thus we need to expose this port so that your IDE can connect to it from your local computer.
  • -p 15XXX:15XXX: vitess creates admin web interfaces that we may want to access from outside the container in our web browsers. Thus we need to expose / forward these ports outside the container
  • -h localhost: this changes the hostname of the container to localhost, which will make the admin web interface URLs printed by the init script use localhost:<port> rather than a container hash for the hostname.
  • --security-opt=seccomp:unconfined: required for delve to work properly

Run the init script

You should now have a bash prompt open inside the container:

vitess@localhost:/vt/src/vitess.io/vitess$

Run the init script we created earlier via:

/vitess-scripts/init.sh

You should see output like this.

Note in particular that the delve debugger failed to attach to the vtgate process in the last few lines of output:

API server listening at: [::]:40000
INFO[0000] attaching to pid 9877                         layer=debugger
Could not attach to pid 9877: this could be caused by a kernel security setting, try writing "0" to /proc/sys/kernel/yama/ptrace_scope

Fix ptrace issue

We need to resolve the above error to allow the delve debugger to attach to the vtgate process. First, kill the container that you just started above.

On Mac OS

Unfortunately the solution I have found is a bit hacky. It works, but it is an annoying process, and if anyone knows of a better way to solve this, please let me know.

  1. Comment out lines in the Dockerfile for whichever image you are using such that the last line in file is USER root. For example, if you are using Dockerfile.percona, the file should look like this.
  2. Build the docker image for the Dockerfile you edited: make docker_base_percona
  3. Run the image you just built: docker run --privileged --security-opt=seccomp:unconfined -it vitess/base:percona bash
  4. You should now have a shell open in the container: root@b84ae3e85834:/vt/src/vitess.io/vitess#
  5. Run the following command in the container's shell: echo 0 > /proc/sys/kernel/yama/ptrace_scope
  6. Kill the container
  7. Undo the edits you made in (1)
  8. Build the docker image again, now that you've undone the edits: make docker_base_percona

The next time that you run the container, the ptrace errors should be fixed. Note that if you restart the docker daemon, you may have to repeat these steps to fix ptrace errors again.

On Linux

On a linux system, you may be able to fix the ptrace setting outside of the container on the host system.

  1. On the host's shell, run: echo "kernel.yama.ptrace_scope = 0" | sudo tee /etc/sysctl.d/10-ptrace.conf
  2. Restart the host to apply the changes: shutdown -r now
  3. Confirm the host has the change applied: cat /proc/sys/kernel/yama/ptrace_scope (should be 0 now)

Run the container and init script, part 2

Run the container again:

docker run -v ~/vitess-scripts:/vitess-scripts -p 40000:40000 -p 15000:15000 -p 15100-15101:15100-15101 -p 15200-15201:15200-15201 -p 15300-15301:15300-15301 -h localhost --security-opt=seccomp:unconfined -it vitess/base:percona bash

Run the init script in the containers shell:

/vitess-scripts/init.sh

You should see output like this. Note in particular that delve was able to attach to the vtgate process now at the end of the output:

API server listening at: [::]:40000
INFO[0000] attaching to pid 9899                         layer=debugger

Note that we can also connect to the vitess admin web interface on the host machine by opening a browser and navigating to http://localhost:15000/ (this URL is printed in the init script's output).

Debug with delve

You have two options for debugging with delve:

  1. debug on the command line
  2. debug with an IDE, for instance GoLand from Jetbrains.

Debugging on the command line

To debug on the command line, you will actually want to kill the container and modify the init script on the host machine. If you open up the init script, the last two lines should be:

dlv attach $vtgate_pid --listen=:40000 --headless --api-version=2 --log
# dlv attach $vtgate_pid

Modify it such that the commenting out of the lines is reversed:

# dlv attach $vtgate_pid --listen=:40000 --headless --api-version=2 --log
dlv attach $vtgate_pid

Run the container again, and run the (now modified) init script in the container. You should see output like this. Note the last two lines are now:

Type 'help' for list of commands.
(dlv)

We now have a delve debugging prompt. Set a breakpoint at the first line of the lookup_hash_unique vindex's Map method in the delve command prompt:

(dlv) break /vt/src/vitess.io/vitess/go/vt/vtgate/vindexes/lookup_hash.go:258
Breakpoint 1 set at 0xaa1034 for vitess.io/vitess/go/vt/vtgate/vindexes.(*LookupHashUnique).Map() /vt/src/vitess.io/vitess/go/vt/vtgate/vindexes/lookup_hash.go:258
(dlv)

Tell delve to continue vtgate program execution (when delve attaches to the vtgate process, it pauses execution):

(dlv) continue

In a new terminal window, open another shell into the container and query some data:

  1. get the container name via docker ps output
  2. open another shell into the container: docker exec -it <container name> bash, where <container name> will be something like adoring_elbakyan
  3. You should now have a shell open inside the container: vitess@localhost:/vt/src/vitess.io/vitess$
  4. Query some data by typing this into the container's shell: vtctl -enable_queries -topo_implementation zk2 -topo_global_server_address localhost:21811,localhost:21812,localhost:21813 -topo_global_root /vitess/global VtGateExecute -server localhost:15991 "select * from shop_data where shop_id = 4"
  5. Observe that in the other terminal window with delve running, the breakpoint is hit:
> vitess.io/vitess/go/vt/vtgate/vindexes.(*LookupHashUnique).Map() /vt/src/vitess.io/vitess/go/vt/vtgate/vindexes/lookup_hash.go:258 (hits goroutine(394):1 total:1) (PC: 0xaa1034)
   253:		return false
   254:	}
   255:
   256:	// Map can map ids to key.Destination objects.
   257:	func (lhu *LookupHashUnique) Map(vcursor VCursor, ids []sqltypes.Value) ([]key.Destination, error) {
=> 258:		out := make([]key.Destination, 0, len(ids))
   259:		if lhu.writeOnly {
   260:			for range ids {
   261:				out = append(out, key.DestinationKeyRange{KeyRange: &topodatapb.KeyRange{}})
   262:			}
   263:			return out, nil
(dlv)

Debugging with GoLand IDE

In GoLand, open the vitess project located on your gopath (go env GOPATH). The path to your project will probably be something like ~/go/src/vitess.

  1. Go to Run > Edit Configurations… and add a new Go Remote configuration
  2. Fill in the Host localhost, Port 40000, and optionally name it. Should look like this.
  3. Click OK
  4. Click the debug button to connect to delve and start a new debug session.
  5. You should see some output in the container running delve like: DEBU[0667] continuing layer=debugger
  6. Set a breakpoint at the first line of the lookup_hash_unique vindex's Map method. Should look like this.

In a new terminal window, open another shell into the container and query some data:

  1. get the container name via docker ps output
  2. open another shell into the container: docker exec -it <container name> bash, where <container name> will be something like adoring_elbakyan
  3. You should now have a shell open inside the container: vitess@localhost:/vt/src/vitess.io/vitess$
  4. Query some data by typing this into the container's shell: vtctl -enable_queries -topo_implementation zk2 -topo_global_server_address localhost:21811,localhost:21812,localhost:21813 -topo_global_root /vitess/global VtGateExecute -server localhost:15991 "select * from shop_data where shop_id = 4"
  5. Observe that in GoLand, the breakpoint is hit:
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment