Skip to content

Instantly share code, notes, and snippets.

@mroth
Created January 4, 2019 22:38
Show Gist options
  • Save mroth/94adf3ef10313ae19fe1193ea610a9c5 to your computer and use it in GitHub Desktop.
Save mroth/94adf3ef10313ae19fe1193ea610a9c5 to your computer and use it in GitHub Desktop.
Debugging interesting behavior with a common pattern for testing subprocesses and the race detector.
package raceexec
import (
"log"
"os"
"os/exec"
"testing"
"time"
)
// This one trick you won't believe to mock an external binary within test!
//
// Creates an exec.Cmd that actually calls back into the test binary itself,
// invoking a specific test function, with [-- cmd, args...] appended to end,
// and a magic env var specified.
//
// This allows us to easily mock external binary behavior in a cross-platform
// way. The go stdlib uses this trick in os/exec.Cmd's own tests!
//
// https://npf.io/2015/06/testing-exec-command/
func mockExecCommand(command string, args ...string) *exec.Cmd {
cs := []string{"-test.run=TestHelperProcess", "--", command}
cs = append(cs, args...)
cmd := exec.Command(os.Args[0], cs...)
cmd.Env = []string{"GO_WANT_HELPER_PROCESS=1"}
return cmd
}
func TestHelperProcess(t *testing.T) {
// ignore me when not being specifically invoked
if os.Getenv("GO_WANT_HELPER_PROCESS") != "1" {
return
}
// grab all the args after "--"
var args []string
for i, arg := range os.Args {
if arg == "--" {
args = os.Args[i+1:]
break
}
}
if len(args) == 0 {
log.Fatal("mock command not specified")
}
switch args[0] {
case "sleep": // variant of sleep, taking ParseDuration string
t, _ := time.ParseDuration(args[1])
time.Sleep(t)
os.Exit(0)
}
log.Fatal("mocked command not implemented:", args[0])
}
func TestReportProcessDelay(t *testing.T) {
for _, delay := range []string{"10ms", "100ms", "1s"} {
c := mockExecCommand("sleep", delay)
t.Log("running process which should sleep for:", delay)
startTS := time.Now()
err := c.Run()
measuredRuntime := time.Since(startTS)
if err != nil {
t.Fatalf("failed to run cmd: %v", err)
}
t.Log("test measured time elapsed:", measuredRuntime)
}
}
@mroth
Copy link
Author

mroth commented Jan 4, 2019

When running normally, you'll see the subprocesses complete in about the expected amount of time, with perhaps a few milliseconds of overhead at most:

$ go test -run TestReportProcessDelay -v
=== RUN   TestReportProcessDelay
--- PASS: TestReportProcessDelay (1.14s)
    raceexec_test.go:58: running process which should sleep for: 10ms
    raceexec_test.go:65: test measured time elapsed: 16.371754ms
    raceexec_test.go:58: running process which should sleep for: 100ms
    raceexec_test.go:65: test measured time elapsed: 113.842707ms
    raceexec_test.go:58: running process which should sleep for: 1s
    raceexec_test.go:65: test measured time elapsed: 1.009933137s
PASS
ok      github.com/mroth/jeju/examples/raceexec 1.145s

When running the same code with race detector active, things are quite different:

$ go test -run TestReportProcessDelay -v -race
=== RUN   TestReportProcessDelay
--- PASS: TestReportProcessDelay (4.17s)
    raceexec_test.go:58: running process which should sleep for: 10ms
    raceexec_test.go:65: test measured time elapsed: 1.025608074s
    raceexec_test.go:58: running process which should sleep for: 100ms
    raceexec_test.go:65: test measured time elapsed: 1.122880153s
    raceexec_test.go:58: running process which should sleep for: 1s
    raceexec_test.go:65: test measured time elapsed: 2.022351714s
PASS
ok      github.com/mroth/jeju/examples/raceexec 5.186s

You can see roughly exactly 1sec is added to the execution time of each subcommand when run with -race.

Fairly certain what is happening here is since the test binary is built with -race, when you call out to it in the subprocess there is some default behavior (either at tear-up or tear-down for the race detector?) that runs for about a second.

Multiply that across potentially thousands of tests involving subcommands, and you've got a problem.

An initial response might be "well exclude those tests via // +build !race package!" however, the actual behavior we're interested in testing involves verifying are no race conditions in the way multiple concurrent subcommands are wrapped and orchestrated across threads, so the race detection for the actual tests is important.

Unfortunately, since the race build flags are not exposed to runtime (this is by design, from what I can see, previous issues asking for it have been rejected), I can't think of a way to get around this while preserving embedding the mocked subcommand tests. The options exposed under the GORACE environment variable don't seem to have anything that would help.

Of course, it's possible to build external binaries, and shell out to those instead, but this is suboptimal because then the test process is reliant on making sure those binaries are built and up to date, and you end up needing to rely on some Makefile magic and things will break for developers who just do the expected thing and run go test -race, and well, it gets messy fast.

Perhaps I'm missing an obvious and clean solution here?

@mroth
Copy link
Author

mroth commented Jan 4, 2019

Update, this is related to: golang/go#20364
And can be resolved with the undocumented (in the go docs at least) env flag GORACE=atexit_sleep_ms=0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment