Skip to content

Instantly share code, notes, and snippets.

@ereyes01
Last active February 14, 2020 17:40
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ereyes01/80629d0476dffcf88eab7fb1a5b37b1b to your computer and use it in GitHub Desktop.
Save ereyes01/80629d0476dffcf88eab7fb1a5b37b1b to your computer and use it in GitHub Desktop.
Concurrent MD5 Hash (From Austin Linux Meetup 2/13/2020)

The compiler error we hit as we were running out of time was due to the invisible underscores! Somehow, I had typed:

for_h := range ...

but we couldn't see the underscore due to the problem with my editor and the font size change!

There was one more problem I fixed in the reducer, besides the compiler error. A common pitfall in Go is when you write a loop with an inner function / go routine that uses the containing loop's iterator... you need to pass the value as an argument of the inner function, or otherwise all the go routines will share the same variable containing the last value of the loop!

The benchmarks on my computer: No concurrency: 18s Concurrent: 5s

You can run the benchmark yourself like this: go test -bench=.

The serial / no concurrency implementation is not included here, but it's easy to write yourself.

Hope you had fun, enjoy!

package main
import (
"crypto/md5"
"fmt"
"io/ioutil"
"os"
"sync"
)
const WORKERS = 4
type hash struct {
Filename string
Hash [md5.Size]byte
Err error
}
// input in: files, ret: in channel
func createInput(files []string) <-chan string {
in := make(chan string)
go func() {
for _, f := range files {
in <- f
}
close(in)
}()
return in
}
// worker
func createWorker(in <-chan string) <-chan hash {
out := make(chan hash)
go func() {
for f := range in {
data, err := ioutil.ReadFile(f)
if err != nil {
out <- hash{Err: err, Filename: f}
} else {
h := hash{Filename: f, Hash: md5.Sum(data)}
out <- h
}
}
close(out)
}()
return out
}
// reducer
func reducer(workerOut []<-chan hash) <-chan hash {
var wg sync.WaitGroup
reduced := make(chan hash)
for i := range workerOut {
wg.Add(1)
go func(out <-chan hash) {
for h := range out {
reduced <- h
}
wg.Done() // out is closed
}(workerOut[i])
}
go func() {
wg.Wait()
close(reduced)
}()
return reduced
}
func hashFiles(filenames ...string) <-chan hash {
in := createInput(filenames)
var workerOut []<-chan hash
for i := 0; i < WORKERS; i++ {
workerOut = append(workerOut, createWorker(in))
}
return reducer(workerOut)
}
func main() {
for h := range hashFiles(os.Args[1:]...) {
if h.Err != nil {
fmt.Printf("%s: %s\n", h.Filename, h.Err.Error())
} else {
fmt.Printf("%x %s\n", h.Hash, h.Filename)
}
}
}
package main
import (
"fmt"
"testing"
)
func BenchmarkFactorials(b *testing.B) {
for h := range hashFiles("file1", "file2", "file3") {
fmt.Println(h.Filename)
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment