Skip to content

Instantly share code, notes, and snippets.

@ashwanthkumar
Created February 28, 2022 02:13
Show Gist options
  • Save ashwanthkumar/095b0f6c3c1f2858648f4b9739fb1e0e to your computer and use it in GitHub Desktop.
Save ashwanthkumar/095b0f6c3c1f2858648f4b9739fb1e0e to your computer and use it in GitHub Desktop.
GroupBy implementation in Golang for string slices
func GroupBy(slice []string, groupFn func(string) string) map[string][]string {
groups := make(map[string][]string)
for _, elem := range slice {
elementKey := groupFn(elem)
existing, present := groups[elementKey]
if !present {
existing = []string{}
}
existing = append(existing, elem)
groups[elementKey] = existing
}
return groups
}
@karthikcru
Copy link

karthikcru commented Feb 28, 2022

Its perfect, making it a bit concise

func GroupBy(slice []string, groupFn func(string) string) map[string][]string {
	groups := map[string][]string{}
	for _, elem := range slice {
		var elementKey string
		elementKey = groupFn(elem)
		if existing, present := groups[elementKey]; !present {
			groups[elementKey] = []string{elem}
		} else {
			groups[elementKey] = append(existing, elem)
		}
	}
	return groups
}

Cache to make it faster, if elements are repeated cache makes it faster

func GroupBy(slice []string, groupFn func(string) string) map[string][]string {
	groups := map[string][]string{}
	cache := map[string]string{}
	for _, elem := range slice {
		var elementKey string
		if val, ok := cache[elem]; ok {
			elementKey = val
		} else {
			elementKey = groupFn(elem)
			cache[elem] = elementKey
		}
		if existing, present := groups[elementKey]; !present {
			groups[elementKey] = []string{elem}
		} else {
			groups[elementKey] = append(existing, elem)
		}
	}
	return groups
}

Did a bench and the cache does it make it faster. Can use thread safe sync maps and use waitgroups to make it even faster

@ashwanthkumar
Copy link
Author

Can use thread safe sync maps and use waitgroups to make it even faster
@karthikcru Wouldn't locks make it slow? 🤔

@karthikcru
Copy link

If we prealloc the slices used to store the groups, but this increases the space to n^2, appending to non prealloc slice is costly

@karthikcru
Copy link

karthikcru commented Feb 28, 2022

@ashwanthkumar The locks are on the group rather than the entire map, so it essentially is wasting resources in the thread waiting for the lock to get free if it writes into the same group, but if they write to different groups it does get faster right ? Am I saying this right ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment